AU708174B2 - Purified hepatitis C virus envelope proteins for diagnostic and therapeutic use - Google Patents

Purified hepatitis C virus envelope proteins for diagnostic and therapeutic use Download PDF

Info

Publication number
AU708174B2
AU708174B2 AU33824/95A AU3382495A AU708174B2 AU 708174 B2 AU708174 B2 AU 708174B2 AU 33824/95 A AU33824/95 A AU 33824/95A AU 3382495 A AU3382495 A AU 3382495A AU 708174 B2 AU708174 B2 AU 708174B2
Authority
AU
Australia
Prior art keywords
gly
leu
ala
val
thr
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU33824/95A
Other versions
AU3382495A (en
Inventor
Fons Bosman
Marie-Ange Buyse
Guy De Martynoff
Geert Maertens
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujirebio Europe NV SA
Original Assignee
Innogenetics NV SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Innogenetics NV SA filed Critical Innogenetics NV SA
Priority to AU33824/95A priority Critical patent/AU708174B2/en
Priority claimed from PCT/EP1995/003031 external-priority patent/WO1996004385A2/en
Publication of AU3382495A publication Critical patent/AU3382495A/en
Application granted granted Critical
Publication of AU708174B2 publication Critical patent/AU708174B2/en
Priority to AU57127/99A priority patent/AU757962B2/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K16/00Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies
    • C07K16/08Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from viruses
    • C07K16/10Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from viruses from RNA viruses
    • C07K16/1081Togaviridae, e.g. flavivirus, rubella virus, hog cholera virus
    • C07K16/109Hepatitis C virus; Hepatitis G virus
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K39/12Viral antigens
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K39/12Viral antigens
    • A61K39/29Hepatitis virus
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/005Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K2039/51Medicinal preparations containing antigens or antibodies comprising whole cells, viruses or DNA/RNA
    • A61K2039/525Virus
    • A61K2039/5256Virus expressing foreign proteins
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2317/00Immunoglobulins specific features
    • C07K2317/30Immunoglobulins specific features characterized by aspects of specificity or valency
    • C07K2317/34Identification of a linear epitope shorter than 20 amino acid residues or of a conformational epitope defined by amino acid residues
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2710/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA dsDNA viruses
    • C12N2710/00011Details
    • C12N2710/24011Poxviridae
    • C12N2710/24111Orthopoxvirus, e.g. vaccinia virus, variola
    • C12N2710/24141Use of virus, viral particle or viral elements as a vector
    • C12N2710/24143Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2770/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssRNA viruses positive-sense
    • C12N2770/00011Details
    • C12N2770/24011Flaviviridae
    • C12N2770/24211Hepacivirus, e.g. hepatitis C virus, hepatitis G virus
    • C12N2770/24222New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2770/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssRNA viruses positive-sense
    • C12N2770/00011Details
    • C12N2770/24011Flaviviridae
    • C12N2770/24211Hepacivirus, e.g. hepatitis C virus, hepatitis G virus
    • C12N2770/24234Use of virus or viral component as vaccine, e.g. live-attenuated or inactivated virus, VLP, viral protein
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2770/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssRNA viruses positive-sense
    • C12N2770/00011Details
    • C12N2770/24011Flaviviridae
    • C12N2770/24211Hepacivirus, e.g. hepatitis C virus, hepatitis G virus
    • C12N2770/24241Use of virus, viral particle or viral elements as a vector
    • C12N2770/24243Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S435/00Chemistry: molecular biology and microbiology
    • Y10S435/803Physical recovery methods, e.g. chromatography, grinding
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S435/00Chemistry: molecular biology and microbiology
    • Y10S435/81Packaged device or kit
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S435/00Chemistry: molecular biology and microbiology
    • Y10S435/8215Microorganisms
    • Y10S435/911Microorganisms using fungi
    • Y10S435/913Aspergillus
    • Y10S435/915Aspergillus flavus
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S435/00Chemistry: molecular biology and microbiology
    • Y10S435/975Kit

Description

WO 96/04385 PCT/EP95/03031 '1 PURIFIED HEPATITIS C VIRUS ENVELOPE PROTEINS FOR DIAGNOSTIC
AND
THERAPEUTIC
USE
Field of the invention The present invention relates to the general fields of recombinant protein expression, purification of recombinant proteins, synthetic peptides, diagnosis of HCV infection, prophylactic treatment against HCV infection and to the prognosis/monitoring of the clinical efficiency of treatment of an individual with chronic hepatitis, or the prognosis/monitoring of natural disease.
More particularly, the present invention relates to purification methods for hepatitis C virus envelope proteins, the use in diagnosis, prophylaxis or therapy of HCV envelope proteins purified according to the methods described in the present invention, the use of single or specific oligomeric El and/or E2 and/or E1/E2 envelope proteins in assays for monitoring disease, and/or diagnosis of disease, and/or treatment of disease.
The invention also relates to epitopes of the El and/or E2 envelope proteins and monoclonal antibodies thereto, as well their use in diagnosis, prophylaxis or treatment.
Backqround of the invention The E2 protein purified from cell lysates according to the methods described in the present invention reacts with approximately 95% of patient sera. This reactivity is similar to the reactivity obtained with E2 secreted from CHO cells (Spaete et al., 1992).
However, the intracellularly expressed form of E2 may more closely resemble the native viral envelope protein because it contains high mannose carbohydrate motifs, whereas the E2 protein secreted from CHO cells is further modified with galactose and sialic acid sugar moieties. When the aminoterminal half of E2 is expressed in the baculovirus system, only about 13 to 21% of sera from several patient groups can be detected (Inoue et al., 1992). After expression of E2 from E. coli, the reactivity of HCV sera was even lower and ranged from 14 (Yokosuka et al., 1992) to 17% (Mita et al., 1992).
About 75% of HCV sera (and 95% of chronic patients) are anti-El positive using the purified, vaccinia-expressed recombinant El protein of the present invention, in sharp contrast with the results of Kohara et al. (1992) and Hsu et al. (1993). Kohara
I
-2et al. used a vaccinia-virus expressed El protein and detected anti-El antibodies in 7 to 23% of patients, while Hsu et al. only detected 14/50 sera using baculovirusexpressed El.
These results show that not only a good expression system but also a good purification protocol are required to reach a high reactivity of the envelope proteins with human patient sera. This can be obtained using the proper expression system and/or purification protocols of the present invention which guarantee the conservation of the natural folding of the protein and the purification protocols of the present invention which guarantee the elimination of contaminating proteins and which preserve the conformation, and thus the reactivity of the HCV envelope proteins. The amounts of purified HCV envelope protein needed for diagnostic screening- assays are in the range of grams per year. For vaccine purposes, even higher amounts of envelope protein would be needed. Therefore, the vaccinia virus system may be used for selecting the best *expression constructs and for limited upscaling, and large-scale expression and S: 15 purification of single or specific oligomeric envelope proteins containing high-mannose carbohydrates may be achieved when expressed from several yeast strains. In the case of hepatitis B for example, manufacturing of HBsAg from mammalian cells was much more costly compared with yeast-derived hepatitis B vaccines.
SUMMARY OF THE INVENTION The present invention provides new purification methods for recombinantly expressed El and/or E2 and/or E1/E2 proteins such that said recombinant proteins are directly useable for diagnostic and vaccine purposes as single or specific oligomeric recombinant proteins free from contaminants instead of aggregates.
-2a- The present invention also provides compositions comprising purified (single or specific oligomeric) recombinant El and/or E2 and/or E1/E2 glycoproteins comprising conformational epitopes from the El and/or E2 domains of HCV.
According to a first aspect the present invention provides a method for purifying recombinant HCV single or specific oligomeric envelope proteins selected from the group consisting of El and/or E2 and/or E1/E2, wherein a disulphide bond cleavage or reduction step is carried out with a disulphide bond cleavage agent on the recombinantly expressed protein.
According to a second aspect the present invention provides a method according to 0 the first aspect, further comprising at least the following steps: lysing recombinant El and/or E2 and/or E1/E2 expressing host cells, possibly in the presence of an SH blocking agent such as N-ethylmaleimide
(NEM),
recovering said HCV envelope proteins by affinity purification such as by means of lectin-chromotography, such as lentil-lectin chromatography, or by means of 15 immunoaffinity using anti-El and/or anti-E2 specific monoclonal antibodies, reduction or cleavage of the disulphide bonds with a disulphide bond cleaving agent, such as DTT, preferably also in the presence of an SH blocking agent, such as NEM or Biotin-NEM, and, recovering the reduced El and/or E2 and/or E1/E2 envelope proteins by gelfiltration and possibly also by a subsequent Ni2+-IMAC chromatography and desalting step.
According to a third aspect the present invention provides an isolated HCV envelope protein obtained by a method according to the first aspect.
2b According to a fourth aspect the present invention provides a method according to the first or second aspect, further comprising at least the following steps: growing a host cell transformed with a recombinant vector comprising a nucleotide sequence allowing the expression of a HCV single or specific oligomeric El and/or E2 and/or E1/E2 protein in a suitable culture medium, causing expression of said vector under suitable conditions, and, lysing said transformed host cells, preferably in the presence of an SH group blocking agent, such as N-ethylmaleimide
(NEM),
recovering said HCV envelope protein by affinity purification by means of for instance lectin-chromatography or immunoaffinity chromatography using anti-El and/or anti-E2 specific monoclonal antibodies, with said lectin being preferably lentillectin, followed by, incubation of the eluate of the previous step with a disulphide bond cleavage agent, such as DTT, preferably also in the presence of an SH group blocking agent, such as 15 NEM or Biotin-NEM, and, isolating the HCV single or specific oligomeric El and/or E2 and/or E1/E2 protein by means of gelfiltration and possibly also by means of an additional Ni 2
-IMAC
chromatography and desalting step.
a According to a fifth aspect the present invention provides an isolated HCV envelope protein according to the third aspect, for use as a medicament.
According to a sixth aspect the present invention provides an isolated HCV envelope protein according to the third aspect, for use as a vaccine for immunising a mammal, preferably humans, against HCV, comprising administrating an effective amount of said composition possibly accompanied by pharmaceutically acceptable adjuvants, to produce an immune response.
According to a seventh aspect the present invention provides a method for immunising a mammal against HCV, comprising the steps of administering to said mammal an effective amount of an isolated HCV envelope protein according to the third aspect, to produce an immune response.
According to an eighth aspect the present invention provides a vaccine composition for immunising a mammal, preferably humans, against HCV, comprising an effective amount of an isolated HCV envelope protein according to the third aspect possibly accompanied by pharmaceutically acceptable adjuvants.
According to a ninth aspect the present invention provides an isolated HCV envelope protein according to the third aspect, for in vitro detection of HCV antibodies present in a biological sample.
According to a tenth aspect the present invention provides a method for in vitro ,o 15 diagnosis of HCV antibodies present in a biological sample, comprising at least the o following steps: contacting said biological sample with an isolated HCV envelope protein according to the third aspect, preferably in an immobilised form under appropriate conditions a. which allow the formation of an immune complex, *a C 20 b) removing unbound components, a.c) incubating the immune complexes formed with heterologous antibodies, with said heterologous antibodies being conjugated to a detectable label under appropriate conditions, -3ad) detecting the presence of said immune complexes visually or mechanically.
According to an eleventh aspect the present invention provides a kit for determining the presence of HCV antibodies present in a biological sample, comprising: at least one isolated HCV envelope protein according to the third aspect, preferably in an immobilised form on a solid substrate, a buffer or components necessary for producing the buffer enabling binding reaction between these proteins and antibodies against HCV present in said biological sample, a means for detecting the immune complexes formed in the preceding binding reaction.
According to a twelfth aspect the present invention provides the use of an isolated HCV envelope protein according to the third aspect, comprising HCV El protein, more particularly HCV single El proteins, for in vitro monitoring HCV disease or prognosing the response to treatment, particularly with interferon, of patients suffering from HCV infection comprising: o.
15 incubating a biological sample from a patient with HCV infection with an El protein a or a suitable part thereof under conditions allowing the formation of an immunological complex, Sremoving unbound components, calculating the anti-E 1 titers present in said sample at the start of and during the %o 20 course of treatment, 99 9 monitoring the natural course of HCV disease, or prognosing the response to 999999 treatment of said patient on the basis of the amount of anti-E I titers found in said sample at the start of treatment and/or during the course of treatment.
-3b- According to a thirteenth aspect the present invention provides a kit for monitoring HCV disease or prognosing the response to treatment, particularly with interferon, of patients suffering from HCV infection comprising: at least one isolated HCV envelope protein, more particularly an El protein according to the third aspect, a buffer or components necessary for producing the buffer enabling the binding reaction between these proteins and the anti-E antibodies present in a biological sample, means for detecting the immune complexes formed in the preceding binding reaction, possibly also an automated scanning and interpretation device for inferring a decrease of anti-E 1 titers during the progression of treatment.
According to a fourteenth aspect the present invention provides a serotyping assay for detecting one or more serological types of HCV present in a biological sample, more particularly for detecting antibodies of the different types of HCV to be detected 15 combined in one assay format, comprising at least the following steps:
S.
a) contacting the biological sample to be analysed for the presence ofHCV antibodies of one or more serological types, with at least one isolated HCV El and/or E2 and/or l /E2 protein according to the third aspect, preferentially in an immobilised form :o under appropriate conditions which allow the formation of an immune complex, b) removing unbound components, *O e O c) incubating the immune complexes formed with heterologous antibodies, with said a heterologous antibodies being conjugated to a detectable label under appropriate conditions, 3cd) detecting the presence of said immune complexes visually or mechanically by means of densitometry, fluorimetry, colorimetry) and inferring the presence of one or more HCV serological types present from the observed binding pattern.
According to a fifteenth aspect the present invention provides a kit for serotyping one or more serological types of HCV present in a biological sample, more particularly for detecting the antibodies to these serological types of HCV comprising: at least one isolated HCV El and/or E2 and/or El/E2 protein according to the third aspect, a buffer or components necessary for producing the buffer enabling the binding reaction between these proteins and the anti-El antibodies present in a biological sample, means for detecting the immune complexes formed in the preceding binding reaction, possibly also an automated scanning and interpretation device for detecting the presence of one or more serological types present from the observed binding pattern.
15 According to a sixteenth aspect the present invention provides an isolated HCV envelope protein according to the third aspect, to raise upon immunisation an El and/or E2 specific monoclonal antibody.
According to a seventeenth aspect the present invention provides an isolated HCV :envelope protein according to the third aspect, for the preparation of an immunoassay kit.
S According to an eighteenth aspect the present invention provides the use of an isolated HCV envelope protein according to the third aspect, for detecting HCV antibodies present in a biological sample.
-3d- Unless the context clearly requires otherwise, throughout the description and the claims, the words 'comprise', 'comprising', and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in the sense of "including, but not limited to".
Other embodiments and advantages of the present invention will be clear from the following description.
Definitions The following definitions serve to illustrate the different terms and expressions used in the present invention.
The term 'hepatitis C virus single envelope proteins' refers to a polypeptide or an analogue thereof mimotopes) comprising an amino acid sequence (and/or amino acid analogues) defining at least one HCV epitope of either the El or the E2 region.
a
**I
*o WO 96/04385 PCT/EP95/03031 4 These single envelope proteins in the broad sense of the word may be both monomeric or homo-oligomeric forms of recombinantly expressed envelope proteins. Typically, the sequences defining the epitope correspond to the amino acid sequence of either the El or the E2 region of HCV (either identically or via substitution of analogues of the native S 5 amino acid residue that do not destroy the epitope). In general, the epitope-defining sequence will be 3 or more amino acids in length, more typically, 5 or more amino acids in length, more typically 8 or more amino acids in length, and even more typically or more amino acids in length. With respect to conformational epitopes, the length of the epitope-defining sequence can be subject to wide variations, since it is believed that these epitopes are formed by the three-dimensional shape of the antigen folding).
Thus, the amino acids defining the epitope can be relatively few in number, but widely dispersed along the length of the molecule being brought into the correct epitope conformation via folding. The portions of the antigen between the residues defining the epitope may not be critical to the conformational structure of the epitope. For example, deletion or substitution of these intervening sequences may not affect the conformational epitope provided sequences critical to epitope conformation are maintained cysteines involved in disulfide bonding, glycosylation sites, etc.). A conformational epitope may also be formed by 2 or more essential regions of subunits of a homooligomer or heterooligomer.
The HCV antigens of the present invention comprise conformational epitopes from the El and/or E2 (envelope) domains of HCV. The El domain, which is believed to correspond to the viral envelope protein, is currently estimated to span amino acids 192-383 of the HCV polyprotein (Hijikata et al., 1991). Upon expression in a mammalian system (glycosylated), it is believed to have an approximate molecular weight of 35 kDa as determined via SDS-PAGE. The E2 protein, previously called NS1, is believed to span amino acids 384-809 or 384-746 (Grakoui et al., 1993) of the HCV polyprotein and to also be an envelope protein. Upon expression in a vaccinia system (glycosylated), it is believed to have an apparent gel molecular weight of about 72 kDa.
It is understood that these protein endpoints are approximations the carboxy terminal end of E2 could lie somewhere in the 730-820 amino acid region, e.g. ending at amino acid 730, 735, 740, 742, 744, 745, preferably 746, 747, 748, 750, 760, 770, 780, 790, 800, 809, 810, 820). The E2 protein may also be expressed together with the El, P7 (aa 747-809), NS2 (aa 810-1026), NS4A (aa 1658-1711) or NS4B (aa 1712-1972). Expression together with these other HCV proteins may be important for WO 96/04385 PCT/EP95/03031 obtaining the correct protein folding.
It is also understood that the isolates used in the examples section of the present invention were not intended to limit the scope of the invention and that any HCV isolate from type 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or any other new genotype of HCV is a suitable source of El and/or E2 sequence for the practice of the present invention.
The El and E2 antigens used in the present invention may be full-length viral proteins, substantially full-length versions thereof, or functional fragments thereof (e.g.
fragments which are not missing sequence essential to the formation or retention of an epitope). Furthermore, the HCV antigens of the present invention can also include other sequences that do not block or prevent the formation of the conformational epitope of interest. The presence or absence of a conformational epitope can be readily determined though screening the antigen of interest with an antibody (polyclonal serum or monoclonal to the conformational epitope) and comparing its reactivity to that of a denatured version of the antigen which retains only linear epitopes (if any). In such screening using polyclonal antibodies, it may be advantageous to adsorb the polyclonal serum first with the denatured antigen and see if it retains antibodies to the antigen of interest.
The HCV antigens of the present invention can be made by any recombinant method that provides the epitope of intrest. For example, recombinant intracellular expression in mammalian or insect cells is a preferred method to provide glycosylated El and/or E2 antigens in 'native' conformation as is the case for the natural HCV antigens. Yeast cells and mutant yeast strains mnn 9 mutant (Kniskern et al., 1994) or glycosylation mutants derived by means of vanadate resistence selection (Ballou et al., 1991)) may be ideally suited for production of secreted high-mannose-type sugars; whereas proteins secreted from mammalian cells may contain modifications including galactose or sialic acids which may be undesirable for certain diagnostic or vaccine applications. However, it may also be possible and sufficient for certain applications, as it is known for proteins, to express the antigen in other recombinant hosts (such as E. coli) and renature the protein after recovery.
The term 'fusion polypeptide' intends a polypeptide in which the HCV antigen(s) are part of a single continuous chain of amino acids, which chain does not occur in nature. The HCV antigens may be connected directly to each other by peptide bonds or be separated by intervening amino acid sequences. The fusion polypeptides may also contain amino acid sequences exogenous to HCV.
WO 96/04385 PCT/EP95/03031 6 The term 'solid phase' intends a solid body to which the individual HCV antigens or the fusion polypeptide comprised of HCV antigens are bound covalently or by noncovalent means such as hydrophobic adsorption.
The term 'biological sample' intends a fluid or tissue of a mammalian individual an anthropoid, a human) that commonly contains antibodies produced by the individual, more particularly antibodies against HCV. The fluid or tissue may also contain HCV antigen. Such components are known in the art and include, without limitation, blood, plasma, serum, urine, spinal fluid, lymph fluid, secretions of the respiratory, intestinal or genitourinary tracts, tears, saliva, milk, white blood cells and myelomas.
Body components include biological liquids. The term 'biological liquid' refers to a fluid obtained from an organism. Some biological fluids are used as a source of other products, such as clotting factors Factor VIII;C), serum albumin, growth hormone and the like. In such cases, it is important that the source of biological fluid be free of contamination by virus such as HCV.
The term 'immunologically reactive' means that the antigen in question will react specifically with anti-HCV antibodies present in a body component from an HCV infected individual.
The term 'immune complex' intends the combination formed when an antibody binds to an epitope on an antigen.
'El' as used herein refers to a protein or polypeptide expressed within the first 400 amino acids of an HCV polyprotein, sometimes referred to as the E, ENV or S protein. In its natural form it is a 35 kDa glycoprotein which is found in strong association with membranes. In most natural HCV strains, the El protein is encoded in the viral polyprotein following the C (core) protein. The El protein extends from approximately amino acid (aa) 192 to about aa 383 of the full-length polyprotein.
The term 'El' as used herein also includes analogs and truncated forms that are immunologically cross-reactive with natural El, and includes El proteins of genotypes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or any other newly identified HCV type or subtype.
'E2' as used herein refers to a protein or polypeptide expressed within the first 900 amino acids of an HCV polyprotein, sometimes referred to as the NS1 protein. In its natural form it is a 72 kDa glycoprotein that is found in strong association with membranes. In most natural HCV strains, the E2 protein is encoded in the viral polyprotein following the El protein. The E2 protein extends from approximately amino acid position 384 to amino acid position 746, another form of E2 extends to amino acid WO 96/04385 PCT/EP95/03031 7 position 809. The term 'E2' as used herein also includes analogs and truncated forms that are immunologically cross-reactive with natural E2. For example, insertions of multiple codons between codon 383 and 384, as well as deletions of amino acids 384- 387 have been reported by Kato et al. (1992).
'E1/E2' as used herein refers to an oligomeric form of envelope proteins containing at least one El component and at least one E2 component.
The term 'specific oligomeric' E1 and/or E2 and/or E1/E2 envelope proteins refers to all possible oligomeric forms of recombinantly expressed El and/or E2 envelope proteins which are not aggregates. El and/or E2 specific oligomeric envelope proteins are also referred to as homo-oligomeric El or E2 envelope proteins (see below).
The term 'single or specific oligomeric' E1 and/or E2 and/or E1/E2 envelope proteins refers to single monomeric El or E2 proteins (single in the strict sense of the word) as well as specific oligomeric E1 and/or E2 and/or E1/E2 recombinantly expressed proteins. These single or specific oligomeric envelope proteins according to the present invention can be further defined by the following formula wherein x can be a number between 0 and 100, and y can be a number between o and 100, provided that x and y are not both 0. With x= 1 and y=0 said envelope proteins include monomeric El.
The term 'homo-oligomer' as used herein refers to a complex of El and/or E2 containing more than one El or E2 monomer, e.g. E1/E1 dimers, E1/E1/E1 trimers or E1/E1/E1/E1 tetramers and E2/E2 dimers, E2/E2/E2 trimers or E2/E2/E2/E2 tetramers, El pentamers and hexamers, E2 pentamers and hexamers or any higher-order homooligomers of El or E2 are all 'homo-oligomers' within the scope of this definition. The oligomers may contain one, two, or several different monomers of El or E2 obtained from different types or subtypes of hepatitis C virus including for example those described in an international application published under WO 94/25601 and European application No. 94870166.9 both by the present applicants. Such mixed oligomers are still homo-oligomers within the scope of this invention, and may allow more universal diagnosis, prophylaxis or treatment of HCV.
The term 'purified' as applied to proteins herein refers to a composition wherein the desired protein comprises at least 35% of the total protein component in the composition. The desired protein preferably comprises at least 40%, more preferably at least about 50%, more preferably at least about 60%, still more preferably at least about 70%, even more preferably at least about 80%, even more preferably at least WO 96/04385 PCT/EP95/03031 8 about 90%, and most preferably at least about 95% of the total protein component.
The composition may contain other compounds such as carbohydrates, salts, lipids, solvents, and the like, withouth affecting the determination of the percentage purity as used herein. An 'isolated' HCV protein intends an HCV protein composition that is at least 35% pure.
The term 'essentially purified proteins' refers to proteins purified such that they can be used for in vitro diagnostic methods and as a therapeutic compound. These proteins are substantially free from cellular proteins, vector-derived proteins or other HCV viral components. Usually these proteins are purified to homogeneity (at least pure, preferably, 90%, more preferably 95%, more preferably 97%, more preferably 98%, more preferably 99%, even more preferably 99.5%, and most preferably the contaminating proteins should be undetectable by conventional methods like SDS-PAGE and silver staining.
The term 'recombinantly expressed' used within the context of the present invention refers to the fact that the proteins of the present invention are produced by recombinant expression methods be it in prokaryotes, or lower or higher eukaryotes as discussed in detail below.
The term 'lower eukaryote' refers to host cells such as yeast, fungi and the like.
Lower eukaryotes are generally (but not necessarily) unicellular. Preferred lower eukaryotes are yeasts, particularly species within Saccharomvces, Schizosaccharomyces, Kluveromyces Pichia Pichia astoris), Hansenula (e.g.
Hansenula o orpha), Yarowia, Schwaniomyces, Schizosaccharomyces, Zyqosaccharomyces and the like. Saccharomyces cerevisiae, S. carlsbergensis and K.
Jactis are the most commonly used yeast hosts, and are convenient fungal hosts.
The term 'prokaryotes' refers to hosts such as E.coli, Lactobacillus, Lactococcus, Salmonella, Streptococcus, Bacillus subtilis or Streptomyces. Also these hosts are contemplated within the present invention.
The term 'higher eukaryote' refers to host cells derived from higher animals, such as mammals, reptiles, insects, and the like. Presently preferred higher eukaryote host cells are derived from Chinese hamster CHO), monkey COS and Vero cells), baby hamster kidney (BHK), pig kidney (PK15), rabbit kidney 13 cells (RK1 the human osteosarcoma cell line 143 B, the human cell line HeLa and human hepatoma cell lines like Hep G2, and insect cell lines Spodoptera fruqiperda). The host cells may be provided in suspension or flask cultures, tissue cultures, organ cultures and the like.
WO 96/04385 PCT/EP95/03031 9 Alternatively the host cells may also be transgenic animals.
The term 'polypeptide' refers to a polymer of amino acids and does not refer to a specific length of the product; thus, peptides, oligopeptides, and proteins are included within the definition of polypeptide. This term also does not refer to or exclude postexpression modifications of the polypeptide, for example, glycosylations, acetylations, phosphorylations and the like. Included within the definition are, for example, polypeptides containing one or more analogues of an amino acid (including, for example, unnatural amino acids, PNA, etc.), polypeptides with substituted linkages, as well as other modifications known in the art, both naturally occurring and non-naturally occurring.
The term 'recombinant polynucleotide or nucleic acid' intends a polynucleotide or nucleic acid of genomic, cDNA, semisynthetic, or synthetic origin which, by virtue of its origin or manipulation is not associated with all or a portion of a polynucleotide with which it is associated in nature, is linked to a polynucleotide other than that to which it is linked in nature, or does not occur in nature.
The term 'recombinant host cells', 'host cells', 'cells', 'cell lines', 'cell cultures', and other such terms denoting microorganisms or higher eukaryotic cell lines cultured as unicellular entities refer to cells which can be or have been, used as recipients for a recombinant vector or other transfer polynucleotide, and include the progeny of the original cell which has been transfected. It is understood that the progeny of a single parental cell may not necessarily be completely identical in morphology or in genomic or total DNA complement as the original parent, due to natural, accidental, or deliberate mutation.
The term 'replicon' is any genetic element, a plasmid, a chromosome, a virus, a cosmid, etc., that behaves as an autonomous unit of polynucleotide replication within a cell; capable of replication under its own control.
The term 'vector' is a replicon further comprising sequences providing replication and/or expression of a desired open reading frame.
The term 'control sequence' refers to polynucleotide sequences which are necessary to effect the expression of coding sequences to which they are ligated. The nature of such control sequences differs depending upon the host organism; in prokaryotes, such control sequences generally include promoter, ribosomal binding site, and terminators; in eukaryotes, generally, such control sequences include promoters, terminators and, in some instances, enhancers. The term 'control sequences' is intended WO 96/04385 PCT/EP95/03031 to include, at a minimum, all components whose presence is necessary for expression, and may also include additional components whose presence is advantageous, for example, leader sequences which govern secretion.
The term 'promoter' is a nucleotide sequence which is comprised of consensus sequences which allow the binding of RNA polymerase to the DNA template in a manner such that mRNA production initiates at the normal transcription initiation site for the adjacent structural gene.
The expression 'operably linked' refers to a juxtaposition wherein the components so described are in a relationship permitting them to function in their intended manner. A control sequence 'operably linked' to a coding sequence is ligated in such a way that expression of the coding sequence is achieved under conditions compatible with the control sequences.
An 'open reading frame' (ORF) is a region of a polynucleotide sequence which encodes a polypeptide and does not contain stop codons; this region may represent a portion of a coding sequence or a total coding sequence.
A 'coding sequence' is a polynucleotide sequence which is transcribed into mRNA and/or translated into a polypeptide when placed under the control of appropriate regulatory sequences. The boundaries of the coding sequence are determined by a translation start codon at the 5'-terminus and a translation stop codon at the 3'terminus. A coding sequence can include but is not limited to mRNA, DNA (including cDNA), and recombinant polynucleotide sequences.
As used herein, 'epitope' or 'antigenic determinant' means an amino acid sequence that is immunoreactive. Generally an epitope consists of at least 3 to 4 amino acids, and more usually, consists of at least 5 or 6 amino acids, sometimes the epitope consists of about 7 to 8, or even about 10 amino acids. As used herein, an epitope of a designated polypeptide denotes epitopes with the same amino acid sequence as the epitope in the designated polypeptide, and immunologic equivalents thereof. Such equivalents also include strain, subtype genotype), or type(group)-specific variants, e.g. of the currently known sequences or strains belonging to genotypes a, 1 b, 1 c, 1 d, 1 e, 1 f, 2a, 2b, 2c, 2d, 2e, 2f, 2g, 2h, 2i, 3a, 3b, 3c, 3d, 3e, 3f, 3g, 4a, 4b, 4c, 4d, 4e, 4f, 4g, 4h, 4i, 4j, 4k, 41, 5a, 5b, 6a, 6b, 6c, 7a, 7b, 7c, 8a, 8b, 9a, 9b, 10a, or any other newly defined HCV (sub)type. It is to be understood that the amino acids constituting the epitope need not be part of a linear sequence, but may be interspersed by any number of amino acids, thus forming a conformational epitope.
WO 96/04385 PCT/EP95/03031 11 The term 'immunogenic' refers to the ability of a substance to cause a humoral and/or cellular response, whether alone or when linked to a carrier, in the presence or absence of an adjuvant. 'Neutralization' refers to an immune response that blocks the infectivity, either partially or fully, of an infectious agent. A 'vaccine' is an immunogenic composition capable of eliciting protection against HCV, whether partial or complete.
A vaccine may also be useful for treatment of an individual, in which case it is called a therapeutic vaccine.
The term 'therapeutic' refers to a composition capable of treating HCV infection.
The term 'effective amount' refers to an amount of epitope-bearing polypeptide sufficient to induce an immunogenic response in the individual to which it is administered, or to otherwise detectably immunoreact in its intended system immunoassay). Preferably, the effective amount is sufficient to effect treatment, as defined above. The exact amount necessary will vary according to the application. For vaccine applications or for the generation of polyclonal antiserum antibodies, for example, the effective amount may vary depending on the species, age, and general condition of the individual, the severity of the condition being treated, the particular polypeptide selected and its mode of administration, etc. It is also believed that effective amounts will be found within a relatively large, non-critical range. An appropriate effective amount can be readily determined using only routine experimentation.
Preferred ranges of El and/or E2 and/or E1/E2 single or specific oligomeric envelope proteins for prophylaxis of HCV disease are 0.01 to 100 /g/dose, preferably 0.1 to pg/dose. Several doses may be needed per individual in order to achieve a sufficient immune response and subsequent protection against HCV disease.
Detailed description of the invention More particularly, the present invention contemplates a method for isolating or purifying recombinant HCV single or specific oligomeric envelope protein selected from the group consisting of E1 and/or E2 and/or E1/E2, characterized in that upon lysing the transformed host cells to isolate the recombinantly expressed protein a disulphide bond cleavage or reduction step is carried out with a disculphide bond cleaving agent.
The essence of these 'single or specific oligomeric' envelope proteins of the invention is that they are free from contaminating proteins and that they are not WO 96/04385 PCTIEP95/03031 12 disulphide bond linked with contaminants.
The proteins according to the present invention are recombinantly expressed in lower or higher eukaryotic cells or in prokaryotes. The recombinant proteins of the present invention are preferably glycosylated and may contain high-mannose-type, hybrid, or complex glycosylations. Preferentially said proteins are expressed from mammalian cell lines as discussed in detail in the Examples section, or in yeast such as in mutant yeast strains also as detailed in the Examples section.
The proteins according to the present invention may be secreted or expressed within components of the cell, such as the ER or the Golgi Apparatus. Preferably, however, the proteins of the present invention bear high-mannose-type glycosylations and are retained in the ER or Golgi Apparatus of mammalian cells or are retained in or secreted from yeast cells, preferably secreted from yeast mutant strains such as the mnn9 mutant (Kniskern et al., 1994), or from mutants that have been selected by means of vanadate resistence (Ballou et al., 1991).
Upon expression of HCV envelope proteins, the present inventors could show that some of the free thiol groups of cysteines not involved in intra- or inter-molecular disulphide bridges, react with cysteines of host or expression-system-derived (e.g.
vaccinia) proteins or of other HCV envelope proteins (single or oligomeric), and form aspecific intermolecular bridges. This results in the formation of 'aggregates' of HCV envelope proteins together with contaminating proteins. It was also shown in WO 92/08734 that 'aggregates' were obtained after purification, but it was not described which protein interactions were involved. In patent application WO 92/08734, recombinant E1/E2 protein expressed with the vaccinia virus system were partially purified as aggregates and only found to be 70% pure, rendering the purified aggregates not useful for diagnostic, prophylactic or therapeutic purposes.
Therefore, a major aim of the present invention resides in the separation of single or specific-oligomeric HCV envelope proteins from contaminating proteins, and to use the purified proteins 95% pure) for diagnostic, prophylactic and therapeutic purposes. To those purposes, the present inventors have been able to provide evidence that aggregated protein complexes ('aggregates') are formed on the basis of disulphide bridges and non-covalent protein-protein interactions. The present invention thus provides a means for selectively cleaving the disulphide bonds under specific conditions and for separating the cleaved proteins from contaminating proteins which greatly interfere with diagnostic, prophylactic and therapeutic applications. The free thiol groups WO 96/04385 PCT/EP95/03031 13 may be blocked (reversibly or irreversibly) in order to prevent the reformation of disulphide bridges, or may be left to oxidize and oligomerize with other envelope proteins (see definition homo-oligomer). It is to be understood that such protein oligomers are essentially different from the 'aggregates' described in WO 92/08734 and WO 94/01778, since the level of contaminating proteins is undetectable.
Said disuphide bond cleavage may also be achieved by: performic acid oxidation by means of cysteic acid in which case the cysteine residues are modified into cysteic acid (Moore et al., 1963).
Sulfitolysis (R-S-S-R 2 R-SO 3 for example by means of sulphite (SO2 3) together with a proper oxidant such as Cu 2 in which case the cysteine is modified into S-sulphocysteine (Bailey and Cole, 1959).
Reduction by means of mercaptans, such as dithiotreitol (DDT), 1-mercapto-ethanol, cysteine, glutathione Red, e-mercapto-ethylamine, or thioglycollic acid, of which DTT and 9-mercapto-ethanol are commonly used (Cleland, 1964), is the preferred method of this invention because the method can be performed in a water environment and because the cysteine remains unmodified.
Reduction by means of a phosphine Bu 3 P) (Ruegg and Rudinger, 1977).
All these compounds are thus to be regarded as agents or means for cleaving disulphide bonds according to the present invention.
Said disulphide bond cleavage (or reducing) step of the present invention is preferably a partial disulphide bond cleavage (reducing) step (carried out under partial cleavage or reducing conditions).
A preferred disulphide bond cleavage or reducing agent according to the present invention is dithiothreitol (DTT). Partial reduction is obtained by using a low concentration of said reducing agent, i.e. for DTT for example in the concentration range of about 0.1 to about 50 mM, preferably about 0.1 to about 20 mM, preferably about to about 10 mM, preferably more than 1 mM, more than 2 mM or more than 5 mM, more preferably about 1.5 mM, about 2.0 mM, about 2.5 mM, about 5 mM or about mM.
Said disulphide bond cleavage step may also be carried out in the presence of a suitable detergent (as an example of a means for cleaving disulphide bonds or in combination with a cleaving agent) able to dissociate the expressed proteins, such as DecylPEG, EMPIGEN-BB, NP-40, sodium cholate, Triton X-100.
Said reduction or cleavage step (preferably a partial reduction or cleavage step) WO 96/04385 PCT/EP95/03031 14 is carried out preferably in in the presence of (with) a detergent. A preferred detergent according to the present invention is Empigen-BB. The amount of detergent used is preferably in the range of 1 to 10 preferably more than more preferably about of a detergent such as Empigen-BB.
A particularly preferred method for obtaining disulphide bond cleavage employs a combination of a classical disulphide bond cleavage agent as detailed above and a detergent (also as detailed above). As contemplated in the Examples section, the particular combination of a low concentration of DTT (1.5 to 7.5 mM) and about 3.5 of Empigen-BB is proven to be a particularly preferred combination of reducing agent and detergent for the purification of recombinantly expressed El and E2 proteins. Upon gelfiltration chromatography, said partial reduction is shown to result in the production of possibly dimeric El protein and separation of this El protein from contaminating proteins that cause false reactivity upon use in immunoassays.
It is, however, to be understood that also any other combination of any reducing agent known in the art with any detergent or other means known in the art to make the cysteines better accessible is also within the scope of the present invention, insofar as said combination reaches the same goal of disulphide bridge cleavage as the preferred combination examplified in the present invention.
Apart from reducing the disulphide bonds, a disulphide bond cleaving means according to the present invention may also include any disulphide bridge exchanging agents (competitive agent being either organic or proteinaeous, see for instance Creighton, 1 988) known in the art which allows the following type of reaction to occur: R1 S S R2 R3 SH R1 S S R3 R2 SH R1, R2: compounds of protein aggregates R3 SH: competitive agent (organic, proteinaeous) The term 'disulphide bridge exchanging agent' is to be interpretated as including disulphide bond reforming as well as disulphide bond blocking agents.
The present invention also relates to methods for purifying or isolating
HCV
single or specific oligomeric envelelope proteins as set out above further including the use of any SH group blocking or binding reagent known in the art such as chosen from the following list: Glutathion 5,5'-dithiobis-(2-nitrobenzoic acid) or bis-(3-carboxy-4-nitrophenyl)-disulphide (DTNB or Ellman's reagent) (Elmann, 1959) WO 96/04385 PCT/EP95/03031 N-ethylmaleimide (NEM; Benesch et al., 1956) N-(4-dimethylamino-3,5-dinitrophenyl) maleimide or Tuppy's maleimide which provides a color to the protein P-chloromercuribenzoate (Grassetti et al., 1969) 4-vinylpyridine (Friedman and Krull, 1969) can be liberated after reaction by acid hydrolysis acrylonitrile, can be liberated after reaction by acid hydrolysis (Weil and Seibles, 1961) NEM-biotin obtained from Sigma B1267) 2 2 '-dithiopyridine (Grassetti and Murray, 1967) 4 ,4'-dithiopyridine (Grassetti and Murray, 1967) 6 6 '-dithiodinicontinic acid (DTDNA; Brown and Cunnigham, 1970) 2, 2 '-dithiobis-(5'-nitropyridine) (DTNP; US patent 3597160) or other dithiobis (heterocyclic derivative) compounds (Grassetti and Murray, 1969) A survey of the publications cited shows that often different reagents for sulphydryl groups will react with varying numbers of thiol groups of the same protein or enzyme molecule. One may conclude that this variation in reactivity of the thiol groups is due to the steric environment of these groups, such as the shape of the molecule and the surrounding groups of atoms and their charges, as well as to the size, shape and charge of the reagent molecule or ion. Frequently the presence of adequate concentrations of denaturants such as sodium dodecylsulfate, urea or guanidine hydrochoride will cause sufficient unfolding of the protein molecule to permit equal access to all of the reagents for thiol groups. By varying the concentration of denaturant, the degree of unfolding can be controlled and in this way thiol groups with different degrees of reactivity may be revealed. Although up to date most of the work reported has been done with p-chloromercuribenzoate, N-ethylmaleimide and DTNB, it is likely that the other more recently developed reagents may prove equally useful.
Because of their varying structures, it seems likely, in fact, that they may respond differently to changes in the steric environment of the thiol groups.
Alternatively, conditions such as low pH (preferably lower than pH 6) for preventing free SH groups from oxidizing and thus preventing the formation of large intermolecular aggregates upon recombinant expression and purification of El and E2 (envelope) proteins are also within the scope of the present invention.
A preferred SH group blocking reagent according to the present invention is N- WO 96/04385 PCT/EP95/03031 16 ethylmaleimide (NEM). Said SH group blocking reagent may be administrated during lysis of the recombinant host cells and after the above-mentioned partial reduction process or after any other process for cleaving disulphide bridges. Said SH group blocking reagent may also be modified with any group capable of providing a detectable label and/or any group aiding in the immobilization of said recombinant protein to a solid substrate, e.g. biotinylated
NEM.
Methods for cleaving cysteine bridges and blocking free cysteines have also been described in Darbre (1987), Means and Feeney (1971), and by Wong (1993).
A method to purify single or specific oligomeric recombinant El and/or E2 and/or E1/E2 proteins according to the present invention as defined above is further characterized as comprising the following steps: lysing recombinant E1 and/or E2 and/or E1/E2 expressing host cells, preferably in the presence of an SH group blocking agent, such as N-ethylmaleimide
(NEM),
and possibly a suitable detergent, preferably Empigen-BB, recovering said HCV envelope protein by affinity purification for instance by means lectin-chromatography, such as lentil-lectin chromatography, or immunoaffinity chromatography using anti-El and/or anti-E2 specific monoclonal antibodies, followed by, reduction or cleavage of disulphide bonds with a disulphide bond cleaving agent, such as DTT, preferably also in the presence of an SH group blocking agent, such as NEM or Biotin-NEM, and, recovering the reduced HCV El and/or E2 and/or E1/E2 envelope proteins for instance by gelfiltration (size exclusion chromatography or molecular sieving) and possibly also by an additional Ni2+-IMAC chromatography and desalting step.
It is to be understood that the above-mentioned recovery steps may also be carried out using any other suitable technique known by the person skilled in the art.
Preferred lectin-chromatography systems include Galanthus nivalis agglutinin (GNA) chromatography, or Lens culinaris agglutinin (LCA) (lentil) lectin chromatography as illustrated in the Examples section. Other useful lectins include those recognizing high-mannose type sugars, such as Narcissus pseudonarcissus agglutinin (NPA), Pisum sativum agglutinin (PSA), or Allium ursinum agglutinin
(AUA).
Preferably said method is usable to purify single or specific oligomeric
HCV
envelope protein produced intracellularly as detailed above.
For secreted El or E2 or E1/E2 oligomers, lectins binding complex sugars such WO 96/04385 PCT/EP95/03031 17 as Ricinus communis agglutinin I (RCA are preferred lectins.
The present invention more Particularly contemplates essentially purified recombinant HCV single or specific oligomeric envelope proteins, selected from the group consisting of El and/or E2 and/or El/E2, characterized as being isolated or purified by a method as defined above.
The present invention more particularly relates to the purification or isolation of recombinant envelope proteins which are expressed from recombinant mammalian cells such as vaccinia.
The present invention also relates to the purification or isolation of recombinant envelope proteins which are expressed from recombinant yeast cells.
The present invention equally relates to the purification or isolation of recombinant envelope proteins which are expressed from recombinant bacterial (prokaryotic) cells.
The present invention also contemplates a recombinant vector comprising a vector sequence, an appropriate prokaryotic, eukaryotic or viral or synthetic promoter sequence followed by a nucleotide sequence allowing the expression of the single or specific oligomeric El and/or E2 and/or El/E2 of the invention.
Particularly, the present invention contemplates a recombinant vector comprising a vector sequence, an appropriate prokaryotic, eukaryotic or viral or synthetic promoter sequence followed by a nucleotide sequence allowing the expression of the single El or El of the invention.
Particularly, the present invention contemplates a recombinant vector comprising a vector sequence, an appropriate prokaryotic, eukaryotic or viral or synthetic promoter sequence followed by a nucleotide sequence allowing the expression of the single El or E2 of the invention.
The segment of the HCV cDNA encoding the desired El and/or E2 sequence inserted into the vector sequence may be attached to a signal sequence. Said signal sequence may be that from a non-HCV source, e.g. the IgG or tissue plasminogen activator (tpa) leader sequence for expression in mammalian cells, or the a-mating factor sequence for expression into yeast cells, but particularly preferred constructs according to the present invention contain signal sequences appearing in the HCV genome before the respective start points of the El and E2 proteins. The segment of the HCV cDNA encoding the desired El and/or E2 sequence inserted into the vector may also include deletions e.g. of the hydrophobic domain(s) as illustrated in the examples section, or of WO 96/04385 PCT/EP95/03031 18 the E2 hypervariable region I.
More particularly, the recombinant vectors according to the present invention encompass a nucleic acid having an HCV cDNA segment encoding the polyprotein starting in the region between amino acid positions 1 and 192 and ending in the region between positions 250 and 400 of the HCV polyprotein, more preferably ending in the region between positions 250 and 341, even more preferably ending in the region between positions 290 and 341 for expression of the HCV single El protein. Most preferably, the present recombinant vector encompasses a recombinant nucleic acid having a HCV cDNA seqment encoding part of the HCV polyprotein starting in the region between positions 117 and 192, and ending at any position in the region between positions 263 and 326, for expression of HCV single El protein. Also within the scope of the present invention are forms that have the first hydrophobic domain deleted (positions 264 to 293 plus or minus 8 amino acids), or forms to which a terminal ATG codon and a 3 '-terminal stop codon has been added, or forms which have a factor Xa cleavage site and/or 3 to 10, preferably 6 Histidine codons have been added.
More particularly, the recombinant vectors according to the present invention encompass a nucleic acid having an HCV cDNA segment encoding the polyprotein starting in the region between amino acid positions 290 and 406 and ending in the region between positions 600 and 820 of the HCV polyprotein, more preferably starting in the region between positions 322 and 406, even more preferably starting in the region between positions 347 and 406, even still more preferably starting in the region between positions 364 and 406 for expression of the HCV single E2.protein. Most preferably, the present recombinant vector encompasses a recombinant nucleic acid having a HCV cDNA seqment encoding the polyprotein starting in the region between positions 290 and 406, and ending at any position of positions 623, 650, 661, 673, 710, 715, 720, 746 or 809, for expression of HCV single E2 protein. Also within the scope of the present invention are forms to which a 5'-terminal ATG codon and a 3'terminal stop codon has been added, or forms which have a factor Xa cleavage site and/or 3 to 10, preferably 6 Histidine codons have been added.
A variety of vectors may be used to obtain recombinant expression of HCV single or specific oligomeric envelope proteins of the present invention. Lower eukaryotes such as yeasts and glycosylation mutant strains are typically transformed with plasmids, or are transformed with a recombinant virus. The vectors may replicate within the host WO 96/04385 PCT/EP95/03031 19 independently, or may integrate into the host cell genome.
Higher eukaryotes may be transformed with vectors, or may be infected with a recombinant virus, for example a recombinant vaccinia virus. Techniques and vectors for the insertion of foreign DNA into vaccinia virus are well known in the art, and utilize, for example homologous recombination. A wide variety of viral promoter sequences, possibly terminator sequences and poly(A)-addition sequences, possibly enhancer sequences and possibly amplification sequences, all required for the mammalian expression, are available in the art. Vaccinia is particularly preferred since vaccinia halts the expression of host cell proteins. Vaccinia is also very much preferred since it allows the expression of El and E2 proteins of HCV in cells or individuals which are immunized with the live recombinant vaccinia virus. For vaccination of humans the avipox and Ankara Modified Virus (AMV) are particularly useful vectors.
Also known are insect expression transfer vectors derived from baculovirus Autoqrapha californica nuclear polyhedrosis virus (AcNPV), which is a helperindependent viral expression vector. Expression vectors derived from this system usually use the strong viral polyhedrin gene promoter to drive the expression of heterologous genes. Different vectors as well as methods for the introduction of heterologous DNA into the desired site of baculovirus are available to the man skilled in the art for baculovirus expression. Also different signals for posttranslational modification recognized by insect cells are known in the art.
Also included within the scope of the present invention is a method for producing purified recombinant single or specific oligomeric HCV El or E2 or E1/E2 proteins, wherein the cysteine residues involved in aggregates formation are replaced at the level of the nucleic acid sequence by other residues such that aggregate formation is prevented. The recombinant proteins expressed by recombinant vectors caarying such a mutated El and/or E2 protein encoding nucleic acid are also within the scope of the present invention.
The present invention also relates to recombinant El and/or E2 and/or E1/E2 proteins characterized in that at least one of their glycosylation sites has been removed and are consequently termed glycosylation mutants. As explained in the Examples section, different glycosylation mutants may be desired to diagnose (screening, confirmation, prognosis, etc.) and prevent HCV disease according to the patient in question. An E2 protein glycosylation mutant lacking the GLY4 has for instance been found to improve the reactivity of certain sera in diagnosis. These glycosylation mutants
I
WO 96/04385 PCT/EP95/03031 are preferably purified according to the method disclosed in the present invention. Also contemplated within the present invention are recombinant vectors carrying the nucleic acid insert encoding such a El and/or E2 and/or E1!E2 glycosylation mutant as well as host cells tranformed with such a recombinant vector.
The present invention also relates to recombinant vectors including a polynucleotide which also forms part of the present invention. The present invention relates more particularly to the recombinant nucleic acids as represented in SEQ ID NO 3, 5, 7, 9, 11, 13, 21, 23, 25, 27, 29, 31, 35, 37, 39, 41, 43, 45, 47 and 49, or parts thereof.
The present invention also contemplates host cells transformed with a recombinant vector as defined above, wherein said vector comprises a nucleotide sequence encoding HCV El and/or E2 and/or E1/E2 protein as defined above in addition to a regulatory sequence operably linked to said HCV El and/or E2 and/or El/E2 sequence and capable of regulating the expression of said HCV El and/or E2 and/or E1/E2 protein.
Eukaryotic hosts include lower and higher eukaryotic hosts as described in the definitions section. Lower eukaryotic hosts include yeast cells well known in the art.
Higher eukaryotic hosts mainly include mammalian cell lines known in the art and include many immortalized cell lines available from the ATCC, inluding HeLa cells, Chinese hamster ovary (CHO) cells, Baby hamster kidney (BHK) cells, PK15, RK1 3 and a number of other cell lines.
The present invention relates particularly to a recombinant El and/or E2 and/or E1/E2 protein expressed by a host cell as defined above containing a recombinany vector as defined above. These recombinant proteins are particularly purified according to the method of the present invention.
A preferred method for isolating or purifying HCV envelope proteins as defined above is further characterized as comprising at least the following steps: growing a host cell as defined above transformed with a recombinant vector according to the present invention or with a known recombinant vector expressing El and/or E2 and/or El/E2 HCV envelope proteins in a suitable culture medium, causing expression of said vector sequence as defined above under suitable conditions, and, lysing said transformed host cells, preferably in the presence of a SH group WO 96/04385 PCT/EP95/03031 21 blocking agent, such as N-ethylmaleimide (NEM), and possibly a suitable detergent, preferably Empigen-BB, recovering said HCV envelope protein by affinity purification such as by means of lectin-chromatography or immunoaffinity chromatography using anti-E 1 and/or anti-E2 specific monoclonal antibodies, with said lectin being preferably lentillectin or GNA, followed by, incubation of the eluate of the previous step with a disulphide bond cleavage means, such as DTT, preferably followed by incubation with an SH group blocking agent, such as NEM or Biotin-NEM, and, isolating the HCV single or specific oligomeric El and/or E2 and/or El/E2 proteins such as by means of gelfiltration and possibly also by a subsequent Ni2+-IMAC chromatography followed by a desalting step.
As a result of the above-mentioned proces, El and/or E2 and/or E1/E2 proteins may be produced in a form which elute differently from the large aggregates containing vector-derived components and/or cell components in the void volume of the gelfiltration column or the IMAC collumn as illustrated in the Examples section. The disulphide bridge cleavage step advantageously also eliminates the false reactivity due to the presence of host and/or expression-system-derived proteins. The presence of NEM and a suitable detergent during lysis of the cells may already partly or even completely prevent the aggregation between the HCV envelope proteins and contaminants.
Ni2+-IMAC chromatography followed by a desalting step is preferably used for contructs bearing a (His), as described by Janknecht et al., 1991, and Hochuli et al., 1988.
The present invention also relates to a method for producing monoclonal antibodies in small animals such as mice or rats, as well as a method for screening and isolating human B-cells that recognize anti-HCV antibodies, using the HCV single or specific oligomeric envelope proteins of the present invention.
The present invention further relates to a composition comprising at least one of the following El peptides as listed in Table 3: E1-31 (SEQ ID NO 56) spanning amino acids 181 to 200 of the Core/El V1 region, E1-33 (SEQ ID NO 57) spanning amino acids 193 to 212 of the El region, E1-35 (SEQ ID NO 58) spanning amino acids 205 to 224 of the El V2 region (epitope B), WO 96/04385 PCT/EP95/03031 22 E1-35A (SEQ ID NO 59) spanning amino acids 208 to 227 of the El V2 region (epitope B), 1 bE1 (SEQ ID NO 53) spanning amino acids 192 to 228 of El regions (V1, C1, and V2 regions (containing epitope E1-51 (SEQ ID NO 66) spanning amino acids 301 to 320 of the El region, El-53 (SEQ ID NO 67) spanning amino acids 313 to 332 of the El C4 region (epitope A), E1-55 (SEQ ID NO 68) spanning amino acids 325 to 344 of the El region.
The present invention also relates to a composition comprising at least one of the following E2 peptides as listed in Table 3: Env 67 or E2-67 (SEQ ID NO 72) spanning amino acid positions 397 to 416 of the E2 region (epitope A, recognized by monoclonal antibody 2F10H10, see Figure 19), Env 69 or E2-69 (SEQ ID NO 73) spanning amino acid positions 409 to 428 of the E2 region (epitope A), Env 23 or E2-23 (SEQ ID NO 86) spanning positions 583 to 602 of the E2 region (epitope E), Env 25 or E2-25 (SEQ ID NO 87) spanning positions 595 to 614 of the E2 region (epitope E), Env 27 or E2-27 (SEQ ID NO 88) spanning positions 607 to 626 of the E2 region (epitope E), Env 17B or E2-17B (SEQ ID NO 83) spanning positions 547 to 566 of the E2 region (epitope D), Env 13B or E2-13B (SEQ ID NO 82) spanning positions 523 to 542 of the E2 region (epitope C; recognized by monoclonal antibody 16A6E7, see Figure 19).
The present invention also relates to a composition comprising at least one of the following E2 conformational epitopes: epitope F recognized by monoclonal antibodies 15C8C1, 12D11F1 and 8G10D1H9, epitope G recognized by monoclonal antibody 9G3E6, epitope H (or C) recognized by monoclonal antibody 10D3C4 and 4H6B2, or, epitope I recognized by monoclonal antibody 17F2C2.
The present invention also relates to an El or E2 specific antibody raised upon immunization with a peptide or protein composition, with said antibody being specifically WO 96/04385 PCT/EP95/03031 23 reactive with any of the polypeptides or peptides as defined above, and with said antibody being preferably a monoclonal antibody.
The present invention also relates to an El or E2 specific antibody screened from a variable chain library in plasmids or phages or from a population of human B-cells by means of a process known in the art, with said antibody being reactive with any of the polypeptides or peptides as defined above, and with said antibody being preferably a monoclonal antibody.
The El or E2 specific monoclonal antibodies of the invention can be produced by any hybridoma liable to be formed according to classical methods from splenic cells of an animal, particularly from a mouse or rat, immunized against the HCV polypeptides or peptides according to the invention, as defined above on the one hand, and of cells of a myeloma cell line on the other hand, and to be selected by the ability of the hybridoma to produce the monoclonal antibodies recognizing the polypeptides which has been initially used for the immunization of the animals.
The antibodies involved in the invention can be labelled by an appropriate label of the enzymatic, fluorescent, or radioactive type.
The monoclonal antibodies according to this preferred embodiment of the invention may be humanized versions of mouse monoclonal antibodies made by means of recombinant DNA technology, departing from parts of mouse and/or human genomic DNA sequences coding for H and L chains from cDNA or genomic clones coding for H and L chains.
Alternatively the monoclonal antibodies according to this preferred embodiment of the invention may be human monoclonal antibodies. These antibodies according to the present embodiment of the invention can also be derived from human peripheral blood lymphocytes of patients infected with HCV, or vaccinated against HCV. Such human monoclonal antibodies are prepared, for instance, by means of human peripheral blood lymphocytes (PBL) repopulation of severe combined immune deficiency (SCID) mice (for recent review, see Duchosal et al., 1992).
The invention also relates to the use of the proteins or peptides of the invention, for the selection of recombinant antibodies by the process of repertoire cloning (Persson et al., 1991).
Antibodies directed to peptides or single or specific oligomeric envelope proteins derived from a certain genotype may be used as a medicament, more particularly for incorporation into an immunoassay for the detection of HCV genotypes (for detecting I i WO 96/04385 PCT/EP95/03031 24 the presence of HCV El or E2 antigen), for prognosing/monitoring of HCV disease, or as therapeutic agents.
Alternatively, the present invention also relates to the use of any of the abovespecified El or E2 specific monoclonal antibodies for the preparation of an immunoassay kit for detecting the presence of El or E2 antigen in a biological sample, for the preparation of a kit for prognosing/monitoring of HCV disease or for the preparation of a HCV medicament.
The present invention also relates to the a method for in vitro diagnosis or detection of HCV antigen present in a biological sample, comprising at least the following steps: contacting said biological sample with any of the El and/or E2 specific monoclonal antibodies as defined above, preferably in an immobilized form under appropriate conditions which allow the formation of an immune complex, (ii) removing unbound components, (iii) incubating the immune complexes formed with heterologous antibodies, which specifically bind to the antibodies present in the sample to be analyzed, with said heterologous antibodies having conjugated to a detectable label under appropriate conditions, (iv) detecting the presence of said immune complexes visually or mechanically by means of densitometry, fluorimetry, colorimetry).
The present invention also relates to a kit for in vitro diagnosis of HCV antigen present in a biological sample, comprising: at least one monoclonal antibody as defined above, with said antibody being preferentially immobilized on a solid substrate, a buffer or components necessary for producing the buffer enabling binding reaction between these antibodies and the HCV antigens present in the biological sample, a means for detecting the immune complexes formed in the preceding binding reaction, possibly also including an automated scanning and interpretation device for inferring the HCV antigens present in the sample from the observed binding pattern.
The present invention also relates to a composition comprising El and/or E2 WO 96/04385 PCT/EP95/03031 and/or E1 /E2 recombinant HCV proteins purified according to the method of the present invention or a composition comprising at least one peptides as specified above for use as a medicament.
The present invention more particularly relates to a composition comprising at least one of the above-specified envelope peptides or a recombinant envelope protein composition as defined above, for use as a vaccine for immunizing a mammal, preferably humans, against HCV, comprising administering a sufficient amount of the composition possibly accompanied by Pharmaceutically acceptable adjuvant(s), to produce an immune response.
More particularly, the present invention relates to the use of any of the compositions as described here above for the preparation of a vaccine as described above.
Also, the present invention relates to a vaccine composition for immunizing a mammal, preferably humans, against HCV, comprising HCV single or specific oligomeric proteins or peptides derived from the El and/or the E2 region as described above.
Immunogenic compositions can be prepared according to methods known in the art. The present compositions comprise an immunogenic amount of a recombinant El and/or E2 and/or El/E2 single or specific oligomeric proteins as defined above or El or E2 peptides as defined above, usually combined with a Pharmaceutically acceptable carrier, preferably further comprising an adjuvant.
The single or specific oligomeric envelope proteins of the present invention, either El and/or E2 and/or El /E2, are expected to provide a particularly useful vaccine antigen, since the formation of antibodies to either El or E2 may be more desirable than to the other envelope protein, and since the E2 protein is cross-reactive between HCV types and the El 1 protein is type-specific. Cocktails including type 1 E2 protein and El 1 proteins derived from several genotypes may be particularly advantageous. Cocktails containing a molar excess of El versus E2 or E2 versus El may also be particularly useful.
Immunogenic compositions may be administered to animals to induce production of antibodies, either to provide a source of antibodies or to induce protective immunity in the animal.
Pharmaceutically acceptable carriers include any carrier that does not itself induce the production of antibodies harmful to the individual receiving the composition.
Suitable carriers are typically large, slowly metabolized macromolecules such as proteins, polysaccharides, polylactic acids, polyglycolic acids, polymeric amino acids, WO 96/04385 PCT/EP95/03031 26 amino acid copolymers; and inactive virus particles. Such carriers are well known to those of ordinary skill in the art.
Preferred adjuvants to enhance effectiveness of the composition include, but are not limited to: aluminim hydroxide (alum), N-acetyl-muramyl-L-threonyl-D-isoglutamine (thr-MDP) as found in U.S. Patent No. 4,606,918, N-acetyl-normuramyl-L-alanyl-Disoglutamine (nor-MDP), N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine-2-(1'-2'dipalmitoyl-sn-glycero-3-hydroxyphosphoryloxy)-ethylamine (MTP-PE) and RIBI, which contains three components extracted from bacteria, monophosphoryl lipid A, trehalose dimycolate, and cell wall skeleton (MPL+TDM+CWS) in a 2% squalene/Tween emulsion. Any of the 3 components MPL, TDM or CWS may also be used alone or combined 2 by 2. Additionally, adjuvants such as Stimulon (Cambridge Bioscience, Worcester, MA) or SAF-1 (Syntex) may be used. Further, Complete Freund's Adjuvant (CFA) and Incomplete Freund's Adjuvant (IFA) may be used for non-human applications and research purposes.
The immunogenic compositions typically will contain pharmaceutically acceptable vehicles, such as water, saline, glycerol, ethanol, etc. Additionally, auxiliary substances, such as wetting or emulsifying agents, pH buffering substances, preservatives, and the like, may be included in such vehicles.
Typically, the immunogenic compositions are prepared as injectables, either as liquid solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid vehicles prior to injection may also be prepared. The preparation also may be emulsified or encapsulated in liposomes for enhanced adjuvant effect. The El and E2 proteins may also be incorporated into Immune Stimulating Complexes together with saponins, for example Quil A (ISCOMS).
Immunogenic compositions used as vaccines comprise a 'sufficient amount' or 'an immunologically effective amount' of the envelope proteins of the present invention, as well as any other of the above mentioned components, as needed. 'Immunologically effective amount', means that the administration of that amount to an individual, either in a single dose or as part of a series, is effective for treatment, as defined above. This amount varies depending upon the health and physical condition of the individual to be treated, the taxonomic group of individual to be treated nonhuman primate, primate, etc.), the capacity of the individual's immune system to synthesize antibodies, the degree of protection desired, the formulation of the vaccine, the treating doctor's assessment of the medical situation, the strain of infecting HCV, and other relevant (t
~I
WO 96/04385 PCTEP95/03031 27 factors. It is expected that the amount will fall in a relatively broad range that can be determined through routine trials. Usually, the amount will vary from 0.01 to 1000 pg/dose, more particularly from 0.1 to 100 ug/dose.
The single or specific oligomeric envelope proteins may also serve as vaccine carriers to present homologous T cell epitopes or B cell epitopes from the core, NS2, NS3, NS4 or NS5 regions) or heterologous (non-HCV) haptens, in the same manner as Hepatitis B surface antigen (see European Patent Application 174,444). In this use, envelope proteins provide an immunogenic carrier capable of stimulating an immune response to haptens or antigens conjugated to the aggregate. The antigen may be conjugated either by conventional chemical methods, or may be cloned into the gene encoding El and/or E2 at a location corresponding to a hydrophilic region of the protein.
Such hydrophylic regions include the V 1 region (encompassing amino acid positions 191 to 202), the V2 region (encompassing amino acid positions 213 to 223), the V3 region (encompassing amino acid positions 230 to 242), the V4 region (encompassing amino acid positions 230 to 242), the V5 region (encompassing amino acid positions 294 to 303) and the V6 region (encompassing amino acid positions 329 to 336). Another useful location for insertion of haptens is the hydrophobic region (encompassing approximately amino acid positions 264 to 293). It is shown in the present invention that this region can be deleted without affecting the reactivity of the deleted El protein with antisera. Therefore, haptens may be inserted at the site of the deletion.
The immunogenic compositions are conventionally administered parenterally, typically by injection, for example, subcutaneously or intramuscularly. Additional formulations suitable for other methods of administration include oral formulations and suppositories. Dosage treatment may be a single dose schedule or a multiple dose schedule. The vaccine may be administered in conjunction with other immunoregulatory agents.
The present invention also relates to a composition comprising peptides or polypeptides as described above, for in vitro detection of HCV antibodies present in a biological sample.
The present invention also relates to the use of a composition as described above for the preparation of an immunoassay kit for detecting HCV antibodies present in a biological sample.
The present invention also relates to a method for in vitro diagnosis of HCV antibodies present in a biological sample, comprising at least the following steps WO 96/04385 PCTIEP95/03031 contacting said biological sample with a composition comprising any of the envelope peptide or proteins as defined above, preferably in an immobilized form under appropriate conditions which allow the formation of an immune complex, wherein said peptide or protein can be a biotinylated peptide or protein which is covalently bound to a solid substrate by means of streptavidin or avidin complexes, (ii) removing unbound components, (iii) incubating the immune complexes formed with heterologous antibodies, with said heterologous antibodies having conjugated to a detectable label under appropriate conditions, (iv) detecting the presence of said immune complexes visually or mechanically by means of densitometry, fluorimetry, colorimetry).
Alternatively, the present invention also relates to competition immunoassay formats in which recombinantly produced purified single or specific oligomeric protein El and/or E2 and/or E1/E2 proteins as disclosed above are used in combination with El and/or E2 peptides in order to compete for HCV antibodies present in a biological sample.
The present invention also relates to a kit for determining the presence of HCV antibodies, in a biological sample, comprising at least one peptide or protein composition as defined above, possibly in combination with other polypeptides or peptides from HCV or other types of HCV, with said peptides or proteins being preferentially immobilized on a solid substrate, more preferably on different microwells of the same ELISA plate, and even more preferentially on one and the same membrane strip, a buffer or components necessary for producing the buffer enabling binding reaction between these polypeptides or peptides and the antibodies against HCV present in the biological sample, means for detecting the immune complexes formed in the preceding 0 binding reaction, possibly also including an automated scanning and interpretation device for inferring the HCV genotypes present in the sample from the observed binding pattern.
The immunoassay methods according to the present invention utilize single or WO 96/04385 PCT/EP95/03031 29 specific oligomeric antigens from the El and/or E2 domains that maintain linear (in case of peptides) and conformational epitopes (single or specific oligomeric proteins) recognized by antibodies in the sera from individuals infected with HCV. It is within the scope of the invention to use for instance single or specific oligomeric antigens, dimeric antigens, as well as combinations of single or specific oligomeric antigens. The HCV El and E2 antigens of the present invention may be employed in virtually any assay format that employs a known antigen to detect antibodies. Of course, a format that denatures the HCV conformational epitope should be avoided or adapted. A common feature of all of these assays is that the antigen is contacted with the body component suspected of containing HCV antibodies under conditions that permit the antigen to bind to any such antibody present in the component. Such conditions will typically be physiologic temperature, pH and ionic strenght using an excess of antigen. The incubation of the antigen with the specimen is followed by detection of immune complexes comprised of the antigen.
Design of the immunoassays is subject to a great deal of variation, and many formats are known in the art. Protocols may, for example, use solid supports, or immunoprecipitation. Most assays involve the use of labeled antibody or polypeptide; the labels may be, for example, enzymatic, fluorescent, chemiluminescent, radioactive, or dye molecules. Assays which amplify the signals from the immune complex are also known; examples of which are assays which utilize biotin and avidin or streptavidin, and enzyme-labeled and mediated immunoassays, such as ELISA assays.
The immunoassay may be, without limitation, in a heterogeneous or in a homogeneous format, and of a standard or competitive type. In a heterogeneous format, the polypeptide is typically bound to a solid matrix or support to facilitate separation of the sample from the polypeptide after incubation. Examples of solid supports that can be used are nitrocellulose in membrane or microtiter well form), polyvinyl chloride in sheets or microtiter wells), polystyrene latex in beads or microtiter plates, polyvinylidine fluoride (known as ImmunolonTM), diazotized paper, nylon membranes, activated beads, and Protein A beads. For example, Dynatech ImmunolonTM 1 or Immunlon"T 2 microtiter plates or 0.25 inch polystyrene beads (Precision Plastic Ball) can be used in the heterogeneous format. The solid support containing the antigenic polypeptides is typically washed after separating it from the test sample, and prior to detection of bound antibodies. Both standard and competitive formats are know in the art.
WO 96/04385 PCT/EP95/03031 In a homogeneous format, the test sample is incubated with the combination of antigens in solution. For example, it may be under conditions that will precipitate any antigen-antibody complexes which are formed. Both standard and competitive formats for these assays are known in the art.
In a standard format, the amount of HCV antibodies in the antibody-antigen complexes is directly monitored. This may be accomplished by determining whether labeled anti-xenogeneic anti-human) antibodies which recognize an epitope on anti- HCV antibodies will bind due to complex formation. In a competitive format, the amount of HCV antibodies in the sample is deduced by monitoring the competitive effect on the binding of a known amount of labeled antibody (or other competing ligand) in the complex.
Complexes formed comprising anti-HCV antibody (or in the case of competitive assays, the amount of competing antibody) are detected by any of a number of known techniques, depending on the format. For example, unlabeled HCV antibodies in the complex may be detected using a conjugate of anti-xenogeneic Ig complexed with a label an enzyme label).
In an immunoprecipitation or agglutination assay format the reaction between the HCV antigens and the antibody forms a network that precipitates from the solution or suspension and forms a visible layer or film of precipitate. If no anti-HCV antibody is present in the test specimen, no visible precipitate is formed.
There currently exist three specific types of particle agglutination (PA) assays.
These assays are used for the detection of antibodies to various antigens when coated to a support. One type of this assay is the hemagglutination assay using red blood cells (RBCs) that are sensitized by passively adsorbing antigen (or antibody) to the RBC. The addition of specific antigen antibodies present in the body component, if any, causes the RBCs coated with the purified antigen to agglutinate.
To eliminate potential non-specific reactions in the hemagglutination assay, two artificial carriers may be used instead of RBC in the PA. The most common of these are latex particles. However, gelatin particles may also be used. The assays utilizing either of these carriers are based on passive agglutination of the particles coated with purified antigens.
The HCV single or specififc oligomeric El and/or E2 and/or El /E2 antigens of the present invention comprised of conformational epitopes will typically be packaged in the form of a kit for use in these immunoassays. The kit will normally contain in separate WO 96/04385 PCT/EP95/03031 31 containers the native HCV antigen, control antibody formulations (positive and/or negative), labeled antibody when the assay format requires the same and signal generating reagents enzyme substrate) if the label does not generate a signal directly. The native HCV antigen may be already bound to a solid matrix or separate with reagents for binding it to the matrix. Instructions written, tape, CD-ROM, etc.) for carrying out the assay usually will be included in the kit.
Immunoassays that utilize the native HCV antigen are useful in screening blood for the preparation of a supply from which potentially infective HCV is lacking. The method for the preparation of the blood supply comprises the following steps. Reacting a body component, preferably blood or a blood component, from the individual donating blood with HCV El and/or E2 proteins of the present invention to allow an immunological reaction between HCV antibodies, if any, and the HCV antigen. Detecting whether anti-HCV antibody HCV antigen complexes are formed as a result of the reacting. Blood contributed to the blood supply is from donors that do not exhibit antibodies to the native HCV antigens, El or E2.
In cases of a positive reactivity to the HCV antigen, it is preferable to repeat the immunoassay to lessen the possibility of false positives. For example, in the large scale screening of blood for the production of blood products blood transfusion, plasma, Factor VIII, immunoglobulin, etc.) 'screening' tests are typically formatted to increase sensitivity (to insure no contaminated blood passes) at the expense of specificity; i.e.
the false-positive rate is increased. Thus, it is typical to only defer for further testing those donors who are 'repeatedly reactive'; i.e. positive in two or more runs of the immunoassay on the donated sample. However, for confirmation of HCV-positivity, the 'confirmation' tests are typically formatted to increase specificity (to insure that no false-positive samples are confirmed) at the expense of sensitivity. Therefore the purification method described in the present invention for El and E2 will- be very advantageous for including single or specific oligomeric envelope proteins into HCV diagnostic assays.
The solid phase selected can include polymeric or glass beads, nitrocellulose, microparticles, microwells of a reaction tray, test tubes and magnetic beads. The signal generating compound can include an enzyme, a luminescent compound, a chromogen, a radioactive element and a chemiluminescent compound. Examples of enzymes include alkaline phosphatase, horseradish peroxidase and beta-galactosidase. Examples of enhancer compounds include biotin, anti-biotin and avidin. Examples of enhancer J r WO 96/04385 PCT/EP95/03031 32 compounds binding members include biotin, anti-biotin and avidin. In order to block the effects of rheumatoid factor-like substances, the test sample is subjected to conditions sufficient to block the effect of rheumatoid factor-like substances. These conditions comprise contacting the test sample with a quantity of anti-human IgG to form a mixture, and incubating the mixture for a time and under conditions sufficient to form a reaction mixture product substantially free of rheumatoid factor-like substance.
The present invention further contemplates the use of E1 proteins, or parts thereof, more particularly HCV single or specific oligomeric El proteins as defined above, for in vitro monitoring HCV disease or prognosing the response to treatment (for instance with Interferon) of patients suffering from HCV infection comprising: incubating a biological sample from a patient with hepatitis C infection with an E1 protein or a suitable part thereof under conditions allowing the formation of an immunological complex, removing unbound components, calculating the anti-El titers present in said sample (for example at the start of and/or during the course of (interferon) therapy), monitoring the natural course of HCV disease, or prognosing the response to treatment of said patient on the basis of the amount anti-El titers found in said sample at the start of treatment and/or during the course of treatment.
Patients who show a decrease of 2, 3, 4, 5, 7, 10, 15, or preferably more than times of the initial anti-El titers could be concluded to be long-term, sustained responders to HCV therapy, more particularly to interferon therapy. It is illustrated in the Examples section, that an anti-El assay may be very useful for prognosing long-term response to IFN treatment, or to treatment of Hepatitis C virus disease in general.
More particularly the following E1 peptides as listed in Table 3 were found to be useful for in vitro monitoring HCV disease or prognosing the response to interferon treatment of patients suffering from HCV infection: E1-31 (SEQ ID NO 56) spanning amino acids 181 to 200 of the Core/El V1 region, E1-33 (SEQ ID NO 57) spanning amino acids 193 to 212 of the El region, E1-35 (SEQ ID NO 58) spanning amino acids 205 to 224 of the El V2 region (epitope
B),
E1-35A (SEQ ID NO 59) spanning amino acids 208 to 227 of the El V2 region WO 96/04385 PCT/EP95/03031 33 (epitope
B),
1ibEl (SEQ ID NO 53) spanning amino acids 192 to 228 of El regions (V1, C1, and V2 regions (containing epitope E1-51 (SEQ ID NO 66) spanning amino acids 301 to 320 of the El region, E1-53 (SEQ ID NO 67) spanning amino acids 313 to 332 of the El C4 region (epitope
A),
(SEQ ID NO 68) spanning amino acids 325 to 344 of the El region.
It is to be understood that smaller fragments of the above-mentioned peptides also fall within the scope of the present invention. Said smaller fragments can be easily prepared by chemical synthesis and can be tested for their ability to be used in an assay as detailed above and in the Examples section.
The present invention also relates to a kit for monitoring HCV disease or prognosing the response to treatment (for instance to interferon) of patients suffering from HCV infection comprising: at least one El protein or El peptide, more particularly an El protein or El peptide as defined above, a buffer or components necessary for producing the buffer enabling the binding reaction between these proteins or peptides and the anti-El antibodies present in a biological sample, means for detecting the immune complexes formed in the preceding binding reaction, possibly also an automated scanning and interpretation device for inferring a decrease of anti-El1 titers during the progression of treatment.
It is to be understood that also E2 protein and peptides according to the present invention can be used to a certain degree to monitor/prognose HCV treatment as indicated above for the El proteins or peptides because also the anti-E2 levels decrease in comparison to antibodies to the other HCV antigens. It is to be understood, however, that it might be possible to determine certain epitopes in the E2 region which would also be suited for use in an test for monitoring/prognosing HCV disease.
The present invention also relates to a serotyping assay for detecting one or more serological types of HCV present in a biological sample, more particularly for detecting antibodies of the different types of HCV to be detected combined in one assay format, comprising at least the following steps contacting the biological sample to be analyzed for the presence of HCV WO 96/04385 PCT/EP95/03031 34 antibodies of one or more serological types, with at least one of the El and/or E2 and/or E1/E2 protein compositions or at least one of the El or E2 peptide compositions as defined above, preferantially in an immobilized form under appropriate conditions which allow the formation of an immune complex, (ii) removing unbound components, (iii) incubating the immune complexes formed with heterologous antibodies, with said heterologous antibodies being conjugated to a detectable label under appropriate conditions, (iv) detecting the presence of said immune complexes visually or mechanically by means of densitometry, fluorimetry, colorimetry) and inferring the presence of one or more HCV serological types present from the observed binding pattern.
It is to be understood that the compositions of proteins or peptides used in this method are recombinantly expressed type-specific envelope proteins or type-specific peptides.
The present invention further relates to a kit for serotyping one or more serological types of HCV present in a biological sample, more particularly for detecting the antibodies to these serological types of HCV comprising: at least one El and/or E2 and/or El/E2 protein or El or E2 peptide, as defined above, a buffer or components necessary for producing the b.uffer enabling the binding reaction between these proteins or peptides and the anti-El antibodies present in a biological sample, means for detecting the immune complexes formed in the preceding binding reaction, possibly also an automated scanning and interpretation device for detecting the presence of one or more serological types present from the observed binding pattern.
The present invention also relates to the use of a peptide or protein composition as defined above, for immobilization on a solid substrate and incorporation into a reversed phase hybridization assay, preferably for immobilization as parallel lines onto a solid support such as a membrane strip, for determining the presence or the genotype of HCV according to a method as defined above. Combination with other type-specific WO 96/04385 PCT/EP95/03031 antigens from other HCV polyprotein regions also lies within the scope of the present invention.
Ar WO 96/04385 PCT/EP95/03031 Figure and Table legends Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Figure 10 Figure 11 Figure 12 Figure 13: Figure 14: Figure 15 Figure 16 Figure 17 Figure 18 Figure 19 Figure 20: Figure 21: Restriction map of plasmid pgpt ATA 18 Restriction map of plasmid pgs ATA 18 Restriction map of plasmid pMS 66 Restriction map of plasmid pv HCV-11A Anti-E1 levels in non-responders to IFN treatment Anti-E1 levels in responders to IFN treatment Anti-E1 levels in patients with complete response to IFN treatment Anti-E1 levels in incomplete responders to IFN treatment Anti-E2 levels in non-responders to IFN treatment Anti-E2 levels in responders to IFN treatment Anti-E2 levels in incomplete responders to IFN treatment Anti-E2 levels in complete responders to IFN treatment Human anti-E1 reactivity competed with peptides Competition of reactivity of anti-E1 monoclonal antibodies with peptides Anti-E1 (epitope 1) levels in non-responders to IFN treatment Anti-E1 (epitope 1) levels in responders to IFN treatment Anti-E1 (epitope 2) levels in non-responders to IFN treatment Anti-E1 (epitope 2) levels in responders to IFN treatment Competition of reactivity of anti-E2 monoclonal antibodies with peptides Human anti-E2 reactivity competed with peptides Nucleic acid sequences of the present invention. The nucleic acid sequences encoding an El or E2 protein according to the present invention may be translated (SEQ ID NO 3 to 13, 21-31, 35 and 41-49 are translated in a reading frame starting from residue number 1, SEQ ID NO 37-39 are translated in a reading frame starting from residue number into the amino acid sequences of the respective El or E2 proteins as shown in the sequence listing.
ELISA results obtained from lentil lectin chromatography eluate fractions of 4 different El purifications of cell lysates infected with vvHCV39 (type lb), vvHCV40 (type lb), vvHCV62 (type 3a), and vvHCV63 (type Elution profiles obtained from the lentil lectin chromatography of the 4 different El constructs on the basis of the values as shown in Figure 22.
30 Figure 22: Figure 23: WO 96/04385 PCT/EP95/03031 Figure 24: Figure 25: Figure 26: Figure 27: ELISA results obtained from fractions obtained after gelfiltration chromatography of 4 different El purifications of cell lysates infected with vvHCV39 (type 1b), vvHCV40 (type 1b), vvHCV62 (type 3a), and vvHCV63 (type Profiles obtained from purifications of El proteins of type 1b type 3a and type 5a (from RK13 cells infected with vvHCV39, vvHCV62, and vvHCV63, respectively; purified on lentil lectin and reduced as in example 5.2 5.3) and a standard The peaks indicated with and represent pure El protein peaks (see Figure 24, El reactivity mainly in fractions 26 to Silver staining of an SDS-PAGE as described in example 4 of a raw lysate of El vvHCV40 (type 1 b) (lane pool 1 of the gelfiltration of representing fractions 10 to 17 as shown in Figure 25 (lane pool 2 of the gelfiltration of vvHCV40 representing fractions 18 to 25 as shown in Figure 25 (lane and El pool (fractions 26 to 30) (lane 4).
Streptavidine-alkaline phosphatase blot of the fractions of the gelfiltration of El constructs 39 (type 1b) and 62 (type 3a). The proteins were labelled with NEM-biotin. Lane 1: start gelfiltration construct 39, lane 2: fraction 26 construct 39, lane 3: fraction 27 construct 39, lane 4: fraction 28 construct 39, lane 5: fraction 29 construct 39, lane 6: fraction 30 construct 39, lane 7 fraction 31 construct 39, lane 8: molecular weight marker, lane 9: start gelfiltration construct 62, lane fraction 26 construct 62, lane 11: fraction 27 construct 62, lane 12: fraction 28 construct 62, lane 13: fraction 29 construct 62, lane 14: fraction 30 construct 62, lane 15: fraction 31 construct 62.
Siver staining of an SDS-PAGE gel of the gelfiltration fractions of vvHCV- 39 (Els, type 1b) and vvHCV-62 (Els, type 3a) run under identical conditions as Figure 26. Lane 1: start gelfiltration construct 39, lane 2: fraction 26 construct 39, lane 3: fraction 27 construct 39, lane 4: fraction 28 construct 39, lane 5: fraction 29 construct 39, lane 6: fraction 30 construct 39, lane 7 fraction 31 construct 39, lane 8: molecular weight marker, lane 9: start gelfiltration construct 62, lane fraction 26 construct 62, lane 11: fraction 27 construct 62, lane 12: fraction 28 construct 62, lane 13: fraction 29 construct 62, lane 14: Figure 28:
M
fr4 WO 96/04385 PCT/EP95/03031 Figure 29: Figure 30: Figure 31A Figure 31B: Figure 32: Figure 33: Figure 34: fraction 30 construct 62, lane 15: fraction 31 construct 62.
Western Blot analysis with anti-E1 mouse monoclonal antibody 5E1A10 giving a complete overview of the purification procedure. Lane 1: crude lysate, Lane 2: flow through of lentil chromagtography, Lane 3: wash with Empigen BB after lentil chromatography, Lane 4: Eluate of lentil chromatography, Lane 5: Flow through during concentration of the lentil eluate, Lane 6: Pool of El after Size Exclusion Chromatography (gelfiltration).
OD
2 8 0 profile (continuous line) of the lentil lectin chromatography of E2 protein from RK13 cells infected with vvHCV44. The dotted line represents the E2 reactivity as detected by ELISA (as in example 6).
OD
2 8 0 profile (continuous line) of the lentil-lectin gelfiltration chromatography E2 protein pool from RK1 3 cells infected with vvHCV44 in which the E2 pool is applied immediately on the gelfiltration column (non-reduced conditions). The dotted line represents the E2 reactivity as detected by ELISA (as in example 6).
OD
2 8 0 profile (continuous line) of the lentil-lectin gelfiltration chromatography E2 protein pool from RK1 3 cells infected with vvHCV44 in which the E2 pool was reduced and blocked according to Example 5.3 (reduced conditions). The dotted line represents the E2 reactivity as detected by ELISA (as in example 6).
Ni2+-IMAC chromatography and ELISA reactivity of the E2 protein as expressed from vvHCV44 after gelfiltration under reducing conditions as shown in Figure 31B.
Silver staining of an SDS-PAGE of 0.5 pg of purified E2 protein recovered by a 200 mM imidazole elution step (lane 2) and a 30mM imidazole wash (lane 1) of the Ni 2 +-IMAC chromatography as shown in Figure 32.
OD profiles of a desalting step of the purified E2 protein recovered by 200 mM immidazole as shown in Figure 33, intended to remove imidazole.
Antibody levels to the different HCV antigens (Core 1, Core 2, E2HCVR, NS3) for NR and LTR followed during treatment and over a period of 6 to 12 months after treatment determined by means of the LIAscan method.
The average values are indicated by the curves with the open squares.
S 30 Figure 35A: WO 96/04385 Figure 35B: PCT/EP95/03031 Antibody levels to the different HCV antigens (NS4, NS5, El and E2) for NR and LTR followed during treatment and over a period of 6 to 12 months after treatment determined by means of the LIAscan method. The avergae vallues are indicated by the curve with the open squares.
Figure 36: and NR grou Figure 37: Figure 38: Figure 39: Figure 40: Figure 41: Figure 42A: Figure 42B: Average El antibody (E1Ab) and E2 antibody (E2Ab) levels in the LTR ips.
Averages El antibody (E1Ab) levels for non-responders (NR) and long term responders (LTR) for type 1b and type 3a.
Relative map positions of the anti-E2 monoclonal antibodies.
Partial deglycosylation of HCV El envelope protein. The lysate of vvHCV1OA-infected RK13 cells were incubated with different concentrations of glycosidases according to the manufacturer's instructions. Right panel: Glycopeptidase F (PNGase Left panel: Endoglycosidase H (Endo H).
Partial deglycosylation of HCV E2 envelope proteins. The lysate of vvHCV64-infected (E2) and vvHCV41-infected (E2s)RK13 cells were incubated with different concentrations of Glycopeptidase F (PNGase F) according to the manufacturer's instructions.
In vitro mutagenesis of HCV El glycoproteins. Map of the mutated sequences and the creation of new restriction sites.
In vitro mutagenesis of HCV El glycoprotein (part First step of PCR amplification.
In vitro mutagensis of HCV El glycoprotein (part Overlap extension and nested PCR.
In vitro mutagesesis of HCV El glycoproteins. Map of the PCR mutated fragments (GLY-# and OVR-#) synthesized during the first step of amplification.
Analysis of El glycoprotein mutants by Western blot expressed in HeLa (left) and RK13 (right) cells. Lane 1: wild type VV (vaccinia virus), Lane 2: original El protein (vvHCV-OA), Lane 3: El mutant Gly-1 (vvHCV-81), Lane 4: El mutant Gly-2 (vvHCV-82), Lane 5: El mutant Gly-3 (vvHCV- 83), Lane 6: El mutant Gly-4 (vvHCV-84), Lane 7: El mutant Lane 8: El mutant Gly-6 (vvHCV-86).
Figure 43: Figure 44A: WO 96/04385 PCT/EP95/03031 Figure 44B: Analysis of El glycosylation mutant vaccinia viruses by PCR amplification/restriction. Lane 1: El (vvHCV-10A), BspE I, Lane 2: E1.GLY-1 (vvHCV-81), BspE/, Lane 4: El (vvHCV-10A), Sac I, Lane E1.GLY-2 (vvHCV-82), Sac I, Lane 7: El (vvHCV-10A), Sac I, Lane 8: E1.GLY-3 (vvHCV-83), Sac I, Lane 10: E1 (vvHCV-1OA), Stu I, Lane 11: E1.GLY-4 (vvHCV-84), Stul, Lane 13: E1 (vvHCV-10A), Sma Lane 14: (vvHCV-85), Snma, Lane 16: El (vvHCV-10A), Stu/, Lane 17: E1.GLY-6 (vvHCV-86), Stu I, Lane 3 6 9 12 15 Low Molecular Weight Marker, pBluescript SK+, Msp Figure 45: SDS polyacrylamide gel electrophoresis of recombinant E2 expressed in S. cerevisiae. Innoculates were grown in leucine selective medium for 72 hrs. and diluted 1/15 in complete medium. After 10 days of culture at 280C, medium samples were taken. The equivalent of 2 0 0 pl of culture supernatant concentrated by speedvac was loaded on the gel. Two independent transformants were analysed.
Figure 46: SDS polyacrylamide gel electrophoresis of recombinant E2 expressed in a glycosylation deficient S. cerevisiae mutant. Innoculae were grown in leucine selective medium for 72 hrs. and diluted 1/15 in complete medium. After 10 days of culture at 280C, medium samples were taken.
The equivalent of 350 pl of culture supernatant, concentrated by ion exchange chromatography, was loaded on the gel.
Table 1 Features of the respective clones and primers used for amplification for constructing the different forms of the El protein as despected in Example 1.
Table 2 Summary of Anti-E1 tests Table 3 Synthetic peptides for competition studies Table 4: Changes of envelope antibody levels over time.
Table 5: Difference between LTR and NR Table 6: Competition experiments between murine E2 monoclonal antibodies Table 7: Primers for construction of El glycosylation mutants Table 8: Analysis of El glycosylation mutants by ELISA WO 96/04385 PCT/EP95/03031 41 Example 1: Cloning and expression of the hepatitis C virs E1 protein 1. Construction of vaccinia virus recombination vectors The pgptATA 18 vaccinia recombination plasmid is a modified version of pATA1 8 (Stunnenberg et al, 1988) with an additional insertion containing the E. coli xanthine guanine phosphoribosyl transferase gene under the control of the vaccinia virus 13 intermediate promoter (Figure The plasmid pgsATA18 was constructed by inserting an oligonucleotide linker with SEQ ID NO 1/94, containing stop codons in the three reading frames, into the Pst I and Hindlll-cut pATA18 vector. This created an extra Pac I restriction site (Figure The original Hindll site was not restored.
Oligonucleotide linker with SEQ ID NO 1/94: 5' G GCATGC AAGCTT AATTAATT 3' 3' ACGTC CGTACG TTCGAA TTAATTAA TCGA PstI SphI HindIII Pac I (HindIII) In order to facilitate rapid and efficient purification by means of Ni 2 chelation of engineered histidine stretches fused to the recombinant proteins, the vaccinia recombination vector pMS66 was designed to express secreted proteins with an additional carboxy-terminal histidine tag. An oligonucleotide linker with SEQ ID NO 2/95, containing unique sites for 3 restriction enzymes generating blunt ends (Sma I, Stu I and Pml I/Bbr PI) was synthesized in such a way that the carboxy-terminal end of any cDNA could be inserted in frame with a sequence encoding the protease factor Xa cleavage site followed by a nucleotide sequence encoding 6 histidines and 2 stop codons (a new Pac I restriction site was also created downstream the 3'end). This oligonucleotide with SEQ ID NO 2/95 was introduced between the Xma I and Pst I sites of pgptATA18 (Figure 3).
Oligonucleotide linker with SEQ ID NO 2/95: CCGGG GAGGCCTGCACGTGATCGAGGGCAGACACCATCACCAC CACTAATAGTTAATTAA CTGCA3 3' C CTCCGGACGTGCACTAGCTCCCGTCTGTGGTAGTGGTGGTAGTGATTATCAATTAATT
G
XmaI PstI WO 96/04385 PCT/EP95/03031 42 Example 2. Construction of HCV recombinant plasmids 2.1. Constructs encoding different forms of the E1 protein Polymerase Chain Reaction (PCR) products were derived from the serum samples by RNA preparation and subsequent reverse-transcription and PCR as described previously (Stuyver et al., 1993b). Table 1 shows the features of the respective clones and the primers used for amplification. The PCR fragments were cloned into the Sma I-cut pSP72 (Promega) plasmids. The following clones were selected for insertion into vaccinia reombination vectors: HCCI9A (SEQ ID NO HCCI10A (SEQ ID NO HCCI11A (SEQ ID NO HCCI12A (SEQ ID NO HCCI13A (SEQ ID NO 11), and HCCI17A (SEQ ID NO 13) as depicted in Figure 21. cDNA fragments containing the Elcoding regions were cleaved by EcoRI and Hindlll restriction from the respective pSP72 plasmids and inserted into the EcoRI/Hindlll-cut pgptATA-18 vaccinia recombination vector (described in example downstream of the 11K vaccinia virus late promoter.
The respective plasmids were designated pvHCV-9A, pvHCV-1 OA, pvHCV-1 1 A, pvHCV- 12A, pvHCV-13A and pvHCV-17A, of which pvHCV-11A is shown in Figure 4.
2.2. Hydrophobic region El deletion mutants Clone HCCI37, containing a deletion of codons Asp264 to Val287 (nucleotides 790 to 861, region encoding hydrophobic domain I) was generated as follows: 2 PCR fragments were generated from clone HCCI10A with primer sets HCPr52 (SEQ ID NO 16)/HCPr107 (SEQ ID NO 19) and HCPr108 (SEQ ID NO 20)/HCPR54 (SEQ ID NO 18).
These primers are shown in Figure 21. The two PCR fragments were purified from agarose gel after electrophoresis and 1 ng of each fragment was used together as template for PCR by means of primers HCPr52 (SEQ ID NO 16) and HCPr54 (SEQ ID NO 18). The resulting fragment was cloned into the Sma I-cut pSP72 vector and clones containing the deletion were readily identified because of the deletion of 24 codons (72 base pairs). Plasmid pSP72HCCI37 containing clone HCCI37 (SEQ ID 15) was selected.
A recombinant vaccinia plasmid containing the full-length El cDNA lacking hydrophobic domain I was constructed by inserting the HCV sequence surrounding the deletion (fragment cleaved by Xma I and BamH I from the vector pSP72-HCCI37) into the Xma I-Bam H I sites of the vaccinia plasmid pvHCV-1 A. The resulting plasmid was named WO 96/04385 PCT/EP95/03031 43 pvHCV-37. After confirmatory sequencing, the amino-terminal region containing the internal deletion was isolated from this vector pvHCV-37 (cleavage by EcoR I and BstE II) and reinserted into the Eco RI and Bst Ell-cut pvHCV-11A plasmid. This construct was expected to express an El protein with both hydrophobic domains deleted and was named pvHCV-38. The El-coding region of clone HCC138 is represented by SEQ ID NO 23.
As the hydrophilic region at the E1 carboxyterminus (theoretically extending to around amino acids 337-340) was not completely included in construct pvHCV-38, a larger El region lacking hydrophobic domain I was isolated from the pvHCV-37 plasmid by EcoR I/Bam HI cleavage and cloned into an EcoRI/BamHI-cut pgsATA-1 8 vector. The resulting plasmid was named pvHCV-39 and contained clone HCC139 (SEQ ID NO The same fragment was cleaved from the pvHCV-37 vector by BamH I (of which the sticky ends were filled with Klenow DNA Polymerase I (Boehringer)) and subsequently by EcoR I cohesive end). This sequence was inserted into the EcoRI and Bbr PI-cut vector pMS-66. This resulted in clone HCC140 (SEQ ID NO 27) in plasmid containing a 6 histidine tail at its carboxy-terminal end.
2.3. El of other genotypes Clone HCCI62 (SEQ ID NO 29) was derived from a type 3a-infected patient with chronic hepatitis C (serum BR36, clone BR36-9-13, SEQ ID NO 19 in WO 94/25601, and see also Stuyver et al. 1993a) and HCCI63 (SEQ ID NO 31) was derived from a type 5a-infected child with post-transfusion hepatitis (serum BE95, clone PC-4-1, SEQ ID NO 45 in WO 94/25601).
2.4. E2 constructs The HCV E2 PCR fragment 22 was obtained from serum BE11 (genotype 1 b) by means of primers HCPrl09 (SEQ ID NO 33) and HCPr72 (SEQ ID NO 34) using techniques of RNA preparation, reverse-transcription and PCR, as described in Stuyver al., ,O993, anu mte fragment was cloned into the Sma I-cut pSP72 vector. Clone HCCI22A (SEQ ID NO 35) was cut with Ncol/AlwNI or by BamHI/AlwNI and the sticky ends of the fragments were blunted (Ncol and BamHI sites with Klenow DNA Polymerase I (Boehringer), and AlwNI with T4 DNA polymerase (Boehringer)). The WO 96/04385 PCT/EP95/03031 44 BamHI/AlwNI cDNA fragment was then inserted into the vaccinia pgsATA-18 vector that had been linearized by EcoR I and Hind III cleavage and of which the cohesive ends had been filled with Klenow DNA Polymerase (Boehringer). The resulting plasmid was named pvHCV-41 and encoded the E2 region from amino acids Met347 to Gln673, including 37 amino acids (from Met347 to Gly383) of the El protein that can serve as signal sequence. The same HCV cDNA was inserted into the EcoR I and Bbr PI-cut vector pMS66, that had subsequently been blunt ended with Klenow DNA Polymerase.
The resulting plasmid was named pvHCV-42 and also encoded amino acids 347 to 683.
The Ncol/AlwNI fragment was inserted in a similar way into the same sites of pgsATA- 18 (pvHCV-43) or pMS-66 vaccinia vectors (pvHCV-44). pvHCV-43 and pvHCV-44 encoded amino acids 364 to 673 of the HCV polyprotein, of which amino acids 364 to 383 were derived from the natural carboxyterminal region of the El protein encoding the signal sequence for E2, and amino acids 384 to 673 of the mature E2 protein.
2.5. Generation of recombinant HCV-vaccinia viruses Rabbit kidney RK13 cells (ATCC CCL 37), human osteosarcoma 143B thymidine kinase deficient (TK) (ATCC CRL 8303), HeLa (ATCC CCL and Hep G2 (ATCC HB 8065) cell lines were obtained from the American Type Culture Collection (ATCC, Rockville, Md, USA). The cells were grown in Dulbecco's modified Eagle medium (DMEM) supplemented with 10 foetal calf serum, and with Earle's salts (EMEM) for RK13 and 143 B and with glucose (4 g/l) for Hep G2. The vaccinia virus WR strain (Western Reserve, ATTC VR119) was routinely propagated in either 143B or RK13 cells, as described previously (Panicali Paoletti, 1982; Piccini et al., 1987; Mackett et al., 1982, 1984, and 1986). A confluent monolayer of 143B cells was infected with wild type vaccinia virus at a multiplicity of infection of 0.1 0.1 plaque forming unit (PFU) per cell). Two hours later, the vaccinia recombination plasmid was transfected into the infected cells in the form of a calcium phosphate coprecipitate containing 500 ng of the plasmid DNA to allow homologous recombination (Graham van der Eb, 1973; Mackett et al., 1985). Recombinant viruses expressing the Escherichia coli xanthine-guanine phosphoribosyl transferase (gpt) protein were selected on rabbit kidney RK13 cells incubated in selection medium (EMEM containing 25 pg/ml mycophenolic acid (MPA), 250pg/ml xanthine, and 1 5 /g/ml hypoxanthine; Falkner and Moss, 1988; Janknecht et al, 1991). Single recombinant viruses were purified on fresh L- WO 96/04385 PCT/EP95/03031 monolayers of RK1 3 cells under a 0.9% agarose overlay in selection medium. Thymidine kinase deficient recombinant viruses were selected and then plaque purified on fresh monolayers of human 143B cells in the presence of 25 /g/ml 5-bromo-2'deoxyuridine. Stocks of purified recombinant HCV-vaccinia viruses were prepared by infecting either human 143 B or rabbit RK13 cells at an m.o.i. of 0.05 (Mackett et al, 1988). The insertion of the HCV cDNA fragment in the recombinant vaccinia viruses was confirmed on an aliquot (50 pl) of the cell lysate after the MPA selection by means of PCR with the primers used to clone the respective HCV fragments (see Table The recombinant vaccinia-HCV viruses were named according to the vaccinia recombination plasmid number, e.g. the recombinant vaccinia virus vvHCV-10A was derived from recombining the wild type WR strain with the pvHCV-1OA plasmid.
Example 3: infection of cells with recombinant vaccinia viruses A confluent monolayer of RK13 cells was infected at a m.o.i. of 3 with the recombinant HCV-vaccinia viruses as described in example 2 For infection, the cell monolayer was washed twice with phosphate-buffered saline pH 7.4 (PBS) and the recombinant vaccinia virus stock was diluted in MEM medium. Two hundred p/ of the virus solution was added per 106 cells such that the m.o.i. was 3, and incubated for min at 24 C. The virus solution was aspirated and 2 ml of complete growth medium (see example 2) was added per 106 cells. The cells were incubated for 24 hr at 37 0
C
during which expression of the HCV proteins took place.
Example 4: Analysis of recombinant proteins by means of western blottin The infected cells were washed two times with PBS, directly lysed with lysis buffer (50 mM Tris.HCI pH 7.5, 150 mM NaCI, 1% Triton X-1 00, 5 mM MgCI 1 /g/ml aprotinin (Sigma, Bornem, Belgium)) or detached from the flasks by incubation in 50 mM Tris.HCL pH 7.5/ 10 mM EDTA/ 150 mM NaCI for 5 min, and collected by centrifugation (5 min at 1000g). The cell pellet was then resuspended in 200 pl lysis buffer (50 mM Tris.HCL pH 8.0, 2 mM EDTA, 150 mM NaCI, 5 mM MgCI 2 aprotinin, 1% Triton X-100) per 106 cells. The cell lysates were cleared for 5 min at 14,000 rpm in an Eppendorf centrifuge to remove the insoluble debris. Proteins of 20 pl lysate were separated by means of sodium dodecyl sulphate-polyacrylamide gel electrophoresis n WO 96/04385 PCT/EP95/03031 46 (SDS-PAGE). The proteins were then electro-transferred from the gel to a nitrocellulose sheet (Amersham) using a Hoefer HSI transfer unit cooled to 4°C for 2 hr at 100 V constant voltage, in transfer buffer (25 mM Tris.HCI pH 8.0, 192 mM glycine, methanol). Nitrocellulose filters were blocked with Blotto (5 fat-free instant milk powder in PBS; Johnson et al., 1981) and incubated with primary antibodies diluted in Blotto/0.1 Tween 20. Usually, a human negative control serum or serum of a patient infected with HCV were 200 times diluted and preincubated for 1 hour at room temperature with 200 times diluted wild type vaccinia virus-infected cell lysate in order to decrease the non-specific binding. After washing with Blotto/O.1 Tween 20, the nitrocellulose filters were incubated with alkaline phosphatase substrate solution diluted in Blotto/O.1 Tween 20. After washing with 0.1 Tween 20 in PBS, the filters were incubated with alkaline phosphatase substrate solution (100 mM Tris.HCI pH 9.5, 100 mM NaCI, 5 mM MgCI,, 0,38 pg/ml nitroblue tetrazolium, 0.165 p/g/ml 5-bromo-4-chloro-3-indolylphosphate). All steps, except the electrotransfer, were performed at room temperature.
Example 5: Purification of recombinant El or E2 protein 5.1. Lysis Infected RK13 cells (carrying El or E2 constructs) were washed 2 times with phosphate-buffered saline (PBS) and detached from the culture recipients by incubation in PBS containing 10 mM EDTA. The detached cells were washed twice with PBS and 1 ml of lysis buffer (50 mM Tris.HCI pH 7.5, 150 mM NaCI, 1% Triton X-100, 5 mM MgCI 2 1 pg/ml aprotinin (Sigma, Bornem, Belgium) containing 2 mM biotinylated
N-
ethylmaleimide (biotin-NEM) (Sigma) was added per 10" cells at 4°C. This lysate was homogenized with a type B douncer and left at room temperature for 0.5 hours. Another volumes of lysis buffer containing 10 mM N-ethylmaleimide (NEM, Aldrich, Bornem, Belgium) was added to the primary lysate and the mixture was left at room temperature for 15 min. insoluble cell debris was cleared from the solution by centrifugation in a Beckman JA-14 rotor at 14,000 rpm (30100 g at for 1 hour at 4'C.
A
WO 96/04385 PCT/EP95/03031 47 5.2. Lectin Chromatography The cleared cell lysate was loaded at a rate of 1 ml/min on a 0.8 by 10 cm Lentillectin Sepharose 4B column (Pharmacia) that had been equilibrated with 5 column volumes of lysis buffer at a rate of 1ml/min. The lentil-lectin column was washed with to 10 column volumes of buffer 1 (0.1M potassium phosphate pH 7.3, 500 mM KCI, glycerol, 1 mM 6-NH 2 -hexanoic acid, 1 mM MgCI 2 and 1% DecylPEG
(KWANT,
Bedum, The Netherlands). In some experiments, the column was subsequently washed with 10 column volumes of buffer 1 containing 0.5% Empigen-BB (Calbiochem, San Diego, CA, USA) instead of 1% DecylPEG. The bound material was eluted by applying elution buffer (10 mM potassium phosphate pH 7.3, 5% glycerol, 1 mM hexanoic acid, 1mM MgCI 0.5% Empigen-BB, and 0.5 M a-methyl-mannopyranoside). The eluted material was fractionated and fractions were screened for the presence of El or E2 protein by means of ELISA as described in example 6. Figure 22 shows ELISA results obtained from lentil lectin eluate fractions of 4 different El purifications of cell lysates infected with vvHCV39 (type 1 vvHCV40 (type 1 vvHCV62 (type 3a), and vvHCV63 (type 5a). Figure 23 shows the profiles obtained from the values shown in Figure 22. These results show that the-lectin affinity column can be employed for envelope proteins of the different types of HCV.
5.3. Concentration and partial reduction The El- or E2-positive fractions were pooled and concentrated on a Centricon 30 kDa (Amicon) by centrifugation for 3 hours at 5,000 rpm in a Beckman JA-20 rotor at 4°C. In some experiments the El- or E2-positive fractions were pooled and concentrated by nitrogen evaporation. An equivalent of 3.108 cells was concentrated to approximately 200 pl. For partial reduction, 30% Empigen-BB (Calbiochem, San Diego, CA, USA) was added to this 200 pl to a final concentration of 3.5 and 1M DTT in H 2 0 was subsequently added to a final concentration of 1.5 to 7.5 mM and incubated for 30 min at 37 NEM (1M in dimethylsulphoxide) was subsequently added to a final concentration of 50 mM and left to react for another 30 min at 37°C to block the free sulphydryl groups.
L WO 96/04385 PCT/EP95/03031 48 5.4. Gel filtration chromatography A Superdex-200 HR 10/20 column (Pharmacia) was equilibrated with 3 column volumes PBS/3% Empigen-BB. The reduced mixture was injected in a 500/ p sample loop of the Smart System (Pharmacia) and PBS/3% Empigen-BB buffer was added for gelfiltration. Fractions of 250 pl were collected from Vo to The fractions were screened for the presence of El or E2 protein as described in example 6.
Figure 24 shows ELISA results obtained from fractions obtained after gelfiltration chromatography of 4 different El purifications of cell lysates infected with vvHCV39 (type lb), vvHCV40 (type 1b), vvHCV62 (type 3a), and vvHCV63 (type 5a). Figure shows the profiles obtained from purifications of El proteins of types b, 3a, and (from RK13 cells infected with vvHCV39, vvHCV62, and vvHCV63, respectively; purified on lentil lectin and reduced as in the previous examples). The peaks indicated with and represent pure El protein peaks (El reactivity mainly in fractions 26 to 30). These peaks show very similar molecular weights of approximately 70 kDa, corresponding to dimeric El protein. Other peaks in the three profiles represent vaccinia virus and/or cellular proteins which could be separated from El only because of the reduction step as outlined in example 5.3. and because of the subsequent gelfiltration step in the presence of the proper detergent. As shown in Figure 26 pool 1 (representing fractions 10 to 17) and pool 2 (representing fractions 18 to 25) contain contaminating proteins not present in the E1 pool (fractions 26 to 30). The El peak fractions were ran on SDS/PAGE and blotted as described in example 4. Proteins labelled with NEM-biotin were detected by streptavidin-alkaline phosphatase as shown in Figure 27. It can be readily observed that, amongst others, the 29 kDa and contaminating proteins present before the gelfiltration chromatography (lane 1) are only present at very low levels in the fractions 26 to 30. The band at approximately represents the El dimeric form that could not be entirely disrupted into the monomeric El form. Similar results were obtained for the type 3a El protein (lanes 10 to which shows a faster mobility on SDS/PAGE because of the presence of only carbohydrates instead of 6. Figure 28 shows a silver stain of an SDS/PAGE gel run in identical conditions as in Figure 26. A complete overview of the purification procedure is given in Figure 29.
The presence of purified El protein was further confirmed by means of western blotting as described in example 4. The dimeric El protein appeared to be non- WO 96/04385 PCT/EP95/03031 49 aggregated and free of contaminants. The subtype lb El protein purified from cells according to the above scheme was aminoterminally sequenced on an 477 Perkins-Elmer sequencer and appeared to contain a tyrosine as first residue.
This confirmed that the El protein had been cleaved by the signal peptidase at the correct position (between A191 and Y192) from its signal sequence. This confirms the finding of Hijikata et al. (1991) that the aminoterminus of the mature El protein starts at amino acid position 192.
Purification of the E2 protein The E2 protein (amino acids 384 to 673) was purified from RK13 cells infected with vvHCV44 as indicated in Examples 5.1 to 5.4. Figure 30 shows the OD 2 8 0 profile (continuous line) of the lentil lectin chromatography. The dotted line represents the E2 reactivity as detected by ELISA (see example Figure 31 shows the same profiles obtained from gelfiltration chromatography of the lentil-lectin E2 pool (see Figure part of which was reduced and blocked according to the methods as set out in example and part of which was immediately applied to the column. Both parts of the E2 pool were run on separate gelfiltration columns. It could be demonstrated that E2 forms covalently-linked aggregates with contaminating proteins if no reduction has been performed. After reduction and blocking, the majority of contaminating proteins segregated into the Vo fraction. Other contaminating proteins copurified with the E2 protein, were not covalently linked to the E2 protein any more because these contaminants could be removed in a subsequent step. Figure 32 shows an additional Ni2+-IMAC purification step carried out for the E2 protein purification. This affinity purification step employs the 6 histidine residues added to the E2 protein as expressed from vvHCV44. Contaminating proteins either run through the column or can be removed by a 30 mM imidazole wash. Figure 33 shows a silver-stained SDS/PAGE of pg of purified E2 protein and a 30 mM imidazole wash. The pure E2 protein could be easily recovered by a 200 mM imidazole elution step. Figure 34 shows an additional desalting step intended to remove imidazole and to be able to switch to the desired buffer, e.g. PBS, carbonate buffer, saline.
Starting from about 50,000 cm 2 of RK13 cells infected with vvHCV11A (or for the production of El or vvHCV41, vvHCV42, vvHCV43, or vvHCV44 for production of E2 protein, the procedures described in examples 5.1 to 5.5 allow the w WO 96/04385 PCT/EP95/03031 purification of approximately 1.3 mg of El protein and 0.6 mg of E2 protein.
It should also be remarked that secreted E2 protein (constituting approximately 30-40%, 60-70% being in the intracellular form) is chracterized by aggregate formation (contrary to expectations). The same problem is thus posed to purify secreted E2. The secreted E2 can be purified as disclosed above.
Example 6: ELISA for the detection of anti-E1 or anti-E2 antibodies or for the detection of El or E2 proteins Maxisorb microwell plates (Nunc, Roskilde, Denmark) were coated with 1 volume 50/pl or 100 /1 or 200 pl) per well of a 5 /g/ml solution of Streptavidin (Boehringer Mannheim) in PBS for 16 hours at 4°C or for 1 hour at 37 C. Alternatively, the wells were coated with 1 volume of 5 pg/ml of Galanthus nivalis agglutinin (GNA) in 50 mM sodium carbonate buffer pH 9.6 for 16 hours at 4 C or for 1 hour at 37 In the case of coating with GNA, the plates were washed 2 times with 400 pl of Washing Solution of the Innotest HCV Ab III kit (Innogenetics, Zwijndrecht, Belgium). Unbound coating surfaces were blocked with 1.5 to 2 volumes of blocking solution casein and 0.1% NaN 3 in PBS) for 1 hour at 37 °C or for 16 hours at 4 Blocking solution was aspirated. Purified El or E2 was diluted to 100-1000 ng/ml (concentration measured at A 280 nm) or column fractions to be screened for El or E2 (see example or El or E2 in non-purified cell lysates (example were diluted 20 times in blocking solution, and 1 volume of the El or E2 solution was added to each well and incubated for 1 hour at 37°C on the Streptavidin- or GNA-coated plates. The microwells were washed 3 times with 1 volume of Washing Solution of the Innotest HCV Ab III kit (Innogenetics, Zwijndrecht, Belgium). Serum samples were diluted 20 times or monoclonal anti-El or anti-E2 antibodies were diluted to a concentration of 20 ng/ml in Sample Diluent of the Innotest HCV Ab Ill kit and 1 volume of the solution was left to react with the El or E2 protein for 1 hour at 37°C. The microwells were washed times with 400 pl of Washing Solution of the Innotest HCV Ab III kit (Innogenetics, Zwijndrecht, Belgium). The bound antibodies were detected by incubating each well for 1 hour at 37°C with a goat anti-human or anti-mouse IgG, peroxidase-conjugated secondary antibody (DAKO, Glostrup, Denmark) diluted 1/80,000 in 1 volume of Conjugate Diluent of the Innotest HCV Ab III kit (Innogenetics, Zwijndrecht, Belgium), WO 96/04385 PCT/EP95/03031 51 and color development was obtained by addition of substrate of the Innotest HCV Ab III kit (Innogenetics, Zwijndrecht, Belgium) diluted 100 times in 1 volume of Substrate Solution of the Innotest HCV Ab III kit (Innogenetics, Zwijndrecht, Belgium) for 30 min at 24°C after washing of the plates 3 times with 400 pl of Washing Solution of the Innotest HCV Ab III kit (Innogenetics, Zwijndrecht, Belgium).
Example 7: Follow up of patient groups with different clinical profiles 7.1. Monitoring of anti-El and anti-E2 antibodies The current hepatitis C virus (HCV) diagnostic assays have been developed for screening and confirmation of the presence of HCV antibodies. Such assays do not seem to provide information useful for monitoring of treatment or for prognosis of the outcome of disease. However, as is the case for hepatitis B, detection and quantification of anti-envelope antibodies may prove more useful in a clinical setting. To investigate the possibility of the use of anti-E1 antibody titer and anti-E2 antibody titer as prognostic markers for outcome of hepatitis C disease, a series of IFN-a treated patients with long-term sustained response (defined as patients with normal transaminase levels and negative HCV-RNA test (PCR in the 5' non-coding region) in the blood for a period of at least 1 year after treatment) was compared with patients showing no response or showing biochemical response with relapse at the end of treatment.
A group of 8 IFN-a treated patients with long-term sustained response (LTR, follow up 1 to 3.5 years, 3 type 3a and 5 type 1b) was compared with 9 patients showing non-complete responses to treatment (NR, follow up 1 to 4 years, 6 type 1b and 3 type 3a). Type lb (vvHCV-39, see example and 3a El (vvHCV-62, see example proteins were expressed by the vaccinia virus system (see examples 3 and 4) and purified to homogeneity (example The samples derived from patients infected with a type 1 b hepatitis C virus were tested for reactivity with purified type 1 b El protein, while samples of a type 3a infection were tested for reactivity of anti-type 3a E1 antibodies in an ELISA as desribed in example 6. The genotypes of hepatitis C viruses infecting the different patients were determined by means of the Inno-LiPA genotyping assay (Innogenetics, Zwijndrecht, Belgium). Figure 5 shows the anti-E1 signal-to-noise ratios of these patients followed during the course of interferon WO 96/04385 PCT/EP95/03031 52 treatment and during the follow-up period after treatment. LTR cases consistently showed rapidly declining anti-El levels (with complete negativation in 3 cases), while anti-El levels of NR cases remained approximately constant. Some of the obtained anti- El data are shown in Table 2 as average S/N ratios SD (mean anti-El titer). The anti- El titer could be deduced from the signal to noise ratio as show in Figures 5, 6, 7, and 8.
Already at the end of treatment, marked differences could be observed between the 2 groups. Anti-El antibody titers had decreased 6.9 times in LTR but only 1.5 times in NR. At the end of follow up, the anti-El titers had declined by a factor of 22.5 in the patients with sustained response and even slightly increased in NR. Therefore, based on these data, decrease of anti-El antibody levels during monitoring of IFN-a therapy correlates with long-term, sustained response to treatment. The anti-El assay may be very useful for prognosis of long-term response to IFN treatment, or to treatment of the hepatitis C disease in general.
This finding was not expected. On the contrary, the inventors had expected the anti-El antibody levels to increase during the course of IFN treatment in patients with long term response. As is the case for hepatitis B, the virus is cleared as a consequence of the seroconversion for anti-HBsAg antibodies. Also in many other virus infections, the virus is eliminated when anti-envelope antibodies are raised. However, in the experiments of the present invention, anti-El antibodies clearly decreased in patients with a long-term response to treatment, while the antibody-level remained approximately at the same level in non-responding patients. Although the outcome of these experiments was not expected, this non-obvious finding may be very important and useful for clinical diagnosis of HCV infections. As shown in Figures 9, 10, 11, and 12, anti-E2 levels behaved very differently in the same patients studied and no obvious decline in titers was observed as for anti-El antibodies. Figure 35 gives a complete overview of the pilot study.
As can be deduced from Table 2, the anti-El titers were on average at least 2 times higher at the start of treatment in long term responders compared with incomplete responders to treatment. Therefore, measuring the titer of anti-El antibodies at the start of treatment, or monitoring the patient during the course of infection and measuring the anti-El titer, may become a useful marker for clinical diagnosis of hepatitis C.
Furthermore, the use of more defined regions of the El or E2 proteins may become desirable, as shown in example 7.3.
WO 96/04385 PCT/EP95/03031 53 7.2. Analysis of El and E2 antibodies in a larger patient cohort The pilot study lead the inventors to conclude that, in case infection was completely cleared, antibodies to the HCV envelope proteins changed more rapidly than antibodies to the more conventionally studied HCV antigens, with El antibodies changing most vigorously. We therefore included more type 1 b and 3a-infected LTR and further supplemented the cohort with a matched series of NR, such that both groups included 14 patients each. Some partial responders (PR) and responders with relapse (RR) were also analyzed.
Figure 36 depicts average El antibody (E1Ab) and E2 antibody (E2Ab) levels in the LTR and NR groups and Tables 4 and 5 show the statistical analyses. In this larger cohort, higher El antibody levels before IFN-a therapy were associated with LTR (P 0.03). Since much higher El antibody levels were observed in type 3a-infected patients compared with type 1 b-infected patients (Figure 37), the genotype was taken into account (Table Within the type 1b-infected group, LTR also had higher El antibody levels than NR at the initiation of treatment [P 0.05]; the limited number of type 3ainfected NR did not allow statistical analysis.
Of antibody levels monitored in LTR during the 1.5-year follow up period, only El antibodies cleared rapidly compared with levels measured at initiation of treatment [P 0.0058, end of therapy; P 0.0047 and P 0.0051 at 6 and 12 months after therapy, respectively]. This clearance remained significant within type 1- or type 3infected LTR (average P values 0.05). These data confirmed the initial finding that E1Ab levels decrease rapidly in the early phase of resolvement. This feature seems to be independent of viral genotype. In NR, PR, or RR, no changes in any of the antibodies measured were observed throughout the follow up period. In patients who responded favourably to treatment with normalization of ALT levels and HCV-RNA negative during treatment, there was a marked difference between sustained responders (LTR) and responders with a relapse In contrast to LTR, RR did not show any decreasing El antibody levels, indicating the presence of occult HCV infection that could neither be demonstrated by PCR or other classical techniques for detection of HCV-RNA, nor by ,oidU ALT levels. The minute quantities of viral RNA, still present in the RR group during treatment, seemed to be capable of anti-E B cell stimulation. Anti-El monitoring may therefore not only be able to discriminate LTR from NR, but also from RR.
WO 96/04385 PCT/EP95/03031 54 7.3. Monitoring of antibodies of defined reqions of the El protein Although the molecular biological approach of identifying HCV antigens resulted in unprecedented breakthrough in the development of viral diagnostics, the method of immune screening of Agtl 1 libraries predominantly yielded linear epitopes dispersed throughout the core and non-structural regions, and analysis of the envelope regions had to await cloning and expression of the E1/E2 region in mammalian cells. This approach sharply contrasts with many other viral infections of which epitopes to the envelope regions had already been mapped long before the deciphering of the genomic structure.
Such epitopes and corresponding antibodies often had neutralizing activity useful for vaccine development and/or allowed the development of diagnostic assays with clinical or prognostic significance antibodies to hepatitis B surface antigen).
As no HCV vaccines or tests allowing clinical diagnosis and prognosis of hepatitis
C
disease are available today, the characterization of viral envelope regions exposed to immune surveillance may significantly contribute to new directions in HCV diagnosis and prophylaxis.
Several 20-mer peptides (Table 3) that overlapped each other by 8 amino acids, were synthesized according to a previously described method (EP-A-O 489 968) based on the HC-J1 sequence (Okamoto et al., 1990). None of these, except peptide (also referred to as E1-35), was able to detect antibodies in sera of approximately 200 HCV cases. Only 2 sera reacted slightly with the env35 peptide. However, by means of the anti-El ELISA as described in example 6, it was possible to discover additional epitopes as follows: The anti-El ELISA as described in example 6 was modified by mixing 50 pg/ml of El peptide with the 1/20 diluted human serum in sample diluent.
Figure 13 shows the results of reactivity of human sera to the recombinant El (expressed from vvHCV-40) protein, in the presence of single or of a mixture of El peptides. While only 2% of the sera could be detected by means of El peptides coated on strips in a Line Immunoassay format, over half of the sera contained anti-El antibodies which could be competed by means of the same peptides, when tested on the recombinant El protein. Some of the murine monoclonal antibodies obtained from Baibi/C mice after injection with purified El protein were subsequently competed for reactivity to El with the single peptides (Figure 14). Clearly, the region of env53 contained the predominant epitope, as the addition of env53 could substantially compete reactivity of several sera with El, and antibodies to the env31 region were also WO 96/04385 PCT/EP95/03031 detected. This finding was surprising, since the env53 and env31 peptides had not shown any reactivity when coated directly to the solid phase.
Therefore peptides were synthesized using technology described by applicant previously (in WO 93/18054). The following peptides were synthesized: peptide NH2-SNSSEAADMIMHTPGCV-GKbiotin (SEQ ID NO 51) spanning amino acids 208 to 227 of the HCV polyprotein in the El region peptide biotin-env53 ('epitope A') biotin-GG-ITGHRMAWDMMMNWSPTTAL-COOH (SEQ ID NO 52) spanning amino acids to 313 of 332 of the HCV polyprotein in the E1 region peptide 1bE1 ('epitope B') H2N-YEVRNVSGIYHVTNDCSNSSIVYEAADMIMHTPGCGK -biotin(SEQID NO 53) spanning amino acids 192 to 228 of the HCV polyprotein in the El region and compared with the reactivities of peptides Ela-BB (biotin-GG- TPTVATRDGKLPATQLRRHIDLL, SEQ ID NO 54) and Elb-BB (biotin-GG- TPTLAARDASVPTTTIRRHVDLL, SEQ ID NO 55) which are derived from the same region of sequences of genotype 1 a and 1 b respectively and which have been described at the IXth international virology meeting in Glasgow, 1993 ('epitope Reactivity of a panel of HCV sera was tested on epitopes A, B and C and epitope B was also compared with (of 47 HCV-positive sera, 8 were positive on epitope B and none reacted with Reactivity towards epitopes A, B, and C was tested directly to the biotinylated peptides (50 pg/ml) bound to streptavidin-coated plates as described in example 6.
Clearly, epitopes A and B were most reactive while epitopes C and env35A-biotin were much less reactive. The same series of patients that had been monitored for their reactivity towards the complete El protein (example was tested for reactivity towards epitopes A, B, and C. Little reactivity was seen to epitope C, while as shown in Figures 15, 16, 17, and 18, epitopes A and B reacted with the majority of sera.
However, antibodies to the most reactive epitope (epitope A) did not seem to predict remiission of disease, while the anti-lbE1 antibodies (epitope B) were present almost exclusively in long term responders at the start of IFN treatment. Therefore, anti-1 bE1 (epitope B) antibodies and anti-env53 (epitope A) antibodies could be shown to be useful markers for prognosis of hepatitis C disease. The env53 epitope may be WO 96/04385 PCT/EP95/03031 56 advantageously used for the detection of cross-reactive antibodies (antibodies that cross-react between major genotypes) and antibodies to the env53 region may be very useful for universal El antigen detection in serum or liver tissue. Monoclonal antibodies that recognized the env53 region were reacted with a random epitope library. In 4 clones that reacted upon immunoscreening with the monoclonal antibody 5E1 A1 0, the sequence -GWD- was present. Because of its analogy with the universal HCV sequence present in all HCV variants in the env53 region, the sequence AWD is thought to contain the essential sequence of the env53 cross-reactive murine epitope. The env31 clearly also contains a variable region which may contain an epitope in the amino terminal sequence -YQVRNSTGL- (SEQ ID NO 93) and may be useful for diagnosis.
Env31 or E1-31 as shown in Table 3, is a part of the peptide 1bE1. Peptides El-33 and El-51 also reacted to some extent with the murine antibodies, and peptide E1-55 (containing the variable region 6 spanning amino acid positions 329-336) also reacted with some of the patient sera.
Anti-E2 antibodies clearly followed a different pattern than the anti-E1 antibodies, especially in patients with a long-term response to treatment. Therefore, it is clear that the decrease in anti-envelope antibodies could not be measured as efficiently with an assay employing a recombinant E1/E2 protein as with a single anti-E1 or anti-E2 protein.
The anti-E2 response would clearly blur the anti-E1 response in an assay measuring both kinds of antibodies at the same time. Therefore, the ability to test anti-envelope antibodies to the single El and E2 proteins, was shown to be useful.
7.4. Mappinq of anti-E2 antibodies Of the 24 anti-E2 Mabs only three could be competed for reactivity to recombinant E2 by peptides, two of which reacted with the HVRI region (peptides E2- 67 and E2-69, designated as epitope A) and one which recognized an epitope competed by peptide E2-13B (epitope The majority of murine antibodies recognized conformational anti-E2 epitopes (Figure 19). A human response to HVRI (epitope and to a lesser extent HVRII (epitope B) and a third linear epitope region (competed by peptides E2-23, E2-25 or E2-27, designated epitope E) and a fourth linear epitope region (competed by peptide E2-17B, epitope D) could also frequently be observed, but the majority of sera reacted with conformational epitopes (Figure 20). These conformational epitopes could be grouped according to their relative positions as follows: the IgG 4' WO 96/04385 PCT/EP95/03031 57 antibodies in the supernatant of hybridomas 15C8C1, 12D11F1, 9G3E6, 8G10D1H9, 10D3C4, 4H6B2, 17F2C2, 5H6A7, 15B7A2 recognizing conformational epitopes were purified by means of protein A affinity chromatography and 1 mg/ml of the resulting IgG's were biotinylated in borate buffer in the presence of biotin. Biotinylated antibodies were separated from free biotin by means of gelfiltration chromatography. Pooled biotinylated antibody fractions were diluted 100 to 10,000 times. E2 protein bound to the solid phase was detected by the biotinylated IgG in the presence of 100 times the amount of non-biotinylated competing antibody and subsequently detected by alkaline phosphatase labeled streptavidin.
Percentages of competition are given in Table 6. Based on these results, 4 conformational anti-E2 epitope regions (epitopes F, G, H and I) could be delineated (Figure 38). Alternatively, these Mabs may recognize mutant linear epitopes not represented by the peptides used in this study. Mabs 4H6B2 and 10D3C4 competed reactivity of 16A6E7, but unlike 16A6E7, they did not recognize peptide E2-13B. These Mabs may recognize variants of the same linear epitope (epitope C) or recognize a conformational epitope which is sterically hindered or changes conformation after binding of 16A6E7 to the E2-13B region (epitope
H).
Example 8: El glycosylation mutants 8.1. Introduction The El protein encoded by vvHCV1 OA, and the E2 protein encoded by vvHCV41 to 44 expressed from mammalian cells contain 6 and 11 carbohydrate moieties, respectively. This could be shown by incubating the lysate of vvHCV1OA-infected or vv. .v I A-,Iec rted I 3 cells with decreasing concentrations of glycosidases (PNGase F or Endoglycosidase H, (Boehringer Mannhein Biochemica) according to the .manufacturer's instructions), such that the proteins in the lysate (including El) are partially deglycosylated (Fig. 39 and 40, respectively).
E
WO 96/04385 PCT/EP95/03031 58 Mutants devoid of some of their glycosylation sites could allow the selection of envelope proteins with improved immunological reactivity. For HIV for example, gpl proteins lacking certain selected sugar-addition motifs, have been found to be particularly useful for diagnostic or vaccine purpose. The addition of a new oligosaccharide side chain in the hemagglutinin protein of an escape mutant of the A/Hong Kong/3/68 (H3N2) influenza virus prevents reactivity with a neutralizing monoclonal antibody (Skehel et al, 1984). When novel glycosylation sites were introduced into the influenza hemaglutinin protein by site-specific mutagenesis, dramatic antigenic changes were observed, suggesting that the carbohydrates serve as a modulator of antigenicity (Gallagher et al., 1988). In another analysis, the 8 carbohydrate-addition motifs of the surface protein gp70 of the Friend Murine Leukemia Virus were deleted. Although seven of the mutations did not affect virus infectivity, mutation of the fourth glycosylation signal with respect to the amino terminus resulted in a non-infectious phenotype (Kayman et al., 1991). Furthermore, it is known in the art that addition of N-linked carbohydrate chains is important for stabilization of folding intermediates and thus for efficient folding, prevention of malfolding and degradation in the endoplasmic reticulum, oligomerization, biological activity, and transport of glycoproteins (see reviews by Rose et al., 1988; Doms et al., 1993; Helenius, 1994).
After alignment of the different envelope protein sequences of HCV genotypes, it may be inferred that not all 6 glycosylation sites on the HCV subtype 1b El protein are required for proper folding and reactivity, since some are absent in certain (sub)types. The fourth carbohydrate motif (on Asn251), present in types 1 b, 6a, 7, 8, and 9, is absent in all other types know today. This sugar-addition motif may be mutated to yield a type lb El protein with improved reactivity. Also the type 2b sequences show an extra glycosylation site in the V5 region (on Asn299). The isolate S83, belonging to genotype 2c, even lacks the first carbohydrate motif in the V1 region (on Asn), while it is present on all other isolates (Stuyver et al., 1994) However, even among the completely conserved sugar-addition motifs, the presence of the carbohydrate may not be required for folding, but may have a role in evasion of immune surveillance. Therefore, identification of the carbohydrate addition motifs which are not required for proper folding (and reactivity) is not obvious, and each mutant has to be analyzed and tested for reactivity. Mutagenesis of a glycosylation motif (NXS or NXT sequences) can be achieved by either mutating the codons for N, S, or T, in such a way that these codons encode amino acids different from N in the case of N, and/or amino L I WO 96/04385 PCT/EP95/03031 59 acids different from S or T in the case of S and in the case of T. Alternatively, the X position may be mutated into P, since it is known that NPS or NPT are not frequently modified with carbohydrates. After establishing which carbohydrate-addition motifs are required for folding and/or reactivity and which are not, combinations of such mutations may be made.
8.2. Mutaqenesis of the E1 protein All mutations were performed on the E1 sequence of clone HCCI10A (SEQ ID NO. The first round of PCR was performed using sense primer 'GPT' (see Table 7) targetting the GPT sequence located upstream of the vaccinia 11 K late promoter, and an antisense primer (designated GLY#, with representing the number of the glycosylation site, see Fig. 41) containing the desired base change to obtain the mutagenesis. The six GLY# primers (each specific for a given glycosylation site) were designed such that: Modification of the codon encoding for the N-glycosylated Asn (AAC or AAT) to a Gin codon (CAA or CAG). Glutamine was chosen because it is very similar to asparagine (both amino acids are neutral and contain non-polar residues, glutamine has a longer side chain (one more -CH 2 group).
The introduction of silent mutations in one or several of the codons downstream of the glycosylation site, in order to create a new unique or rare a second Smal site for restriction enzyme site. Without modifying the amino acid sequence, this mutation will provide a way to distinguish the mutated sequences from the original El sequence (pvHCV-1OA) or from each other (Figure 41) This additional restriction site may also be useful for the construction of new hybrid (double, triple,-etc.) glycosylation mutants.
18 nucleotides extend 5' of the first mismatched nucleotide and 12 to 16 nucleotides extend to the 3' end. Table 7 depicts the sequences of the six GLY# primers overlapping the sequence of N-linked glycosylation sites.
For site-directed mutagenesis, the 'mispriming' or 'overlap extension' (Horton, 19.) vvas usu. TIe concept is illustrated in Figures 42 and 43. First, two separate fragments were amplified from the target gene for each mutated site. The PCR product obtained from the 5' end (product GLY#) was amplified with the 5' sense GPT primer (see Table 7) and with the respective 3' antisense GLY# primers. The second fragment Q t WO 96/04385 PCT/EP95/03031 (product OVR#) was amplified with the 3' antisense TK, primer and the respective sense primers (OVR# primers, see Table 7, Figure 43).
The OVR# primers target part of the GLY# primer sequence. Therefore, the two groups of PCR products share an overlap region of identical sequence. When these intermediate products are mixed (GLY-1 with OVR-1, GLY-2 with OVR-2, etc.), melted at high temperature, and reannealed, the top sense strand of product GLY# can anneal to the antisense strand of product OVR# (and vice versa) in such a way that the two strands act as primers for one another (see Fig. Extension of the annealed overlap by Taq polymerase during two PCR cycles created the full-length mutant molecule E1Gly#, which carries the mutation destroying the glycosylation site number Sufficient quantities of the E1GLY# products for cloning were generated in a third PCR by means of a common set of two internal nested primers. These two new primers are respectively overlapping the 3' end of the vaccinia 11K promoter (sense GPT-2 primer) and the 5' end of the vaccinia thymidine kinase locus (antisense TKR-2 primer, see Table All PCR conditions were performed as described in Stuyver et al. (1993).
Each of these PCR products was cloned by EcoRI/BamHI cleavage into the EcoRI/BamHI-cut vaccinia vector containing the original El sequence (pvHCV-1OA).
The selected clones were analyzed for length of insert by EcoRI/BamH I cleavage and for the presence of each new restriction site. The sequences overlapping the mutated sites were confirmed by double-stranded sequencing.
8.3. Analysis of El alvcosvlation mutants Starting from the 6 plasmids containing the mutant El sequences as described in example 8.2, recombinant vaccinia viruses were generated by recombination with wt vaccinia virus as described in example 2.5. Briefly, 175 cm 2 -flasks of subconfluent RK13 cells were infected with the 6 recombinant vaccinia viruses carrying the mutant El sequences, as well as with the vvHCV-1OA (carrying the non-mutated El sequence) and wt vaccinia viruses. Cells were lysed after 24 hours of infection and analyzed on western blot as described in example 4 (see Figure 44A). All mutants showed a faster mobility (corresponding to a smaller molecular weight of approximately 2 to 3 kDa) on SDS-PAGE than the original El protein; confirming that one carbohydrate moiety was not added. Recombinant viruses were also analyzed by PCR and restriction enzyme analysis to confirm the identity of the different mutants. Figure 44B shows that all IS I* WO 96/04385 PCT/EP95/03031 61 mutants (as shown in Figure 41) contained the expected additional restriction sites.
Another part of the cell lysate was used to test the reactivity of the different mutant by ELISA. The lysates were diluted 20 times and added to microwell plates coated with the lectin GNA as described in example 6. Captured (mutant) El glycoproteins were left to react with 20-times diluted sera of 24 HCV-infected patients as described in example 6. Signal to noise values (OD of GLY#/OD of wt) for the six mutants and El are shown in Table 8. The table also shows the ratios between S/N values of GLY# and El proteins. It should be understood that the approach to use cell lysates of the different mutants for comparison of reactivity with patient sera may result in observations that are the consequence of different expression levels rather then reactivity levels. Such difficulties can be overcome by purification of the different mutants as described in example 5, and by testing identical quantities of all the different El proteins. However, the results shown in table 5 already indicate that removal of the 1st (GLY1), 3rd (GLY3), and 6th (GLY6) glycosylation motifs reduces reactivity of some sera, while removal of the 2nd and 5th site does not. Removal of GLY4 seems to improve the reactivity of certain sera. These data indicate that different patients react differently to the glycosylation mutants of the present invention. Thus, such mutant El proteins may be useful for the diagnosis (screening, confirmation, prognosis, etc.) and prevention of HCV disease.
Example 9: Expression of HCV E2 protein in lvcosvlation-deficient easts The E2 sequence corresponding to clone HCCL41 was provided with the amating factor pre/pro signal sequence, inserted in a yeast expression vector and S.
cerevisiae cells transformed with this construct secreted E2 protein into the growth medium. It was observed that most glycosylation sites were modified with highmannose type glycosylations upon expression of such a construct in S. cerevisiae strains (Figure 45). This resulted in a too high level of heterogeneity and in shielding of reactivity, which is not desirable for either vaccine or diagnostic purposes. To overcome his problem, S. cerevisiae mutants with modified glycosylation pathways were generated by means of selection of vanadate-resistant clones. Such clones were analyzed for modified glycosylation pathways by analysis of the molecular weight and heterogeneity of the glycoprotein invertase. This allowed us to identify different WO 96/04385 PCT/EP95/03031 62 glycosylation deficient S. cerevisiae mutants. The E2 protein was subsequently expressed in some of the selected mutants and left to react with a monoclonal antibody as described in example 7, on western blot as described in example 4 (Figure 46).
Example 10. General utility The present results show that not only a good expression system but also a good purification protocol are required to reach a high reactivity of the HCV envelope proteins with human patient sera. This can be obtained using the proper HCV envelope protein expression system and/or purification protocols of the present invention which guarantee the conservation of the natural folding of the protein and the purification protocols of the present invention which guarantee the elimination of contaminating proteins and which preserve the conformation, and thus the reactivity of the HCV envelope proteins. The amounts of purified HCV envelope protein needed for diagnostic screening assays are in the range of grams per year. For vaccine purposes, even higher amounts of envelope protein would be needed. Therefore, the vaccinia virus system may be used for selecting the best expression constructs and for limited upscaling, and large-scale expression and purification of single or specific oligomeric envelope proteins containing high-mannose carbohydrates may be achieved when expressed from several yeast strains. In the case of hepatitis B for example, manufacturing of HBsAg from mammalian cells was much more costly compared with yeast-derived hepatitis
B
vaccines.
The purification method dislcosed in the present invention may also be used for 'viral envelope proteins' in general. Examples are those derived from Flaviviruses, the newly discovered GB-A, GB-B and GB-C Hepatitis viruses, Pestiviruses (such as Bovine viral Diarrhoea Virus (BVDV), Hog Cholera Virus (HCV), Border Disease Virus but also less related virusses such as Hepatitis B Virus (mainly for the purification of HBsAg).
I Te envelupe protein purification method of the present invention may be used for intra- as well as extracellularly expressed proteins in lower or higher eukaryotic cells or in prokaryotes as set out in the detailed description section.
WO 96/04385 63PC/9/301 PCT/EP95/03031 Table 1: Recombinant vaccinial plasmids and viruses Plasmid name Name cDNA subclone Length (nt/aa) Vector used construction for insertion pvHCV-13A Els EcoR I Hind 111 472/1 57 pgptATA-18 pvHCV-1 2A Els EcoR I Hind III 1 472/158 pg ptATA- 18 pvHCV-9A Ell EcoR I Hi nd 111 631/211 pgptATA-1 8 pvHCV- 11A Els EcoR I Hind 111 625/207 pgptATA-18 pvHCV-17A Els EcoR I Hind 111 625/208 pg ptATA- 18 pvHCV-lOA El EcoR I Hind 111 783/262 pg ptATA- 18 pvHCV-1 8A COREs -Acc I (KI) EcoR I (KI1) 403/130 pg ptATA- 18 PvHCV-34 CORE Acc I (KI) Fsp 1 595/197 pgptATA-18 pvHCV-33 CORE-El Acc I (KI) 1150/380 pg ptATA- 18 CORE-Elb.his EcoR I BamH I (KI1) 1032/352 pMS-66 pvHCV-36 CORE-El n.his EcoR I Nco I (KI) 1106/376 pMS-66 pvHCV-37 ElIA Xma I BamH 1 711/239 pvHCV-1OA pvHCV-38 ElAs EcoR I BstE 11 553/183 pvHCV-11 lA pvHCV-39 ElAb EcoR I BamH 1 960/313 pgsATA-18 E1Ab.his EcoR I BamH I (KI) 960/323 pMS-66 pvHCV-41 E2bs BamH I (KI)-AIwN I (T4) 1005/331 pgsATA-18 pvHCV-42 E2bs.his BamH I (KI)-AIwN I (T4) 1005/341 pMS-66 -pvHCV-43 E2ns Nco I (KI) AlwN I (T4) 932/314 pgsATA- 18 -pvHCV-44 E2ns.his Nco I (KI) AlwN I (14) 1 932/32 1 pMS-66 -pvHCV-62 El1s (type 3a) EcoR I Hind 111 625/207 pgsATA-18 -pvHCV-63 Els (type 5) EcoR I Hind 111 625/207 pgsATA-18 -pvHCV-64 E2 BamH I Hind 111 1410/463 pgsATA-18 E1-E2 BamH I Hind 111 2072/691 pvHCV-66 CORE-E1l-E2 BamH I Hind 111 2427/809 pvHCV-33 nt: nucleotide aa: aminoacid Ki: Kienow DNA Pol filling Position: aminoacid position in the HCV polyprotein sequence 14: T4 DNA Pol filling SUBTITUTE SHEET (RULE 26) WO 96/04385 PCTJEP95/03031 54 Table 1 cogntinued: Recombinant vacciniap lasmids and viruses Plasmid HCV cDNA subclone Vector Name Name IConstruction J Length used for I (ntlaa) insertion j pvHCV-81 El *-GLY 1 EcoRi BamH 1 783/262 pvHCV-1OA pvHCV-82 El *-GLY 2 EcoRi BamH 1 783/262 pvHCV-lOA pvHCV.-83 El *-GLY 3 EcoRi BamH 1 783/262 pvHCV-lOA pvHCV-84 El *-GLY 4 EcoRI BamH 1 783/262 pvHCV-1OA El *-GLY 5 EcoRi BamH 1 783/262 pvHCV-lOA pvHCV-86 El *.-GLY 6 EcoRi BamH 1 783/262 pvHCV-lOA nt: nucleotide aa: aminoacid KI: Klenow DNA Pol filling Position: ammnoacid position in the HCV polyprotein sequence T4: T4 DNA Pal filling SUBSTITUTE SHEET (RULE 26)
_M
WO 96/04385 PCT/EP95/03031 Table 2 Summary of anti-El tests SIN SD (mean anti-E1 titer) Start of treatment End of treatment Follow-up LTR 6.94+2.29 4.48 2.69 (1:568) 2.99 2.69 (1:175) (1:3946) NR 5.77 3.77 5.29 3.99 6.08 3.73 (1:1607) (1:1060) (1:1978) LTR Long-term, sustained response for more than 1 year NR No response, response with relapse, or partial response WO 96/04385 PCTJEP95/0303 1 Table 3 Synthetic peptides for competition studies
PROTEIN
PEPTIDE AMINO ACID SEQUENCE El-31 El-33 El-37 El-39 El1-41 El -43 El -45 El -49 El-51 El-53 El-57 El-59 El-63
LLSCLTVPASAYQVRNSTGL
QVRNSTGLYHVTNDCPNSSI
NDCPNSSIVYEAHDAILHTP
SNSS iVYEAADM IM HTPG CV HDA ILHTPG CVPC VREGN VS CVREGNVS
RCWVAMTPTVAT
AMTPTVATRDG
KLPATQLRR
LPATQLRRHIDLLVGSATLC
LVGSATLCSALYVGDLCGSV
QLFTFSPR RH WTTQGCNCS
I
TQGCNCSIYPGHITGHRMAW
ITGHRMAWDMMMNWSPTAAL
NWSPTAALVMAQLLRIPQAI
LLRIPQAILDMIAGAHWGVL
AGAHWGVLAGIAYFSMVGNM
VVLLLFAGVDAETIVSGG
QA
POSITION
181-200 193-2 12 205-224 208-227 217-236 229-248 241-260 3-272 265-284 289-308 301-320 313-332 325-344 3 37-35 6 349-368 373-392 SEQ ID NO 56 57 58 59 61 62 63 64 66 67 68 69 71 WO 96/04385 PCT/EP95/0303 1 E2-67 E2-69 E2-$3B E2-$1 B E2-1 B E2-3B E2-7B E2-9B E2-1 1lB E2-1 3B E2-1 7B E2-1 9B E2-21 E2-23 E2-25 E2-27 E2-29 E2-31 E2-33 E2-35
SGLVSLFTPGAKQNIQLINT
QNIQLINTNGSWHINSTALN
LNCNESLNTG WWLAGLIYQH
K
AG LIYQH KFN SSG CPE RLAS
GCPERLASCRPLTDFDQGWG
TDFDQGWGPISYANGSGPDQ
ANGSGPDQRPYCWHYPPKPC
WHYPPKPCGIVPAKSVCGPV
AKSVCGPVYCFTPSPVVVGT
PSPVVVGTTDRSGAPTYSWG
GAPTYSWGENDTDVFVLNNT
GNWFGCTWMNSTGFTKVCGA
GFTKVCGAPPVCIGGAGNNT
IGGAGNNTLHCPTDCFRKHP
TDCFRKHPDATYSRCGSGPW
SRCGSGPWITPRCLVDYPYR
CLVDYPYRLWHYPCTINYTI
PCTINYTIFKIRMYVGGVEH
MYVGGVEHRLEAACNWTPGE
ACNWTPGERCDLEDRDRSEL
EDRDRSELSPLLLTTTQWQV
397-416 409-428 427-446 439-458 451-470 463-482 475-494 487-506 499-5 18 511-530 52 3-542 547-5 66 9-5 78 57 1-590 83-602 595-6 14 607-626 619-638 63 1-650 643-662 655-674 72 73 74 76 7 7 78 79 81 82 83 84 86 87 88 89 91 92
,A
h i I W
L-
Table 4. Change of Envelope Antibody levels over time (complete study, 28 patients) Wilcoxon Signed Rank test (P values) ElAb NR ElAb NR ElAb NR ElAb LTR ElAb LTR ElAb LTR type l b type 3a All type l b type 3a E2Ab NR El Ab LTR End of therapy* 0.1167 0.2604 0.285 0.005-8----0.043-, -0.0499** 0.0186*' 0.0640 6 months follow up* 0.86 0.72 13 0.5930 0.Q004 7 0.043** 0.063 0.04326 0.0464'- 12 months follow up* 0.7989 0.3105 1 0.0051" 0.0679 0.0277-' 0.0869 0.0058** Data were compared wivale banda ntainohrp P values 0.05 0 Table 5. Difference between LTR and NR (complete study) Mann-Withney ElAb SIN El1Ab titers ElAb SIN ElAb SIN E2Ab SiN U test (P values) All All type l b type 3a All initiation of therapy 0.0257' 0.05 0.68 0.1078 End of therapy 0.1742 0.1295 6 months follow up, 1 0.6099 0.42 5 0.3081 12 months follow up 0.67 0.23 0.4386 0.6629 P values <c 0.05
M
Ml 4
M
Table 6. Competition experiments between murine E2 monoclonal antibodies Decrease of anti-E2 reactivity of biotinylated anti-E2 mabs competitor 17H1QF4D1O 2F1OH1O 16A6E7 10D3C4 4H6B2 17C2F2 9G3E6 12D11F1 15C8G1 8G1OD1H9 17H1OF4D1O 2F1OH1O E 16A6E7 IN 10OD3C4 1 4HI612 I 17G2F2 2 9G3E6 N 12D 11 F1 N 10 1 92 82 75 68 26 18 11 6 4
ND
43
ND
ND
082 0 4
ND
15C8C1 ND 8G1OD1H9 2 competitor controls 15137A2 0 51-6A7 0 23C12H9 ND ND, not done 0 9 1 2 0 12 IND 2 12 10 9 0 8 0 0 ND 4 ND
.I
I I Table 7. Primers SEQ ID NO. 96 GPT 5'-GTTTAACCACTGCATGATG-3' SEQ ID NO. 97 TKR 5'-GTCCCATCGAGTGCGGCTAC-3' SEQ ID NO. 98 GLYl 5'-CGTGACATGGTACATTCCGGACACTTGGCGCACTTCATAAGCGGA- 3 SEQ ID NO. 99 GLY2 5'-TG CCTCATACACAATG GAG CTCTG GGACGAGTCGTTCGTGAC- 3 SEQ ID NO. 100O GLY3 5'-TACCCAGCAGCGGGAGCTCTGTTGCTCCCGAACGCAGGGCAC- 3 SEQ ID NO. 101 GLY4 5'-TGTCGTGGTGGGGACGGAGGCCTGCCTAGCTGCGAGCGTGGG- 3 SEQ ID NO. 102 GLY5 5'-CGTTATGTGGCCCGGGTAGATTGAGCACTGGCAGTCCTGCACCGTCTC- 3 SEQ ID NO. 1 C13 GLY6 5'-CAGGGCCGTTGTAGGCCTCCACTGCATCATCATATCCCAAGC- 3 SEQ ID NO. 104 OVR1 5 '-CCGGAATGTACCATGTCACGAACGAC-3' SEQ IQ NO. 105 OVR2 5'-GCTCCATTGTGTATGAGGCAGCGG-3' SEQ ID NO. 106 OVR3 5'-GAGCTCCCGCTGCTGGGTAGCGC3' SEQ ID NO. 107 OVR4 5'-CCTCCGTCCCCACCACGACAATACG-3' SEQ ID NO. 108 OVR5 5'-CTACCCGGGCCACATAACGGGTCACCG-3' SEQ ID NO. 109 OVR6 5'-GGAGGCCTACAACGGCCCTGGTGG-3' SEQ ID NO. 110 GPT-2 5'-TTCTATCGATTAAATAGAATTC -3' SEQ ID NO. 111 TKR-2 5'-GCCATACGCTGACAGCCGATCCC-3' nucleotides underlined represent additional restriction site nucleotides in bold represent mutations with respect to the original HCCI 1OA sequence Taje8Anj)~p~j_~ EltMM ans1
SERUM
SN GLYI SN GLY2* SN GLY3 SN GLY4 SN GLY5 SN GLY6 SN El SN GLYl ,SN GLY2 ~SN GLY3 ~SN GLY4 q..SN GLY5 C-:SN GLY6 r
T
-SN El mGLYI/El r~GLY2/E I 'GLY3IEl GLY4/El GLY6IEl GLYlIEl GLY2/E 1 GLY3IE 1 GLY4IEl 1 GLY6/E I 1.802462 2.400795 1.642718 2.578154 2.482051 2.031487 2.828205 13 5.685561 7.556682 7.930538 8.176816 8.883408 8.005561 8. 825112 2.120971 1.76818 1.715477 3.824 038 1.7921761 1.495737 2.227036 14 3.233604 2.567613 2.763055 6.561122 2.940334 2.499952 3. 183771 3 1.40387 1 2.325495 2.261646 3.874605 2.409344 2.131613 2.512792 15 3.763498 3.621928 3.016099 5.707668 3.125561 2.621704 3.067265 4 1.205597 2.639308 2.354748 1.499387 2.627358 2.527925 2.790881 5 2.120191 2.459019 1.591818 3.15 1.7 15311 2 .494 833 3. 131579 6 2.866913 5.043993 4.833742 4.7 1302 4.964765 4.784027 4.869128 7 1.950345 2.146302 1.96692 4. 198751 2.13912 2.02069 2.287753 19 1.93476 2. 127712 1.980185 3.813321 2.442804 1.506716 2.7712 18 8 1.866 183 1.595477 1.482099 3.959542 1.576336 1.496489 1.954 198 20 2.4717 1 2.92 1288 2.557384 3.002535 3. 126761 2.665433 3.678068 9 1.730193 1.688973 1.602222 3.710507 1. 708937 1.704976 1.805556 21 4.378633 4.680101 4.268633 4.293038 4.64557 2.781063 5.35443 10 2.468162 2.482212 2.191558 5.17084 1 3.02 1807 2.677757 2.6 16822 22 1. 188748 1.15078 1 0.97767 2.393011 1.153656 1.280743 1. 167286 11 1.220654 1.467582 1.4642 16 4.250784 1.562092 1.529608 1.557 19 23 2. 158889 1.66 1914 1.336775 3.68213 1.8 17901 1.475062 2.083333 12 1.629403 2.070524 1.72 1164 3.955153 2.07278 1.744221 2.593886 24 1.706992 1.632785 1.20376 2.48 1585 1.6382 11 1.7 16423 1.78252 16 17 18 1.985105 2.317721 6.675179 3.055649 2.933792 7.65433 2.945628 2.515305 5.775357 5.684498 5.604813 6.4125 3.338912 2.654224 5.424107 2.572385 2.363301 5.194107 3.280335 2.980354 7.191964 Sum
S/N
59.88534 69.65243 62.09872 102.6978 69.2651 1 61. 32 181 76.54068 Average
SIN
2.495223 2.,902 185 2.58744 7 4.279076 2,886046 2.555075 3. 189195
SERUM
1 0.637316 0.848876 0.580834 0.911587 0.877607 0.718296 13 0.644248 0.85627 0.898633 0.92654 1.006606 0.907134 2 0.952:374 0.793961 0.770296 1.717097 0.8054147 0.671626 14 1.015(652 0.8064169 0.8671356 2.060802 0.923538 0.785217 3 0.55869 0.925463 0.900053 1.541952 0.958831 0.848305 15 1.226988 1. 180833 0.983319 1.860833 1.019006 0.854737 4 0.431977 0.94569 0.84373 0.537245 0.941408 0.90578 16 0.605153 0.931505 0.897966 1.732902 1.017857 0.784184 5 0.677036 0.785233 0.508312 1.005882 0.547746 0.796669 17 0.777666 0.984377 0.843962 1.880587 0.890574 0.79296 6 0.588794 1.0359 13 0.992733 0.967939 1.019642 0.982522 18 0.928144 1.064289 0.803029 0.89162 0.75419 0.72221 7 0.852516 0.93817 0.859761 1.8353 17 0.935031 0.883264 19 0.698162 0.76779 0.714554 1.376045 0.88 1491 0.543702 8 0. 95496 1 0.816436 0.7584 18 2.026172 0.806641 0.765781 20 0.672013 0.794245 0.695306 0.816335 0.850109 0.724683 9 0.958261 0.935431 0.887385 2.05505 0.946488 0.944294 21 0.817759 0.87406 1 0.797215 0.801773 0.867612 0.519395 10 0.94319 0.94856 0.837488 1.976 1.154762 1.023286 22 1.018386 0.98586 0.837558 2.050064 0.988323 1.097 197 11 0.783882 0.942455 0.940294 2.72978 1.003 148 0.982288 23 1.036267 0.797719 0.64 1652 1.767422 0.872593 0.70803 12 0.628 171 0.798232 0.663547 1.524798 0.799102 0.672435 24 0.957628 0.915998 0.675314 1.392 178 0.919042 0.962919 Sum E1/GLY# 19.36524 21.67384 19.19921 36.38592 21.78679 19. 5969 1 Average E1/GLY# 0.806885 0.903077 0.799967 1.5 1608 0.907783 0.8 16538 WO 96/04385 PCT/EP95/03031 73
REFERENCES
Bailey, J. and Cole, R. (1959) J. Biol. Chem. 234, 1733-1739.
Ballou, Hitzeman, Lewis, M. Ballou, C. (1991) PNAS 88, 3209-3212.
Benesch, Benesch, Gutcho, M. Lanfer, L. (1956) Science 123, 981.
Cavins, J. Friedman. (1970) Anal. Biochem. 35, 489.
Cleland, W. (1964) Biochemistry 3, 480 Creighton E. (1988) BioEssays 8, 57 Darbre, John Wiley Sons Ltd. (1987) Practical Protein Chemistry
A
Handbook.
Darbre, John Wiley Sons Ltd. (1987) Practical Proteinchemistry p.69-79.
Doms et al, (1993), Virology 193, 545-562.
Ellman, G. (1959) Arch. Biochem. Biohys. 82, Falkner, F. Moss, B. (1988) J. Virol. 62, 1849-1854.
Friedman, M. Krull. (1969) Biochem. Biophys. Res. Commun. 37, 630.
Gallagher J. (1988) J. Cell Biol. 107, 2059-2073.
Glazer, Delange, Sigman, D. (1975) North Holland publishing company, EI evi, .Bimedical. UPar. FrModification of protein 1 6).
Graham, F. van der Eb, A. (1973) Virology 52, 456-467.
Grakoui et al. (1993) Journal of Virology 67:1385-1395.
WO 96/04385 PCT/E-P95/0303 1 74 Grassetti, D. Murray, J. (1969) Analyt. Chim. Acta. 46, 139.
Grassetti, D. Murray, J. (1967) Arch. Biochem Biophys. 119, 41.
Helenius, Mol. Biol. Cell (1994), 5: 253-265.
Hijikata, Kato, Ootsuyama, Nakagawa, M. Shimotohno, K. (1 991) Proc.
NatI. Acad. Sci. U.S.A. 88(13):5547-51.
Hochuli, Bannwarth, Dbbeli, Gentz, Stfiber, D. (1988) Biochemistry 88, 8976.
Hsu, Donets, Greenberg, H. Feinstone, S. (1993) Hepatology 17:763-771.
Inoue, Suzuki, Matsuura, Harada, Chiba, Watanabe, Saito, I. Miyamura, T. (1992) J. Gen. Virol. 73:2151-2154.
Janknecht, de Martynoff, G. et (199 1) Proc. NatI. Acad. Sci. USA 88, 8972- 8976.
Kayman (1991) J. Virology 65, 5323-5332.
Kato, Oostuyama, Tanaka, Nakagawa, Muraiso, Ohkoshi,
S.,
Hijikata, Shimitohno, K. (1992) Virus Res. 22:107-123.
Kniskern, Hagopian, Burke, Schultz, Montgomery, Hurni, Yu Ip, Schulman, Maigetter, Wampler, Kubek, Sitrin, West, Ellis, Miller, W. (1994) Vaccine 12:1021-1025.
Kohara, Tsukiyama-Kohara, Maki, Asano, Yoshizawa, Miki, K., I-11 MatsLuura d, Saito, M~tiyamura, T. Nomoto, A. (1 992) J. Gen. Virol. 73:2313-2318.
Mackett, Smith, G. Moss, B. (1985) In: 'DNA cloning: a practical approach' (Ed. Glover, IRL Press, Oxford.
I
WO 96/04385 PCT/EP95/03031 Mackett, Smith, G. (1986) J. Gen. Virol. 67, 2067-2082.
Mackett, Smith, G. Moss, B. (1984) J. Virol. 49, 857-864.
Mackett, Smith, G. Moss, B. (1984) Proc. Natl. Acad. Sci. USA 79, 7415- 7419.
Means, G. (1971) Holden Day, Inc.
Means, G. Feeney, R. (1971) Holden Day p.105 p. 217.
Mita, Hayashi, Ueda, Kasahara, Fusamoto, Takamizawa, A., Matsubara, Okayama, H. Kamada T. (1992) Biochem. Biophys. Res. Comm.
183:925-930.
Moore, S. (1963) J. Biol. Chem. 238, 235-237.
Okamoto, Okada, Sugiyama, Yotsumoto, Tanaka, Yoshizawa, H., Tsuda, Miyakawa, Y. Mayumi, M. (1990) Jpn. J. Exp. Med. 60:167-177.
Panicali Paoletti (1982) Proc. Natl. Acad. Sci. USA 79, 4927-4931.
Piccini, Perkus, M. Paoletti, E. (1987) Meth. Enzymol. 153, 545-563.
Rose (1988) Annu. Rev. Cell Biol. 1988, 4: 257-288; Ruegg, V. and Rudinger, J. (1977) Methods Enzymol. 47, 111-116.
Shan, S. Wong (1993) CRC-press p. 30-33.
Spaete, Alexander, Rugroden, Choo, Berger, Crawford, Kuo, Leng, Lee, Ralston, et al. (1992) Virology 188(2):819-30.
Skehel, (1984) Proc. Natl. Acad. Sci. USA 81, 1179-1783.
WO 96/04385 PCTJEP95/0303 1 76 Stunnenberg, Lange, Philipson, Miitenburg, R. van der Viiet, R. (1 988) Nuci. Acids Res. 16, 2431-2444.
Stuyver, Van Arnhemn, Wyseur, DeLeys, R. Maertens, G. (1 9 93a) Biochem. Biophys. Res. Commun. 192, 635-641.
Stuyver, Rossau, Wyseur, Duhamel, Vanderborght, Van Heuverswyn, Maertens, G. (1 993b) J. Gen. Virol. 74, 1093-1102.
Stuyver, Van Arnhem, Wyseur, Hernandez, Delaporte, Maertens, G. (1994), Proc. Natl. Acad. Sci. USA 91:10134-10138.
Weil, L. Seibler, S. (196 1) Arch. Biochem. Biophys. 95, 470.
Yokosuka, Ito, Imazeki, Ohto, M. Omata, M. (1 992) Biochem. Biophys.
Res. Commun. 189:565-571.
Miller P, Yano J, Yano E, Carroll C, Jayaram K, Ts'o P (1 979) Biochemistry 18:5134-43.
Nielsen P, Egholm M, Berg R, Buchardt 0 (1991) Science 254:1497-500.
Nielsen P, Egholm M, Berg R, Buchardt 0 (1993) Nucleic-Acids-Res. 21:197-200.
Asseline U, Delarue M, Lancelot G, Toulme F, Thuong N (1984) Proc. NatI. Acad.
Sd. USA 81 :3297-301.
Matsukura M, Shinozuka K, Zon G, Mitsuya H, Reitz M, Cohen J, Broder S (1987) Proc. NatI. Acad. Sd. USA 84:7706-10.
77 SEQUENCE LISTING GENERAL INFORMATION:
APPLICANT:
NAME: Innogenetics N.V.
STREET: Industriepark Zwijnaarde 7 Bus 4 CITY: Gent COUNTRY: Belgium POSTAL CODE (ZIP): 9052 TELEPHONE: 00-32-09.241.07.11 TELEFAX: 00-32-09.241.07.99 (ii) TITLE OF INVENTION: Purified hepatitis C virus envelope proteins for diagnostic and therapeutic use.
(iii) NUMBER OF SEQUENCES: 111 (iv) COMPUTER READABLE FORM: MEDIUM TYPE: Floppy disk COMPUTER: IBM PC compatible OPERATING SYSTEM: PC-DOS/MS-DOS SOFTWARE: PatentIn Release Version #1.25 (EPO) CURRENT APPLICATION DATA: APPLICATION NUMBER: PCT/EP/95/03031 INFORMATION FOR SEQ ID NO: 1: SEQUENCE CHARACTERISTICS: LENGTH: 21 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL:
NO
(iii) ANTI-SENSE:
NO
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: GGCATGCAAG CTTAATTAAT T 21 INFORMATION FOR SEQ ID NO: 2: SEQUENCE CHARACTERISTICS: LENGTH: 68 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: CCGGGGAGGC CTGCACGTGA TCGAGGGCAG ACACCATCAC CACCATCACT AATAGTTAAT 78
TAACTGCA
68 INFORMATION FOR SEQ ID NO: 3: SEQUENCE CHARACTERISTICS: LENGTH: 642 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..639 (ix) FEATURE: NAME/KEY: matpeptide LOCATION: 1..636 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: ATG CCC GGT TGC TCT TTC TCT ATC TTC CTC TTG GCT TTA CTG TCC TGT 48 Met Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu Leu Ser Cys 1 5 10 CTG ACC ATT CCA GCT TCC GCT TAT GAG GTG CGC AAC GTG TCC GGG ATG 96 Leu Thr Ile Pro Ala Ser Ala Tyr Glu Val Arg Asn Val Ser Gly Met 25 TAC CAT GTC ACG AAC GAC TGC TCC AAC TCA AGC ATT GTG TAT GAG GCA 144 Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser Ile Val Tyr Glu Ala 40 GCG GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC GTT CGG GAG 192 Ala Asp Met Ile Met His Thr Pro Gly Cys Val Pro Cys Val Arg Glu 55 AAC AAC TCT TCC CGC TGC TGG GTA GCG CTC ACC CCC ACG CTC GCA GCT 240 Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala 70 75 AGG AAC GCC AGC GTC CCC ACC ACG ACA ATA CGA CGC CAC GTC GAT TTG 288 Arg Asn Ala Ser Val Pro Thr Thr Thr Ile Arg Arg His Val Asp Leu 90 CTC GTT GGG GCG GCT GCT CTC TGT TCC GCT ATG TAC GTG GGG GAT CTC 336 Leu Val Gly Ala Ala Ala Leu Cys Ser Ala Met Tyr Val Gly Asp Leu 100 105 110 TGC GGA TCT GTC TTC CTC GTC TCC CAG CTG TTC ACC ATC TCG CCT CGC 384 Cys Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr Ile Ser Pro Arg 115 120 125 CGG CAT GAG ACG GTG CAG GAC TGC AAT TGC TCA ATC TAT CCC GGC CAC 432 nm His Glu Thr val Gin Asp Cys Asn Cys Ser Ile Tyr Pro Gly His 130 135 140 ATA ACA GGT CAC CGT ATG GCT TGG GAT ATG ATG ATG AAC TGG TCG CCT 480 Ile Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp Ser Pro 145 150 155 160 ACA ACG GCC CTG GTG GTA TCG CAG CTG CTC CGG ATC CCA CAA GCT GTC 528 79 Thr Thr Ala Leu Val Val Ser Gin Leu Leu Arg Ile Pro Gin Ala Val 165 170 175 GTG GAC ATG GTG GCG GGG GCC CAT TGG GGA GTC CTG GCG GGC CTC GCC Val Asp Met Val Ala Gly Ala His Trp Gly Val Leu Ala Gly Leu Ala 180 185 190 TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG GTT TTG ATT GTG ATG CTA Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val Leu Ile Val Met Leu 195 200 205 CTC TTT GCT CTC TAATAG Leu Phe Ala Leu 210 INFORMATION FOR SEQ ID NO: 4: SEQUENCE CHARACTERISTICS: LENGTH: 212 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: Met Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu Leu Ser Cys 1 5 10 576 624 Leu Tyr Ala Asn Arg Leu Cys Arg Ile 145 Thr Val Tyr Thr His Asp Asn Asn Val Gly His 130 Thr Thr Asp Tyr Ile Pro Val Thr Met Ile Ser Ser Ala Ser Gly Ala 100 Ser Val 115 Glu Thr Gly His Ala Leu Met Val 180 Ser Met 195 Ala Ser Asn Asp Met His Arg Cys 70 Val Pro Ala Ala Phe Leu Val Gin Arg Met 150 Val Val 165 Ala Gly Ala Cys Thr 55 Trp Thr Leu Val Asp 135 Ala Ser Ala Tyr Ser 40 Pro Val Thr Cys Ser 120 Cys Trp Gin His Trp 200 Glu 25 Asn Gly Ala Thr Ser 105 Gin Asn Asp Leu Trp 185 Val Ser Cys Leu Ile 90 Ala Leu Cys Met Leu 170 Gly Arg Ser Val Thr Arg Met Phe Ser Met 155 Arg Val Asn Val Ile Val Pro Cys Pro Thr Arg His Tyr Val Thr Ile 125 Ile Tyr 140 Met Asn Ile Pro Leu Ala Ser Tyr Val Leu Val Gly 110 Ser Pro Trp Gin Gly 190 Gly Glu Arg Ala Asp Asp Pro Gly Ser Ala 175 Leu Val Gly Asn Ala Lys Val Leu Ile Val Met Leu 205 Leu Phe Ala Leu 210 80 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 795 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..792 (ix) FEATURE: NAME/KEY: mat peptide LOCATION: 1..789 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: ATG TTG GGT AAG GTC ATC GAT ACC CTT Met Leu Gly Lys Val Ile Asp Thr Leu 1
GTG
Val
GCC
Ala
ACA
Thr
CTG
Leu
TCC
Ser
TAT
Tyr
GTT
Val
CTC
Leu
GTC
Val 145
ATT
Ile
CAT
His
TTG
Leu
CTG
Leu
TAC
Tyr
GCG
Ala 100
AAC
Asn
AGG
Arg Leu 5
CCG
Pro
GGC
Gly
CCC
Pro
ACC
Thr
CAT
His
GAC
Asp
AAC
Asn
AAC
Asn Val
CTC
Leu
GTC
Val
GGT
Gly
GTT
Val 70
GTC
Val
ATG
Met
TCT
Ser
GCC
Ala Gly 150
GGC
Gly
GTT
Val
TCT
Ser
GCT
Ala
AAC
Asn
ATG
Met
CGC
Arg 120
GTC
Val Ala Ala
ACA
Thr 10
CCC
Pro
GAG
Glu
TCT
Ser
GCT
Ala
TGC
Cys 90
ACC
Thr
TGG
Trp
ACC
Thr
TTC
Phe Cys Gly Phe Ala
CTA
Leu
GAC
Asp
ATC
Ile
TAT
Tyr 75
TCC
Ser
CCC
Pro
GTA
Val
ACG
Thr
TGT
Cys 155 GGG -GGC Gly Gly GGC GTG Gly Val TTC CTC Phe Leu GAA GTG Glu Val AAC TCA Asn Ser GGG TGC Gly Cys GCG CTC Ala Leu 125 ACA ATA Thr Ile 140 TCC GCT Ser Ala TGC GGC TTC GCC GAC CTC Asp
GCC
Ala
TAT
Tyr
GCT
Ala
AAC
Asn
ATT
Ile
CCC
Pro
CCC
Pro
CGC
Arg
TAC
Tyr Leu
AGG
Arg
GCA
Ala
TTG
Leu
GTG
Val
GTG
Val
TGC
Cys
ACG
Thr
CAC
His
GTG
Val 160 96 144 192 240 288 336 384 432 GGG GAC CTC TGC GGA TCT GTC Gly Asp Leu Cys Gly Ser Val 165 TTC CTC GTC TCC CAG CTG TTC Phe Leu Val Ser Gln Leu Phe ACC ATC Thr Ile 175 81 TCG CCT CGC CGG CAT GAG ACG GTG CAG GAC TGC AAT TGC Ser Pro Arg Arg His Glu Thr Val Gin Asp Cys Asn Cys 180 185 CCC GGC CAC ATA ACG GGT CAC CGT ATG GCT TGG GAT ATG Pro Gly His Ile Thr Gly His Arg Met Ala Trp Asp Met 195 200 205 TGG TCG CCT ACA ACG GCC CTG GTG GTA TCG CAG CTG CTC Trp Ser Pro Thr Thr Ala Leu Val Val Ser Gin Leu Leu 210 215 220 CAA GCT GTC GTG GAC ATG GTG GCG GGG GCC CAT TGG GGA Gin Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly 225 230 235 GGT CTC GCC TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys 245 250 GTG ATG CTA CTC TTT GCT CCC TAATAG Val Met Leu Leu Phe Ala Pro 260 INFORMATION FOR SEQ ID NO: 6: SEQUENCE CHARACTERISTICS: LENGTH: 263 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: TCA ATC Ser Ile 190 ATG ATG Met Met CGG ATC Arg Ile GTC CTG Val Leu GTT TTG Val Leu 255
TAT
Tyr
AAC
Asn
CCA
Pro
GCG
Ala 240
ATT
Ile 576 624 672 720 768 Met Leu Gly Lys Val Ile Asp Thr Leu Val Ala Thr Leu Ser Tyr Val Leu Val 145 Gly Leu Gly Ser Gly Glu Arg Ala 130 Asp Tyr Ile Ala His Asn Leu Cys Leu Met Tyr Ala Ala 100 Glu Asn 115 Ala Arg Leu Leu Leu Val Gly Val 70 Val Met Ser Ala Gly 150 Gly Val 40 Ser Ala Asn Met Arg 120 Val Ala Thr Pro Glu Ser Ala Cys Thr Trp Thr Phe Cys Gly Phe Ala Leu Asp Ile Tyr 75 Ser Pro Val Thr Cys 155 Gly Gly Phe Glu Asn Gly Ala Thr 140 Ser Gly Ala Val Asn Leu Leu Val Arg Ser Ser Cys Val 110 Leu Thr 125 Ile Arg Ala Met Asp Leu Ala Arg Tyr Ala Ala Leu Asn Val Ile Val Pro Cys Pro Thr Arg His Tyr Val 160 Gly Asp Leu Cys Ser Val Phe Leu Val Ser Gin Leu Phe Thr Ile 175 82 Ser Pro Arg Arg His Glu Thr Val Gin Asp Cys Asn Cys Ser Ile Tyr 180 185 190 Pro Gly His Iie Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn 195 200 205 Trp Ser Pro Thr Thr Ala Leu Val Val Ser Gin Leu Leu Arg Ile Pro 210 215 220 Gin Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly Val Leu Ala 225 230 235 240 Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val Leu Ile 245 250 255 Val Met Leu Leu Phe Ala Pro 260 INFORMATION FOR SEQ ID NO: 7: SEQUENCE CHARACTERISTICS: LENGTH: 633 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..630 (ix) FEATURE: NAME/KEY: mat_peptide LOCATION: 1..627 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: ATG TTG GGT AAG GTC ATC GAT ACC CTT ACG TGC GGC TTC GCC GAC CTC 48 Met Leu Gly Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 1 5 10 ATG GGG TAC ATT CCG CTC GTC GGC GCC CCC CTA GGG GGT GCT GCC AGA 96 Met Gly Tyr Ile Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 25 GCC CTG GCG CAT GGC GTC CGG GTT CTG GAA GAC GGC GTG AAC TAT GCA 144 Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala 40 ACA GGG AAT TTG CCT GGT TGC TCT TTC TCT ATC TTC CTC TTG GCT TTA 192 Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu 55 CTG TCC TGT CTG ACC ATT CCA GCT TCC GCT TAT GAG GTG CGC AAC GTG 240 Lu Ser Cs Leu Thr Ile Pro Ala Ser Ala Tyr Glu val Arg Asn Val 70 75 TCC GGG ATG TAC CAT GTC ACG AAC GAC TGC TCC AAC TCA AGC ATT GTG 288 Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser Ile Val 90 83 TAT GAG GCA GCG GAC ATG ATC ATG CAC ACC CCC Tyr Glu Ala Ala Asp Met Ile Met His Thr Pro 100 105 GTT CGG GAG AAC AAC TCT TCC CGC TGC TGG GTA Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val 115 120 CTC GCA GCT AGG AAC GCC AGC GTC CCC ACT ACG Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr 130 135 GTC GAT TTG CTC GTT GGG GCG GCT GCT TTC TGT Val Asp Leu Leu Val Gly Ala Ala Ala Phe Cys 145 150 155 GGG GAT CTC TGC GGA TCT GTC TTC CTC GTC TCC Gly Asp Leu Cys Gly Ser Val Phe Leu Val Ser 165 170 TCG CCT CGC CGG CAT GAG ACG GTG CAG GAC TGC Ser Pro Arg Arg His Glu Thr Val Gin Asp Cys 180 185 CCC GGC CAC ATA ACA GGT CAC CGT ATG GCT TGG Pro Gly His Ile Thr Gly His Arg Met Ala Trp 195 200 TGG TAATAG Trp 210 INFORMATION FOR SEQ ID NO: 8: SEQUENCE CHARACTERISTICS: LENGTH: 209 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: Met Leu Gly Lys Val Ile Asp Thr Leu Thr Cys 1 5 10 Met Gly Tyr Ile Pro Leu Val Gly Ala Pro Leu 25 Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 40 Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile 55 Leu Ser Cys Leu Thr Ile Pro Ala Ser Ala Tyr 70 75 Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser 90 Tyr Glu Ala Ala Asp Met Ile Met His Thr Pro 100 105 Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val 115 120 Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr GGG TGC I Gly Cys GCG CTC Ala Leu 125 ACA ATA Thr Ile 140 TCC GCT Ser Ala CAG CTG Gin Leu AAT TGC Asn Cys GAT ATG Asp Met 205
GTG
Val 110
ACC
Thr
CGA
Arg
ATG
Met
TTC
Phe
TCA
Ser 190
ATG
Met CCC TGC Pro Cys CCC ACG Pro Thr CGC CAC Arg His TAC GTG Tyr Val 160 ACC ATC Thr Ile 175 ATC TAT Ile Tyr ATG AAC Met Asn 336 384 432 480 528 576 624 Gly Gly Gly Phe 31u Asn Gly Ala rhr Phe Ala Gly Ala Val Asn Leu Leu Val Arg Ser Ser Cys Val 110 Leu Thr 125 Ile Arg Asp Leu Ala Arg Tyr Ala Ala Leu Asn Val Ile Val Pro Cys Pro Thr Arg His 84 130 135 140 Val Asp Leu Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val 145 150 155 160 Gly Asp Leu Cys Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr Ile 165 170 175 Ser Pro Arg Arg His Glu Thr Val Gin Asp Cys Asn Cys Ser Ile Tyr 180 185 190 Pro Gly His Ile Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn 195 200 205 Trp INFORMATION FOR SEQ ID NO: 9: SEQUENCE CHARACTERISTICS: LENGTH: 483 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..480 (ix) FEATURE: NAME/KEY: mat_peptide LOCATION: 1..477 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: ATG CCC GGT TGC TCT TTC TCT ATC TTC CTC TTG GCC CTG CTG TCC TGT 48 Met Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu Leu Ser Cys 1 5 10 CTG ACC ATA CCA GCT TCC GCT TAT GAA GTG CGC AAC GTG TCC GGG GTG 96 Leu Thr Ile Pro Ala Ser Ala Tyr Glu Val Arg Asn Val Ser Gly Val 25 TAC CAT GTC ACG AAC GAC TGC TCC AAC TCA AGC ATA GTG TAT GAG GCA 144 Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser Ile Val Tyr Glu Ala 40 GCG GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC GTT CGG GAG 192 Ala Asp Met Ile Met His Thr Pro Gly Cys Val Pro Cys Val Arg Glu 55 GGC AAC TCC TCC CGT TGC TGG GTG GCG CTC ACT CCC ACG CTC GCG GCC 240 Gly Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala 70 75 AGG AAC GCC AGC GTC CCC ACA ACG ACA ATA CGA CGC CAC GTC GAT TTG 288 Arg Asn Ala Ser Val Pro Thr Thr Thr Ile Arg Arg His Val Asp Leu 90 CTC GTT GGG GCT GCT GCT TTC TGT TCC GCT ATG TAC GTG GGG GAT CTC 336 Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val Gly Asp Leu 85 100
TGC
Cys
CGG
Arg
GTA
Val 145 (2) GGA TCT GTT TTC CTT GTT TCC CAG CTG TTC A Gly Ser Val Phe Leu Val Ser Gin Leu Phe T 115 120 CAT CAA ACA GTA CAG GAC TGC AAC TGC TCA A His Gin Thr Val Gin Asp Cys Asn Cys Ser I 130 135 1 TCA GGT CAC CGC ATG GCT TGG GAT ATG ATG A Ser Gly His Arg Met Ala Trp Asp Met Met M 150 155 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 159 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110 LCC TTC TCA CCT CGC 'hr Phe Ser Pro Arg 125 ,TC TAT CCC GGC CAT le Tyr Pro Gly His TG AAC TGG TCC TAATAG et Asn Trp Ser 160 384 432 483 Met Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu Leu Ser 1 Leu Tyr Ala Gly Arg Leu Cys Arg Val 145 Ala Asn Met Arg Val Ala Phe Val Arg 10 'al Arg Asn Val Ser Gly ;er Ser Ile Val Tyr Glu ys Val Pro Cys Val Arg eu Thr Pro Thr Leu Ala le Arg Arg His Val Asp 90 .la Met Tyr Val Gly Asp 110 eu Phe Thr Phe Ser Pro 125 ys Ser Ile Tyr Pro Gly 140 et Met Met Asn Trp Ser 155 INFORMATION FOR SEQ ID NO: 11: SEQUENCE CHARACTERISTICS: LENGTH: 480 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO
_E
86 ;iii) ANTI-SENSE: NO (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..477 (ix) FEATURE: NAME/KEY: mat_pe LOCATION: 1..474 ptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: ATG TCC GGT TGC Met Ser Gly Cys TCT TTC TCT ATC TTC CTC Ser Phe Ser Ile Phe 1
CTG
Leu
TAC
Tyr
GCG
Ala
GGC
Gly
AGG
Arg
CTC
Leu
TGC
Cys
CGG
Arg
GTA
Val 145
ACC
Thr
CAT
His
GAC
Asp
AAC
Asn
AAC
Asn
GTT
Val
GGA
Gly
CAT
His 130
TCA
Ser
ATA
Ile
GTC
Val
ATG
Met
TCC
Ser
GCC
Ala
GGG
Gly
TCT
Ser 115
CAA
Gin
GGT
Gly 5 CCA GCT Pro Ala ACG AAC Thr Asn ATC ATG Ile Met TCC CGT Ser Arg AGC GTC Ser Val GCT GCT Ala Ala 100 GTT TTC Val Phe ACA GTA Thr Val CAC CGC His Arg TCC GCT TAT GAA Ser Ala GAC TGC Asp Cys CAC ACC His Thr 55 TGC TGG Cys Trp 70 CCC ACA Pro Thr GCT TTC Ala Phe CTT GTT Leu Val CAG GAC Gin Asp 135 ATG GCT Met Ala 150 Tyr
TCC
Ser 40
CCC
Pro
GTG
Val
ACG
Thr
TGT
Cys
TCC
Ser 120
TGC
Cys Glu 25
AAC
Asn
GGG
Gly
GCG
Ala
ACA
Thr
TCC
Ser 105
CAG
Gin
AAC
Asn Leu
GTG
Val
TCA
Ser
TGC
Cys
CTC
Leu
ATA
Ile 90
GCT
Ala
CTG
Leu
TGC
Cys TTG GCC Leu Ala
CGC
Arg
AGC
Ser
GTG
Val
ACT
Thr 75
CGA
Arg
ATG
Met
TTC
Phe
TCA
Ser
AAC
Asn
ATA
Ile
CCC
Pro
CCC
Pro
CGC
Arg
TAC
Tyr
ACC
Thr
ATC
Ile 140
GTG
Val
GTG
Val
TGC
Cys
ACG
Thr
CAC
His
GTG
Val
TTC
Phe 125
TAT
Tyr
TCC
Ser
TAT
Tyr
GTT
Val
CTC
Leu
GTC
Val
GGG
Gly 110
TCA
Ser
CCC
Pro
GGG
Gly
GAG
Glu
CGG
Arg
GCG
Ala
GAT
Asp
GAT
Asp
CCT
Pro
GGC
Gly CTG CTG TCC TGT Leu Leu Ser Cys
GTG
Val
GCA
Ala
GAG
Glu
GCC
Ala
TTG
Leu
CTC
Leu
CGC
Arg
CAT
His 96 144 192 240 288 336 384 432 480 TGG GAT ATG Trp Asp Met ATG ATG AAC TGG TAATAG Met Met Asn Trp 155 INFORMATION FOR SEQ ID NO: 12: SEQUENCE CHARACTERISTICS: LENGTH: 158 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: Met Ser Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu Leu Ser Cys 1 5 10 Leu Tyr Ala Gly Arg Leu Cys Arg Val 145 Thr Ile His Val Asp Met Asn Ser Asn Ala Val Gly Gly Ser 115 His Gin 130 Ser Gly Pro Ala Thr Asn Ile Met Ser Arg Ser Val Ala Ala 100 Val Phe Thr Val His Arg Ser Asp His Cys 70 Pro Ala Leu Gin Met 150 Ala Tyr Cys Ser 40 Thr Pro 55 Trp Val Thr Thr Phe Cys Val Ser 120 Asp Cys 135 Ala Trp 87 Glu Val Arg 25 Asn Ser Ser Gly Cys Val Ala Leu Thr 75 Thr Ile Arg 90 Ser Ala Met 105 Gin Leu Phe Asn Cys Ser Asp Met Met 155 Asn Val Ser Gly Val Ile Val Tyr Glu Ala Pro Cys Val Arg Glu Pro Thr Leu Ala Ala Arg His Val Asp Leu Tyr Val Gly Asp Leu 110 Thr Phe Ser Pro Arg 125 Ile Tyr Pro Gly His 140 Met Asn Trp (2) INFORMATION FOR SEQ ID NO: 13: SEQUENCE CHARACTERISTICS: LENGTH: 636 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..633 (ix) FEATURE: NAME/KEY: mat_peptide LOCATION: 1..630 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: CTG GGT AAG GCC ATC GAT ACC CTT ACG TGC GGC TTC GCC GAC CTC Leu Gly Lys Ala Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 5 10 GGG TAC ATT CCG CTC GTC GGC GCC CCC CTA GGG GGC GCT GCC AGG Gly Tyr Ile Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 25 CTG GCG CAT GGC GTC CGG GTT CTG GAA GAC GGC GTG AAC TAT GCA Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala 40 GGG AAT TTG CCT GGT TGC TCT TTC TCT ATC TTC CTC TTG GCT TTA Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu 55
ATG
Met 1
GTG
Val
GCC
Ala
ACA
Thr 88 CTG TCC TGT CTA ACC ATT CCA GCT TCC GCT TAC GAG GTG CGC AAC GTG 240 Leu Ser Cys Leu Thr Ile Pro Ala Ser Ala Tyr Glu Val Arg Asn Val 70 75 TCC GGG ATG TAC CAT GTC ACG AAC GAC TGC TCC AAC TCA AGC ATT GTG 288 Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser Ile Val 90 TAT GAG GCA GCG GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC 336 Tyr Glu Ala Ala Asp Met Ile Met His Thr Pro Gly Cys Val Pro Cys 100 105 110 GTT CGG GAG AAC AAC TCT TCC CGC TGC TGG GTA GCG CTC ACC CCC ACG 384 Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr 115 120 125 CTC GCG GCT AGG AAC GCC AGC ATC CCC ACT ACA ACA ATA CGA CGC CAC 432 Leu Ala Ala Arg Asn Ala Ser Ile Pro Thr Thr Thr Ile Arg Arg His 130 135 140 GTC GAT TTG CTC GTT GGG GCG GCT GCT TTC TGT TCC GCT ATG TAC GTG 480 Val Asp Leu Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val 145 150 155 160 GGG GAT CTC TGC GGA TCT GTC TTC CTC GTC TCC CAG CTG TTC ACC ATC 528 Gly Asp Leu Cys Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr Ile 165 170 175 TCG CCT CGC CGG CAT GAG ACG GTG CAG GAC TGC AAT TGC TCA ATC TAT 576 Ser Pro Arg Arg His Glu Thr Val Gin Asp Cys Asn Cys Ser Ile Tyr 180 185 190 CCC GGC CAC ATA ACG GGT CAC CGT ATG GCT TGG GAT ATG ATG ATG AAC 624 Pro Gly His Ile Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn 195 200 205 TGG TAC TAATAG 640 Trp Tyr 640 210 INFORMATION FOR SEQ ID NO: 14: SEQUENCE CHARACTERISTICS: LENGTH: 210 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: Met Leu Gly Lys Ala Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 1 5 10 Val Gly Tyr Ile Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 25 Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala 40 Thr Gly Asn Leu Pro Gly Cys ser Phe Ser Ile Phe Leu Leu Ala Leu 55 Leu Ser Cys Leu Thr Ile Pro Ala Ser Ala Tyr Glu Val Arg Asn Val 70 75 Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser Ile Val I I 89 90 Tyr Glu Ala Ala Asp Met Ile Met His Thr Pro Gly Cys Val Pro Cys 100 105 110 Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr 115 120 125 Leu Ala Ala Arg Asn Ala Ser Ile Pro Thr Thr Thr Ile Arg Arg His 130 135 140 Val Asp Leu Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val 145 150 155 160 Gly Asp Leu Cys Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr Ile 165 170 175 Ser Pro Arg Arg His Glu Thr Val Gin Asp Cys Asn Cys Ser Ile Tyr 180 185 190 Pro Gly His Ile Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn 195 200 205 Trp Tyr 210 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 26 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: ATGCCCGGTT GCTCTTTCTC TATCTT 26 INFORMATION FOR SEQ ID NO: 16: SEQUENCE CHARACTERISTICS: LENGTH: 26 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: ATGTTGGGTA AGGTCATCGA TACCCT 26 INFORMATION FOR SEQ ID NO: 17: 90 SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: CTATTAGGAC CAGTTCATCA TCATATCCCA INFORMATION FOR SEQ ID NO: 18: SEQUENCE CHARACTERISTICS: LENGTH: 27 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: CTATTACCAG TTCATCATCA TATCCCA 27 INFORMATION FOR SEQ ID NO: 19: SEQUENCE CHARACTERISTICS: LENGTH: 36 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: ATACGACGCC ACGTCGATTC CCAGCTGTTC ACCATC 36 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 36 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO 91 (iii) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO: GATGGTGAAC AGCTGGGAAT CGACGTGGCG TCGTAT INFORMATION FOR SEQ ID NO: 21: SEQUENCE CHARACTERISTICS: LENGTH: 723 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..720 (ix) FEATURE: NAME/KEY: mat peptide LOCATION: 1..717 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21:
ATG
Met 1
GTG
Val
GCC
Ala
ACA
Thr
CTG
Leu
TCC
Ser
TAT
Tyr
GTT
Val
CTC
Leu TTG GGT AAG GTC ATC GAT ACC CTT ACA TGC GGC Leu Gly Lys Val Ile Asp Thr Leu Thr Cys Gly
GGG
Gly
CTG
Leu
GGG
Gly
TCC
Ser
GGG
Gly
GAG
Glu
CGG
Arg
GCA
Ala 130
TAC
Tyr
GCG
Ala
AAT
Asn
TGT
Cys
ATG
Met
GCA
Ala
GAG
Glu 115
GCT
Ala
ATT
Ile
CAT
His
TTG
Leu
CTG
Leu
TAC
Tyr
GCG
Ala 100
AAC
Asn
AGG
Arg
CTC
Leu
GTC
Val
GGT
Gly
GTT
Val 70
GTC
Val
ATG
Met
TCT
Ser
GCC
Ala
GTC
Val
CGG
Arg
TGC
Cys 55
CCA
Pro
ACG
Thr
ATC
Ile
TCC
Ser
AGC
Ser 135
GCC
Ala 25
CTG
Leu
TTC
Phe
TCC
Ser
GAC
Asp
CAC
His 105
TGC
Cys
CCC
Pro
CTA
Leu
GAC
Asp
ATC
Ile
TAT
Tyr 75
TCC
Ser
CCC
Pro
GTA
Val
ACG
Thr
GGG
Gly
GGC
Gly
TTC
Phe
GAA
Glu
AAC
Asn
GGG
Gly
GCG
Ala
ACA
Thr 140 TTC GCC GAC CTC Phe Ala Asp Leu GGC GCT GCC AGG Gly Ala Ala Arg GTG AAC TAT GCA Val Asn Tyr Ala CTC TTG GCT TTG Leu Leu Ala Leu GTG CGC AAC GTG Val Arg Asn Val TCA AGC ATT GTG Ser Ser Ile Val TGC GTG CCC TGC Cys Val Pro Cys 110 CTC ACC CCC ACG Leu Thr Pro Thr 125 ATA CGA CGC CAC Ile Arg Arg His 48 96 144 192 240 288 336 384 432 92 GTC GAT TCC CAG CTG TTC ACC ATC TCG CCT CGC CGG CAT GAG ACG GTG 480 Val Asp Ser Gin Leu Phe Thr Ile Ser Pro Arg Arg His Glu Thr Val 145 150 155 160 CAG GAC TGC AAT TGC TCA ATC TAT CCC GGC CAC ATA ACG GGT CAC CGT 528 Gin Asp Cys Asn Cys Ser Ile Tyr Pro Gly His Ile Thr Gly His Arg 165 170 175 ATG GCT TGG GAT ATG ATG ATG AAC TGG TCG CCT ACA ACG GCC CTG GTG 576 Met Ala Trp Asp Met Met Met Asn Trp Ser Pro Thr Thr Ala Leu Val 180 185 190 GTA TCG CAG CTG CTC CGG ATC CCA CAA GCT GTC GTG GAC ATG GTG GCG 624 Val Ser Gin Leu Leu Arg Ile Pro Gin Ala Val Val Asp Met Val Ala 195 200 205 GGG GCC CAT TGG GGA GTC CTG GCG GGT CTC GCC TAC TAT TCC ATG GTG 672 Gly Ala His Trp Gly Val Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val 210 215 220 GGG AAC TGG GCT AAG GTT TTG ATT GTG ATG CTA CTC TTT GCT CCC TAATAG 723 Gly Asn Trp Ala Lys Val Leu Ile Val Met Leu Leu Phe Ala Pro 225 230 235 240 INFORMATION FOR SEQ ID NO: 22: SEQUENCE CHARACTERISTICS: LENGTH: 239 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: Met Leu Gly Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 1 5 10 Val Gly Tyr Ile Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 25 Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala 40 Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu 55 Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr Glu Val Arg Asn Val 70 75 Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser Ile Val 90 Tyr Glu Ala Ala Asp Met Ile Met His Thr Pro Gly Cys Val Pro Cys 100 105 110 Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr 115 120 125 Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr Thr Ile Arg Arg His 130 135 140 Val Asp Ser Gin Leu Phe Thr Ile Ser Pro Arg Arg His Glu Thr Val 145 150 155 160 Gin Asp Cys Asn Cys Ser Ile Tyr Pro Gly His Ile Thr Gly His Arg 165 170 175 I I 93 Met Ala Trp Asp Met Met Met Asn Trp Ser Pro Thr Thr Ala Leu Val 180 185 190 Val Ser Gin Leu Leu Arg Ile Pro Gin Ala Val Val Asp Met Val Ala 195 200 205 Gly Ala His Trp Gly Val Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val 210 215 220 Gly Asn Trp Ala Lys Val Leu Ile Val Met Leu Leu Phe Ala Pro 225 230 235 INFORMATION FOR SEQ ID NO: 23: SEQUENCE CHARACTERISTICS: LENGTH: 561 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..558 (ix) FEATURE: NAME/KEY: mat_peptide LOCATION: 1..555 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: ATG TTG GGT AAG GTC ATC GAT ACC CTT ACA TGC GGC TTC GCC GAC CTC 48 Met Leu Gly Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 1 5 10 GTG GGG TAC ATT CCG CTC GTC GGC GCC CCC CTA GGG GGC GCT GCC AGG 96 Val Gly Tyr Ile Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 25 GCC CTG GCG CAT GGC GTC CGG GTT CTG GAG GAC GGC GTG AAC TAT GCA 144 Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala 40 ACA GGG AAT TTG CCC GGT TGC TCT TTC TCT ATC TTC CTC TTG GCT TTG 192 Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu 55 CTG TCC TGT CTG ACC GTT CCA GCT TCC GCT TAT GAA GTG CGC AAC GTG 240 Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr Glu Val Arg Asn Val 70 75 TCC GGG ATG TAC CAT GTC ACG AAC GAC TGC TCC AAC TCA AGC ATT GTG 288 Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser Ile Val 90 TAT GAG GCA GCG GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC 336 Tyr Glu Ala Ala Asp Met Ile Met His Thr Pro Gly Cys Val Pro Cys 100 105 110 GTT CGG GAG AAC AAC TCT TCC CGC TGC TGG GTA GCG CTC ACC CCC ACG 384 Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr 94 115 120 125 CTC GCA GCT AGG AAC GCC AGC GTC CCC ACC ACG ACA ATA CGA CGC CAC Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr Thr Ile Arg Arg His 130 135 140 GTC GAT TCC CAG CTG TTC ACC ATC TCG CCT CGC CGG CAT GAG ACG GTG Val Asp Ser Gin Leu Phe Thr Ile Ser Pro Arg Arg His Glu Thr Val 145 150 155 160 CAG GAC TGC AAT TGC TCA ATC TAT CCC GGC CAC ATA ACG GGT CAC CGT Gin Asp Cys Asn Cys Ser Ile Tyr Pro Gly His Ile Thr Gly His Arg 165 170 175 ATG GCT TGG GAT ATG ATG ATG AAC TGG TAATAG Met Ala Trp Asp Met Met Met Asn Trp 180 185 INFORMATION FOR SEQ ID NO: 24: SEQUENCE CHARACTERISTICS: LENGTH: 185 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: Met Leu Gly Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Val Ala Thr Leu Ser Tyr Val Leu Val 145 Gin Met Gly Leu Gly Ser Gly Glu Arg Ala 130 Asp Asp Ala Tyr Ala Asn Cys Met Ala Glu 115 Ala Ser Cys Trp Ala Leu Phe Ser Asp His 105 Cys Pro Ser Pro Trp 185 Gly Gly Gly Val Phe Leu Glu Val Asn Ser Gly Cys Ala Leu 125 Thr Ile 140 Arg His Ile Thr Ala Asn Leu Arg Ser Val 110 Thr Arg Glu Gly INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 606 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: CDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..603 (ix) FEATURE: NAME/KEY: mat peptide LOCATION: 1..600 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: ATG TTG GGT AAG GTC ATC Met Leu Gly Lys Val Ile 1 5 GTG GGG TAC ATT CCG CTC Val Gly Tyr Ile Pro Leu GCC CTG GCG CAT GGC GTC Ala Leu Ala His Gly Val ACA GGG AAT TTG CCC GGT Thr Gly Asn Leu Pro Gly CTG TCC TGT CTG ACC GTT Leu Ser Cys Leu Thr Val 70 TCC GGG ATG TAC CAT GTC Ser Gly Met Tyr His Val TAT GAG GCA GCG GAC ATG Tyr Glu Ala Ala Asp Met 100 GTT CGG GAG AAC AAC TCT Val Arg Glu Asn Asn Ser 115 CTC GCA GCT AGG AAC GCC Leu Ala Ala Arg Asn Ala 130 GTC GAT TCC CAG CTG TTC Val Asp Ser Gin Leu Phe 145 150 CAG GAC TGC AAT TGC TCA Gin Asp Cys Asn Cys Ser 165 ATG GCT TGG GAT ATG ATG GAT ACC CTT ACA TGC GGC TTC GCC GAC Asp Thr Leu Thr Cys Gly Phe Ala 10
CCC
Pro
GAG
Glu
TCT
Ser
GCT
Ala
TGC
Cys 90
ACC
Thr
TGG
Trp
ACC
Thr
CCT
Pro
GGC
Gly 170
TCG
CTA
Leu
GAC
Asp
ATC
Ile
TAT
Tyr 75
TCC
Ser
CCC
Pro
GTA
Val
ACG
Thr
CGC
Arg 155
CAC
His
CCT
Asp
GCC
Ala
TAT
Tyr
GCT
Ala
AAC
Asn
ATT
Ile
CCC
Pro
CCC
Pro
CGC
Arg
ACG
Thr
CAC
His 175
CTG
CTC
Leu
AGG
Arg
GCA
Ala
TTG
Leu
GTG
Val
GTG
Val
TGC
Cys
ACG
Thr
CAC
His
GTG
Val 160
CGT
Arg
GTG
48 96 144 192 240 288 336 384 432 480 528 576 96 Met Ala Trp Asp Met Met Met Asn Trp Ser Pro Thr Thr Ala Leu Val 180 185 190 GTA TCG CAG CTG CTC CGG ATC CTC TAATAG Val Ser Gin Leu Leu Arg Ile Leu 195 200 INFORMATION FOR SEQ ID NO: 26: SEQUENCE CHARACTERISTICS: LENGTH: 200 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: Met Leu Gly Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Val Ala Thr Leu Ser Tyr Val Leu Val 145 Gly Leu Gly Ser Gly Glu Arg Ala 130 Asp Tyr Ala Asn Cys Met Ala Glu 115 Ala Ser Ile His Leu Leu Tyr Ala 100 Asn Arg Gin Pro Gly Pro Thr His Asp Asn Asn Leu Cys 165 Leu Val Gly Val Arg Val 40 Gly Cys Ser 55 Val Pro Ala 70 Val Thr Asn Met Ile Met Ser Ser Arg 120 Ala Ser Val 135 Phe Thr Ile 150 Ser Ile Tyr Ala Pro 25 Leu Glu Phe Ser Ser Ala Asp Cys 90 His Thr 105 Cys Trp Pro Thr Ser Pro Pro Gly 170 Leu Asp Ile Tyr 75 Ser Pro Val Thr Arg 155 His Gly Gly Phe Glu Asn Gly Ala Thr 140 Arg Ile Gly Ala Val Asn Leu Leu Val Arg Ser Ser Cys Val 110 Leu Thr 125 Ile Arg His Glu Thr Gly Asp Leu Ala Arg Tyr Ala Ala Leu Asn Val Ile Val Pro Cys Pro Thr Arg His Thr Val 160 His Arg 175 Gin Asp Cys Asn Asp Met 180 Leu Leu Met Arg Asn Trp 185 Leu Ser Pro Thr Thr Ala Leu Val 190 INFORMATION FOR SEQ ID NO: 27: SEQUENCE CHARACTERISTICS: LENGTH: 636 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA 97 (iii) (iii) HYPOTHETICAL: NO ANTI-SENSE: NO (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..633 (ix) FEATURE: NAME/KEY: matpeptide LOCATION: 1..630 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: TTG OGT AAG GTC ATC GAT ACC CTT ACA TGC G Leu Gly Lys Val Ile Asp Thr Leu Thr Cvs G
ATG
Met GC TTC GCC GAC CTC ly Phe Ala Asp Leu 1
GTG
Val
GCC
Ala
ACA
Thr
CTG
Leu
TCC
Ser
TAT
Tyr
GTT
Val
CTC
Leu
GTC
Val 145
CAG
Gin
ATG
Met
GTA
Va1
CAT
GGG
Gly
CTG
Leu
GGG
Gly
TCC
Ser
GGG
Gly
GAG
Glu
CGG
Arg
OCA
Ala 130
GAT
Asp
GAC
Asp
OCT
Ala
TCG
Ser
CAC
TAC ATT Tyr Ile GCG CAT Ala His AAT TTG Asn Leu TGT CTG Cys Leu ATG TAC Met Tyr OCA GCG Ala Ala 100 GAG AAC Glu Asn 115 OCT AGO Ala Arg TCC CAG Ser Gin TGC AAT Cys Asn TGG GAT Trp Asp 160 CAG CTO Gin Leu 195
TAATAG
5
CCG
Pro
GGC
Oly
CCC
Pro
ACC
Thr
CAT
His
GAC
Asp
AAC
Asn
AAC
Asn
CTG
Leu
TGC
Cys 165
ATO
Met
CTC
Leu
CTC
Leu
GTC
Val
GGT
Gly
OTT
Val 70
GTC
Val
ATO
Met
TCT
Ser 0CC Ala
TTC
Phe 150
TCA
Ser
ATO.
Met
CGG
Arg
GTC
Val
CGG
Arg
TGC
Cys 55
CCA
Pro
ACO
Thr
ATC
Ile
TCC
Ser
AGC
Ser 135
ACC
Thr
ATC
Ile
ATO
Met
ATC
Ile
GGC
Gly
OTT
Val 40
TCT
Ser
OCT
Ala
AAC
Asn
ATO
Met
CGC
Arg 120
OTC
Vai
ATC
Ile
TAT
Tyr
AAC
Asn
GTG
Va1 200 0CC Ala 25
CTG
Leu
TTC
Phe
TCC
Ser
GAC
Asp
CAC
His 105
TGC
Cys
CCC
Pro
TCG
Ser
CCC
Pro
TGG
Trp 18b
ATC
Ile 10
CCC
Pro
GAO
Glu
TCT
Ser
OCT
Ala
TGC
Cys 90
ACC
Thr
TGG
Trp
ACC
Thr
CCT
Pro
GGC
Gly 170
TCO
Ser
GAO
Glu
CTA
Leu
GAC
Asp
ATC
Ile
TAT
Tyr 75
TCC
Ser
CCC
Pro
OTA
Val
ACO
Thr
CGC
Arg 155
CAC
His
CCT
Pro
GGC
Oly
GGG
Gly
GGC
Gly
TTC
Phe
OAA
Glu
AAC
Asn
GGG
Gly
GCG
Ala
ACA
Thr 140
CGG
Arg
ATA
Ile
ACA
Thr
AGA
Arg
GGC
Oly
GTG
Va1
CTC
Leu
GTG
Val
TCA
Ser
TGC
Cys
CTC
Leu 125
ATA
Ile
CAT
His
ACO
Thr
ACG
Thr
CAC
His 205
OCT
Ala
AAC
Asn
TTG
Leu
CGC
Arg
AGC
Ser
GTG
Val 110
ACC
Thr
CGA
Arg
GAG
Glu
GGT
Gly 0CC Ala 190
CAT
His 0CC Ala
TAT
Tyr
OCT
Ala
AAC
Asn
ATT
Ile
CCC
Pro
CCC
Pro
CGC
Arg
ACO
Thr
CAC
His 175
CTG
Leu
CAC
His
AGO
Arg
OCA
Ala
TTO
Leu
GTG
Val
GTG
Val
TGC
Cys
ACG
Thr
CAC
His
GTG
Val 160
CGT
Arg
GTG
Val
CAC
His 96 144 192 240 288 336 384 432 480 528 576 624 98 His His 210 INFORMATION FOR SEQ ID NO: 28: SEQUENCE CHARACTERISTICS: LENGTH: 210 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: Met Leu Gly Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala val Ala Thr Leu Ser Tyr Val Leu Val 145 Gin Met Val His Gly Leu Gly Ser Gly Glu Arg Ala 130 Asp Asp Ala Ser His 210 Leu Val Gly Val 70 Val Met Ser Ala Phe 150 Ser Met Arg Gly Val Ser Ala Asn Met Arg 120 Val Ile Tyr Asn Val 200 Leu Asp Ile Tyr 75 Ser Pro Val Thr Arg 155 His Pro Gly Gly Gly Gly Val Phe Leu Glu Val Asn Ser Gly Cys Ala Leu 125 Thr Ile 140 Arg His Ile Thr Thr Thr Arg His 205 Asp Leu Ala Arg Tyr Ala Ala Leu Asn Val Ile Val Pro Cys Pro Thr Arg His Thr Val 160 His Arg 175 Leu Val His His INFORMATION FOR SEQ ID NO: 29: SEQUENCE CHARACTERISTICS: LENGTH: 630 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO 99 (iii) ANTI-SENSE: NO (ix) FEATURE: NAME/KEY: CDS LOCATION: 1. .627 (ix) FEATURE: NAME/KEY: mat-peptide (B3) LOCATION: 1. .624 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: ATG GGT AAG GTC ATC GAT Met Gly Lys Val Ile Asp
ACC
Thr 1
GGG
Gly
CTT
Leu
GGG
Gly
TCT
Ser
GGC
Gly
GAG
Giu
CAG
Gin
OCA
Ala
GAC
Asp 145
GAC.
Asp
CCT
Pro
TAC
Tyr
GCG
Ala
AAT
Asn 50
TGC
Cys
CTC
Leu
GCC
Ala
GAC
Asp
GTC
Val1 130
CTA
Leu 1ATG M4et
CGT
Arg
ATC
Ile
CAT
His 35
TTG
Leu
TTA
Leu
TAT
Tyr
GAT
Asp
GGC
Gly 115
AAG
Lys
TTA
Leu
TGT
Cys
CGC
Arg
CCG
Pro 20
GGC
Giy
CCC
Pro
ATT
Ile
GTC
Val
GAC
Asp 100
AAT
Asn
TAC
Tyr
GTG
Val1
GGG
Gly
CAT
His 180 5
CTC
Leu
GTG
Val1
GGT
Gly
CAT
His
CTT
Leu
GTT
Val1
ACA
Thr
GTC
Val
GGC
Gly
GCT
Ala 165
CAA
Gin GTC GGC Val Gly AGG GCC Arg Ala TGC TCC Cys Ser 55 CCA OCA Pro Ala 70 ACC AAC Thr Asn ATT CTG Ile Leu TCC ACG Ser Thr GGA GCA Gly Ala 135 GCG GCC Ala Ala 150 GTC TTC Val Phe ACG GTC Thr Val CTT ACG TGC GGA TTC Leu Thr Cys Gly Phe 10 GCT CCC GTA GGA GGC Ala Pro Val Gly Gly 25 CTT GAA GAC GGG ATA Leu Giu Asp Gly Ile 40 TTT TCT ATT TTC CTT Phe Ser Ile Phe Leu GCT AGT CTA GAG TGG Ala Ser Leu Giu Trp 75 GAC TGT TCC AAT AGC Asp Cys Ser Asn Ser 90 CAC ACA CCC GGC TGC His Thr Pro Gly Cys 105 TGC TGG ACC CCA GTG Cys Trp, Thr Pro Val 12 0 WCC ACC GCT TCG ATA I'hr Thr Ala Ser Ile 140 kCG ATG TGC TCT GCG Chr Met Cys Ser Ala 155 'TC GTG GGA CAA GCC eu Val Gly Gin Ala 170 ,AG ACC TGT AAC TGC fln Thr Cys Asn Cys 185
GCC
Ala
GTC
Val1
AAT
Asn
CTC
Leu
CGG
Arg
AGT
Ser
ATA
Ile
ACA
Thr 125
CGC
Arg
CTC
Leu
TTC
Phe
TCG
Ser GAT CTC ATG Asp Leu Met
GCA
Ala
TTC
Phe
GCT
Al a
AAT
Asn
ATT
Ile
CCT
Pro 110
CCT
Pro
AGT
Ser
TAC
Tyr
ACG
Thr
CTG
Leu 190
AGA
Arg
GCA
Ala
CTG
Leu
ACG
Thr
GTG
Val
TGT
Cys
ACA
Thr
CAT
His
GTG
Val
TTC
Phe 175
TAC
Tyr
GCC
Ala
ACA
Thr
TTC
Phe
TCT
Ser
TAC
Tyr
GTC
Val1
GTG
Val1
GTG
Val1
GGT
Gly 160
AGA
Arg
CCA
Pro 96 144 192 240 288 336 384 432 480 528 576 GGC CAT C~TT Gly His Leu 195
TAATAG
TCA nn rAT cna Ser Gly His Arg ATG OCT TaG nAT AT ATO TACG Met Ala Trp Asp Met Met Met Asn Trp 200 205 100 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 208 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: Met Gly Lys Val Ile Asp Thr Leu Thr Gly Phe Ala Asp Leu Met Gly Leu Gly Ser Gly Glu Gln Ala Asp 145 Asp Pro Gly Tyr Ala Asn Cys Leu Ala Asp Val 130 Leu Met Arg His Ile His Leu Leu Tyr Asp Gly 115 Lys Leu Cys Arg Leu 195 Pro Gly Pro Ile Val Asp 100 Asn Tyr Val Gly His 180 Ser Leu Val Val Arg Gly Cys His Pro 70 Leu Thr Val Ile Thr Ser Val Gly Gly Ala 150 Ala Val 165 Gin Thr Gly Ala Ser 55 Ala Asn Leu Thr Ala 135 Ala Phe Val SAla Leu 40 Phe Ala Asp His Cys 120 Thr Thr Leu Gin Met 200 Pro 25 Glu Ser Ser Cys Thr 105 Trp Thr Met Val Thr 185 Val Asp Ile Leu Ser 90 Pro Thr Ala Cys Gly 170 Cys Gly Gly Phe Glu 75 Asn Gly Pro Ser Ser 155 Gin Asn Gly Val Ile Asn Leu Leu Trp Arg Ser Ser Cys Ile Val Thr 125 Ile Arg 140 Ala Leu Ala Phe Cys Ser Ala Phe Ala Asn Ile Pro 110 Pro Ser Tyr Thr Leu 190 Arg Ala Leu Thr Val Cys Thr His Val Phe 175 Tyr Ala Thr Phe Ser Tyr Val Val Val Gly 160 Arg Pro Gly His Arg Ala Trp Asp Met Met 205 Met Asn Trp INFORMATION FOR SEQ ID NO: 31: SEQUENCE CHARACTERISTICS: LENGTH: 630 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO 101 (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..627 (ix) FEATURE: NAME/KEY: matpeptide LOCATION: 1..624 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: ATG GGT AAG GTC ATC GAT ACC CTA ACG TGC GGA TTC GCC GAT CTC ATG 48 Met Gly Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met 1 5 10 GGG TAT ATC CCG CTC GTA GGC GGC CCC ATT GGG GGC GTC GCA AGG GCT 96 Gly Tyr Ile Pro Leu Val Gly Gly Pro Ile Gly Gly Val Ala Arg Ala 25 CTC GCA CAC GGT GTG AGG GTC CTT GAG GAC GGG GTA AAC TAT GCA ACA 144 Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr 40 GGG AAT TTA CCC GGT TGC TCT TTC TCT ATC TTT ATT CTT GCT CTT CTC 192 Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile Phe Ile Leu Ala Leu Leu 55 TCG TGT CTG ACC GTT CCG GCC TCT GCA GTT CCC TAC CGA AAT GCC TCT 240 Ser Cys Leu Thr Val Pro Ala Ser Ala Val Pro Tyr Arg Asn Ala Ser 70 75 GGG ATT TAT CAT GTT ACC AAT GAT TGC CCA AAC TCT TCC ATA GTC TAT 288 Gly Ile Tyr His Val Thr Asn Asp Cys Pro Asn Ser Ser Ile Val Tyr 90 GAG GCA GAT AAC CTG ATC CTA CAC GCA CCT GGT TGC GTG CCT TGT GTC 336 Glu Ala Asp Asn Leu Ile Leu His Ala Pro Gly Cys Val Pro Cys Val 100 105 110 ATG ACA GGT AAT GTG AGT AGA TGC TGG GTC CAA ATT ACC CCT ACA CTG 384 Met Thr Gly Asn Val Ser Arg Cys Trp Val Gin Ile Thr Pro Thr Leu 115 120 125 TCA GCC CCG AGC CTC GGA GCA GTC ACG GCT CCT CTT CGG AGA GCC GTT 432 Ser Ala Pro Ser Leu Gly Ala Val Thr Ala Pro Leu Arg Arg Ala Val 130 135 140 GAC TAC CTA GCG GGA GGG GCT GCC CTC TGC TCC GCG TTA TAC GTA GGA 480 Asp Tyr Leu Ala Gly Gly Ala Ala Leu Cys Ser Ala Leu Tyr Val Gly 145 150 155 160 GAC GCG TGT GGG GCA CTA TTC TTG GTA GGC CAA ATG TTC ACC TAT AGG 528 Asp Ala Cys Gly Ala Leu Phe Leu Val Gly Gin Met Phe Thr Tyr Arg 165 170 175 CCT CGC CAG CAC GCT ACG GTG CAG AAC TGC AAC TGT TCC ATT TAC AGT 576 Pro Arg Gin His Ala Thr Val Gin Asn Cys Asn Cys Ser Ile Tyr Ser 180 185 190 GGC CAT GTT ACC GGC CAC CGG ATG GCA TGG GAT ATG ATG ATG AAC TGG 624 Gly His Val Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 195 200 205 TAATAG 630 INFORMATION FOR SEQ ID NO: 32: ww 102 SEQUENCE CHARACTERISTICS: LENGTH: 208 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: Met Gly Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met 1 5 10 ic Gly Leu Gly Ser Gly Glu Met Ser Asp 145 Asp Tyr Ala Asn Cys Ile Ala Thr Ala 130 Tyr Ala Ile His Leu Leu Tyr Asp Gly 115 Pro Leu Cys Pro Gly Pro Thr His Asn 100 Asn Ser Ala Gly Leu Val Gly Val Arg Val Gly Cys Ser 55 Val Pro Ala 70 Val Thr Asn Leu Ile Leu Val Ser Arg Leu Gly Ala 135 Gly Gly Ala 150 Ala Leu Phe 165.
Ala Thr Val Gly Leu 40 Phe Ser Asp His Cys 120 Val Ala Leu Gin SPro 25 Glu Ser Ala Cys Ala 105 Trp Thr Leu Val Asn 185 Ile Gly Gly Val Ala Arg Ala Asp Ile Val Pro 90 Pro Val Ala Cys Gly 170 Cys Gly Val Asn Tyr Ala Thr Phe Pro 75 Asn Gly Gin Pro Ser 155 Gin Asn Ile Leu Tyr Arg Ser Ser Cys Val Ile Thr 125 Leu Arg 140 Ala Leu Met Phe Cys Ser Ala Asn Ile Pro 110 Pro Arg Tyr Thr Ile Leu Ala Val Cys Thr Ala Val Tyr 175 Tyr Leu Ser Tyr Val Leu Val Gly 160 Arg Ser Pro Arg Gin His 180 190 Gly His Val Thr Gly His Arg Met 195 200 Ala Trp Asp Met Met Asn Trp INFORMATION FOR SEQ ID NO: 33: SEQUENCE CHARACTERISTICS: LENGTH: 23 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 103 TGGGATATGA TGATGAACTG
GTC
INFORMATION FOR SEQ ID NO: 34: SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL:
NO
(iii) ANTI-SENSE:
NO
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: CTATTATGGT GGTAAGCCAC
AGAGCAGGAG
INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 1476 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL:
NO
(iii) ANTI-SENSE:
NO
(ix) FEATURE: NAME/KEY: CDS LOCATION: 1..1473 (ix) FEATURE: NAME/KEY: mat_peptide LOCATION: 1..1470
TGG
Trp 1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: GAT ATG ATG ATG AAC TGG TCG CCT ACA ACG GCC Asp Met Met Met Asn Trp Ser Pro Thr Thr Ala 5 10 CAG CTG CTC CGG ATC CCA CAA GCT GTC GTG GAC ATG Gln Leu Leu Arg Ile Pro Gln Ala Val Val Asp Met 25 CAT TGG GGA GTC CTG GCG GGC CTC GCC TAC TAT TCC His Trp Gly Val Leu Ala Gly Leu Ala Tyr Tyr Ser 40 TGG GCT AAG GTT TTG GTT GTG ATG CTA CTC TTT GCC Trp Ala Lys Val Leu Val Val Met Leu Leu Phe Ala 55 CTG GTG GTA TCG Leu Val Val Ser GTG GCG GGG GCC Val Ala Gly Ala ATG GTG GGG AAC Met Val Gly Asn GGC GTC GAC GGG Gly Val Asp Gly 144 192 240 CAT ACC CGC GTG TCA GGA GGG GCA GCA GCC TCC GAT ACC AGG GGC CTT His Thr Arg Val Ser Gly Gly Ala Ala Ala Ser Asp Thr Arg Gly Leu 70 75 GTG TCC CTC TTT AGC CCC GGG TCG GCT CAG AAA ATC CAG CTC GTA AAC 104 Ala Gin Lys Val Ser Leu Phe Ser Pro Gly Ser Ile Gin Leu Val Asn 90
ACC
Thr
TCC
Ser
AAC
Asn
AAG
Lys 145
TCG
Ser
ATT
Ile
AGC
Ser
AAC
Asn
CCG
Pro 225
TTC
Phe
AAC
Asn
GCC
Ala
ATG
Met
TTC
Phe 305
TTC
Phe
GAC
Asp
TGG
Trp
AAC
Asn
CTC
Leu
TCG
Ser 130
TTC
Phe
GAC
Asp
GTA
Val1
CCT
Pro
TGG
Trp 210
CCG
Pro
ACC
Thr
AAC
Asn
ACC
Thr
GTT
Vai 290
ACC
Thr
GAA
Giu
AGG
Arg
CAG
Gin GGC AGT Gly Ser 100 CAA ACA Gin Thr 115 TCT GGA Ser Gly GCT CAG Ala Gin CAG AGG Gin Arg CCC GCG Pro Ala 180 GTT GTG Val Val 195 GGG GCG Gly Ala CGA GGC Arg Gly AAG ACG Ljys Thr ACC TTG Thr Leu 260 TAC GCC Tyr Ala 275 CAT TAC His Tyr ATC TTC Ile Phe GCC GCA Ala Ala GAT AGA Asp Arg 340 ATA CTG Ile Leu
TGG
Trp
GGG
G ly
TGC
Cys
GGG
Gly
CCC
Pro 165
TCT
Ser
GTG
Val1
AAC
Asn
AAC
Asn
TGT
Cys 245
ACC
Thr
AGA
Arg
CCA
Pro
AAG
Lys
TGC
Cys 325
TCA
Ser
CCC
Pro
CAC
His
TTC
Phe
CCA
Pro
TGG
Trp 150
TAC
Tyr
CAG
G in
GG
Gly
GAC
Asp
TGG
Trp 230
GGG
Gly
TGC
Cys
TGC
Cys
TAT.
Tyr
GTT
Val 310
AAT
Asn
GAG
Giu
TGT
Cys
ATC
Ile
'TT
Phe
GAG
Giu 135
GGT
Giy
TGC
Cys
GTG
Val
ACG
Thr
TCG
Ser 215 TrC Phe
GGC
Giy
CCC
Pro
GGT
Giy
AGG
Arg 295
AGG
Arg
TGG
Trp
CTT
Leu
TCC
Ser
AAC
Asn
GCC
Ala 120
CGC
Arg
CCC
Pro
TGG
Trp
TGC
Cys
ACC
Thr 200
GAT
Asp
GGC
Gly
CCC
Pro
ACT
Thr
TCT
Ser 280
CTC
Leu
ATG
Met
ACT
Thr
AGC
Ser
TTC
Phe
AGG
Arg 105
GCA
Al a
TTG
Leu
CTC
Leu
CAC
His
GGT
Gly 185
GAT
Asp
GTG
Val1
TGT.
Cys
CCG
Pro
GAC
Asp 265
GGG
Gly
TGG
Trp
TAC
Tyr
CGA
Arg
CCG
Pro 345
ACC.
Thr
ACT
Thr
CTA
Leu
GCC
Ala
ACT
Thr
TAC
Tyr 170
CCA
Pro
CGG
Arg
CTG
Leu
ACA
Thr
TGC
Cys 250
TGT
Cys
CCC
Pro
CAC
His
GTG
Val1
GGA
Gly 330
CTG
Leu
ACC
Thr GCC CTG Ala Leu TTC TAC Phe Tyr AGC TGT Ser Cys 140 TAC ACT Tyr Thr 155 GCG CCT Ala Pro GTG TAT Vai Tyr TTT GGT Phe Gly ATT CTC Ile Leu 220 TGG ATG Trp, Met 235 AAC ATC Asn Ile TTT CGG Phe Arg TGG CTG Trp Leu TAC CCC Tyr Pro 300 GGG GGC Gly Gly 315 GAG CGT Giu Arg CTG CTG Leu Leu CTG CCG Leu Pro AAC TGC Asn Cys 110 AAA CAC Lys His 125 CGC TCC Arg Ser GAG CCT Glu Pro CGA CCG Arg Pro TGC TTC Cys Phe 190 GTC CCC Val Pro 205 AAC AAC Asn Asn AAT GGC Asn Gly GGG GGG Gly Gly AAG CAC Lys His 270 ACA CCT Thr Pro 285 TGC ACT Cys Thr GTG GAG Val Giu TGT GAC Cys Asp TCT ACA Ser Thr 350 GCC CTA Ala Leu
AAC
Asn
AAA
Lys
ATC
Ile
AAC
As n
TGT
Cys 175
ACC
Thr
ACG
Thr
ACG
Thr
ACT
Thr
GCC
Al a 255
CCC
Pro
AGG
Arg
GTC
Val1
CAC
His
TTG
Leu 335
ACA
Thr
TCC
Ser
GAC
Asp
TTC
Phe
GAC
Asp
AGC-
Ser 160
GGT
Gly
CCG
Pro
TAT
Tyr
CGG
Arg
GGG
Gly 240
GGC
Giy
GAG
Glu
TGT
Cys
AAC
Asn
AGG
Arg 320
GAG
Giu
GAG
Giu
ACC
Thr 336 384 432 480 528 576 624 672 720 768 816 864 912 960 1008 1056 1104 105 355 360 GGC CTG ATC CAC CTC CAT CAG AAC ATC GTG GAC GT Gly Leu Ile His Leu His Gin Asn Ile Val Asp Va 370 375 38( GGT GTA GGG TCG GCG GTT GTC TCC CTT GTC ATC AAJ Gly Val Gly Ser Ala Val Val Ser Leu Val Ile Ly 385 390 395 CTG TTG CTC TTC CTT CTC CTG GCA GAC GCG CGC ATC Leu Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Ile 405 410 TGG ATG ATG CTG CTG ATA GCT CAA GCT GAG GCC GCC Trp Met Met Leu Leu Ile Ala Gin Ala Glu Ala Ala 420 425 GTG GTC CTC AAT GCG GCG GCC GTG GCC GGG GCG CAT Val Val Leu Asn Ala Ala Ala Val Ala Gly Ala His 435 440 TTC CTT GTG TTC TTC TGT GCT GCC TGG TAC ATC AAG Phe Leu Val Phe Phe Cys Ala Ala Trp Tyr Ile Lys 450 455 460 CCT GGT GCG GCA TAC GCC TTC TAT GGC GTG TGG CCG Pro Gly Ala Ala Tyr Ala Phe Tyr Gly Val Trp Pro 465 470 475 CTG CTG GCC TTA CCA CCA CGA GCT TAT GCC TAGTAA Leu Leu Ala Leu Pro Pro Arg Ala Tyr Ala 485 490 INFORMATION FOR SEQ ID NO: 36: SEQUENCE CHARACTERISTICS: LENGTH: 490 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: Trp Asp Met Met Met Asn Trp Ser Pro Thr Thr Ala 1 5 10 Gin Leu Leu Arg Ile Pro Gin Ala Val Val Asp Met 25 His Trp Gly Val Leu Ala Gly Leu Ala Tyr Tyr Ser 40 Trp Ala Lys Val Leu Val Val Met Leu Leu Phe Ala 55 His Thr Arg Val Ser Gly Gly Ala Ala Ala Ser Asp 70 75 Val Ser Leu Phe Ser Pro Gly Ser Ala Gin Lys Ile 90 Thr Asn Gly Ser Trp His Ile Asn Arg Thr Ala Leu 100 105 Ser Leu Gin Thr Gly Phe Phe Ala Ala Leu Phe Tyr 115 120 365 3 CAA 1 Gin 0 A TGG s Trp C TGC e Cys
TTA
Leu
GGC
Gly 445
GGC
Gly
CTG
Leu
TAC
Tyr
GAG
Glu
GCC
Ala
GAG
Glu 430
ACT
Thr
AGG
Arg
CTC
Leu CTG TAC Leu Tyr TAT GTC Tyr Val 400 TGC TTA Cys Leu 415 AAC CTG Asn Leu CTT TCC Leu Ser CTG GTC Leu Val CTG CTT Leu Leu 480 1152 1200 1248 1296 1344 1392 1440 1476 Val Ser Gly Ala Gly Asn Asp Gly Gly Leu Val Asn Asn Asp Lys Phe 106 Axg Leu Ala Ser Asn Lys 145 Ser Ile Ser Asn Pro 225 Phe Asn Ala Met Phe 305 Phe Asp Trp Gly Gly 385 Leu Trp Val Phe Pro 465 Ser 130 Phe Asp Val Pro Trp 210 Pro Thr Asn Thr Val 290 Thr Glu Ser Ala Gin Pro Val 195 Gly Arg Lys Thr Tyr 275 His Ile Ala Gly Gin Arg Ala 180 Val Ala.
Gly.
Thr Leu 260 Ala Tyr Phe Ala Arg 340 Leu His Ser Phe Leu 420 Asn Phe Ala Cys Gly Pro 165 Ser Va1 Asn Asn Cys 245 Thr Arg Pro Lys Cys 325 Ser Pro Leu Ala Leu 405 Leu Ala Phe Tyr Pro Trp 150 Tyr Gin Gly Asp Trp 230 Gly Cys Cys Tyr Val 310 Asn Glu Cys His Val 390 Leu Ile Ala Cys Ala 470 lu 135 ly Cys Va1 rhr Ser 215 Phe ly Pro Gly Arg 295 Arg Trp Leu Ser Gin 375 Val Leu Ala Ala Ala 455 Phe Cys Arg Ser Ile Asp 140 Pro Trp Cys Thr 200 Asp Gly Pro Thr Ser 280 Leu Met Thr Ser Phe 360 Asn Ser Ala Gin Val 440 Ala Tyr Leu His Gly 185 Asp Val Cys Pro Asp 265 Gly Trp Tyr Arg Pro 345 Thr Ile Leu Asp Ala 425 Ala Trp Gly Thr Tyr 170 Pro Arg Leu Thr Cys 250 Cys Pro His Val Gly 330 Leu Thr Val Val Ala 410 Glu Gly Tyr Va1 Tyr 155 Ala Val Phe Ile Trp 235 Asn Phe Trp Tyr Gly 315 Glu Leu Leu Asp Ile 395 Arg Ala Ala Ile Trp 475 Thr Pro Tyr Gly Leu 220 Met Ile Arg Leu Pro 300 Gly Arg Leu Pro Va1 380 Lys Ile Ala His Lys 460 Glu Arg Cys Val 205 Asn Asn Gly Lys Thr 285 Cys Va1 Cys Ser Ala 365 Gin Trp Cys Leu Gly 445 Gly Pro Pro Phe 190 Pro Asn Gly Gly His 270 Pro Thr Glu Asp Thr 350 Leu Tyr Glu Ala Glu 430 Thr Arg Asn Cys 175 Thr Thr Thr Thr Ala 255 Pro Arg Va1 His Leu 335 Thr Ser Leu Tyr Cys 415 Asn Leu Leu Ser 160 Gly Pro Tyr Arg Gly 240 Gly Glu Cys Asn Arg 320 Glu Glu Thr Tyr Val 400 Leu Leu Ser Val Leu 480 Arg Asp Gln Ile 355 Leu Ile 370 Val Gly Leu Leu Met Met Val Leu 435 Leu Val 450 Gly Ala Pro Leu Leu Leu Leu Leu Ala Leu Pro 485 Pro Arg Aia Tyr Ala 490 107 INFORMATION FOR SEQ ID NO: 37: SEQUENCE CHARACTERISTICS: LENGTH: 1021 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (ix) FEATURE: NAME/KEY: CDS LOCATION: 2..1018 (ix) FEATURE: NAME/KEY: matpeptide LOCATION: 2..1015 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: G ATC CCA CAA GCT GTC GTG GAC ATG GTG GCG GGG GCC CAT TGG GGA Ile Pro Gln Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly 1 5 10 GTC CTG Val
GTT
Val
GTG
Val
TTT
Phe
AGT
Ser
ACA
Thr
GGA
Gly
CAG
Gln
AGG
Arg
GCG
Ala 160 Leu
TTG
Leu
TCA
Ser
AGC
Ser
TGG
Trp
GGG
Gly
TGC
Cys
GGG
Gly
CCC
Pro 145
TCT
Ser GCG GGC Ala Gly GTT GTG Val Val GGA GGG Gly Gly CCC GGG Pro Gly CAC ATC His Ile TTC TTT Phe Phe CCA GAG Pro Glu 115 TGG GGT Trp Gly 130 TAC TGC' Tyr Cys CAG GTG Gln Val CTC GCC TAC TAT TCC ATG GTG GGG AAC TGG Leu
ATG
Met
GCA
Ala
TCG
Ser
AAC
Asn
GCC
Ala 100
CGC
Arg
CCC
Pro TGr Trp Ala
CTA
Leu
GCA
Ala
GCT
Ala
AGG
Arg 85
GCA
Ala
TTG
Leu
CTC
Leu
HCA
His Tyr
CTC
Leu
GCC
Ala
CAG
Gln 70
ACT
Thr
CTA
Leu
GCC
Ala
ACT
Thr
TAC
Tyr 150 Tyr Ser TTT GCC Phe Ala 40 TCC GAT Ser Asp 55 AAA ATC Lys Ile GCC CTG Ala Leu TTC TAC Phe Tyr AGC TGT Ser Cys 120 TAC ACT Tyr Thr 135 Ala Pro Met 25
GGC
Gly
ACC
Thr
CAG
Gln
AAC
Asn
AAA
Lys 105
CGC
Arg
GAG
Glu Arg Val
GTC
Val
AGG
Arg
CTC
Leu
TGC
Cys 90
CAC
His
TCC
Ser
CCT
Pro Pro
TTC
Phe 170 Gly
GAC
Asp
GGC
Gly
GTA
Val
AAC
Asn
AAA
Lys
ATC
Ile
AAC
Asn Cys 155 Asn
GGG
Gly
CTT
Leu
AAC
Asn
GAC
Asp
TTC
Phe
GAC
Asp
AGC
Ser 140 jGl Gly Trp
CAT
His
GTG
Val
ACC
Thr
TCC
Ser
AAC
Asn
AAG
Lys 125
TCG
Ser
ATT
Ile GCT AAG Ala Lys ACC CGC Thr Arg TCC CTC Ser Leu AAC GGC Asn Gly CTC CAA Leu Gin TCG TCT Ser Ser 110 TTC GCT Phe Ala GAC CAG Asp Gin GTA CCC Val Pro 94 142 190 238 286 334 382 430 478 TGC GGT Cys Gly 165 CCA GTG TAT TGC Pro Val Tyr Cys ACC CCG AGC CCT Thr Pro Ser Pro
I
108 GTG GTG GGG ACG ACC Val Val Gly Thr Thr GAT CGG TTT GGT GTC Asp Arg Phe Gly Val
GCG
Ala
GGC
Gly
ACG
Thr
TTG
Leu 240
GCC
Ala
TAC
Tyr
TTC
Phe
GCA
Ala
AGA
Arg 320
AAC
Asn
AAC
Asn
TGT
Cys 225
ACC
Thr
AGA
Arg
CCA
Pro
AAG
Lys
TGC
Cys 305
TCA
GAC
Asp
TGG
Trp 210
GGG
Gly
TGC
Cys
TGC
Cys
TAT
Tyr
GTT
Val 290
AAT
Asn
GAG
TCG
Ser 195
TTC
Phe
GGC
Gly
CCC
Pro
GGT
Gly
AGG
Arg 275
AGG
Arg
TGG
Trp
CTT
180
GAT
Asp
GGC
Gly
CCC
Pro
ACT
Thr
TCT
Ser 260
CTC
Leu
ATG
Met
ACT
Thr
GTG
Val
TGT
Cys
CCG
Pro
GAC
Asp 245
GGG
Gly
TGG
Trp
TAC
Tyr
CGA
Arg
CTG
Leu
ACA
Thr
TGC
Cys 230
TGT
Cys
CCC
Pro
CAC
His
GTG
Val
GGA
Gly 310 ATT CTC Ile Leu 200 TGG ATG Trp Met 215 AAC ATC Asn Ile TTT CGG Phe Arg TGG CTG Trp Leu TAC CCC Tyr Pro 280 GGG GGC Gly Gly 295 GAG CGT Glu Arg 185
AAC
Asn
AAT
Asn
GGG
Gly
AAG
Lys
ACA
Thr 265
TGC
Cys
GTG
Val
TGT
Cys
CCC
Pro
AAC
Asn
GGC
Gly
GGG
Gly
CAC
His 250
CCT
Pro
ACT
Thr
GAG
Glu
GAC
Asp ACG TAT AAC TGG GGG Thr Tyr Asn
ACG
Thr
ACT
Thr
GCC
Ala 235
CCC
Pro
AGG
Arg
GTC
Val
CAC
His
TTG
Leu 315
CGG
Arg
GGG
Gly 220
GGC
Gly
GAG
Glu
TGT
Cys
AAC
Asn
AGG
Arg 300
GAG
Glu
CCG
Pro 205
TTC
Phe
AAC
Asn
GCC
Ala
ATG
Met
TTC
Phe 285
TTC
Phe
GAC
Asp Trp Gly 190 CCG CGA Pro Arg ACC AAG Thr Lys AAC ACC Asn Thr ACC TAC Thr Tyr 255 GTT CAT Val His 270 ACC ATC Thr Ile GAA GCC Glu Ala AGG GAT Arg Asp 622 670 718 766 814 862 910 958 1006 1021 AGC CCG CTG CTG CTG TCT Ser Glu Leu Ser Pro Leu Leu Leu Ser 325 ACA ACA GAG TGG Thr Thr Glu Trp 330 CAG AGT Gin Ser 335 GGC AGA GCT TAATTA Gly Arg Ala INFORMATION FOR SEQ ID NO: 38: SEQUENCE CHARACTERISTICS: LENGTH: 338 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: Ile Pro Gin Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly Val 1 5 10 Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val 25 Leu Val Val Met Leu Leu Ph Ala Gly val Astp Gly HJ l Iri. y Vdl 40 Ser Gly Gly Ala Ala Ala Ser Asp Thr Arg Gly Leu Val Ser Leu Phe 55 Ser Pro Gly Ser Ala Gin Lys Ile Gin Leu Val Asn Thr Asn Gly Ser 109 Trp Giy Cys Gly Pro 145 Ser Val1 Asn Asn Cys 225 Thr Arg Pro Lys Cys 305 Ser Arg His Phe Pro Trp 130 Tyr Gin Gly Asp Trp 210 Gly Cys Cys Tyr Val1 290 Asn Giu Ala Ile Phe Glu 115 Gly Cys Val Thr Ser 195 Phe Gly Pro Gly Mrg 275 Arg Trp Leu Asn Ala 100 Arg Pro Trp Cys Thr 180 Asp Giy Pro Thr Ser 260 Leu Met Thr Ser Arg Ala Leu Leu His Giy 165 Asp Val1 Cys Pro Asp 245 Gly Trp Tyr Arg Pro 325 Thr Ala Leu Asn Leu Phe Ala Ser Thr Tyr 135 Tyr Ala 150 Pro Val Mrg Phe Leu Ile Thr Trp 215 Cys Asn 230 Cys Phe Pro Trp His Tyr Val Gly 295 Gly Glu 310 Leu Leu Tyr Cys 120 Thr Pro Tyr Giy Leu 200 Met Ile Mrg Leu Pro 280 Gly Mrg Leu Lys 105 Arg Glu Arg Cys Val 185 Asn Asn Gly Lys Thr 265 Cys Val1 Cys Ser Cys 90 His Ser Pro Pro Phe 170 Pro Asn Giy Gly His 250 Pro Thr Giu Asp Thr 330 Asn Lys Ile Asn Cys 155 Thr Thr Thr Thr Ala 235 Pro Arg Val His Leu 315 Thr Asp Phe Asp Ser 140 Gly Pro Tyr Arg Gly 220 Gly Giu Cys Asn Arg 300 Giu Giu Ser Asn Lys 125 Ser Ile Ser Asn Pro 205 Phe Asn Ala Met Phe 285 Phe Asp Trp Leu Ser 110 Phe Asp Val1 Pro Trp 190 Pro Thr Asn Thr Val1 270 Thr Giu Gln Ser Aila Gin Pro Val1 175 Gly Mrg Lys Thr Tyr 255 His Ile Ala Thr Gly Gin Arg Ala 160 Val Al a Gly Thr Leu 240 Al a Tyr Phe Aila Mrg 320 Gly Arg Asp Gin Ser 335 INFORMATION FOR SEQ ID NO: 39: SEQUENCE CHARACTERISTICS: LENGTH: 1034 base pairs TYPE: nucieic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: CDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (ix) FEATURE: 110 NAME/KEY: CDS LOCATION: 2. .1032 (ix) FEATURE: NAME/KEY: matpeptide LOCATION: 2. .1029 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: G ATC CCA CAA GCT GTC GTG GAC ATG GTG GCG GGG GCC CAT TGG GGA Ile Pro Gin Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly 1 5 10 GTC CTG GCG GGC CTC GCC TAC TAT TCC ATG GTG GGG AAC TGG GGT AAG Val Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys 25
GTT
Val
GTG
Val1
TTT
Phe
AGT
Ser
ACA
Thr
GGA
Gly
CAG
Gln
AGG
Arg
GCG
Ala 160
GTG
Val
GCG
Ala
GGC
Gly
ACG
Thr TTG GTT GTG ATG Leu
TCA
Ser
AGC
Ser
TGG
Trp
GGG
Gly
TGC
Gys
GGG
Gly
CCC
Pro 145
TCT
Ser
GTG
Val1
AAC
Asn kAC ksn
TGT
-ys 225 Val
GGA
Gly
CCC
Pro
CAC
His
TTC
Phe
CCA
Pro
TGG
Trp 130
TAG
Tyr
GAG
Gin
GGG
Gly
GAG
Asp
TGG
Trp 210
GGG
GlyC Val1
GGG
Gly
GGG
Gly
ATC
Ile
TT
Phe
GAG
Glu 115
GGT
Gly
TGC
Cys
GTG
Val1 kCG rhr rCG 3er 195
ETC
?he
'GC
fly Met
GCA
Ala
TG
Ser
AAC
Asn
GCC
Ala 100
CGG
Arg
CCC
Pro
TGG
Trp
TGC
Gys
ACC
Thr 180
GAT
Asp
GGC
Gly
CCC
Pro GTA CTC TTT GCC GGC GTC GAG GGG CAT ACC GG Leu Leu Phe Ala Gly Val Asp Gly His Thr Arg 40
GGA
Ala
GCT
Ala
AGG
Arg 85
GGA
Ala
TTG
Leu
GTC
Leu
CAC
His GGT4 Gly 165
GAT
Asp
GTG
Val
TGT
:ys
CCG
Pro TGC GAT Ser Asp 55 AAA ATG ACC AGG GGC CTT GTG TCC CTC Arg Gly Leu Val Ser CTC GTA AAC ACC Gin Lys Ile Gin
ACT
Thr
GTA
Leu
GCC
Ala
ACT
Thr
TAG
Tyr 150
CCA
Pro
CGG
krg
CTG
eu
),CA
Chr
PGC
'ys ~30
GGG
Ala
TTC
Phe
AGC
Ser
TAG
Tyr 135
GG
Ala
GTG
Val
TTT
Phe
PSTT
Ile r~c rrp 215
GTG
Leu
TAG
Tyr
TGT
Cys 120
ACT
Thr
GGT
Pro
TAT
Tyr
GGT
Gly
GTC
Leu 200 ATr.
Met
AAG
Asn
AAA
Lys 105
G
Arg
GAG
Glu
GGA
Arg
TG
Cys
GTG
Val1 185
AAG
Asn lAsn Leu
TGG
Cys 90
GAG
His
TC
Ser
GCT
Pro
CCG
Pro
TTG
Phe 170
CCC
Pro
AAG
Asn GOC-r Gly Val
AAC
Asn
AAA
Lys
ATG
Ile
AAG
Asn
TGT
Gys 155
ACC
Thr
AG
Thr
ACG
Thr Thr
GCC
Ala 235 Asn
GAG
Asp
TTC
Phe
GAG
Asp
AGG
Ser 140
GGT
Gly
GCG
Pro
TAT
Tyr
GG
Arg Gly 220 Thr
TGG
Ser
AAG
Asn
AAG
Lys 125
TG
Ser
ATT
Ile
AGG
Ser
A.AC
Asn
CCG
Pro 205 Phe
AAC
As n
CTC
Leu
TG
Ser 110
TTG
Phe
GAG
Asp
GTA
Val
GCT
Pro
TGG
Trp 190
CG
Pro Thr Leu
GGC
Gly
CAA
Gin
TGT
Ser
GGT
Ala
GAG
Gin
CCC
Pro
GTT
Val 175
GGG
Gly
GA
Arg Lys 142 190 238 286 334 382 430 478 526 574 622 6 7 0 718 AAC ATC GGG GGG Asn Ilie Gly Gly GGG AAG AAC AGC Gly Asn Asn Thr ~1_111 111 TTG ACC TGC CCC ACT GAC TGT TTT CGG AAG CAC CC Leu Thr Cys Pro Thr Asp Cys Phe Arg Lys His Pr 240 245 250 GCC AGA TGC GGT TCT GGG CCC TGG CTG ACA CCT AGC Ala Arg Cys Gly Ser Gly Pro Trp Leu Thr Pro Arc 260 265 TAC CCA TAT AGG CTC TGG CAC TAC CCC TGC ACT GTC Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val 275 280 TTC AAG GTT AGG ATG TAC GTG GGG GGC GTG GAG CAC Phe Lys Val Arg Met Tyr Val Gly Gly Val Glu His 290 295 GCA TGC AAT TGG ACT CGA GGA GAG CGT TGT GAC TTG Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu 305 310 315 AGA TCA GAG CTT AGC CCG CTG CTG CTG TCT ACA ACA Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr 320 325 330 CAG ACA CCA TCA CCA CCA TCA CTA AT AG Gin Thr Pro Ser Pro Pro Ser Leu 340 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 343 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1le Pro Gin Ala Val Val Asp Met Val Ala Gly Ala 1 5 10 Leu Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn 25 Leu Val Val Met Leu Leu Phe Ala Gly Val Asp Gly 40 ier Gly Gly Ala Ala Ala Ser Asp Thr Arg Gly Leu 55 ;er Pro Gly Ser Ala Gin Lys Ile Gin Leu Val Asn 70 75 'rp His Ile Asn Arg Thr Ala Leu Asn Cys Asn Asp 90 hly Phe Phe Ala Ala Leu Phe Tyr Lys His Lys Phe 100 105 -ys Pro Glu Arg Leu Ala Ser Cys Arg Ser Ile Asp 115 120 ;ly Trp Gly Pro Leu Thr Tyr Thr Glu Pro Asn Ser 130 135 140 ro Tyr Cys Trp His Tyr Ala Pro Arg Pro Cys Gly 45 150 155 C GAG o Glu
STGT
SCys
AAC
SAsn
SAGG
Arg 300
GAG
Glu
GGT
Gly His I Trp I His T Val S Thr A Ser L Asn S 1 Lys P 125 Ser A Ile V GCC ACC Ala Thr ATG GTT Met Val 270 TTC ACC Phe Thr 285 TTC GAA Phe Glu GAC AGG Asp Arg GAT CGA Asp Arg
TAC
Tyr 255
CAT
His
ATC
Ile
GCC
Ala
GAT
Asp
GGG
Gly 335 766 814 862 910 958 1006 1034
I
I
G
C
G
P
1 :rp Gly Val la Lys Val 'hr Arg Val er Leu Phe sn Gly Ser eu Gin Thr er Ser Gly he Ala Gin sp Gin Arg al Pro Ala 160 112 Ser Gin Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val 165 170 175 Val Gly Thr Thr Asp Arg Phe Gly Val Pro Thr Tyr Asn Trp Gly Ala 180 185 190 Asn Asp Ser Asp Val Leu Ile Leu Asn Asn Thr Arg Pro Pro Arg Gly 195 200 205 Asn Trp Phe Gly Cys Thr Trp Met Asn Gly Thr Gly Phe Thr Lys Thr 210 215 220 Cys Gly Gly Pro Pro Cys Asn Ile Gly Gly Ala Gly Asn Asn Thr Leu 225 230 235 240 Thr Cys Pro Thr Asp Cys Phe Arg Lys His Pro Glu Ala Thr Tyr Ala 245 250 255 Arg Cys Gly Ser Gly Pro Trp Leu Thr Pro Arg Cys Met Val His Tyr 260 265 270 Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Val Asn Phe Thr Ile Phe 275 280 285 Lys Val Arg Met Tyr Val Gly Gly Val Glu His Arg Phe Glu Ala Ala 290 295 300 Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu Glu Asp Arg Asp Arg 305 310 315 320 Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr Gly Asp Arg Gly Gin 325 330 335 Thr Pro Ser Pro Pro Ser Leu 340 INFORMATION FOR SEQ ID NO: 41: SEQUENCE CHARACTERISTICS: LENGTH: 945 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..942 (ix) FEATURE: NAME/KEY: mat_peptide LOCATION: 1..939 (xi) SEQUENCE DESCRIPTION: SEO ID NO: 41: ATG GTG GGG AAC TGG GCT AAG GTT TTG GTT GTG ATG CTA CTC TTT GCC 48 Met Val Gly Asn Trp Ala Lys Val Leu Val Val Met Leu Leu Phe Ala 1 5 10 GGC GTC GAC GGG CAT ACC CGC GTG TCA GGA GGG GCA GCA GCC TCC GAT 96 Gly Val Asp Gly His Thr Arg Val Ser Gly Gly Ala Ala Ala Ser Asp 113
AC(
Thi
CAG
Gir
AAC
Asn
AAA
Lys
CGC
Arg
GAG
Glu
CGA
Arg
TGC
Cys 145
GTC
Val1 *AGG GGC *Arg Gly CTC GTA Leu Val *TGC AAC Cys Asn CAC AAA Hius Lys TCC ATC Ser Ile CCT AAC Pro Asn 11is CCG TGT Pro Cys 130 TTC ACC Phe Thr CCC ACG Pro Thr
CT
Le
AAI
As2
GA(
Asi Trc Phe
GAC
Asp 100
AGC
Ser
GGT
Gly
'CG
ro
'AT
.'yr TGTG TO u Val Se: CACC AA( ni Thr Asr TCC CTC Ser Let 7C AAC TCG Asn Ser AAG TTC Lys Phe TCG GAC Ser Asp ATT GTA Ile Val AGC CCT Ser Pro 150 AAC TGG CTC TTT rLeu Phe 40 GGC AGT SGly Ser 55 CAA ACA Gin Thr TCT GGA Ser Gly GCT CAG Ala Gin CAG AGG Gin Arg 120 CCC GCG Pro Ala 135 GTT GTG Vai Val GGG GCG
AG(
Sei
TGC
Trp
GGG
Gly
TGC
Cys
GGG
Gly 105
CCC
Pro rCT Ser
,TG
al1 ccC Prc
CAC
His
TTC
Phe
CCA
Pro 90
TGG
Trp
TAC
Tyr
CAG
Gin
GGG
Gly
GAC
Asp 170 GGG TCG )Gly Ser ATC AAC Ile Asn TTT GCC Phe Ala 75 GAG CGC Glu Arg GGT CCC Gly Pro TGC TGG Cys Trp GTG TGC Val Cys 140 ACG ACC Thr Thr 155 TCG GAT C Ser Asp I
GC'
Al~
AGC
Arc
GCA
Ala
TTG
Leu
CTC
Leu
CAC
[-is 125
GGT
G
1 y
'AT
ksp
TG
tal T CAG AAA a Gin Lys ACT GCC Thr Ala CTA TTC Leu Phe GCC AGC Ala Ser ACT TAC Thr Tyr 110 TAC GCG Tyr Ala CCA GTG Pro Val CGG TTTC Arg PheC CTG ATT C Leu Ile L
ATC
Ile
CTG
Leu
TAC
Tyr
TGT
Cys
ACT
Thr
CCT
Pro
TAT
Tyr 3GT fly
TC
jCu 144 192 240 288 336 384 432 480 528 Asn Trp Gly Ala Asn 165 AAC AAC Asn Asn AAT GGC Asn Gly GGG GGG Gly Gly 210 AAG CAC Lys His 225 ACA CCT Thr Pro TGC ACT Cys Thr '3G GAG ~P Val Glu
ACG
Thr
ACT
Thr 195
GCC
Ala
CCC
Pro
AGG
Arg
GTC
Val1
CAC
His 275
CGG
Mrg 180
GGG
Gly
GGC
Gly
GAG
Glu
TGT
Cys
AAC
Asn 260
AGG
Arg
CCG
Pro
TTC
Phe
AAC
Asn
GCC
Ala
ATG
Met 245
TTC
Phe Yc' Phe
CCG
Pro
ACC
Thr
AAC
Asn
ACC
Thr 230
GTT
Val1
ACC
Thr LiAA Glu CGA GGC AAC TGG TTC GGC TGT Mrg Gly Asn Trp Phe Gly Cys 185 ACA TGG ATG Thr Trp Met 190 ACG TGT Thr Cys 200 TTG ACC Leu Thr GCC AGA Ala Arg TAC CCA Tyr Pro TTC AAG Phe Lys 265 GCA TGC Ala Cys 280
GGG
Gly
TGC
Cys
TGC
Cys
TAT
Tyr 250
GTT
Val1
AAT
Asn
GGC
Gly
CCC
Pro
GGT
Gly 235
AGG
Mrg
AGG
Arg
TGG
Trp
CCC
Pro
ACT
Thr 220
TCT
Ser
CTC
Leu
ATG
Met
ACT
Thr
AGC
Ser 300 CCG TGC AAC Pro Cys Asn 205 GAC TGT TTT Asp Cys Phe GGG CCC TGG Gly Pro Trp TGG CAC TAC Trp His Tyr 255 TAG GTG GGG Tyr Val Gly 270 CGA GGA GAG Arg Gly Glu 285 CCG CTG CTG Pro Leu Leu
ATC
ile
CGG
Mrg
CTG
iLeu 240
CCC
?'ro
GGC
Gly
CGT
.Arg
CTG
:,eu 576 624 672 720 768 816 864 912 TGT GAC TTG GAG GAC AGG GAT AGA TCA GAG CTT Cys Asp Leu Glu Asp Mrg Asp Arg Ser Glu Leu 290 295 114 TCT ACA ACA GAG TGG CAG AGC TTA ATT AAT TAG 945 Ser Thr Thr Glu Trp Gin Ser Leu Ile Asn 305 310 INFORMATION FOR SEQ ID NO: 42: SEQUENCE CHARACTERISTICS: LENGTH: 314 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: Met Val Gly Asn Trp Ala Lys Val Leu Val Val Met Leu Leu Phe Ala 1 5 10 Gly Val Asp Gly His Thr Arg Val Ser Gly Gly Ala Ala Ala Ser Asp 25 Thr Arg Gly Leu Val Ser Leu Phe Ser Pro Gly Ser Ala Gin Lys Ile 40 Gin Leu Val Asn Thr Asn Gly Ser Trp His Ile Asn Arg Thr Ala Leu 55 Asn Cys Asn Asp Ser Leu Gin Thr Gly Phe Phe Ala Ala Leu Phe Tyr 70 75 Lys His Lys Phe Asn Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser Cys 90 Arg Ser Ile Asp Lys Phe Ala Gin Gly Trp Gly Pro Leu Thr Tyr Thr 100 105 110 Glu Pro Asn Ser Ser Asp Gin Arg Pro Tyr Cys Trp His Tyr Ala Pro 115 120 125 Arg Pro Cys Gly Ile Val Pro Ala Ser Gin Val Cys Gly Pro Val Tyr 130 135 140 Cys Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Phe Gly 145 150 155 160 Val Pro Thr Tyr Asn Trp Gly Ala Asn Asp Ser Asp Val Leu Ile Leu 165 170 175 Asn Asn Thr Arg Pro Pro Arg Gly Asn Trp Phe Gly Cys Thr Trp Met 180 185 190 Asn Gly Thr Gly Phe Thr Lys Thr Cys Gly Gly Pro Pro Cys Asn Ile 195 200 205 Gly Gly Ala Gly Asn Asn Thr Leu Thr Cys Pro Thr Asp Cys Phe Arg 210 215 220 Lys His Pro Glu Ala Thr Tyr Ala Arg Cys Gly Ser Gly Pro Trp Leu 225 230 235 240 Thr Pro Arg Cys Met Val His Tyr Pro Tyr Arq Leu TrD His Tvr Pro 245 250 255 Cys Thr Val Asn Phe Thr Ile Phe Lys Val Arg Met Tyr Val Gly Gly 260 265 270 Val Glu His Arg Phe Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg 275 280 285 115 Cys Asp Leu Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu 290 295 300 Ser Thr Thr Glu Trp Gin Ser Leu Ile Asn 305 310 INFORMATION FOR SEQ ID NO: 43: SEQUENCE CHARACTERISTICS: LENGTH: 961 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..958 (ix) FEATURE: NAME/KEY: matpeptide LOCATION: 1..955 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: ATG GTG GGG AAC TGG GCT AAG GTT TTG GTT GTG ATG CTA CTC TTT GCC 48 Met Val Gly Asn Trp Ala Lys Val Leu Val Val Met Leu Leu Phe Ala 1 5 10 GGC GTC GAC GGG CAT ACC CGC GTG TCA GGA GGG GCA GCA GCC TCC GAT 96 Gly Val Asp Gly His Thr Arg Val Ser Gly Gly Ala Ala Ala Ser Asp 25 ACC AGG GGC CTT GTG TCC CTC TTT AGC CCC GGG TCG GCT CAG AAA ATC 144 Thr Arg Gly Leu Val Ser Leu Phe Ser Pro Gly Ser Ala Gin Lys Ile 40 CAG CTC GTA AAC ACC AAC GGC AGT TGG CAC ATC AAC AGG ACT GCC CTG 192 Gin Leu Val Asn Thr Asn Gly Ser Trp His Ile Asn Arg Thr Ala Leu 55 AAC TGC AAC GAC TCC CTC CAA ACA GGG TTC TTT GCC GCA CTA TTC TAC 240 Asn Cys Asn Asp Ser Leu Gin Thr Gly Phe Phe Ala Ala Leu Phe Tyr 70 75 AAA CAC AAA TTC AAC TCG TCT GGA TGC CCA GAG CGC TTG GCC AGC TGT 288 Lys His Lys Phe Asn Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser Cys 90 CGC TCC ATC GAC AAG TTC GCT CAG GGG TGG GGT CCC CTC ACT TAC ACT 336 Arg Ser Ile Asp Lys Phe Ala Gin Gly Trp Gly Pro Leu Thr Tyr Thr 100 105 110 GAG CCT AAC AGC TCG GAC CAG AGG CCC TAC TGC TGG CAC TAC GCG CCT 384 Glu Pro Asn Ser Ser Asp Gin Arg Pro Tyr Cys Trp His Tyr Ala Pro 115 120 125 CGA CCG TGT GGT ATT GTA CCC GCG TCT CAG GTG TGC GGT CCA GTG TAT 432 Arg Pro Cys Gly Ile Val Pro Ala Ser Gin Val Cys Gly Pro Val Tyr 130 135 140 TGC TTC ACC CCG AGC CCT GTT GTG GTG GGG ACG ACC GAT CGG TTT GGT 480 116 Cys Phe Thr Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Phe 145 150 155 GTC CCC ACG TAT AAC TGG GGG GCG AAC GAC TCG GAT GTG CTG ATT C Val Pro Thr Tyr Asn Trp Gly Ala Asn Asp Ser Asp Val Leu Ile L 165 170 175 AAC AAC ACG CGG CCG CCG CGA GGC AAC TGG TTC GGC TGT ACA TGG A Asn Asn Thr Arg Pro Pro Arg Gly Asn Trp Phe Gly Cys Thr Trp M 180 185 190 AAT GGC ACT GGG TTC ACC AAG ACG TGT GGG GGC CCC CCG TGC AAC A Asn Gly Thr Gly Phe Thr Lys Thr Cys Gly GIy Pro Pro Cys Asn I 195 200 205 GGG GGG GCC GGC AAC AAC ACC TTG ACC TGC CCC ACT GAC TGT TTT C Gly Gly Ala Gly Asn Asn Thr Leu Thr Cys Pro Thr Asp Cys Phe A 210 215 220 AAG CAC CCC GAG GCC ACC TAC GCC AGA TGC GGT TCT GGG CCC TGG C: Lys His Pro Glu Ala Thr Tyr Ala Arg Cys Gly Ser Gly Pro Trp LE 225 230 235 24 ACA CCT AGG TGT ATG GTT CAT TAC CCA TAT AGG CTC TGG CAC TAC CC Thr Pro Arg Cys Met Val His Tyr Pro Tyr Arg Leu Trp His Tyr Pr 245 250 255 TGC ACT GTC AAC TTC ACC ATC TTC AAG GTT AGG ATG TAC GTG GGG GG 275 280 285 TGT GAC TTG GAG GAC AGG GAT AGA TCA GAG CTT AGC CCG CTG CTG CT
TAG
INFORMATION FOR SEQ ID NO: 44: SEQUENCE
CHARACTERISTICS:
LENGTH: 319 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: Met Val Gly Asn Trp Ala Lys Val Leu Val Val Met Leu Leu Phe Ala 1 5 10 Gly Val Asp Gly His Thr Arg Val Ser Gly Gly Ala Ala Ala Ser Asp 25 Thr Arg Gly Leu Val Ser Leu Phe Ser Pro Gly Ser Ala Gin Lys ile 40 Gin Leu Val Asn Thr Asn Gly Ser Trp His Ile Asn Arg Thr Ala Leu 55 3ly 160
:TC
jeu
,TG
let
TC
le 3G rg rG
C
!C
y
T
g u -ic 528 576 624 672 720 768 816 864 912 958 961 117 Gly Phe Phe Ala Ala Leu Phe Asn Cys Asn Asp Ser Leu Gin Thr Lys His Lys Phe Asn Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser Cys Arg Ser Giu Pro Arg Pro 130 Cys Phe 145 Val Pro Asn Asn Asn Gly Gly Gly 210 Lys His 225 Thr Pro Cys Thr Val Glu Cys Asp 290 Ile Asn 115 Cys Thr Thr Thr Thr 195 Ala Pro Arg Val His 275 Leu Asp 100 Ser Gly Pro Tyr Arg 180 Gly Gly Giu Cys Asn 260 Arg Glu Lys Phe Ala Ser Ile Ser Asn 165 Pro Phe Asn Al a Met 245 Phe Phe A.sp Asp Val1 Pro 150 Trp Pro Thr Asn Thr 230 Val1 Thr Glu Arg Gin Pro 135 *Val Gly Arg Lys Thr 215 Tyr His Ile Ala Asp 295 Gly Gin Arg 120 Al a Val Ala Gly Thr 200 Leu Ala Tyr Phe Ala 280 Arg Gln Gly 105 Pro Ser Val1 Asn Asn 185 Cys Thr Arg Pro Lys 265 Cys Ser Thr Trp Tyr Gin Gly Asp 170 Trp Gly Cys Cys Tyr 250 Val1 Asn Glu Pro *Gly Cys Val Thr 155 Ser Phe G ly Pro Gly 235 Arg Arg Trp, Leu Ser 315 Pro Trp Cys 140 Thr Asp Gly Pro Thr 220 Ser ELeu M1et rhr Ser 300 Pro Leu His 125 Gly Asp Val Cys Pro 205 Asp Gly Trp Tyr Arg 285 Pro Pro Thr 110 Tyr Pro Arg Leu Thr 190 Cys Cys Pro His Val 270 Gly Leu Ser Tyr Ala Val1 Phe Ile 175 Trp Asn Phe Trp Tyr 255 Gly Glu Leu Leu Thr Pro Tyr Gly 160 Leu Met Ile Arg Leu 240 Pro Gly Arg Ser Thr Thr Gly Asp Arg 305 310 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 1395 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (ix) FEATURE: NAME/KEY: CDS LOCATION: 1. .1392 (ix) FEATURE: NAME/KEY: matpeptide 118 LOCATION: 1. .1389 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: ATG GTG GCG GGG GCC CAT TGG GGA GTC CTG GCG GGC CTC GCC TAC TAT Met Vai Ala Gly Ala His Trp Gly Val Leu Ala Gly Leu Ala Tyr Tyr 1 5 10 TCC ATG GTG GGG AAC TGG Ser Met Val Gly Asn Trp GCT AAG GTT TTG GTT GTG ATG CTA CTC 'TTT Ala~ uys Val Leu Val Val M 25TC GG GG
GCC
Al a
GAT
Asp
ATC
Ile
CTG
Leu
TAC
Tyr
TGT
Cys
ACT
Thr
CCT
Pro 145
TAT
Tyr
GGT
Gly
CTC
Leu
ATG
Met
ATC
225
GGC
Gly
ACC
Thr
CAG
Gin
AAC
Asn
AAA
Lys
CGC
Arg
GAG
Giu 130
CGA
Arg
TGC
Cys1
GTC
Val kiAC2 %sn2 k.AT ),sn 3GG C ,Jly C GTC GAC Val Asp AGG GGC Arg Gly GGG CAT ACC CGC GTC Gly His Thr Arg Val 40 CTT GTG TCC CTC TTI Leu Vai Ser Leu Phe Ser Giy Gly A AGC CCC GGG T
CTC
Leu
TGC
Cys
CAC
His
TCC
Ser 115
CCT
Pro
CCG
Pro
TTC
Phe
~CC
Pro
%JAC
%sn 195
,GC
;iy
GG
GTA
Val
AAC
Asn
AAA
Lys 100
ATC
Ile
AAC
Asn
TGT
Cys
ACC
Thr
ACG
Thr 180
ACG
Thr
ACT
Thr
GCC
Ala
AAC
Asn
GAC
Asp
TTC
Phe
GAC
Asp
AGC
Ser
GGT
Gly
CCG
Pro 165
TAT
T'yr
CGG
krg 3GG Gly 3GC 3ly.
ACC AAC GGC AGT Thr Asn Gly Ser 70 TCC CTC CAA ACA Ser
AAC
Asn
AAG
Lys
TCG
Ser
ATT
Ile 150
AGC
Ser
AAC
Asn
CCG
Pro
TTC
Phe
AAC
Asn 230 Leu
TCG
Ser
TTC
Phe
GAC
Asp 135
GTA
Val1
CCT
Pro
TGG
Trp
CCG
Pro
ACC
Thr 215
AAC
Asn Gin
TCT
Ser
GCT
Al a 120
CAG
Gin
CCC
Pro
GTT
Val
GGG
Gly.
CGA
Arg 200
AAG
Lys
ACC
Thr Thr
GGA
Gly 105
CAG
Gin
AGG
Arg
GCG
Ala
GTG
Val1
GCG
Al a 185
GGC
Gly
ACG
rhr
TTG
Leu Ser
TGG
Trp
GGG
Gly 90
TGC
Cys
GGG
Gly
CCC
Pro
TCT
Ser
GTG
Val1 170
AAC
Asn
AAC
Asn
TGT
Cys
ACC
Thr Pro
CAC
His 75
TTC
Phe
CCA
Pro
TGG
Trp
TAC
Tyr
CAG
Gin 155
GGG
Gly
GAC
Asp
TGG
Trp
GGG
Gly
TGC
Cys 235 Gly
ATC
Ile
TTT
Phe
GAG
Glu
GT
Gly
TGC
Cys 140
GTG
Val1
ACG
Thr
TCG
Ser
TTC
Phe
GGC
Gly 220
CCC
Pro
S
A
A
GI
A:
C(
CC
12
TC
Ty.
TG
AC
Th
G.A
As G1 20 Pr rh let Leu Leu Phe ;CA GCA GCC TCC .la Ala Ala Ser CG GCT CAG AAA er Ala Gin Lys AC AGG ACT GCC sn Arg Thr Ala CC GCA CTA TTC la Ala Leu Phe 3C TTG GCC AGC Leu Ala Ser 110 'C CTC ACT TAC :o Leu Thr Tyr G CAC TAC GCG p His Tyr Ala IC GGT CCA GTG 's Gly Pro Val 160 'C OAT CGG TTT Lr Asp Arg Phe 175 TGTG CTG ATT p Val Leu Ile 190 C TGT ACA TGG y Cys Thr Trp C CCG TOC AAC o Pro Cys Asn T GAC TOT -rTT r Asp Cys Phe 240 96 144 192 240 288 336 384 432 480 528 576 624 672 720 CGG AAG CAC Arg Lys His CCC GAG GCC ACC TAC AGA TGC GGT TCT GGG CCC Pro Giu 245 Ala Thr Tyr Arg 250 Cys Gly Ser Gly Pro 255
CTG
Leu
CCC
Pro
GGC
Gly
CGT
Arg 305
CTG
Leu
CCG
Pro
GTG
Val
AAA~
Lys
ATC
Ile 385 GCC 'I Ala I CAT G His G AAG G Lys G
ACA
Thr
TGC
Cys
GTG
Val1 290
TGT
Cys
TCT
Ser
GCC
Ala
CAA
TGG
rrp 370 rGC rTA ~eu GC2 ;ly
GC
~ly1 CCT AGG TOT Pro Arg Cys 260 ACT GTC AAC Thr Val Asn 275 GAG CAC AG Giu His Arg GAC TTG GAG Asp Leu Glu ACA ACA GAG Thr Thr Giu 325 CTA TCC ACC Leu Ser Thr 340
ATG
Met
TTC
Phe
TTC
Phe
GAC
Asp 310
TG
Trp
GC
Gly GTT CAT Vai His ACC ATC Thr Ile 280 GAA GCC Giu Ala 295 AGG OAT Arg Asp CAG ATA Gin Ile CTO ATC Leu Ile 119 TAC CCA TAT AGO Tyr Pro Tyr Arg 265 TTC AAG OTT AGO Phe Lys Val Arg GCA TOC AAT TOO Aia Cys Asn Trp 300 AGA TCA GAG CTT- Arg Ser Oiu Leu 315 CTO CCC TOT TC Leu Pro Cys Ser 330 CAC CTC CAT CAG His Leu His Gin 345
CTC
Leu
ATO
Met 285
ACT
Thr
AGC
Ser
TTC
Phe
AAC
Asn TOG CAC TA Trp His Ty 270 TAC OTO 00 Tyr Vai Gil COA OGA GA Arg Giy Oh CCO CTG CT( Pro Leu LeL 320 ACC ACC CTG Thr Thr Leu 335 ATC OTO GAC Ile Vai Asp 350
C
r
G
y 8i6 864 912 960 1008 1056 1104 1152 1200 i2 48 1296 1344 rAC T'yr 355 33AG 3lu 3CC la
;AG
;iu
~CT
'hr ,00 .rg 35
CTG
Leu
TAT
Tyr
TOC
Cys
AAC
Asn
CTT
Leu 420
CTO
Leu
TAC
Tyr
OTC
Val1
TTA
Leu
CTO
Leu 405
TCC
Ser
OTC
Val1
GT
Gly
CTO
Leu
TOO
Trp 390
OTO
Val1
TTC
Phe
CCT
Pro
OTA
Val
TTO
Leu 375
ATO
Met
OTC
Vai
CTT
Leu
GOT
Gly 000 Giy 360
CTC
Leu
ATO
Met
CTC
Leu
OTO
Vai
OCO
Aia 440 TCO OCO Ser Ala TTC CTT Phe Leu CTG CTO Leu Leu AAT GCO Asn Ala 410 TTC TTC Phe Phe 425 OCA TAC Ala Tyr
OTT
Vali
CTC
Leu
ATA
Ile 395
OCO
Ala
TOT
Cys
GCC
Aila OTC TCC Val Ser 365 CTO OCA Leu Aia 380 OCT CAA Aia Gin 0CC OTO Ala Vai OCT 0CC Ala Aia TTC TAT Phe Tyr 445
CTT
Leu
OAC
Asp
OCT
Ala 0CC Aila
TOG
Trp 430
GOC
Gly
GTC
Vali
OCO
Ala
GAO
Olu 000 Oly 415
TAC
Tyr
OTO
Val
ATC
Ile
COC
Arg 0CC Ala 400
OCO
Al a
ATC
Ile
TOO
Trp 4 CCO CTO CTC CTO CTT CTO CTG 0CC TTA CCA CCA CGA OCT TAT 0CC TAGTAA Pro Leu Leu Leu Leu Leu Leu Ala Leu Pro Pro Arg Aia Tyr Ala 450 455 460 INFORMATION FOR SEQ ID NO: 46: SEQUENCE CHARACTERISTICS: LENGTH: 463 amino acids TYPE: amino acid TOPOLOGY: iinear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION- qfn Tn Mrn. Met Val Ala Gly Ala His Trp Gly Val Leu Ala Oly Leu Ala Tyr Tyr 1 5 10 Ser Met Val Oly Asn Trp, Ala Lys Val Leu Val Val Met Leu Leu ?he 25 1395 Ala Gly Val Asp Gly His Thr Arg Val 40 120 Ser Gly Gly Ala Ala Ala Ser Asp Thr Arg Gly Leu Val Ser Leu Phe Ser Pr 55 Asn Ile Leu Tyr Cys Thr Pro 145 Tyr Gly Leu Met Ile 225 Arg Leu Pro Gly Arg 305 Leu Pro I Vai C Lys '1 Gin Leu Val Asn Thr Gly Ser Trr Asn Lys Arg Glu 130 Arg Cys Val1 Asn Asn 210 Gly Lys rhr Cys la 1 290 ;er ;ln rp Cys His Ser 115 Pro Pro Phe Pro Asn 195 Gly Gly His Pro Thr 275 Glu Asp Thr Leu Tyr 355 Glu Asn ILys 100 Ile Asn Cys Thr Thr 180 Thr Thr Ala Pro Arg 260 Val His Leu Thr Ser 340 Leu Tyr Asp Phe Asp Ser Gly Pro 165 Tyr Arg Giy Gly Glu 245 Cys Asn Arg Giu Glu 325 ['hr ['yr la 1 70 Ser Asn Lys Ser Ile 150 Ser Asn Pro Phe Asn 230 Ala Met Phe Phe Asp 310 Trp C Gly I Gly Leu L Leu Gin Thr Set Phe Asp 135 Val1 Pro Trp Pro Thr 215 Asn rhr la 1 ['hr 3lu ~95 krg ln ~eu Tal ~eu 175 *Ser Ala 120 Gin Pro Val Gly Arg 200 Lys Thr Tyr His Ile 280 Ala.
Asp.
Ile Ile Gly 360 Leu Giy 105 Gin Arg Ala Val1 Al a 185 Gly Thr Leu Aia Tyr 265 Phe Al a Arg Lieu -His 345 Ser Phe Gly 90 'Cys Gly Pro Ser Val 170 Asn Asn Cys Thr Arg 250 Pro Lys Cys Ser Pro 330 Leu Ala Leu I Hi~ PhE Prc Trp Tyr Gin 155 Gly Asp Trp Gly Cys 235 Cys Tyr Val1 ksn G1u 315 :ys -[is fal ~eu oGly s Ile SPhe Glu Gly Cys 140 *Val *Thr Ser Phe Gly 220 Pro Gly Arg I Arg Trp 1* 300 Leu S Ser P Gin A Val S 3 Leu A 380 Se: Asi Al~ Arc Pro 125 Trp Cys Thr Asp Gly 205 Pro ['hr 3er ~eu ~et 'hr ;er 'he ~sn er l1a r Al~ 'i Arc a Ala Leu 110 Leu His Gly Asp Val1 190 Cys Pro Asp Gly Trp 270 Tyr Arg Pro Thr Ile 350 Leu Asp aGin Lys Thr Ala Leu Phe IAla Thr Tyr Tyr Ala Pro Val 160 Arg Phe 175 Leu Ile Thr Trp Cys Asn Cys Phe 240 Pro Trp 255 His Tyr Val Gly Gly Giu Leu Leu 320 Thr Leu 335 Val Asp Val Ile Ala Arg Cys Ala Cys Leu Trp 390 Met Met Leu Leu AlGiAaGi Ala Gln Ala Glu 121 Ala Leu Glu Asn Leu Val Val Leu Asn Ala Ala Ala Val Ala Gly Ala 405 410 415 His Gly Thr Leu Ser Phe Leu Val Phe Phe Cys Ala Ala Trp Tyr Ile 420 425 430 Lys Gly Arg Leu Val Pro Gly Ala Ala Tyr Ala Phe Tyr Gly Val Trp 435 440 445 Pro Leu Leu Leu Leu Leu Leu Ala Leu Pro Pro Arg Ala Tyr Ala 450 455 460 INFORMATION FOR SEQ ID NO: 47: SEQUENCE CHARACTERISTICS: LENGTH: 2082 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL:
NO
(iii) ANTI-SENSE: NO (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..2079 (ix) FEATURE: NAME/KEY: matpeptide LOCATION: 1..2076 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: AAT TTG GGT AAG GTC ATC GAT ACC CTT ACA TGC GGC TTC GCC GAC CTC 48 Asn Leu Gly Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 1 5 10 GTG GGG TAC ATT CCG CTC GTC GGC GCC CCC CTA GGG GGC GCT GCC AGG 96 Val Gly Tyr Ile Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 25 GCC CTG GCG CAT GGC GTC CGG GTT CTG GAG GAC GGC GTG AAC TAT GCA 144 Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala 40 ACA GGG AAT TTG CCC GGT TGC TCT TTC TCT ATC TTC CTC TTG GCT TTG 192 Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu 55 CTG TCC TGT CTG ACC GTT CCA GCT TCC GCT TAT GAA GTG CGC AAC GTG 240 Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr Glu Val Arg Asn Val 70 75 TCC GGG ATG TAC CAT GTC ACG AAC GAC TGC TCC AAC TCA AGC ATT GTG 288 Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser Ile Val 90 TAT GAG GCA GCG GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC 336 Tyr Glu Ala Ala Asp Met Ile Met His Thr Pro Gly Cys Val Pro Cys 100 105 110 GTT CGG GAG AAC AAC TCT TCC CGC TGC TGG GTA GCG CTC ACC CCC ACG 384 Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr 122 115 CTC OCA GCT Leu Ala Ala 130 GTC GAT ?TG Val Asp Leu 145 GGG GAC CTC Gly Asp Leu TCG CCT CGC Ser Pro Arg CCC GGC CAC Pro Gly His 195 TGG TCG CCT Trp Ser Pro 210 CAA GCT GTC Gin Ala Val 225 GGC CTC GCC Gly Leu Ala GTG ATG CTA C Val Met Leu L
AGG
Arg
CTC
Leu
TGC
Cys
CGG
Arg 180
ATA
Ile
A~CA
rhr 3TGC lal I
[AC
[yr2 2
AA
As:
GT.
Va
GG)
Glj 165
CAT
His
ACG
Thr kCG rhr 3AC ~sp
.'AT
yr :45 120 C GCC AGC GTC CCC n Ala Ser Val Pro 135 rGGG GCG GCT GCT L Gly Ala Ala Ala 150 STCT GTC ITC CTC 'Ser Val Phe Leu GAG ACG GTG CAG Giu Thr Val Gin 185 GGT CAC CGT ATG Gly His Arg Met 200 GCC CTG GTG GTA Ala Leu Val Val 215 ATG GTG GCG GGG Met Val Ala Gly 230 TCC ATG GTG GGG Ser Met Val Gly GCC GGC GTC GAC G Ala Gly Val Asp G 265
ACC
Thr
TTC
Phe
GTC
Val1 170
GAC
Asp
GCT
Ala
TCG
Ser 3CCC kla 1 AC2 ~sn 2 ~50
~GGC
Ily H
AC
Th TGr Cy~ 15!
TC(
Sez
TGC
Cys
TGG
Trp
CAG
31n
'AT
is ~35
'GG
rp
'AT
i1s 12 G ACA AT r Thr Ii 140 T TCC GC' s Ser AL.
SCAG CT( Gin Lei AAT TG( Asn CyE GAT ATC Asp Met 205 CTG CTC Leu Leu 220 TGG GGA Trp Gly GCT AAG Ala Lys ACC CGC Thr Arg A CGA CGC CAC e Arg Arg His T' ATG TAC GTG a Met Tyr Val 160 3 TTC ACC ATC 1 Phe Thr Ile 175 TCA ATC TAT Ser Ile Tyr 190 IATG ATG AAC .Met Met Asn CGG ATC CCA Arg Ile Pro GTC CTG GCG Vai Leu Ala 240 GTT TTG GTT Val Leu Val 255 GTG TCA GGA Val Ser Gly 432 480 528 576 624 672 720 768 816 .eu Phe 260
GGG
Gly
GGG
Giy
ATC
Ile 305
TTT
Phe
GAG
Glu
GGT
Gly
TGC
Cys
GTG
Val 385
GCA
Ala
TCG
Ser 290
AAC
Asn
GCC
Ala
CGC
Arg
CC
Pro
TGG
rrp 370
GCA
Ala 275
GCT
Ala
AGG
Arg
GCA
Ala
TTG
Leu
CTC
Leu 355
CAC
His
GCC
Ala
CAG
Gin
ACT
Thr
CTA
Leu
GCC
Al a 340
ACT
Thr
TAC
TCC
Ser
AAA
Lys
GCC
Al a
TTC
Phe 325
AGC
Ser
TAC
Tyr (rrr.
GAT
Asp
ATC
Ile
CTG
Leu 310
TAC
Tyr
TGT
Cys
ACT
Thr ACC AGG Thr Arg 280 CAG CTC Gin Leu 295 AAC TGC Asn Cys AAA CAC Lys His CGC TCC Arg Ser GAG CCT Glu Pro 360 Arg Pro 375
GGC
Gly
GTA
Val
AAC
Asn
AAA
Lys
ATC
Ile 345
AAC
Asn Cys
CTT
Leu
AAC
Asn
GAC
Asp
TTC
Phe 330
GAC
Asp
AGC
Ser Gly
GTG
Val1
ACC
Thr
TCC
Ser 315
AAC
Asn
AAG
Lys
TCG
Ser Ile
AGC
Ser 395
TCC
Ser
AAC
Asn 300
CTC
Leu
TCG
Ser
TTC
Phe
GAC
Asp Val1 380
CTC
Leu 285
GGC
Giy
CAA
Gin
TCT
Ser
GCT
Ala
CAG
Gin 365 Pro
TTT
Phe
AGT
Ser
ACA
Thr
GGA
Gly
CAG
Gin 350
AGG
Arg Ala3
AGC
Ser
TGG
Trp
GGG
Gly
TGC
Cys 335
GGG
Gly
CCC
Pro
TCT
Ser
CCC
Pro
CAC
His
ITC
Phe 320
CCA
Pro
TGG
Trp
TAC
Tyr
CAG
Gin
GGG
Gly 400 864 912 960 1008 1056 1104 1152 1200 T'yr Ala Pro rGC GGT CCA GTG 'ys Gly Pro Val TGC TTC ACC CCG Cys Phe Thr Pro CCT GTT GTG GTG Pro Val Val Val
AC
Th
TC(
Se~
TT(
PhE
GGC
Gij
CCC
Pro 465
GGT
Gly
AGG
Arg
AGG
Arg
TGG
Trp, G ACC GAT CGG r Thr Asp Arg G GAT GTG CTG r Asp Val Leu 420 GGC TGT ACA Gly Cys Thr 435 CCC CCG TGC Pro Pro Cys 450 ACT GAC TGT Thr Asp Cys TCT GGG CCC Ser Gly Pro CTC TGG CAC Leu Trp His 500 ATG TAC GTG Met Tyr Val C 515 ACT CGA GGA G Thr Arg Gly G 530 Tr.
Ph~ 401
AT)
Ile
TGG
Trp
AAC
Asn
TTT
Phe
TGG
rrp 485 rAC ['yr
GG
fly
;AG
[GGT GTC SGly Val CTC AAC Leu Asn ATG AAT Met Asn ATC GGG Ile Gly 455 CGG AAG Arg Lys 470 CTG ACA Leu Thr CCC TGC Pro Cys GGC GTGC Giy Val C CGT TGTC Arg Cys A 535 CCC ACG Pro Thr AAC ACG Asn Thr 425 GGC ACT Gly Thr 440 GGG GCC Gly Ala CAC CCC His Pro CCT AGG Pro Arg %.CT GTC r'hr Val 505 ,AG CAC2 ;lu His 1 20 AC TTG G sp Leu G 123 TAT AAC Tyr Asn 410 CGG CCG Arg Pro GGG TTC Gly Phe GGC AAC Gly Asn GAG GCC Glu Ala 475 rGT ATG Cys Met 490 kAC TTC .7 ks5n Phe I3 GG TTC G ~rg Phe G AG GAC ~lu Asp A 5
TG(
Trj
CC(
Prc
ACC
Thr
AAC
Asn 460
A.CC
rhr 3TT Ia 1
~CC
h r
AA
i u
GG
rg G GGG GCG SGly Ala CGA GGC Arg Gly 430 AAG ACG Lys Thr 445 ACC TTG Thr Leu TAC GCC Tyr Ala CAT TAC His Tyr ATC TTC Ile Phe 510 GCC GCA Ala Ala C 525 GAT AGA I Asp Arg S
AA(
Asi 41~
AAC
Asr
TGT
Cys
ACC
Thr
AGA
Arg
CCA
Pro 495 kAG Lys
E'GC
~ys
'CA
er
CGAC
-1 Asp
TGG
ITrp
GG
Gly
TGC
Cys
TGC
Cys 480
TAT
Tyr
GTT
Val
AAT
Asn
GAG
Giu 1248 1296 1344 1392 1440 1488 1536 1584 1632 1680 1728 1776 1824 1872 1920 1968 2016 CTT AGC Leu Ser 545 TCC TTC Ser Phe CAG AAC Gin Asn GTC TCC Val Ser CTG GCA Leu Ala 610 GCT CAA Ala Gin 625 GCC GTG Ala Vai GCT GCC Ala Ala
CCG
Pro
ACC
Thr
ATC
Ile
CTT
Leu 595
GAC
Asp
GCT
Ala
GCC
Ala
TGG
CTG CTG Leu Leu ACC CTG Thr Leu 565 GTG GAC Val Asp 580 GTC ATC Val Ile GCG CGC Ala Arg GAG GCC Glu Ala GGG GCG Gly Ala TAC ATC CTG TCT ACA Leu 550
CCG
Pro
GTG
Val
AAA
Lys
ATC
Ile
GCC
Ala 630
CAT
His Ser
GCC
Ala
CAA
Gin
TGG
Trp
TGC
Cys 615
TTA
Leu
GGC
Gly Thr
CTA
Leu
TAC
Tyr
GAG
Glu 600
GCC
Ala
GAG
Giu
ACT
ACA GAG TGG Thr Giu Trp 555 TCC ACC GGC Ser Thr Gly 570 CTG TAC GGT Leu Tyr Gly 585 TAT GTC CTG Tyr Val Leu TGC TTA TGG Cys Leu Trp AAC CTG GTG Asn Leu Val 635 CTT TCC TTC Leu Ser Phe CAG ATA CTG CCC TGT Gin Ile Leu Pro Cys CTG ATC Leu Ile GTA GGG Val Gly TTG CTC Leu Leu 605 ATG ATG Met Met 620 GTC CTC Val Leu CTT GTG Leu Val GGT GCG Gly Ala CAC CTC CAT His Leu His 575 TCG GCG GTT Ser Ala Val 590 TTC CTT CTC Phe Leu Leu CTG CTG ATA Leu Leu Ile AAT GCG GCG Asn Ala Ala 640 TTC TTC TGT Phe Phe Cvs 655 GCA TAC GCC Ala Tyr Ala 650 Trp Tyr 660 Ile AAG GGC AGG Lys Gly Arg CTG GTC Leu Val 665
CCT
Pro TTC TAT GGC GTG TGG CCG CTG CTC CTG CTT CTG CTG GCC TTA CCA CCA 2064 124 Phe Tyr Gly Val Trp Pro Leu Leu Leu Leu Leu Leu Ala Leu Pro Pro 675 680 685 CGA GCT TAT GCC TAGTAA Arg Ala Tyr Ala 2082 690 INFORMATION FOR SEQ ID NO: 48: SEQUENCE CHARACTERISTICS: LENGTH: 692 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: Asn Leu Gly Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu 1 5 10 Val Gly Tyr Ile Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg 25 Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala 40 Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu 55 60 Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr Glu Val Arg Asn Val 70 75 Ser Gly Met Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser Ile Val 90 Tyr Glu Ala Ala Asp Met Ile Met His Thr Pro Gly Cys Val Pro Cys 100 105 110 Val Arg Glu Asn Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr 115 120 125 Leu Ala Ala Arg Asn Ala Ser Val Pro Thr Thr Thr Ile Arg Arg His 130 135 140 Val Asp Leu Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val 145 150 155 160 Gly Asp Leu Cys Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr Ile 165 170 175 Ser Pro Arg Arg His Glu Thr Val Gin Asp Cys Asn Cys Ser Ile Tyr 180 185 190 Pro Gly His Ile Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn 195 200 205 Trp Ser Pro Thr Thr Ala Leu Val Val Ser Gin Leu Leu Arg Ile Pro 210 215 220 Gln Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly Val Leu Ala 22 230 235 240 Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val Leu Val 245 250 255 Val Met Leu Leu Phe Ala Gly Val Asp Gly His Thr Arg Val Ser Gly 260 265 270 Gly Ala Ala Ala Ser Asp 275 125 Thr Arg Gly Leu 280 Val Ser Leu Phe Ser Pro 285 Gly Ser 290 lie Asn 305 Phe Ala Glu Arg Gly Pro Cys Trp 370 Val Cys 385 Thr Thr Ser Asp Phe Gly Gly Pro 450 Pro Thr 465 Gly Ser Arg Leu Arg Met Trp Thr 530 Leu Ser I 545 Ser Phe Gin Asn I Val Ser I Leu Ala 610 Ala Gin 625 Ala Gin Lys Ile Gin Leu Val Asn Thr Ar Al Leu Leu 355 His Gly Asp Val Cys 435 Pro Asp Gly Trp Tyr 515 k.rg ?ro rhr le .eu ~sp la Thi Leu Ala 340 Thr Tyr Pro Arg Leu 420 Thr Cys Cys Pro His 500 Val Gly Leu Thr Va1 580 Va1 Ala Glu 295 Ala Leu Asn 310 Phe Tyr Lys 325 Ser Cys Arg Tyr Thr Glu Ala Pro Arg 375 Val Tyr Cys 390 Phe Gly Val 405 Ile Leu Asn Trp Met Asn Asn Ile Gly 455 Phe Arg Lys 470 Trp Leu Thr 485 Tyr Pro Cys Gly Gly Val C Glu Arg Cys 1 535 Leu Leu Ser 1 550 Leu Pro Ala L 565 Asp Val Gin I Ile Lys Trp G 6 Arg Ile Cys A 615 Ala Ala Leu G 630 Asn Gly Ser Trp His 300 Leu Gin Thr Gly Phe Cys Asn As His Ser Pro 360 Pro Phe Pro Asn Gly 440 Gly His ?ro rhr lu j20 ~sp 'hr leu 'yr ;lu .00 la Ilu Lys lie 345 Asn Cys Thr Thr Thr 425 Thr Ala Pro Arg Val 505 His Leu Thr Ser Leu 585 Tyr Cys i Asn Ph 33' As Sei Gl Pro Tyr 410 Arg Gly Gly Glu Cys 490 Asn Arg Glu 31u rhr 570 Tyr al Leu Leu p Ser 315 e Asn 0 Lys Ser Ile Ser 395 Asn Pro Phe Asn.
Ala 475 Met Phe Phe Asp Trp 555 Gly I Gly I Leu I Trp r' Val 635 Sei Phe Asp Val 380 Pro Trp Pro Thr Asn 460 Thr Val rhr Glu krg 540 ,ln eu lal 4eu let Tal Ser Ala Gin 365 Pro Val Gly Arg Lys 445 Thr Tyr His Ile Ala 525 Asp Ile Ile Gly Leu I 605 Met Leu 1 Gil Glr Arc Ala Val Ala Gly 430 Thr Leu Ala Tyr Phe 510 Ala Arg Leu His Ser 590 Phe eu %sn r Cys 335 1 Gly Pro Ser Va1 Asn 415 Asn Cys Thr Arg Pro 495 Lys Cys Ser Pro Leu 575 Ala Leu Leu Ala 320 Pro Trp Tyr Gin Gly 400 Asp Trp Gly Cys Cys 480 Tyr Va1 Asn Glu Cys 560 His Val Leu Ile Ala 640 126 Ala Val Ala Gly Ala His Gly Thr Leu Ser Phe Leu Val Phe Phe Cys 645 650 655 Ala Ala Trp Tyr Ile Lys Gly Arg Leu Val Pro Gly Ala Ala Tyr Ala 660 665 670 Phe Tyr Gly Val Trp Pro Leu Leu Leu Leu Leu Leu Ala Leu Pro Pro 675 680 685 Arg Ala Tyr Ala 690 INFORMATION FOR SEQ ID NO: 49: SEQUENCE
CHARACTERISTICS:
LENGTH: 2433 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO
ATG
Met 1
CGC
Arg
GGA
Gly
ACT
Thr
ATC
Ile
TAC
Tyr
CTC
Leu (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..2430 (ix) FEATURE: NAME/KEY: mat_peptide LOCATION: 1..2427 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA CGT Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg 5 10 CGC CCA CAG GAC GTC AAG TTC CCG GGC GGT GGT CAG Arg Pro Gln Asp Val Lys Phe Pro Gly Gly Gly Gin 25 GTT TAC CTG TTG CCG CGC AGG GGC CCC AGG TTG GGT Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly 40 AGG AAG ACT TCC GAG CGG TCG CAA CCT CGT GGG AGG Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg 55 CCC AAG GCT CGC CGA CCC GAG GGT AGG GCC TGG GCT Pro Lys Ala Arg Arg Pro Glu Gly Arg Ala Trp Ala 70 75 CCT TGG CCC CTC TAT GGC AAT GAG GGC ATG GGG TGG Pro Trp Pro Leu Tyr Gly Asn Glu Gly Met Gly Trp 90 CTG TCA rrCC CC TCT C nCCT A.o TGG...
Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro 100 105 AAC ACC AAC Asn Thr Asn
ATC
Ile
GTG
Val
CGA
Arg
CAG
Gin
GCA
Ala
ACA
Thr 110
GTT
Val
CGC
Arg
CAA
Gin
CCC
Pro
GGA
Gly
GAC
Asp
GGT
Gly
GCG
Ala
CCT
Pro
GGG
Gly
TGG
Trp
CCC
Pro 96 144 192 240 288 336 CGG CGT AGG Arg Arg Arg 115 TCG CGT AAT TTG Ser Arg Asn Leu AAG GTC ATC GAT Lys Val Ile Asp
ACC
Thr 125 CTT ACA TGC Leu Thr Cys TTC GCC GAC Phe Ala Asp 127 CTC GTG GGG TAC ATT CCG CTC GTC Leu Val Gly Tyr Ile Pro Leu Val GGC GCC CCC CTA Gly Ala Pro Leu 130 GGG GGC Gly Gly 145 GGC GTG Gly Val TTC CTC Phe Leu GAA GTG Glu Val AAC TCA Asn Ser 210 GGG TGC Gly Cys 225 GCG CTC Ala Leu ACA ATA Thr Ile TCC GCT Ser Ala CAG CTG Gin Leu 290 AAT TGC Asn Cys 305 GAT ATG Asp Met CTG CTC Leu Leu TGG GGA Trp Gly GCT AAG Ala Lys 370 ACC CGC Thr Arg 385 135 GCT GCC Ala Ala AAC TAT Asn Tyr TTG GCT Leu Ala 180 CGC AAC Arg Asn 195 AGC ATT Ser Ile GTG CCC Val Pro ACC CCC Thr Pro CGA CGC Arg Arg 260 ATG TAC Met Tyr 275 TTC ACC Phe Thr TCA ATC Ser Ile ATG ATG Met Met CGG ANTC Arg Ile 340 GTC CTG Val Leu 355 GTT TTG Vai Leu GTG TCA Vai Ser
AGG
Arg
GCA
Ala 165
TTG
Leu
GTG
Val1
GTG
Val1
TGC
Cys
ACG
Thr 245
CAC
His
GTG
Vai
ATC
Ile
TAT
Tyr
AAC
Asn 325
CCA
Pro
GCG
Ala
GTT
Val1
GGA
Gly GCC CTG GCG Ala Leu Ala 150 ACA GGG AAT Thr Gly Asn CTG TCC TGT Leu Ser Cys TCC GGG ATG Ser Gly Met 200 TAT GAG GCA Tyr Giu Ala 215 GTT CGG GAG Val Arg Glu 230 CTC GCA GCT Leu Ala Ala GTC GAT TTG Val Asp Leu GGG GAC CTC Gly Asp Leu 280 TCG CCT CGC Ser Pro Arg 295 CCC GGC CAC Pro Gly His 310 TGG TCG CCT Trp Ser Pro CAA GCT GTC Gin Ala Val GGC CTC GCC Gly Leu Ala 360 GTG ATG CTA Vai Met Leu 375 GGG GCA GCA Gly Ala Ala 390 CAT GGC His Gly TTG CCC Leu Pro 170 CTG ACC Leu Thr 185 TAC CAT Tyr His GCG GAC Ala Asp AAC AAC Asn Asn AGG AAC Arg Asn 250 CTC GTT Leu Val 265 TGC GGA Cys Gly CGG CAT Arg His ATA ACG Ile Thr ACA ACG Thr Thr 330 GTG GAC Val Asp 345 TAC TAT Tyr Tyr CTC TTT Leu Phe GCC TCC Ala Ser 140 GTC CGG Val Arg 155 GGT TGC Gly Cys GTT CCA Val Pro GTC ACG Val Thr ATG ATC Met Ile 220 TCT TCC Ser Ser 235 GCC AGC Ala Ser GGG GCG Gly Ala TCT GTC Ser Val GAG ACG Giu Thr 300 GGT CAC Gly His 315 GCC CTG Ala Leu ATG GTG Met Val TCC ATG Ser Met GCC GGC Ala Gly 380 GAT ACC Asp Thr 395
GTT
Val1
TCT
Ser
GCT
Al a
AAC
Asn 205
ATG
Met
CGC
Arg
GTC
Val1
GCT
Ala
TTC
Phe 285
GTG
Val
CGT
Arg
GTG
Val1
GCG
Ala
GTG
Val1 365
GTC
Val1
AGG
Arg
CTG
Leu
TTC
Phe
TCC
Ser 190
GAC
Asp
CAC
His
TGC
Cys
CCC
Pro
GCT
Ala 270
CTC
Leu
CAG
Gin
ATG
Met
GTA
Val
GGG
Gly 350
GGG
Gly
GAC
Asp
GGC
Gly GAG GAC Giu Asp 160 TCT ATC Ser Ile 175 GCT TAT Ala Tyr TGC TCC Cys Ser ACC CCC Thr Pro TGG GTA Trp Val 240 ACC ACG Thr Thr 255 TTC TGT Phe Cys GTC TCC Val Ser GAC TGC Asp Cys GCT TGG Ala Trp 320 TCG CAG Ser Gin 335 GCC CAT Ala His AAC TGG Asn Trp GGG CAT Gly His CTT GTG Leu Val 400 480 528 576 624 672 720 768 816 864 912 960 1008 1056 1104 1152 1200 1248 TCC CTC TITT AGC CCC GGG TCG GCT CAG AAA ATC CAG CTC GTA AAC ACC 128 Ser Leu Phe Ser Pro Cly Ser Ala Gin 405
AAC
Asn
CTC
Leu
TCG
Ser
TTC
Phe 465
GAC
Asp
GTA
Val
CCT
Pro
TGG
Trp
CCG
Pro 545
ACC
Thr
AAC
Asn
ACC
Thr
GTT
Val1
ACC
Thr 625
GAA
Glu
AGG
Arg
CAG
Gin GGC AGT Gly Ser CAA ACA Gin Thr 435 TCT GGA Ser Gly 450 GCT CAG Ala Gin CAG AGG Gin Arg CCC GCG Pro Ala GTT GTG Val Val 515 GGG GG Gly Ala 530 CGA GGC Arg Gly AAG ACG Lys Thr ACC TTG Thr Leu TAG GCC Tyr Ala 595 CAT TAC His Tyr 610 ATC TTC Ile Phe GCC GCA Ala Ala GAT AGA Asp Arg ATA CTG Ile Leu
TGG
Trp 420
GGG
Gly
TGC
Gys
GGG
Gly
CCC
Pro
TCT
Ser 500
GTG
Val1
AAC
Asn
AAC
Asn
TGT
Cys
ACC
Thr 580
AGA
Arg
CCA
Pro
AAG
L~ys
TGC
Cys
TCA
Ser 660
CCC
Pro
CAC
His
TTC
Phe
CCA
Pro
TG
Trp,
TAC
Tyr 485
GAG
Gln
GGG
Gly
GAC
Asp
TGG
Trp
GGG
Gly 565
TGC
Gys rGC :Cys
TAT
Tyr
GTT
Val
AAT
Asn 645
GAG
Giu
TGT
Cys
ATC
Ile
TTT
Phe
GAG
Glu
GGT
Giy 470
TGC
Cys
GTG
Val
ACG
Thr
TG
Ser TrC Phe 550
GCC
Gly
CCC
Pro
GT
Gly
AGG
Arg
AG
Arg 630
TGG
Trp
GTT
Leu
TCC
Ser AAC AGG Asn Arg GGC GCA Ala Ala 440 CGC TTG Arg Leu 455 CCC CTC Pro Leu TGG CAC Trp His TGG GGT Cys Gly ACC GAT Thr Asp 520 GAT GTG Asp Val 535 GGC TGT Gly Cys CCC GCG Pro Pro ACT GAG Thr Asp TGT CCC Ser Gly 600 CTC TGG Leu Trp 615 ATG TAG Met Tyr ACT GGA Thr Arg AGG CCC Ser Pro TTC ACC Phe Thr
ACT
Thr 425
CTA
Leu
CC
Ala
ACT
Thr
TAG
Tyr
GCA
Pro 505
CGG
Arg
GTG
Leu
ACA
Thr
TGC
Cys.
TGT
Gys 585
CCC
Pro
GAG
His
GTG
Val
GGA
Cly
CTG
Leu 665
ACC
Thr Lys 410
GCC
Ala
TTC
Phe
AGC
Ser
TAG
Tyr
C
Ala 490
GTC
Val
TTT
Phe
ATT
Ile
TGO
Trp
AAG
Asn 570
TTT
Phe
TGG
Trp
TAG
Tyr
GGG
Gly
GAG
Glu 650
GTG
Leu
CG
Leu
GTG
Leu
TAG
Tyr
TGT
Cys
ACT
Thr 475
GCT
Pro
TAT
Tyr
GT
Gly
CTC
Leu
ATG
Met 555
ATC
Ile
CCC
Arg
CTG
Leu
CCC
Pro
GCC
Gly 635
CGT
Arg
CTG
Leu
CCC
Pro
AAC
Asn
AAA
Lys
CC
Arg 460
GAG
Giu
GGA
Arg
TGG
Cys
GTC
Val1
AAC
Asn 540
AAT
Asn
GGG
Cly
AAC
Lys
ACA
Thr
TC
Cys 620
GTG
Val1
TGT
Gys
TCT
Ser
CC
Al a Ile Gin Leu Val Asn Thr
TGC
Gys
CAC
His 445
TCC
Ser
GCT
Pro
CCC
Pro
TTC
Phe
CCC
Pro 525
AAC
Asn
GCC
Cly
CCC
Gly
CAC
His
GGT
Pro 605
ACT
Thr
GAG
Giu
GAG
Asp
ACA
Thr
CTA
Leu
AAG
Asn 430
AAA
Lys
ATG
Ile
AAC
Asn
TGT
Cys
ACC
Thr 510
ACG
Thr
ACG
Thr
ACT
Thr
CC
Al a
CCC
Pro 590
AGC
Arg
GTC
Val
CAG
His
TTG
Leu
ACA
Thr 670
TCC
Ser 415
GAG
Asp
TTC
Phe
GAG
Asp
AGC
Ser
GGT
Gly 495
CCG
Pro
TAT
Tyr
CGG
Arg
GGG
Gly
GCC
Gly 575
GAG
Giu
TOT
Gys
AAC
Asn
AGG
Arg
GAG
Glu 655
GAG
Giu
ACC
Thr
TCC
Ser
AAC
Asn
AAG
Lys
TCG
Ser 480
ATT
Ile
AGC
Ser
AAG
Asn
CCG
Pro
TTC
Phe 560
AAC
Asn
CC
Ala
ATG
Met
TTC
Phe
TTC
Phe 640
GAG
Asp
TGG
Trp
GC
Gly 1296 1344 1392 1440 1488 1536 *i1584 1632 1680 1728 1776 1824 1872 1920 1968 2016 2064 129
CTG
Leu
GTA
Val 705
TTG
Leu
ATG
Met
GTC
Val
CTT
Leu
GGT
Gly 785
CTG
Leu
ATC
Ile 690
GGG
Gly
CTC
Leu
ATG
Met
CTC
Leu
GTG
Val 770
GCG
Ala
GCC
Ala 675 SCAC CTC CAT CAG His Leu His Gin TCG GCG GTT GTC Ser Ala Val Val 710 TTC CTT CTC CTG Phe Leu Leu Leu 725 CTG CTG ATA GCT Leu Leu Ile Ala 740 AAT GCG GCG GCC Asn Ala Ala Ala 755 TTC TTC TGT GCT Phe Phe Cys Ala GCA TAC GCC TTC Ala Tyr Ala Phe 790 TTA CCA CCA CGA Leu Pro Pro Arg 805 680 685 SAAC ATC GTG GAC GTG CAA TAC CTG Asn Ile Val Asp Val Gin Tyr Leu 695 700 TCC CTT GTC ATC AAA TGG GAG TAT Ser Leu Val Ile Lys Trp Glu Tyr 715 GCA GAC GCG CGC ATC TGC GCC TGC Ala Asp Ala Arg Ile Cys Ala Cys 730 CAA GCT GAG GCC GCC TTA GAG AAC Gin Ala Glu Ala Ala Leu Glu Asn 745 750 GTG GCC GGG GCG CAT GGC ACT CTT Val Ala Gly Ala His Gly ThrLeu 760 765 GCC TGG TAC ATC AAG GGC AGG CTG Ala Trp Tyr Ile Lys Gly Arg Leu 775 780 TAT GGC GTG TGG CCG CTG CTC CTG Tyr Gly Val Trp Pro Leu Leu Leu 795 GCT TAT GCC TAGTAA Ala Tyr Ala 810 TAC GGT Tyr Gly GTC CTG Val Leu 720 TTA TGG Leu Trp 735 CTG GTG Leu Val TCC TTC Ser Phe GTC CCT Val Pro CTT CTG Leu Leu 800 2112 2160 2208 2256 2304 2352 2400 2433
C
T
I
T
L
A
INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 809 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: let Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys 1 5 10 rg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly 25 ;ly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu 40 'hr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly 55 le Pro Lys Ala Arg Arg Pro Glu Gly Arg Ala Trp 70 75 yr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Met Gly 90 eu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly 100 105 rg Arg Arg Ser Arg Asn Leu Gly Lys Val Ile Asp 115 120 Arg Gin Gly Arg Ala Trp Pro Thr 125 Thr Val Arg Gin Pro Gly Asp Thr Asn Gly Ala Pro Gly Trp Pro Cys Gly Phe 130 Gly Gly 145 Gly Val Phe Leu Glu Val Asn Ser 210 Gly Cys 225 Al a Al a Asn Leu Arg 195 Ser Val Asp Leu Ala Arg Tyr Ala 165 Ala Leu 180 Asn Val Ile Val Pro Cys Val1 Ala 150 Thr Leu Ser Tyr Val 230 Gly Tyr Ile 135 Leu Ala His Gly Asn Leu Ser Cys Leu 185 Gly Met Tyr 200 Giu Ala Ala 215 Arg Giu Asn 1 Pro Giy Pro 170 Thr His Asp Asn 30 pLeu Val Giy 140 Val Arg Val 155 Giy Cys Ser Val Pro Ala Val Thr Asn 205 Met Ile Met 220 Ser Ser Arg 235 Ala Leu Phe Ser 190 Asp His Cys Pro Giu Ser 175 Ala Cys Thr Trp Leu Asp 160 Ile Tyr Ser Pro Val1 240 Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Ala Ser Val Pro 245 250 Thr Thr 255 Thr Ser Gin Asn 305 Asp Leu Trp, Ala Thr 385 Ser Asn Leu Ser Phe 465 Asp Ile Ala Leu 290 Cys Met Leu Gly Lys 370 Arg Leu Gly Gln Ser 450 kl a 31n Arg Met 275 Phe Ser Met Arg Val1 355 Val Val Phe Ser Thr 435 Gly Gin Arg Arc 260 Tyr Thr Ile Met Ile 340 Leu Leu Ser Ser Trp 420 Giy Cys Giy Pro His Val Asp Lei Val Ile Tyr Asn 325 Pro Ala Val Gly Pro 405 His Phe Pro Trp Tyr 485 Gly Ser Pro 310 Trp Gin Gly Val Gly 390 Gly Ile Phe Glu Gly 470 Cys Asp Pro 295 Gly Ser Ala Leu Met 375 Ala Ser ksn k.la krg 155 Pro Vrp Leu 280 Mrg His Pro Val1 Ala 360 Leu Ala.
Ala Arg Ala 440 Leu Leu His Let 265 Cys Ile Thr Val 345 Tyr Leu Al a Gin rhr 425 Leu kla r'hr ['yr Val Gly His Thr Thr 330 Asp Tyr Phe Ser Lys 410 Ala Phe Ser Tyr Ala 490 Gly Ser Giu Gly 315 Ala Met Ser Ala Asp 395 Ile Leu T'yr Thr 475 Pro Ala Ala Ala Phe Val Phe 285 Thr Val 300 His Arg Leu Val Val Ala Met Val 365 Gly Val 380 Thr Arg Gin Leu Asn Cys Lys His 445 Arg Ser 460 Glu Pro Mrg Pro 270 Leu Gir Met Val Gly 350 Gly Asp Gly Val1 As n 430 Lys Ile Asn 1Val Asp Ala Ser 335 Ala Asn Gly Leu Asn 415 Asp Phe Asp Ser Gly 495 Cys Ser Cys Trp 320 Gin His Trp His Val1 400 Thr Ser Asn L~ys Ser 480 Ile Val Pro Ala Ser Gin Val 500 Pro Val Val Vai Gly Thr 515 131 Pro Val T} 505 Arg Phe Gi Pro Ser Asn 525 Trp Gly 530 Pro Arg 545 Thr Lys Asn Thr Thr Tyr Val His 610 Ala Asn Asp Ser Gly Thr Leu Ala 595 Tyr Asn Cys Thr 580 Mrg Pro Thr 625 Giu Arg Gin Leu Val1 705 Leu Met Val1 Leu Gly 785 Leu Ile Phe Lys Val Ala Asp Ile Ile 690 Gly Leu Met Leu Val 770 Ala Al a Ala Mrg Leu 675 His Ser Phe Leu Asn 755 Phe Al a Leu Cys Ser 660 Pro Leu Ala Leu Leu 740 Ala Phe Tyr Pro Asn 645 Giu Cys His Val1 Leu 725 Ile Ala Cys Ala Pro 805 Phe 550 rGly Pro Gly Mrg Arg 630 Trp Leu Ser Gin Val 710 Leu Ala Ala Ala Phe 790 Arg Pro Pro Cys Vai Leu Il Cys Thr Trr Thr Ser Leu 615 Met Thr Ser Phe Asn 695 Ser kla 'ln lal ~75 ['yr lia *Asp *Gly 600 Trp Tyr Arg Pro Thr 680 Ile Leu Asp Ala Ala 760 Trp Gly N Tyr I Cys 585 Pro His Val1 Gly Leu 665 Thr Val1 Val Ala G1u 745 31y C'yr Tal l a Asr 570C Phe Trp Tyr Gly Glu 650 Leu Leu Asp Ile Axg 730 Al a Ala Ile Trp Leu Asn 540 Met Asn 555 Sle Gly Arg Lys Leu Thr Pro Cys 620 Gly Val 635 Mrg Cys Leu Ser Pro Ala Val Gin 700 Lys Trp, 715 Ile Cys I Ala Leu C His Gly I Asr Glj G i His Pro 605 Thr Glu Asp Thr Leu 685 I'yr 31u l a flu 1 Thr Thr Ala Pro 590 Arg Val1 His Leu Thr 670 Ser Leu Tyr Cys Asn 750 Arc Gil G i) 575 Glu Cys Asn Arg Glu 655 Glu Thr Tyr L~eu 735 4 eu gPro SPhe 560 'Asn Ala Met Phe Phe 640 Asp Trp Gly Gly Leu 720 Trp, Val1 Phe Lys Pro 795 765 Gly Arg 780 Leu Leu Leu Val Leu Leu INFORMATION FOR SEQ ID NO: 51: SEQUENCE CHARACTERISTICS: LENGTH: 17 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear 132 (ii) MOLECULE TYPE: peptide (ix) FEATURE: NAME/KEY: Modified-site LOCATION: 1..17 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: Ser Asn Ser Ser Glu Ala Ala Asp Met Ile Met His Thr Pro Gly Cys 1 5 10 Val INFORMATION FOR SEQ ID NO: 52: SEQUENCE CHARACTERISTICS: LENGTH: 22 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (ix) FEATURE: NAME/KEY: Modified-site LOCATION: 1..22 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: Gly Gly Ile Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 1 5 10 Ser Pro Thr Thr Ala Leu INFORMATION FOR SEQ ID NO: 53: SEQUENCE CHARACTERISTICS: LENGTH: 37 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (ix) FEATURE: NAME/KEY: Modified-site LOCATION: 1..37 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: Tyr Glu Val Arg Asn Val Ser Gly Ile Tyr His Val Thr Asn Asp Cys 1 5 10 Ser Asn Ser Ser Ile Val Tyr Glu Ala Ala Asp Met Ile Met His Thr 25 Pro Gly Cys Gly Lys INFORMATION FOR SEQ ID NO: 54: 133 SEQUENCE CHARACTERISTICS: LENGTH: 25 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (ix) FEATURE: NAME/KEY: Modified-site LOCATION: 1..25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: Gly Gly Thr Pro Thr Val Ala Thr Arg Asp Gly Lys Leu Pro Ala Thr 1 5 10 Gln Leu Arg Arg His Ile Asp Leu Leu INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 25 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (ix) FEATURE: NAME/KEY: Modified-site LOCATION: 1..25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: Gly Gly Thr Pro Thr Leu Ala Ala Arg Asp Ala Ser Val Pro Thr Thr 1 5 10 Thr Ile Arg Arg His Val Asp Leu Leu INFORMATION FOR SEQ ID NO: 56: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr Gin Val Arg Asn 1 5 10 Ser Thr Gly Leu INFORMATION FOR SEQ ID NO: 57: 134 SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: Gin Val Arg Asn Ser Thr Gly Leu Tyr His Val Thr Asn Asp Cys Pro 1 5 10 Asn Ser Ser Ile INFORMATION FOR SEQ ID NO: 58: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: Asn Asp Cys Pro Asn Ser Ser Ile Val Tyr Glu Ala His Asp Ala Ile 1 5 10 Leu His Thr Pro INFORMATION FOR SEQ ID NO: 59: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: Ser Asn Ser Ser Ile Val Tyr Glu Ala Ala Asp Met Ile Met His Thr 1 5 10 Pro Gly Cys Val INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 19 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide 135 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: His Asp Ala Ile Leu His Thr Pro Gly Val Pro Cys Val Arg Glu Gly 1 5 10 Asn Val Ser INFORMATION FOR SEQ ID NO: 61: SEQUENCE
CHARACTERISTICS:
LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: Cys Val Arg Glu Gly Asn Val Ser Arg Cys Trp Val Ala Met Thr Pro 1 5 10 Thr Val Ala Thr INFORMATION FOR SEQ ID NO: 62: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: Ala Met Thr Pro Thr Val Ala Thr Arg Asp Gly Lys Leu Pro Ala Thr 1 5 10 Gin Leu Arg Arg INFORMATION FOR SEQ ID NO: 63: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: Leu Pro Ala Thr Gin Leu Arg Arg His Ile Asp Leu Leu Val Gly SeT 1 5 10 Ala Thr Leu Cys INFORMATION FOR SEQ ID NO: 64: 136 SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: Leu Val Gly Ser Ala Thr Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu 1 5 10 Cys Gly Ser Val INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: Gln Leu Phe Thr Phe Ser Pro Arg Arg His Trp Thr Thr Gln Gly Cys 1 5 10 Asn Cys Ser Ile INFORMATION FOR SEQ ID NO: 66: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: Thr Gln Gly Cys Asn Cys Ser Ile Tyr Pro Gly His Ile Thr Gly His 1 5 10 Arg Met Ala Trp INFORMATION FOR SEQ ID NO: 67: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide 137 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: Ile Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp Ser Pro 1 5 10 Thr Ala Ala Leu INFORMATION FOR SEQ ID NO: 68: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: Asn Trp Ser Pro Thr Ala Ala Leu Val Met Ala Gin Leu Leu Arg Ile 1 5 10 Pro Gin Ala Ile INFORMATION FOR SEQ ID NO: 69: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: Leu Leu Arg Ile Pro Gin Ala Ile Leu Asp Met Ile Ala Gly Ala His 1 5 10 Trp Gly Val Leu INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: Ala Gly Ala His Trp Gly Val Leu Ala Gly Ile Ala Tyr Phe Ser Me: 1 5 10 Val Gly Asn Met 138 INFORMATION FOR SEQ ID NO: 71: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: Val Val Leu Leu Leu Phe Ala Gly Val Asp Ala Glu Thr Ile Val Ser 1 5 10 Gly Gly Gin Ala INFORMATION FOR SEQ ID NO: 72: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: Ser Gly Leu Val Ser Leu Phe Thr Pro Gly Ala Lys Gin Asn Ile Gin 1 5 10 Leu Ile Asn Thr INFORMATION FOR SEQ ID NO: 73: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: Gin Asn Ile Gln Leu Ile Asn Thr Asn Gly Gin Trp His Ile Asn Ser 1 5 10 Thr Ala Leu Asn INFORMATION FOR SEQ ID NO: 74: SEQUENCE CHARACTERISTICS: LENGTH: 21 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide 139 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: Leu Asn Cys Asn Glu Ser Leu Asn Thr Gly Trp Trp Leu Ala Gly Leu 1 5 10 Ile Tyr Gin His Lys INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: Ala Gly Leu Ile Tyr Gin His Lys Phe Asn Ser Ser Gly Cys Pro Glu 1 5 10 Arg Leu Ala Ser INFORMATION FOR SEQ ID NO: 76: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: Gly Cys Pro Glu Arg Leu Ala Ser Cys Arg Pro Leu Thr Asp Phe Asp 1 5 10 Gin Gly Trp Gly INFORMATION FOR SEQ ID NO: 77: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTTON: SEQ ID NO: 77: Thr Asp Phe Asp Gin Gly Trp Gly Pro Ile Ser Tyr Ala Asn Gly Ser 1 5 10 Gly Pro Asp Gin 140 INFORMATION FOR SEQ ID NO: 78: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: Ala Asn Gly Ser Gly Pro Asp Gln Arg Pro Tyr Cys Trp His Tyr Pro 1 5 10 Pro Lys Pro Cys INFORMATION FOR SEQ ID NO: 79: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: Trp His Tyr Pro Pro Lys Pro Cys Gly Ile Val Pro Ala Lys Ser Val 1 5 10 Cys Gly Pro Val INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: Ala Lys Ser Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val 1 5 10 Val Val Gly Thr INFORMATION FOR SEQ ID NO: 81: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide 141 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr 1 5 10 Tyr Ser Trp Gly INFORMATION FOR SEQ ID NO: 82: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: Gly Ala Pro Thr Tyr Ser Trp Gly Glu Asn Asp Thr Asp Val Phe Val 1 5 10 Leu Asn Asn Thr INFORMATION FOR SEQ ID NO: 83: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: Gly Asn Trp Phe Gly Cys Thr Trp Met Asn Ser Thr Gly Phe Thr Lys 1 5 10 Val Cys Gly Ala INFORMATION FOR SEQ ID NO: 84: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: Gly Phe Thr Lys Val Cys Gly Ala Pro Pro Val Cys Ile Gly Gly Ala 1 5 10 Gly Asn Asn Thr 142 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 19 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: Ile Gly Gly Ala Gly Asn Asn Thr Leu His Cys Pro Thr Asp Cys Arg 1 5 10 Lys His Pro INFORMATION FOR SEQ ID NO: 86: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: Thr Asp Cys Phe Arg Lys His Pro Asp Ala Thr Tyr Ser Arg Cys Gly 1 5 10 Ser Gly Pro Trp INFORMATION FOR SEQ ID NO: 87: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: Ser Arg Cys Gly Ser Gly Pro Trp Ile Thr Pro Arg Cys Leu Val Asp 1 5 10 Tyr Pro Tyr Arg INFORMATION FOR SEQ ID NO: 88: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide 143 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: Cys Leu Val Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Ile 1 5 10 Asn Tyr Thr Ile INFORMATION FOR SEQ ID NO: 89: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: Pro Cys Thr Ile Asn Tyr Thr Ile Phe Lys Ile Arg Met Tyr Val Gly 1 5 10 Gly Val Glu His INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: Met Tyr Val Gly Gly Val Glu His Arg Leu Glu Ala Ala Cys Asn Trp 1 5 10 Thr Pro Gly Glu INFORMATION FOR SEQ ID NO: 91: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide xi) SEQUENCE DESCRITION.: SEQ ID NO: 91: Ala Cys Asn Trp Thr Pro Gly Glu Arg Cys Asp Leu Glu Asp Arg Asp 1 5 10 Arg Ser Glu Leu 144 INFORMATION FOR SEQ ID NO: 92: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) Glu SEQUENCE DESCRIPTION: SEQ ID NO: 92: Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Thr Thr Thr 10 Gin Trp Gin Val INFORMATION FOR SEQ ID NO: 93: SEQUENCE CHARACTERISTICS: LENGTH: 9 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: Tyr Gln Val Arg Asn Ser Thr Gly Leu 1 INFORMATION FOR SEQ ID NO: 94: SEQUENCE CHARACTERISTICS: LENGTH: 29 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: ACGTCCGTAC GTTCGAATTA ATTAATCGA INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 60 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (iii) HYPOTHETICAL: NO I ll I Illll 145 (iii) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO: CCTCCGGACG TGCACTAGCT CCCGTCTGTG GTAGTGGTGG TAGTGATTAT CAATTAATTG INFORMATION FOR SEQ ID NO: 96: SEQUENCE CHARACTERISTICS: LENGTH: 19 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: GTTTAACCAC TGCATGATG 19 INFORMATION FOR SEQ ID NO: 97: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: GTCCCATCGA GTGCGGCTAC INFORMATION FOR SEQ ID NO: 98: SEQUENCE CHARACTERISTICS: LENGTH: 45 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98:
II
146 CGTGACATGG TACATTCCGG ACACTTGGCG CACTTCATAA GCGGA INFORMATION FOR SEQ ID NO: 99: SEQUENCE CHARACTERISTICS: LENGTH: 42 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL:
NO
(iii) ANTI-SENSE:
NO
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: TGCCTCATAC ACAATGGAGC TCTGGGACGA GTCGTTCGTG AC 42 INFORMATION FOR SEQ ID NO: 100: SEQUENCE CHARACTERISTICS: LENGTH: 42 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL:
NO
(iii) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: TACCCAGCAG CGGGAGCTCT GTTGCTCCCG AACGCAGGGC AC 42 INFORMATION FOR SEQ ID NO: 101: SEQUENCE CHARACTERISTICS: LENGTH: 42 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: TGTCGTGGTG GGGACGGAGG CCTGCCTAGC TGCGAGCGTG GG 42 INFOPRMATIO' FOR SEQ ID NO: 102: SEQUENCE CHARACTERISTICS: LENGTH: 48 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 147 (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL:
NO
(iii) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: CGTTATGTGG CCCGGGTAGA TTGAGCACTG GCAGTCCTGC ACCGTCTC 48 INFORMATION FOR SEQ ID NO: 103: SEQUENCE CHARACTERISTICS: LENGTH: 42 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL:
NO
(iii) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: CAGGGCCGTT CTAGGCCTCC ACTGCATCAT CATATCCCAA GC 42 INFORMATION FOR SEQ ID NO: 104: SEQUENCE CHARACTERISTICS: LENGTH: 26 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: CCGGAATGTA CCATGTCACG AACGAC 26 INFORMATION FOR SEQ ID NO: 105: SEQUENCE CHARACTERISTICS: LENGTH: 24 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 148 GCTCCATTGT GTATGAGGCA GCGG 24 INFORMATION FOR SEQ ID NO: 106: SEQUENCE CHARACTERISTICS: LENGTH: 23 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: GAGCTCCCGC TGCTGGGTAG CGC 23 INFORMATION FOR SEQ ID NO: 107: SEQUENCE CHARACTERISTICS: LENGTH: 25 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: CCTCCGTCCC CACCACGACA ATACG INFORMATION FOR SEQ ID NO: 108: SEQUENCE CHARACTERISTICS: LENGTH: 27 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: CTACCCGGGC CACATAACGG GTCACCG 27 INFORMATION FOR SEQ ID NO: 109: SEQUENCE CHARACTERISTICS: LENGTH: 24 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 149 (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: GGAGGCCTAC AACGGCCCTG GTGG 24 INFORMATION FOR SEQ ID NO: 110: SEQUENCE CHARACTERISTICS: LENGTH: 22 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: TTCTATCGAT TAAATAGAAT TC 22 INFORMATION FOR SEQ ID NO: 111: SEQUENCE CHARACTERISTICS: LENGTH: 23 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iii) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: GCCATACGCT CACAGCCGAT CCC on
I

Claims (14)

1. Method for purifying recombinant HCV single or specific oligomeric envelope proteins selected from the group consisting of El and/or E2 and/or E1/E2, wherein a disulphide bond cleavage or reduction step is carried out with a disulphide bond cleavage agent on the recombinantly expressed protein.
2. Method according to claim 1, wherein said disulphide bond cleavage or reduction step is carried out under partial cleavage or reducing conditions.
3. Method according to claim 1 or claim 2, wherein said disulphide bond cleavage agent is dithiothreitol (DTT), preferably in a concentration range of 0.1 to 50 mM, preferably 0.1 to 20 mM, more preferably 0.5 to 10 mM.
4. Method according to claim 1, wherein said disulphide bond cleavage agent is combined with a detergent. Method according to claim 4, wherein said detergent is Empigen-BB, preferably at 15 a concentration of 1 to 10%, more preferably at a concentration of
6. Method according to claim 1 or claim 2, wherein said disulphide bond cleaving agent comprises a combination of a classical disulphide bond cleavage agent, such as DTT, and a detergent, such as Empigen-BB. S'0 Method according to any one of claims 1 to 6, further comprising the step of 20 blocking disulphide bond reformation with an SH group blocking agent.
8. Method according to claim 7, wherein said SH group blocking agent is N- ethylmaleimide (NEM) or a derivative thereof.
9. Method according to claim 7, wherein said step of blocking the disulphide bond reformation is brought about by low pH conditions. 151 Method according to any one of claims 1 to 9, further comprising at least the following steps: lysing recombinant El and/or E2 and/or E1/E2 expressing host cells, possibly in the presence of an SH blocking agent such as N-ethylmaleimide (NEM), recovering said HCV envelope proteins by affinity purification such as by means of lectin-chromotography, such as lentil-lectin chromatography, or by means of immunoaffinity using anti-El and/or anti-E2 specific monoclonal antibodies, *reduction or cleavage of the disulphide bonds with a disulphide bond cleaving agent, such as DTT, preferably also in the presence of an SH blocking agent, such as NEM 10 or Biotin-NEM, and, recovering the reduced El and/or E2 and/or E1/E2 envelope proteins by gelfiltration and possibly also by a subsequent Ni+-IMAC chromatography and desalting step.
11. An isolated HCV envelope protein obtained by a method according to any one of claims 1 to 4*
12. An isolated HCV envelope protein according to claim 11, wherein said recombinant HCV envelope proteins are expressed from recombinant mammalian cells such as by using a vaccinia virus based system.
13. An isolated HCV envelope protein according to claim 11, wherein said recombinant HCV envelope proteins are expressed from recombinant yeast cells.
14. Method according to any one of claims 1 to 10, further comprising at least the fnlollnwin step:o+ m^-po. -152- growing a host cell transformed with a recombinant vector comprising a nucleotide sequence allowing the expression of a HCV single or specific oligomeric El and/or E2 and/or E1/E2 protein in a suitable culture medium, causing expression of said vector under suitable conditions, and, lysing said transformed host cells, preferably in the presence of an SH group blocking agent, such as N-ethylmaleimide (NEM), recovering said HCV envelope protein by affinity purification by means of for instance lectin-chromatography or immunoaffinity chromatography using anti-E1 1 and/or anti-E2 specific monoclonal antibodies, with said lectin being preferably lentil- 10 lectin, followed by, incubation of the eluate of the previous step with a disulphide bond cleavage agent, such as DTT, preferably also in the presence of an SH group blocking agent, such as NEM or Biotin-NEM, and, isolating the HCV single or specific oligomeric El and/or E2 and/or El/E2 protein by 15 means of gelfiltration and possibly also by means of an additional Ni2-IMAC chromatography and desalting step. An isolated HCV envelope protein according to any one of claims 11 to 13, for use as a medicament.
16. An isolated HCV envelope protein according to any one of claims 11 to 13, for use as a vaccine for immunising a mammal, preferably humans, against HCV, comprising adm iniktrin o an p-ff+t o 1 I adm----....itrating an effecte aou f sau id cmposition possibly accompanied by pharmaceutically acceptable adjuvants, to produce an immune response.
153- 17. Method for immunising a mammal against HCV, comprising the steps of administering to said mammal an effective amount of an isolated HCV envelope protein according to any one of claims 11 to 13, to produce an immune response. 18. Method according to claim 17, wherein said mammal is human. 19. Vaccine composition for immunising a mammal, preferably humans, against HCV, comprising an effective amount of an isolated HCV envelope protein according to any one of claims 11 to 13, possibly accompanied by pharmaceutically acceptable adjuvants. 20. An isolated HCV envelope protein according to any one of claims 11 to 13, for in vitro detection of HCV antibodies present in a biological sample. 10 21. Method for in vitro diagnosis of HCV antibodies present in a biological sample, comprising at least the following steps: a) contacting said biological sample with an isolated HCV envelope protein according to any one of claims 11 to 13, preferably in an immobilised form under appropriate conditions which allow the formation of an immune complex, 15 b) removing unbound components, c) incubating the immune complexes formed with heterologous antibodies, with said heterologous antibodies being conjugated to a detectable label under appropriate conditions, detecting the presence of said immune complexes visually or mechanically. 22. Kit for determining the presence of HCV antibodies present in a biological sample, CAmnrcinn.T at least one isolated HCV envelope protein according to any one of claims 11 to 13, preferably in an immobilised form on a solid substrate, -154- a buffer or components necessary for producing the buffer enabling binding reaction between these proteins and antibodies against HCV present in said biological sample, a means for detecting the immune complexes formed in the preceding binding reaction. 23. Use of an isolated HCV envelope protein according to any one of claims 11 to 13, comprising HCV El protein, more particularly HCV single El proteins, for in vitro monitoring HCV disease or prognosing the response to treatment, particularly with interferon, of patients suffering from HCV infection comprising: incubating a biological sample from a patient with HCV infection with an El protein 10 or a suitable part thereof under conditions allowing the formation of an immunological complex, removing unbound components, *0 calculating the anti-E l titers present in said sample at the start of and during the course of treatment, S monitoring the natural course of HCV disease, or prognosing the response to treatment of said patient on the basis of the amount of anti-El titers found in said sample at the start of treatment and/or during the course of treatment. 24. Kit for monitoring HCV disease or prognosing the response to treatment, particularly with interferon, of patients suffering from HCV infection comprising: at least one isolated HCV envelope protein, more particularly an El protein according v UIIj l, ui k, lalllms 1 I t 13,
155- a buffer or components necessary for producing the buffer enabling the binding reaction between these proteins and the anti-El antibodies present in a biological sample, means for detecting the immune complexes formed in the preceding binding reaction, possibly also an automated scanning and interpretation device for inferring a decrease of anti-El titers during the progression of treatment. A serotyping assay for detecting one or more serological types of HCV present in a biological sample, more particularly for detecting antibodies of the different types of °HCV to be detected combined in one assay format, comprising at least the following *w 10 steps: a) contacting the biological sample to be analysed for the presence of HCV antibodies of one or more serological types, with at least one isolated HCV El and/or E2 and/or E1/E2 protein according to any one of claims 11 to 13, preferentially in an immobilised form under appropriate conditions which allow the formation of an immune complex, b) removing unbound components, c) incubating the immune complexes formed with heterologous antibodies, with said heterologous antibodies being conjugated to a detectable label under appropriate conditions, detecting the presence of said immune complexes visually or mechanically by means nfrhPncjtnmi-trIr l Inrit., means of densito .lrIIy coluimtcuy) and inferring the presence of one or more HCV serological types present from the observed binding pattern. -156- 26. Kit for serotyping one or more serological types of HCV present in a biological sample, more particularly for detecting the antibodies to these serological types of HCV comprising: at least one isolated HCV El and/or E2 and/or El/E2 protein according to any one of claims 11 to 13, a buffer or components necessary for producing the buffer enabling the binding reaction between these proteins and the anti-El antibodies present in a biological sample, *i means for detecting the immune complexes formed in the preceding binding reaction, 10 possibly also an automated scanning and interpretation device for detecting the presence of one or more serological types present from the observed binding pattern. S::t 27. An isolated HCV envelope protein according to any one of claims 11 to 13 to raise upon immunisation an E 1 and/or E2 specific monoclonal antibody. 28. An isolated HCV envelope protein according to any one of claims 11 to 13, for the preparation of an immunoassay kit. 29. Use of an isolated HCV envelope protein according to any one of claims 11 to 13, for detecting HCV antibodies present in a biological sample. Use of an isolated HCV envelope protein according to any one of claims 11 to 13 for the manufacture of a medicament for immunising a mammal against HCV. 31. Use according to claim 30, wherein said mammal is human. 32. Method according to any one of claims 1 to 10 and substantially as herein described with reference to any one of the examples. -157- 33. An isolated HCV envelope protein according to any one of claims 11 to 13 and substantially as herein described with reference to any one of the examples and/or any one of the figures. 34. Method according to claim 17 or claim 18 and substantially as herein described with reference to any one of the examples. A vaccine composition according to claim 19 and substantially as herein described with reference to any one of the examples. 36. Method according to claim 21 and substantially as herein described with reference o• o• to any one of the examples. S10 37. Kit according to claim 22 and substantially as herein described with reference to any one of the examples. 38. Use according to claim 23 and substantially as herein described with reference to S: any one of the examples. 39. Kit according to claim 24 and substantially as herein described with reference to any one of the examples. Serotyping assay according to claim 25 and substantially as herein described with reference to any one of the examples. 41. Kit according to claim 26 and substantially as herein described with reference to any one of the examples. DATED this 2nd day of June 1999 INNOGENETICS N.V. Attorney: IVAN A. RAJKOVIC Fellow Institute of Patent Attorneys of Australia of BALDWIN SHELSTON WATERS
AU33824/95A 1994-07-29 1995-07-31 Purified hepatitis C virus envelope proteins for diagnostic and therapeutic use Ceased AU708174B2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
AU33824/95A AU708174B2 (en) 1994-07-29 1995-07-31 Purified hepatitis C virus envelope proteins for diagnostic and therapeutic use
AU57127/99A AU757962B2 (en) 1994-07-29 1999-10-29 Purified hepatitis C virus envelope proteins for diagnostic and therapeutic use

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
EP94870132 1994-07-29
EP94870132 1994-07-29
PCT/EP1995/003031 WO1996004385A2 (en) 1994-07-29 1995-07-31 Purified hepatitis c virus envelope proteins for diagnostic and therapeutic use
AU33824/95A AU708174B2 (en) 1994-07-29 1995-07-31 Purified hepatitis C virus envelope proteins for diagnostic and therapeutic use

Related Child Applications (1)

Application Number Title Priority Date Filing Date
AU57127/99A Division AU757962B2 (en) 1994-07-29 1999-10-29 Purified hepatitis C virus envelope proteins for diagnostic and therapeutic use

Publications (2)

Publication Number Publication Date
AU3382495A AU3382495A (en) 1996-03-04
AU708174B2 true AU708174B2 (en) 1999-07-29

Family

ID=25622598

Family Applications (1)

Application Number Title Priority Date Filing Date
AU33824/95A Ceased AU708174B2 (en) 1994-07-29 1995-07-31 Purified hepatitis C virus envelope proteins for diagnostic and therapeutic use

Country Status (1)

Country Link
AU (1) AU708174B2 (en)

Also Published As

Publication number Publication date
AU3382495A (en) 1996-03-04

Similar Documents

Publication Publication Date Title
US6245503B1 (en) Purified hepatitis C virus envelope proteins for diagnostic and therapeutic use
US7101561B2 (en) Purified hepatitis C virus envelope proteins for diagnostic and therapeutic use
AU2002361160B2 (en) Purified Hepatitis C virus envelope proteins for diagnostic and therapeutic use
US20040126395A1 (en) Purified hepatitis C virus envelope proteins for diagnostic and therapeutic use
AU2002238502B2 (en) Purified hepatitis C virus envelope proteins for diagnostic and therapeutic use
US7108855B2 (en) Purified hepatitis C virus envelope proteins for diagnostic and therapeutic use
AU708174B2 (en) Purified hepatitis C virus envelope proteins for diagnostic and therapeutic use
AU757962B2 (en) Purified hepatitis C virus envelope proteins for diagnostic and therapeutic use
US20030095980A1 (en) Purified hepatitis C virus envelope proteins for diagnostic and therapeutic use
NZ521299A (en) Purified hepatitis C virus envelope proteins for diagnostic and therapeutic use
ZA200207272B (en) Purified hepatitis C virus envelope proteins for diagnostic and therapeutic use.
ZA200405218B (en) Purified hepatitis C virus envelope proteins for diagnostic and therapeutic use.
JP2004525885A (en) Purified hepatitis C virus envelope protein for diagnostic and therapeutic use