WO1995032292A2 - Detection of viral antigens coded by reverse-reading frames - Google Patents

Detection of viral antigens coded by reverse-reading frames

Info

Publication number
WO1995032292A2
WO1995032292A2 PCT/US1995/006266 US9506266W WO9532292A2 WO 1995032292 A2 WO1995032292 A2 WO 1995032292A2 US 9506266 W US9506266 W US 9506266W WO 9532292 A2 WO9532292 A2 WO 9532292A2
Authority
WO
WIPO (PCT)
Prior art keywords
leu
val
ala
gly
ser
Prior art date
Application number
PCT/US1995/006266
Other languages
French (fr)
Other versions
WO1995032292A3 (en
Inventor
Kirk E. Fry
Jungsuh P. Kim
Frederick A. Murphy
Jeffrey M. Linnen
Original Assignee
Genelabs Technologies, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Genelabs Technologies, Inc. filed Critical Genelabs Technologies, Inc.
Priority to AU25941/95A priority Critical patent/AU2594195A/en
Publication of WO1995032292A2 publication Critical patent/WO1995032292A2/en
Publication of WO1995032292A3 publication Critical patent/WO1995032292A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/005Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2770/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssRNA viruses positive-sense
    • C12N2770/00011Details
    • C12N2770/24011Flaviviridae
    • C12N2770/24211Hepacivirus, e.g. hepatitis C virus, hepatitis G virus
    • C12N2770/24222New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes

Abstract

The present invention describes a novel method to determine whether a test subject is infected with a selected virus, where the virus has an RNA genome. The method includes the identification of polypeptide antigens coded by reverse open reading frames, that is, reading frames coded in the opposite direction to the major known viral reading frames. Further, the invention includes the reverse frame polypeptide antigens, methods of identifying and producing such polypeptide antigens, and antibodies that are specifically immunoreactive with said polypeptide antigens. These polypeptide antigens and antibodies are useful in diagnostic and therapeutic applications.

Description

DETECTION OF VIRAL AHTIGENS CODED BT REVERBE-READIHG FRAMES
FIELD OF INVENTION
This invention relates to a novel method to determine whether a subject is infected with a virus. The method includes the use of antigens coded by reverse open reading frames, that is, reading frames coded in the opposite direction to the major known viral reading frames. Also included in the invention are the reverse frame antigens, methods of identifying and producing such antigens, and antibodies that are specifically immunoreactive with said antigens. The invention also relates to diagnostic and therapeutic methods involving these antigens and antibodies.
REFERENCES
Abstracts, The 1992 San Diego Conf.: Genetic Recog- nition, Clin. Chem. 3£(4):705 (1993)
Alter, H.J. , Abstracts of Int. Symp. on Viral Hepa¬ titis and Liver Diε., p. 47 (1993).
Altschul, S., et al., J. Mol . Biol. 215:403-10 (1990) . Ascadi, G. , et al., Nature 352:815 (1991).
Ausubel, F.M., et al., CURRENT PROTOCOLS IN MOLECULAR BIOLOGY. John Wiley and Sons, Inc., Media PA.
Barany, F. , PCR Methods Appl. 1:5 (1991). Baron, S., et al., JAMA 266:1375 (1991). Beames, et al., Biotechniques .11:378 (1991).
Blackburn, G.F., et al., Clin. Chem. 37:1534-1539 (1991) .
Bradley, D.W., et al., J. Infec. Diε., 148:2 (1983). Bradley, D.W. , et al., J Gen. Virol., 69:1 (1988). Bradley, D.W. et al., Proc. Nat. Acad. Sci., USA, 84:6277 (1987).
Briand, J.-P., et al., J. Immunol. Meth. 156:255 (1992) .
Cahill, P., et al., Clin. Chem. 3_7:1482 (1991). Cathal, G., et al., DNA 2:329-335 (1983) . Chambers, T.J., et al., Ann. Rev. Microbiol. 44:649 (1990) .
Chambers, T.J., et al., PNAS 87:8898 (1990). Chomczynski et al, Anal. Biochem. 162:159 (1987). Christian, R.B., et al., J. Mol. Biol. 227:771 (1992) .
Crea, R. , U.S. Patent No. 4,888,286, issued December 19, 1989. DeGraaf, M.E., et al., Gene 128:13 (1993).
DiBisceglie, A.M., et al., Hepatology .16:649 (1992). DiBisceglie, A.M., et al., NEJM 321:1506 (1989). DiCesare, J., et al., Biotechnigues 15:152-157 (1993) . Dienstag, J.L., et al, Sem Liver Disease e>:67 (1986). Eaton, M. A. W., et al., U.S. Patent No. 4,719,180, issued Jan. 12, 1988.
Egholm, et al. , Nature 365:566 (1993). EPO patent application 88310922.5, filed 11/18/88. Farci, P., et al. , NEJM 330:88 (1994).
Feigner and Rhodes, Nature 349:251 (1991). Folgori, A., et al., EMBO J. 12:2236 (1994). Fracncki, R.I.B., et al., Arch. Virol. Suppl2:223 (1991) . Frank, R. , and Doring, R. , Tetrahedron 44:6031-6040 (1988) .
Geysen, M. , et al., Proc. Natl. Acad. Sci. USA £1:3998-4002 (1984).
Gingeras, T.R. , et al., Ann. Biol. Clin. 48:498 (1990).
Gingeras, T.R. , et al., J. Inf. Diε. 164:1066 (1991). Goeddel, D.V. , Methods in Enzymology 185 (1990). Grakoui, A., et al., J. Virol. 7:2832 (1993). Guatelli, J.C., et al., Proc. Natl. Acad. Sci. USA 82:1874 (1990).
Gubler, U. , et al, Gene, 25:263 (1983). Guthrie, C. , and G.R. Fink, Methodε in Enzymology 194 (1991) .
Gutterman, J.U. , PNAS 91:1198 (1994). Harlow, E., et al., ANTIBODIES: A LABORATORY MANUAL. Cold Spring Harbor Laboratory Press (1988) .
Hieter, P. A. , et al., Cell 22:197-207 (1980). Hijikata, M. , et al., PNAS 88_:5547 (1991). Hochuli, E. , in GENETIC ENGINEERING. PRINCIPALS AND PRAC¬ TICE. VOL. 12 (j. stelow Ed.) Plenum, NY, pp. 87-98 (1990). Holodniy, M. , et al., Biotechniqueε 12.:36 (1992).
Hopp, T.P., et al., Proc. Natl. Acad. Sci. USA 78:3824-3828 (1981).
Horn, T., and Urdea, M.S., Nuc. Acidε . Reε. 17:6959 (1989) . Houghten, R.A. , Proc. Natl. Acad. Sci. USA 82:5131
(1985) .
Hudson, D., J. Org. Chem. 52:617 (1988). Irwin, M.J., et al., J. Virol. J58.5036 (1994). Jacob, J.R. , et al., in THE MOLECULAR BIOLOGY OF HCV. Section 4, pages 387-392 (1991).
Jacob, J.R., et al., Hepatology 10:921-927 (1989). Jacob, J.R., et al., J. Infect. Dis . 161:1121-1127 (1990) .
Kakumu, S., et al., Gastroenterol . 105:507 (1993). Katz, E.D., and Dong, M. , Biotechniqueε 8_:546 (1990).
Kawasaki, E.S., et al., in PCR TECHNOLOGY: PRINCIPLES AND APPLICATIONS OF DNA AMPLIFICATION (H.A. Erlich, ed.) Stockton Press (1989) .
Koonin, E.V., and Dolja, V.V. , Critical .Reviews in Biochem. & Mol . Biol. .28:375-430 (1993).
Krausslich, H.G., et al . , VIRAL PRQTEINASES AS TARGETS FOR CHEMOTHERAPY (Cold Spring Harbor Press, Plainville, NY) (1989) .
Kumar, R. , et al . , AIDS Reε. Human Retroviruεeε 5_(3) :345-354 (1989).
Larder, B.A. , and Kemp, S.D., Science 246:1155 (1989) . Lanford, R.E., et al., In Vitro Cell. Dev. Biol. 25:174-182 (1989).
Lomell, H., et al., Clin. Chem. 4.8:492 (1990). Maniatis, T., et al., MOLECULAR CLONING: A LABORATORY MANUAL. Cold Spring Harbor Laboratory (1982) .
Marshall, W.S., and Caruthers, M.H., Science 259:1564 (1993) .
Messing, J. , Methodε in Enzymol. 101:20 (1983). Michelle, et al., International Sympoεium on Viral Hepatitiε.
Miller, J. H. , EXPERIMENTS IN MOLECULAR GENETICS. Cold Spring Harbor Laboratories, Cold Spring Harbor, NY (1972).
Morrissey, D.V. , et al., Anal. Biochem. 181:345 (1989) . Moss, B. , et al., CURRENT PROTOCOLS IN MOLECULAR BIOLOGY
(Section IV, Unit 16) (1991) .
Mullis, K.B., U.S. Patent No. 4,683,202, issued 28 July 1987.
Mullis, K.B., et al., U.S. Patent No. 4,683,195, issued 28 July 1987.
Osikowicz, G., et al., Clin. Chem. 3j5:1586 (1990). Patterson, J.L., and Fernandez -Larsson, R. , Rev. Infect. Diε. 12:1139 (1990).
Pearson, W.R. and Lipman, D.J., PNAS 85:2444-2448 (1988).
Pearson, W.R. , Methodε in Enzymology 183:63- 98 (1990) .
Porath, J., Protein Exp. and Pur if . 2: 63 (1992). Pritchard, C.G., and Stefano, J.E., Ann. Biol. Chem. 4_8:492 (1990).
Reichard, 0., et al., Lancet 337:1058 (1991). Reilly, P.R., et al., BACULQVIRUS EXPRESSION VECTORS: A LABORATORY MANUAL (1992) .
Reyes, G. , et al , Science, 247:1335 (1990). Reyes, G., et al., Molecular and Cellular Probeε
5:473-481 (1991).
Roberts, N.A. , et al., Science 248:358 (1990). Sanger, et al., Proc. Natl. Acad. Sci. 74:5463 (1977) .
Sambrook , J . , et al. , In MOLECULAR CLONING: A LABORATORY MANUAL. Cold Spring Harbor Laboratory Press, Vol. 2 (1989). Saiki, R.K., et al., Science 239:487-491 (1988).
Scharf, S. J., et al., Science 221:1076 (1986). Schuler, G.D., et al., Proteinε: Struc, Func. and Genet. .9:180 (1989) .
Scott, J.K., and Smith, G.P. , Science 149:386 (1990). Scott, J.K., et al., Proc. Natl. Acad. Sci. USA
89:5398 (1992).
Smith, D.B., et al., Gene 62:31 (1988). Smith, J.P., Curr. Opin. Biotechnol. 2.:668 (1991). Sreenivasan, M.A. , et al., J. Gen. Virol. 65:1005 (1984).
Tarn, A., et al., Virology 185:120 (1991). Tarn, J.P., Proc. Natl. Acad. Sci. USA 85:5409 (1988). Tonkinson, J.L., and Stein, CA. , Antiviral Chem. and Chemother. 4(4):193-200 (1993). Ulmer, et al., Science 259:1745 (1993).
Urdea, M. , Clin. Chem. 21:725 (1993). Urdea, M. , et al., AIDS I.'Sll (1993). Wages, J.M. , et al . , Amplifications 10:1-6 (1993). Walker, G.T. , PCR Methodε Appl. 2:1-6 (1993). Wang, A.M., et al. in PCR PROTOCOLS: A GUIDE TO METHODS
AND APPLICATIONS (M.A. Innis, et al., eds.) Academic Press (1990) .
Wang, B. , et al., Proc. Natl. Acad. Sci. USA 90:4156 (1993) . Whetsell, A. J. , et al . , J. Clin. Micro. 30:845
(1992) .
Wolf, J.A., et al., Nature 242:1465 (1990). Vacca, J.P., et al., PNAS 9_1:4096 (1994). VanGemen, B. , et al., J. Virol. Methodε 43:177 (1993).
Valenzuela, P., et al.. Nature 298:344 (1982). Valenzuela, P., et al . , in HEPATITIS B. eds. I. Millman, et al., Plenum Press, pages 225-236 (1984). Yarbrough, et al . , J. Virol . 65:5790 (1991). Yoshio, T., et al . , U.S. Patent No. 4,849,350, issued July 18, 1989.
BACKGROUND OF THE INVENTION
Viral hepatitis resulting from a virus other than hepatitis A virus (HAV) and hepatitis B virus (HBV) has been referred to as non-A, non-B hepatitis (NANBH) . NANBH can be further defined based on the mode of transmission of an individual type, for example, enteric versus parenteral.
One form of NANBH, known as enterically transmitted NANBH or ET-NANBH, is contracted predominantly in poor- sanitation areas where food and drinking water have been contaminated by fecal matter. The molecular cloning of the causative agent, referred to as the hepatitis E virus (HEV) , has recently been described (Reyes et al . , 1990; Tam et al . ) .
A second form of NANB, known as parenterally trans¬ mitted NANBH, or PT-NANBH, is transmitted by parenteral routes, typically by exposure to blood or blood products. The rate of this hepatitis varied by (i) locale, (ii) whether ALT testing was done in blood banks, and (iii) elimination of high-risk patients for AIDS. Appoximately 10% of transfusions caused PT-NANBH infection and about half of those went on to a chronic disease state (Dienstag) . After implementation of anti-HCV testing, HCV seroconversion per unit transfused was decreased to less than 1% among heart surgery patients (Alter) .
Human plasma samples documented as having produced post-transfusion NANBH in human recipients have been used successfully to produce PT-NANBH infection in chimpanzees (Bradley) . RNA isolated from infected chimpanzee plasma has been used to construct cDNA libraries in an expression vector for immunoscreening with serum from human subjects with chronic PT-NANBH infection. This procedure identified a PT-NANBH specific cDNA clone and the viral sequence was then used as a probe to identify a set of overlapping fragments making up 7,300 contiguous basepairs of a PT-NANBH viral agent. The sequenced viral agent has been named the hepatitis C virus (HCV) (for example, the sequence of HCV is presented in EPO patent application 88310922.5, filed 11/18/88). The full-length sequence (~ 9,500 nt) of HCV is now available. Primate transmission studies conducted at the Centers for Disease Control (CDC; Phoenix, AZ, 1973-1975; 1978- 1983) originally provided substantial evidence for the existence of multiple agents of non-A, non-B hepatitis (NANBH) : the primary agents associated with the majority of cases of NANBH are now recognized to be HCV and HEV (see above) , for PT-NANBH and ET-NANBH, respectively. Later epidemiologic studies conducted at the CDC (Atlanta, GA, 1989-present) using both research (prototype) and commercial tests for anti-HCV antibody showed that approximately 20% of all community-acquired NANBH was also non-C. Further testing of these samples for the presence of HEV (co-owned, co-pending U.S. Application Serial No. 07/372,711, filed 28 June 1989, herein incorporated by reference) have indicated that these cases of community- acquired non-A, non-B, non-C hepatitis were also non-E. Liver biopsy specimens, sera and plasma of Sentinel County patients (study of Drs. Miriam Alter and Kris Krawczynski) also showed that many Jona fide cases of NANBH were also non-C hepatitis (serologically and by Reverse Transcriptase-Polymerase Chain Reaction (RT-PCR; Kawasaki, et al. ; Wang, et al . , 1990) negative for all markers of HCV infection) developed subsequently into chronic hepatitis with presentation of chronic persistent hepatitis (CPH) or chronic active hepatitis (CAH) consistent with a viral infection. SUMMARY OF THE INVENTION
The present invention describes polypeptide antigens encoded by the reverse-frame of a selected virus having an RNA genome, where the polypeptide antigen is specifically immunoreactive with serum infected with the selected RNA virus. Reverse-frames are defined as open reading frames that are transcribed and translated in the opposite direction to the major known reading frames for the virus. In one embodiment of the present invention the se- lected virus is a single, positive strand RNA virus. Exemplary viruses of this group are Hepatitis G Virus, also disclosed herein, and Hepatitis C Virus.
In another aspect, the present invention includes a method for detecting serum infected with a virus having an RNA genome. In this method, serum from a test subject is reacted with a reverse-frame polypeptide antigen. The polypeptide antigen is then examined for the presence of bound antibody. Alternatively, antibodies against the reverse-frame polypeptide antigen may be used to detect the presence of the reverse-frame polypeptide antigen in a sample.
In one embodiment of the detection method, a polypeptide antigen is attached to a solid support. The serum is then exposed to the polypeptide antigen/support followed by addition of a reporter-labelled anti-human antibody. The polypeptide antigen/support is then examined to detect the presence of reporter-labelled antibody bound to the polypeptide antigen/support.
The invention also includes antibodies directed against reverse-frame polypeptide antigens, including monoclonal antibodies and substantially isolated prepara¬ tions of polyclonal antibodies.
Further, the invention includes diagnostic kits containing the above described reverse-frame polypeptide antigens and/or antibodies against these polypeptide antigens. In another embodiment, the present invention includes a method of identifying a polypeptide antigen that is specifically immunoreactive with antibodies against a selected virus having an RNA genome. In the method, polynucleotide sequences corresponding to the coding sequences for identifiable viral proteins are determined for the selected virus. A second polynucleotide sequence complementary to the first polynucleotide (encoding identifiable viral protein(s)) is examined for the presence of an open reading frame (ORF) . The immuno¬ logical properties of the polypeptide encoded by the open reading frame are then examined to determine if the poly¬ peptide is specifically immunoreactive with antibodies (e . g. , infected serum) against the virus. In one embodiment, the first polynucleotide is the genomic strand of a single, positive strand RNA virus (for example, HCV) that encodes a polyprotein.
Also, the following step can be included in the method of identifying a polypeptide antigen. Reverse- frames from a number of variants can be compared to de¬ termine the reverse-frame coding sequences that are con¬ served between variants. These conserved reverse-frame polypeptides are then evaluated for their antigenic prop¬ erties.
These and other objects and features of the invention will be more fully appreciated when the following detailed description of the invention is read in conjunction with the accompanying drawings.
BRIEF DESCRIPTION OF THE FIGURES
Figure 1: the relationship of the SEQ ID NO:14 open reading frame to the 470-20-1 clone.
Figure 2: shows an exemplary protein profile from gradient fractions eluted from a glutathione affinity column. Figure 3: shows an exemplary Sodium dodecyl sulfate polyacrylamide gel electrophoresis analysis of fraction samples from Figure 2.
Figure 4A: shows an exemplary protein profile from gradient fractions eluted from an anion exchange column.
Figures 4B and 4C: show exemplary Sodium dodecyl sulfate polyacrylamide gel electrophoresis analysis of fraction samples from Figure 4A.
Figures 5A and 5B: amino acid alignments of HGV with two other members of Flaviviridae family — Hog Cholera Virus and Hepatitis C Virus.
Figure 6 shows a map of a portion of the vector pGEX- Hisb-GE3-2, a bacterial expression plasmid carrying an HGV epitope. Figures 7A to 7D show the results of Western blot analysis of the purified HGV GE3-2 protein.
Figures 8A to 8D show the results of Western blot analysis of the purified HGV Y5-10 antigen.
Figures 9A to 9D show the results of Western blot analysis of the following antigens: Y5-5, GE3-2 and Y5- 10.
Figure 10: shows the relative positions of two exemplary reverse open reading frame antigens.
Figures 11A, 11B and 11C show a multiple sequence alignment for the K3 clones.
DETAILED DESCRIPTION OF THE INVENTION
I. DEFINITIONS
The terms defined below have the following meaning herein:
1. "nonA/nonB/nonC/nonD/nonE hepatitis viral agent {N-(ABCDE) }," herein provisionally designated HGV, means a virus, virus type, or virus class which (i) is transmissible in some primates, including, mystax, chimpanzees or humans, (ii) is serologically distinct from hepatitis A virus (HAV) , hepatitis B virus (HBV) , hepatitis C virus (HCV) , hepatitis D virus, and hepatitis E (HEV) (although HGV may co-infect a subject with these viruses) , and (iii) is a member of the virus family Flaviviridae.
2. "HGV variants" are defined as viral isolates that have at least about 40%, preferably 55%, more preferably 70%, or most preferably 80% global sequence homology, that is, sequence identity over a length (comparable to SEQ ID NO:14) of the viral genome polynucleotide sequence, to the HGV polynucleotide sequences disclosed herein. "Sequence homology" is determined essentially as follows. Two polynucleotide sequences of the same length (preferably, the entire viral genome) are considered to be homologous to one another, if, when they are aligned using the ALIGN program, over 40%, or preferably 55%, more preferably 70%, or most preferably 80% of the nucleic acids in the highest scoring alignment are identically aligned using a ktup of 1, the default parameters and the default PAM matrix.
The ALIGN program is found in the FASTA version 1.7 suite of sequence comparison programs (Pearson, et al ., 1988; Pearson, 1990; program available from William R. Pearson, Department of Biological Chemistry, Box 440, Jordan Hall, Charlottesville, VA) .
In determining whether two viruses are "highly homologous" to each other, the complete sequence of all the viral proteins (or the polyprotein) for one virus are optimally, globally aligned with the viral proteins or polyprotein of the other virus using the ALIGN program of the above suite using a ktup of 1, the default parameters and the default PAM matrix. Regions of dissimilarity or similarity are not excluded from the analysis. Differences in lengths between the two sequences are considered as mismatches. Alternatively, viral structural protein regions are typically used to determine relatedness between viral isolates. Highly homologous viruses have over 40%, or preferably 55%, more preferably 70%, or most preferably 80% global polypeptide sequence identity.
3. Two nucleic acid fragments are considered to be "selectively hybridizable" to an HGV polynucleotide, if they are capable of specifically hybridizing to HGV or a variant thereof (e . g. , a probe that hybridizes to HGV nucleic acid but not to polynucleotides from other members of the virus family Flaviviridae) or specifically priming a polymerase chain reaction: (i) under typical hybridization and wash conditions, as described, for example, in Maniatis, et al . , pages 320-328, and 382-389, or (ii) using reduced stringency wash conditions that allow at most about 25-30% basepair mismatches, for example: 2 x SSC, 0.1% SDS, room temperature twice, 30 minutes each; then 2 x SSC, 0.1% SDS, 37°C. once, 30 minutes; then 2 x SSC room temperature twice, 10 minutes each, or (iii) selecting primers for use in typical polymerase chain reactions (PCR) under standard conditions (for example, in Saiki, R.K, et al . ) , which result in specific amplification of sequences of HGV or its variants.
Preferably, highly homologous nucleic acid strands contain less than 20-30% basepair mismatches, even more preferably less than 5-20% basepair mismatches. These degrees of homology can be selected by using wash conditions of appropriate stringency for identification of clones from gene libraries (or other sources of genetic material) , as is well known in the art.
4. An "HGV polynucleotide," as used herein, is defined as follows. For polynucleotides greater than about 100 nucleotides, HGV polynucleotides encompass polynucleotide sequences encoded by HGV variants and homologous sequences as defined in "2" above. For polynucleotides less than about 100 nucleotides in length, HGV polynucleotide encompasses sequences that selectively hybridizes to sequences of HGV or its variants. Further, HGV polynucleotides include polynucleotides encoding HGV polypeptides (see below) .
The term "polynucleotide" as used herein refers to a polymeric molecule having a backbone that supports bases capable of hydrogen bonding to typical nucleic acids, where the polymer backbone presents the bases in a manner to permit such hydrogen bonding in a sequence specific fashion between the polymeric molecule and a typically nucleic acid (e.g., single-stranded DNA). Such bases are typically inosine, adenosine, guanosine, cytosine, uracil and thymidine. Numerous polynucleotide modifications are known in the art, for example, labels, ethylation, and substitution of one or more of the naturally occurring nucleotides with an analog. Polymeric molecules include double and single stranded RNA and DNA, and backbone modifications thereof, for example, methylphosphonate linkages. Further, such polymeric molecules include alternative polymer backbone structures such as, but not limited to, polyvinyl backbones (Pitha, 1970a/b) , morpholino backbones
(Summerton, et al . , 1992, 1993). A variety of other charged and uncharged polynucleotide analogs have been reported. Numerous backbone modifications are known in the art, including, but not limited to, uncharged linkages (e .g. , methyl phosphonates, phosphotriesters, phosphoamidates, and carbamates) , charged linkages (e . g. , phosphorothioates and phosphorodithioates) . In addition linkages may contain the following exemplary modifications: pendant moieties, such as, proteins (including, for example, nucleases, toxins, antibodies, signal peptides and poly-L-lysine) ; intercalators (e .g . , acridine and psoralen) , chelators (e . g . , metals, radioactive metals, boron and oxidative metals) , alkylators, and other modified linkages (e.g., alpha anomeric nucleic acids) .
5. An "HGV polypeptide" is defined herein as any polypeptide homologous to an HGV polypeptide. "Homology," as used herein, is defined as follows. In one embodiment, a polypeptide is homologous to an HGV polypeptide if it is encoded by nucleic acid that selectively hybridizes to sequences of HGV or its variants. In another embodiment, a polypeptide is homologous to an HGV polypeptide if it is encoded by HGV or its variants, as defined above, polypeptides of this group are typically larger than 15, preferable 25, or more preferable 35, contiguous amino acids. Further, for polypeptides longer than about 60 amino acids, sequence comparisons for the purpose of determining "polypeptide homology" are performed using the local alignment program LALIGN. The polypeptide sequence, is compared against the HGV amino acid sequence or any of its variants, as defined above, using the LALIGN program with a ktup of 1, default parameters and the default PAM.
Any polypeptide with an optimal alignment longer than 60 amino acids and greater than 65%, preferably 70%, or more preferably 80% of identically aligned amino acids is considered to be a "homologous polypeptide." The LALIGN program is found in the FASTA version 1.7 suite of sequence comparison programs (Pearson, et al. , 1988; Pearson, 1990; program available from William R. Pearson, Department of Biological Chemistry, Box 440, Jordan Hall, Charlottesville, VA) .
6. A polynucleotide is "derived from" HGV if it has the same or substantially the same basepair sequence as a region of an HGV genome, cDNA of HGV or complements thereof, or if it displays homology as noted under "2", "3" or "4" above.
A polypeptide is "derived from" HGV if it is (i) encoded by an open reading frame of an HGV polynucleotide, or (ii) displays homology to HGV polypeptides as noted under "2" and "5" above, or (iii) is specifically immunoreactive with HGV positive sera.
7. "Substantially isolated" and "purified" are used in several contexts and typically refer to at least partial purification of an HGV virus particle, component (e .g. , polynucleotide or polypeptide), or related compound (e .g. , anti-HGV antibodies) away from unrelated or contaminating components (e.g., serum cells, proteins, non-HGV polynucleotides and non-anti-HGV antibodies) . Methods and procedures for the isolation or purification of compounds or components of interest are described below (e.g., affinity purification of fusion proteins and recombinant production of HGV polypeptides) . 8. In the context of the present invention, the phrase "nucleic acid sequences," when referring to sequences which encode a protein, polypeptide, or peptide, is meant to include degenerative nucleic acid sequences which encode homologous protein, polypeptide or peptide sequences as well as the disclosed sequence.
9. An "epitope" is the antigenic determinant defined as the specific portion of an antigen with which the antigen binding portion of a specific antibody interacts.
10. An antigen or epitope is "specifically immunoreactive" with HGV positive sera when the epitope/antigen binds to antibodies present in the HGV infected sera but does not bind to antibodies present in the majority (greater than about 90%, preferably greater than 95%) of sera from individuals who are not or have not been infected with HGV. "Specifically immunoreactive" antigens or epitopes may also be immunoreactive with monoclonal or polyclonal antibodies generated against specific HGV epitopes or antigens.
An antibody or antibody composition (e .g . , polyclonal antibodies) is "specifically immunoreactive" with HGV when the antibody or antibody composition is immunoreactive with an HGV antigen but not with HAV, HBV, HCV, HDV or HEV antigens. Further, "specifically immunoreactive antibodies" are not immunoreactive with antigens typically present in normal sera obtained from subjects not infected with or exposed to HGV, HAV, HBV, HCV, HDV or HEV. II. ISOLATION OF HGV ASSOCIATED SEQUENCES. As one approach toward identifying clones containing HGV sequences, a cDNA library was prepared from infected- HGV sera in the expression vector lambda gtll (Example 1) . Polynucleotide sequences were then selected for the expression of peptides which are immunoreactive with serum PNF 2161. PNF 2161 was believed to contain an etiologic agent of NANBH other than HCV. First round screening was typically performed using the PNF 2161 serum (used to generate the phage library) . It is also possible to screen with other suspected N-(ABCDE) sera.
Recombinant proteins identified by this approach provide candidates for peptides which can serve as sub¬ strates in diagnostic tests. Further, the nucleic acid coding sequences identified by this approach serve as useful hybridization probes for the identification of additional HGV coding sequences.
The sera described above were used to generate cDNA libraries in lambda gtll (Example 1) . In the method illustrated in Example 1, infected serum was precipitated in 8% PEG without dilution, and the libraries were gener¬ ated from the resulting pelleted virus. Sera from in¬ fected human sources were treated in the same fashion.
As an advantageous alternative to PEG precipitation, ultracentrifugation can be used to pellet particulate agents from infected sera or other biological specimens. To isolate viral particles from which nucleic acids could be extracted, serum, ranging up to 2 ml, is diluted to approximately 10 ml with PBS, spun at 3K for 10 minutes, and the supernatant is centrifuged for a minimum of 2 hours at 40,000 rp (approximately 110,000 x g) in a Ti70.1 rotor (Beckman Instruments, Fullerton, CA) at 4°C. The supernatant is then aspirated and the pellet extracted by standard nucleic acid extraction techniques. cDNA libraries were generated using random primers in reverse transcription reactions with RNA extracted from pelleted sera as starting material. The resulting molecules were ligated to Sequence Independent Single Primer Amplification (SISPA; Reyes, et al., 1991) linker primers and expanded in a non-selective manner, and then cloned into a suitable vector, for example, lambda gtll, for expression and screening of peptide antigens.
Alternatively, the lambda gtlO vector may also be used.
Lambda gtll is a particularly useful expression vector which contains a unique EcoRI insertion site 53 base pairs upstream of the translation termination codon of the S-galactosidase gene. Thus, an inserted sequence is expressed as a /S-galactosidase fusion protein which contains the N-terminal portion of the jS-galactosidase gene product, the heterologous peptide, and optionally the C-terminal region of the S-galactosidase peptide (the C— terminal portion being expressed when the heterologous peptide coding sequence does not contain a translation termination codon) .
This vector also produces a temperature-sensitive repressor (cI857) which causes viral lysogeny at permis- sive temperatures, e . g. , 32°C, and leads to viral lysis at elevated temperatures, e .g. , 42°C. Advantages of this vector include: (1) highly efficient recombinant clone generation, (2) ability to select lysogenized host cells on the basis of host-cell growth at permissive, but not non-permissive, temperatures, and (3) production of re¬ combinant fusion protein. Further, since phage containing a heterologous insert produces an inactive /3-galactosidase enzyme, phage with inserts are typically identified using a colorimetric substrate conversion reaction employing β- galactosidase.
Example 1 describes the preparation of a cDNA library for the N-(ABCDE) hepatitis sera PNF 2161. The library was immunoscreened using PNF 2161 (Example 3) . A number of lambda gtll clones were identified which were immunoreactive. Immunopositive clones were plaque-puri¬ fied and their immunoreactivity retested. Also, the immunoreactivity of the clones with normal human sera was also tested.
These clones were also examined for the "exogenous" nature of the cloned insert sequence. This basic test establishes that the cloned fragment does not represent a portion of human or other potentially contaminating nucleic acids (e .g. , E. coli, S. cereviεiea and mitochondrial) . The clone inserts were isolated by EcoRl digestion following polymerase chain reaction amplification. The inserts were purified then radiolabelled and used as hybridization probes against membrane bound normal human DNA, normal mystax DNA and bacterial DNA (control DNAs) (Example 4A) .
Clone 470-20-1 (PNF2161 cDNA source) was one of the clones isolated by immunoscreening with the PNF 2161 serum. The clone was not reactive with normal human sera. The clone has a large open reading frame (203 base pairs; SEQ ID NO:3), in-frame with the /S-galactosidase gene of the lambda gtll vector. The clone is exogenous by genomic DNA hybridization analysis and genomic PCR analysis, using human, yeast and E. coli genomic DNAs (Example 4B) .
The sequence was present in PNF2161 serum as deter¬ mined by RT-PCR (Example 4C) . RT-PCR of serially diluted PNF 2161 RNA suggested at least about 105 copies of 470-20- l specific sequence per ml. The sequence was also detected in sucrose density gradient fractions at densi¬ ties consistent with the sequence banding in association with a virus-like particle (Example 5).
Bacterial lysates of E. coli expressing a second clone, clone 470-expl, (SEQ ID NO:28) were also shown to be specifically immunoreactive with PNF 2161 serum at comparable levels to clone 470-20-1. The coding sequence of 470-expl was flanked by termination codons (based on sequence comparisons to SEQ ID NO:14, also see Figure 1) and had an internal methionine.
Further sequences (SEQ ID NO:14) adjacent to clone 470-20-1 were obtained by anchor polymerase chain reaction (Anchor PCR) using primers from clone 470-20-1 (Example 6) . In this case a PNF 2161 2-cDNA source library was used as template, where the cDNA/complement double- stranded DNA products were ligated to lambda arms, but the mixture was not packaged.
470-20-1 specific primers were used in amplification reactions with SISPA-amplified PNF 2161 cDNA as a template (Example 4) . The identity of the amplified DNA fragments were confirmed by (i) size and (ii) hybridization with a 470-20-1 specific oligonucleotide probe (SEQ ID NO:16). The 470-20-1 specific signal was detected in cDNA amplified by PCR from SISPA-amplified PNF 2161, demonstrating the presence of the 470-20-1 sequences in the source material. The 470-20-1 specific primers were also used in amplification reactions with the following RNA sources as substrate: normal mystax liver RNA, normal tamarin (Sanguinε laboriatiε) liver RNA, and MY131 liver RNA (Example 4) . The results from these experiments demon- strate the 470-20-1 sequences are present in the parent serum sample (PNF 2161) and in an RNA liver sample from an animal challenged with the PNF 2161 sample (MY131) . Both normal control RNAs were negative for the presence of 470- 20-1 sequences. Further, PNF 2161 serum and other cloning source or related source materials were directly tested by PCR using primers from selected cloned sequences. Specific amplification products were detected by hybridization to a specific oligonucleotide probe 470-20-1-152F (SEQ ID NO:16). A specific signal was reproducibly detected in multiple extracts of PNF 2161, with the 470-20-1 specific primers.
The disease association between HGV and liver disease is further supported by the data presented in Example 4F. Sera from hepatitis patients and from blood donors with abnormal liver function were assessed for the presence of HGV by RT-PCR screening, using HGV specific primers. HGV specific sequence were detected in 6/152 of these sera samples. No HGV positives were detected among the control samples (n = 11) .
The results presented above indicate the isolation of a viral agent associated with N-(ABCDE) viral infection of liver (i.e., hepatitis) and/or infection, and resulting disease, of other tissue and cell types. Cloning of further HGV isolates (JC, BG34, T55806 and EB20) is described in Example 15.
III. FURTHER CHARACTERIZATION OF HGV RECOMBINANT ANTIGENS. A. SCREENING RECOMBINANT LIBRARIES.
Further candidate HGV antigens can be obtained from the libraries of the present invention using the screening methods described above. The cDNA library described above has been deposited with the American Type Culture Collection, 12301 Parklawn Dr., Rockville, MD, 20852, and has been assigned the following designation: PNF 2161 CDNA source, ATCC 75268. A second PNF 2161 cDNA library has been generated essentially as described for the first PNF 2161 cDNA library, except that second PNF 2161 cDNA source library was ligated to lambda gtll arms but was not packaged. This non-packaged library was used to obtain the extension clones described below. A packaged version of this second library (PNF 2161 2-cDNA source library) has been deposited with the American Type Culture Collection, 12301 Parklawn Drive, Rockville, MD, 20852, and has been assigned the following designation: PNF 2161 2-cDNA source, ATCC 75837.
In addition to the recombinant libraries generated above, other recombinant libraries from N-(ABCDE) hepati¬ tis sera can likewise be generated and screened as de¬ scribed herein. B. EPITOPE MAPPING, CROSS HYBRIDIZATION AND ISOLATION OF GENOMIC SEQUENCES.
Antigen encoding DNA fragments can be identified by
(i) immunoscreening, as described above, or (ii) computer analysis of coding sequences (e . g. , SEQ ID NO:14) using an algorithm (such as, "ANTIGEN," Intelligenetics, Mountain
View, CA) to identify potential antigenic regions. An antigen-encoding DNA fragment can be subcloned. The subcloned insert can then be fragmented by partial DNase I digestion to generate random fragments or by specific restriction endonuclease digestion to produce specific subfragments. The resulting DNA fragments can be inserted into the lambda gtll vector and subjected to immuno¬ screening in order to provide an epitope map of the cloned insert.
In addition, the DNA fragments can be employed as probes in hybridization experiments to identify overlap¬ ping HGV sequences, and these in turn can be further used as probes to identify a set of contiguous clones. The generation of sets of contiguous clones allows the eluci¬ dation of the sequence of the HGV's genome.
Any of the above-described clone sequences (e.g., derived from SEQ ID NO:14 or clone 470-20-1) can be used to probe the cDNA and DNA libraries, generated in a vector such as lambda gtlO or "LAMBDA ZAP II" (Stratagene, San Diego, CA) . Specific subfragments of known sequence may be isolated by polymerase chain reaction or after restriction endonuclease cleavage of vectors carrying such sequences. The resulting DNA fragments can be used as radiolabelled probes against any selected library. In particular, the 5' and 3' terminal sequences of the clone inserts are useful as probes to identify additional clones.
Further, the sequences provided by the 5' end of cloned inserts are useful as sequence specific primers in first-strand cDNA or DNA synthesis reactions (Maniatis et al . ; Scharf et al.). For example, specifically primed PNF 2161 cDNA and DNA libraries can be prepared by using specific primers derived from SEQ ID NO:14 on PNF 2161 nucleic acids as a template. The second-strand of the new cDNA is synthesized using RNase H and DNA polymerase I. The above procedures identify or produce DNA/cDNA molecules corresponding to nucleic acid regions that are 5' adjacent to the known clone insert sequences. These newly isolated sequences can in turn be used to identify further flanking sequences, and so on, to identify the sequences composing the entire genome for HGV. As de¬ scribed above, after new HGV sequences are isolated, the polynucleotides can be cloned and immunoscreened to iden¬ tify specific sequences encoding HGV antigens.
Extension clone sequences (SEQ ID NO:14), containing further sequences of interest, were obtained for clone PNF 470-20-1 (SEQ ID NO:3) using the "Anchor PCR" method described in Example 6. Briefly, the strategy consists of ligating PNF 2161 SISPA cDNA to lambda gtll arms and amplifying the ligation reaction with a gtll-specific primer and one of two 470-20-1 specific primers.
The amplification products are electrophoretically separated, transferred to filters and the DNA bound to the filters is probed with a 470-20-1 specific probe. Bands corresponding to hybridization positive band signals were gel purified, cloned and sequenced.
C. PREPARATION OF ANTIGENIC POLYPEPTIDES AND ANTIBODIES.
The recombinant peptides of the present invention can be purified by standard protein purification procedures which may include differential precipitation, molecular sieve chromatography, ion-exchange chromatography, isoelectric focusing, gel electrophoresis and affinity chromatography.
In one embodiment of the present invention, the polynucleotide sequences of the antigens of the present invention have been cloned in the plasmid p-GEX (Example 7A) or various derivatives thereof (pGEX-GLI) . The plas- mid pGEX (Smith, et al . , 1988) and its derivatives express the polypeptide sequences of a cloned insert fused in- frame to the protein glutathione-S-transferase (sj26) . In one vector construction, plasmid pGEX-hisB, an amino acid sequence of 6 histidines is introduced at the carboxy terminus of the fusion protein.
The various recombinant pGEX plasmids can be trans¬ formed into appropriate strains of E. coli and fusion protein production can be induced by the addition of IPTG (isopropyl-thio galactopyranoside) as described in Example 7A. Solubilized recombinant fusion protein can then be purified from cell lysates of the induced cultures using glutathione agarose affinity chromatography (Example 7A) . Insoluble fusion protein expressed by the plasmid pGEX-hisB can be purified by means of immobilized metal ion affinity chromatography (Porath) in buffers containing 6M Urea or 6 M guanidinium isothiocyanate, both of which are useful for the solubilization of proteins. Alternatively insoluble proteins expressed in pGEX-GLI or derivatives thereof can be purified using combinations of centrifugation to remove soluble proteins followed by solubilization of insoluble proteins and standard chro- matographic methodologies, such as ion exchange or size exclusion chromatography, and other such methods are known in the art.
In the case of /S-galactosidase fusion proteins (such as those produced by lambda gtll clones) the fused protein can be isolated readily by affinity chromatography, by passing cell lysis material over a solid support having surface-bound anti-β-galactosidase antibody. For example, purification of a /3-galactosidase/fusion protein, derived from 470-20-1 coding sequences, by affinity chromatography is described in Example 7B.
Also included in the invention is an expression vector, such as the lambda gtll or pGEX vectors described above, containing HGV coding sequences and expression control elements which allow expression of the coding regions in a suitable host. The control elements gener¬ ally include a promoter, translation initiation codon, and translation and transcription termination sequences, and an insertion site for introducing the insert into the vector.
The DNA encoding the desired antigenic polypeptide can be cloned into any number of commercially available vectors to generate expression of the polypeptide in the appropriate host system. These systems include, but are not limited to, the following: baculovirus expression (Reilly, et al . ,' Beames, et al . ; Pharmingen; Clontech, Palo Alto, CA) , vaccinia expression (Moss, et al . ) , ex¬ pression in bacteria (Ausubel, et al.; Clontech), expres¬ sion in yeast (Goeddel; Guthrie and Fink) , expression in mammalian cells (Clontech; Gibco-BRL, Ground Island, NY) . These recombinant polypeptide antigens can be expressed directly or as fusion proteins. A number of features can be engineered into the expression vectors, such as leader sequences which promote the secretion of the expressed sequences into culture medium. The recombinantly produced HGV polypeptide antigens are typically isolated from lysed cells or culture media. Purification can be carried out by methods known in the art including salt fractionation, ion exchange chromatography, and affinity chromatography. Immunoaffinity chromatography can be employed using antibodies generated based on the HGV antigens identified by the methods of the present invention.
HGV polypeptide antigens may also be isolated from HGV particles (see below) . Continuous antigenic determinants of polypeptides are generally relatively small, typically 6 to 10 amino acids in length. Smaller fragments have been identified as antigenic regions, for example, in conformational epitopes. HGV polypeptide antigens are identified as described above. The resulting DNA coding regions of either strand can be expressed recombinantly either as fusion proteins or isolated polypeptides. In addition, /3 292 PCMJS95/06266
25 amino acid sequences can be conveniently chemically synthesized using commercially available synthesizer (Applied Biosystems, Foster City, CA) or "PIN" technology (Applied Biosytems) . In another embodiment, the present invention includes mosaic proteins that are composed of multiple epitopes. An HGV mosaic polypeptide typically contains at least two epitopes of HGV, where the polypeptide substantially lacks amino acids normally intervening between the epitopes in the native HGV coding sequence. Synthetic genes (Crea; Yoshio et al . ; Eaton et al . ) encoding multiple, tandem epitopes can be constructed that will produce mosaic proteins using standard recombinant DNA technology using polypeptide expression vector/host system described above. Further, multiple antigen peptides can be synthesized chemically by methods described previously (Tarn, J.P., 1988; Briand et al . ) . For example, a small immuno- logically inert core matrix of lysine residues with o- and e- amino groups can be used to anchor multiple copies of the same or different synthetic peptides (typically 6-15 residues long) representing epitopes of interest. Mosaic proteins or multiple antigen peptide antigens give higher sensitivity and specificity in immunoassays due to the signal amplification resulting from distribution of multiple epitopes.
Antigens obtained by any of these methods can be used for antibody generation, diagnostic tests and vaccine development.
In another aspect, the invention includes specific antibodies directed against the polypeptide antigens of the present invention. Antigens obtained by any of these methods may be directly used for the generation of anti¬ bodies or they may be coupled to appropriate carrier molecules. Many such carriers are known in the art and are commercially available (e . g. , Pierce, Rockford IL) .
Typically, to prepare antibodies, a host animal, such as a rabbit, is immunized with the purified antigen or fused protein antigen. Hybrid, or fused, proteins may be gen¬ erated using a variety of coding sequence derived from other proteins, such as glutathione-S-transferase or β- galactosidase. The host serum or plasma is collected following an appropriate time interval, and this serum is tested for antibodies specific against the antigen. Example 8 describes the production of rabbit serum anti¬ bodies which are specific against the 470-20-1 antigen in the SJ26/470-20-1 hybrid protein. These techniques are equally applicable to all immunogenic sequences derived from HGV, including, but not limited to, those derived from the coding sequence presented as SEQ ID NO:14.
The gamma globulin fraction or the IgG antibodies of immunized animals can be obtained, for example, by use of saturated ammonium sulfate precipitation or DEAE Sephadex chromatography, affinity chromatography, or other tech¬ niques known to those skilled in the art for producing polyclonal antibodies.
Alternatively, purified antigen or fused antigen pro- tein may be used for producing monoclonal antibodies. Here the spleen or lymphocytes from an immunized animal are removed and immortalized or used to prepare hybridomas by methods known to those skilled in the art. To produce a human-derived hybridoma, a human lymphocyte donor is selected. A donor known to be infected with a HGV may serve as a suitable lymphocyte donor. Lymphocytes can be isolated from a peripheral blood sample. Epstein-Barr virus (EBV) can be used to immortalize human lymphocytes or a suitable fusion partner can be used to produce human- derived hybridomas. Primary in vitro sensitization with viral specific polypeptides can also be used in the generation of human monoclonal antibodies.
Antibodies secreted by the immortalized cells are screened to determine the clones that secrete antibodies of the desired specificity, for example, by using the
ELISA or Western blot method (Example 9; Ausubel et al . ) . Using the antibodies of the present invention other antigenic peptides and epitopes can be isolated.
D. ELISA AND PROTEIN BLOT SCREENING. When HGV antigens are identified, typically through plaque immunoscreening as described above, the antigens can be expressed and purified. The antigens can then be screened rapidly against a large number of suspected HGV hepatitis sera using alternative immunoassays, such as, ELISAs or Protein Blot Assays (Western blots) employing the isolated antigen peptide. The antigen polypeptides fusion can be isolated as described above, usually by affinity chromatography to the fusion partner such as β- galactosidase or glutathione-S-transferase. Alternative- ly, the antigen itself can be purified using antibodies generated against it (see below) .
A general ELISA assay format is presented in Example 9. Harlow, et al . , describe a number of useful techniques for immunoassays and antibody/antigen screening. The purified antigen polypeptide or fusion polypep¬ tide containing the antigen of interest, is attached to a solid support, for example, a multiwell polystyrene plate. Sera to be tested are diluted and added to the wells. After a period of time sufficient for the binding of antibodies to the bound antigens, the sera are washed out of the wells. A labelled reporter antibody is added to each well along with an appropriate substrate: wells containing antibodies bound to the purified antigen poly¬ peptide or fusion polypeptide containing the antigen are detected by a positive signal.
A typical format for protein blot analysis using the polypeptide antigens of the present invention is presented in Example 9. General protein blotting methods are described by Ausubel, et al . In Example 9, the 470-20- l/sj26 fusion protein was used to screen a number of sera samples. The results presented in Example 9 demonstrate that several different source N-(ABCDE) hepatitis sera are immunoreactive with the polypeptide antigen.
The results presented above demonstrate that the polypeptide antigens of the present invention can, by these methods, be rapidly screened against panels of suspected HGV infected serum samples for the detection of HGV.
E. CELL CULTURE SYSTEMS, ANIMAL MODELS AND ISOLATION OF HGV.
HGV may be propagated in the animal model systems.
Infectivity studies have been carried out in chimpanzees, cynomolgus monkey and four mystax subjects (Example 4G) .
These studies have yielded further information about HGV infectivity in these animal models. The HGV described in the present specification have the advantage of being capable of infecting tamarins, cynomologous monkeys and chimpanzees.
Alternatively, primary hepatocytes obtained from infected animals (chimpanzees, baboons, monkeys, or hu¬ mans) can be cultured in vitro. A serum-free medium, supplemented with growth factors and hormones, has been described which permits the long-term maintenance of differentiated primate hepatocytes (Lanford, et al . ; Jacob, et al . , 1989, 1990, 1991). In addition to primary hepatocyte cultures, immortalized cultures of infected cells may also be generated. For example, primary liver cultures may be fused to a variety of cells (like HepG2) to provide stable immortalized cell lines. Primary hepa- tocyte cell cultures may also be immortalized by intro¬ duction of oncogenes or genes causing a transformed phe¬ notype. Such oncogenes or genes can be derived from a number of sources known in the art including SV40, human cellular oncogenes and Epstein Barr Virus. Further, the un-infected hepatocytes (e . g. , primary or continuous hepatoma cell lines) may be infected by exposing the cells in culture to the HGV either as partially purified particle preparations (prepared, for example, from infected sera by differential centrifugation and/or molecular sieving) or in infectious sera. These infected cells can then be propagated and the virus passaged by methods known in the art. Further, other cell types, such as lymphoid cell lines, may be useful for the propagation of HGV.
Protein similarity studies of HGV have detected amino acid regions similar to other viruses in the family Flaviviridae. It is known that members of this family of viruses can be propagated in a variety of tissue culture systems (ATCC-Viruses catalogue, 1990) . By analogy it is likely that HGV can be propagated in one or more of the following tissue culture systems: Hela cells, primary hamster kidney cells, monkey kidney cells, vero cells,
LLC-MK2 (rhesus monkey kidney cells), KB cells(human oral epidermoid carcinoma cells) , duck embryo cells, primary sheep leptomeningeal cells, primary sheep choroid plexus cells, pig kidney cells, bovine embryonic kidney cells, bovine turbinate cells, chick embryo cells, primary rabbit kidney cells, BHD-21 cells, or PK-13 cells.
In addition to expression of HGV, regions of HGV polynucleotide sequences, cDNA or in vitro transcribed RNA can be introduced by recombinant means into tissue culture cells. Such recombinant manipulations allow the individual expression of individual components of the HGV.
RNA samples can be prepared from infected tissue or, in particular, from infected cell cultures. The RNA samples can be fractionated on gels and transferred to membranes for hybridization analysis using probes derived from the cloned HGV sequences.
HGV particles may be isolated from infected sera, infected tissue, the above-described cell culture media, or the cultured infected cells by methods known in the art. Such methods include techniques based on size frac¬ tionation (i.e., ultrafiltration, precipitation, sedimen¬ tation) , using anionic and/or cationic exchange materials, separation on the basis of density, hydrophilic properties, and affinity chromatography. During the isolation procedure the HGV can be identified (i) using the anti-HGV hepatitis associated agent antibodies of the present invention, (ii) by using hybridization probes based on identified HGV nucleic acid sequences (e.g., Example 5) or (iii) by RT-PCR.
Antibodies directed against HGV can be used in puri¬ fication of HGV particles through immunoaffinity chroma- tography (Harlow, et al . ; Pierce). Antibodies directed against HGV polypeptides or fusion polypeptides (such as 470-20-1) are fixed to solid supports in such a manner that the antibodies maintain their immunoselectivity. To accomplish such attachment of antibodies to solid support bifunctional coupling agents (Pierce; Pharmacia,
Piscataway, NJ) containing spacer groups are frequently used to retain accessibility of the antigen binding site of the antibody.
HGV particles can be further characterized by stan- dard procedures including, but not limited to, immunoflu- orescence microscopy, electron microscopy, Western blot analysis of proteins composing the particles, infection studies in animal and/or cell systems utilizing the par¬ tially purified particles, and sedimentation characteris- tics. The results presented in Example 5 suggest that the viral particle of the present invention is more similar to an enveloped viral particle than to a non-enveloped viral particle.
HGV particles can be disrupted to obtain HGV genomes. Disruption of the particles can be achieved by, for example, treatment with detergents in the presence of chelating agents. The genomic nucleic acid can then be further characterized. Characterization may include analysis of DNase and RNase sensitivity. The strandedness (Example 4F) and conformation (e.g., circular) of the genome can be determined by techniques known in the art, including visualization by electron microscopy and sedimentation characteristics.
The isolated genomes also make it possible to se¬ quence the entire genome whether it is segmented or not, and whether it is an RNA or DNA genome (using, for example RT-PCR, chromosome walking techniques, or PCR which utilizes primers from adjacent cloned sequences) . Deter¬ mination of the entire sequence of HGV allows genomic organization studies and the comparison of the HGV se- quences to the coding and regulatory sequences of known viral agents.
F. SCREENING FOR AGENTS HAVING ANTI-HGV HEPATITIS ACTIVITY. The use of cell culture and animal model systems for propagation of HGV provides the ability to screen for anti-hepatitis agents which inhibit the production of infectious HGV: in particular, drugs that inhibit the replication of HGV. Cell culture and animal models allow the evaluation of the effect of such anti-hepatitis drugs on normal cellular functions and viability. Potential anti-viral agents (including, for example, small mole¬ cules, complex mixtures such as fungal extracts, and anti¬ sense oligonucleotides) are typically screened for anti¬ viral activity over a range of concentrations. The effect on HGV replication and/or antigen production is then evaluated, typically by monitering viral macromolecular synthesis or accumulation of macromolecules (e.g., DNA, RNA or protein) . This evaluation is often made relative to the effect of the anti-viral agent on normal cellular function (DNA replication, RNA transcription, general protein translation, etc.).
The detection of the HGV can be accomplished by many methods including those described in the present specifi¬ cation. For example, antibodies can be generated against the antigens of the present invention and these antibodies used in antibody-based assays (Harlow, et al . ) to identify and quantitate HGV antigens in cell culture. HGV antigens can be quantitated in culture using competition assays: polypeptides encoded by the cloned HGV sequences can be used in such assays. Typically, a recombinantly produced HGV antigenic polypeptide is produced and used to generate a monoclonal or polyclonal antibody. The recombinant HGV polypeptide is labelled using a reporter molecule. The inhibition of binding of this labelled polypeptide to its cognate antibody is then evaluated in the presence of samples (e.g., cell culture media or sera) that contain HGV antigens. The level of HGV antigens in the sample is determined by comparison of levels of inhibition to a standard curve generated using unlabelled recombinant proteins at known concentrations.
The HGV sequences of the present invention are par- ticularly useful for the generation of polynucleotide probes/primers that may be used to quantitate the amount of HGV nucleic acid sequences produced in a cell culture system. Such quantification can be accomplished in a number of ways. For example, probes labelled with re- porter molecules can be used in standard dot-blot hybrid¬ izations or competition assays of labelled probes with infected cell nucleic acids. Further, there are a number of methods using the polymerase chain reaction to quantitate target nucleic acid levels in a sample (Osikowicz, et al . ) .
Protective antibodies can also be identified using the cell culture and animal model systems described above. For example, polyclonal or monoclonal antibodies are generated against the antigens of the present invention. These antibodies are then used to pre-treat an infectious HGV-containing inoculum (e .g. , serum) before infection of cell cultures or animals. The ability of a single antibody or mixtures of antibodies to protect the cell culture or animal from infection is evaluated. For example, in cell culture and animals the absence of viral antigen and/or nucleic acid production serves as a screen. Further in animals, the absence of HGV hepatitis disease symptoms, e .g. , elevated ALT values, is also indicative of the presence of protective antibodies.
Alternatively, convalescent sera can be screened for the presence of protective antibodies and then these sera used to identify HGV hepatitis associated agent antigens that bind with the antibodies. The identified HGV antigen is then recombinantly or synthetically produced. The ability of the antigen to generate protective antibodies is tested as above. After initial screening, the antigen or antigens identified as capable of generating protective antibodies, either singly or in combination, can be used as a vaccine to inoculate test animals. The animals are then challenged with infectious HGV. Protection from infection indicates the ability of the animals to generate antibodies that protect them from infection (humoral immunity) . Further, use of the animal models allows identification of antigens that activate cellular immunity.
G. VACCINES AND THE GENERATION OF PROTECTIVE IMMUNITY. Vaccines can be prepared from one or more of the immunogenic polypeptides identified by the method of the present invention. Genomic organization similarities between the isolated sequences from HGV and other known viral proteins may provide information concerning the polypeptides that are likely to be candidates for effec¬ tive vaccines. In addition, a number of computer programs can be used for to identify likely regions of isolated sequences that encode protein antigenic determinant regions (for example, Hopp, et al . ; "ANTIGEN," Intelli- genetics, Mountain View CA) .
Vaccines containing immunogenic polypeptides as active ingredients are typically prepared as injectables either as solutions or suspensions. Further, the immuno¬ genic polypeptides may be prepared in a solid or lyophi- lized state that is suitable for resuspension, prior to injection, in an aqueous form. The immunogenic poly¬ peptides may also be emulsified or encapsulated in lipo¬ somes. The polypeptides are frequently mixed with phar¬ maceutically acceptable excipients that are compatible with the polypeptides. Such excipients include, but are not limited to, the following and combinations of the following: saline, water, sugars (such as dextrose and sorbitol) , glycerol, alcohols (such as ethanol [EtOH]), and others known in the art. Further, vaccine prepara- tions may contain minor amounts of other auxiliary sub¬ stances such as wetting agents, emulsifying agents (e.g., detergents) , and pH buffering agents. In addition, a number of adjuvants are available which may enhance the effectiveness of vaccine preparations. Examples of such adjuvants include, but are not limited to, the following: the group of related compounds including N-acetyl-muranyl- L-threonyl-D-isoglutamine and N-acetyl-nor-muranyl-L- alanyl-D-isoglutamine, and aluminum hydroxide.
The immunogenic polypeptides used in the vaccines of the present invention may be recombinant, synthetic or isolated from, for example, attenuated HGV particles. The polypeptides are commonly formulated into vaccines in neutral or salt forms. Pharmaceutically acceptable or¬ ganic and inorganic salts are well known in the art. HGV hepatitis associated agent vaccines are paren- terally administered, typically by subcutaneous or intra¬ muscular injection. Other possible formulations include oral and suppository formulations. Oral formulations commonly employ excipients (e.g., pharmaceutical grade sugars, saccharine, cellulose, and the like) and usually contain within 10-98% immunogenic polypeptide. Oral compositions take the form of pills, capsules, tablets, solutions, suspensions, powders, etc., and may be formu¬ lated to allow sustained or long-term release. Supposi- tory formulations use traditional binders and carriers and typically contain between 0.1% and 10% of the immunogenic polypeptide. In view of the above information, multivalent vac¬ cines against HGV hepatitis associated agents can be generated which are composed of one or more structural or non-structural viral-agent polypeptides. These vaccines can contain, for example, recombinant expressed HGV polypeptides, polypeptides isolated from HGV virions, synthetic polypeptides or assembled epitopes in the form of mosaic polypeptides. In addition, it may be possible to prepare vaccines, which confer protection against HGV hepatitis infection through the use of inactivated HGV. Such inactivation might be achieved by preparation of viral lysates followed by treatment of the lysates with appropriate organic solvents, detergents or formalin. Vaccines may also be prepared from attenuated HGV strains. Such attenuated HGV may be obtained utilizing the above described cell culture and/or animal model systems. Typically, attenuated strains are isolated after multiple passages in vitro or in vivo . Detection of attenuated strains is accomplished by methods known in the art. One method for detecting attenuated HGV is the use of antibody probes against HGV antigens, sequence-specific hybridization probes, or amplification with sequence- specific primers for infected animals or assay of HGV- infected in vitro cultures. Alternatively, or in addition to the above methods, attenuated HGV strains may be constructed based on the genomic information that can be obtained from the infor¬ mation presented in the present specification. Typically, a region of the infectious agent genome that encodes, for example, a polypeptide that is related to viral pathogenesis can be deleted. The deletion should not interfere with viral replication. Further, the recombi¬ nant attenuated HGV construct allows the expression of an epitope or epitopes that are capable of giving rise to protective immune responses against the HGV. The desired immune response may include both humeral and cellular immunity. The genome of the attenuated HGV is then used to transform cells and the cells grown under conditions that allow viral replication. Such attenuated strains are useful not only as vaccines, but also as production sources of viral antigens and/or HGV particles. Hybrid particle immunogens that contain HGV epitopes can also be generated. The immunogenicity of HGV epitopes may be enhanced by expressing the epitope in eucaryotic systems (e.g., mammalian or yeast systems) where the epitope is fused or assembled with known particle forming proteins. One such protein is the hepatitis B surface antigen. Recombinant constructs where the HGV epitope is directly linked to coding sequence for the particle forming protein will produce hybrid proteins that are immunogenic with respect to the HGV epitope and the particle forming protein. Alternatively, selected portions of the particle-forming protein coding sequence, which are not involved in particle formation, may be replaced with coding sequences corresponding to HGV epi¬ topes. For example, regions of specific immunoreactivity to the particle-forming protein can be replaced by HGV epitope sequences.
The hepatitis B surface antigen has been shown to be expressed and assembled into particles in the yeast Sac- charomyceε cereviεiea and in mammalian cells (Valenzuela, et al . , 1982 and 1984; Michelle, et al . ) . These particles have been shown to have enhanced immunoreactivity. Formation of these particles using hybrid proteins, i.e., recombinant constructs with heterologous viral sequences, has been previously disclosed (EPO 175,261, published 26 March 1986) . Such hybrid particles containing HGV epitopes may also be useful in vaccine applications.
The vaccines of the present invention are adminis¬ tered in dosages compatible with the method of formula¬ tion, and in such amounts that will be pharmacologically effective for prophylactic or therapeutic treatments. The quantity of immunogen administered depends on the subject being treated, the capacity of the treatment subject's immune system for generation of protective immune response, and the desired level of protection.
HGV vaccines of the present invention can be admin¬ istered in single or multiple doses. Dosage regimens are also determined relative to the treatment subject's needs and tolerances. In addition to the HGV immunogenic poly¬ peptides, vaccine formulations may be administered in conjunction with other immunoregulatory agents.
In an additional approach to HGV vaccination, DNA constructs encoding HGV proteins under appropriate regu¬ latory control are introduced directly into mammalian tissue, in vivo . Introduction of such constructs produces "genetic immunization". Similar DNA constructs have been shown to be taken up by cells and the encoded proteins expressed (Wolf, et al . ; Ascadi, et al . ) . Injected DNA does not appear to integrate into host cells chromatin or replicate. This expression gives rise to substantial humoral and cellular immune responses, including protection from in vivo viral challenge in animal systems (Wang, et al . , 1993; Ulmer, et al.). In one embodiment, the DNA construct is injected into skeletal muscle fol¬ lowing pre-treatment with local anesthetics, such as, bupivicaine hydrochloride with methylparaben in isotonic saline, to facilitate cellular DNA uptake. The injected DNA constructs are taken up by muscle cells and the en¬ coded proteins expressed.
Compared to vaccination with soluble viral subunit proteins, genetic immunization has the advantage of au¬ thentic in vivo expression of the viral proteins. These viral proteins are expressed in association with host cell histocompatibility antigens, and other proteins, as would occur with natural viral infection. This type of immunization is capable of inducing both humoral and cellular immune responses, in contrast to many soluble subunit protein vaccines. Accordingly, this type of immunization retains many of the beneficial features of live attenuated vaccines, without the use of infectious agents for vaccination and attendant safety concerns.
Direct injection of plasmid or other DNA constructs encoding the desired vaccine antigens into in vivo tissues is one delivery means. Other means of delivery of the DNA constructs can be employed as well. These include a variety of lipid-based approaches in which the DNA is packaged using liposomes, cationic lipid reagents or cytofectins (such as, lipofectin) . These approaches facilitate in vivo uptake and expression, as summarized by Feigner and Rhodes (1991) . Various modifications to these basic approaches include the following: incorporation of peptides, or other moieties, to facilitate (i) targeting to particular cells, (ii) the intracellular disposition of the DNA construct following uptake, or (iii) to facilitate expression. Alternatively, the sequences encoding the desired vaccine antigens may be inserted into a suitable retroviral vector. The resulting recombinant retroviral vector inoculated into the subject for in vivo expression of the vaccine antigen. The antigen then induces the immune responses. As noted above, this approach has been shown to induce both humoral and cellular immunity to viral antigens (Irwin, et al . ) .
Further, the HGV vaccines of the present invention may be administered in combination with other vaccine agents, for example, with other hepatitis vaccines.
H. SYNTHETIC PEPTIDES.
When the coding sequences of HGV polypeptide antigens are determined synthetic peptides can be generated which correspond to these polypeptides. Synthetic peptides can be commercially synthesized or prepared using standard methods and apparatus in the art (Applied Biosystems, Foster City CA) . Alternatively, oligonucleotide sequences encoding peptides can be either synthesized directly by standard methods of oligonucleotide synthesis, or, in the case of large coding sequences, synthesized by a series of cloning steps involving a tandem array of multiple oligonucleotide fragments corresponding to the coding sequence (Crea; Yoshio et al. ; Eaton et al . ) . Oligonucleotide coding sequences can be expressed by standard recombinant procedures (Maniatis et al . ; Ausubel et al . ) .
IV. CHARACTERIZATION OF THE VIRAL GENOME. As shown in Example 4, the HGV genome appears to be an RNA molecule and has the closest sequence similarity to viral sequences that are catagorized in the Flaviviridae family of viruses. This family includes the Flaviviruses, Pestiviruses and an unclassified Genus made up of one member, Hepatitis C virus. The HGV virus does not have significant global (i.e., over the length of the virus) sequence identity with other established members of the Flaviviridae — with the exception of the protein motifs discussed below.
In general members of the Flaviviridae are enveloped viruses that have densities in sucrose gradients between 1.1 and 1.23 g/ml and are sensitive to heat, organic solvents and detergents. As shown in Example 5, HGV has density characteristics similar to an enveloped Flaviviridae virus (HCV) . The integrity of the HGV virion also appears to be sensitive to organic solvents (Example 5).
Flaviviridae virions contain a single molecule of ' linear single-stranded (ss) RNA which also serves as the only mRNA that codes for the viral proteins. The ssRNA molecule is typically between the size of 9 and 12 kilo- bases long.
Viral proteins are derived from one polyprotein precursor that is subsequently processed to the mature viral proteins. Most members of the Flaviviridae do not contain poly(A) tails at their 3' ends. Virion are about 15-20% lipid by weight. Members in the Flaviviridae family have a core pro¬ tein and two or three membrane-associated proteins. The analogous structural proteins of members in the three genera Flavivirus family show little similarity to one another at the sequence level. The nonstructural proteins contain conserved motifs for RNA dependent RNA polymerase (RDRP) , helicase, and a serine protease. These short blocks of conserved amino acids or motifs can be detected using computer algorithms known in the art such as "MACAW" (Schuler, et al . ) . These motifs are presumably related to constraints imposed by substrates processed by these proteins (Koonin and Dolja) . The order of these motifs is conserved in all members of the Flaviviridae family. The genome of HGV contains at least the protein motifs found in the RNA dependent RNA polymerase (RDRP) of members of the Flaviviridae family (see Figure 5, "GDD" sequence).
Members of the Flaviviridae family are known to replicate in a wide variety of animals ranging from (i) hematophagous arthropod vectors (ticks and mosquitoes) , where they do not cause disease, to (ii) a large range of vertebrate hosts (humans, primates, other mammals, marsu¬ pials, and birds) . Over 30 members of the Flaviviridae family cause diseases in man, ranging from febrile ill¬ ness, or rash, to potentially fatal diseases such as hemorrhagic fever, encephalitis, or hepatitis. At least 10 members of the Flaviviridae family cause severe and economically important diseases in domestic animals.
V. DETECTION OF ANTIGENS CODED BY SHORT REVERSE READING FRAMES COINCIDENT WITH KNOWN READING FRAMES
The present invention provides antigens useful for the determination of whether a test subject (e.g., human patient or animal) has been infected with a virus having an RNA genome, and a method for identifying such antigens. RNA viruses include, but are not limited to, the following families: Picornaviridae, Caliciviridae, Reoviridae,
Birnaviridae, Togaviridae, Flaviviridae, Ortho yxoviridae, Paramyxoviridae, Rhabdoviridae, Filoviridae, Coronaviridae, Bunyaviridae, Retroviridae, and Arena- viridae. These families include single- and double- stranded RNA genomes, segmented and non-segmented genomes. In a preferred embodiment, the method of the present invention is applied to RNA viruses having single-strand genomes.
The method of the present invention teaches the expression and subsequent induction of antibodies to a protein or proteins coded by "reverse reading frames" of RNA viruses. "Reverse reading frames" are defined as open reading frames that are transcribed and translated in the opposite direction to the major known reading frames for the virus, i.e., identifiable viral proteins. Identification of reverse-reading frame encoded antigens can be accomplished as follows. Coding regions of viral polynucleotides are examined to determine the coding regions corresponding to coding sequences for identifiable viral proteins. Such identifiable viral proteins include, for example, typical viral structural
(e.g., capsid) and non-structural (e.g., RNA dependent RNA polymerase, reverse transcriptase, and proteases) proteins. A further example of such identifiable viral proteins includes the polyprotein of members of Flavi- viridae.
The complement (i.e., the reverse frame) of the polynucleotide strand encoding identifiable viral pro¬ tein(s) is evaluated for open reading frames using the following method. First, conserved open frames are iden- tified among the complement strands of variants of a selected virus. Typically, variants are chosen that show low global sequence identity conservation relative to each other. A program such as DM.EXE (MS-DOS program from David Mount and Bruce Conrad, University of Arizona, Tucson, AZ) or alternatively the PC/GENE suite of programs (Intelligenetics, Mountain View, CA) facilitates the identification of open reading frames in the reverse frame.
Reverse open reading frames that are conserved be¬ tween, for example, two variants are then examined in other isolates. Reverse open reading frames that are conserved in a number of variants of a virus (e.g., among many HCV variants) are candidates for reverse frame anti¬ gens. As longer reverse open reading frames are more difficult to conserve, the longest frames should be exam- ined first.
In general, the starting codons of the frames are conserved but minor variations of the terminations and length can be accepted. Frames can be as short as about 12 amino acids, but preferably the reading frame is at least about 30 amino acids in length, and even more pref¬ erably at least about 30 to 100 amino acids in length. Although it is preferred to compare variants for conserved reverse open reading frames, it is also within the scope of the invention to select any reverse open reading frame and screen the encoded protein, as described below, for antigenic activity.
After identification of reverse-frame coding sequences, the polypeptide encoded by the sequence is produced, for example, recombinantly or synthetically (e .g. , solid phase chemical synthesis). In one embodi¬ ment, recombinant proteins coded by the reverse open reading frames are expressed in E. coli expression sys¬ tems. The antigens are screened against sera known to be specifically immunoreactive with viral antigens from the virus whose genome is being evaluated. For example, the antigens are used to detect antibodies in humans or ani¬ mals infected with RNA viruses. Specific examples are given below for HGV and HCV.
The diagnostic utility of reverse-frame antigens identified by this method are evaluated using immunologi¬ cal screening of panels of sera known or suspected to be infected with the viral agent from which the reverse frame antigens were derived. Exemplary embodiments of antigen selection using this method, and use of such antigens in diagnostic assays, are described below.
A. Detection of Viral Antibodies
The method of the present invention includes detec¬ tion of viral antibodies based on the detection of an antigen coded by the reverse reading frame from the ex¬ pected major coding open frame. In one embodiment of the present invention, a reverse reading frame antigen was identified for the RNA virus HGV: the antigen encoded by the 470-20-1 clone was detected with antibodies from several N-(ABCDE) hepatitis sera, including PNF 2161. The sequence of the 470-20-1 clone was extended by Anchored PCR cloning (Example 6) .
Analysis of the regions surrounding the original clone 470-20-1 open reading frame revealed an extended open reading frame of approximately 161 amino acids (SEQ ID NO:28) . Analysis of the opposite strand to the protein coding strand of 470-20-1 revealed that it consisted of a completely open reading frame for a polyprotein sequence (Figure 1) . Similarity analysis of the polyprotein detected sequence similarity to members of the Flaviviridae family (see Section IV) . All members of Flaviviridae code for their known viral proteins using a long open reading frame to produce a polyprotein that is subsequently processed to the individual viral proteins. The sequence similarity of HGV to Flaviviridae is seen in the long, open, reverse-reading frame relative to the coding sequences for the 420-20-1 antigen — implying that the 470-20-1 antigen is actually coded in the opposite direction from the expected major coding region. Yet, the 470-20-1 antigen has been useful to detect infection of sera by HGV (Example 9) . Further reverse-frame HGV antigens have been identi¬ fied as follows. Three distinct immunogenic regions were isolated from three different HGV-epitope libraries. All three epitopic regions are encoded by the negative strand (i.e., the opposite strand relative to the strand encoding the polyprotein) of the HGV virus. The antigenic regions encoded by the negative strand are all contained within relatively short and separate open reading frames (ORFs) . The three libraries constructed for screening are described below.
The first immunogenic region is defined by a single clone Kl-2-3a (SEQ ID N0:111; SEQ ID N0:112). Kl-2-3 was isolated from a library designated NS3 which was generated by polymerase chain reaction amplification from PNF 2161 serum nucleic acids using the primer set 470ep-f9 (SEQ ID NO:98) and 470ep-R9 (SEQ ID NO:99). These primers amplify a fragment of HGV from the NS3 region. Fragment F9/R9 was amplified from 1 μl of PNF 2161 SISPA amplified DNA.
Amplifications were for 30 cycles for 1 minute at 94°C, 2 minutes at 52°C and 3 minutes at 72°C. The expected 777 nucleotide product was gel purified.
The primers were also used for amplification of the same fragment from a larger clone that was also obtained from PNF 2161 serum nucleic acids. The two purified DNA fragments were combined and partially digested with DNAse I. The partially digested sample (designated the F9/R9 library) was ligated to KL1 SISPA linkers and digested with .Eco-RI. The F9/R9 DNA was ligated into lambda gtll and packaged.
The clone Kl-2-3a was isolated by screening of the library expressing the F9/R9 fragment. Ten plates at 30,000 plaques/plate were screened with PNF 2161 plasma diluted 1/100 in AIB. Twenty two first round positive plaques were identified. Clone Kl-2-3a was purified from one of these plaques and was repeatedly immunoreactive against PNF 2161 sera.
Sequencing of the Kl-2-3a clone (SEQ ID NO:111; SEQ ID NO:112) indicated that it expresses a 44 amino acid insert. Analysis of the position of the K-l-2-3a sequence with respect to the sequence of the negative strand of HGV indicated Kl-2-3 is contained within a 100 amino acid ORF that is located in the negative strand of the NS3 gene of HGV. This ORF contains 1 methionine. The total size of ORF from the methionine to the termination codon is 51 amino acids. This methionine residue is also contained within the Kl-2-3 sequence at position 4.
The next reverse-frame immunogenic region was desig¬ nated the K3 region. The K3 series of clones was isolated from a library designated NS2. The library was generated using the primers given in Table 1 and SISPA amplified PNF 2162 DNA as template.
Table 1
Fragments nt
9E3-REV (SEQ ID NO:100) 59 aa 358 (of 389) of E2 E39-94PR (SEQ ID NO:101) 2 to aa 166 of NS-2
GEP-F12 (SEQ ID NO:102) 66 aa 144 (of 313) of GEP-R12 (SEQ ID NO:106) 3 NS-2 to aa 51 of NS-3
GEP-F14 (SEQ ID NO:103) 71 aa 357 - 594 of NS-3 GEP-R13 (SEQ ID NO:107) 5
470epF8 (SEQ ID NO:97) 64 aa 716 - 847 of NS-5 GEP-R14 (SEQ ID NO:108) 8 (716 to end)
All amplifications were for 35 cycles of 94°C/l minute, 48°C/2 minutes, and 73°C/3 minutes. All amplifi¬ cations yielded at least a fragment of the expected size. The amplified products were mixed and in an approximately 1:1:1:1 ratio and partially digested with DNasel. As above, the digestion products were ligated to KL1 SISPA linkers, amplified and EcoRI digested. The digested fragments were ligated into lambda gtll. The ligation reactions were packaged.
The packaged ligation products were plated. Screen¬ ing of this epitope library with PNF 2161 serum resulted in the isolation of 35 putatively immunoreactive plaques. Of the 35 positive areas, 22 were repeatedly immunoreac- tive with PNF 2161 serum. Twelve of the positive plaques were purified, re-screened and sequenced.
Eight of the 12 clones contained essentially the same insert (not counting repeated sequences and linkers) . These clones are K3-8-5A (SEQ ID N0:131; SEQ ID NO:132), K3-10-1D (SEQ ID N0:113; SEQ ID N0:114), K3-8-4C (SEQ ID NO:129; SEQ ID N0:130), K3-8-7C (SEQ ID N0:135; SEQ ID NO:136), K3-14-3A (SEQ ID N0:119; SEQ ID NO:120), K3-14-6A (SEQ ID NO:123; SEQ ID NO:124), K3-14-2A (SEQ ID N0:117; SEQ ID N0:118), and K3-14-5A (SEQ ID N0:121; SEQ ID
NO:122). One of the 12 was the same as these 8 clones except for a 3 nt insertion (K3-17-1A; SEQ ID NO:125, SEQ ID NO:126) .
One of the 12 clones was a unique chimera (K3-8-3A; SEQ ID NO:127, SEQ ID NO:128). Two of the 12 clones were unique long clones (K3-11-1A — SEQ ID NO:115, SEQ ID NO:116; and K3-8-6A — SEQ ID NO:133, SEQ ID NO:134).
All of the K3 clones express the negative strand of HGV (i.e., relative to the coding strand for the poly- protein) . All of the K3 clones have completely open reading frames through their entire inserts. An alignment of these clones is presented as Figures 11A, 11B and 11C.
The K3 clones are contained with the PCR fragment derived from amplification with the 9e3-rev (SEQ ID NO:100) and E39-94pr (SEQ ID NO:101) primers. This fragment contains the COOH terminal 31 amino acids of HGV E2 gene and the amino terminal 166 amino acids of HGV, NS2 gene.
All of the K3 clones contain a frame shift relative to the consensus sequence of the reverse strand of HGV:
11 of the 12 clones are missing 1 C residue; and the 12th clone (K3-17-1) contains 3 additional C residues.
The 5' end of all of the K3 clones is contained within a 171 amino acid ORF of the negative strand. This ORF contains a methionine at position 23, such that the greatest possible length of the methionine to termination codon open reading frame is 149 amino acids (approximately 18 kd) . All of the K3 clones (except K3-8-6) have their 5' terminal defined by the PCR primer E39-94pr (SEQ ID NO:101), which corresponds to amino acid 87 of the 171 acid ORF. All of the clones continue in this ORF until the occurrence of the frame shift at amino acid 140. At this point, all clones frame shift into the 8th amino acid of a new ORF (Figure 11B) . The clones all then express the sequence SEQ ID NO:149.
Then the reading frames of all the clones, except K3- 8-6 and K3-11-1, shift to an 8 nucleotide sequence of unknown origin (coding the amino acids QHS) then into the sequence of the reverse primer 9e3-rev (SEQ ID NO:100) which expresses the amino acids SEQ ID NO:148 (Figure 11C) . SEQ ID NO:148 is in the same frame as the common sequence SEQ ID NO:147 at amino acid 277 of the long combined frame (amino acid 144 of the 2nd frame) .
The 2 clones K3-11-1 and K3-8-6 are co-linear with the new frames until their inserts end at amino acids 192 and 259. In summary, this group of clones contains multiple disparately located sequences, whose final contribution to the observed immunoreactivity is being determined. Primers for the subcloning of various permutations of the amino acid sequences from the K3 region have been designed. Subfragments of the K3 region will be cloned into the expression vector pGEX-HIS-B. Preliminary data confirms that 2 of these sequences are highly immunoreac¬ tive with PNF 2161 sera when expressed as a fusion protein with sj26. The last negative strand immunogenic region is de¬ fined by the clones Y10-13-1 (SEQ ID NO:137; SEQ ID NO:138) and Y10-13-2 (SEQ ID NO:139; SEQ ID NO:140). These clones were derived from the envelope protein coding region. The env library was generated by PCR ampli- fication of 1 μl of PNF 2161 SISPA-amplified material using the primers presented in Table 2. Table 2
Fragments nt
GEP-F15 (SEQ ID NO:104) 52 = -182 amino acid of the COOH GEP-R15 (SEQ ID NO:109) 5 H Of E2
GEP-F17 (SEQ ID NO:110) 76 the COOH term of El through - GEP-R16 (SEQ ID NO:105) 5 aa 220 of E2
PCR amplification was for 35 cycles of 94°C/1 minute, 52°C/1.5 minutes, 72°C/3 minutes. The amplified products were purified, partially digested with DNAsel, and ligated to KL1 linkers. The ligated KL1 DNAs were amplified, digested with EcoRI and ligated into lambda gtll. This library was screened with the HGV positive sera R34587: 150,000 recombinant phage were screened. From this screening positive areas were isolated, plaque purified and re-screened. Three plaques were identified that were repeatedly reactive with R34587 sera. Two of these plaques, Y10-13-1 and Y10-13-2, were sequenced.
The clones Y10-13-1 and Y10-13-2 are contained with in the PCR fragment defined by GEP-F17 and GEP-rl6. The inserts of both clones represent continuous open reading frames. They are contained within a 139 amino acid ORF of the negative strand. This ORF has a methionine present at amino acid 22 (where the longest open reading frame is 117 amino acids, methionine to termination codon) . Both clones start downstream of the methionine (Y10-13-1 = amino acids 39-116 of the ORF; Y10-13-2 = amino acids 57- 116 of the ORF) . The epitopes in all of the above clones will be mapped.
Further reverse-frame HGV antigens can be identified using the above-described methods and a selected HGV polynucleotide (e.g., SEQ ID NO:14 or SEQ ID NO: 156, Example 13) . B. Reverse-Reading Frame Encoded Antigens in Other RNA Viruses.
The virus HCV is a member of the Flaviviridae family.
Three members of the HCV group of viruses were analyzed for conserved, reverse open reading frames: (accession numbers/viral designation, Genbank Ver. 83, Intelli- genetics, Mountain View, CA) M58335/HPCHUMR; D90208/-
HPCJCG; and M62321/HPCPLYPRE. Two exemplary reverse open reading frames were identified that were conserved between the three members. Each of these open reading frames start with a methionine codon and end at a termination codon. Figure 10 shows a schematic of the inverse sequence of the HCV genome based on the 9401 base pair sequences obtained from isolate HPCPLYPRE. The open boxes in Figure 10 show several exemplary open reading frames; inverse ORF1 and inverse ORF2 represent the position of the two conserved open reading frames. The coordinates for these open reading frames are presented in Table 3.
Table 3
Virus ORF Start End ORF Size SEQ ID NO:
M62321 lr 2876 3259 128 141 2r 3404 3835 144 142
M58335 lr 2900 3199 107 143 2r 3533 3934 134 144
D90208 lr 2900 3220 100 145 2r 3533 3935 134 146
Coordinates are expressed as number of base pairs from the 3' end of the positive strand of the virus.
The present invention provides a novel method to determine whether a test subject has been infected with a virus. Experiments performed in support of the present invention suggest the expression and subsequent induction of antibodies to a polypeptide or polypeptides coded by reverse frames in the opposite direction of the major known reading frames of RNA viruses. This phenomena forms the basis of a diagnostic assay based on detection of antibodies directed against polypeptide antigens coded for by the reverse frame of RNA viruses.
The reverse-frame antigens of the present invention can be utilized in the applications exemplified herein for HGV embodiments, for example, vaccine, antibodies, methods and diagnostics.
VI. Utility A. IMMUNOASSAYS FOR HGV.
One utility for the antigens obtained by the methods of the present invention is their use as diagnostic re- agents for the detection of antibodies present in the sera of test subjects infected with HGV hepatitis virus, thereby indicating infection in the subject; for example, 470-20-1 antigen, antigens encoded by SEQ ID NO:14 or its complement, and antigens encoded by portions of either strand of the complete viral sequence. The antigens of the present invention can be used singly, or in combination with each other, in order to detect HGV. The antigens of the present invention may also be coupled with diagnostic assays for other hepatitis agents such as HAV, HBV, HCV, and HEV.
In one diagnostic configuration, test serum is re¬ acted with a solid phase reagent having a surface-bound antigen obtained by the methods of the present invention, e . g. , the 470-20-1 antigen. After binding with anti-HGV antibody to the reagent and removing unbound serum compo¬ nents by washing, the reagent is reacted with reporter- labelled anti-human antibody to bind reporter to the reagent in proportion to the amount of bound anti-HGV antibody on the solid support. The reagent is again washed to remove unbound labelled antibody, and the amount of reporter associated with the reagent is determined. Typically, the reporter is an enzyme which is detected by incubating the solid phase in the presence of a suitable fluorometric or colorimetric substrate (Sigma, St. Louis, MO) .
The solid surface reagent in the above assay is prepared by known techniques for attaching protein mate¬ rial to solid support material, such as polymeric beads, dip sticks, 96-well plate or filter material. These attachment methods generally include non-specific adsorp¬ tion of the protein to the support or covalent attachment of the protein, typically through a free amine group, to a chemically reactive group on the solid support, such as an activated carboxyl, hydroxy1, or aldehyde group. Alternatively, streptavidin coated plates can be used in conjunction with biotinylated antigen(s) . Also forming part of the invention is an assay system or kit for carrying out this diagnostic method. The kit generally includes a support with surface-bound re¬ combinant HGV antigen (e.g., the 470-20-1 antigen, as above) , and a reporter-labelled anti-human antibody for detecting surface-bound anti-HGV antigen antibody.
In a second diagnostic configuration, known as a homogeneous assay, antibody binding to a solid support produces some change in the reaction medium which can be directly detected in the medium. Known general types of homogeneous assays proposed heretofore include (a) spin- labelled reporters, where antibody binding to the antigen is detected by a change in reported mobility (broadening ' of the spin splitting peaks), (b) fluorescent reporters, where binding is detected by a change in fluorescence efficiency or polarization, (c) enzyme reporters, where antibody binding causes enzyme/substrate interactions, and (d) liposome-bound reporters, where binding leads to liposome lysis and release of encapsulated reporter. The adaptation of these methods to the protein antigen of the present invention follows conventional methods for pre¬ paring homogeneous assay reagents. In each of the assays described above, the assay method involves reacting the serum from a test individual with the protein antigen and examining the antigen for the presence of bound antibody. The examining may involve attaching a labelled anti-human antibody to the antibody being examined (for example from acute, chronic or convalescent phase) and measuring the amount of reporter bound to the solid support, as in the first method, or may involve observing the effect of antibody binding on a homogeneous assay reagent, as in the second method.
A third diagnostic configuration involves use of HGV antibodies capable of detecting HGV-specific antigens. The HGV antigens may be detected, for example, using an antigen capture assay where HGV antigens present in can- didate serum samples are reacted with a HGV specific monoclonal or polyclonal antibody. The antibody is bound to a solid substrate and the antigen is then detected by a second, different labelled anti-HGV antibody. Antibodies can be prepared, utilizing the peptides of the present invention, by standard methods. Further, substantially isolated antibodies (essentially free of serum proteins which may affect reactivity) can be generated (e.g., affinity purification (Harlow et al . ) ) .
B. HYBRIDIZATION ASSAYS FOR HGV.
One utility for the nucleic acid sequences obtained by the methods of the present invention is their use as diagnostic agents for HGV sequences present in sera, thereby indicating infection in the individual. Primers and/or probes derived from the coding sequences of the present invention, in particular, Clone 470-20-1 and SEQ ID NO:14, can be used singly, or in combination with each other, in order to detect HGV.
In one diagnostic configuration, test serum is re- acted under PCR or RT-PCR conditions using primers derived from, for example, 470-20-1 sequences. The presence of HGV, in the serum used in the amplification reaction, can be detected by specific amplification of the sequences targeted by the primers. Example 4 describes the use of polymerase chain amplification reactions, employing primers derived from the clones of the present invention, to screen different source material. The results of these amplification reactions demonstrate the ability of primers derived from the clones of the present invention (for example, 470-20-1) , to detect homologous sequences by amplification reactions employing a variety of different source templates. The amplification reactions in Example 4 included use of nucleic acids obtained directly from sera as template material.
Alternatively, probes can be derived from the HGV sequences of the present invention. These probes can then be labelled and used as hybridization probes against nucleic acids obtained from test serum or tissue samples. The probes can be labelled using a variety of reporter molecules and detected accordingly: for example, radio¬ active isotopic labelling and chemiluminescent detection reporter systems (Tropix, Bedford, Mass.).
Target amplification methods, embodied by the poly¬ merase chain reaction, the self-sustained sequence repli¬ cation technique ["3SR," (Guatelli, et al . ; Gingeras, et al . , 1990) also known as "NASBA" (VanGe en, et al.) ] , the ligase chain reaction (Barany) , strand-displacement ampli¬ fication ["SDA," (Walker)], and other techniques, multiply the number of copies of the target sequence. Signal amplification techniques, exemplified by branched-chain DNA probes (Horn and Urdea; Urdea; Urdea, et al . ) and the Q-beta replicase method (Cahill, et al . ; Lomell, et al . ) , first bind a specific molecular probe, then replicate all of or part of this probe or in some other manner amplify the probe signal.
For the detection of the specific nucleic acid se- quences disclosed in the present invention or contiguous sequences in the same or a similar (related) viral genome, amplification and detection methodologies may be employed. as alternatives to amplification by the PCR. A number of such techniques are known to the field of nucleic acid diagnostics (The 1992 San Diego Conference: Genetic Recognition, Clin . Chem . 29.(4) :705 (1993)).
1. SELF-SUSTAINED SEQUENCE REPLICATION. The Self-Sustained Sequence Replication (3SR) tech¬ nique results in amplification to a similar magnitude as PCR, but isothermally. Rather than thermal cycle-driven PCR, the 3SR operates as a concerted three-enzyme reaction of a) cDNA synthesis by reverse transcriptase, b) RNA strand degradation by RNase H, and c) RNA transcription by T7 RNA polymerase.
As the entire reaction sequence occurs isothermally (typically at 42°C), expensive temperature-cycling in¬ strumentation is not required. In the absence of duplex denaturation via heating, organic solvents, or other mechanism, only single-stranded templates (i.e., predomi¬ nantly RNA) are amplified. Suitable primers for use in 3SR amplification can be selected from the viral sequences of the present invention by those having ordinary skill in the art. For example, for isothermal amplification of viral sequences by the 3SR technique, primer 470-20-1-77F (SEQ ID NO:9) is modified by the addition of the T7 promoter sequence and a preferred T7 transcription initiation site to the 5'-end of the oligonucleotide. This modification results in a suitable 3SR primer T7-470-20-1-77F (SEQ ID N0:9). Primer 470-20-1-211R (SEQ ID NO:10) can be used in these reactions either without modification or T7 promoter. RNA extracted from PNF 2161 is incubated with AMV reverse transcriptase (30 U) , RNase H (3 U) , T7 RNA poly¬ merase (100 U) , in 100 ul reactions containing 20 mM Tris- HC1, pH 8.1 (at room temperature), 15 mM MgCl2, 10 mM KC1, 2 mM spermidine HCl, 5 mM dithiothreitol (DTT) , 1 mM each of dATP, dCTP, dGTP, and TTP, 7 mM each of ATP, CTP, GTP, and UTP, and 0.15 uM each primer. Amplification takes place during incubation at 42°C. for 1-2 h.
Initially, primer T7-470-20-1-77F anneals to the target RNA, and is extended by AMV reverse transcriptase to form cDNA complementary to the starting RNA strand. Following degradation of the RNA strand by RNase H, re¬ verse transcriptase catalyzes the synthesis of the second strand DNA, resulting in a double-stranded template con¬ taining the (double-stranded) T7 promoter sequence. RNA transcription results in production of single-stranded RNA. This RNA then serves to re-enter the cycle for additional rounds of amplification, finally resulting in a pool of high-concentration product RNA. The product is predominantly single-stranded RNA of the same strand as the primer containing the T7 promoter (T7-470-20-1-77F) , with much smaller amounts of cDNA.
Alternatively, the other primer (470-20-1-211R) may contain the T7 promoter, or both primers may contain the promoter, resulting in production of both strands of RNA as products of the reaction. Products of the 3SR reaction may be detected, characterized, or quantitated by standard techniques for the analysis of RNA (e.g., Northern blots, RNA slot or dot blots, direct gel electrophoresis with RNA-staining dyes) . Further, the products may be detected by methods making use of biotin-avidin affinity interactions or specific hybridizations of nucleic acid probes.
In one technique for rapid and specific analysis of 3SR products, solution hybridization of the product to radiolabelled oligonucleotide 470-20-1-152R (SEQ ID NO:21) is followed by non-denaturing polyacrylamide gel electrophoresis. This assay (a gel mobility shift-type assay) results in the detection of specific probe-product hybrid as a slower-moving band than the band corresponding to unhybridized oligonucleotide. 2. LIGASE CHAIN REACTION (LCR)
As another example of a detection system, the HGV sequence may form the basis for design of ligase chain reaction (LCR) primers. LCR makes use of the nick-closing activity of DNA ligase to join two immediately adjacent oligonucleotides possessing adjacent 5'-phosphate ("donor" oligo) and 3'-hydroxyl ("acceptor" oligo) terminii. The property of DNA ligase to join only fully complementary ends in a template-dependent way, leads to a high degree of specificity, in that ligation will not occur unless the terminii to be linked are perfectly matched in sequence to the target strand.
As an alternative to PCR, with some advantages in terms of specificity for discrimination of single base mismatches between primer and target nucleic acid, the LCR may be used to detect or "type" strains of virus possessing homology to HGV sequences. These techniques are suitable for assessing the presence of specific muta¬ tions when such base changes are known to confer drug resistance (e.g., Larder and Kemp; Gingeras, et al . , 1991) .
In the presence of template-complementary donor and acceptor oligonucleotides and oligonucleotides complemen¬ tary to the donor and acceptor, exponential amplification by LCR is possible. In this embodiment, each round of ligation generates additional template for subsequent rounds, in a cyclic reaction.
For example, primer 470-20-1-211R (SEQ ID NO:10), an adjacent oligonucleotide (B, SEQ ID NO:22) and cognate oligos (211R', SEQ ID NO:23, and B', SEQ ID NO:24), can be used to perform LCR amplification of the sequence of this invention. Reverse transcription is first performed by standard methods to generate cDNA, which is then amplified in reactions containing 0.1-1 μM each of the four LCR primers, 20 mM Tris-HCl, pH 8.3 (room temperature), 25 mM KC1, 10 mM MgCl2, 10 mM dithiothreitol (DTT), 0.5 mM NAD+, 0.01% Triton X-100, and 5 Units of DNA ligase (Ampligase, /32292 PCI7US95/06266
57
Epicentre Technologies, Madison, WI, or other commercial supplier of thermostable DNA ligase) , in 25 ul reactions. Thermal cycling is performed at 94βC. for 1 min. 30 s; 94°C. for 1 min., 65°C. for 2 min., repeated for 25-40 cycles. Specificity of product synthesis depends on primer-template match at the 3'-terminal position. Prod¬ ucts are detected by polyacrylamide gel electrophoresis, followed by ethidium bromide staining; alternatively, one of the acceptor oligos (211R' or B) is 5'-radiolabelled for visualization by autoradiography following gel elec¬ trophoresis.
Alternatively, a donor oligo is 3'-end-labelled with a specific bindable moiety (e.g., biotin) , and the accep¬ tor is 5'-labelled with a specific detectable group (e.g., a fluorescent dye) , for solid phase capture and detection.
3. METHODS FOR ANALYSIS OF AMPLIFIED DNA Numerous techniques have been described for the analysis of amplified DNA. Several such techniques are advantageous for high-throughput applications, where gel electrophoresis is impractical, for example, rapid and high-resolution HPLC techniques (Katz and Dong) . However, in general, methods for infectious disease organism screening using nucleic acid probes involve a separate post-amplification hybridization step in order to assure requisite specificity for pathogen detection.
One such detection embodiment is an affinity-based hybrid capture technique (Holodniy, et al.). In this embodiment the PCR is conducted with one biotinylated primer. Following amplification, the double-stranded product is denatured then hybridized to a peroxidase- labelled probe complementary to the strand having incor¬ porated the biotinylated primer. The hybridized product is then incubated in a buffer which is in contact with an avidin (or streptavidin) coated surface (e.g., membrane filter, microwell, latex or paramagnetic beads) . The mass of coated solid phase which contacts the volume of PCR product to be analyzed by this method must contain sufficient biotin-binding sites to capture essen¬ tially all of the free biotinylated primer, as well as the much lower concentration of biotinylated PCR product. Following three to four washes of the solid phase, bound hybridized product is detected by incubation with o-phen- ylenediamine in citrate buffer containing hydrogen perox¬ ide. Alternatively, capture may be mediated by probe- coated surfaces, followed by affinity-based detection via the biotinylated primer and an avidin-reporter enzyme conjugate (Whetsell, et al . ) .
4. ADDITIONAL METHODS
Viral sequences of the present invention may also form the basis for a signal amplification approach to detection, using branched-chain DNA probes. Branched- chain probes (Horn and Urdea; Urdea) have been described for detection and quantification of rare RNA and DNA sequences (Urdea, et al.). In this method, an oligonu¬ cleotide probe (RNA, DNA, or nucleic acid analogue) is synthesized with a sequence complementary to the target RNA or DNA. The probe also contains a unique branching sequence or sequences not complementary to the target RNA or DNA.
This unique sequence constitutes a target for hy¬ bridization of branched secondary detector probes, each of which contains one or more other unique sequences, serving as targets for tertiary probes. At each branch point in the signal amplification pathway, a different unique sequence directs hybridization of secondary, tertiary, etc., detection probes. The last probe in the series typically is linked to an enzyme useful for detection (e.g., alkaline phosphatase) . The sequential hy¬ bridization of primers eventually results in the buildup of a highly-branched structure, the arms of which termi¬ nate in enzyme-linked probes.
Enzymatic turnover provides a final amplification, and the choice of highly sensitive che iluminescent sub- strates (e.g., LumiPhos, Lumigen, Detroit, MI, as a sub¬ strate for alkaline phosphatase labels) results in exqui¬ site sensitivity, on the order of 10,000 molecules or less of original target sequence per assay. In such a detection method, amplification depends only on molecular hybridization, rather than enzymatic mechanisms, and is thus far less susceptible to inhibitory substances in clinical specimens than, for example, PCR. Thus, this detection method allows the use of crude techniques for nucleic acid release in test samples, without extensive purification before assay.
Amplification for sensitive detection of the viral sequences of the present invention may also be accom¬ plished by the Q-/S replicase technique (Cahill, et al . ; Lomell, et al . ; Pritchard, et al . ) . In this method, a specific probe is designed to be complementary to the target sequence. This probe is then inserted by standard molecular cloning techniques into the sequence of the replicatable RNA from Q-/3 phage. Insertion into a spe¬ cific region of the replicon does not prevent replication by Q-/3 replicase.
Following molecular hybridization, and several cycles of washing, the replicase is added and amplification of the probe RNA ensues. "Reversible target capture" is one known technique for reducing the potential background from replication of unhybridized probes (Morrissey, et al . ) . Amplified replicons are detectable by standard molecular hybridization techniques employing DNA, RNA or nucleic acid analogue probes.
Additional methods for amplification and detection of rare DNA or RNA sequences are known in the literature and preferred to the PCR for some applications in the field of molecular diagnostics. These alternative techniques may form the basis for detection, characterization (e.g., sequence diversity existing as multiple related strains of the sequence described herein, genotypic changes characteristic of drug resistance) , or quantification of the sequence disclosed in the present invention.
Also forming part of the invention are assay systems or kits for carrying out the amplification/hybridization assay methods just described. Such kits generally include either specific primers for use in amplification reactions or hybridization probes.
The following examples illustrate, but in no way are intended to limit the present invention.
MATERIALS AND METHODS
E. coli DNA polymerase I (Klenow fragment) was ob¬ tained from Boehringer Mannheim Biochemicals (BMB) (Indi¬ anapolis, IN) . T4 DNA ligase and T4 DNA polymerase were obtained from New England Biolabs (Beverly, MA) ; Nitro- cellulose and "NYTRAN" filters were obtained from Schleicher and Schuell (Keene, NH) .
Synthetic oligonucleotide linkers and primers were prepared using commercially available automated oligonu¬ cleotide synthesizers. Alternatively, custom designed synthetic oligonucleotides may be purchased from commer¬ cial suppliers. cDNA synthesis kit and random priming labeling kits were obtained from BMB (Indianapolis, IN) or GIBCO/BRL (Gaithersburg, MD) .
Standard molecular biology and cloning techniques were performed essentially as previously described in Ausubel, et al., Sambrook, et al . , and Maniatis, et al .
Common manipulations relevant to employing antisera and/or antibodies for screening and detection of immuno¬ reactive protein antigens were performed essentially as described (Harlow, et al . ) . Similarly ELISA and Western blot assays for the detection of anti viral antibodies were performed either as described by their manufacturer (Abbott, N. Chicago, IL, Genelabs Diagnostics, Singapore) or using standard techniques known in the art (Harlow, et al ) .
EXAMPLE 1
CONSTRUCTION OF PNF2161 cDNA LIBRARIES
A. ISOLATION OF RNA FROM SERA.
One milliliter of undiluted PNF 2161 serum was pre¬ cipitated by the addition of PEG (MW 6,000) to 8% and centrifugation at 12K, for 15 minutes in a microfuge, at 4°C. RNA was extracted from the resulting serum pellet essentially as described by Chomczynski.
The pellet was treated with a solution containing 4M guanidinium isothiocyanate, 0.18% 2- mercaptoethanol , and 0.5% sarcosyl. The treated pellet was extracted several times with acidic phenol-chloroform, and the RNA was precipitated with ethanol. This solution was held at -70°C for approximately 10 minutes and then spun in a microfuge at 4°C for 10 minutes. The resulting pellet was resuspended in 100 μl of DEPC-treated (diethyl pyro- carbonate) water, and 10 μl of 3M NaOAc, pH = 5.2, two volumes of 100% ethanol and one volume of 100% isopropanol were added to the solution. The solution was held at -70°C for at least 10 minutes. The RNA pellet was recov- ered by centrifugation in a microfuge at 12,000 x g for 15 minutes at 5°C. The pellet was washed in 70% ethanol and dried under vacuum.
B. SYNTHESIS OF CDNA (i) FIRST STRAND SYNTHESIS
The synthesis of cDNA molecules was accomplished as follows. The above described RNA preparations were tran¬ scribed into cDNA, according to the method of Gubler et al . using random nucleotide hexamer primers (cDNA Synthe- sis Kit, BMB, Indianapolis, IN or GIBCO/BRL) .
After the second-strand cDNA synthesis, T4 DNA poly¬ merase was added to the mixture to maximize the number of blunt-ends of cDNA molecules. The reaction mixture was incubated at room temperature for 10 minutes. The reac¬ tion mixture was extracted with phenol/chloroform and chloroform isoamyl alcohol. The cDNA was precipitated by the addition of two volumes of 100% ethanol and chilling at -70βC for 15 minutes. The cDNA was collected by centrifugation, the pellet washed with 70% ethanol and dried under vacuum.
C. AMPLIFICATION OF THE DOUBLE STRANDED CDNA MOLECULES. The cDNA pellet was resuspended in 12 μl distilled water. To the resuspended cDNA molecules the following components were added: 5 μl phosphorylated linkers (Linker AB, a double strand linker comprised of SEQ ID N0:1 and SEQ ID NO:2, where SEQ ID NO:2 is in a 3' to 5' orientation relative to SEQ ID NO:l — as a partially complementary sequence to SEQ ID NO:l), 2 μl 10x ligation buffer (0.66 M Tris.Cl pH=7.6, 50 mM MgCl2, 50 mM DTT, 10 mM ATP) and 1 μl T4 DNA ligase (0.3 to 0.6 Weiss Units). Typically, the cDNA and linker were mixed at a 1:100 ratio. The reaction was incubated at 14°C overnight. The following morning the reaction was incubated at 70°C for three minutes to inactivate the ligase.
To 100 μl of 10 mM Tris-Cl buffer, pH 8.3, containing 1.5 mM MgCl2 and 50 mM KCl (Buffer A) was added about 1 μl of the linker-ligated cDNA preparation, 2 μM of a primer having the sequence shown as SEQ ID NO:l, 200 μM each of dATP, dCTP, dGTP, and dTTP, and 2.5 units of Thermuε aquaticuε DNA polymerase (Taq polymerase) . The reaction mixture was heated to 94°C for 30 sec for denaturation, allowed to cool to 50°C for 30 sec for primer annealing, and then heated to 72°C for 0.5-3 minutes to allow for primer extension by Tag polymerase. The amplification reaction, involving successive heating, cooling, and polymerase reaction, was repeated an additional 25-40 times with the aid of a Perkin-Elmer Cetus DNA thermal cycler (Mullis; Mullis, et al . ; Reyes, et al . , 1991; Perkin-Elmer Cetus, Norwalk, CT) .
After the amplification reactions, the solution was then phenol/chloroform, chloroform/isoamyl alcohol ex- tracted and precipitated with two volumes of ethanol. The resulting amplified cDNA pellets were resuspended in 20 μl TE (pH=7.5).
D. CLONING OF THE CDNA INTO LAMBDA VECTORS. The linkers used in the construction of the cDNAs contained an EcoRI site which allowed for direct insertion of the amplified cDNAs into lambda gtll vectors (Promega, Madison WI or Stratagene, La Jolla, CA) . Lambda vectors were purchased from the manufacturer (Promega) which were already digested with EcoRI and treated with alkaline phosphatase, to remove the 5' phosphate and prevent self-ligation of the vector.
The EcoRI-digested cDNA preparations were ligated into lambda gtll (Promega) . The conditions of the liga- tion reactions were as follows: 1 μl vector DNA (Promega, 0.5 mg/ml); 0.5 or 3 μl of the PCR amplified insert cDNA; 0.5 μl 10 x ligation buffer (0.5 M Tris-HCl, pH=7.8; 0.1 M MgCl2; 0.2 M DTT; 10 mM ATP; 0.5 g/ml bovine serum albumin (BSA) ) , 0.5 μl T4 DNA ligase (0.3 to 0.6 Weiss units) and distilled water to a final reaction volume of 5 μl.
The ligation reactions were incubated at 14°C over¬ night (12-18 hours). The ligated cDNA was packaged by ' standard procedures using a lambda DNA packaging system ("GIGAPAK", Stratagene, LaJolla, CA) , and then plated at various dilutions to determine the titer. A standard X- gal blue/white assay was used to determine recombinant frequency of the libraries (Miller; Maniatis et al . ) .
Percent recombination in each library was also de¬ termined as follows. A number of random clones were selected and corresponding phage DNA isolated. Polymerase chain reaction (Mullis; Mullis, et al . ) was then performed using isolated phage DNA as template and lambda DNA sequences, derived from lambda sequences flanking the -EσoRI insert site for the cDNA molecules, as primers. The presence or absence of insert was evident from gel analysis of the polymerase chain reaction products. The cDNA-insert phage libraries generated from serum sample PNF 2161 was deposited with the American Type Culture Collection, 12301 Parklawn Dr., Rockville MD 20852, and has been assigned the deposit designation ATCC 75268 (PNF 2161 cDNA source) .
EXAMPLE 2 IMMUNOSCREENING OF RECOMBINANT LIBRARIES The lambda gtll libraries generated in Example 1 were immunoscreened for the production of antigens recognizable by the PNF 2161 serum from which the libraries were generated. The phage were plated for plaque formation using the Eεcherichia coli bacterial plating strain E. coli KM392. Alternatively, E. coli Y1090R- may be used (Promega, Madison WI) . The fusion proteins expressed by the lambda gtll clones were screened with serum antibodies essentially as described by Ausubel, et al .
Each library was plated at approximately 2 x 104 phages per 150 mm plate. Plates were overlaid with nitrocellulose filters overnight. Filters were washed with TBS (10 mM, Tris pH 7.5; 150 mM NaCl), blocked with AIB (TBS buffer with 1% gelatin) and incubated with a primary antibody diluted 100 times in AIB.
After washing with TBS, filters were incubated with a second antibody, goat-anti-human IgG conjugated to alkaline phosphatase (Promega) . Reactive plaques were developed with a substrate (for example, BCIP, 5-bromo-4- chloro-3-indolyl-phosphate) , with NBT (nitro blue tetrazoliu salt (Sigma)). Positive areas from the primary screening were replated and immunoscreened until pure plaques were obtained. EXAMPLE 3 SCREENING OF THE PNF 2161 LIBRARY The cDNA library of PNF 2161 in lambda gtll was screened, as described in Example 2, with PNF 2161 sera. The results of the screening are presented in Table 4.
Table 4
PNF2161 Libraries
Figure imgf000067_0001
cDNA library constructed from the indicated human source.
Percent recombinant clones in the indicated λgtll library as determined by blue/white plaque assay and confirmed by PCR amplification of randomly selected clones.
Antisera source used for the immunoscreening of each indicated library.
One of the clones isolated by the above screen (PNF 2161 clone 470-20-1, SEQ ID NO:3; /8-galactosidase in-frame fusion translated sequence, SEQ ID NO:4), was used to generate extension clones, as described in Example 6. The clone 470-20-1 is deposited at Genelabs Technologies, Incorporated, 505 Penobscot Drive, Redwood City, CA 94063. Clone 470-20-1 nucleic acid sequence is presented as SEQ ID NO:3 (protein sequence SEQ ID NO:4). The isolated nucleic acid sequence without the SISPA cloning linkers is presented as SEQ ID NO:19 (protein SEQ ID Nθ:20) . EXAMPLE 4 CHARACTERIZATION OF THE IMMUNOREACTIVE 470-20-1 CLONE A. SOUTHERN BLOT ANALYSIS OF IMMUNOREACTIVE CLONES. The inserts of immunoreactive clones were screened for their ability to hybridize to the following control DNA sources: normal human peripheral blood lymphocyte (purchased from Stanford University Blood Bank, Stanford, California) DNA, and Eεcherichia coli KM392 genomic DNA (Ausubel, et al . ; Maniatis, et al . ; Sambrook, et al.). Ten micrograms of human lymphocyte DNA and 2 micrograms of E. coli genomic DNA were digested with coRI and Hin lll. The restriction digestion products were electrophoretically fractionated on an agarose gel (Ausubel, et al . ) and transferred to nylon or nitrocellulose membranes (Schleicher and Schuell, Keene, NH) as per the manufacturer's instructions.
Probes from the immunoreactive clones were prepared as follows. Each clone was amplified using primers corresponding to lambda gtll sequences that flank the EσoRI cloning site of the gtll vector. Amplification was carried out by polymerase chain reactions utilizing each immunoreactive clone as template. The resulting amplification products were digested with coRI, the amplified fragments gel purified and eluted from the gel (Ausubel, et al.) . The resulting amplified fragments, derived from the immunoreactive clones, were then random prime labelled using a commercially available kit (BMB) ' employing 32P-dNTPs.
The random primed probes were then hybridized to the above-prepared nylon membrane to test for hybridization of the insert sequences to the control DNAs. The 470-20-1 insert did not hybridize with any of the control DNAs.
As positive hybridization controls, a probe derivative from a human C-kappa gene fragment (Hieter) was used as single gene copy control for human DNA and a E . coli polymerase gene fragment was similarly used for E . coli DNA. B . GENOMIC PCR.
PCR detection was developed first to verify exogenicity with respect to several genomic DNAs which could have been inadvertently cloned during library construction, then to test for the presence of the cloned sequence in the cloning source and related specimen materials. Several different types of specimens, including SISPA-amplified nucleic acids and nucleic acids extracted from the primary source, and nucleic acids extracted from related source materials (e.g., from animal passage studies) , were tested.
The term "genomic PCR" refers to testing for the presence of specific sequences in genomic DNA from relevant organisms. For example, a genomic PCR for a Mystax-derived clone would include genomic DNAs as follows:
1. human DNA (1 μg/rxn.)
2. Mystax DNA (0.1-1 μg/rxn.)
3. E. coli (10-100 ng/rxn.) 4. yeast (10-100 ng/rxn.)
Human and Mystax DNAs are tested, as the immediate and ultimate source for the agent. E. coli genomic DNA, as a frequent contaminant of commercial enzyme preparations, is tested. Yeast is also tested, as a ubiquitous organism, whose DNA can contaminate reagents and thus, be cloned.
In addition, a negative control (i.e., buffer or water only), and positive controls to include approximately 105c/rxn. , are also amplified. Amplification conditions vary, as may be determined for individual sequences, but follow closely the following standard PCR protocol: PCR was performed in reactions containing 10 mM Tris, pH 8.3, 50 mM KCl, 1.75 mM MgCl2, 1.0 uM each primer, 200 uM each dATP, dCTP, and dGTP, and 300 μM dUTP, 2.5 units Taq DNA polymerase, and 0.2 units uracil-N-glycosylase per 100 ul reaction. Cycling was for at least 1 minute at 94°C, followed by 30 to 40 repetitions of denaturation (92-94°C for 15 seconds) , annealing (55-56°C for 30 seconds) , and extension (72°C for 30 seconds) . PCR reagents were assembled, and amplification reactions were constituted, in a specially- designated laboratory maintained free of amplified DNA. As a further barrier to contamination by amplified sequences and thus compromise of the test by "false positives," the PCR was performed with dUTP replacing TTP, in order to render the amplified sequences biochemically distinguishable from native DNA. To enzymatically render unamplifiable any contaminating PCR product, the enzyme uracil-N-glycosylase was included in all genomic PCR reactions. Upon conclusion of thermal cycling, the reactions were held at 72°C to prevent renaturation of uracil-N-glycosylase and possible degradation of amplified U-containing sequences.
A "HOT START PCR" was performed, using standard techniques ("AMPLIWAX", Perkin-Elmer Biotechnology; alternatively, manual techniques were used) , in order to make the above general protocol more robust for amplification of diverse sequences, which ideally require different amplification conditions for maximal sensitivity and specificity.
Detection of amplified DNA was performed by hybridization to specific oligonucleotide probes located internal to the two PCR primer sequences and having no or minimal overlap with the primers. In some cases, direct visualization of electrophoresed PCR products was performed, using ethidium bromide fluorescence, but probe hybridization was in each case also performed, to help ensure discrimination between specific and non-specific amplification products. Hybridization to radiolabelled probes in solution was followed by electrophoresis in 8- 15% polyacrylamide gels (as appropriate to the size of the amplified sequence) and autoradiography.
Clone 470-20-1 was tested by genomic PCR, against human, E . coli , and yeast DNAs. No specific sequence was detected in negative control reactions, nor in any genomic DNA which was tested, and 105 copies of DNA/reaction resulted in a readily-detectable signal. This sensitivity (i.e., 105/reaction) is adequate for detection of single- copy human sequences in reactions containing 1 ug total DNA, representing the DNA from approximately 1.5 x 10s cells.
C. DIRECT SERUM PCR Serum or other cloning source or related source materials were directly tested by PCR using primers from selected cloned sequences. In these experiments, HGV viral particles were directly precipitated from sera with polyethylene glycol (PEG) , or, in the case of PNF and certain other sera, were pelleted by ultracentrifugation. For purification of RNA, the pelleted materials were dissolved in guanidinium thiocyanate and extracted by the acid guanidinium phenol technique (Chomczynski, et al . ) .
Alternatively, a modification of this method afforded through and implemented by the use of commercially available reagents, e.g., "TRIREAGENT" (Molecular Research Center, Cincinnati, OH) or "TRIZOL" (Life Technologies, Gaithersburg, MD) , and associated protocols was used to isolate RNA. In addition, RNA suitable for PCR analysis was isolated directly from serum or other fluids containing virus, without prior concentration or pelleting of virus particles, through the use of "PURESCRIPT" reagents and protocols (Gentra Systems, Minneapolis, MN) . Isolated DNA was used directly as a template for the PCR. RNA was reverse transcribed using reverse transcriptase (Gibco/BRL) , and the cDNA product was then used as a template for subsequent PCR amplification. In the case of 470-20-1, nucleic acid from the equivalent of 20-50 ul of PNF serum was used as the input template into each RT-PCR or PCR reaction. Primers were designed based on the 470-20-1 sequence, as follows: 470- 20-1-77F (SEQ ID NO:9) and 470-20-1-211R (SEQ ID NO:10). Reverse transcription was performed using MMLV-RT (Gibco/BRL) and random hexamers (Promega) by incubation at room temperature for approximately 10 minutes, 42°C for 15 minutes, and 99°C for 5 minutes, with rapid cooling to 4°C. The synthesized cDNA was amplified directly, without purification, by PCR, in reactions containing 1.75 mM MgCl2, 0.2-1 μM each primer, 200 uM each dATP, dCTP, dGTP, and dTTP, and 2.5-5.0 units Taq DNA polymerase ("AMPLITAQ", Perkin-Elmer) per 100 ul reaction. Cycling was for at least one minute at 94°C, followed by 40-45 repetitions of denaturation (94°C for 15 seconds for 10 cycles; 92°C or 94°C for 15 seconds for the succeeding cycles) , annealing (55°C for 30 seconds) , and extension (72°C for 30 seconds), in the "GENEAMP SYSTEM 9600" thermal cycler (Perkin-Elmer) or comparable cycling conditions in other thermal cyclers (Perkin-Elmer; MJ Research, Watertown, MA) .
Positive controls consisted of (i) previously amplified PCR product whose concentration was estimated using the Hoechst 33258 fluroescence assay, (ii) purified plasmid DNA containing the DNA sequence of interest, or (iii) purified RNA transcripts derived from plasmid clones in which the DNA sequence of interest is disposed under the transcriptional control of phage RNA promoters such as T7, T3, or SP6 and RNA prepared through the use of commercially available in vitro transcription kits. In addition, an aliquot of positive control DNA corresponding to approximately 10-100 copies/rxn. can be spiked into reactions containing nucleic acids extracted from the cloning source specimen, as a control for the presence of inhibitors of DNA amplification reactions. Each separate extract was tested with at least one positive control.
Specific products were detected by hybridization to a specific oligonucleotide probe 470-20-1-152F (SEQ ID NO:16), for confirmation of specificity. Hybridization of 10 ul of PCR product was performed in solution in 20 ul reactions containing approximately 1 x 106 cpm of 32P- labelled 470-20-1-152F. Specific hybrids were detected following electrophoretic separation from unhybridized oligo in polyacrylamide gels, and autoradiography.
In addition to PNF, extracted nucleic acids from normal serum was also reverse transcribed and amplified, using the "serum PCR" protocol sequence. No signal was detected in normal human serum. The specific signal in PNF serum was reproducibly detected in multiple extracts, with the 470-20-1 specific primers.
D. AMPLIFICATION FROM SISPA UNCLONED NUCLEIC ACIDS SISPA (Sequence-Independent Single Primer Amplification) amplified cDNA was used as templates (Example 1) . Sequence-specific primers designed from selected cloned sequences were used to amplify DNA fragments of interest from the templates. Typically, the templates were the SISPA-amplified samples used in the cloning manipulations. For example, amplification primers 470-20-1-77F (SEQ ID NO:9) and 470-20-1-211R (SEQ ID NO:10) were selected from the clone 470-20-1 sequence (SEQ ID NO:3). These primers were used in amplification reactions with the SISPA-amplified PNF2161 cDNA as a template.
The identity of the amplified DNA fragments were confirmed by (i) hybridization with the specific oligonucleotide probe 470-20-1-152F (SEQ ID NO:16), designed based on the 470-20-1 sequence (SEQ ID NO:3) and/or (ii) size. The probe used for DNA blot detection was labelled with digoxygenin using terminal transferase according to the manufacturer's recommendations (BMB) . Hybridization to the amplified DNA was then performed using either Southern blot or liquid hybridization (Kumar, et al . , 1989) analyses.
Positive control DNA used in the amplification reactions was previously amplified PCR product whose concentration was estimated by the Hoechst 33258 fluorescence assay, or, alternatively, purified plasmid DNA containing the cloned inserts of interest.
The 470-20-1 specific signal was detected in cDNA amplified by PCR from SISPA-amplified PNF2161. Negative control reactions were nonreactive, and positive control DNA templates were detected.
E. AMPLIFICATION FROM LIVER RNA SAMPLES. RNA was prepared from liver biopsy material following the methods of Cathal, et al . , wherein tissue was extracted in 5M guanidine thiocyanate followed by direct precipitation of RNA by 4M LiCl. After washing of the RNA pellet with 2M LiCl, residual contaminating protein was removed by extraction with phenol:chloroform and the RNA recovered by ethanol precipitation.
The 470-20-1 specific primers were also used in amplification reactions with the following RNA sources as substrate: normal mystax liver RNA, normal ta arin (Sanguinuε labiatuε) liver RNA, and MY131 liver RNA. MY131 is a mystax that was infected with PNF 2161 plasma. Mystax 131 liver RNA did not give amplified products with the non-coding primers (SEQ ID NO:7 and SEQ ID NO:8) of HCV.
The amplification reactions were carried out in duplicate for two experiments. The results of these amplification reactions are presented in Table 5.
Table S
PCR with 470-20-1 Primers
Figure imgf000074_0001
These results demonstrate the 470-20-1 sequences are present in the parent serum sample (PNF 2161) and in a liver RNA sample from a passage animal of the PNF 2161 sample (MY131) . However, both control RNAs were negative for the presence of 470-20-1 sequences.
F. SCREENING OF A SERUM PANEL FOR HGV SEQUENCES BY POLYMERASE CHAIN REACTION USING RNA TEMPLATES.
1. PCR SCREENING OF HIGH-ALT DONORS FOR HGV The disease association between HGV and liver disease was assessed by polymerase chain reaction screening, using HGV specific primers, of sera from hepatitis patients and from blood donors with abnormal liver function. The latter consisted of serum from blood donations with serum ALT levels greater than 45 International Units per ml. A serum panel consisting of 152 total sera was selected. The following sera were selected for the serum panel: 104 high-ALT sera from screened blood donations at the Stanford University Blood Bank (SUBB) ; 34 N-(ABCDE) hepatitis sera from northern California, Egypt, and Peru; and 14 sera from other N-(ABCDE) donors suspected of having liver disease and/or hepatitis virus infection. The negative controls for the panel were as follows: 9 highly-screened blood donors (SUBB) notable for the absence of risk factors for viral infections
("supernormal" sera, e .g. , O-negative, Rh-negative; negative for HIV, known hepatitis agents, and CMV; whose multiple previous blood donations had been transfused without causing disease) ; and 2 random blood donors. These sera were assayed for the presence of HGV specific sequences by RT-PCR using the 470-20-1 primers 77F (SEQ ID N0:9) and 211R (SEQ ID NO:10).
RNA extraction and RT-PCR were performed essentially as described in Example 4C, except that the primer 470-20- 1-211R was 5'-biotinylated to facilitate rapid screening of amplified products by a method involving hybridization in solution, followed by affinity capture of hybridized probe using streptavidin-coated paramagnetic beads. Methods for the analysis of nucleic acids by hybridization to specific labelled probes with capture of the hybridized sequences through affinity interactions are well known in the art of nucleic acid analysis.
Depending on the amount of serum available for testing, RNA from 30 to 50 μl of serum was used per RT/PCR reaction. Each serum was tested in duplicate, with positive controls corresponding to 10, 100, or 1000 copies of RNA transcript per reaction and with appropriate negative (buffer) controls. No negative controls were reactive, and at least 10 copies per reaction were detectable in each PCR run. Indeterminate results were defined as specific hybridizing signal being present in only one of two duplicate reactions.
Efficient, highly sensitive analysis of the products from the amplification analysis of this serum panel was performed using an instrument specifically designed for affinity-based hybrid capture using electrochemiluminscent oligonucleotide probes (QPCR System 5000™, Perkin-Elmer) . Assays utilizing the QPCR 5000™ have been described (DiCesare, et al; Wages, et al) .
The products of each reaction were assayed by hybridization to probe 470-20-1-152F (5'-end-labelled with an electrochemilu inescent ruthenium chelate) , and measurement using the "QPCR 5000." Based on a cutoff of the sum of the mean and three times the standard deviation of negative controls in a given amplification run, a total of 34 possible positives were selected for confirmatory testing.
The 34 samples were analyzed by solution hybridization and electrophoresis (Example 4C) . Out of these 34 samples, 6 sera (i.e., 6/152) were shown to have specific hybridizing sequences in duplicate reactions. Of these six samples, three were strongly reactive by comparison with positive controls: one High-ALT serum from SUBB, and two N-(ABCDE) sera from Egypt. A second blood sample was obtained from the highly positive SUBB serum donor one year after the initial sample was taken. The second serum sample was confirmed to be HGV positive by the PCR methods described above. This result confirms persistant infection by HGV in a human. The serum was designated "JC." Further, the serum donor was HCV negative and antibody negative for HAV and HBV.
In addition, a third N-(ABCDE) serum from Egypt, a northern California blood donor with N-(ABCDE) hepatitis, and a N-(ABCDE) hepatitis serum, were also shown to be weakly positive by this method. Two other sera gave indeterminate results, defined as the presence of specific sequences in one of two amplification reactions. Subsequent PCR analysis of replicate serum aliquots from these positive and indeterminate sera resulted in positive results in 6 of 8 sera tested and indeterminate results in the remaining 2 sera. Thus, the specific hybridizing signal was reproducibly detected in 8 of the 152 serum samples tested.
In contrast, none of the random donor or highly- screened "supernormal" sera (total 11) was positive in either set of PCR analysis.
These results reinforce the disease association between HGV and liver disease.
Further testing of sera from High-ALT donors has yielded the following results. A total of 495 sera have been tested, in addition to the initial panel of 104 sera described above. Of these 495 specimens, 6 were identified as HGV positive using the primer pair 470-20-1- 77F (SEQ ID N0:9) and 470-20-1-211R (SEQ ID NO:10). Positive scores are based on repeated reactivity in at least 2 separate reactions. Accordingly, a detection rate of approximately 1-2% has been observed (8 of 599 tested) . G. INFECTIVITY OF HGV IN PRIMATES.
Two chimpanzees (designated CH1323 and CH1356) , six cynomolgus monkeys (CY143, CY8904, CY8908, CY8912, CY8917, and CH8918), and four Mystax (MY98, MY187, MY229, MY254) subjects were inoculated with PNF 2161. Pre-inoculation and post-inoculation sera were monitored for ALT and for the presence of HGV RNA sequences (as determined by PCR screening, described above) .
One cynomologous monkey (CY8904) showed a positive RNA PCR result and one indeterminant result from a total of 17 seperate blood draws. In one chimpanzee, designated CH1356, was sustained vire ia observed by RNA PCR. As shown in Table 6, no significant ALT elevation was observed, and circulating virus was detected only at time points considerably after inoculation. Viremia was observed at and following 118 days post-inoculation. Suggestive reactivity was also observed in the first post- inoculation time-point (8 days) , which may indicate residual inoculum. Table 6
ALT and PCR Results from CH1356 Following Inoculation with PNF 2161
Days Post- Inoculation ALT* HGV PCR
0 59 —
8 65 +
15 85 —
22 89 —
29 89 —
36 86 -
39 31 —
47 74 —
54 40 —
61 57 —
84 65 + Days Post- Inoculation ALT* HGV PCR
89 63 +
98 64 —
118 84 +
125 73 +
134 74 +
159 80 +
610 ALT not + available average ALT base-line before inoculation was 100.
The data presented above indicate that HGV infection was established in experimental primate subjects.
H. CHARACTERIZATION OF THE VIRAL GENOME.
The isolation of 470-20-1 from a cDNA library (Example 1) suggests that the viral genome detected in PNF 2161 is RNA. Further experiments to confirm the identity of the HGV viral genome as RNA include the following.
Selective degradation of either RNA or DNA (e.g., by DNase-free RNase or RNase-free DNase) in the original cloning source followed by amplification with HGV specific primers and detection of the amplification products serves to distinguish RNA from DNA templates.
An alternative method makes use of amplification reactions (nucleic acids from the original cloning source as template and HGV specific primers) that employ (i) a DNA-dependent DNA polymerase, in the absence of any RNA- dependent DNA polymerase (i.e., reverse transcripase) in the reactions, and (ii) a DNA-dependent DNA polymerase and an RNA-dependent DNA polymerase in the reactions. In this method, if the HGV genome is DNA or has a DNA intermediate, then amplified product is detected in both types of amplification reactions. If the HGV genome is only RNA, the amplified product is detected in only the reverse transcriptase-containing reactions.
Total nucleic acid (i.e., DNA or RNA) was extracted from PNF 2161, using proteinase K and SDS followed by phenol extraction, as described in Example 4C. The purified nucleic acid was then amplified using polymerase chain reaction (PCR) where either (i) the PCR was preceded by a reverse transcription step, or (ii) the reverse transcription step was omitted. Amplification was reproducibly obtained only when the PCR reactions were preceded by reverse transcription. As a control, DNA templates were successfully amplified in separate reactions. These results demonstrate that the nature of the HGV viral genome is RNA.
The strand of the cloned, double-stranded DNA sequence that was originally present in PNF 2161 may be deduced by various means, including the following. Northern or dot blotting of the unamplified genomic RNA from an infected source serum can be performed, followed by hybridization of duplicate blots to probes corresponding to each strand of the cloned sequence. Alternatively, single-stranded cDNA probes isolated from M13 vectors (Messing) , or multiple strand-specific oligonucleotide probes are used for added sensitivity. If the source serum contains single-stranded RNA, only one probe (i.e., sequences from one strand of the 470-20-1 clones) yield a signal, under appropriate conditions of hybridization stringency. If the source serum contains double-stranded RNA, both strand-probes will yeild a signal.
The polymerase chain reaction, prefaced by reverse transcription using one or the other specific primer, represents a much more sensitive alternative to Northern blotting. Genomic RNA extracted from purified virions present in PNF 2161 serum is used as the input template into each RT/PCR. Rather than cDNA synthesis with random hexa ers, HGV sequence-specific primers were used. One cDNA synthesis reaction was performed with a primer complementary to one strand of the cloned sequence (e.g., 470-20-1-77F) ; a second cDNA synthesis reaction was also performed using a primer derived from the opposite strand (e.g., 470-20-1-211R) .
The resulting first strand cDNA was amplified in using two HGV specific primers. Controls were included for successful amplification by PCR (e.g., DNA controls). RNA transcripts from each strand of the cloned sequence was also used, to control also for the reverse transcription efficiency obtained when using the specific primers which are described.
Specific products were detected by agarose gel electrophoresis with ethidium bromide staining. DNA controls (i.e., double-stranded DNA controls for the PCR amplifcation) were successfully amplified regardless of the primer used for reverse transcription. Single- stranded RNA transcripts (i.e., controls for reverse transcription efficiency and strand specificity) were amplified only when the opposite-strand primer was used for cDNA synthesis.
The PNF-derived HGV polynucleotide gave rise to a specific amplified product only when the primer 470-20-1- 211R was used for reverse transcription, thus indicating that the original HGV polynucleotide sequence present in the serum is complementary to 470-20-1-211R and is likely a single-strand RNA.
EXAMPLE 5
SUCROSE DENSITY GRADIENT SEPARATION OF PNF2161 A . BANDING OF PNF-2161 AGENT .
A continuous gradient of 10-60% sucrose ("ULTRAPURE", Gibco/BRL) in TNE (50 mM Tris-Cl, pH 7.5, 100 mM NaCl, 1 mM EDTA) was prepared using a gradient maker from Hoefer Scientific (San Francisco, CA) . Approximately 12.5 ml of the gradient was overlaid with 0.4 ml of PNF serum which had been stored at -70°C, rapidly thawed at 37°C, then diluted in TNE.
The gradient was then centrifuged in the SW40 rotor (Beckman Instruments) at 40,000 rpm (approximately 200,000 x g at r,v) at 4°C for approximately 18 hours. Fractions of volume approximately 0.6 ml were collected from the bottom of the tube, and 0.5 ml was weighed directly into the ultracentrifuge tube, for calculation of density.
Table 7
Measured Densities of PNF Fractions and Presence of 470-20-1
Fraction Density 470-20-1 Detected*
1 1.274 —
2 1.274 —
3 1.266 —
4 1.266 —
5 1.260 —
6 1.254 —
7 1.248 +
8 1.206 +
9 1.146 +
10 1.126 +++
11 1.098 ++++
12 1.068 +++
13 1.050 +
14 1.034 +
15 1.036 +
16 1.018 -
17 1.008 +
18 1.020 +
* "+" and "-" scores were initially based on 40-cycle PCR. In order to distinguish "+", "++", "+++", and "++++", fractions giving initial positive scores (7- 18) were amplified with 30 cycles of PCR. The putative viral particles were then pelleted by centrifugation at 40,000 rpm in the Ti70.1 rotor (approximately 110,000 x g) at 4°C for 2 hours, and RNA was extracted using the acid guanidinium phenol technique ("TRI REAGENT", Molecular Research Center, Cincinnati, OH) , and alcohol-precipitated using glycogen as a carrier to improve recovery. The purified nucleic acid was dissolved in an RNase-free buffer containing 2 mM DTT and 1 U/μl recombinant RNasin. Analysis of the gradient fractions by RNA PCR (Example 4C) showed a distinct peak in the 470-20-1 specific signal, localized in fractions of density ranging from 1.126 to 1.068 g/ml (Table 7). The 470-20-1 signal was thus shown, under these conditions, to form a discrete band, consistent with the expected behavior of a viral particle in a sucrose gradient.
B. RELATIVE VIRAL PARTICLE DENSITIES.
PNF 2161 has been demonstrated to be co-infected with HCV (see above) . In order to compare the properties of the 470-20-1 viral particle to other known hepatitis viral particles, the serum PNF 2161 and a sample of purified Hepatitis A Virus were layered on a sucrose gradient (as described above). Fractions (0.6 ml) were collected, pelleted and the RNA extracted. The isolated RNA from each fraction was subjected to amplification reactions (PCR) using HAV (SEQ ID NO:5; SEQ ID NO:6), HCV (SEQ ID NO:7; SEQ ID NO:8) and 470-20-1 (SEQ ID NO:9, SEQ ID NO:10) specific primers. Product bands were identified by electrophoretic separation of the amplification reactions on agarose gels followed by ethidium bromide staining. The results of this analysis are presented in Table 8. Table 8
Average Density HAV HCV 470-20-1
1.269 — — —
1.263 + — —
1.260 + - —
1.246 ++ — —
1.238 ++ - —
1.240 + - —
10 1.207 + — —
1.193 + - —
1.172 + + —
1.150 + + +
1.134 + + +
15 1.118 + + +
1.103 + + +
1.118 + + +
1.103 + + +
1.088 + + +
20 1.084 — + +
1.080 — + +
1.070 — + +
1.057 — + +
1.035 — + -
25. 1.017 - - -
1.009 - - -
These results suggest that 470-20-1 particles are 30 more similar to HCV particles than to HAV.
Further, serum PNF 2161 and HAV particles were treated with chloroform before sucrose gradient centrifugation. The results of these experiments suggest that 470-20-1 agent may be an enveloped virus since it has more similar properties to an enveloped Flaviviridae member (HCV) than a non-enveloped virus (HAV) .
EXAMPLE 6 GENERATION OF 470-20-1 EXTENSION CLONES
RNA was extracted directly from PNF2161 serum as described in Example 1. The RNA was passed through a "CHROMA SPIN" 100 gel filtration column (Clontech) to remove small molecular weight impurities. cDNA was synthesized using a BMB cDNA synthesis kit. After cDNA synthesis, the PNF cDNA was ligated to a 50 to 100 fold excess of KL-l/KL-2 SISPA or JML-A/JML-B linkers (SEQ ID NO:ll/SEQ ID NO:12, and SEQ ID NO:17/SEQ ID NO:18, respectively) and amplified for 35 cycles using either the primer KL-1 or the primer JML-A.
The 470 extension clones were generated by anchored PCR of a 1 μl aliquot from a 10 μl ligation reaction containing EcoRI digested (dephosphorylated) lambda gtll arms (1 μg) and EcoRI digested PNF cDNA (0.2 μg) . PCR amplification (40 cycles) of the ligation reaction was carried out using the lambda gtll reverse primer (SEQ ID NO:13) in combination with either 470-20-77F (SEQ ID NO:9) or 470-20-1-211R (SEQ ID NO:10). All primer concentrations for PCR were 0.2 μM. The amplification products (9 μl/100 μl) were separated on a 1.5% agarose gel, blotted to "NYTRAN" (Schleicher and Schuell, Keene, NH) , and probed with a digoxygenin labelled oligonucleotide probe specific for 470-20-1. The digoxygenin labeling was performed according to the manufacturer's recommendations using terminal transferase (BMB) . Bands that hybridized were gel-purified, cloned into the "TA CLONING VECTOR pCR II" (Invitrogen) , and sequenced.
Sequencing was carried out using "DYEDEOXY TERMINATOR CYCLE SEQUENCING" (a modification of the procedure of Sanger, et al . ) on an Applied Biosyste s model 373A DNA sequencing system according to the manufacturer's recommendations (Applied Biosystems, Foster City, CA) . Sequence data is presented in the Sequence Listing. Sequences were compared with "GENBANK", EMBL database and dbEST (National Library of Medicine) sequences at both nucleic acid and amino acid levels. Search programs
FASTA, BLASTP, BLASTN and BLASTX (Altschul, et al.) indi¬ cated that these sequences were novel as both nucleic acid and amino acid sequences.
Numerous clones having both 5' and 3' extensions to 470-20-1 were identified. All sequences are based on a consensus sequence from the sequencing of at least two independent isolates. This Anchor PCR approach was repeated in a similar manner to obtain further 5' and 3' extension sequences. These PCR amplification reactions were carried out using the lambda gtll reverse primer (SEQ ID NO:13) in combination with HGV specific primers derived from sequences obtained from previous extension clones. The substrate for these reactions was unpackaged PNF 2161 2-cDNA source DNA. The individual consensus sequences were aligned, overlapping sequences identified and 9391 base pairs of the HGV sequence are presented as SEQ ID NO:14. This sequence represents a continuous open reading frame (SEQ ID NO:15) . The relationship between the original 470-20-1 clone and the sequences obtained by extension is shown schematically in Figure 1. As seen in the figure, the DNA strand having opposite polarity to the protein coding sequence of 470-20-1 comprising a long continuous open reading frame.
The amino acid sequence of HGV was compared against the sequences of all viral sequence in the PIR database (IntelliGenetics, Inc., Mountain View, CA) of protein sequences. The comparison was carried out using the "SSEARCH" program of the "FASTA" suite of programs version 1.7 (Pearson, et al.). Regions of local sequence similarities were found between the HGV sequences and two viruses in the Flaviviridae family of viruses. The similarity alignments are presented in Figures 5A and 5B.
Present in these alignments are motifs for the RNA dependent RNA polymerase (RDRP) of these viruses. Conserved RDRP amino acid motifs are indicated in Figures 5A and 5B by stars and uppercase, bold letters (Koonin and Dolja) . These alignments demonstrate that this portion of the HGV coding sequence correspond to RDRP. This alignment data combined with the data concerning the RNA genome of HGV supports the placement of HGV as a member of the Flaviviridae family.
The global amino acid sequence identities of the HGV polyprotein (SEQ ID NO:15) with HoCV (Hog Cholera Virus) and HCV are 17.1% and 25.5%, respectively. Such levels of global sequence identity demonstrates that HGV is a separate viral entity from both HoCV and HCV. To illustrate, in two members of the Flaviviridae family of viruses BVDV (Bovine Diarrhea Virus) and HCV, 16.2% of the amino acids can be globally aligned with HGV. Members within a genus generally show high homology when aligned globally, for example, BVDV vs. HoCV show 71.2% identity. Various members (variants) of the un¬ named genus of which HCV is a member are between 65% and 100% identical when globally aligned.
EXAMPLE 7
ISOLATION OF 470-20-1 FUSION PROTEIN
A. EXPRESSION AND PURIFICATION OF 470-20-1/GLUTATHIONE-S- TRANSFERASE FUSION PROTEIN Expression of a glutathione-S-transf erase (sj26) fused protein containing the 470-20-1 peptide was achieved as follows . A 237 base pair insert (containing 17 nucleotides of SISPA linkers on both sides) corresponding to the original lambda gtll 470-20-1 clone was isolated from the lambda gtll 470-20-1 clone by polymerase chain reaction using primers gtll F (SEQ ID NO : 25 ) and gtll R (SEQ ID NO : 13 ) followed by Eco RI digestion . The insert was cloned into a modified pGEX vector, pGEX MOV. pGEX MOV encodes sj26 protein fused with six histidines at the carboxy terminal end (sj26his) . The 470-20-1 polypeptide coding sequences were introduced into the vector at a cloning site located downstream of sj26his coding sequence in the vector. Thus, the 470-20-1 polypeptide is expressed as sj26his/470-20-l fusion protein. The sj26 protein and six histidine region of the fusion protein allow the affinity purification of the fusion protein by dual chromatographic methods employing glutathione-conjugated beads (Smith, D.B., et al . ) and immobilized metal ion beads (Hochula; Porath) .
E. coli strain W3110 (ATCC catalogue number 27352) was transformed with pGEX MOV and pGEX MOV containing 470- 20-1 insert. Sj26his protein and 470-20-1 fusion protein were induced by the addition of 2 mM isopropyl-/S- thiogalactopyranoside (IPTG) . The fusion proteins were purified either by glutathione-affinity chromatography or by immobilized metal ion chromatography (IMAC) according to the published methods (Smith, D.B., et al . ; Porath) in conjunction with conventional ion-exchange chromatography.
The purified 470-20-1 fusion protein was immunoreactive with PNF 2161. However, purified sj26his protein was not immunoreactive with PNF 2161, indicating the presence of specific immunoreaction between the 470- 20-1 peptide and PNF 2161.
B. ISOLATION OF 470-20-1/B-GALACTOSIDASE FUSION PROTEIN KM392 lysogens infected either with lambda phage gtll or with gtll/470-20-1 are incubated in 32°C until the culture reaches to an O.D. of 0.4. Then the culture is incubated in a 43°C water bath for 15 minutes to induce gtll peptide synthesis, and further incubated at 37°C for 1 hour. Bacterial cells are pelleted and lysed in lysis buffer (10 mM Tris, pH 7.4, 2 % "TRITON X-100" and 1% aprotinin) . Bacterial lysates are clarified by centrifugation (10K, for 10 minutes, Sorvall JA20 rotor) and the clarified lysates are incubated with Sepharose 4B beads conjugated with anti-jβ-galactosidase (Promega) .
Binding and elution of jS-galactosidase fusion proteins are performed according to the manufacturer's instruction. Typically binding of the proteins and washing of the column are done with lysis buffer. Bound proteins are eluted with 0.1 M carbonate/bicarbonate buffer, pH 10. The purified 470-20-1/b-galactosidase protein is immunoreactive with both PNF2161 and anti-b- galactosidase antibody. However, /S-galactosidase, expressed by gtll lysogen and purified, is not immunoreactive with PNF2161 but immunoreactive with anti- /S-galactosidase antibody.
EXAMPLE 8
PURIFICATION OF THE 470-20-1 FUSION PROTEIN AND PREPARATION OF ANTI-470-20-1 ANTIBODY
A. GLUTATHIONE AFFINITY PURIFICATION
Materials included 50 ml glutathione affinity matrix reduced form (Sigma), XK 26/30 Pharmacia column, 2.5 x 10 cm Bio-Rad "ECONO-COLUMN" (Richmond, CA) , Gilson
(Middleton, WI) HPLC, DTT (Sigma) , glutathione reduced form (Sigma) , urea, and sodium phosphate dibasic.
The following solutions were used in purification of the fusion protein:
Buffer A: phosphate buffer saline, pH 7.4, and
Buffer B: 50 mM Tris Ph 8.5, 8 mM glutathione, (reduced form glutathione)
Strip buffer: 8 M urea, 100 mM Tris pH 8.8, 10 mM glutathione, 1.5 NaCl.
E . coli carrying the plasmid pGEX MOV containing 470- 20-1 insert, were grown in a fer entor (20 liters) . The bacteria were collected and lysed in phosphate buffered saline (PBS) containing 2 mM phenylmethyl sulfonyl fluoride (PMSF) using a micro-fluidizer. Unless otherwise noted, all of the following procedures were carried out at 4°C.
The crude lysate was prepared for loading by placing lysed bacteria into "OAKRIDGE" tubes and spinning at 2OK rpms (40k x g) in a Beckman model JA-20 rotor. The supernatant was filtered through a 0.4 μm filter and then through a 0.2 μm filter.
The 2.5 x 10 cm "ECONO-COLUMN" was packed with the glutathione affinity matrix that was swelled in PBS for two hours at room temperature. The column was brought into equilibrium by washing with 4 bed volumes of PBS.
The column was loaded with the crude lysate at a flow rate of 8 ml per minute. Subsequently, the column was washed with 5 column volumes of PBS at the same flow rate. The column was eluted by setting the flow rate to 0.75-1 ml/min. and introducing Buffer B. Buffer B was pumped through the column for 5 column volumes and two- minute fractions were collected. An exemplary elution profile is shown in Figure 2. The content and purity of the proteins present in the fractions were assessed by standard SDS PAGE (Figure 3). The 470-20-l/sj26his fusion protein was identified based on its predicted molecular weight and its immunoreactivity to PNF 2161 serum. For further manipulations, the protein can be isolated from fractions containing the fusion protein or from the gel by extraction of gel regions containing the fusion protein.
B. PURIFICATION OF CLONE 470-20-1 FUSION PROTEIN BY ANION EXCHANGE. Solutions include the following:
Buffer A (10 mM sodium phosphate pH 8.0, 4 M urea, 10 mM DTT) ;
Buffer B (10 mM sodium phosphate pH 8.0, 4 M urea, 10 mM DTT, 2.0 M NaCl); and Strip Buffer (8 M urea, 100 mM Tris pH 8.8, 10 mM glutathione, 1.5 NaCl). Crude lysate (or other protein source, such as pooled fractions from above) was loaded onto "HIGH-Q-50" (Biorad, Richmond, CA) column at a flow rate of 4.0 ml/min. The column was then washed with Buffer A for 5 column volumes at a flow rate of 4.0 ml/min.
After these washes, a gradient was started and ran from Buffer A to Buffer B in 15 column volumes. The gradient then stepped to 100% Buffer B for one column volume. An exemplary gradient is shown in Figure 4A. Fractions were collected every 10 minutes. Purity of the 470-20-l/sj26his fusion protein was assessed by standard SDS-PAGE (Figures 4B and 4C) and relevant fractions were pooled (approximately fractions 34 through 37, Figure 4C) .
C. PREPARATION OF ANTI-470-20-1 ANTIBODY
The purified 470-20-l/sj26his fusion protein is injected subcutaneously in Freund's adjuvant in a rabbit. Approximately 1 mg of fusion protein is injected at days 0 and 21, and rabbit serum is typically collected at 6 and 8 weeks.
A second rabbit is similarly immunized with purified sj26his protein.
Minilysates are prepared from bacteria expressing the 470-20-l/sj26his fusion protein, sj26his protein, and β- galactosidase/470-20-1 fusion protein. The lysates are fractionated on a gel and transfered to a membrane. Separate Western blots are performed using the sera from the two rabbits.
Serum from the animal immunized with 470-20-1 fusion protein is immunoreactive with all sj26his fusion protein in minilysates of IPTG induced E. coli W3110 that are transformed either with pGEX MOV or with pGEX MOV containing 470-20-1 insert. This serum is also immunoreactive with the fusion protein in the minilysate from the 470-20-1 lambda gtll construct.
The second rabbit serum is immunoreactive with both sj26his and 470-20-l/sj26his fusion proteins in the minilysates. This serum is not expected to immunoreactive with 470-20-l//3-galactosidase fusion protein in the inilysate from the 470-20-1 lambda gtll construct. None of the sera are expected to be immunoreactive with β- galactosidase.
Anti-470-20-1 antibody present in the sera from the animal immunized with the fusion protein is purified by affinity chromatography (using the 470-20-1 ligand) .
Alternatively, the fusion protein can be cleaved to provide the 470-20-1 antigen free of the sj-26 protein sequences. The 470-20-1 antigen alone is then used to generate antibodies as described above.
EXAMPLE 9 SEROLOGY
A. WESTERN BLOT ANALYSIS OF SERA PANELS The 470-20-1 fusion antigen (described above) was used to screen panels of sera. Many of the panels were of human sera derived both from individuals suffering from hepatitis and uninfected controls.
Affinity purified 470-20-1 fusion antigen (Example 8) was loaded onto a 12% SDS-PAGE at 2 μg/cm. The gel was run for two hours at 200V. The antigen was transfered from the gel to a nitrocellulose filter. The membrane was then blocked for 2 hours using a solution of 1% bovine serum albumin, 3% normal goat serum, 0.25% gelatin, 100 mM NaP04, 100 mM NaCl, and 1% nonfat dry milk. The membrane was then dried and cut into 1-2 mm strips; each strip contained the 470-20-1 fusion antigen. The strip was typically rehydrated with TBS (150 mM NaCl; 20 mM Tris HCl, pH 7.5) and incubated in panel sera (1:100) overnight with rocking at room temperature.
The strips were washed twice for five minutes each time in TBS plus "TWEEN 20" (0.05%), and then washed twice for five minutes each time in TBS. The strips were then incubated in secondary antibody (Promega anti-human IgG- Alkaline Phosphatase conjugate, 1:7500), for 1 hour with rocking at room temperature. The strips were then washed twice x 5 minutes in TBS + "TWEEN 20", then twice x 5 minutes in TBS.
Bound antibody was detected by incubating the strips in a substrate solution containing BCIP (Example 2) and NBT (Example 2) in pH 9.5 buffer (100 mM Tris, 100 mM NaCl, 5 mM MgCl2) . Color development was allowed to proceed for approximately 15 minutes at which point color development was halted by 3 washes in distilled H20.
Test sera were derived from the following groups of individuals: (i) blood donors, negative for HBV Ab, surface Ag, negative for HCV, HIV, HTLV-1 Abs; (ii) HBV, sera from individuals who are infected with Hepatitis B virus; (iii) HCV, sera from individuals infected with Hepatitis C virus by virtue of being reactive in a second- generation HCV ELISA assay; and (iv) HXV, individuals serologically negative for HAV, HBV, HCV, or HEV.
The results of these screens are presented in Table 9.
Table 9
470-20-1 Sera Panelling Result Summary
No. Human*
Sample Sera Tested + IND* a blood 30 1 (3.3%) 2 (6.7%) 27 (90.0%) donor
HBV 40 7 (17.5%) 4 (10.0%) 29 (72.5%)
HCV 38 11 (28.95%) 11 (28.95%) 16 (42.1%)
HXV 122 20 (16.4%) 12 (9.8%) 90 (73.8%)
Indeterminate, weak reactivity
These results suggest the presence of the 470-20-1 antigen in a number of different sera samples. The antigen is not immunoreactive with normal human sera. B. GENERAL ELISA PROTOCOL FOR DETECTION OF ANTIBODIES Polystyrene 96 well plates ("IMMULON II" (PGC) ) are coated with 5 μg/ml (100 μL per well) antigen in 0.1 M sodium bicarbonate buffer, pH 9.5. Plates are sealed with "PARAFILM" and stored at 4°C overnight.
Plates are aspirated and blocked with 300 uL 10% normal goat serum and incubated at 37°C for 1 hr.
Plates are washed 5 times with PBS 0.5% "TWEEN-20". Antisera is diluted in 1 x PBS, pH 7.2. The desired dilution(s) of antisera (0.1 mL) are added to each well and the plate incubated 1 hour at 37°C. The plates are then washed 5 times with PBS 0.5% "TWEEN-20".
Horseradish peroxidase (HRP) conjugated goat anti- human antiserum (Cappel) is diluted 1/5,000 in PBS. 0.1 mL of this solution is added to each well. The plate is incubated 30 min at 37°C, then washed 5 times with PBS.
Sigma ABTS (substrate) is prepared just prior to addition to the plate.
The reagent consists of 50 ml 0.05 M citric acid, pH 4.2, 0.078 ml 30% hydrogen peroxide solution and 15 mg
ABTS. 0.1 ml of the substrate is added to each well, then incubated for 30 min at room temperature. The reaction is stopped with the addition of 0.050 mL 5% SDS (w/v) . The relative absorbance is determined at 410 nm.
EXAMPLE 10 Preliminary Mapping of HGV Epitopes An approximately 7.3 kb coding sequence of HGV was subcloned as 77 distinct but overlapping cDNA fragments. The length of most cDNA fragments ranged from about 200 bp to about 500 bp. The cDNA fragments were cloned separately into the expression vector, pGEX-HisB. This vector is similar to pGEX-MOV, described above. pGEX-hisB is a modification of pGEX-2T (Genbank accession number A01438; a commercially available expression vector) . The vector pGEX-2T has been modified by insertion of a Ncol site directly downstream from the thrombin cleavage site. This site is followed by a BamHI site, which is followed by a poly-histidine (six histidines) encoding sequence, followed by the Eco-RT site found in pGEX-2T. Coding sequences of interest are typically inserted between the Ncol site and the BamHI site. In Figure 6 (SEQ ID NO:96), the inserted sequence encodes the GE3-2 antigen. The rest of the vector sequence is identical to pGEX-2T. Expression of fusion protein is carried out essentially as described above with other pGEX-derived expression vectors.
Cloning of all 24 fragments was carried out essentially as described below, where specific primers were selected for each of the 24 coding regions. Typically, the 5' primer contained a Wcol restriction site and the 3' primer contained a BamHI restriction site. The Ncol primers in the amplified fragments allowed in-frame fusion of amplified coding sequences to the GST-SJ26 coding sequence in the expression vector pGEX-Hisb. Expressed recombinant proteins were analyzed for specific immunoreactivity against putative HGV-infected human sera by Western blot.
Two fragments designated GE3 and GE9 encoded antigens that gave a clear immunogenic response when reacted with putative HGV-infected human sera.
A. CLONING OF GE3, GE9, GE15, AND GE17. The coding sequence inserts for clones GE3 and GE9 were generated by polymerase chain reaction from SISPA- amplified double-stranded cDNA or RNA obtained from PNF 2161, using PCR primers specific for each fragment.
In the GE3-5' primer (GE-3F, SEQ ID NO:30) a silent point mutation was introduced to modify a natural Ncol restriction site. The GE3-3' primer was GE-3R (SEQ ID N0:31). The GE9-5' primer was GE-9F (SEQ ID NO:32) and the GE9-3' primer was GE-9R (SEQ ID NO:33). The GE15-5' primer was GE-15F (SEQ ID NO:92) and the GE15-3' primer was GE-15R (SEQ ID NO:93). The GE17-5' primer was GE-17F (SEQ ID NO:94) and the GE17-3' primer was GE-17R (SEQ ID NO:95). Using these primers, PCR amplification products were generated. The amplification products were gel purified, digested with Wcol and BamHI, and gel purified again. The purified Ncol /BamHI GE3, GE9, GE15 and GE17 fragments were independently ligated into dephos- phorylated, Ncol /BamHI cut pGEX-HisB vectors.
Each ligation mixture was transformed into E.coli W3110 strain and ampicillin resistant colonies were selected. The ampicillin resistant colonies were resuspended in a Tris/EDTA buffer were analyzed by PCR, using primers homologous to pGEX vector sequences flanking the inserted molecules, to confirm the presence of insert sequences. Four candidate clones were designated GE3-2 (SEQ ID NO:34), GE9-2 (SEQ ID NO:36), GE15-1 (SEQ ID NO:88) and GE17-2 (SEQ ID NO:90), respectively.
B. EXPRESSION OF THE GE3-2, GE9-2, GE15-1, AND GE17-2 FUSION PROTEINS. Colonies of ampicillin resistant bacteria carrying GE3-2, GE9-2, GE15-1, and GE17-2 containing-vectors were individually inoculated into LB medium containing ampicillin. The cultures were grown to OD of 0.8 to 0.9 at which time IPTG (isopropylthio-beta-galactoside; Gibco- BRL) was added to a final concentration of 0.3 to 0.5 mM, for the induction of protein expression. Incubation in the presence of IPTG was continued for 3 to 4 hours.
Bacterial cells were harvested by centrifugation and resuspended in SDS sample buffer (0.0625 M Tris, pH 6.8, 10% glycerol, 5% mercaptoethanol, 2.3% SDS). The resuspended pellet was boiled for 5 min. and then cleared of cellular debris by centrifugation. The supernatants obtained from IPTG-induced cultures of GE3-2, GE9-2, GE15- 1, and GE17-2 were analyzed by SDS-polyacrylamide gel electrophoresis (PAGE) . The proteins from these gels were then transferred to nitrocellulose filters (i.e., by Western blotting) . The filters were then exposed to PNF 2161, JC and supernormal serum. JC is the HGV-positive sera identified in Example 4F that was rejected by the blood bank for being High ALT. A second sample, taken one year after the initial serum sample, was also positive for HGV by PCR analysis. Immunoreactivity of JC serum with bands at the appropriate molecular weight for the fusion proteins demonstrated the successful expression of the fusion protein by the bacterial cells.
The fusion proteins were purified from bacterial cell lysates essentially as in Example 7 using dual chromatographic methods employing glutathione-conjugated beads (Smith, D.B., et al . ) and immobilized metal ion beads (Hochula; Porath) .
c. WESTERN BLOT ANALYSIS OF PURIFIED HGV PROTEINS.
Various amounts of the purified HGV proteins (e.g., GE3-2 and GE9-2 proteins) were loaded on 12% acrylamide gels. Following PAGE, proteins were transferred from the gels to nitrocellulose membranes, using standard pro- cedures. Individual membranes were incubated with one of a number of human or mouse sera. Excess sera were removed by washing the membranes.
These membranes were incubated with alkaline phosphatase-conjugated goat anti-human antibody (Promega) or alkaline phosphatase-conjugated goat anti-mouse antibodies (Sigma) , depending on the serum being used for screening. The membranes were washed again, to remove excess goat anti-human IgG antibody, and exposed to NBT/BCIP. Photographs of exemplary stained membranes having the GE3 fusion protein are shown in Figures 7A to 7D.
The Figures show the results of Western blot analysis of the purified GE3-2 protein using the following sera: N-(ABCDE) human (JC) serum (Figure 7A) , N-(ABDE) human (PNF 2161) serum (Figure 7B) , a super normal (SN2) serum (Figure 7C) , and mouse monoclonal antibody (RM001) directed against GST-SJ26 protein (Figure 7D) . In each of the figures, lane 1 contains molecular weight standards, and lanes 2-5 contain, respectively, the following amounts of the GE3-2 fusion protein: 4 μg, 2 μg, 1 μg, and 0.5 μg. Numbers represent loading amounts in micrograms per 0.6 centimeter of gel (well size). Dilutions of the human JC, PNF 2161 and Super Normal 2 sera were 1:100. The anti-sj26 dilution was 1:1000. The band seen at about 97K in the JC blot is reactivity against a minor contaminant in the GE3.2 fusion protein preparation. Protein marker sizes are 142.9, 97.2, 50, 35.1, 29.7 and 21.9 KD.
As shown in Figures 7A to 7D, GE3-2 showed specific immunoreactivity with JC serum. GE3-2 reacted weakly with PNF 2161 serum and would be scored as an indeterminant or negative.
In parallel experiments, GE9-2 showed weak but specific immunoreactivity toward PNF 2161 serum. Further, GE15-1 and GE-17 showed weak but specific immunoreactivity toward PNF 2161 and T55806. T55806 is human serum that contains HGV; this sera was identified as HGV positive by PCR, as described in Example 4.
EXAMPLE 11 Construction of an Exemplary Epitope Library Polymerase Chain Reactions were employed to amplify 3 overlapping DNA fragments from PNF 2161 SISPA-amplified cDNA. The PNF 2161 SISPA-amplified cDNA was prepared using the JML-A/B linkers (SEQ ID NO:38 and SEQ ID NO:39). One microliter of this material was re-amplified for 30 cycles (1 minute at 94°C, 1.5 minutes at 55°C and 2 minutes at 72°C) using 1 μM of the JML-A primers. The total reaction volume was 100 μl. The products from 3 of these amplifications were combined and separated from excess PCR primers by a single pass through a "WIZARD PCR COLUMN" (Promega) following the manufacturer's instructions. The "WIZARD PCR COLUMN" is a silica based resin that binds DNA in high ionic strength buffers and will release DNA in low ionic strength buffers. The amplified DNA was eluted from the column with 100 μl distilled H20.
The eluted DNA was fractionated on a 1.5% Agarose TBE gel (Maniatis, et al . ) and visualized with UV light following ethidium bromide staining. A strong smear of DNA fragments between 150 and 1000 bp was observed. One microliter of the re-amplified cDNA was used as for template in PCR reactions with each primer pair presented in Table 10.
Table 10
Figure imgf000099_0001
The primers were designed to result in the amplification of HGV specific DNA fragments of the sizes indicated in Table 10. In the amplification reactions, the primer pairs were used at a concentration of 1 μM. Amplifications were for 30 cycles of 1 minute at 94, 1.5 minutes at 54°C and 3 minutes at 72°C in a total reaction volume of 100 μl. Each of the three different primer pair PCR reactions resulted in the specific amplification of products having the expected sizes. For each primer pair reaction, amplification products from 3 independent PCR reactions were combined and purified using a "WIZARD PCR COLUMN" as described above. The purified products were eluted in 50 μl dH20. Samples from each purified product (14 μl, containing approximately 1 - 2 μg of each primer-pair amplified DNA fragment) were combined. The combined sample of all three different amplified fragments was added to 5 μl of 10X DNAse Digestion buffer (500 mM Tris PH 7.5, 100 mM MnCl2) and 2 μl of dH20. From this digestion mixture, a 10 μl sample was removed and placed in a tube containing 5 μl of Stop solution (100 mM EDTA, pH 8.0). This sample was the 0 "minutes of digestion" time point. The rest of the digestion reaction was placed at 25°C. To the digestion mixture 1 μl of 1/25 diluted RNase-free DNAse I (Stratagene) was added. At various time points 10 μl aliquots were withdrawn and mixed with 5 μl of Stop solution. The DNAsel digested DNA products were analyzed on a 1.5% Agarose TBE gel.
The results of several digestion experiments showed that 40 minutes of digestion provided a good distribution of DNA fragments in the size range of 100 - 300 bp. A DNAse I digestion was then repeated with the entire digestion being left for 40 minutes at room temperature. The digestion was stopped by the addition of 18 μl of Stop Buffer and the digested DNA products were purified using a "WIZARD PCR COLUMN." The "WIZARD-PCR COLUMN" was eluted with 50 μl of dH20 and the eluted DNA added to the following reaction mixture: 7 μl of Restriction Enzyme Buffer C (Promega, 10 mM MgCl2, 1 mM DTT, 50 mM NaCl, 10 mM Tris, pH 7.9, IX concentration); 11 μl of 1.25 mM dNTPs; and 2 μl T4 DNA Polymerase (Boehringer-Mannhiem) . This reaction mixture was held at 37°C for 30 minutes, at which point 70 μl of pH 8.0 phenol/CHCl3 was added and mixed. The phenol/CHC13 was removed and extracted once to yield a total aqueous volume of 150 μl containing the DNA sample. The DNA was ethanol precipitated using 2 volumes of absolute ethanol and 0.5 volume of 7.5 M NH4-acetate. The DNA was pelleted by centrifugation for 15 minutes at 14,000 rpm in an "EPPENDORF MICROFUGE", dried for 5 minutes at 42°C and resuspended in 25 μl of dH20.
The DNA was ligated to 5' phosphorylated SISPA linkers KL1 (SEQ ID NO:46) and KL2 (SEQ ID NO:47). Several different concentrations of SISPA linkers and DNA was tested. The highest level of ligation (assessed as described below) occurred under the following ligation reaction conditions: 6 μl of DNA, 2 μl of 5.0 x 10 -12 M KL1/KL2 linkers, 1 μl of 10X ligase buffer (New England Biolabs) , and 1 μl of 400 Units/μl T4 DNA Ligase (New England Biolabs) in a total reaction volume of 10 μl. Ligations were carried out overnight at 16°C.
Two reactions were run in parallel as follows. A 2 μl sample of the ligated material was amplified using the KL1 SISPA primer in a total reaction volume of 100 μl (25 cycles of 1 minute at 94°C, 1.5 minutes at 55°C and 2 minutes at 72°C) . The degree of ligation was assessed by separating 1/5 of the PCR reaction amplified products by electrophoresis using a 1.5% agarose TBE gel. The gel was stained with ethidium bromide and the bands visualized with UV light.
The amplification products from the duplicate reactions were purified using "WIZARD PCR COLUMNS" and the purified DNA eluted in 50 μl of dH20. A twenty-five microliter aliquot of the PCR KL1/KL2 amplified DNA was digested with 36 Units of Eco-RI (Promega) in a total volume of 30 μl. The reaction was carried out overnight at 37°C. The Digested DNA was purified using a "SEPHADEX G25" spin column.
The EcoRI digested DNA was ligated in overnight reactions to λgtll arms that were pre-digested with EcoRI and treated with calf intestinal alkaline phosphatase (Stratagene, La Jolla, CA) . The ligation mixture was packaged using a "GIGAPACK GOLD PACKAGING EXTRACT" (Stratagene) following manufacturer's instructions. Titration of the amount of recombinant phage obtained was performed by plating a 1/10 dilution of the packaged phage on a lawn of KM-392, where the plate contained 20 μl of a 100 mg/ml solution of x-gal (5-Bromo-4-chloro-3-indolyl-/S- D-galactoside; Sigma) and 20 μl of a 0.1 M solution of IPTG (Isopropyl-l-thio-/3-D-galactoside; Sigma) . A titer was obtained of 1.2 x 106phage/ml containing over 75% recombinant phage.
The percentage of recombinant plaques was confirmed by PCR analysis of 8 randomly picked plaques using primers 11F (SEQ ID NO:25) and 11R (SEQ ID NO:13). This packaged library containing the DNA fragments derived from the digestion of the amplified DNAs Fl/Rl, F2/R3, and F4/R4 amplified DNAs and was designated library Y5.
EXAMPLE 12
Immunoscreening of the Y5 Library A. ISOLATION OF IMMUNOREACTIVE CLONES. Two HGV positive sera, PNF2161 and JC, were used for immunoscreening of the Y5 library, essentially as described in Example 2. The Y5 phage library was plated onto 20 plates at approximately 15,000 phage per plate.
The plates were incubated for approximately 5 hours and were overlaid with nitrocellulose filters (Schleicher and
Schuell) overnight. The filters were blocked by incubation in AIB (1% gelatin plus 0.02% Na azide) for approximately 6 hours. The blocked filters were washed once with TBS.
Ten Y5 library filters were incubated overnight, with agitation, with PNF2161 serum and ten filters with JC serum. In order to reduce non-specific antibody binding, both sera had been pre-treated by incubation overnight with nitrocellulose filters to which wild type λgtll were adsorbed.
The filters were removed from the sera, washed 3 times with TBS and incubated with goat anti-human alkaline phosphatase-conjugated secondary antibody (Promega; diluted 1/7500 in AIB) for one hour. The filters were washed 4 times with TBS. Bound secondary antibody was detected by incubation of the filters in AP buffer (100 mM NaCl, 5 mM MgCl2, 100 mM Tris pH 9.5) containing NBT &
BCIP. Plaques that tested positive in the initial screen were picked and eluted in 500 μl of PDB (100 mM NaCl, 8.1 mM MgS04, 50 mM Tris pH 7.5, 0.02% Gelatin). The immunoreactive phage were purified by replating the eluted phage at a total density of 100 - 500 plaques per 100 mm plate. The plates were re-immunoscreened with the appropriate HGV-positive sera, essentially as described above. After color development several isolated, positive plaques were picked and put into 500 μl of PDB. After 1 hour of incubation, 2 μl of the re-purified phage PDB solution was used as template in a PCR reaction containing the 11F (SEQ ID NO:25) and 11R (SEQ ID NO:13) PCR primers. These primers are homologous to sequences located 70 nucleotides (nt) 5' and 90 nt 3' of the Eco-RI site of λgtll. The PCR reactions were amplified through 30 cycles of 94°C for 1 minute, 55°C for 1.5 minutes and 72°C for 2 minutes.
The PCR amplification reactions were size- fractionated on agarose gels. PCR amplification of purified plaques resulted in a single band for each single-plaque amplification reaction, where the amplified fragment contained the DNA insert plus approximately 140 bp of 5' and 3' phage flanking sequences. The amplified products, from PCR reactions resulting in single bands, were purified using a "S-300 HR" spin column (Pharmacia) , following manufacturers instructions. The DNA was quantitated and DNA sequenced employing an Applied Biosystems automated sequencer 373A and appropriate protocols. The above-described screening of the Y5 library with JC sera resulted in the purification and DNA sequencing of the positive-strand clones presented in Table 11. Positive-strand clones correspond to the 5' to 3' translation of the HGV sequence presented in SEQ ID NO:14 — the polyprotein reading frame. Table 11
Insert Insert Nucleic Encoded
Clone Screening Size Size Acid Protein Sera (base (amino SEQ ID SEQ ID pairs) acids) NO: NO:
Y5-10 JC 210 62 48 49
Y5-12 JC 333 94 50 51
Y5-26 JC 303 93 52 53
Y5-5 JC 153 36 54 55
Y5-3 JC 162 44 56 57
Y5-27 JC 288 86 58 59
Y5-25 JC 165 36 60 61
Y5-20 JC 165 191 62 63
Y5-16 JC 234 63 64 65
1 the clone contained a double insert, nt 69 to 126 of the clone insert correspond to HGV sequences.
These clones delineated 2 immunogenic regions within the putative NS5 protein of HGV. These two regions are specifically delineated by Y5-10 and Y5-5.
Further, screening of the Y5 library with PNF 2161 sera resulted in the purification and DNA sequencing of the following negative-strand clones presented in Table 12. Negative-strand clones correspond to the 5' to 3' translation of the sequence complementary to the HGV sequence presented in SEQ ID NO:14.
Table 12
Insert Insert Nucleic Encoded
Clone Screening Size size Acid SEQ Protein Sera (base (amino ID NO: SEQ ID pairs) acids) NO:
Y5-50 PNF 2161 349 104 66 67
Y5-52 PNF 2161 119 201 68 69
Y5-53 PNF 2161 250 332 70 71
Y5-55 PNF 2161 143 193 72 73 nsert Insert Nucleic Encoded
Clone Screening Size Size Acid SEQ Protein Sera (base (amino ID NO: SEQ ID pairs) acids) NO:
Y5-56 PNF 2161 366 110 74 75
Y5-57 PNF 2161 231 65 76 77
Y5-60 PNF 2161 151 38 78 79
Y5-63 PNF 2161 1254 25 80 81 the clone contained a double insert, nt 46 to 105 of the clone insert correspond to HGV sequences. the clone contained a double insert, nt 19 to 118 of the clone insert correspond to HGV sequences. the clone contained a double insert, nt 70 to 126 of the clone insert correspond to HGV sequences. the insert contains an extra, non-HGV sequence between nucleotides 19 and 35.
All of these sequences contain portions of the original HGV clone 470-20-1 isolated using the PNF 2161 serum.
B. FURTHER CHARACTERIZATION OF IMMUNOREACTIVE CLONES. Clones Y5-10, Y5-16, and Y5-5 were selected for sub- cloning into the expression vector pGEX-HisB. PCR primers were designed which removed the extraneous linker sequences at the end of these clones. These primers also introduced (i) a Wcol site at the 5' end (relative to the coding sequence) of each insert, and (ii) a BamHI site at the 3' end of each insert. Using these primers (see Table 13), the DNA fragments were amplified from 2 μl of the plaque pure stocks. Table 13
Clone Primer Set
Y5-10 Y5-10-F1 SEQ ID NO:82
Y5-10-R1 SEQ ID NO:83
Y5-16 Y5-16F1 SEQ ID NO:84
470ep-R3 SEQ ID NO:85
Y5-5 Y5-5-F1 SEQ ID NO:86
470ep-R3 SEQ ID NO:85
Amplifications were performed as follows: 30 cycles of 94°C for 1 minute, 50°C for 1.5 minutes, and 72°C for 2 minutes. After amplification the resulting DNAs were purified using "WIZARD PCR," spin columns, the samples eluted in 50 μl, and digested overnight with Wcol and BamHI. A minimum of 30 units of each enzyme was used in the restriction endonuclease digestions (Ncol, Boehringer Mannhiem; BamHI , Promega) . The digested PCR fragments were ligated overnight to expression vector pGEX-HisB that had been digested with WcoJ and BamHI . Each set of ligated plasmids was independently used to transform E. coli strain W3110, using a heat shock protocol (Ausubel, et al . ; Maniatis, et al.). Transformants were selected on LB plates containing 100 μg/ml ampicillin and resistant colonies were used to inoculate 2 mis of LB containing 100 μg/ml ampicillin. Cultures expressing non-recombinant sj26/his protein were also prepared. After incubation overnight at 37°C the cultures were diluted 1/10 into 2 mis of fresh LB plus ampicillin and grown for an additional 1 hour at 37°C. IPTG was added to a final concentration of 0.2 mM and the cultures were grown for an additional 3 hours at 37°C. The bacteria were pelleted by centrifugation and the bacterial pellet was resuspended in 100 μl PBS. To the pellet, 100 μl of 2X SDS sample buffer (0.125 M Tris, pH 6.8, 10% glycine, 5% /3-mercaptoethanol, 2.3% SDS) was added. The resulting lysates were vortexed and heated to 100°C for 5 minutes. Aliquots (15 μl) of each lysate were loaded onto a 12% acrylamide SDS-PAGE gel.
The expressed proteins were size-fractionated by electrophoresis. The separated proteins were transferred from the gel to nitrocellulose filters using standard techniques (Harlow, et al . ) . An additional gel containing the expressed proteins was stained using coomasie blue protein stain.
Transformants carrying plasmids Y5-10, Y5-5 and Y5-16 expressed significant amounts of correctly sized recombinant fusion proteins. The identity of the recombinant fusions were confirmed by incubating a Western blot (prepared above) with a murine monoclonal antibody that is specifically immunoreactive with sj26 (Sierra BioSource, Gilroy, CA) .
Additional confirmation that the picked colonies contained the appropriate insert was obtained as follows. A phage solution for each colony was prepared by inoculating 40 μl of TE solution with a toothpick containing a small amount of bacteria putatively expressing a recombinant clone had been inoculated. A 5 μl sample was taken from each solution and separately PCR amplified.
The amplifications employed the appropriate forward primer, (e.g., Y5-10 F for a colony putatively expressing Y5-10) and a reverse primer (SEQ ID NO:87) homologous to a sequence located 3' to the cloning sites of the plasmid pGEX-HisB. The PCR amplifications were for 25 cycles as follows: 94°C for 1 minute, 50°C for 1.5 minutes and 72°C for 2 minutes. All of the colonies selected for further analysis produced a correctly sized DNA band with no other obvious bands under these conditions.
The immunoreactivity of the antigens expressed from the Y5-10, Y5-16, & Y5-5 inserts (expressed as sj26-his fusion proteins) was determined as follows. Aliquots (15 μl) of the crude lysates prepared above were size- fractionated by SDS-PAGE using a 12% acrylamide gel. The proteins were electro-blotted ("NOVEX MINICELL MINIBLOT II," San Diego, CA) onto nitrocellulose filters. The filters were then individually incubated with one of the following sera: JC, PNF 2161, and Super normal serum 4 (SN4) (R05072) as a negative control. In addition, one filter was incubated with anti-sj26 monoclonal antibodies (RMOOl; Sierra BioSource) .
As expected, the recombinant protein produced by the bacteria expressing the antigens encoded by the Y5-10, Y5- 5, and Y5-16 inserts all reacted with JC sera. No reactivity was observed with either PNF 2161 or SN4 sera. All proteins appeared to be expressed at similar levels as determined by their reactivity to the anti-sj26 monoclonal antibody. The Y5-5 and Y5-10 encoded proteins were selected for further purification.
E. coli carrying Y5-5- and Y5-10- containing pGEX- HisB vectors were cultured and expression of the fusion protein induced as described above. The cells were lysed in PBS, containing 2 mM PMSF, using a French Press at 1500 psi. The crude lysate was spun to remove cellular debris. The supernatant was loaded onto the glutathione affinity column at a high flow rate and the column was washed with 10 column volumes of PBS. The Y5-5 and Y5-10 fusion proteins were eluted with 10 mM Tris pH 8.8 containing 10 mM glutathione.
Each of the fusion protein samples was diluted 1/10 with Buffer A (10 mM Tris pH 8.8, containing 8 M urea) and loaded onto a nickel charged-chelating "SEPHAROSE" fast flow column. Each column was repeatedly washed with Buffer A until no further contaminants were eluted. The fusion proteins were eluted using a gradient of imidazole in buffer A. An imidazole gradient was run from 0 to 0.5 M imidazole in 20 column volumes. Fractions were collected. Each set of fractions was analyzed by standard SDS- PAGE using 12% polyacrylamide gels. Pools of the Y5-5 and Y5-10 fusion protein-containing fractions were separately made. Figures 8A to 8D show the results of Western blot analysis of the following samples (μg/lane) : lane 1, Y5- 10 antigen 1.6 μg; lane 2, Y5-10 antigen 0.8 μg; lane 3, Y5-10 antigen 0.4 μg; and lane 4, Y5-10 antigen 0.2 μg. Human serum JC (Figure 8A) and Super Normal 2 serum (Figure 8B) were diluted 1:100. The anti-GST mouse monoclonal antibody RM001 (Figure 8C) was diluted 1:1000. Figure 8D shows the Y5-10 antigen resolved by SDS-PAGE, transferred onto the nitrocellulose membrane and stained with Ponceau S protein stain (Kodak, Rochester, NY; Sigma). Arrow indicates the location of Y5.10 antigen. These results demonstrate that Y5-10 is specifically immunoreactive with N-(ABCDE) human serum JC.
Figures 9A to 9D show the results of Western blot analysis of the following samples: lane 1, Y5-5 antigen 3.2 μg; lane 2, Y5-5 antigen 1.6 μg; lane 3, Y5-5 antigen 0.8 μg; lane 4, Y5-5 antigen 0.4 μg; lane 5, Y5-5 antigen 0.2 μg; lane 6, GE3-2 antigen 0.4 μg; and lane 7, Y5-10 antigen 0.4 μg. Human serum JC (Figure 9A) , T55806 (Figure 9B) , and Super Normal 2 serum (Figure 9C) were diluted 1:100. RM001, the anti-GST mouse monoclonal antibody, (Figure 9D) was diluted 1:1000. Arrows indicate the locations of antigens Y5.5, GE3.2 and Y5.10. These results show specific immunoreactivity of the Y5-5 antigen with the JC serum. Further, the antigens GE3-2 and Y5-10 were reactive with T55806. However, the Y5-5 antigen was not reactive with the HGV-positive sera T55806.
The Y5-10 antigen was also size-fractionated by SDS polyacrylamide gel electrophoresis. The gel was stained using coomasie blue protein stain. The gel was scanned for purity with a laser densitometer. The purity of the Y5-10 fusion protein was approximately 95%. EXAMPLE 13 Cloning Further HGV Isolates A. THE JC VARIANT. One milliliter of JC serum was spun at 40,000 rpms for 2 hours. The resulting pellet was extracted using "TRIREAGENT" (MRC, Cinncinati, OH), resulting in the formation of 3 phases. The upper phase contained RNA only. This phase was taken and ETOH precipitated.
HGV cDNA molecules were generated from the JC sample by two methods. The first method was amplification (RT- PCR) of the JC nucleic acid sample using specific and nested primers. The primer sequences were based on the HGV sequence obtained from PNF 2161 serum. The criteria used to select the primers were (i) regions having a high G/C content, and (ii) no repetitious sequences.
The second method used to generate HGV cDNA molecules was amplification using HGV (PNF 2161) specific primers followed by identification of HGV specific sequences with 32P-labelled oligonucleotide probes. Such DNA hybridizations were carried out essentially as described by Sambrook, et al . (1989). The PCR derived clones were either (i) cloned into the "TA" vector (Invitrogen, San Diego, CA) and sequenced with vector primers (TAR and TAF) , or (ii) sequenced directly after PCR amplification. Both the probe and primer sequences were based on the HGV variant obtained from the PNF 2161 serum.
These two approaches yielded multiply-overlapping HGV fragments from the JC serum. Each of these fragments were cloned and sequenced. The sequences were aligned to obtain the HGV (JC-variant) consensus sequence presented as SEQ ID NO:156 (polypeptide sequence, SEQ ID NO:157). The sequence of each region of the HGV (JC-variant) virus was based on a consensus from at least three different, overlapping, independent clones. B . OTHER HGV VARIANTS .
In addition to the HGV PNF 2161-variant and JC- variant sequences, three partial HGV isolates have been obtained from the sera BG34, T55806 and EB20 by methods similar to those described above. The partial sequences of these isolates are presented as SEQ ID NO:150 (BG34 nucleic acid), SEQ ID NO:151 (BG34 protein), SEQ ID NO:152 (T55806 nucleic acid), SEQ ID NO:153 (T55806 protein), SEQ ID NO:154 (EB20-2 nucleic acid) and SEQ ID NO:155 (EB20-2 protein) .
While the invention has been described with reference to specific methods and embodiments, it will be appreciated that various modifications and changes may be made without departing from the invention.
SEQUENCE LISTING (1) GENERAL INFORMATION:
(i) APPLICANT:
(A) NAME: Genelabs Technologies, Inc.
(B) STREET: 505 Penobβcot Drive
(C) CITY: Redwood City
(D) STATE: CA
(E) COUNTRY: USA
(F) POSTAL CODE: 94063
(ii) TITLE OF INVENTION: Detection of Viral Antigens Coded by Reverse Reading Frames
(iii) NUMBER OF SEQUENCES: 157
(iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: Dehlinger & Associates
(B) STREET: 350 Cambridge Avenue, Suite 250
(C) CITY: Palo Alto
(D) STATE: CA
(E) COUNTRY: USA
(F) ZIP: 94306
(v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Floppy disk
(B) COMPUTER: IBM PC compatible
(C) OPERATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: Patentin Release #1.0, Version #1.25
(vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER:
(B) FILING DATE:
(C) CLASSIFICATION:
(vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: US 08/246,985
(B) FILING DATE: 20-MAY-1994
(vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: US 08/285,561
(B) FILING DATE: 03-AUG-1994
(vii) PRIOR APPLICATION DATA: (A) APPLICATION NUMBER: US 08/329,729
(B) FILING DATE: 26-OCT-1994
(Vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: US 08/344,271
(B) FILING DATE: 23-NOV-1994
(vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: US 08/357,509
(B) FILING DATE: 16-DEC-1994
(vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: US 08/389,886
(B) FILING DATE: 15-FEB-1995
(viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: Fabian, Gary R.
(B) REGISTRATION NUMBER: 33,875
(C) REFERENCE/DOCKET NUMBER: 4600-0202.41
(ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: (415) 324-0880
(B) TELEFAX: (415) 324-0960
(2) INFORMATION FOR SEQ ID NO:l:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA
(iii) HYPOTHETICAL: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: SISPA primer, top strand Linker AB
(xi) SEQUENCE DESCRIPTION: SEQ ID Nθ:l:
GGAATTCGCG GCCGCTCG 18 (2) INFORMATION FOR SEQ ID NO:2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA
(iii) HYPOTHETICAL: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Linker AB, bottom strand
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:
CGAGCGGCCG CGAATTCCTT 20
(2) INFORMATION FOR SEQ ID NO:3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 237 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
(iii) HYPOTHETICAL: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: PNF 2161 CLONE 470-20-1
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..237
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:
GAA TTC GCG GCC GCT CGG GCT GTC TCG GAC TCT TGG ATG ACC TCG AAT 48 Glu Phe Ala Ala Ala Arg Ala Val Ser Asp Ser Trp Met Thr Ser Asn 1 5 10 15
GAG TCA GAG GAC GGG GTA TCC TCC TGC GAG GAG GAC ACC GGC GGG GTC 96 Glu Ser Glu Asp Gly Val Ser Ser Cys Glu Glu Asp Thr Gly Gly Val 20 25 30
TTC TCA TCT GAG CTG CTC TCA GTA ACC GAG ATA AGT GCT GGC GAT GGA 144 Phe Ser Ser Glu Leu Leu Ser Val Thr Glu lie Ser Ala Gly Asp Gly 35 40 45
GTA CGG GGG ATG TCT TCT CCC CAT ACA GGC ATC TCT CGG CTA CTA CCA 192 Val Arg Gly Met Ser Ser Pro His Thr Gly lie Ser Arg Leu Leu Pro 50 55 60
CAA AGA GAG GGT GTA CTG CAG TCC TCC ACG AGC GGC CGC GAA TTC 237
Gin Arg Glu Gly Val Leu Gin Ser Ser Thr Ser Gly Arg Glu Phe 65 70 75
(2) INFORMATION FOR SEQ ID NO:4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 79 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: :
Glu Phe Ala Ala Ala Arg Ala Val Ser Asp Ser Trp Met Thr Ser Asn 1 5 10 15
Glu Ser Glu Asp Gly Val Ser Ser Cys Glu Glu Asp Thr Gly Gly Val 20 25 30
Phe Ser Ser Glu Leu Leu Ser Val Thr Glu lie Ser Ala Gly Asp Gly 35 40 45
Val Arg Gly Met Ser Ser Pro His Thr Gly lie Ser Arg Leu Leu Pro 50 55 60
Gin Arg Glu Gly Val Leu Gin Ser Ser Thr Ser Gly Arg Glu Phe 65 70 75
(2 ) INFORMATION FOR SEQ ID NO:5:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 22 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
(iii) HYPOTHETICAL: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: HAV-R1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:
GTTGACCAAC TGAGTCTGAA GC 22
(2) INFORMATION FOR SEQ ID NO:6:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 22 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
(iii) HYPOTHETICAL: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: HAV-F1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:
GATTGGAAAT CTGATCCGTC CC 22 (2) INFORMATION FOR SEQ ID NO:7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
(iii) HYPOTHETICAL: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: HCV-LANR
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:
TCGCGACCCA ACACTACTC 19
(2) INFORMATION FOR SEQ ID NO:8:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
(iii) HYPOTHETICAL: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: HCV 1532
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:
GGGGGCGACA CTCCACCA 18
(2) INFORMATION FOR SEQ ID NO:9: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 25 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer 470-20-1-77F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:
CTCTTTGTGG TAGTAGCCGA GAGAT 25
(2) INFORMATION FOR SEQ ID NO:10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 24 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer 470-20-1-211R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:
CGAATGAGTC AGAGGACGGG GTAT 24 (2) INFORMATION FOR SEQ ID NO:11:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer KL-1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:
GCAGGATCCG AATTCGCATC TAGAGAT 27
(2) INFORMATION FOR SEQ ID NO:12:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 29 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer KL-2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:
ATCTCTAGAT GCGAATTCGG ATCCTGCGA 29 (2) INFORMATION FOR SEQ ID NO:13:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
(iii) HYPOTHETICAL: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: LAMBDA GTll, REVERSE PRIMER
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:
GGCAGACATG GCCTGCCCGG 20
(2) INFORMATION FOR SEQ ID NO:14:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 9391 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: unknown
(ii) MOLECULE TYPE: DNA (genomic)
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: HGV-PNF 2161 Variant
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 459..9077 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:
ACGTGGGGGA GTTGATCCCC CCCCCCCGGC ACTGGGTGCA AGCCCCAGAA ACCGACGCCT 60
ATCTAAGTAG ACGCAATGAC TCGGCGCCGA CTCGGCGACC GGCCAAAAGG TGGTGGATGG 120
GTGATGACAG GGTTGGTAGG TCGTAAATCC CGGTCACCTT GGTAGCCACT ATAGGTGGGT 180
CTTAAGAGAA GGTTAAGATT CCTCTTGTGC CTGCGGCGAG ACCGCGCACG GTCCACAGGT 240
GTTGGCCCTA CCGGTGGGAA TAAGGGCCCG ACGTCAGGCT CGTCGTTAAA CCGAGCCCGT 300
TACCCACCTG GGCAAACGAC GCCCACGTAC GGTCCACGTC GCCCTTCAAT GTCTCTCTTG 360
ACCAATAGGC GTAGCCGGCG AGTTGACAAG GACCAGTGGG GGCCGGGGGC TTGGAGAGGG 420
ACTCCAAGTC CCGCCCTTCC CGGTGGGCCG GGAAATGC ATG GGG CCA CCC AGC 473
Met Gly Pro Pro Ser 1 5
TCC GCG GCG GCC TGC AGC CGG GGT AGC CCA AGA ATC CTT CGG GTG AGG 521 Ser Ala Ala Ala Cys Ser Arg Gly Ser Pro Arg lie Leu Arg Val Arg 10 15 20
GCG GGT GGC ATT TCC TTT TTC TAT ACC ATC ATG GCA GTC CTT CTG CTC 569 Ala Gly Gly lie Ser Phe Phe Tyr Thr lie Met Ala Val Leu Leu Leu 25 30 35
CTT CTC GTG GTT GAG GCC GGG GCC ATT CTG GCC CCG GCC ACC CAC GCT 617 Leu Leu Val Val Glu Ala Gly Ala lie Leu Ala Pro Ala Thr His Ala 40 45 50
TGT CGA GCG AAT GGG CAA TAT TTC CTC ACA AAT TGT TGT GCC CCG GAG 665 Cys Arg Ala Asn Gly Gin Tyr Phe Leu Thr Asn Cys Cys Ala Pro Glu 55 60 65
GAC ATC GGG TTC TGC CTG GAG GGT GGA TGC CTG GTG GCC CTG GGG TGC 713 Asp lie Gly Phe Cys Leu Glu Gly Gly Cys Leu Val Ala Leu Gly Cys 70 75 80 85
ACG ATT TGC ACT GAC CAA TGC TGG CCA CTG TAT CAG GCG GGT TTG GCT 761 Thr lie Cys Thr Asp Gin Cys Trp Pro Leu Tyr Gin Ala Gly Leu Ala 90 95 100 GTG CGG CCT GGC AAG TCC GCG GCC CAA CTG GTG GGG GAG CTG GGT AGC 809 Val Arg Pro Gly Lys Ser Ala Ala Gin Leu Val Gly Glu Leu Gly Ser 105 110 115
CTA TAC GGG CCC CTG TCG GTC TCG GCC TAT GTG GCT GGG ATC CTG GGC 857 Leu Tyr Gly Pro Leu Ser Val Ser Ala Tyr Val Ala Gly lie Leu Gly 120 125 130
CTG GGT GAG GTG TAC TCG GGT GTC CTA ACG GTG GGA GTC GCG TTG ACG 905 Leu Gly Glu Val Tyr Ser Gly Val Leu Thr Val Gly Val Ala Leu Thr 135 140 145
CGC CGG GTC TAC CCG GTG CCT AAC CTG ACG TGT GCA GTC GCG TGT GAG 953 Arg Arg Val Tyr Pro Val Pro Asn Leu Thr Cys Ala Val Ala Cys Glu 150 155 160 165
CTA AAG TGG GAA AGT GAG TTT TGG AGA TGG ACT GAA CAG CTG GCC TCC 1001 Leu Lys Trp Glu Ser Glu Phe Trp Arg Trp Thr Glu Gin Leu Ala Ser 170 175 180
AAC TAC TGG ATT CTG GAA TAC CTC TGG AAG GTC CCA TTT GAT TTC TGG 1049 Asn Tyr Trp lie Leu Glu Tyr Leu Trp Lys Val Pro Phe Asp Phe Trp 185 190 195
AGA GGC GTG ATA AGC CTG ACC CCC TTG TTG GTT TGC GTG GCC GCA TTG 1097 Arg Gly Val lie Ser Leu Thr Pro Leu Leu Val Cys Val Ala Ala Leu 200 205 210
CTG CTG CTT GAG CAA CGG ATT GTC ATG GTC TTC CTG TTG GTG ACG ATG 1145 Leu Leu Leu Glu Gin Arg lie Val Met Val Phe Leu Leu Val Thr Met 215 220 225
GCC GGG ATG TCG CAA GGC GCC CCT GCC TCC GTT TTG GGG TCA CGC CCC 1193 Ala Gly Met Ser Gin Gly Ala Pro Ala Ser Val Leu Gly Ser Arg Pro 230 235 240 245
TTT GAC TAC GGG TTG ACT TGG CAG ACC TGC TCT TGC AGG GCC AAC GGT 1241 Phe Asp Tyr Gly Leu Thr Trp Gin Thr Cys Ser Cys Arg Ala Asn Gly 250 255 260
TCG CGT TTT TCG ACT GGG GAG AAG GTG TGG GAC CGT GGG AAC GTT ACG 1289 Ser Arg Phe Ser Thr Gly Glu Lys Val Trp Asp Arg Gly Asn Val Thr 265 270 275 CTT CAG TGT GAC TGC CCT AAC GGC CCC TGG GTG TGG TTG CCA GCC TTT 1337 Leu Gin Cys Asp Cys Pro Asn Gly Pro Trp Val Trp Leu Pro Ala Phe 280 285 290
TGC CAA GCA ATC GGC TGG GGT GAC CCC ATC ACT TAT TGG AGC CAC GGG 1385 Cys Gin Ala lie Gly Trp Gly Asp Pro lie Thr Tyr Trp Ser His Gly 295 300 305
CAA AAT CAG TGG CCC CTT TCA TGC CCC CAG TAT GTC TAT GGG TCT GCT 1433 Gin Asn Gin Trp Pro Leu Ser Cys Pro Gin Tyr Val Tyr Gly Ser Ala 310 315 320 325
ACA GTC ACT TGC GTG TGG GGT TCC GCT TCT TGG TTT GCC TCC ACC AGT 1481 Thr Val Thr Cys Val Trp Gly Ser Ala Ser Trp Phe Ala Ser Thr Ser 330 335 340
GGT CGC GAC TCG AAG ATA GAT GTG TGG AGT TTA GTG CCA GTT GGC TCT 1529 Gly Arg Asp Ser Lys lie Asp Val Trp Ser Leu Val Pro Val Gly Ser 345 350 355
GCC ACC TGC ACC ATA GCC GCA CTT GGA TCA TCG GAT CGC GAC ACG GTG 1577 Ala Thr Cys Thr lie Ala Ala Leu Gly Ser Ser Asp Arg Asp Thr Val 360 365 370
CCT GGG CTC TCC GAG TGG GGA ATC CCG TGC GTG ACG TGT GTT CTG GAC 1625 Pro Gly Leu Ser Glu Trp Gly lie Pro Cys Val Thr Cys Val Leu Asp 375 380 385
CGT CGG CCT GCC TCC TGC GGC ACC TGT GTG AGG GAC TGC TGG CCC GAG 1673 Arg Arg Pro Ala Ser Cys Gly Thr Cys Val Arg Asp Cys Trp Pro Glu 390 395 400 405
ACC GGG TCG GTT AGG TTC CCA TTC CAT CGG TGC GGC GTG GGG CCT CGG 1721 Thr Gly Ser Val Arg Phe Pro Phe His Arg Cys Gly Val Gly Pro Arg 410 415 420
CTG ACA AAG GAC TTG GAA GCT GTG CCC TTC GTC AAC AGG ACA ACT CCC 1769 Leu Thr Lys Asp Leu Glu Ala Val Pro Phe Val Asn Arg Thr Thr Pro 425 430 435
TTC ACC ATT AGG GGG CCC CTG GGC AAC CAG GGC CGA GGC AAC CCG GTG 1817 Phe Thr lie Arg Gly Pro Leu Gly Asn Gin Gly Arg Gly Asn Pro Val 440 445 450 CGG TCG CCC TTG GGT TTT GGG TCC TAC GCC ATG ACC AGG ATC CGA GAT 1865 Arg Ser Pro Leu Gly Phe Gly Ser Tyr Ala Met Thr Arg lie Arg Asp 455 460 465
ACC CTA CAT CTG GTG GAG TGT CCC ACA CCA GCC ATT GAG CCT CCC ACC 1913 Thr Leu His Leu Val Glu Cys Pro Thr Pro Ala lie Glu Pro Pro Thr 470 475 480 485
GGG ACG TTT GGG TTC TTC CCC GGG ACG CCG CCT CTC AAC AAC TGC ATG 1961 Gly Thr Phe Gly Phe Phe Pro Gly Thr Pro Pro Leu Asn Asn Cys Met 490 495 500
CTC TTG GGC ACG GAA GTG TCC GAG GCA CTT GGG GGG GCT GGC CTC ACG 2009 Leu Leu Gly Thr Glu Val Ser Glu Ala Leu Gly Gly Ala Gly Leu Thr 505 510 515
GGG GGG TTC TAT GAA CCC CTG GTG CGC AGG TGT TCG AAG CTG ATG GGA 2057 Gly Gly Phe Tyr Glu Pro Leu Val Arg Arg Cys Ser Lys Leu Met Gly 520 525 530
AGC CGA AAT CCG GTT TGT CCG GGG TTT GCA TGG CTC TCT TCG GGC AGG 2105 Ser Arg Asn Pro Val Cys Pro Gly Phe Ala Trp Leu Ser Ser Gly Arg 535 540 545
CCT GAT GGG TTT ATA CAT GTC CAG GGT CAC TTG CAG GAG GTG GAT GCA 2153 Pro Asp Gly Phe lie His Val Gin Gly His Leu Gin Glu Val Asp Ala 550 555 560 565
GGC AAC TTC ATC CCG CCC CCG CGC TGG TTG CTC TTG GAC TTT GTA TTT 2201 Gly Asn Phe lie Pro Pro Pro Arg Trp Leu Leu Leu Asp Phe Val Phe 570 575 580
GTC CTG TTA TAC CTG ATG AAG CTG GCT GAG GCA CGG TTG GTC CCG CTG 2249 Val Leu Leu Tyr Leu Met Lys Leu Ala Glu Ala Arg Leu Val Pro Leu 585 590 595
ATC TTG CTG CTG CTA TGG TGG TGG GTG AAC CAG CTG GCA GTC CTA GGG 2297 lie Leu Leu Leu Leu Trp Trp Trp Val Asn Gin Leu Ala Val Leu Gly 600 605 610
CTG CCG GCT GTG GAA GCC GCC GTG GCA GGT GAG GTC TTC GCG GGC CCT 2345 Leu Pro Ala Val Glu Ala Ala Val Ala Gly Glu Val Phe Ala Gly Pro 615 620 625 GCC CTG TCC TGG TGT CTG GGA CTC CCG GTC GTC AGT ATG ATA TTG GGT 2393 Ala Leu Ser Trp Cys Leu Gly Leu Pro Val Val Ser Met lie Leu Gly 630 635 640 645
TTG GCA AAC CTG GTG CTG TAC TTT AGA TGG TTG GGA* CCC CAA CGC CTG 2441 Leu Ala Asn Leu Val Leu Tyr Phe Arg Trp Leu Gly Pro Gin Arg Leu 650 655 660
ATG TTC CTC GTG TTG TGG AAG CTT GCT CGG GGA GCT TTC CCG CTG GCC 2489 Met Phe Leu Val Leu Trp Lys Leu Ala Arg Gly Ala Phe Pro Leu Ala 665 670 675
CTC TTG ATG GGG ATT TCG GCG ACC CGC GGG CGC ACC TCA GTG CTC GGG 2537 Leu Leu Met Gly lie Ser Ala Thr Arg Gly Arg Thr Ser Val Leu Gly 680 685 690
GCC GAG TTC TGC TTC GAT GCT ACA TTC GAG GTG GAC ACT TCG GTG TTG 2585 Ala Glu Phe Cys Phe Asp Ala Thr Phe Glu Val Asp Thr Ser Val Leu 695 700 705
GGC TGG GTG GTG GCC AGT GTG GTA GCT TGG GCC ATT GCG CTC CTG AGC 2633 Gly Trp Val Val Ala Ser Val Val Ala Trp Ala He Ala Leu Leu Ser 710 715 720 725
TCG ATG AGC GCA GGG GGG TGG AGG CAC AAA GCC GTG ATC TAT AGG ACG 2681 Ser Met Ser Ala Gly Gly Trp Arg His Lys Ala Val He Tyr Arg Thr 730 735 740
TGG TGT AAG GGG TAC CAG GCA ATC CGT CAA AGG GTG GTG AGG AGC CCC 2729 Trp Cys Lys Gly Tyr Gin Ala He Arg Gin Arg Val Val Arg Ser Pro 745 750 755
CTC GGG GAG GGG CGG CCT GCC AAA CCC CTG ACC TTT GCC TGG TGC TTG 2777 Leu Gly Glu Gly Arg Pro Ala Lys Pro Leu Thr Phe Ala Trp Cys Leu 760 765 770
GCC TCG TAC ATC TGG CCA GAT GCT GTG ATG ATG GTG GTG GTT GCC TTG 2825 Ala Ser Tyr He Trp Pro Asp Ala Val Met Met Val Val Val Ala Leu 775 780 785
GTC CTT CTC TTT GGC CTG TTC GAC GCG TTG GAT TGG GCC TTG GAG GAG 2873 Val Leu Leu Phe Gly Leu Phe Asp Ala Leu Asp Trp Ala Leu Glu Glu 790 795 800 805 ATC TTG GTG TCC CGG CCC TCG TTG CGG CGT TTG GCT CGG GTG GTT GAG 2921 He Leu Val Ser Arg Pro Ser Leu Arg Arg Leu Ala Arg Val Val Glu 810 815 820
TGC TGT GTG ATG GCG GGT GAG AAG GCC ACA ACC GTC CGG CTG GTC TCC 2969 Cys Cys Val Met Ala Gly Glu Lys Ala Thr Thr Val Arg Leu Val Ser 825 830 835
AAG ATG TGT GCG AGA GGA GCT TAT TTG TTC GAT CAT ATG GGC TCT TTT 3017 Lys Met Cys Ala Arg Gly Ala Tyr Leu Phe Asp His Met Gly Ser Phe 840 845 850
TCG CGT GCT GTC AAG GAG CGC CTG TTG GAA TGG GAC GCA GCT CTT GAA 3065 Ser Arg Ala Val Lys Glu Arg Leu Leu Glu Trp Asp Ala Ala Leu Glu 855 860 865
CCT CTG TCA TTC ACT AGG ACG GAC TGT CGC ATC ATA CGG GAT GCC GCG 3113 Pro Leu Ser Phe Thr Arg Thr Asp Cys Arg He He Arg Asp Ala Ala 870 875 880 885
AGG ACT TTG TCC TGC GGG CAG TGC GTC ATG GGT TTA CCC GTG GTT GCG 3161 Arg Thr Leu Ser Cys Gly Gin Cys Val Met Gly Leu Pro Val Val Ala 890 895 900
CGC CGT GGT GAT GAG GTT CTC ATC GGC GTC TTC CAG GAT GTG AAT CAT 3209 Arg Arg Gly Asp Glu Val Leu He Gly Val Phe Gin Asp Val Asn His 905 910 915
TTG CCT CCC GGG TTT GTT CCG ACC GCG CCT GTT GTC ATC CGA CGG TGC 3257 Leu Pro Pro Gly Phe Val Pro Thr Ala Pro Val Val He Arg Arg Cys 920 925 930
GGA AAG GGC TTC TTG GGG GTC ACA AAG GCT GCC TTG ACA GGT CGG GAT 3305 Gly Lys Gly Phe Leu Gly Val Thr Lys Ala Ala Leu Thr Gly Arg Asp 935 940 945
CCT GAC TTA CAT CCA GGG AAC GTC ATG GTG TTG GGG ACG GCT ACG TCG 3353 Pro Asp Leu His Pro Gly Asn Val Met Val Leu Gly Thr Ala Thr Ser 950 955 960 965
CGA AGC ATG GGA ACA TGC TTG AAC GGC CTG CTG TTC ACG ACC TTC CAT 3401 Arg Ser Met Gly Thr Cys Leu Asn Gly Leu Leu Phe Thr Thr Phe His 970 975 980 GGG GCT TCA TCC CGA ACC ATC GCC ACA CCC GTG GGG GCC CTT AAT CCC 3449 Gly Ala Ser Ser Arg Thr He Ala Thr Pro Val Gly Ala Leu Asn Pro 985 990 995
AGA TGG TGG TCA GCC AGT GAT GAT GTC ACG GTG TAT CCA CTC CCG GAT 3497 Arg Trp Trp Ser Ala Ser Asp Asp Val Thr Val Tyr Pro Leu Pro Asp 1000 1005 1010
GGG GCT ACT TCG TTA ACA CCT TGT ACT TGC CAG GCT GAG TCC TGT TGG 3545 Gly Ala Thr Ser Leu Thr Pro Cys Thr Cys Gin Ala Glu Ser Cys Trp 1015 1020 1025
GTC ATC AGA TCC GAC GGG GCC CTA TGC CAT GGC TTG AGC AAG GGG GAC 3593 Val He Arg Ser Asp Gly Ala Leu Cys His Gly Leu Ser Lys Gly Asp 1030 1035 1040 1045
AAG GTG GAG CTG GAT GTG GCC ATG GAG GTC TCT GAC TTC CGT GGC TCG 3641 Lys Val Glu Leu Asp Val Ala Met Glu Val Ser Asp Phe Arg Gly Ser 1050 1055 1060
TCT GGC TCA CCG GTC CTA TGT GAC GAA GGG CAC GCA GTA GGA ATG CTC 3689 Ser Gly Ser Pro Val Leu Cys Asp Glu Gly His Ala Val Gly Met Leu 1065 1070 1075
GTG TCT GTG CTT CAC TCC GGT GGT AGG GTC ACC GCG GCA CGG TTC ACT 3737 Val Ser Val Leu His Ser Gly Gly Arg Val Thr Ala Ala Arg Phe Thr 1080 1085 1090
AGG CCG TGG ACC CAA GTG CCA ACA GAT GCC AAA ACC ACT ACT GAA CCC 3785 Arg Pro Trp Thr Gin Val Pro Thr Asp Ala Lys Thr Thr Thr Glu Pro 1095 1100 1105
CCT CCG GTG CCG GCC AAA GGA GTT TTC AAA GAG GCC CCG TTG TTT ATG 3833 Pro Pro Val Pro Ala Lys Gly Val Phe Lys Glu Ala Pro Leu Phe Met 1110 1115 1120 1125
CCT ACG GGA GCG GGA AAG AGC ACT CGC GTC CCG TTG GAG TAC GAT AAC 3881 Pro Thr Gly Ala Gly Lys Ser Thr Arg Val Pro Leu Glu Tyr Asp Asn 1130 1135 1140
ATG GGG CAC AAG GTC TTA ATC TTG AAC CCC TCA GTG GCC ACT GTG CGG 3929 Met Gly His Lys Val Leu He Leu Asn Pro Ser Val Ala Thr Val Arg 1145 1150 1155 GCC ATG GGC CCG TAC ATG GAG CGG CTG GCG GGT AAA CAT CCA AGT ATA 3977 Ala Met Gly Pro Tyr Met Glu Arg Leu Ala Gly Lys His Pro Ser He 1160 1165 1170
TAC TGT GGG CAT GAT ACA ACT GCT TTC ACA AGG ATC ACT GAC TCC CCC 4025 Tyr Cys Gly His Asp Thr Thr Ala Phe Thr Arg He Thr Asp Ser Pro 1175 1180 1185
CTG ACG TAT TCA ACC TAT GGG AGG TTT TTG GCC AAC CCT AGG CAG ATG 4073 Leu Thr Tyr Ser Thr Tyr Gly Arg Phe Leu Ala Asn Pro Arg Gin Met 1190 1195 1200 1205
CTA CGG GGC GTT TCG GTG GTC ATT TGT GAT GAG TGC CAC AGT CAT GAC 4121 Leu Arg Gly Val Ser Val Val He Cys Asp Glu Cys His Ser His Asp 1210 1215 1220
TCA ACC GTG CTG TTA GGC ATT GGG AGA GTC CGG GAG CTG GCG CGT GGG 4169 Ser Thr Val Leu Leu Gly He Gly Arg Val Arg Glu Leu Ala Arg Gly 1225 1230 1235
TGC GGG GTG CAA CTA GTG CTC TAC GCC ACC GCT ACA CCT CCC GGA TCC 4217 Cys Gly Val Gin Leu Val Leu Tyr Ala Thr Ala Thr Pro Pro Gly Ser 1240 1245 1250
CCT ATG ACG CAG CAC CCT TCC ATA ATT GAG ACA AAA TTG GAC GTG GGC 4265 Pro Met Thr Gin His Pro Ser He He Glu Thr Lys Leu Asp Val Gly 1255 1260 1265
GAG ATT CCC TTT TAT GGG CAT GGA ATA CCC CTC GAG CGG ATG CGA ACC 4313 Glu He Pro Phe Tyr Gly His Gly He Pro Leu Glu Arg Met Arg Thr 1270 1275 1280 1285
GGA AGG CAC CTC GTG TTC TGC CAT TCT AAG GCT GAG TGC GAG CGC CTT 4361 Gly Arg His Leu Val Phe Cys His Ser Lys Ala Glu Cys Glu Arg Leu 1290 1295 1300
GCT GGC CAG TTC TCC GCT AGG GGG GTC AAT GCC ATT GCC TAT TAT AGG 4409 Ala Gly Gin Phe Ser Ala Arg Gly Val Asn Ala He Ala Tyr Tyr Arg 1305 1310 1315
GGT AAA GAC AGT TCT ATC ATC AAG GAT GGG GAC CTG GTG GTC TGT GCT 4457 Gly Lys Asp Ser Ser He He Lys Asp Gly Asp Leu Val Val Cys Ala 1320 1325 1330 ACA GAC GCG CTT TCC ACT GGG TAC ACT GGA AAT TTC GAC TCC GTC ACC 4505 Thr Asp Ala Leu Ser Thr Gly Tyr Thr Gly Asn Phe Asp Ser Val Thr 1335 1340 1345
GAC TGT GGA TTA GTG GTG GAG GAG GTC GTT GAG GTG ACC CTT GAT CCC 4553 Asp Cys Gly Leu Val Val Glu Glu Val Val Glu Val Thr Leu Asp Pro 1350 1355 1360 1365
ACC ATT ACC ATC TCC CTG CGG ACA GTG CCT GCG TCG GCT GAA CTG TCG 4601 Thr He Thr He Ser Leu Arg Thr Val Pro Ala Ser Ala Glu Leu Ser 1370 1375 1380
ATG CAA AGA CGA GGA CGC ACG GGT AGG GGC AGG TCT GGA CGC TAC TAC 4649 Met Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Ser Gly Arg Tyr Tyr 1385 1390 1395
TAC GCG GGG GTG GGC AAA GCC CCT GCG GGT GTG GTG CGC TCA GGT CCT 4697 Tyr Ala Gly Val Gly Lys Ala Pro Ala Gly Val Val Arg Ser Gly Pro 1400 1405 1410
GTC TGG TCG GCG GTG GAA GCT GGA GTG ACC TGG TAC GGA ATG GAA CCT 4745 Val Trp Ser Ala Val Glu Ala Gly Val Thr Trp Tyr Gly Met Glu Pro 1415 1420 1425
GAC TTG ACA GCT AAC CTA CTG AGA CTT TAC GAC GAC TGC CCT TAC ACC 4793 Asp Leu Thr Ala Asn Leu Leu Arg Leu Tyr Asp Asp Cys Pro Tyr Thr 1430 1435 1440 1445
GCA GCC GTC GCG GCT GAT ATC GGA GAA GCC GCG GTG TTC TTC TCT GGG 4841 Ala Ala Val Ala Ala Asp He Gly Glu Ala Ala Val Phe Phe Ser Gly 1450 1455 1460
CTC GCC CCA TTG AGG ATG CAC CCT GAT GTC AGC TGG GCA AAA GTT CGC 4889 Leu Ala Pro Leu Arg Met His Pro Asp Val Ser Trp Ala Lys Val Arg 1465 1470 1475
GGC GTC AAC TGG CCC CTC TTG GTG GGT GTT CAG CGG ACC ATG TGT CGG 4937 Gly Val Asn Trp Pro Leu Leu Val Gly Val Gin Arg Thr Met Cys Arg 1480 1485 1490
GAA ACA CTG TCT CCC GGC CCA TCG GAT GAC CCC CAA TGG GCA GGT CTG 4985 Glu Thr Leu Ser Pro Gly Pro Ser Asp Asp Pro Gin Trp Ala Gly Leu 1495 1500 1505 AAG GGC CCA AAT CCT GTC CCA CTC CTG CTG AGG TGG GGC AAT GAT TTA 5033 Lys Gly Pro Asn Pro Val Pro Leu Leu Leu Arg Trp Gly Asn Asp Leu 1510 1515 1520 1525
CCA TCT AAA GTG GCC GGC CAC CAC ATA GTG GAC GAC CTG GTC CGG AGA 5081 Pro Ser Lys Val Ala Gly His His He Val Asp Asp Leu Val Arg Arg 1530 1535 1540
CTC GGT GTG GCG GAG GGT TAC GTC CGC TGC GAC GCT GGG CCG ATC TTG 5129 Leu Gly Val Ala Glu Gly Tyr Val Arg Cys Asp Ala Gly Pro He Leu 1545 1550 1555
ATG ATC GGT CTA GCT ATC GCG GGG GGA ATG ATC TAC GCG TCA TAC ACC 5177 Met He Gly Leu Ala He Ala Gly Gly Met He Tyr Ala Ser Tyr Thr 1560 1565 1570
GGG TCG CTA GTG GTG GTG ACA GAC TGG GAT GTG AAG GGG GGT GGC GCC 5225 Gly Ser Leu Val Val Val Thr Asp Trp Asp Val Lys Gly Gly Gly Ala 1575 1580 1585
CCC CTT TAT CGG CAT GGA GAC CAG GCC ACG CCT CAG CCG GTG GTG CAG 5273 Pro Leu Tyr Arg His Gly Asp Gin Ala Thr Pro Gin Pro Val Val Gin 1590 1595 1600 1605
GTT CCT CCG GTA GAC CAT CGG CCG GGG GGT GAA TCA GCA CCA TCG GAT 5321 Val Pro Pro Val Asp His Arg Pro Gly Gly Glu Ser Ala Pro Ser Asp 1610 1615 1620
GCC AAG ACA GTG ACA GAT GCG GTG GCA GCC ATC CAG GTG GAC TGC GAT 5369 Ala Lys Thr Val Thr Asp Ala Val Ala Ala He Gin Val Asp Cys Asp 1625 1630 1635
TGG ACT ATC ATG ACT CTG TCG ATC GGA GAA GTG TTG TCC TTG GCT CAG 5417 Trp Thr He Met Thr Leu Ser He Gly Glu Val Leu Ser Leu Ala Gin 1640 1645 1650
GCT AAG ACG GCC GAG GCC TAC ACA GCA ACC GCC AAG TGG CTC GCT GGC 5465 Ala Lys Thr Ala Glu Ala Tyr Thr Ala Thr Ala Lys Trp Leu Ala Gly 1655 1660 1665
TGC TAT ACG GGG ACG CGG GCC GTT CCC ACT GTA TCC ATT GTT GAC AAG 5513 Cys Tyr Thr Gly Thr Arg Ala Val Pro Thr Val Ser He Val Asp Lys 1670 1675 1680 1685 CTC TTC GCC GGA GGG TGG GCG GCT GTG GTG GGC CAT TGC CAC AGC GTG 5561 Leu Phe Ala Gly Gly Trp Ala Ala Val Val Gly His Cys His Ser Val 1690 1695 1700
ATT GCT GCG GCG GTG GCG GCC TAC GGG GCT TCA AGG AGC CCG CCG TTG 5609 He Ala Ala Ala Val Ala Ala Tyr Gly Ala Ser Arg Ser Pro Pro Leu 1705 1710 1715
GCA GCC GCG GCT TCC TAC CTG ATG GGG TTG GGC GTT GGA GGC AAC GCT 5657 Ala Ala Ala Ala Ser Tyr Leu Met Gly Leu Gly Val Gly Gly Asn Ala 1720 1725 1730
CAG ACG CGC CTG GCG TCT GCC CTC CTA TTG GGG GCT GCT GGA ACC GCC 5705 Gin Thr Arg Leu Ala Ser Ala Leu Leu Leu Gly Ala Ala Gly Thr Ala 1735 1740 1745
TTG GGC ACT CCT GTC GTG GGC TTG ACC ATG GCA GGT GCG TTC ATG GGG 5753 Leu Gly Thr Pro Val Val Gly Leu Thr Met Ala Gly Ala Phe Met Gly 1750 1755 1760 1765
GGG GCC AGT GTC TCC CCC TCC TTG GTC ACC ATT TTA TTG GGG GCC GTC 5801 Gly Ala Ser Val Ser Pro Ser Leu Val Thr He Leu Leu Gly Ala Val 1770 1775 1780
GGA GGT TGG GAG GGT GTT GTC AAC GCG GCG AGC CTA GTC TTT GAC TTC 5849 Gly Gly Trp Glu Gly Val Val Asn Ala Ala Ser Leu Val Phe Asp Phe 1785 1790 1795
ATG GCG GGG AAA CTT TCA TCA GAA GAT CTG TGG TAT GCC ATC CCG GTA 5897 Met Ala Gly Lys Leu Ser Ser Glu Asp Leu Trp Tyr Ala He Pro Val 1800 1805 1810
CTG ACC AGC CCG GGG GCG GGC CTT GCG GGG ATC GCT CTC GGG TTG GTT 5945 Leu Thr Ser Pro Gly Ala Gly Leu Ala Gly He Ala Leu Gly Leu Val 1815 1820 1825
TTG TAT TCA GCT AAC AAC TCT GGC ACT ACC ACT TGG TTG AAC CGT CTG 5993 Leu Tyr Ser Ala Asn Asn Ser Gly Thr Thr Thr Trp Leu Asn Arg Leu 1830 1835 1840 1845
CTG ACT ACG TTA CCA AGG TCT TCA TGT ATC CCG GAC AGT TAC TTT CAG 6041 Leu Thr Thr Leu Pro Arg Ser Ser Cys He Pro Asp Ser Tyr Phe Gin 1850 1855 I860 CAA GTT GAC TAT TGC GAC AAG GTC TCA GCC GTG CTC CGG CGC CTG AGC 6089 Gin Val Asp Tyr Cys Asp Lys Val Ser Ala Val Leu Arg Arg Leu Ser 1865 1870 1875
CTC ACC CGC ACA GTG GTT GCC CTG GTC AAC AGG GAG CCT AAG GTG GAT 6137 Leu Thr Arg Thr Val Val Ala Leu Val Asn Arg Glu Pro Lys Val Asp 1880 1885 1890
GAG GTA CAG GTG GGG TAT GTC TGG GAC CTG TGG GAG TGG ATC ATG CGC 6185 Glu Val Gin Val Gly Tyr Val Trp Asp Leu Trp Glu Trp He Met Arg 1895 1900 1905
CAA GTG CGC GTG GTC ATG GCC AGA CTC AGG GCC CTC TGC CCC GTG GTG 6233 Gin Val Arg Val Val Met Ala Arg Leu Arg Ala Leu Cys Pro Val Val 1910 1915 1920 1925
TCA CTA CCC TTG TGG CAT TGC GGG GAG GGG TGG TCC GGG GAA TGG TTG 6281 Ser Leu Pro Leu Trp His Cys Gly Glu Gly Trp Ser Gly Glu Trp Leu 1930 1935 1940
CTT GAC GGT CAT GTT GAG AGT CGC TGC CTC TGT GGC TGC GTG ATC ACT 6329 Leu Asp Gly His Val Glu Ser Arg Cys Leu Cys Gly Cys Val He Thr 1945 1950 1955
GGT GAC GTT CTG AAT GGG CAA CTC AAA GAA CCA GTT TAC TCT ACC AAG 6377 Gly Asp Val Leu Asn Gly Gin Leu Lys Glu Pro Val Tyr Ser Thr Lys 1960 1965 1970
CTG TGC CGG CAC TAT TGG ATG GGG ACT GTC CCT GTG AAC ATG CTG GGT 6425 Leu Cys Arg His Tyr Trp Met Gly Thr Val Pro Val Asn Met Leu Gly 1975 1980 1985
TAC GGT GAA ACG TCG CCT CTC CTG GCC TCC GAC ACC CCG AAG GTT GTG 6473 Tyr Gly Glu Thr Ser Pro Leu Leu Ala Ser Asp Thr Pro Lys Val Val 1990 1995 2000 2005
CCC TTC GGG ACG TCT GGC TGG GCT GAG GTG GTG GTG ACC ACT ACC CAC 6521 Pro Phe Gly Thr Ser Gly Trp Ala Glu Val Val Val Thr Thr Thr His 2010 2015 2020
GTG GTA ATC AGG AGG ACC TCC GCC TAT AAG CTG CTG CGC CAG CAA ATC 6569 Val Val He Arg Arg Thr Ser Ala Tyr Lys Leu Leu Arg Gin Gin He 2025 2030 2035 CTA TCG GCT GCT GTA GCT GAG CCC TAC TAC GTC GAC GGC ATT CCG GTC 6617 Leu Ser Ala Ala Val Ala Glu Pro Tyr Tyr Val Asp Gly He Pro Val 2040 2045 2050
TCA TGG GAC GCG GAC GCT CGT GCG CCC GCC ATG GTC TAT GGC CCT GGG 6665 Ser Trp Asp Ala Asp Ala Arg Ala Pro Ala Met Val Tyr Gly Pro Gly 2055 2060 2065
CAA AGT GTT ACC ATT GAC GGG GAG CGC TAC ACC TTG CCT CAT CAA CTG 6713 Gin Ser Val Thr He Asp Gly Glu Arg Tyr Thr Leu Pro His Gin Leu 2070 2075 2080 2085
AGG CTC AGG AAT GTG GCA CCC TCT GAG GTT TCA TCC GAG GTG TCC ATT 6761 Arg Leu Arg Asn Val Ala Pro Ser Glu Val Ser Ser Glu Val Ser He 2090 2095 2100
GAC ATT GGG ACG GAG ACT GAA GAC TCA GAA CTG ACT GAG GCC GAT CTG 6809 Asp He Gly Thr Glu Thr Glu Asp Ser Glu Leu Thr Glu Ala Asp Leu 2105 2110 2115
CCG CCG GCG GCT GCT GCT CTC CAA GCG ATC GAG AAT GCT GCG AGG ATT 6857 Pro Pro Ala Ala Ala Ala Leu Gin Ala He Glu Asn Ala Ala Arg He 2120 2125 2130
CTT GAA CCG CAC ATT GAT GTC ATC ATG GAG GAC TGC AGT ACA CCC TCT 6905 Leu Glu Pro His He Asp Val He Met Glu Asp Cys Ser Thr Pro Ser 2135 2140 2145
CTT TGT GGT AGT AGC CGA GAG ATG CCT GTA TGG GGA GAA GAC ATC CCC 6953 Leu Cys Gly Ser Ser Arg Glu Met Pro Val Trp Gly Glu Asp He Pro 2150 2155 2160 2165
CGT ACT CCA TCG CCA GCA CTT ATC TCG GTT ACT GAG AGC AGC TCA GAT 7001 Arg Thr Pro Ser Pro Ala Leu He Ser Val Thr Glu Ser Ser Ser Asp 2170 2175 2180
GAG AAG ACC CCG TCG GTG TCC TCC TCG CAG GAG GAT ACC CCG TCC TCT 7049 Glu Lys Thr Pro Ser Val Ser Ser Ser Gin Glu Asp Thr Pro Ser Ser 2185 2190 2195
GAC TCA TTC GAG GTC ATC CAA GAG TCC GAG ACA GCC GAA GGG GAG GAA 7097 Asp Ser Phe Glu Val He Gin Glu Ser Glu Thr Ala Glu Gly Glu Glu 2200 2205 2210 AGT GTC TTC AAC GTG GCT CTT TCC GTA TTA AAA GCC TTA TTT CCA CAG 7145 Ser Val Phe Asn Val Ala Leu Ser Val Leu Lys Ala Leu Phe Pro Gin 2215 2220 2225
AGC GAC GCG ACC AGG AAG CTT ACC GTC AAG ATG TCG TGC TGC GTT GAA 7193 Ser Asp Ala Thr Arg Lys Leu Thr Val Lys Met Ser Cys Cys Val Glu 2230 2235 2240 2245
AAG AGC GTC ACG CGC TTT TTC TCA TTG GGG TTG ACG GTG GCT GAT GTT 7241 Lys Ser Val Thr Arg Phe Phe Ser Leu Gly Leu Thr Val Ala Asp Val 2250 2255 2260
GCT AGC CTG TGT GAG ATG GAA ATC CAG AAC CAT ACA GCC TAT TGT GAC 7289 Ala Ser Leu Cys Glu Met Glu He Gin Asn His Thr Ala Tyr Cys Asp 2265 2270 2275
CAG GTG CGC ACT CCG CTT GAA TTG CAG GTT GGG TGC TTG GTG GGC AAT 7337 Gin Val Arg Thr Pro Leu Glu Leu Gin Val Gly Cys Leu Val Gly Asn 2280 2285 2290
GAA CTT ACC TTT GAA TGT GAC AAG TGT GAG GCT AGG CAA GAA ACC TTG 7385 Glu Leu Thr Phe Glu Cys Asp Lys Cys Glu Ala Arg Gin Glu Thr Leu 2295 2300 2305
GCC TCC TTC TCT TAC ATT TGG TCT GGA GTG CCG CTG ACT AGG GCC ACG 7433 Ala Ser Phe Ser Tyr He Trp Ser Gly Val Pro Leu Thr Arg Ala Thr 2310 2315 2320 2325
CCG GCC AAG CCT CCC GTG GTG AGG CCG GTT GGC TCT TTG TTA GTG GCC 7481 Pro Ala Lys Pro Pro Val Val Arg Pro Val Gly Ser Leu Leu Val Ala 2330 2335 2340
GAC ACT ACT AAG GTG TAT GTT ACC AAT CCA GAC AAT GTG GGA CGG AGG 7529 Asp Thr Thr Lys Val Tyr Val Thr Asn Pro Asp Asn Val Gly Arg Arg 2345 2350 2355
GTG GAC AAG GTG ACC TTC TGG CGT GCT CCT AGG GTT CAT GAT AAG TAC 7577 Val Asp Lys Val Thr Phe Trp Arg Ala Pro Arg Val His Asp Lys Tyr 2360 2365 2370
CTC GTG GAC TCT ATT GAG CGC GCT AAG AGG GCC GCT CAA GCC TGC CTA 7625 Leu Val Asp Ser He Glu Arg Ala Lys Arg Ala Ala Gin Ala Cys Leu 2375 2380 2385 AGC ATG GGT TAC ACT TAT GAG GAA GCA ATA AGG ACT GTA AGG CCA CAT 7673 Ser Met Gly Tyr Thr Tyr Glu Glu Ala He Arg Thr Val Arg Pro His 2390 2395 2400 2405
GCT GCC ATG GGC TGG GGA TCT AAG GTG TCG GTT AAG GAC TTA GCC ACC 7721 Ala Ala Met Gly Trp Gly Ser Lys Val Ser Val Lys Asp Leu Ala Thr 2410 2415 2420
CCC GCG GGG AAG ATG GCC GTC CAT GAC CGG CTT CAG GAG ATA CTT GAA 7769 Pro Ala Gly Lys Met Ala Val His Asp Arg Leu Gin Glu He Leu Glu 2425 2430 2435
GGG ACT CCG GTC CCC TTT ACT CTT ACT GTG AAA AAG GAG GTG TTC TTC 7817 Gly Thr Pro Val Pro Phe Thr Leu Thr Val Lys Lys Glu Val Phe Phe 2440 2445 2450
AAA GAC CGG AAG GAG GAG AAG GCC CCC CGC CTC ATT GTG TTC CCC CCC 7865 Lys Asp Arg Lys Glu Glu Lys Ala Pro Arg Leu He Val Phe Pro Pro 2455 2460 2465
CTG GAC TTC CGG ATA GCT GAA AAG CTC ATC TTG GGA GAC CCA GGC CGG 7913 Leu Asp Phe Arg He Ala Glu Lys Leu He Leu Gly Asp Pro Gly Arg 2470 2475 2480 2485
GTA GCC AAG GCG GTG TTG GGG GGG GCC TAC GCC TTC CAG TAC ACC CCA 7961 Val Ala Lys Ala Val Leu Gly Gly Ala Tyr Ala Phe Gin Tyr Thr Pro 2490 2495 2500
AAT CAG CGA GTT AAG GAG ATG CTC AAG CTA TGG GAG TCT AAG AAG ACC 8009 Asn Gin Arg Val Lys Glu Met Leu Lys Leu Trp Glu Ser Lys Lys Thr 2505 2510 2515
CCT TGC GCC ATC TGT GTG GAC GCC ACC TGC TTC GAC AGT AGC ATA ACT 8057 Pro Cys Ala He Cys Val Asp Ala Thr Cys Phe Asp Ser Ser He Thr 2520 2525 2530
GAA GAG GAC GTG GCT TTG GAG ACA GAG CTA TAC GCT CTG GCC TCT GAC 8105 Glu Glu Asp Val Ala Leu Glu Thr Glu Leu Tyr Ala Leu Ala Ser Asp 2535 2540 2545
CAT CCA GAA TGG GTG CGG GCA CTT GGG AAA TAC TAT GCC TCA GGC ACC 8153 His Pro Glu Trp Val Arg Ala Leu Gly Lys Tyr Tyr Ala Ser Gly Thr 2550 2555 2560 2565 ATG GTC ACC CCG GAA GGG GTG CCC GTC GGT GAG AGG TAT TGC AGA TCC 8201 Met Val Thr Pro Glu Gly Val Pro Val Gly Glu Arg Tyr Cys Arg Ser 2570 2575 2580
TCG GGT GTC CTA ACA ACT AGC GCG AGC AAC TGC TTG ACC TGC TAC ATC 8249 Ser Gly Val Leu Thr Thr Ser Ala Ser Asn Cys Leu Thr Cys Tyr He 2585 2590 2595
AAG GTG AAA GCT GCC TGT GAG AGA GTG GGG CTG AAA AAT GTC TCT CTT 8297 Lys Val Lys Ala Ala Cys Glu Arg Val Gly Leu Lys Asn Val Ser Leu 2600 2605 2610
CTC ATA GCC GGC GAT GAC TGC TTG ATC ATA TGT GAG CGG CCA GTG TGC 8345 Leu He Ala Gly Asp Asp Cys Leu He He Cys Glu Arg Pro Val Cys 2615 2620 2625
GAC CCA AGC GAC GCT TTG GGC AGA GCC CTA GCG AGC TAT GGG TAC GCG 8393 Asp Pro Ser Asp Ala Leu Gly Arg Ala Leu Ala Ser Tyr Gly Tyr Ala 2630 2635 2640 2645
TGC GAG CCC TCA TAT CAT GCA TCA TTG GAC ACG GCC CCC TTC TGC TCC 8441 Cys Glu Pro Ser Tyr His Ala Ser Leu Asp Thr Ala Pro Phe Cys Ser 2650 2655 2660
ACT TGG CTT GCT GAG TGC AAT GCA GAT GGG AAG CGC CAT TTC TTC CTG 8489 Thr Trp Leu Ala Glu Cys Asn Ala Asp Gly Lys Arg His Phe Phe Leu 2665 2670 2675
ACC ACG GAC TTC CGG AGG CCG CTC GCT CGC ATG TCG AGT GAG TAT AGT 8537 Thr Thr Asp Phe Arg Arg Pro Leu Ala Arg Met Ser Ser Glu Tyr Ser 2680 2685 2690
GAC CCG ATG GCT TCG GCG ATC GGT TAC ATC CTC CTT TAT CCT TGG CAC 8585 Asp Pro Met Ala Ser Ala He Gly Tyr He Leu Leu Tyr Pro Trp His 2695 2700 2705
CCC ATC ACA CGG TGG GTC ATC ATC CCT CAT GTG CTA ACG TGC GCA TTC 8633 Pro He Thr Arg Trp Val He He Pro His Val Leu Thr Cys Ala Phe 2710 2715 2720 2725
AGG GGT GGA GGC ACA CCG TCT GAT CCG GTT TGG TGC CAG GTG CAT GGT 8681 Arg Gly Gly Gly Thr Pro Ser Asp Pro Val Trp Cys Gin Val His Gly 2730 2735 2740 AAC TAC TAC AAG TTT CCA CTG GAC AAA CTG CCT AAC ATC ATC GTG GCC 8729 Asn Tyr Tyr Lys Phe Pro Leu Asp Lys Leu Pro Asn He He Val Ala 2745 2750 2755
CTC CAC GGA CCA GCA GCG TTG AGG GTT ACC GCA GAC ACA ACT AAA ACA 8777 Leu His Gly Pro Ala Ala Leu Arg Val Thr Ala Asp Thr Thr Lys Thr 2760 2765 2770
AAG ATG GAG GCT GGT AAG GTT CTG AGC GAC CTC AAG CTC CCT GGC TTA 8825 Lys Met Glu Ala Gly Lys Val Leu Ser Asp Leu Lys Leu Pro Gly Leu 2775 2780 2785
GCA GTC CAC CGA AAG AAG GCC GGG GCG TTG CGA ACA CGC ATG CTC CGC 8873 Ala Val His Arg Lys Lys Ala Gly Ala Leu Arg Thr Arg Met Leu Arg 2790 2795 2800 2805
TCG CGC GGT TGG GCT GAG TTG GCT AGG GGC TTG TTG TGG CAT CCA GGC 8921 Ser Arg Gly Trp Ala Glu Leu Ala Arg Gly Leu Leu Trp His Pro Gly 2810 2815 2820
CTA CGG CTT CCT CCC CCT GAG ATT GCT GGT ATC CCG GGG GGT TTC CCT 8969 Leu Arg Leu Pro Pro Pro Glu He Ala Gly He Pro Gly Gly Phe Pro 2825 2830 2835
CTC TCC CCC CCC TAT ATG GGG GTG GTA CAT CAA TTG GAT TTC ACA AGC 9017 Leu Ser Pro Pro Tyr Met Gly Val Val His Gin Leu Asp Phe Thr Ser 2840 2845 2850
CAG AGG AGT CGC TGG CGG TGG TTG GGG TTC TTA GCC CTG CTC ATC GTA 9065 Gin Arg Ser Arg Trp Arg Trp Leu Gly Phe Leu Ala Leu Leu He Val 2855 2860 2865
GCC CTC TTC GGG TGAACTAAAT TCATCTGTTG CGGCAAGGTC TGGTGACTGA 9117
Ala Leu Phe Gly
2870
TCATCACCGG AGGAGGTTCC CGCCCTCCCC GCCCCAGGGG TCTCCCCGCT GGGTAAAAAG 9177
GGCCCGGCCT TGGGAGGCAT GGTGGTTACT AACCCCCTGG CAGGGTCAAA GCCTGATGGT 9237
GCTAATGCAC TGCCACTTCG GTGGCGGGTC GCTACCTTAT AGCGTAATCC GTGACTACGG 9297
GCTGCTCGCA GAGCCCTCCC CGGATGGGGC ACAGTGCACT GTGATCTGAA GGGGTGCACC 9357 CCGGGAAGAG CTCGGCCCGA AGGCCGGTTC TACT 9391
(2) INFORMATION FOR SEQ ID NO:15:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2873 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15:
Met Gly Pro Pro Ser Ser Ala Ala Ala Cys Ser Arg Gly Ser Pro Arg 1 5 10 15
He Leu Arg Val Arg Ala Gly Gly He Ser Phe Phe Tyr Thr He Met 20 25 30
Ala Val Leu Leu Leu Leu Leu Val Val Glu Ala Gly Ala He Leu Ala 35 40 45
Pro Ala Thr His Ala Cys Arg Ala Asn Gly Gin Tyr Phe Leu Thr Asn 50 55 60
Cys Cys Ala Pro Glu Asp He Gly Phe Cys Leu Glu Gly Gly Cys Leu 65 70 75 80
Val Ala Leu Gly Cys Thr He Cys Thr Asp Gin Cys Trp Pro Leu Tyr 85 90 95
Gin Ala Gly Leu Ala Val Arg Pro Gly Lys Ser Ala Ala Gin Leu Val 100 105 110
Gly Glu Leu Gly Ser Leu Tyr Gly Pro Leu Ser Val Ser Ala Tyr Val 115 120 125
Ala Gly He Leu Gly Leu Gly Glu Val Tyr Ser Gly Val Leu Thr Val 130 135 140
Gly Val Ala Leu Thr Arg Arg Val Tyr Pro Val Pro Asn Leu Thr Cys 145 150 155 160 Ala Val Ala Cys Glu Leu Lys Trp Glu Ser Glu Phe Trp Arg Trp Thr 165 170 175
Glu Gin Leu Ala Ser Asn Tyr Trp He Leu Glu Tyr Leu Trp Lys Val 180 185 190
Pro Phe Asp Phe Trp Arg Gly Val He Ser Leu Thr Pro Leu Leu Val 195 200 205
Cys Val Ala Ala Leu Leu Leu Leu Glu Gin Arg He Val Met Val Phe 210 215 220
Leu Leu Val Thr Met Ala Gly Met Ser Gin Gly Ala Pro Ala Ser Val 225 230 235 240
Leu Gly Ser Arg Pro Phe Asp Tyr Gly Leu Thr Trp Gin Thr Cys Ser 245 250 255
Cys Arg Ala Asn Gly Ser Arg Phe Ser Thr Gly Glu Lys Val Trp Asp 260 265 270
Arg Gly Asn Val Thr Leu Gin Cys Asp Cys Pro Asn Gly Pro Trp Val 275 280 285
Trp Leu Pro Ala Phe Cys Gin Ala He Gly Trp Gly Asp Pro He Thr 290 295 300
Tyr Trp Ser His Gly Gin Asn Gin Trp Pro Leu Ser Cys Pro Gin Tyr 305 310 315 320
Val Tyr Gly Ser Ala Thr Val Thr Cys Val Trp Gly Ser Ala Ser Trp 325 330 335
Phe Ala Ser Thr Ser Gly Arg Asp Ser Lys He Asp Val Trp Ser Leu 340 345 350
Val Pro Val Gly Ser Ala Thr Cys Thr He Ala Ala Leu Gly Ser Ser 355 360 365
Asp Arg Asp Thr Val Pro Gly Leu Ser Glu Trp Gly He Pro Cys Val 370 375 380
Thr Cys Val Leu Asp Arg Arg Pro Ala Ser Cys Gly Thr Cys Val Arg 385 390 395 400 Asp Cys Trp Pro Glu Thr Gly Ser Val Arg Phe Pro Phe His Arg Cys 405 410 415
Gly Val Gly Pro Arg Leu Thr Lys Asp Leu Glu Ala Val Pro Phe Val 420 425 430
Asn Arg Thr Thr Pro Phe Thr He Arg Gly Pro Leu Gly Asn Gin Gly 435 440 445
Arg Gly Asn Pro Val Arg Ser Pro Leu Gly Phe Gly Ser Tyr Ala Met 450 455 460
Thr Arg lie Arg Asp Thr Leu His Leu Val Glu Cys Pro Thr Pro Ala 465 470 475 480
He Glu Pro Pro Thr Gly Thr Phe Gly Phe Phe Pro Gly Thr Pro Pro 485 490 495
Leu Asn Asn Cys Met Leu Leu Gly Thr Glu Val Ser Glu Ala Leu Gly 500 505 510
Gly Ala Gly Leu Thr Gly Gly Phe Tyr Glu Pro Leu Val Arg Arg Cys 515 520 525
Ser Lys Leu Met Gly Ser Arg Asn Pro Val Cys Pro Gly Phe Ala Trp 530 535 540
Leu Ser Ser Gly Arg Pro Asp Gly Phe He His Val Gin Gly His Leu 545 550 555 560
Gin Glu Val Asp Ala Gly Asn Phe He Pro Pro Pro Arg Trp Leu Leu 565 570 575
Leu Asp Phe Val Phe Val Leu Leu Tyr Leu Met Lys Leu Ala Glu Ala 580 585 590
Arg Leu Val Pro Leu He Leu Leu Leu Leu Trp Trp Trp Val Asn Gin 595 600 605
Leu Ala Val Leu Gly Leu Pro Ala Val Glu Ala Ala Val Ala Gly Glu 610 615 620
Val Phe Ala Gly Pro Ala Leu Ser Trp Cys Leu Gly Leu Pro Val Val 625 630 635 640 Ser Met He Leu Gly Leu Ala Asn Leu Val Leu Tyr Phe Arg Trp Leu 645 650 655
Gly Pro Gin Arg Leu Met Phe Leu Val Leu Trp Lys Leu Ala Arg Gly 660 665 670
Ala Phe Pro Leu Ala Leu Leu Met Gly He Ser Ala Thr Arg Gly Arg 675 680 685
Thr Ser Val Leu Gly Ala Glu Phe Cys Phe Asp Ala Thr Phe Glu Val 690 695 700
Asp Thr Ser Val Leu Gly Trp Val Val Ala Ser Val Val Ala Trp Ala 705 710 715 720
He Ala Leu Leu Ser Ser Met Ser Ala Gly Gly Trp Arg His Lys Ala 725 730 735
Val He Tyr Arg Thr Trp Cys Lys Gly Tyr Gin Ala He Arg Gin Arg 740 745 750
Val Val Arg Ser Pro Leu Gly Glu Gly Arg Pro Ala Lys Pro Leu Thr 755 760 765
Phe Ala Trp Cys Leu Ala Ser Tyr He Trp Pro Asp Ala Val Met Met 770 775 780
Val Val Val Ala Leu Val Leu Leu Phe Gly Leu Phe Asp Ala Leu Asp 785 790 795 800
Trp Ala Leu Glu Glu He Leu Val Ser Arg Pro Ser Leu Arg Arg Leu 805 810 815
Ala Arg Val Val Glu Cys Cys Val Met Ala Gly Glu Lys Ala Thr Thr 820 825 830
Val Arg Leu Val Ser Lys Met Cys Ala Arg Gly Ala Tyr Leu Phe Asp 835 840 845
His Met Gly Ser Phe Ser Arg Ala Val Lys Glu Arg Leu Leu Glu Trp 850 855 860
Asp Ala Ala Leu Glu Pro Leu Ser Phe Thr Arg Thr Asp Cys Arg He 865 870 875 880 He Arg Asp Ala Ala Arg Thr Leu Ser Cys Gly Gin Cys Val Met Gly 885 890 895
Leu Pro Val Val Ala Arg Arg Gly Asp Glu Val Leu He Gly Val Phe 900 905 910
Gin Asp Val Asn His Leu Pro Pro Gly Phe Val Pro Thr Ala Pro Val 915 920 925
Val He Arg Arg Cys Gly Lys Gly Phe Leu Gly Val Thr Lys Ala Ala 930 935 940
Leu Thr Gly Arg Asp Pro Asp Leu His Pro Gly Asn Val Met Val Leu 945 950 955 960
Gly Thr Ala Thr Ser Arg Ser Met Gly Thr Cys Leu Asn Gly Leu Leu 965 970 975
Phe Thr Thr Phe His Gly Ala Ser Ser Arg Thr He Ala Thr Pro Val 980 985 990
Gly Ala Leu Asn Pro Arg Trp Trp Ser Ala Ser Asp Asp Val Thr Val 995 1000 1005
Tyr Pro Leu Pro Asp Gly Ala Thr Ser Leu Thr Pro Cys Thr Cys Gin 1010 1015 1020
Ala Glu Ser Cys Trp Val He Arg Ser Asp Gly Ala Leu Cys His Gly 1025 1030 1035 1040
Leu Ser Lys Gly Asp Lys Val Glu Leu Asp Val Ala Met Glu Val Ser 1045 1050 1055
Asp Phe Arg Gly Ser Ser Gly Ser Pro Val Leu Cys Asp Glu Gly His 1060 1065 1070
Ala Val Gly Met Leu Val Ser Val Leu His Ser Gly Gly Arg Val Thr 1075 1080 1085
Ala Ala Arg Phe Thr Arg Pro Trp Thr Gin Val Pro Thr Asp Ala Lys 1090 1095 1100
Thr Thr Thr Glu Pro Pro Pro Val Pro Ala Lys Gly Val Phe Lys Glu 1105 1110 1115 1120 Ala Pro Leu Phe Met Pro Thr Gly Ala Gly Lys Ser Thr Arg Val Pro 1125 1130 1135
Leu Glu Tyr Asp Asn Met Gly His Lys Val Leu He Leu Asn Pro Ser 1140 1145 1150
Val Ala Thr Val Arg Ala Met Gly Pro Tyr Met Glu Arg Leu Ala Gly 1155 1160 1165
Lys His Pro Ser He Tyr Cys Gly His Asp Thr Thr Ala Phe Thr Arg 1170 1175 1180
He Thr Asp Ser Pro Leu Thr Tyr Ser Thr Tyr Gly Arg Phe Leu Ala 1185 1190 1195 1200
Asn Pro Arg Gin Met Leu Arg Gly Val Ser Val Val He Cys Asp Glu 1205 1210 1215
Cys His Ser His Asp Ser Thr Val Leu Leu Gly He Gly Arg Val Arg 1220 1225 1230
Glu Leu Ala Arg Gly Cys Gly Val Gin Leu Val Leu Tyr Ala Thr Ala 1235 1240 1245
Thr Pro Pro Gly Ser Pro Met Thr Gin His Pro Ser He He Glu Thr 1250 1255 1260
Lys Leu Asp Val Gly Glu He Pro Phe Tyr Gly His Gly He Pro Leu 1265 1270 1275 1280
Glu Arg Met Arg Thr Gly Arg His Leu Val Phe Cys His Ser Lys Ala 1285 1290 1295
Glu Cys Glu Arg Leu Ala Gly Gin Phe Ser Ala Arg Gly Val Asn Ala 1300 1305 1310
He Ala Tyr Tyr Arg Gly Lys Asp Ser Ser He He Lys Asp Gly Asp 1315 1320 1325
Leu Val Val Cys Ala Thr Asp Ala Leu Ser Thr Gly Tyr Thr Gly Asn 1330 1335 1340
Phe Asp Ser Val Thr Asp Cys Gly Leu Val Val Glu Glu Val Val Glu 1345 1350 1355 1360 Val Thr Leu Asp Pro Thr He Thr He Ser Leu Arg Thr Val Pro Ala 1365 1370 1375
Ser Ala Glu Leu Ser Met Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg 1380 1385 1390
Ser Gly Arg Tyr Tyr Tyr Ala Gly Val Gly Lys Ala Pro Ala Gly Val 1395 1400 1405
Val Arg Ser Gly Pro Val Trp Ser Ala Val Glu Ala Gly Val Thr Trp 1410 1415 1420
Tyr Gly Met Glu Pro Asp Leu Thr Ala Asn Leu Leu Arg Leu Tyr Asp 1425 1430 1435 1440
Asp Cys Pro Tyr Thr Ala Ala Val Ala Ala Asp He Gly Glu Ala Ala 1445 1450 1455
Val Phe Phe Ser Gly Leu Ala Pro Leu Arg Met His Pro Asp Val Ser 1460 1465 1470
Trp Ala Lys Val Arg Gly Val Asn Trp Pro Leu Leu Val Gly Val Gin 1475 1480 1485
Arg Thr Met Cys Arg Glu Thr Leu Ser Pro Gly Pro Ser Asp Asp Pro 1490 1495 1500
Gin Trp Ala Gly Leu Lys Gly Pro Asn Pro Val Pro Leu Leu Leu Arg 1505 1510 1515 1520
Trp Gly Asn Asp Leu Pro Ser Lys Val Ala Gly His His He Val Asp 1525 1530 1535
Asp Leu Val Arg Arg Leu Gly Val Ala Glu Gly Tyr Val Arg Cys Asp 1540 1545 1550
Ala Gly Pro He Leu Met He Gly Leu Ala He Ala Gly Gly Met He 1555 1560 1565
Tyr Ala Ser Tyr Thr Gly Ser Leu Val Val Val Thr Asp Trp Asp Val 1570 1575 1580
Lys Gly Gly Gly Ala Pro Leu Tyr Arg His Gly Asp Gin Ala Thr Pro 1585 1590 1595 1600 Gin Pro Val Val Gin Val Pro Pro Val Asp His Arg Pro Gly Gly Glu 1605 1610 1615
Ser Ala Pro Ser Asp Ala Lys Thr Val Thr Asp Ala Val Ala Ala He 1620 1625 1630
Gin Val Asp Cys Asp Trp Thr He Met Thr Leu Ser He Gly Glu Val 1635 1640 1645
Leu Ser Leu Ala Gin Ala Lys Thr Ala Glu Ala Tyr Thr Ala Thr Ala 1650 1655 1660
Lys Trp Leu Ala Gly Cys Tyr Thr Gly Thr Arg Ala Val Pro Thr Val 1665 1670 1675 1680
Ser He Val Asp Lys Leu Phe Ala Gly Gly Trp Ala Ala Val Val Gly 1685 1690 1695
His Cys His Ser Val He Ala Ala Ala Val Ala Ala Tyr Gly Ala Ser 1700 1705 1710
Arg Ser Pro Pro Leu Ala Ala Ala Ala Ser Tyr Leu Met Gly Leu Gly 1715 1720 1725
Val Gly Gly Asn Ala Gin Thr Arg Leu Ala Ser Ala Leu Leu Leu Gly 1730 1735 1740
Ala Ala Gly Thr Ala Leu Gly Thr Pro Val Val Gly Leu Thr Met Ala 1745 1750 1755 1760
Gly Ala Phe Met Gly Gly Ala Ser Val Ser Pro Ser Leu Val Thr He 1765 1770 1775
Leu Leu Gly Ala Val Gly Gly Trp Glu Gly Val Val Asn Ala Ala Ser 1780 1785 1790
Leu Val Phe Asp Phe Met Ala Gly Lys Leu Ser Ser Glu Asp Leu Trp 1795 1800 1805
Tyr Ala He Pro Val Leu Thr Ser Pro Gly Ala Gly Leu Ala Gly He 1810 1815 1820
Ala Leu Gly Leu Val Leu Tyr Ser Ala Asn Asn Ser Gly Thr Thr Thr 1825 1830 1835 1840 Trp Leu Asn Arg Leu Leu Thr Thr Leu Pro Arg Ser Ser Cys He Pro 1845 1850 1855
Asp Ser Tyr Phe Gin Gin Val Asp Tyr Cys Asp Lys Val Ser Ala Val 1860 1865 1870
Leu Arg Arg Leu Ser Leu Thr Arg Thr Val Val Ala Leu Val Asn Arg 1875 1880 1885
Glu Pro Lys Val Asp Glu Val Gin Val Gly Tyr Val Trp Asp Leu Trp 1890 1895 1900
Glu Trp He Met Arg Gin Val Arg Val Val Met Ala Arg Leu Arg Ala 1905 1910 1915 1920
Leu Cys Pro Val Val Ser Leu Pro Leu Trp His Cys Gly Glu Gly Trp 1925 1930 1935
Ser Gly Glu Trp Leu Leu Asp Gly His Val Glu Ser Arg Cys Leu Cys 1940 1945 1950
Gly Cys Val He Thr Gly Asp Val Leu Asn Gly Gin Leu Lys Glu Pro 1955 1960 1965
Val Tyr Ser Thr Lys Leu Cys Arg His Tyr Trp Met Gly Thr Val Pro 1970 1975 1980
Val Asn Met Leu Gly Tyr Gly Glu Thr Ser Pro Leu Leu Ala Ser Asp 1985 1990 1995 2000
Thr Pro Lys Val Val Pro Phe Gly Thr Ser Gly Trp Ala Glu Val Val 2005 2010 2015
Val Thr Thr Thr His Val Val He Arg Arg Thr Ser Ala Tyr Lys Leu 2020 2025 2030
Leu Arg Gin Gin He Leu Ser Ala Ala Val Ala Glu Pro Tyr Tyr Val 2035 2040 2045
Asp Gly He Pro Val Ser Trp Asp Ala Asp Ala Arg Ala Pro Ala Met 2050 2055 2060
Val Tyr Gly Pro Gly Gin Ser Val Thr He Asp Gly Glu Arg Tyr Thr 2065 2070 2075 2080 Leu Pro His Gin Leu Arg Leu Arg Asn Val Ala Pro Ser Glu Val Ser 2085 2090 2095
Ser Glu Val Ser He Asp He Gly Thr Glu Thr Glu Asp Ser Glu Leu 2100 2105 2110
Thr Glu Ala Asp Leu Pro Pro Ala Ala Ala Ala Leu Gin Ala He Glu 2115 2120 2125
Asn Ala Ala Arg He Leu Glu Pro His He Asp Val He Met Glu Asp 2130 2135 2140
Cys Ser Thr Pro Ser Leu Cys Gly Ser Ser Arg Glu Met Pro Val Trp 2145 2150 2155 2160
Gly Glu Asp He Pro Arg Thr Pro Ser Pro Ala Leu He Ser Val Thr 2165 2170 2175
Glu Ser Ser Ser Asp Glu Lys Thr Pro Ser Val Ser Ser Ser Gin Glu 2180 2185 2190
Asp Thr Pro Ser Ser Asp Ser Phe Glu Val He Gin Glu Ser Glu Thr 2195 2200 2205
Ala Glu Gly Glu Glu Ser Val Phe Asn Val Ala Leu Ser Val Leu Lys 2210 2215 2220
Ala Leu Phe Pro Gin Ser Asp Ala Thr Arg Lys Leu Thr Val Lys Met 2225 2230 2235 2240
Ser Cys Cys Val Glu Lys Ser Val Thr Arg Phe Phe Ser Leu Gly Leu 2245 2250 2255
Thr Val Ala Asp Val Ala Ser Leu Cys Glu Met Glu He Gin Asn His 2260 2265 2270
Thr Ala Tyr Cys Asp Gin Val Arg Thr Pro Leu Glu Leu Gin Val Gly 2275 2280 2285
Cys Leu Val Gly Asn Glu Leu Thr Phe Glu Cys Asp Lys Cys Glu Ala 2290 2295 2300
Arg Gin Glu Thr Leu Ala Ser Phe Ser Tyr He Trp Ser Gly Val Pro 2305 2310 2315 2320 Leu Thr Arg Ala Thr Pro Ala Lys Pro Pro Val Val Arg Pro Val Gly 2325 2330 2335
Ser Leu Leu Val Ala Asp Thr Thr Lys Val Tyr Val Thr Asn Pro Asp 2340 2345 2350
Asn Val Gly Arg Arg Val Asp Lys Val Thr Phe Trp Arg Ala Pro Arg 2355 2360 2365
Val His Asp Lys Tyr Leu Val Asp Ser He Glu Arg Ala Lys Arg Ala 2370 2375 2380
Ala Gin Ala Cys Leu Ser Met Gly Tyr Thr Tyr Glu Glu Ala He Arg 2385 2390 2395 2400
Thr Val Arg Pro His Ala Ala Met Gly Trp Gly Ser Lys Val Ser Val 2405 2410 2415
Lys Asp Leu Ala Thr Pro Ala Gly Lys Met Ala Val His Asp Arg Leu 2420 2425 2430
Gin Glu He Leu Glu Gly Thr Pro Val Pro Phe Thr Leu Thr Val Lys 2435 2440 2445
Lys Glu Val Phe Phe Lys Asp Arg Lys Glu Glu Lys Ala Pro Arg Leu 2450 2455 2460
He Val Phe Pro Pro Leu Asp Phe Arg He Ala Glu Lys Leu He Leu 2465 2470 2475 2480
Gly Asp Pro Gly Arg Val Ala Lys Ala Val Leu Gly Gly Ala Tyr Ala 2485 2490 2495
Phe Gin Tyr Thr Pro Asn Gin Arg Val Lys Glu Met Leu Lys Leu Trp 2500 2505 2510
Glu Ser Lys Lys Thr Pro Cys Ala He Cys Val Asp Ala Thr Cys Phe 2515 2520 2525
Asp Ser Ser He Thr Glu Glu Asp Val Ala Leu Glu Thr Glu Leu Tyr 2530 2535 2540
Ala Leu Ala Ser Asp His Pro Glu Trp Val Arg Ala Leu Gly Lys Tyr 2545 2550 2555 2560 Tyr Ala Ser Gly Thr Met Val Thr Pro Glu Gly Val Pro Val Gly Glu 2565 2570 2575
Arg Tyr Cys Arg Ser Ser Gly Val Leu Thr Thr Ser Ala Ser Asn Cys 2580 2585 2590
Leu Thr Cys Tyr He Lys Val Lys Ala Ala Cys Glu Arg Val Gly Leu 2595 2600 2605
Lys Asn Val Ser Leu Leu He Ala Gly Asp Asp Cys Leu He He Cys 2610 2615 2620
Glu Arg Pro Val Cys Asp Pro Ser Asp Ala Leu Gly Arg Ala Leu Ala 2625 2630 2635 2640
Ser Tyr Gly Tyr Ala Cys Glu Pro Ser Tyr His Ala Ser Leu Asp Thr 2645 2650 2655
Ala Pro Phe Cys Ser Thr Trp Leu Ala Glu Cys Asn Ala Asp Gly Lys 2660 2665 2670
Arg His Phe Phe Leu Thr Thr Asp Phe Arg Arg Pro Leu Ala Arg Met 2675 2680 2685
Ser Ser Glu Tyr Ser Asp Pro Met Ala Ser Ala He Gly Tyr He Leu 2690 2695 2700
Leu Tyr Pro Trp Hie Pro He Thr Arg Trp Val He He Pro His Val 2705 2710 2715 2720
Leu Thr Cys Ala Phe Arg Gly Gly Gly Thr Pro Ser Asp Pro Val Trp 2725 2730 2735
Cys Gin Val His Gly Asn Tyr Tyr Lys Phe Pro Leu Asp Lys Leu Pro 2740 2745 2750
Asn He He Val Ala Leu His Gly Pro Ala Ala Leu Arg Val Thr Ala 2755 2760 2765
Asp Thr Thr Lys Thr Lys Met Glu Ala Gly Lys Val Leu Ser Asp Leu 2770 2775 2780
Lys Leu Pro Gly Leu Ala Val His Arg Lys Lys Ala Gly Ala Leu Arg 2785 2790 2795 2800 Thr Arg Met Leu Arg Ser Arg Gly Trp Ala Glu Leu Ala Arg Gly Leu 2805 2810 2815
Leu Trp His Pro Gly Leu Arg Leu Pro Pro Pro Glu He Ala Gly He 2820 2825 2830
Pro Gly Gly Phe Pro Leu Ser Pro Pro Tyr Met Gly Val Val His Gin 2835 2840 2845
Leu Asp Phe Thr Ser Gin Arg Ser Arg Trp Arg Trp Leu Gly Phe Leu 2850 2855 2860
Ala Leu Leu He Val Ala Leu Phe Gly 2865 2870
(2) INFORMATION FOR SEQ ID NO:16:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: PROBE 470-20-1-152F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16:
TCGGTTACTG AGAGCAGCTC AGATGAG 27
(2) INFORMATION FOR SEQ ID NO:17:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
(iii) HYPOTHETICAL: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: JML-A, PRIMER
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17:
AGGAATTCAG CGGCCGCGAG 20
(2) INFORMATION FOR SEQ ID NO:18:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 22 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
(iii) HYPOTHETICAL: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: JML-B, PRIMER
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:
CTCGCGGCCG CTGAATTCCT TT 22
(2) INFORMATION FOR SEQ ID NO:19:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 203 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: CDNA to mRNA
(iii) HYPOTHETICAL: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: 470-20-1 CLONE, WITHOUT SISPA LINKERS
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 2..203
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:19:
G GCT GTC TCG GAC TCT TGG ATG ACC TCG AAT GAG TCA GAG GAC GGG 46
Ala Val Ser Asp Ser Trp Met Thr Ser Asn Glu Ser Glu Asp Gly 1 5 10 15
GTA TCC TCC TGC GAG GAG GAC ACC GGC GGG GTC TTC TCA TCT GAG CTG 94
Val Ser Ser Cys Glu Glu Asp Thr Gly Gly Val Phe Ser Ser Glu Leu 20 25 30
CTC TCA GTA ACC GAG ATA AGT GCT GGC GAT GGA GTA CGG GGG ATG TCT 142 Leu Ser Val Thr Glu He Ser Ala Gly Asp Gly Val Arg Gly Met Ser 35 40 45
TCT CCC CAT ACA GGC ATC TCT CGG CTA CTA CCA CAA AGA GAG GGT GTA 190 Ser Pro His Thr Gly He Ser Arg Leu Leu Pro Gin Arg Glu Gly Val 50 55 60
CTG CAG TCC TCC A 203
Leu Gin Ser Ser 65
(2) INFORMATION FOR SEQ ID NO:20:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 67 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:20:
Ala Val Ser Asp Ser Trp Met Thr Ser Asn Glu Ser Glu Asp Gly Val 1 5 10 15
Ser Ser Cys Glu Glu Asp Thr Gly Gly Val Phe Ser Ser Glu Leu Leu 20 25 30
Ser Val Thr Glu He Ser Ala Gly Asp Gly Val Arg Gly Met Ser Ser 35 40 45
Pro His Thr Gly He Ser Arg Leu Leu Pro Gin Arg Glu Gly Val Leu 50 55 60
Gin Ser Ser 65
(2) INFORMATION FOR SEQ ID NO:21:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
.(•vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: 470-20-1-152R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21:
CTCATCTGAG CTGCTCTCAG TAACCGA 27
(2) INFORMATION FOR SEQ ID NO:22:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 24 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: OLIGONUCLEOTIDE B
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22:
CTGTCTCGGA CTCTTGGATG ACCT 24
(2) INFORMATION FOR SEQ ID NO:23:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 24 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: COGNATE OLIGONUCLEOTIDE 211R'
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23:
ATACCCCGTC CTCTGACTCA TTCG 24
(2) INFORMATION FOR SEQ ID NO:24: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 24 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: COGNATE OLIGONUCLEOTIDE B'
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24:
AGGTCATCCA AGAGTCCGAG ACAG 24
(2) INFORMATION FOR SEQ ID NO:25:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: LAMBDA GT 11 FORWARD PRIMER, 20mer
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25:
CACATGGCTG AATATCGACG 20 (2) INFORMATION FOR SEQ ID NO:26:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: PROBE 470-201-1-142R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26:
TCGGTTACTG AGAGCAGCTC AGATGAG 27
(2) INFORMATION FOR SEQ ID NO:27:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: PROBE 470-20-1-152F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27:
TCGGTTACTG AGAGCAGCTC AGATGAG 27 (2) INFORMATION FOR SEQ ID NO:28:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 570 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone 470EXP1
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..570
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28:
GCT GTA TGG TTC TGG ATT TCC ATC TCA CAC AGG CTA GCA ACA TCA GCC 48 Ala Val Trp Phe Trp He Ser He Ser His Arg Leu Ala Thr Ser Ala 1 5 10 15
ACC GTC AAC CCC AAT GAG AAA AAG CGC GTG ACG CTC TTT TCA ACG CAG 96 Thr Val Asn Pro Asn Glu Lys Lys Arg Val Thr Leu Phe Ser Thr Gin 20 25 30
CAC GAC ATC TTG ACG GTA AGC TTC CTG GTC GCG TCG CTC TGT GGA AAT 144 His Asp He Leu Thr Val Ser Phe Leu Val Ala Ser Leu Cys Gly Asn 35 40 45
AAG GCT TTT AAT ACG GAA AGA GCC ACG TTG AAG ACA CTT TCC TCC CCT 192 Lys Ala Phe Asn Thr Glu Arg Ala Thr Leu Lys Thr Leu Ser Ser Pro 50 55 60
TCG GCT GTC TCG GAC TCT TGG ATG ACC TCG AAT GAG TCA GAG GAC GGG 240 Ser Ala Val Ser Asp Ser Trp Met Thr Ser Asn Glu Ser Glu Asp Gly 65 70 75 80 GTA TCC TCC TGC GAG GAG GAC ACC GAC GGG GTC TTC TCA TCT GAG CTG 288 Val Ser Ser Cys Glu Glu Asp Thr Asp Gly Val Phe Ser Ser Glu Leu 85 90 95
CTC TCA GTA ACC GAG ATA AGT GCT GGC GAT GGA GTA CGG GGG ATG TCT 336 Leu Ser Val Thr Glu He Ser Ala Gly Asp Gly Val Arg Gly Met Ser 100 105 110
TCT CCC CAT ACA GGC ATC TCT CGG CTA CTA CCA CAA AGA GAG GGT GTA 384 Ser Pro His Thr Gly He Ser Arg Leu Leu Pro Gin Arg Glu Gly Val 115 120 125
CTG CAG TCC TCC ATG ATG ACA TCA ATG TGC GGT TCA AGA ATC CTC GCA 432 Leu Gin Ser Ser Met Met Thr Ser Met Cys Gly Ser Arg He Leu Ala 130 135 140
GCA TTC TCG ATC GCT TGG AGA GCA GCA GCC GCC GGC GGC AGA TCG GCC 480 Ala Phe Ser He Ala Trp Arg Ala Ala Ala Ala Gly Gly Arg Ser Ala 145 150 155 160
TCA GTC AGT TCT GAG TCT TCA GTC TCC GTC CCA ATG TCA ATG GAC ACC 528 Ser Val Ser Ser Glu Ser Ser Val Ser Val Pro Met Ser Met Asp Thr 165 170 175
TCG GAT GAA ACC TCA GAG GGT GCC ACA TTC CTG AGC CTC AGT 570
Ser Asp Glu Thr Ser Glu Gly Ala Thr Phe Leu Ser Leu Ser 180 185 190
(2) INFORMATION FOR SEQ ID NO:29:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 190 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29:
Ala Val Trp Phe Trp He Ser He Ser His Arg Leu Ala Thr Ser Ala 1 5 10 15
Thr Val Asn Pro Asn Glu Lys Lys Arg Val Thr Leu Phe Ser Thr Gin 20 25 30
His Asp He Leu Thr Val Ser Phe Leu Val Ala Ser Leu Cys Gly Asn 35 40 45
Lys Ala Phe Asn Thr Glu Arg Ala Thr Leu Lys Thr Leu Ser Ser Pro 50 55 60
Ser Ala Val Ser Asp Ser Trp Met Thr Ser Asn Glu Ser Glu Asp Gly 65 70 75 80
Val Ser Ser Cys Glu Glu Asp Thr Asp Gly Val Phe Ser Ser Glu Leu 85 90 95
Leu Ser Val Thr Glu He Ser Ala Gly Asp Gly Val Arg Gly Met Ser 100 105 110
Ser Pro His Thr Gly He Ser Arg Leu Leu Pro Gin Arg Glu Gly Val 115 120 125
Leu Gin Ser Ser Met Met Thr Ser Met Cys Gly Ser Arg He Leu Ala 130 135 140
Ala Phe Ser He Ala Trp Arg Ala Ala Ala Ala Gly Gly Arg Ser Ala 145 150 155 160
Ser Val Ser Ser Glu Ser Ser Val Ser Val Pro Met Ser Met Asp Thr 165 170 175
Ser Asp Glu Thr Ser Glu Gly Ala Thr Phe Leu Ser Leu Ser 180 185 190
(2) INFORMATION FOR SEQ ID NO:30:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 45 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer GE-3F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:30:
GCCGCCATGG TCTCATGGGA CGCGGACGCT CGTGCGCCCG CGATG 45
(2) INFORMATION FOR SEQ ID NO:31:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer GE-3R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:31:
GCGCGGATCC GATAAGTGCT GGCGATGGAG TACG 34
(2) INFORMATION FOR SEQ ID NO:32:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 22 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer GE-9F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32:
GGCACCATGG TCACCCCGGA AG 22
(2) INFORMATION FOR SEQ ID NO:33:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer GE-9R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33:
GCTCGGATCC GGAGCAGAAG GGGGCCGT 28
(2) INFORMATION FOR SEQ ID NO:34:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 364 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear ( ii ) MOLECULE TYPE : cDNA to mRNA
( iii ) HYPOTHETICAL : NO
( iv ) ANTI -SENSE : NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: GE3-2
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 2..364
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34:
G GTC TCA TGG GAC GCG GAC GCT CGT GCG CCC GCG ATG GTC TAT GGC 46
Val Ser Trp Asp Ala Asp Ala Arg Ala Pro Ala Met Val Tyr Gly 1 5 10 15
CCT GGG CAA AGT GTT ACC ATT GAC GGG GAG CGC TAC ACC TTG CCT CAT 94 Pro Gly Gin Ser Val Thr He Asp Gly Glu Arg Tyr Thr Leu Pro His 20 25 30
CAA CTG AGG CTC AGG AAT GTG GCA CCC TCT GAG GTT TCA TCC GAG GTG 142 Gin Leu Arg Leu Arg Asn Val Ala Pro Ser Glu Val Ser Ser Glu Val 35 40 45
TCC ATT GAC ATT GGG ACG GAG ACT GAA GAC TCA GAA CTG ACT GAG GCC 190 Ser He Asp He Gly Thr Glu Thr Glu Asp Ser Glu Leu Thr Glu Ala 50 55 60
GAT CTG CCG CCG GCG GCT GCT GCT CTC CAA GCG ATC GAG AAT GCT GCG 238 Asp Leu Pro Pro Ala Ala Ala Ala Leu Gin Ala He Glu Asn Ala Ala 65 70 75
AGG ATT CTT GAA CCG CAC ATT GAT GTC ATC ATG GAG GAC TGC AGT ACA 286 Arg He Leu Glu Pro His He Asp Val He Met Glu Asp Cys Ser Thr 80 85 90 95
CCC TCT CTT TGT GGT AGT AGC CGA GAG ATG CCT GTA TGG GGA GAA GAC 334 Pro Ser Leu Cys Gly Ser Ser Arg Glu Met Pro Val Trp Gly Glu Asp 100 105 110 ATC CCC CGT ACT CCA TCG CCA GCA CTT ATC 364
He Pro Arg Thr Pro Ser Pro Ala Leu He 115 120
(2) INFORMATION FOR SEQ ID NO:35:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 121 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35:
Val Ser Trp Asp Ala Asp Ala Arg Ala Pro Ala Met Val Tyr Gly Pro 1 5 10 15
Gly Gin Ser Val Thr He Asp Gly Glu Arg Tyr Thr Leu Pro His Gin 20 25 30
Leu Arg Leu Arg Asn Val Ala Pro Ser Glu Val Ser Ser Glu Val Ser 35 40 45
He Asp He Gly Thr Glu Thr Glu Asp Ser Glu Leu Thr Glu Ala Asp 50 55 60
Leu Pro Pro Ala Ala Ala Ala Leu Gin Ala He Glu Asn Ala Ala Arg 65 70 75 80
He Leu Glu Pro His He Asp Val He Met Glu Asp Cys Ser Thr Pro 85 90 95
Ser Leu Cys Gly Ser Ser Arg Glu Met Pro Val Trp Gly Glu Asp He 100 105 110
Pro Arg Thr Pro Ser Pro Ala Leu He 115 120
(2) INFORMATION FOR SEQ ID NO:36:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 290 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone GE9-2
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 3..290
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36:
CC ATG GTC ACC CCG GAA GGG GTG CCC GTT GGT GAG AGG TAT TGC AGA 47
Met Val Thr Pro Glu Gly Val Pro Val Gly Glu Arg Tyr Cys Arg 1 5 10 15
TCC TCG GGT GTC CTA ACA ACT AGC GCG AGC AAC TGC TTG ACC TGC TAC 95 Ser Ser Gly Val Leu Thr Thr Ser Ala Ser Asn Cys Leu Thr Cys Tyr 20 25 30
ATC AAG GTG AAA GCC GCC TGT GAG AGG GTG GGG CTG AAA AAT GTC TCT 143 He Lys Val Lys Ala Ala Cys Glu Arg Val Gly Leu Lys Asn Val Ser 35 40 45
CTT CTC ATA GCC GGC GAT GAC TGC TTG ATC ATA TGT GAG CGG CCA GTG 191 Leu Leu He Ala Gly Asp Asp Cys Leu He He Cys Glu Arg Pro Val 50 55 60
TGC GAC CCA AGC GAC GCT TTG GGC AGA GCC CTA GCG AGC TAT GGG TAC 239 Cys Asp Pro Ser Asp Ala Leu Gly Arg Ala Leu Ala Ser Tyr Gly Tyr 65 70 75
GCG TGC GAG CCC TCA TAT TAT GCA TGC TCG GAC ACG GCC CCC TTC TGC 287 Ala Cys Glu Pro Ser Tyr Tyr Ala Cys Ser Asp Thr Ala Pro Phe Cys 80 85 90 95 TCC 290
Ser
(2) INFORMATION FOR SEQ ID NO:37:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 96 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37:
Met Val Thr Pro Glu Gly Val Pro Val Gly Glu Arg Tyr Cys Arg Ser 1 5 10 15
Ser Gly Val Leu Thr Thr Ser Ala Ser Asn Cys Leu Thr Cys Tyr He 20 25 30
Lys Val Lys Ala Ala Cys Glu Arg Val Gly Leu Lys Asn Val Ser Leu 35 40 45
Leu He Ala Gly Asp Asp Cys Leu He He Cys Glu Arg Pro Val Cys 50 55 60
Asp Pro Ser Asp Ala Leu Gly Arg Ala Leu Ala Ser Tyr Gly Tyr Ala 65 70 75 80
Cys Glu Pro Ser Tyr Tyr Ala Cys Ser Asp Thr Ala Pro Phe Cys Ser 85 90 95
(2) INFORMATION FOR SEQ ID NO:38:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: JML-A SISPA Primer
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:38:
AGGAATTCAG CGGCCGCGAG 20
(2) INFORMATION FOR SEQ ID NO:39:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 22 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: JML-B SISPA Primer
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:39:
CTCGCGGCCG CTGAATTCCT TT 22
(2) INFORMATION FOR SEQ ID NO:40:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear ( ii ) MOLECULE TYPE : cDNA to mRNA
( iii ) HYPOTHETICAL : NO
( iv ) ANTI -SENSE : NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: 470ep-fl Primer
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:40:
GCGAATTCGC CATGGCGGGG AGACTTTCAT CA 32
(2) INFORMATION FOR SEQ ID NO:41:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 35 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: 470ep-Rl Primer
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41:
GCGAATTCGG ATCCAGGGCC ATAGACCATC GCGGG 35
(2) INFORMATION FOR SEQ ID NO:42:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 26 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both (D ) TOPOLOGY : linear
( ii ) MOLECULE TYPE : cDNA to mRNA
( iii ) HYPOTHETICAL: NO
( iv) ANTI -SENSE : NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: 470ep-f2 Primer
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:42:
GCGAATTCCG TGCGCCCGCC ATGGTC 26
(2) INFORMATION FOR SEQ ID NO:43:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: 470ep-R3 Primer
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:43:
GCGAATTCGG ATCCCAAGGT TTCTTGCCTA GC 32
(2) INFORMATION FOR SEQ ID NO:44:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 26 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: 470ep-f4 Primer
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:44:
GCGAATTCAA GTGTGAGGCT AGGCAA 26
(2) INFORMATION FOR SEQ ID NO:45:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 35 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: 470ep-R4 Primer
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:45:
GCGAATTCGG ATCCCCACAC AGATGGCGCA AGGGG 35
(2) INFORMATION FOR SEQ ID NO:46: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: KL-1 SISPA Primer
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46:
GCAGGATCCG AATTCGCATC TAGAGAT 27
(2) INFORMATION FOR SEQ ID NO:47:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 29 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: KL-2 SISPA Primer
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:47:
ATCTCTAGAT GCGAATTCGG ATCCTGCGA 29
(2) INFORMATION FOR SEQ ID NO:48: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 186 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone Y5-10
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..186
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48:
CGT GCG CCC GCC ATG GTC TAT GGC CCT GGG CAA AGT GTT GCC ATT GAC 48 Arg Ala Pro Ala Met Val Tyr Gly Pro Gly Gin Ser Val Ala He Asp 1 5 10 15
GGG GAG CGC TAC ACC TTG CCT CAT CAA CTG AGG CTC AGG AAT GTG GCA 96 Gly Glu Arg Tyr Thr Leu Pro His Gin Leu Arg Leu Arg Asn Val Ala 20 25 30
CCC TCT GAG GTT TCA TCC GAG GTG TCC ATT GAC ATT GGG ACG GAG GCT 144 Pro Ser Glu Val Ser Ser Glu Val Ser He Asp He Gly Thr Glu Ala 35 40 45
GAA AAC TCA GAA CTG ACT GAG GCC GAT CTG CCG CCG GCG GCT 186
Glu Asn Ser Glu Leu Thr Glu Ala Asp Leu Pro Pro Ala Ala 50 55 60
(2) INFORMATION FOR SEQ ID NO:49:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 62 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:49:
Arg Ala Pro Ala Met Val Tyr Gly Pro Gly Gin Ser Val Ala He Asp 1 5 10 15
Gly Glu Arg Tyr Thr Leu Pro His Gin Leu Arg Leu Arg Asn Val Ala 20 25 30
Pro Ser Glu Val Ser Ser Glu Val Ser He Asp He Gly Thr Glu Ala 35 40 45
Glu Asn Ser Glu Leu Thr Glu Ala Asp Leu Pro Pro Ala Ala 50 55 60
(2) INFORMATION FOR SEQ ID NO:50:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 282 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone Y5-12
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..282
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:50:
CGT GCG CCC GCC ATG GTC TAT GGC CCT GGG CAA AGT GTT ACC ATT GAC 48 Arg Ala Pro Ala Met Val Tyr Gly Pro Gly Gin Ser Val Thr He Asp 1 5 10 15
GGG GAG CGC TAC ACC TTG CCT CAT CAA CTG AGG CTC AGG AAT GTG GCA 96 Gly Glu Arg Tyr Thr Leu Pro His Gin Leu Arg Leu Arg Asn Val Ala 20 25 30
CCC TCT GAG GTT TCA TCC GAG GTG TCC ATT GAC ATT GGG ACG GAG ACT 144 Pro Ser Glu Val Ser Ser Glu Val Ser He Asp He Gly Thr Glu Thr 35 40 45
GAA GAC TCA GAA CTG ACT GAG GCC GAT CTG CCG CCG GCG GCT GCT GCT 192 Glu Asp Ser Glu Leu Thr Glu Ala Asp Leu Pro Pro Ala Ala Ala Ala 50 55 60
CTC CAA GCG ATC GAG AAT GCT GCG AGG ATT CTT GAA CCG CAC ATT GAT 240 Leu Gin Ala He Glu Asn Ala Ala Arg He Leu Glu Pro His He Asp 65 70 75 80
GTC ATC ATG GAG GAC TGC AGT ACA CCC TCT CTT TGT GGT AGT 282
Val He Met Glu Asp Cys Ser Thr Pro Ser Leu Cys Gly Ser 85 90
(2) INFORMATION FOR SEQ ID NO:51:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 94 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:51:
Arg Ala Pro Ala Met Val Tyr Gly Pro Gly Gin Ser Val Thr He Asp 1 5 10 15
Gly Glu Arg Tyr Thr Leu Pro His Gin Leu Arg Leu Arg Asn Val Ala 20 25 30
Pro Ser Glu Val Ser Ser Glu Val Ser He Asp He Gly Thr Glu Thr 35 40 45 Glu Asp Ser Glu Leu Thr Glu Ala Asp Leu Pro Pro Ala Ala Ala Ala 50 55 60
Leu Gin Ala He Glu Asn Ala Ala Arg He Leu Glu Pro His He Asp 65 70 75 80
Val He Met Glu Asp Cys Ser Thr Pro Ser Leu Cys Gly Ser 85 90
(2) INFORMATION FOR SEQ ID NO:52:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 279 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone Y5-26
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..279
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:52:
CGT GCG CCC GCC ATG GTC TAT GGC CCT GGG CAA AGT GTT TCC ATT GAC 48 Arg Ala Pro Ala Met Val Tyr Gly Pro Gly Gin Ser Val Ser He Asp 1 5 10 15
GGG GAG CGC TAC ACC TTG CCT CAT CAA CTG AGG CTC AGG AAT GTG GCA 96 Gly Glu Arg Tyr Thr Leu Pro His Gin' Leu Arg Leu Arg Asn Val Ala 20 25 30
CCC TCT GAG GTT TCA TCC GAG GTG TCC ATT GAC ATT GGG ACG GAG ACT 144 Pro Ser Glu Val Ser Ser Glu Val Ser He Asp He Gly Thr Glu Thr 35 40 45
GAA GAC TCA GAA CTG ACT GAG GCC GAC CTG CCG CCG GCG GCT GCT GCT 192 Glu Asp Ser Glu Leu Thr Glu Ala Asp Leu Pro Pro Ala Ala Ala Ala 50 55 60
CTC CAA GCG ATC GAG AAT GCT GCG AGG ATT CTT GAA CCG CAC ATC GAT 240 Leu Gin Ala He Glu Asn Ala Ala Arg He Leu Glu Pro His He Asp 65 70 75 80
GTC ATC ATG GAG GAC TGC AGT ACA CCC TCT CTT TGT GGT 279
Val He Met Glu Asp Cys Ser Thr Pro Ser Leu Cys Gly 85 90
(2) INFORMATION FOR SEQ ID NO:53:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 93 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:53:
Arg Ala Pro Ala Met Val Tyr Gly Pro Gly Gin Ser Val Ser He Asp 1 5 10 15
Gly Glu Arg Tyr Thr Leu Pro His Gin Leu Arg Leu Arg Asn Val Ala 20 25 30
Pro Ser Glu Val Ser Ser Glu Val Ser He Asp He Gly Thr Glu Thr 35 40 45
Glu Asp Ser Glu Leu Thr Glu Ala Asp Leu Pro Pro Ala Ala Ala Ala 50 55 60
Leu Gin Ala He Glu Asn Ala Ala Arg He Leu Glu Pro His He Asp 65 70 75 80
Val He Met Glu Asp Cys Ser Thr Pro Ser Leu Cys Gly 85 90 ( 2 ) INFORMATION FOR SEQ ID NO : 54 :
( i ) SEQUENCE CHARACTERISTICS :
(A) LENGTH: 108 base pairs
( B ) TYPE : nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone Y5-5
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..108
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:54:
GCC TAT TGT GAC AAG GTG CGC ACT CCG CTT GAA TTG CAG GTT GGG TGC 48 Ala Tyr Cys Asp Lys Val Arg Thr Pro Leu Glu Leu Gin Val Gly Cys 1 5 10 15
TTG GTG GGC AAT GAA CTT ACC TTT GAA TGT GAC AAG TGT GAG GCT AGG 96 Leu Val Gly Asn Glu Leu Thr Phe Glu Cys Asp Lys Cys Glu Ala Arg 20 25 30
CAA GAA ACC TTG 108
Gin Glu Thr Leu 35
(2) INFORMATION FOR SEQ ID NO:55:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 36 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:55:
Ala Tyr Cys Asp Lys Val Arg Thr Pro Leu Glu Leu Gin Val Gly Cys 1 5 10 15
Leu Val Gly Asn Glu Leu Thr Phe Glu Cys Asp Lys Cys Glu Ala Arg 20 25 30
Gin Glu Thr Leu 35
(2) INFORMATION FOR SEQ ID NO:56:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 132 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: CDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone Y5-3
(ix) FEATUR :
(A) NAME/KEY: CDS
(B) LOCATION: 1..132
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:56:
GAG ATG GAA ATC CAG AAC CAT ACA GCC TAT TGT GAC AAG GTG CGC ACT 48 Glu Met Glu He Gin Asn His Thr Ala Tyr Cys Asp Lys Val Arg Thr 1 5 10 15
CCG CTT GAA TTG CAG GTT GGG TGC TTG GTG GGC AAT GAA CTT ACC TTT 96 Pro Leu Glu Leu Gin Val Gly Cys Leu Val Gly Asn Glu Leu Thr Phe 20 25 30
GAA TGT GAC AAG TGT GAG GCT AGG CAA GAA ACC TTG 132
Glu Cys Asp Lys Cys Glu Ala Arg Gin Glu Thr Leu 35 40
(2) INFORMATION FOR SEQ ID NO:57:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 44 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:57:
Glu Met Glu He Gin Asn His Thr Ala Tyr Cys Asp Lys Val Arg Thr 1 5 10 15
Pro Leu Glu Leu Gin Val Gly Cys Leu Val Gly Asn Glu Leu Thr Phe 20 25 30
Glu Cys Asp Lys Cys Glu Ala Arg Gin Glu Thr Leu 35 40
(2) INFORMATION FOR SEQ ID NO:58:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 258 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone Y5-27 ( ix ) FEATURE :
(A) NAME/KEY: CDS
(B) LOCATION: 1..258
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:58:
AAA GCC TTA TTT CCA CAG AGC GAC GCG ACC AGG AAG CTT ACC GTC AAG 48 Lys Ala Leu Phe Pro Gin Ser Asp Ala Thr Arg Lys Leu Thr Val Lys 1 5 10 15
ATG TCA TGC TGC GTT GAA AAG AGC GTC ACG CGC TTT TTC TCA TTG GGG 96 Met Ser Cys Cys Val Glu Lys Ser Val Thr Arg Phe Phe Ser Leu Gly 20 25 30
TTG ACG GTG GCT GAT GTT GCT AGC CTG TGT GAG ATG GAA ATC CAG AAC 144 Leu Thr Val Ala Asp Val Ala Ser Leu Cys Glu Met Glu He Gin Asn 35 40 45
CAT ATA GCC TAT TGT GAC AAG GTG CGC ACT CCG CTT GAA TTG CAG GTT 192 His He Ala Tyr Cys Asp Lys Val Arg Thr Pro Leu Glu Leu Gin Val 50 55 60
GGG TGC TTG GTG GGC AAT GAA CTC ACC TTT GAA TGT GAC AAG TGT GAG 240 Gly Cys Leu Val Gly Asn Glu Leu Thr Phe Glu Cys Asp Lys Cys Glu 65 70 75 80
GCT AGG CAA GAA ACC TTG 258
Ala Arg Gin Glu Thr Leu 85
(2) INFORMATION FOR SEQ ID NO:59:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 86 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:59:
Lys Ala Leu Phe Pro Gin Ser Asp Ala Thr Arg Lys Leu Thr Val Lys 1 5 10 15
Met Ser Cys Cys Val Glu Lys Ser Val Thr Arg Phe Phe Ser Leu Gly 20 25 30
Leu Thr Val Ala Asp Val Ala Ser Leu Cys Glu Met Glu He Gin Asn 35 40 45
His He Ala Tyr Cys Asp Lys Val Arg Thr Pro Leu Glu Leu Gin Val 50 55 60
Gly Cys Leu Val Gly Asn Glu Leu Thr Phe Glu Cys Asp Lys Cys Glu 65 70 75 80
Ala Arg Gin Glu Thr Leu 85
(2) INFORMATION FOR SEQ ID NO:60:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 108 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone Y5-25
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..108
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:60:
ACC TAT TGT GAC AAG GTG CGC ACT CCG CTT GAA TTG CAG GTT GGG TGC 48 Thr Tyr Cys Asp Lys Val Arg Thr Pro Leu Glu Leu Gin Val Gly Cys 1 . 5 10 15
TTG GTG GGC AAT GAA CTT ACC TTT GAA TGT GAC AAG TGT GAG GCT AGG 96 Leu Val Gly Asn Glu Leu Thr Phe Glu Cys Asp Lys Cys Glu Ala Arg 20 25 30
CAA GAA ACC TTG 108
Gin Glu Thr Leu 35
(2) INFORMATION FOR SEQ ID NO:61:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 36 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:61:
Thr Tyr Cys Asp Lys Val Arg Thr Pro Leu Glu Leu Gin Val Gly Cys 1 5 10 15
Leu Val Gly Asn Glu Leu Thr Phe Glu Cys Asp Lys Cys Glu Ala Arg 20 25 30
Gin Glu Thr Leu 35
(2) INFORMATION FOR SEQ ID NO:62:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 108 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO ( iv ) ANTI-SENSE : NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone Y5-20
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 52..108
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:62:
GCCGACACTA CTAAGGTGTA TGTTACCAAT CCAGACAATG TGGGACGAAG G GTG GGC 57
Val Gly
1
AAT GAA CTT ACC TTT GAA TGT GAC AAG TGT GAG GCT AGG CAA GAA ACC 105 Asn Glu Leu Thr Phe Glu Cys Asp Lys Cys Glu Ala Arg Gin Glu Thr 5 10 15
TTG 108
Leu
(2) INFORMATION FOR SEQ ID NO:63:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:63:
Val Gly Asn Glu Leu Thr Phe Glu Cys Asp Lys Cys Glu Ala Arg Gin 1 5 10 15
Glu Thr Leu
(2) INFORMATION FOR SEQ ID NO:64: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 168 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone Y5-16
(i ) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..168
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:64:
TTG GGG TTG ACG GTG GCT GAT GTT GCT AGC CTG TGT GAG ATG GAA ATC 48 Leu Gly Leu Thr Val Ala Asp Val Ala Ser Leu Cys Glu Met Glu He 1 5 10 15
CAG AAC CAT ACA GCC TAT TGT GAC AAG GTG CGC ACT CCG CTT GAA TTG 96 Gin Asn His Thr Ala Tyr Cys Asp Lys Val Arg Thr Pro Leu Glu Leu 20 25 30
CAG GTT GGG TGC TTG GTG GGC AAT GAA CTT ACC TTT GAA TGT GAC AAG 144 Gin Val Gly Cys Leu Val Gly Asn Glu Leu Thr Phe Glu Cys Asp Lys 35 40 45
TGT GAG GCT AGG CAA GAA ACC TTG 168
Cys Glu Ala Arg Gin Glu Thr Leu 50 55
(2) INFORMATION FOR SEQ ID NO:65:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 56 amino acids
(B) TYPE: amino acid (D ) TOPOLOGY : linear
( ii ) MOLECULE TYPE : protein
(xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 65 :
Leu Gly Leu Thr Val Ala Asp Val Ala Ser Leu Cys Glu Met Glu He 1 5 10 15
Gin Asn His Thr Ala Tyr Cys Asp Lys Val Arg Thr Pro Leu Glu Leu 20 25 30
Gin Val Gly Cys Leu Val Gly Asn Glu Leu Thr Phe Glu Cys Asp Lys 35 40 45
Cys Glu Ala Arg Gin Glu Thr Leu 50 55
(2) INFORMATION FOR SEQ ID NO:66:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 313 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone Y5-50
(i ) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..313
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:66:
ATC ACC GTC AAC CCC AAT GAG AAA AAG CGC GTG ACG CTC TTT TCA ACG 48 He Thr Val Asn Pro Asn Glu Lys Lys Arg Val Thr Leu Phe Ser Thr 1 5 10 15
CAG CAC GAC ATC TTG ACG GTA AGC TTC CTG GTC GCG TCG CTC TGT GGA 96
Gin His Asp He Leu Thr Val Ser Phe Leu Val Ala Ser Leu Cys Gly 20 25 30
AAT AAG GCT TTT AAT ACG GAA AGA GCC ACG TTG AAG ACA CTT TCC TCC 144 Asn Lys Ala Phe Asn Thr Glu Arg Ala Thr Leu Lys Thr Leu Ser Ser 35 40 45
CCT TCG GCT GTC TCG GAC TCT TGG ATG ACC TCG AAT GAG TCA GAG GAC 192 Pro Ser Ala Val Ser Asp Ser Trp Met Thr Ser Asn Glu Ser Glu Asp 50 55 60
GGG GTA TCC TCC TGC GAG GAG GAC ACC GAC GGG GTC TTC TCA TCT GAG 240 Gly Val Ser Ser Cys Glu Glu Asp Thr Asp Gly Val Phe Ser Ser Glu 65 70 75 80
CTG CTC TCA GTA ACC GAG ATA AGT GCT GGC GAT GGA GTA CGG GGG ATG 288 Leu Leu Ser Val Thr Glu He Ser Ala Gly Asp Gly Val Arg Gly Met 85 90 95
TCT TCT CCC CAT ACA GGC ATC TCT C 313
Ser Ser Pro His Thr Gly He Ser 100
(2) INFORMATION FOR SEQ ID NO:67:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 104 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:67:
He Thr Val Asn Pro Asn Glu Lys Lys Arg Val Thr Leu Phe Ser Thr 1 5 10 15
Gin His Asp He Leu Thr Val Ser Phe Leu Val Ala Ser Leu Cys Gly 20 25 30 Asn Lys Ala Phe Asn Thr Glu Arg Ala Thr Leu Lys Thr Leu Ser Ser 35 40 45
Pro Ser Ala Val Ser Asp Ser Trp Met Thr Ser Asn Glu Ser Glu Asp 50 55 60
Gly Val Ser Ser Cys Glu Glu Asp Thr Asp Gly Val Phe Ser Ser Glu 65 70 75 80
Leu Leu Ser Val Thr Glu He Ser Ala Gly Asp Gly Val Arg Gly Met 85 90 95
Ser Ser Pro His Thr Gly He Ser 100
(2) INFORMATION FOR SEQ ID NO:68:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 89 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone Y5-52
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 28..87
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:68:
ACTGAGAGCA GCTCAGATGA GAAGACC CCT TCG GCT GTC TCG GAC TCT TGG 51
Pro Ser Ala Val Ser Asp Ser Trp 1 5 ATG ACC TCG AAT GAG TCA GAG GAC GGG GTA TCC TCG CA 89
Met Thr Ser Asn Glu Ser Glu Asp Gly Val Ser Ser 10 15 20
(2) INFORMATION FOR SEQ ID NO:69:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:69:
Pro Ser Ala Val Ser Asp Ser Trp Met Thr Ser Asn Glu Ser Glu Asp 1 5 10 15
Gly Val Ser Ser 20
(2) INFORMATION FOR SEQ ID NO:70:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 214 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone Y5-53
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..100 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:70:
AAT AAG GCT TTT AAT ACG GAA AGA GCC ACG TTG AAG ACA CTT TCC TCC 48 Asn Lys Ala Phe Asn Thr Glu Arg Ala Thr Leu Lys Thr Leu Ser Ser 1 5 10 15
CCT TCG GCT GTC TCG GAC TCT TGG ATG ACC TCG AAT GAG TCA GAG GAC 96 Pro Ser Ala Val Ser Asp Ser Trp Met Thr Ser Asn Glu Ser Glu Asp 20 25 30
GGG G ATCTCTAGAT GCGAATTCAA GTGTGAGGCT AGGCAAGAAA CCTTGGCCTC 150
Gly
CTTCTCTTAC ATTTGGTCTG GAGTGCCGCT GACTAGGGCC ACGCCGGCCA AGCCTCCCGT 210
GGTG 214
(2) INFORMATION FOR SEQ ID NO:71:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
( i) SEQUENCE DESCRIPTION: SEQ ID NO:71:
Asn Lys Ala Phe Asn Thr Glu Arg Ala Thr Leu Lys Thr Leu Ser Ser 1 5 10 15
Pro Ser Ala Val Ser Asp Ser Trp Met Thr Ser Asn Glu Ser Glu Asp 20 25 30
Gly
(2) INFORMATION FOR SEQ ID NO:72:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 113 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone Y5-55
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 52..113
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:72:
CCATCGCCAG CACTTATCTC GGTTACTGAG AGCAGCTCAG ATCAGAAGAC C CCT TCG 57
Pro Ser
1
GCT GTC TCG GAC TCT TGG ATG ACC TCG AAT GAG TCA GAG GAC GGG GTA 105 Ala Val Ser Asp Ser Trp Met Thr Ser Asn Glu Ser Glu Asp Gly Val 5 10 15
TCC TCG CA 113
Ser Ser 20
(2) INFORMATION FOR SEQ ID NO:73:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:73:
Pro Ser Ala Val Ser Asp Ser Trp Met Thr Ser Asn Glu Ser Glu Asp 1 5 10 15
Gly Val Ser Ser 20
(2) INFORMATION FOR SEQ ID NO:74:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 330 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone Y5-56
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..330
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:74:
ACG TTG AAG ACA CTT TCC TCC CCT TCG GCT GTC TCG GAC TCT TGG ATG 48 Thr Leu Lys Thr Leu Ser Ser Pro Ser Ala Val Ser Asp Ser Trp Met 1 . 5 10 15
ACC TCG AAT GAG TCA GAG GAC GGG GTA TCC TCC TGC GAG GAG GAC ACC 96 Thr Ser Asn Glu Ser Glu Asp Gly Val Ser Ser Cys Glu Glu Asp Thr 20 25 30
GAC GGG GTC TTC TCA TCT GAG CTG CTC TCA GTA ACC GAG ATA AGT GCT 144 Asp Gly Val Phe Ser Ser Glu Leu Leu Ser Val Thr Glu He Ser Ala 35 40 45
GGC GAT GGA GTA CGG GGG ATG TCT TCT CCC CAT ACA GGC ATC TCT CGG 192 Gly Asp Gly Val Arg Gly Met Ser Ser Pro His Thr Gly He Ser Arg 50 55 60
CTA CTA CCA CAA AGA GAG GGT GTA CTG CAG TCC TCC ATG ATG ACA TCA 240 Leu Leu Pro Gin Arg Glu Gly Val Leu Gin Ser Ser Met Met Thr Ser 65 70 75 80
ATG TGC GGT TCA AGA ATC CTC GCA GCA TTC TCG ATC GCT TGG AGA GCA 288 Met Cys Gly Ser Arg He Leu Ala Ala Phe Ser He Ala Trp Arg Ala 85 90 95
GCA GCC GCC GGC GGC AGA TCG GCC TCA GTC AGT TCT GAG TCT 330
Ala Ala Ala Gly Gly Arg Ser Ala Ser Val Ser Ser Glu Ser 100 105 110
(2) INFORMATION FOR SEQ ID NO:75:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 110 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:75:
Thr Leu Lys Thr Leu Ser Ser Pro Ser Ala Val Ser Asp Ser Trp Met 1 5 10 15
Thr Ser Asn Glu Ser Glu Asp Gly Val Ser Ser Cys Glu Glu Asp Thr 20 25 30
Asp Gly Val Phe Ser Ser Glu Leu Leu Ser Val Thr Glu He Ser Ala 35 40 45
Gly Asp Gly Val Arg Gly Met Ser Ser Pro His Thr Gly He Ser Arg 50 55 60
Leu Leu Pro Gin Arg Glu Gly Val Leu Gin Ser Ser Met Met Thr Ser 65 70 75 80
Met Cys Gly Ser Arg He Leu Ala Ala Phe Ser He Ala Trp Arg Ala 85 90 95 Ala Ala Ala Gly Gly Arg Ser Ala Ser Val Ser Ser Glu Ser 100 105 110
(2) INFORMATION FOR SEQ ID NO:76:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 195 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone Y5-57
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..195
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:76:
ACG GAA AGA GCC ACG TTG AAG ACA CTT TCC TCC CCT TCG GCT GCC TCG 48 Thr Glu Arg Ala Thr Leu Lys Thr Leu Ser Ser Pro Ser Ala Ala Ser 1 5 10 15
GAC TCT TGG ATG ACC TCG AAT GAG TCG GAG GAC GGG GTA TCC TCC TGC 96 Asp Ser Trp Met Thr Ser Asn Glu Ser Glu Asp Gly Val Ser Ser Cys 20 25 30
GAA GAG GAC ACC GAC GGG GTC TTC TCA TCT GAG CTG CTC TCA GTA ACC 144 Glu Glu Asp Thr Asp Gly Val Phe Ser Ser Glu Leu Leu Ser Val Thr 35 40 45
GAG ATA AGT GCT GGC GGT GGA GTA CGG GGG ATG TCT TCT CCC CAT ACG 192 Glu He Ser Ala Gly Gly Gly Val Arg Gly Met Ser Ser Pro His Thr 50 55 60 GGC 195
Gly 65
(2) INFORMATION FOR SEQ ID NO:77:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 65 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:77:
Thr Glu Arg Ala Thr Leu Lys Thr Leu Ser Ser Pro Ser Ala Ala Ser 1 5 10 15
Asp Ser Trp Met Thr Ser Asn Glu Ser Glu Asp Gly Val Ser Ser Cys 20 25 30
Glu Glu Asp Thr Asp Gly Val Phe Ser Ser Glu Leu Leu Ser Val Thr 35 40 45
Glu He Ser Ala Gly Gly Gly Val Arg Gly Met Ser Ser Pro His Thr 50 55 60
Gly 65
(2) INFORMATION FOR SEQ ID NO:78:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 115 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO ( iv ) ANTI-SENSE : NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone Y5-60
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..115
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:78:
AAG ACA CTT TCC TCC CCT TCG GCT GTC TCG GAC TCT TGG ATG ACC TCG 48 Lys Thr Leu Ser Ser Pro Ser Ala Val Ser Asp Ser Trp Met Thr Ser 1 5 10 15
AAT GAG TCA GAG GAC GGG GTA TCC TCC TGC GAG GAG GAC ACC GAC TGG 96
Asn Glu Ser Glu Asp Gly Val Ser Ser Cys Glu Glu Asp Thr Asp Trp 20 25 30
GTC TTC TCA TCT GAG CTG C 115
Val Phe Ser Ser Glu Leu 35
(2) INFORMATION FOR SEQ ID NO:79:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 38 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:79:
Lys Thr Leu Ser Ser Pro Ser Ala Val Ser Asp Ser Trp Met Thr Ser 1 5 10 15
Asn Glu Ser Glu Asp Gly Val Ser Ser Cys Glu Glu Asp Thr Asp Trp 20 25 30
Val Phe Ser Ser Glu Leu
35 (2) INFORMATION FOR SEQ ID NO:80:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 93 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone Y5-63
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 19..93
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:80:
GAGAGCAGCT CAGATGAG AAG ACA CTT TCC TCC CCT TCG GCT GTC TCG GAC 51
Lys Thr Leu Ser Ser Pro Ser Ala Val Ser Asp 1 5 10
TCT TGG ATG ACC TCG AAT GAG TCA GAG GAC GGG GTA TCC TCG 93
Ser Trp Met Thr Ser Asn Glu Ser Glu Asp Gly Val Ser Ser 15 20 25
(2) INFORMATION FOR SEQ ID NO:81:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 25 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:81: Lys Thr Leu Ser Ser Pro Ser Ala Val Ser Asp Ser Trp Met Thr Ser 1 5 10 15
Asn Glu Ser Glu Asp Gly Val Ser Ser 20 25
(2) INFORMATION FOR SEQ ID NO:82:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: CDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer Y5-10-F1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:82:
TCAGCCATGG CTCGTGCGCC CGCGATGGTC 30
(2) INFORMATION FOR SEQ ID NO:83:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO (vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer Y5-10-R1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:83:
CGAGGATCCA GCCGCCGGCG GCAGATC 27
(2) INFORMATION FOR SEQ ID NO:84:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer Y5-16F1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:84:
GATTCCATGG GTTTGGGGTT GACGGTGGCT GA 32
(2) INFORMATION FOR SEQ ID NO:85:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer 470EP-R3
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:85:
GCGAATTCGG ATCCCAAGGT TTCTTGCCTA GC 32
(2) INFORMATION FOR SEQ ID NO:86:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer Y5-5-F1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:86:
GAGGCCATGG CCTATTGTGA CAAGGTG 27
(2) INFORMATION FOR SEQ ID NO:87:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA (iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer PGEX-R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:87:
GACCGTCTCC GGGAGCT 17
(2) INFORMATION FOR SEQ ID NO:88:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 326 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone GE15-1
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 3..326
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:88:
CC ATG GAG GTC TCT GAC TTC CGT GGC TCG TCT GGC TCA CCG GTC CTA 47
Met Glu Val Ser Asp Phe Arg Gly Ser Ser Gly Ser Pro Val Leu 1 5 10 15
TGT GAC GAA GGG CAC GCA GTA GGA ATG CTC GTG TCT GTG CTT CAC TCC 95 Cys Asp Glu Gly His Ala Val Gly Met Leu Val Ser Val Leu His Ser 20 25 30 GGT GGT AGG GTC ACC GCG GCA CGG TTC ACT AGG CCG TGG ACC CAA GTG 143 Gly Gly Arg Val Thr Ala Ala Arg Phe Thr Arg Pro Trp Thr Gin Val 35 40 45
CCA ACA GAT GCC AAA ACC ACC ACT GAA CCC CCT CCG GTG CCG GCC AAA 191 Pro Thr Asp Ala Lys Thr Thr Thr Glu Pro Pro Pro Val Pro Ala Lys 50 55 60
GGA GTT TTC AAA GAG GCC CCG TTG TTT ATG CCT ACG GGA GCG GGA AAG 239 Gly Val Phe Lys Glu Ala Pro Leu Phe Met Pro Thr Gly Ala Gly Lys 65 70 75
AGC ACT CGC GTC CCG TTG GAG TAC GGC AAC ATG GGG CAC AAG GTC TTA 287 Ser Thr Arg Val Pro Leu Glu Tyr Gly Asn Met Gly His Lys Val Leu 80 85 90 95
ATC TTG AAC CCC TCA GTG GCC ACT GTG CGG GCG ATG GGC 326
He Leu Asn Pro Ser Val Ala Thr Val Arg Ala Met Gly 100 105
(2) INFORMATION FOR SEQ ID NO:89:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 108 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:89:
Met Glu Val Ser Asp Phe Arg Gly Ser Ser Gly Ser Pro Val Leu Cys 1 5 10 15
Asp Glu Gly His Ala Val Gly Met Leu Val Ser Val Leu His Ser Gly 20 25 30
Gly Arg Val Thr Ala Ala Arg Phe Thr Arg Pro Trp Thr Gin Val Pro 35 40 45
Thr Asp Ala Lys Thr Thr Thr Glu Pro Pro Pro Val Pro Ala Lys Gly 50 55 60 Val Phe Lys Glu Ala Pro Leu Phe Met Pro Thr Gly Ala Gly Lys Ser 65 70 75 80
Thr Arg Val Pro Leu Glu Tyr Gly Asn Met Gly His Lys Val Leu He 85 90 95
Leu Asn Pro Ser Val Ala Thr Val Arg Ala Met Gly 100 105
(2) INFORMATION FOR SEQ ID NO:90:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 138 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Clone GE17-2
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..138
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:90:
GGT GAT GAG GTT CTC ATC GGC GTC TTC CAG GAT GTG AAT CAT TTG CCT 48 Gly Asp Glu Val Leu He Gly Val Phe Gin Asp Val Asn His Leu Pro 1 5 10 15
CCC GGG TTT GTT CCG ACC GCG CCT GTT GTC ATC CGA CGG TGC GGA AAG 96 Pro Gly Phe Val Pro Thr Ala Pro Val Val He Arg Arg Cys Gly Lys 20 25 30
GGC TTC TTG GGG GTC ACA AAG GCT GCC TTG ACA GGT CGG GAT 138
Gly Phe Leu Gly Val Thr Lys Ala Ala Leu Thr Gly Arg Asp 35 40 45
(2) INFORMATION FOR SEQ ID NO:91:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 46 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:91:
Gly Asp Glu Val Leu He Gly Val Phe Gin Asp Val Asn His Leu Pro 1 5 10 15
Pro Gly Phe Val Pro Thr Ala Pro Val Val He Arg Arg Cys Gly Lys 20 25 30
Gly Phe Leu Gly Val Thr Lys Ala Ala Leu Thr Gly Arg Asp 35 40 45
(2) INFORMATION FOR SEQ ID NO:92:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer GE15F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:92: GCCGCCATGG AGGTCTCTGA CTTCCGTG 28
(2) INFORMATION FOR SEQ ID NO:93:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer GE15R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:93:
GCGCGGATCC GCCCATCGCC CGCACAGTGG C 31
(2) INFORMATION FOR SEQ ID NO:94:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 31 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer GE17F
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:94: CGCTCCATGG GTGATGAGGT TCTCATCGGC G 31
(2) INFORMATION FOR SEQ ID NO:95:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer GE17R
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:95:
GTAAGTCAGG ATCCCGACCT GTCAAGGC 28
(2) INFORMATION FOR SEQ ID NO:96:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 452 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: NcoI/EcoRI-containing fragment of pGEX-HISb-GE3-s HGV plasmid (xi) SEQUENCE DESCRIPTION: SEQ ID NO:96:
CAAAATCGGA TCTGGTTCCG CGTGGTTCCA TGGTCTCATG GGACGCGGAC GCTCGTGCGC 60
CCGCGATGGT CTATGGCCCT GGGCAAAGTG TTACCATTGA CGGGGAGCGC TACACCTTGC 120
CTCATCAACT GAGGCTCAGG AATGTGGCAC CCTCTGAGGT TTCATCCGAG GTGTCCATTG 180
ACATTGGGAC GGAGACTGAA GACTCAGAAC TGACTGAGGC CGATCTGCCG CCGGCGGCTG 240
CTGCTCTCCA AGCGATCGAG AATGCTGCGA GGATTCTTGA ACCGCACATT GATGTCATCA 300
TGGAGGACTG CAGTACACCC TCTCTTTGTG GTAGTAGCCG AGAGATGCCT GTATGGGGAG 360
AAGACATCCC CCGTACTCCA TCGCCAGCAC TTATCGGATC CCACCATCAC CATCACCATT 420
AGAATTCATC GTGACTGACT GACGATCTAC CT 452
(2) INFORMATION FOR SEQ ID NO:97:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 39 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer 470EP-F8
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:97:
GCTGAATTCG CCATGGCGAC GTGCGCATTC AGGGGTGGA 39
(2) INFORMATION FOR SEQ ID NO:98:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer 470EP-F9
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:98:
GCTAGATCTG GCAACATGGG GCACAAGGTC 30
(2) INFORMATION FOR SEQ ID NO:99:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer 470EP-R9
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:99:
CACAGATCTC GCGTAGTAGT AGCGTCCAGA 30
(2) INFORMATION FOR SEQ ID NO:100:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 22 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer 9E3-REV
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:100:
GCTGGCTGAG GCACGGTTGG TC 22
(2) INFORMATION FOR SEQ ID NO:101:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 22 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer E39-94PR
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:101:
CACCATCATC ACAGCATCTG GC 22
(2) INFORMATION FOR SEQ ID NO:102:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 32 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer GEP-F12
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:102:
GCAACCATGG AACCTGCCAA ACCCCTGACC TT 32
(2) INFORMATION FOR SEQ ID NO:103:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer GEP-F14
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:103:
TTGGGATCCC TCGTGTTCCG CCATTCTAAG 30
(2) INFORMATION FOR SEQ ID NO:104:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer GEP-F15
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:104:
GCGGCCATGG TGCCCTTCGT CAATAGGACA 30
(2) INFORMATION FOR SEQ ID NO:105:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer GEP-R16
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:105:
TGCGAATCCT CGGCCCTGGT TGCCCAG 27
(2) INFORMATION FOR SEQ ID NO:106:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer GEP-R12
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:106:
AGCCCCATGG AAGGTCGTGA A 21
(2) INFORMATION FOR SEQ ID NO:107:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer GEP-R13
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:107;
TATGGATCCT GGTAAATCAT TGCCCCACCT 30
(2) INFORMATION FOR SEQ ID NO:108:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer GEP-R14
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:108:
GGAGGATCCG CGACCCGCCA CCGAAGT 27
(2) INFORMATION FOR SEQ ID NO:109:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 29 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer GEP-R15
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:109:
CTTGCCATGG CCAGCTGGTT CACCCACCA 29
(2) INFORMATION FOR SEQ ID NO:110:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Primer GEP-F17
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:110:
GCAGGATCCC CTCTGGAAGG TCCCATTTGA 30
(2) INFORMATION FOR SEQ ID NO:111:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 138 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Reverse-Frame Antigen K1-2-3A
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..138
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:111:
AGC CTT AGA ATG GCA GAA CAC GAG GTG CCT TCC GGT TCG CAT CCG CTC 48 Ser Leu Arg Met Ala Glu His Glu Val Pro Ser Gly Ser His Pro Leu 1 5 10 15 GAG GGG TAT TCC ATG TCC ATA AAA GGG AAT CTC GCC CAC GTC CAA TTT 96 Glu Gly Tyr Ser Met Ser He Lys Gly Asn Leu Ala Hiε Val Gin Phe 20 25 30
TGT CTC AAT TAT GGA AGG GTG CTG CGT CAT AGG GGA GAA TTC 138
Cys Leu Asn Tyr Gly Arg Val Leu Arg His Arg Gly Glu Phe 35 40 45
(2) INFORMATION FOR SEQ ID NO:112:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 46 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:112:
Ser Leu Arg Met Ala Glu His Glu Val Pro Ser Gly Ser His Pro Leu 1 5 10 15
Glu Gly Tyr Ser Met Ser He Lys Gly Asn Leu Ala His Val Gin Phe 20 25 30
Cys Leu Asn Tyr Gly Arg Val Leu Arg His Arg Gly Glu Phe 35 40 45
(2) INFORMATION FOR SEQ ID NO:113:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 240 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE: (C) INDIVIDUAL ISOLATE: Reverse-Frame Antigen K3-10-1D
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..240
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:113:
CAC CAT CAT CAC AGC ATC TGG CCA GAT GTA CGA GGC CAA GCA CCA GGC 48 His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Gly 1 5 10 15
AAA GGT CAG GGG TTT GGC AGG CCG CCC CTC CCC GAG GGG GCT CCT CAC 96
Lys Gly Gin Gly Phe Gly Arg Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30
CAC CCT TTG ACG GAT CGC CTG ATA CCC CTT ACA CCA CGT CCT ATA GAT 144 His Pro Leu Thr Asp Arg Leu He Pro Leu Thr Pro Arg Pro He Asp 35 40 45
CAC GGC TTT GTG CCT CCA CCC CCC GCG CTC ATC GAG CTC AGG AGC GCA 192 His Gly Phe Val Pro Pro Pro Pro Ala Leu He Glu Leu Arg Ser Ala 50 55 60
ATG GCC CAA GCC ATC ACA CAG CAC TCG ACC AAC CGT GCC TCA GCC AGC 240 Met Ala Gin Ala He Thr Gin His Ser Thr Asn Arg Ala Ser Ala Ser 65 70 75 80
(2) INFORMATION FOR SEQ ID NO:114:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 80 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:114:
His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Gly 1 5 10 15 Lys Gly Gin Gly Phe Gly Arg Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30
His Pro Leu Thr Asp Arg Leu He Pro Leu Thr Pro Arg Pro He Asp 35 40 45
His Gly Phe Val Pro Pro Pro Pro Ala Leu He Glu Leu Arg Ser Ala 50 55 60
Met Ala Gin Ala He Thr Gin His Ser Thr Asn Arg Ala Ser Ala Ser 65 70 75 80
(2) INFORMATION FOR SEQ ID NO:115:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 318 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Reverse-Frame Antigen K3-11-1A
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..318
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:115:
CAC CAT CAT CAC AGC ATC TGG CCA GAT GTA CGA GGC CAA GCA CCA GGC 48
His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Gly 1 5 10 15
AAA GGT CAG GGG TTT GGC AGG CCG CCC CTC CCC GAG GGG GCT CCT CAC 96 Lys Gly Gin Gly Phe Gly Arg Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30 CAC CCT TTG ACG GAT TGC CTG GTA CCC CTT ACA CCA CGT CCT ATA GAT 144 His Pro Leu Thr Asp Cys Leu Val Pro Leu Thr Pro Arg Pro He Asp 35 40 45
CAC GGC TTT GTG CCT CCA CCC CCT GCG CTC ATC GAG CTC AGG AGC GCA 192 His Gly Phe Val Pro Pro Pro Pro Ala Leu He Glu Leu Arg Ser Ala 50 55 60
ATG GCC CAA GCT ACC ACA CTG GCC ACC ACC CAG CCC AAC ACC GAA GTG 240 Met Ala Gin Ala Thr Thr Leu Ala Thr Thr Gin Pro Asn Thr Glu Val 65 70 75 80
TCC ACC TCG AAT GTA GCA TCG AAG CAG AAC TCG GCC CCG AGC ACT GAG 288 Ser Thr Ser Asn Val Ala Ser Lys Gin Asn Ser Ala Pro Ser Thr Glu 85 90 95
GTG CGC CCG CGG GTC GCC GAA ATC CCC ATC 318
Val Arg Pro Arg Val Ala Glu He Pro He 100 105
(2) INFORMATION FOR SEQ ID NO:116:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 106 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:116:
His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Gly 1 5 10 15
Lys Gly Gin Gly Phe Gly Arg Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30
His Pro Leu Thr Asp Cys Leu Val Pro Leu Thr Pro Arg Pro He Asp 35 40 45
His Gly Phe Val Pro Pro Pro Pro Ala Leu He Glu Leu Arg Ser Ala 50 55 60 Met Ala Gin Ala Thr Thr Leu Ala Thr Thr Gin Pro Asn Thr Glu Val 65 70 75 80
Ser Thr Ser Asn Val Ala Ser Lys Gin Asn Ser Ala Pro Ser Thr Glu 85 90 95
Val Arg Pro Arg Val Ala Glu He Pro He 100 105
(2) INFORMATION FOR SEQ ID NO:117:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 240 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Reverse-Frame Antigen K3-14-2A
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..240
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:117:
CAC CAT CAT CAC AGC ATC TGG CCA GAC GTA CGA GGC CAA GCA CCA GGC 48 His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Gly 1 5 10 15
AAA GGT CAG GGG TTT GGC AGG CCG CCC CTC CCC GAG GGG GCT CCT CAC 96 Lys Gly Gin Gly Phe Gly Arg Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30
CAC CCT TTG ACG GAT TGC CTG GTA CCC CTT ACA CCA CGT CCT ATA GAT 144 His Pro Leu Thr Asp Cys Leu Val Pro Leu Thr Pro Arg Pro He Asp 35 40 45 CAC GGC TTT GTG CCT CCA CCC CCT GCG CTC ATC GAG CTC AGG AGC GCA 192
His Gly Phe Val Pro Pro Pro Pro Ala Leu He Glu Leu Arg Ser Ala
50 55 60
ATG GCC CAA GCT ACC ACA CAG CAC TCG ACC AAC CGT GCC TCA GCC AGC 240
Met Ala Gin Ala Thr Thr Gin His Ser Thr Asn Arg Ala Ser Ala Ser
65 70 75 80
(2) INFORMATION FOR SEQ ID NO:118:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 80 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:118:
His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Gly 1 5 10 15
Lys Gly Gin Gly Phe Gly Arg Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30
His Pro Leu Thr Asp Cys Leu Val Pro Leu Thr Pro Arg Pro He Asp 35 40 45
His Gly Phe Val Pro Pro Pro Pro Ala Leu He Glu Leu Arg Ser Ala 50 55 60
Met Ala Gin Ala Thr Thr Gin His Ser Thr Asn Arg Ala Ser Ala Ser 65 70 75 80
(2) INFORMATION FOR SEQ ID NO:119:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 240 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear ( ii ) MOLECULE TYPE : cDNA to mRNA
( iii ) HYPOTHETICAL : NO
( iv ) ANTI-SENSE : NO
(vi ) ORIGINAL SOURCE :
( C ) INDIVIDUAL ISOLATE : Reverse-Frame Antigen K3-14-3A
( i ) FEATURE :
(A) NAME/KEY: CDS
(B) LOCATION: 1..240
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:119:
CAC CAT CAT CAC AGC ATC TGG CCA GAT GTA CGA GGC CAA GCA CCA GGC 48 His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Gly 1 5 10 15
AAA GGT CAG GGG TTT GGC AGG CCG CCC CTC CCC GAG GGG GCT CCT CAC 96 Lys Gly Gin Gly Phe Gly Arg Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30
CAC CCT TTG ACG GAT TGC CTG GTA CCC CTT ACA CCA CGT CCT ATA GAT 144 His Pro Leu Thr Asp Cys Leu Val Pro Leu Thr Pro Arg Pro He Asp 35 40 45
CAC GGC TTT GTG CCT CCA CCC CCT GCG CTC ATC GAG CTC AGG AGC GCA 192 His Gly Phe Val Pro Pro Pro Pro Ala Leu He Glu Leu Arg Ser Ala 50 55 60
ATG GCC CAA GCT ACC ACA CAG CAC TCG ACC AAC CGT GCC TCA GCC AGC 240 Met Ala Gin Ala Thr Thr Gin His Ser Thr Asn Arg Ala Ser Ala Ser 65 70 75 80
(2) INFORMATION FOR SEQ ID NO:120:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 80 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear ( ii ) MOLECULE TYPE : protein
( xi ) SEQUENCE DESCRIPTION : SEQ ID NO : 120 :
His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Gly 1 5 10 15
Lys Gly Gin Gly Phe Gly Arg Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30
His Pro Leu Thr Asp Cys Leu Val Pro Leu Thr Pro Arg Pro He Asp 35 40 45
His Gly Phe Val Pro Pro Pro Pro Ala Leu He Glu Leu Arg Ser Ala 50 55 60
Met Ala Gin Ala Thr Thr Gin His Ser Thr Asn Arg Ala Ser Ala Ser 65 70 75 80
(2) INFORMATION FOR SEQ ID NO:121:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 240 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Reverse-Frame Antigen K3-14-5A
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..240
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:121: CAC CAT CAT CAC AGC ATC TGG CCA GAT GTA CGA GGC CAA GCA CCG AGC 48
His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Ser 1 5 10 15
AAA GGT CAG GGG TTT GGC AGG CCG CCC CTC CCC GAG GGG GCT CCT CAC 96
Lys Gly Gin Gly Phe Gly Arg Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30
CAC CCT TTG ACG GAT TGC CTG GTA CCC CTT ACA CCA CGT CCT ATA GAT 144
His Pro Leu Thr Asp Cys Leu Val Pro Leu Thr Pro Arg Pro He Asp 35 40 45
CAC GGC TTT GTG CCT CAC CCC CCT GCG CTC ATC GAG CTC AGG AGC GCA 192
His Gly Phe Val Pro His Pro Pro Ala Leu He Glu Leu Arg Ser Ala 50 55 60
ATG GCC CAA GCC ATC ACA CAG CAC TCG ACC AAC CGT GCC TCA GCC AGC 240
Met Ala Gin Ala He Thr Gin His Ser Thr Asn Arg Ala Ser Ala Ser 65 70 75 80
(2) INFORMATION FOR SEQ ID NO:122:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 80 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:122:
His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Ser 1 5 10 15
Lys Gly Gin Gly Phe Gly Arg Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30
His Pro Leu Thr Asp Cys Leu Val Pro Leu Thr Pro Arg Pro He Asp 35 40 45
His Gly Phe Val Pro His Pro Pro Ala Leu He Glu Leu Arg Ser Ala 50 55 60 Met Ala Gin Ala He Thr Gin His Ser Thr Asn Arg Ala Ser Ala Ser 65 70 75 80
(2) INFORMATION FOR SEQ ID NO:123:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 240 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Reverse-Frame Antigen K3-14-6A
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..240
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:123:
CAC CAT CAT CAC AGC ATC TGG CCA GAT GTA CGA GGC CAA GCA CCA GGC 48 His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Gly 1 5 10 15
AAA GGT CAG GGG TTT GGC GGG CCG CCC CTC CCC GAG GGG GCT CCT CAC 96 Lys Gly Gin Gly Phe Gly Gly Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30
CAC CCT TTG ACG GAT TGC CTG GTA CCC CTT ACA CCA CGT CCT ATA GAT 144 His Pro Leu Thr Asp Cys Leu Val Pro Leu Thr Pro Arg Pro He Asp 35 40 45
CAC GGC TTT GTG CCT CCA CCC CCC GCG CTC ATC GAG CTC AGG AGC GCA 192 His Gly Phe Val Pro Pro Pro Pro Ala Leu He Glu Leu Arg Ser Ala 50 55 60 ATG GCC CAA GCT ACC ACA CAG CAC TCG ACC AAC CGT GCC TCA GCC AGC 240 Met Ala Gin Ala Thr Thr Gin His Ser Thr Asn Arg Ala Ser Ala Ser 65 70 75 80
(2) INFORMATION FOR SEQ ID NO:124:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 80 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:124:
His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Gly 1 5 10 15
Lys Gly Gin Gly Phe Gly Gly Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30
His Pro Leu Thr Asp Cys Leu Val Pro Leu Thr Pro Arg Pro He Asp 35 40 45
His Gly Phe Val Pro Pro Pro Pro Ala Leu He Glu Leu Arg Ser Ala 50 55 60
Met Ala Gin Ala Thr Thr Gin His Ser Thr Asn Arg Ala Ser Ala Ser 65 70 75 80
(2) INFORMATION FOR SEQ ID NO:125:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 243 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Reverse-Frame Antigen K3-17-1A
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..243
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:125:
CAC CAT CAT CAC AGC ATC TGG CCA GAT GTA CGA GGC CAA GCA CCA GGC 48
His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Gly 1 5 10 15
AAA GGT CAG GGG TTT GGC AGG CCG CCC CTC CCC GAG GGG GCT CCT CAC 96 Lys Gly Gin Gly Phe Gly Arg Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30
CAC CCT TTG ACG GAT TGC CTG GTA CCC CTT ACA CCA CGT CCT ATA GAT 144 His Pro Leu Thr Asp Cys Leu Val Pro Leu Thr Pro Arg Pro He Asp 35 40 45
CAC GGC TTT GTG CCT CCA CCC CCC CCT GCG CTC ATC GAG CTC AGG AGC 192 His Gly Phe Val Pro Pro Pro Pro Pro Ala Leu He Glu Leu Arg Ser 50 55 60
GCA ATG GCC CAA GCT ACC ACA CAG CAC TCG ACC AAC CGT GCC TCA GCC 240 Ala Met Ala Gin Ala Thr Thr Gin His Ser Thr Asn Arg Ala Ser Ala 65 70 75 80
AGC 243
Ser
(2) INFORMATION FOR SEQ ID NO:126:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 81 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:126:
His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Gly 1 5 10 15
Lys Gly Gin Gly Phe Gly Arg Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30
His Pro Leu Thr Asp Cys Leu Val Pro Leu Thr Pro Arg Pro He Asp 35 40 45
His Gly Phe Val Pro Pro Pro Pro Pro Ala Leu He Glu Leu Arg Ser 50 55 60
Ala Met Ala Gin Ala Thr Thr Gin His Ser Thr Asn Arg Ala Ser Ala 65 70 75 80
Ser
(2) INFORMATION FOR SEQ ID NO:127:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 156 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(Ui) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Reverse-Frame Antigen K3-8-3A
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..156 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:127:
CAC CAT CAT CAC AGC ATC TGG CCA GAT GTA CGA GGC CAA GCA CCA GGC 48 His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Gly 1 5 10 15
AAA GGT CAG GGG TTT GGC AGG CCG CCC CTC CCC CCT GCG CTC ATC GAG 96 Lys Gly Gin Gly Phe Gly Arg Pro Pro Leu Pro Pro Ala Leu He Glu 20 25 30
CTC AGG AGC GCA ATG GCC CAA GCC ATC ACA CAG CAC TCG ACC AAC CGT 144 Leu Arg Ser Ala Met Ala Gin Ala He Thr Gin His Ser Thr Asn Arg 35 40 45
GCC TCA GCC AGC 156
Ala Ser Ala Ser 50
(2) INFORMATION FOR SEQ ID NO:128:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 52 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:128:
His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Gly 1 5 10 15
Lys Gly Gin Gly Phe Gly Arg Pro Pro Leu Pro Pro Ala Leu He Glu 20 25 30
Leu Arg Ser Ala Met Ala Gin Ala He Thr Gin His Ser Thr Asn Arg 35 40 45
Ala Ser Ala Ser 50
(2) INFORMATION FOR SEQ ID NO:129: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 240 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Reverse-Frame Antigen K3-8-4C
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..240
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:129:
CAC CAT CAT CAC AGC ATC TGG CCA GAT GTA CGA GGC CAA GCA CCA GGC 48 His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Gly 1 5 10 15
AAA GGT CAG GGG TTT GGC AGG CCG CCC CTC CCC GAG GGG GCT CCT CAC 96 Lys Gly Gin Gly Phe Gly Arg Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30
CAC CCT TTG ACG GAT TGC CTG GTA CCC CTT ACA TCA CGT CCT ATA GAT 144 His Pro Leu Thr Asp Cys Leu Val Pro Leu Thr Ser Arg Pro He Asp . 35 40 45
CAC GGC TTT GTG CCT CCA CCC CCT GCG CTC ATC GAG CTC AGG AGC GCA 192 His Gly Phe Val Pro Pro Pro Pro Ala Leu He Glu Leu Arg Ser Ala 50 55 60
ATG GCC CAA GCC ATC ACA CAG CAC TCG ACC AAC CGT GCC TCA GCC AGC 240 Met Ala Gin Ala He Thr Gin His Ser Thr Asn Arg Ala Ser Ala Ser 65 70 75 80
(2) INFORMATION FOR SEQ ID NO:130: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 80 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:130:
His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Gly 1 5 10 15
Lys Gly Gin Gly Phe Gly Arg Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30
His Pro Leu Thr Asp Cys Leu Val Pro Leu Thr Ser Arg Pro He Asp 35 40 45
His Gly Phe Val Pro Pro Pro Pro Ala Leu He Glu Leu Arg Ser Ala 50 55 60
Met Ala Gin Ala He Thr Gin His Ser Thr Asn Arg Ala Ser Ala Ser 65 70 75 80
(2) INFORMATION FOR SEQ ID NO:131:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 239 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Reverse-Frame Antigen K3-8-5A
(ix) FEATURE:
(A) NAME/KEY: CDS (B) LOCATION: 1..239
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:131:
CAC CAT CAT CAC AGC ATC TGG CCA GAT GTA CGA GGC CAA GCA CCA GGC 48 His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Gly 1 5 10 15
AAA GGT CAG GGG TTT GGC AGG CCG CCC CTC CCC GAG GGG GCT CCT CAC 96 Lys Gly Gin Gly Phe Gly Arg Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30
CAC CCT TTG ACG GAT TGC CTG GTA CCC CTT ACA CCA CGT CCT ATA GAT 144 His Pro Leu Thr Asp Cys Leu Val Pro Leu Thr Pro Arg Pro He Asp 35 40 45
CAC GGC TTA GTG CCT CCA CCC CCT GCG CTC ATC GAG CTC AGG AGC GCA 192 His Gly Leu Val Pro Pro Pro Pro Ala Leu He Glu Leu Arg Ser Ala 50 55 60
ATG GCC CAA GCT ACC ACA CAG CAC TCG ACC AAC CGT GCC TCA GCC AG 239 Met Ala Gin Ala Thr Thr Gin His Ser Thr Asn Arg Ala Ser Ala 65 70 75
(2) INFORMATION FOR SEQ ID NO:132:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 79 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:132:
His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Gly 1 5 10 15
Lys Gly Gin Gly Phe Gly Arg Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30
His Pro Leu Thr Asp Cys Leu Val Pro Leu Thr Pro Arg Pro He Asp 35 40 45
His Gly Leu Val Pro Pro Pro Pro Ala Leu He Glu Leu Arg Ser Ala 50 55 60
Met Ala Gin Ala Thr Thr Gin His Ser Thr Asn Arg Ala Ser Ala 65 70 75
(2) INFORMATION FOR SEQ ID NO:133:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 427 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Reverse-Frame Antigen K3-8-6A
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..427
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:133:
CAC CAC CCT TTG ACG GAT TGC CTG GTA CCC CTT ACA CCA CGT CCT ATA 48 His His Pro Leu Thr Asp Cys Leu Val Pro Leu Thr Pro Arg Pro He 1 5 10 15
GAT CAC GGC TTT GTG CCT CCA CCC CCT GCG CTC ATC GAG CTC AGG AGC 96 Asp His Gly Phe Val Pro Pro Pro Pro Ala Leu He Glu Leu Arg Ser 20 25 30
GCA ATG GCC CAA GCT ACC ACA CTG GCC ACC ACC CAG CCC AAC ACC GAA 144 Ala Met Ala Gin Ala Thr Thr Leu Ala Thr Thr Gin Pro Asn Thr Glu 35 40 45 GTG TCC ACC TCG AAT GTA GCA TCG AAG CAG AAC TCG ACC CCG AGC ACT 192 Val Ser Thr Ser Asn Val Ala Ser Lys Gin Asn Ser Thr Pro Ser Thr 50 55 60
GAG GTG CGC CCG CGG GTC GCC GAA ATC CCC ATC AAG AGG GCC AGC GGG 240 Glu Val Arg Pro Arg Val Ala Glu He Pro He Lys Arg Ala Ser Gly 65 70 75 80
AAA GCT CCC CGA GCA AGC TTC CAC AAC ACG AGG AAC ATC AGG CGT TGG 288 Lys Ala Pro Arg Ala Ser Phe His Asn Thr Arg Asn He Arg Arg Trp 85 90 95
GGT CCC AAC CAT CTA AAG TAC AGC ACT AGG TTT GCC AAA CCC AAT ATC 336 Gly Pro Asn His Leu Lys Tyr Ser Thr Arg Phe Ala Lys Pro Asn He 100 105 110
ATA CTG ACG ACC GGG AGT CCC AGA CAC CAG GAC AGG GCA GGG CCC GCG 384 He Leu Thr Thr Gly Ser Pro Arg His Gin Asp Arg Ala Gly Pro Ala 115 120 125
GAG ACC TCA CCT GCC GCG GCG GCT TCC ACA GCC GGC AGC CCT A 427
Glu Thr Ser Pro Ala Ala Ala Ala Ser Thr Ala Gly Ser Pro 130 135 140
(2) INFORMATION FOR SEQ ID NO:134:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 142 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:134:
His His Pro Leu Thr Asp Cys Leu Val Pro Leu Thr Pro Arg Pro He 1 5 10 15
Asp His Gly Phe Val Pro Pro Pro Pro Ala Leu He Glu Leu Arg Ser 20 25 30
Ala Met Ala Gin Ala Thr Thr Leu Ala Thr Thr Gin Pro Asn Thr Glu 35 40 45 Val Ser Thr Ser Asn Val Ala Ser Lys Gin Asn Ser Thr Pro Ser Thr 50 55 60
Glu Val Arg Pro Arg Val Ala Glu He Pro He Lys Arg Ala Ser Gly 65 70 75 80
Lys Ala Pro Arg Ala Ser Phe His Asn Thr Arg Asn He Arg Arg Trp 85 90 95
Gly Pro Asn His Leu Lys Tyr Ser Thr Arg Phe Ala Lys Pro Asn He 100 105 110
He Leu Thr Thr Gly Ser Pro Arg His Gin Asp Arg Ala Gly Pro Ala 115 120 125
Glu Thr Ser Pro Ala Ala Ala Ala Ser Thr Ala Gly Ser Pro 130 135 140
(2) INFORMATION FOR SEQ ID NO:135:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 240 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Reverse-Frame Antigen K3-8-7C
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..240
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:135:
CAC CAT CAT CAC AGC ATC TGG CCA GAT GTA CGA GGC CAA GCA CCA GGC 48 His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Gly 1 5 10 15
AAA GGT CAG GGG TTT GGC AGG CCG CCC CTC CCC GAG GGG GCT CCT CAC 96 Lys Gly Gin Gly Phe Gly Arg Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30
CAC CCT CTG ACG GAT TGC CTG GTA CCC CTT ACA CCA CGT CCT ATA GAT 144 His Pro Leu Thr Asp Cys Leu Val Pro Leu Thr Pro Arg Pro He Asp 35 40 45
CAC GGC TTT GTG CCT CCA CCC CCT GCG CTC ATC GAG CTC AGG AGC GCA 192 His Gly Phe Val Pro Pro Pro Pro Ala Leu He Glu Leu Arg Ser Ala 50 55 60
ACG GCC CAA GCC ATC ACA CAG CAC TCG ACC AAC CGT GCC TCA GCC AGC 240 Thr Ala Gin Ala He Thr Gin His Ser Thr Asn Arg Ala Ser Ala Ser 65 70 75 80
(2) INFORMATION FOR SEQ ID NO:136:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 80 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:136:
His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Gly 1 5 10 15
Lys Gly Gin Gly Phe Gly Arg Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30
His Pro Leu Thr Asp Cys Leu Val Pro Leu Thr Pro Arg Pro He Asp 35 40 45
His Gly Phe Val Pro Pro Pro Pro Ala Leu He Glu Leu Arg Ser Ala 50 55 60
Thr Ala Gin Ala He Thr Gin His Ser Thr Asn Arg Ala Ser Ala Ser 65 70 75 80 (2) INFORMATION FOR SEQ ID NO:137:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 235 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Reverse-Frame Antigen Y10-13-1
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..235
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:137:
GTC AGC CGA GGC CCC ACG CCG CAC CGA TGG AAT GGG AAC CTA ACC GAC 48 Val Ser Arg Gly Pro Thr Pro His Arg Trp Asn Gly Asn Leu Thr Asp 1 5 10 15
CCG GTC TCG GGT CAG CAG TCC CTC ACA CAG GTG CCG CGG GAG GCA GGC 96 Pro Val Ser Gly Gin Gin Ser Leu Thr Gin Val Pro Arg Glu Ala Gly 20 25 30
CGA CGG TCC AGA ACA CAC GTC ACG CAC GGG ATT CTC CAC TCG GAG AGC 144 Arg Arg Ser Arg Thr His Val Thr His Gly He Leu His Ser Glu Ser 35 40 45
CCA GGC ACC GTG TCG CGA TCC GAT GAT CCA AGT GCG GCT ATG GTG CAG 192 Pro Gly Thr Val Ser Arg Ser Asp Asp Pro Ser Ala Ala Met Val Gin 50 55 60
GTG GCA GAG CCA ACC GGC ACT AAA CTC CAC ACA TCT ATC TTC G 235 Val Ala Glu Pro Thr Gly Thr Lys Leu His Thr Ser He Phe 65 70 75 (2) INFORMATION FOR SEQ ID NO:138:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 78 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:138:
Val Ser Arg Gly Pro Thr Pro His Arg Trp Asn Gly Asn Leu Thr Asp 1 5 10 15
Pro Val Ser Gly Gin Gin Ser Leu Thr Gin Val Pro Arg Glu Ala Gly 20 25 30
Arg Arg Ser Arg Thr His Val Thr His Gly He Leu His Ser Glu Ser 35 40 45
Pro Gly Thr Val Ser Arg Ser Asp Asp Pro Ser Ala Ala Met Val Gin 50 55 60
Val Ala Glu Pro Thr Gly Thr Lys Leu His Thr Ser He Phe 65 70 75
(2) INFORMATION FOR SEQ ID NO:139:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 181 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Reverse-Frame Antigen Y10-13-2 (ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..181
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:139:
TCG GGC CAG CAG TCC CTC ACA CAG GTG CCG CAG GAG GCA GGC CGA CGG 48 Ser Gly Gin Gin Ser Leu Thr Gin Val Pro Gin Glu Ala Gly Arg Arg 1 5 10 15
TCC AGA ACA CAC GTC ACG CAC GGG ATT CCC CAC TCG GAG AGC TCA GGC 96 Ser Arg Thr His Val Thr His Gly He Pro His Ser Glu Ser Ser Gly 20 25 30
ACC GTG TCG CGA TCC GAT GAT CCA AGT GCG GTT ATG GTG CAG GTG GCA 144 Thr Val Ser Arg Ser Asp Asp Pro Ser Ala Val Met Val Gin Val Ala 35 40 45
GAG CCA ACT GGC ACT AAA CTC CAC ACA TCT ATC TTC G 181
Glu Pro Thr Gly Thr Lys Leu His Thr Ser He Phe 50 55 60
(2) INFORMATION FOR SEQ ID NO:140:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 60 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:140:
Ser Gly Gin Gin Ser Leu Thr Gin Val Pro Gin Glu Ala Gly Arg Arg 1 5 10 15
Ser Arg Thr His Val Thr His Gly He Pro His Ser Glu Ser Ser Gly 20 25 30
Thr Val Ser Arg Ser Asp Asp Pro Ser Ala Val Met Val Gin Val Ala 35 40 45 Glu Pro Thr Gly Thr Lys Leu His Thr Ser He Phe 50 55 60
(2) INFORMATION FOR SEQ ID NO:141:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 128 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: M62321 ORF1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:141:
Met Gly Lys Val Pro Leu His Met Phe Leu Gin Val Leu Gly Pro Thr 1 5 10 15
He Leu He Val Pro Phe Leu Thr Cys Pro Val He Ser Ala Pro Gin 20 25 30
Trp Gin Arg Val Cys Met Met Pro Ser Thr Arg Gin Thr Pro Leu Tyr 35 40 45
Pro Arg Trp Gin Asp Thr Lys Gly He Pro Gly Ser Cys Gly Met Ser 50 55 60
Leu Ala Phe Ser Gin Val Leu Lys Ser Leu Asn Thr Ser His He Gin 65 70 75 80
Ser Gin Met Ser Leu Ser Gin Glu Pro Glu His Gly Val Val His Ser 85 90 95
Glu Leu He His Trp Cys Ser Arg Leu Arg Ser Trp Val Thr Val Arg 100 105 110 Leu Leu Ser Met Ala Val Thr Arg Ala Ala Ala Ser Leu Ser Gly Thr 115 120 125
(2) INFORMATION FOR SEQ ID NO:142:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 144 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: M62321 ORF2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:142:
Met Ala Gly Ser Arg Leu Thr Arg Ser Ser Val Glu Gly Thr Ser Pro 1 5 10 15
Leu Met He Leu Asn Ala Thr Arg Ala Pro Ala Thr Pro Ala Pro Tyr 20 25 30
Pro Ala Arg Met Ser Met Arg Thr Phe Pro Ser Pro Thr Leu Pro Met 35 40 45
Ala Ala Pro Ala Lys Pro Ala Pro Thr Lys Ala Val Ala Ala Pro Gly 50 55 60
Ala Ala Ser Trp Ala Ala Thr His Pro Pro Asn Met Leu Lys Arg Arg 65 70 75 80
Val Trp Leu Val Val Ser Gly Leu Val Thr Ala Ala Val Lys Ala He 85 90 95
Asn Glu Ala Met Ala Gly Leu Pro Gly Ser Val Asp Lys Pro Ala Lys 100 105 110 Tyr Cys He Pro Leu Met Lys Phe His He Cys Phe Ala Gin Lys Val 115 120 125
Ser Ser Phe Cys Gin Leu Val Trp Thr Ala Gly Ala He Thr Ser Ala 130 135 140
(2) INFORMATION FOR SEQ ID NO:143:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 107 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: M58335, ORF1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:143:
Met Gly Asn Val Pro Cys His Val Leu Leu Gin Val Leu Gly Pro Thr 1 5 10 15
He Leu Met Glu Pro Phe Leu Thr Cys Pro Val He Cys Ala Pro His 20 25 30
Gly Gin Val Val Cys Met Met Pro Ser Pro Arg Gin Thr Pro Leu Tyr 35 40 45
Pro Arg Trp His Glu Lys Lys Gly Thr Pro Gly Ser Cys Gly Arg Ser 50 55 60
Leu Asp Trp Ser Gin Val Leu Lys Ser Val Asn Thr Val His He Gin 65 70 75 80
Ser Gin Thr Ser Leu Ser His Glu Pro Glu His Gly Val Glu Gin Ser 85 90 95 Ser Leu He His Trp Trp Ser Leu Phe Ser Ser 100 105
(2) INFORMATION FOR SEQ ID NO:144:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 134 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: M58335, ORF2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:144:
Met Ser Thr Ser Thr Phe Pro Arg Pro Met Leu Pro Thr Ala Ala Pro 1 5 10 15
Ala Met Pro Ala Pro Thr Lys Ala Glu Ala Ala Leu Gly Gly Ala Ser 20 25 30
Trp Ala Ala Thr His Pro Pro Lys Met Leu Asn Arg Arg Val Leu Trp 35 40 45
Val Val Ser Gly Leu Val He Glu Ala Val Asn Ala He Asn Asp Ala 50 55 60
He Ala Gly Phe Pro Gly Arg Val Asp Lys Pro Ala Lys Tyr Cys He 65 70 75 80
Pro Leu Met Lys Phe His Met Cys Phe Ala Gin Asn Val Ser Arg Ala 85 90 95
Arg His Leu Asp Ser Thr Thr Gly Ala Ala Ala Ser Ala Cys Leu Val 100 105 110
Ala Val Cys Ser Asn Pro Ser Ala Phe Cys Leu Asn Cys Ser Ala Ser 115 120 125
Cys He Pro Cys Ser Met 130
(2) INFORMATION FOR SEQ ID NO:145:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 100 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: D90208, ORF1
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:145:
Met Gly Asn Val Pro Cys His Val Leu Leu Gin Val Phe Gly Pro Thr 1 5 10 15
He Leu Met Glu Pro Phe Leu Thr Cys Pro Val He Cys Ala Pro His 20 25 30
Gly Gin Val Val Cys Met Met Pro Ser Pro Arg Gin Thr Pro Leu Tyr 35 40 45
Pro Arg Trp His Asp Arg Lys Gly Ser Pro Gly Asn Arg Gly Arg Ser 50 55 60
Leu Asp Trp Ser Gin Val Leu Lys Ser Leu Asn Thr Val His He Gin 65 70 75 80
Ser Gin Thr Ser Phe Ser His Glu Pro Glu Gin Gly Val Glu Gin Ser 85 90 95
Ser Leu He His 100 (2) INFORMATION FOR SEQ ID NO:146:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 134 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: D90208, ORF2
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:146:
Met Ser Thr Ser Thr Phe Pro Arg Pro Met Leu Pro Thr Ala Ala Pro 1 5 10 15
Ala Met Pro Ala Pro Thr Lys Ala Glu Ala Ala Leu Gly Gly Ala Ser 20 25 30
Trp Ala Ala Thr His Pro Pro Lys Met Leu Asn Arg Arg Val Phe Trp 35 40 45
Val Val Ser Gly Leu Val He Glu Ala Val Lys Ala He Asn Asp Ala 50 55 60
He Ala Gly Phe Pro Gly Arg Val Asp Arg Pro Ala Lys Tyr Cys He 65 70 75 80
Pro Leu Met Lys Phe His Met Cys Phe Ala Gin Lys Thr Ser Arg Ala 85 90 95
Arg His Leu Asp Ser Thr Thr Gly Ala Ala Ala Ser Ala Cys Leu Val 100 105 110
Ala Val Cys Ser Asn Pro Ser Ala Phe Cys Leu Asn Cys Ser Ala Ser 115 120 125
Cys He Pro Cys Ser Met 130
(2) INFORMATION FOR SEQ ID NO:147:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 175 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Long Consensus Sequence, Fig. 11
(ix) FEATURE:
(A) NAME/KEY: Modified-site
(B) LOCATION: 16
(D) OTHER INFORMATION: /note= "where X is G or S"
(ix) FEATURE:
(A) NAME/KEY: Modified-site
(B) LOCATION: 23
(D) OTHER INFORMATION: /note= "where X is R or G"
(ix) FEATURE:
(A) NAME/KEY: Modified-site
(B) LOCATION: 38
(D) OTHER INFORMATION: /note= "where X is C or R"
(ix) FEATURE:
(A) NAME/KEY: Modified-site
(B) LOCATION: 40
(D) OTHER INFORMATION: /note= "where X is V or I"
(ix) FEATURE:
(A) NAME/KEY: Modified-site
(B) LOCATION: 44
(D) OTHER INFORMATION: /note= "where X is P or S" ( ix ) FEATURE :
(A) NAME/KEY : Modif ied-site
( B ) LOCATION : 54
(D) OTHER INFORMATION: /note= "where X is P or H"
(ix) FEATURE:
(A) NAME/KEY: Modified-site
(B) LOCATION: 65
(D) OTHER INFORMATION: /note= "where X is M or T"
(ix) FEATURE:
(A) NAME/KEY: Modified-site
(B) LOCATION: 69
(D) OTHER INFORMATION: /note= "where X is T or I"
(ix) FEATURE:
(A) NAME/KEY: Modified-site
(B) LOCATION: 92
(D) OTHER INFORMATION: /note= "where X is T or A"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:147:
His His His His Ser He Trp Pro Asp Val Arg Gly Gin Ala Pro Xaa 1 5 10 15
Lys Gly Gin Gly Phe Gly Xaa Pro Pro Leu Pro Glu Gly Ala Pro His 20 25 30
His Pro Leu Thr Asp Xaa Leu Xaa Pro Leu Thr Xaa Arg Pro He Asp 35 40 45
His Gly Phe Val Pro Xaa Pro Pro Ala Leu He Glu Leu Arg Ser Ala 50 55 60
Xaa Ala Gin Ala Xaa Thr Leu Ala Thr Thr Gin Pro Asn Thr Glu Val 65 70 75 80
Ser Thr Ser Asn Val Ala Ser Lys Gin Asn Ser Xaa Pro Ser Thr Glu 85 90 95
Val Arg Pro Arg Val Ala Glu He Pro He Lys Arg Ala Ser Gly Lys 100 105 110 Ala Pro Arg Ala Ser Phe His Asn Thr Arg Asn He Arg Arg Trp Gly 115 120 125
Pro Asn His Leu Lys Tyr Ser Thr Arg Phe Ala Lys Pro Asn He He 130 135 140
Leu Thr Thr Gly Ser Pro Arg His Gin Asp Arg Ala Gly Pro Ala Glu 145 150 155 160
Thr Ser Pro Ala Ala Ala Ala Ser Thr Ala Gly Ser Pro Asn Leu 165 170 175
(2) INFORMATION FOR SEQ ID NO:148:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 7 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Short Consensus Sequence, Fig. 11
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:148:
Thr Asn Arg Ala Ser Ala Ser
1 5
(2) INFORMATION FOR SEQ ID NO:149:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 16 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(iii) HYPOTHETICAL: NO ( iv ) ANTI-SENSE : NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: Frame Shift Fragment
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:149:
Pro Pro Ala Leu He Glu Leu Arg Ser Ala Met Ala Gin Ala Thr Thr 1 5 10 15
(2) INFORMATION FOR SEQ ID NO:150:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 688 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: HGV Variant BG34
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 272.-688
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:150:
GACTCGGCGC CGACTCGGCG ACCGGCCAAA AGGTGGTGGA TGGGTGATGA CAGGGTTGGT 60
AGGTCGTAAA TCCCGGTCAC CTTGGTAGCC ACTATAGGTG GGTCTTAAGA GAAGGTTAAG 120
ATTCCTCTTG TGCCTGCGGC GAGACCGCGC ACGGTCCACA GGTGTTGGCC CTACCGGTGT 180
GAATAAGGGC CCGACGTCAG GCTCGTCGTT AAACCGAGCC CGTCACCCAC CTGGGCAAAC 240
GACGCCCACG TACGGTCCAC GTCGCCCTTC A ATG CCT CTC TTG GCC AAT AGG 292
Met Pro Leu Leu Ala Asn Arg 1 5
AGT ATC CGG CGA GTT GAC AAG GAC CAG TGG GGG CCG GGA GTC ACG GGG 340 Ser He Arg Arg Val Asp Lys Asp Gin Trp Gly Pro Gly Val Thr Gly 10 15 20
ATG GAC CCC GGG CTC TGC CCT TCC CGG TGG AAC GGG AAA CGC ATG GGG 388 Met Asp Pro Gly Leu Cys Pro Ser Arg Trp Asn Gly Lys Arg Met Gly 25 30 35
CCA CCC AGC TCC GCG GCG GCC TGC AGC CGG GGT AGC CCA AGA ACC CTT 436 Pro Pro Ser Ser Ala Ala Ala Cys Ser Arg Gly Ser Pro Arg Thr Leu 40 45 50 55
CGG GTG AGG GCG GGT GGC ATT TCT CTT TTC TGT ATC ATC ATG GCA GTC 484 Arg Val Arg Ala Gly Gly He Ser Leu Phe Cys He He Met Ala Val 60 65 70
CTC CTG CTC CTT CTC GTG GTT GAG GCC GGG GCC ATT CTG GCC CCG GCC 532 Leu Leu Leu Leu Leu Val Val Glu Ala Gly Ala He Leu Ala Pro Ala 75 80 85
ACC CAC GCT TGT CGA GCG AAT GGA CAA TAT TTC CTC ACA AAC TGT TGC 580 Thr His Ala Cys Arg Ala Asn Gly Gin Tyr Phe Leu Thr Asn Cys Cys 90 95 100
GCC CTC GAG GAC ATC GGG TTC TGC CTG GAA GGC GGG TGC CTG GTG GCC 628 Ala Leu Glu Asp He Gly Phe Cys Leu Glu Gly Gly Cys Leu Val Ala 105 110 115
TTA GGG TGC ACC ATT TGC ACT GAC CGT TGC TGG CCA CTG TAT CAG GCG 676 Leu Gly Cys Thr He Cys Thr Asp Arg Cys Trp Pro Leu Tyr Gin Ala 120 125 130 135
GGT TTG GCT GTG 688
Gly Leu Ala Val
(2) INFORMATION FOR SEQ ID NO:151:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 139 amino acids
(B) TYPE: amino acid (D ) TOPOLOGY : linear
( ii ) MOLECULE TYPE : protein
( xi ) SEQUENCE DESCRIPTION : SEQ ID NO : 151 :
Met Pro Leu Leu Ala Asn Arg Ser He Arg Arg Val Asp Lys Asp Gin 1 5 10 15
Trp Gly Pro Gly Val Thr Gly Met Asp Pro Gly Leu Cys Pro Ser Arg 20 25 30
Trp Asn Gly Lys Arg Met Gly Pro Pro Ser Ser Ala Ala Ala Cys Ser 35 40 45
Arg Gly Ser Pro Arg Thr Leu Arg Val Arg Ala Gly Gly He Ser Leu 50 55 60
Phe Cys He He Met Ala Val Leu Leu Leu Leu Leu Val Val Glu Ala 65 70 75 80
Gly Ala He Leu Ala Pro Ala Thr His Ala Cys Arg Ala Asn Gly Gin 85 90 95
Tyr Phe Leu Thr Asn Cys Cys Ala Leu Glu Asp He Gly Phe Cys Leu 100 105 110
Glu Gly Gly Cys Leu Val Ala Leu Gly Cys Thr He Cys Thr Asp Arg 115 120 125
Cys Trp Pro Leu Tyr Gin Ala Gly Leu Ala Val 130 135
(2) INFORMATION FOR SEQ ID NO:152:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 663 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: HGV Variant T55806
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 271..663
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:152:
GACTCGGCGC CGACTCGGCG ACCGGCCAAA AGGTGGTGGA TGGGTGATGC CAGGGTTGGT 60
AGGTCGTAAA TCCCGGTCAT CTTGGTAGCC ACTATAGGTG GGTCTTAAGA GAAGGTTAAG 120
ATTCCTCTTG TGCCTGCGGC GAGACCGCGC ACGGTCCACA GGTGTTGGCC CTACCGGTGG 180
AATAAGGGCC CGACGTCAGG CTCGTCGTTA AACCGAGCCC GTCACCCACC TGGGCAAACG 240
ACGCTCACGT ACGGTCCACG TCGCCCTTCA ATG TCT CTC TTG ACC AAT AGG TTT 294
Met Ser Leu Leu Thr Asn Arg Phe 1 5
ATC CGG CGA GTT GAC AAG GAC CAG TGG GGG CCG GGG GTT ACG GGG ACG 342 He Arg Arg Val Asp Lys Asp Gin Trp Gly Pro Gly Val Thr Gly Thr 10 15 20
GAC CCC GAA CCC TGC CCT TCC CGG TGG GCC GGG AAA TGC ATG GGG CCA 390 Asp Pro Glu Pro Cys Pro Ser Arg Trp Ala Gly Lys Cys Met Gly Pro 25 30 35 40
CCC AGC TCC GCG GCG GCC TGC AGC CGG GGT AGC CCA AGA ATC CTT CGG 438 Pro Ser Ser Ala Ala Ala Cys Ser Arg Gly Ser Pro Arg He Leu Arg 45 50 55
GTG AGG GCG GGT GGC ATT TCT CTT TTC TAT ACC ATC ATG GCA GTC CTT 486 Val Arg Ala Gly Gly He Ser Leu Phe Tyr Thr He Met Ala Val Leu 60 65 70
CTG CTC TTC TTC GTG GTT GAG GCC GGG GCG ATT CTC GCC CCG GCC ACC 534 Leu Leu Phe Phe Val Val Glu Ala Gly Ala He Leu Ala Pro Ala Thr 75 80 85 CAC GCT TGT CGG GCG AAT GGG CAA TAT TTC CTC ACA AAT TGT TGC GCC 582 His Ala Cys Arg Ala Asn Gly Gin Tyr Phe Leu Thr Asn Cys Cys Ala 90 95 100
CCA GAG GAT GTT GGG TTC TGC CTG GAG GGC GGA TGC CTG GTG GCT CTG 630 Pro Glu Asp Val Gly Phe Cys Leu Glu Gly Gly Cys Leu Val Ala Leu 105 110 115 120
GGG TGT ACG ATT TGC ACT GAC CGT TGC TGG CCA 663
Gly Cys Thr He Cys Thr Asp Arg Cys Trp Pro 125 130
(2) INFORMATION FOR SEQ ID NO:153:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 131 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:153:
Met Ser Leu Leu Thr Asn Arg Phe He Arg Arg Val Asp Lys Asp Gin 1 5 10 15
Trp Gly Pro Gly Val Thr Gly Thr Asp Pro Glu Pro Cys Pro Ser Arg 20 25 30
Trp Ala Gly Lys Cys Met Gly Pro Pro Ser Ser Ala Ala Ala Cys Ser 35 40 45
Arg Gly Ser Pro Arg He Leu Arg Val Arg Ala Gly Gly He Ser Leu 50 55 60
Phe Tyr Thr He Met Ala Val Leu Leu Leu Phe Phe Val Val Glu Ala 65 70 75 80
Gly Ala He Leu Ala Pro Ala Thr His Ala Cys Arg Ala Asn Gly Gin 85 90 95
Tyr Phe Leu Thr Asn Cys Cys Ala Pro Glu Asp Val Gly Phe Cys Leu 100 105 110 Glu Gly Gly Cys Leu Val Ala Leu Gly Cys Thr He Cys Thr Asp Arg 115 120 125
Cys Trp Pro 130
(2) INFORMATION FOR SEQ ID NO:154:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 632 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: HGV Variant EB20-2
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 271..632
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:154:
GACTCGGCGC CGACTCGGCG ACCGGCCAAA AGGTGGTGGA TGGGTGATGC CAGGGTTGGT 60
AGGTCGTAAA TCCCGGTCAT CTTGGTAGCC ACTATAGGTG GGTCTTAAGA GAAGGTTAAG 120
ATTCCTCTTG TGCCTGCGGC GAGACCGCGC ACGGTCCACA GGTGTTGGCC CTACCGGTGT 180
AATAAGGGCC CGACGTCAGG CTCGTCGTTA AACCGAGCCC GTCACCCACC TGGGCAAACG 240
ACGCCCACGT ACGGTCCACG TCGCCCTTCA ATG CCT CTC TTG GCC AAT AGG AGT 294
Met Pro Leu Leu Ala Asn Arg Ser
1 5
TAT CTC CGG CGA GTT GGC AAG* GAC CAG TGG GGG CCG GGG GTT ACG GGG 342 Tyr Leu Arg Arg Val Gly Lys Asp Gin Trp Gly Pro Gly Val Thr Gly 10 15 20
AAG GAC CCC GAA CCC TGC CCT TCC CGG TGG GCC GGG AAA TGC ATG GGG 390
Lys Asp Pro Glu Pro Cys Pro Ser Arg Trp Ala Gly Lys Cys Met Gly
25 30 35 40
CCA CCC AGC TCC GCG GCG GCC TGC AGC CGG GGT AGC CCA AAA AAC CTT 438
Pro Pro Ser Ser Ala Ala Ala Cys Ser Arg Gly Ser Pro Lys Asn Leu
45 50 55
CGG GTG AGG GCG GGT GGC ATT TTC TTT TCC TAT ACC ATC ATG GCA GTC 486
Arg Val Arg Ala Gly Gly He Phe Phe Ser Tyr Thr He Met Ala Val 60 65 70
CTT CTG CTC CTT CTC GTG GTT GAG GCC GGG GCC ATT TTG GCC CCG GCC 534
Leu Leu Leu Leu Leu Val Val Glu Ala Gly Ala He Leu Ala Pro Ala 75 80 85
ACC CAC GCT TGC AGA GCT AAT GGG CAA TAT TTC CTC ACA AAC TGT TGT 582
Thr His Ala Cys Arg Ala Asn Gly Gin Tyr Phe Leu Thr Asn Cys Cys 90 95 100
GCC TTG GAG GAC ATC GGG TTC TGC CTG GAA GGC GGA TGC TTG GTG GCG CT 632
Ala Leu Glu Asp He Gly Phe Cys Leu Glu Gly Gly Cys Leu Val Ala
105 110 115 120
(2) INFORMATION FOR SEQ ID NO:155:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 120 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:155:
Met Pro Leu Leu Ala Asn Arg Ser Tyr Leu Arg Arg Val Gly Lys Asp 1 5 10 15
Gin Trp Gly Pro Gly Val Thr Gly Lys Asp Pro Glu Pro Cys Pro Ser 20 25 30 Arg Trp Ala Gly Lys Cys Met Gly Pro Pro Ser Ser Ala Ala Ala Cys 35 40 45
Ser Arg Gly Ser Pro Lys Asn Leu Arg Val Arg Ala Gly Gly He Phe 50 55 60
Phe Ser Tyr Thr He Met Ala Val Leu Leu Leu Leu Leu Val Val Glu 65 70 75 80
Ala Gly Ala He Leu Ala Pro Ala Thr His Ala Cys Arg Ala Asn Gly 85 90 95
Gin Tyr Phe Leu Thr Asn Cys Cys Ala Leu Glu Asp He Gly Phe Cys 100 105 110
Leu Glu Gly Gly Cys Leu Val Ala 115 120
(2) INFORMATION FOR SEQ ID NO:156:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 9103 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: both
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to mRNA
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(C) INDIVIDUAL ISOLATE: HGV-JC Variant
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 276..9005
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:156: CAATGACTCG GCGCCGACTC GGCGACCGGC CAAAAGGTGG TGGATGGGTG ATGACAGGGT 60
TGGTAGGTCG TAAATCCCGG TCACCTTGGT AGCCACTATA GGTGGGTCTT AAGAGAAGGT 120
TAAGATTCCT CTTGTGCCTG CGGCGAGACC GCGCACGGTC CACAGGTGTT GGCCCTACCG 180
GTGGGAATAA GGGCCCGACG TCAGGCTCGT CGTTAAACCG AGCCCGTAAC CCGCCTGGGC 240
AAACGACGCC CACGTACGGT CCACGTCGCC CTTCA ATG TCG CTC TTG ACC AAT 293
Met Ser Leu Leu Thr Asn 1 5
AGG CTT AGC CGG CGA GTT GAC AAG GAC CAG TGG GGG CCG GGG TTT ATG 341 Arg Leu Ser Arg Arg Val Asp Lys Asp Gin Trp Gly Pro Gly Phe Met 10 15 20
GGG AAG GAC CCC AAA CCC TGC CCT TCC CGG CGG ACC GGG AAA TGC ATG 389 Gly Lys Asp Pro Lys Pro Cys Pro Ser Arg Arg Thr Gly Lys Cys Met 25 30 35
GGG CCA CCC AGC TCC GCG GCG GCC TGC AGC CGG GGT AGC CCA AGA ATC 437 Gly Pro Pro Ser Ser Ala Ala Ala Cys Ser Arg Gly Ser Pro Arg He 40 45 50
CTT CGG GTG AGG GCG GGT GGC ATT TCT CTT CCT TAT ACC ATC ATG GAA 485 Leu Arg Val Arg Ala Gly Gly He Ser Leu Pro Tyr Thr He Met Glu 55 60 65 70
GCC CTC CTG TTC CTC CTC GGG GTG GAG GCC GGG GCC ATT CTG GCC CCG 533 Ala Leu Leu Phe Leu Leu Gly Val Glu Ala Gly Ala He Leu Ala Pro 75 80 85
GCC ACC CAC GCT TGT CGA GCG AAT GGG CAA TAT TTC CTC ACA AAC TGT 581 Ala Thr His Ala Cys Arg Ala Asn Gly Gin Tyr Phe Leu Thr Asn Cys 90 95 100
TGT GCT CCA GAG GAC ATT GGG TTC TGC CTC GAA GGC GGT TGC CTT GTG 629 Cys Ala Pro Glu Asp He Gly Phe Cys Leu Glu Gly Gly Cys Leu Val 105 110 115
GCC CTG GGG TGC ACA GTT TGC ACT GAC CGA TGC TGG CCG CTG TAT CAG 677 Ala Leu Gly Cys Thr Val Cys Thr Asp Arg Cys Trp Pro Leu Tyr Gin 120 125 130 GCG GGC TTG GCT GTG CGG CCT GGC AAG TCC GCA GCC CAG CTG GTG GGG 725 Ala Gly Leu Ala Val Arg Pro Gly Lys Ser Ala Ala Gin Leu Val Gly 135 140 145 150
CAA CTG GGT GGC CTC TAC GGG CCC TTG TCG GTG TCG GCC TAC GTG GCC 773 Gin Leu Gly Gly Leu Tyr Gly Pro Leu Ser Val Ser Ala Tyr Val Ala 155 160 165
GGC ATC CTG GGC CTG GGT GAG GTG TAC TCG GGT GTC CTA ACA GTT GGT 821 Gly He Leu Gly Leu Gly Glu Val Tyr Ser Gly Val Leu Thr Val Gly 170 175 180
GTT GCG TTG ACG CGC CGG GTC TAC CCG ATG CCC AAC CTG ACG TGT GCA 869 Val Ala Leu Thr Arg Arg Val Tyr Pro Met Pro Asn Leu Thr Cys Ala 185 190 195
GTA GAG TGT GAG CTT AAG TGG GAA AGT GAG TTT TGG AGA TGG ACT GAG 917 Val Glu Cys Glu Leu Lys Trp Glu Ser Glu Phe Trp Arg Trp Thr Glu 200 205 210
CAG CTG GCC TCC AAT TAC TGG ATT CTG GAA TAC CTT TGG AAG GTC CCG 965 Gin Leu Ala Ser Asn Tyr Trp He Leu Glu Tyr Leu Trp Lys Val Pro 215 220 225 230
TTT GAC TTC TGG AGA GGC GTG CTA AGC CTG ACT CCC TTG CTG GTT TGC 1013 Phe Asp Phe Trp Arg Gly Val Leu Ser Leu Thr Pro Leu Leu Val Cys 235 240 245
GTG GCC GCG TTG CTG CTG CTG GAG CAA CGG ATT GTC ATG GTC TTC CTG 1061 Val Ala Ala Leu Leu Leu Leu Glu Gin Arg He Val Met Val Phe Leu 250 255 260
TTG GTG ACG ATG GCC GGG ATG TCG CAA GGC GCT CCG GCC TCC GTT TTG 1109 Leu Val Thr Met Ala Gly Met Ser Gin Gly Ala Pro Ala Ser Val Leu 265 270 275
GGG TCT CGC CCC TTT GAC TAC GGG TTG ACA TGG CAG TCT TGT TCC TGC 1157 Gly Ser Arg Pro Phe Asp Tyr Gly Leu Thr Trp Gin Ser Cys Ser Cys 280 285 290
AGG GCT AAT GGG TCG CGC TAT ACT ACT GGG GAG AAG GTG TGG GAC CGT 1205 Arg Ala Asn Gly Ser Arg Tyr Thr Thr Gly Glu Lys Val Trp Asp Arg 295 300 305 310 GGG AAC GTC ACG CTC CTG TGT GAC TGC CCC AAC GGC CCC TGG GTG TGG 1253 Gly Asn Val Thr Leu Leu Cys Asp Cys Pro Asn Gly Pro Trp Val Trp 315 320 325
TTG CCG GCC TTT TGC CAA GCA ATC GGC TGG GGC GAT CCC ATC ACT CAT 1301 Leu Pro Ala Phe Cys Gin Ala He Gly Trp Gly Asp Pro He Thr His 330 335 340
TGG AGC CAC GGC CAA AAT CGG TGG CCC CTC TCA TGC CCC CAG TAT GTC 1349 Trp Ser His Gly Gin Asn Arg Trp Pro Leu Ser Cys Pro Gin Tyr Val 345 350 355
TAT GGG TCT GTT TCA GTC ACT TGC GTG TGG GGT TCC GTC TCT TGG TTT 1397 Tyr Gly Ser Val Ser Val Thr Cys Val Trp Gly Ser Val Ser Trp Phe 360 365 370
GCC TCG ACT GGC GGT CGC GAC TCG AAG ATC GAT GTG TGG AGT CTG GTG 1445 Ala Ser Thr Gly Gly Arg Asp Ser Lys He Asp Val Trp Ser Leu Val 375 380 385 390
CCG GTT GGT TCC GCC AGC TGC ACC ATA GCC GCT CTT GGA TCG TCG GAT 1493 Pro Val Gly Ser Ala Ser Cys Thr He Ala Ala Leu Gly Ser Ser Asp 395 400 405
CGG GAC ACG GTA GTT GAG CTC TCC GAG TGG GGA GTC CCG TGC GCA ACG 1541 Arg Asp Thr Val Val Glu Leu Ser Glu Trp Gly Val Pro Cys Ala Thr 410 415 420
TGC ATT CTG GAT CGT CGG CCG GCC TCG TGC GGC ACC TGT GTG AGA GAC 1589 Cys He Leu Asp Arg Arg Pro Ala Ser Cys Gly Thr Cys Val Arg Asp 425 430 435
TGC TGG CCC GAA ACC GGG TCG GTT AGG TTT CCA TTC CAT CGG TGC GGC 1637 Cys Trp Pro Glu Thr Gly Ser Val Arg Phe Pro Phe His Arg Cys Gly 440 445 450
GCG GGG CCT AAG CTG ACA AAG GAC TTG GAA GCT GTG CCC TTC GTC AAT 1685 Ala Gly Pro Lys Leu Thr Lys Asp Leu Glu Ala Val Pro Phe Val Asn 455 460 465 470
AGG ACA ACT CCC TTC ACC ATA AGG GGC CCC CTG GGC AAC CAG GGG AGA 1733 Arg Thr Thr Pro Phe Thr He Arg Gly Pro Leu Gly Asn Gin Gly Arg 475 480 485 GGC AAC CCG GTG CGG TCG CCC TTG GGT TTT GGG TCC TAC GCC ATG ACC 1781 Gly Asn Pro Val Arg Ser Pro Leu Gly Phe Gly Ser Tyr Ala Met Thr 490 495 500
AAG ATC CGA GAC TCC TTA CAT TTG GTG AAA TGT CCC ACA CCA GCC ATT 1829 Lys He Arg Asp Ser Leu His Leu Val Lys Cys Pro Thr Pro Ala He 505 510 515
GAG CCT CCC ACC GGG ACG TTT GGG TTC TTC CCC GGA GTG CCG CCT CTT 1877 Glu Pro Pro Thr Gly Thr Phe Gly Phe Phe Pro Gly Val Pro Pro Leu 520 525 530
AAC AAC TGC CTG CTG TTG GGC ACG GAA GTG TCC GAA GCG CTG GGC GGG 1925 Asn Asn Cys Leu Leu Leu Gly Thr Glu Val Ser Glu Ala Leu Gly Gly 535 540 545 550
GCC GGC CTC ACG GGG GGG TTC TAT GAA CCC CTG GTG CGC AGG CGT TCG 1973 Ala Gly Leu Thr Gly Gly Phe Tyr Glu Pro Leu Val Arg Arg Arg Ser 555 560 565
GAG CTG ATG GGG CGC CGA AAT CCG GTT TGC CCG GGG TTT GCA TGG CTG 2021 Glu Leu Met Gly Arg Arg Asn Pro Val Cys Pro Gly Phe Ala Trp Leu 570 575 580
TCC TCG GGT CGA CCT GAC GGG TTT ATA CAC GTC CAG GGC CAC TTG CAG 2069 Ser Ser Gly Arg Pro Asp Gly Phe He His Val Gin Gly His Leu Gin 585 590 595
GAG GTC GAT GCT GGC AAC TTC ATC CCT CCA CCT CGC TGG TTG CTC TTG 2117 Glu Val Asp Ala Gly Asn Phe He Pro Pro Pro Arg Trp Leu Leu Leu 600 605 610
GAC TTT GTG TTT GTC CTG TTA TAC CTG ATG AAG CTG GCT GAG GCA CGG 2165 Asp Phe Val Phe Val Leu Leu Tyr Leu Met Lys Leu Ala Glu Ala Arg 615 620 625 630
CTG GTC CCG TTG ATC TTG CTT CTG CTG TGG TGG TGG GTG AAC CAG TTG 2213 Leu Val Pro Leu He Leu Leu Leu Leu Trp Trp Trp Val Asn Gin Leu 635 640 645
GCA GTC CTT GGA CTG CCG GCT GTG GAC GCC GCC GTG GCT GGT GAG GTC 2261 Ala Val Leu Gly Leu Pro Ala Val Asp Ala Ala Val Ala Gly Glu Val 650 655 660 TTC GCG GGC CCG GCC CTG TCG TGG TGT CTG GGC CTC CCC ACC GTT AGT 2309 Phe Ala Gly Pro Ala Leu Ser Trp Cys Leu Gly Leu Pro Thr Val Ser 665 670 675
ATG ATC CTG GGC TTA GCA AAC CTG GTG TTG TAT TTC CGG TGG ATG GGT 2357 Met He Leu Gly Leu Ala Asn Leu Val Leu Tyr Phe Arg Trp Met Gly 680 685 690
CCC CAA CGC CTC ATG TTC CTC GTG TTG TGG AAG CTC GCT CGG GGA GCC 2405 Pro Gin Arg Leu Met Phe Leu Val Leu Trp Lys Leu Ala Arg Gly Ala 695 700 705 710
TTC CCG CTG GCA CTT CTG ATG GGG ATC TCG GCA ACC CGC GGG CGC ACC 2453 Phe Pro Leu Ala Leu Leu Met Gly He Ser Ala Thr Arg Gly Arg Thr 715 720 725
TCG GTG CTC GGG GCC GAG TTC TGC TTC GAT GTC ACA TTC GAG GTG GAC 2501 Ser Val Leu Gly Ala Glu Phe Cys Phe Asp Val Thr Phe Glu Val Asp 730 735 740
ACG TCG GTT TTG GGC TGG GTG GTG GCC AGT GTG GTA GCC TGG GCC ATT 2549 Thr Ser Val Leu Gly Trp Val Val Ala Ser Val Val Ala Trp Ala He 745 750 755
GCG CTC CTG AGC TCG ATG AGC GCG GGA GGG TGG AGG CAC AAG GCC GTG 2597 Ala Leu Leu Ser Ser Met Ser Ala Gly Gly Trp Arg His Lys Ala Val 760 765 770
ATC TAT AGG ACG TGG TGT AAG GGG TAC CAG GCA ATA CGC CAA CGG GTG 2645 He Tyr Arg Thr Trp Cys Lys Gly Tyr Gin Ala He Arg Gin Arg Val 775 780 785 790
GTG CGG AGC CCC CTC GGG GAG GGG CGG CCC ACC AAA CCC TTG ACG TTT 2693 Val Arg Ser Pro Leu Gly Glu Gly Arg Pro Thr Lys Pro Leu Thr Phe 795 800 805
GCT TGG TGC TTG GCC TCA TAC ATC TGG CCG GAT GCT GTG ATG ATG GTG 2741 Ala Trp Cys Leu Ala Ser Tyr He Trp Pro Asp Ala Val Met Met Val 810 815 820
GTG GTA GCC TTG GTG CTC CTC TTT GGC CTG TTC GAC GCG TTG GAC TGG 2789 Val Val Ala Leu Val Leu Leu Phe Gly Leu Phe Asp Ala Leu Asp Trp 825 830 835 GCT TTG GAG GAG CTC TTG GTG TCC CGG CCC TCG TTA CGG CGT CTG GCC 2837 Ala Leu Glu Glu Leu Leu Val Ser Arg Pro Ser Leu Arg Arg Leu Ala 840 845 850
CGG GTG GTT GAG TGC TGT GTG ATG GCG GGA GAG AAG GCC ACA ACC GTC 2885 Arg Val Val Glu Cys Cys Val Met Ala Gly Glu Lys Ala Thr Thr Val 855 860 865 870
CGG CTG GTC TCC AAG ATG TGC GCG AGA GGG GCC TAT TTG TTT GAC CAT 2933 Arg Leu Val Ser Lys Met Cys Ala Arg Gly Ala Tyr Leu Phe Asp His 875 880 885
ATG GGC TCT TTT TCG CGC GCT GTC AAG GAG CGC CTG CTG GAG TGG GAC 2981 Met Gly Ser Phe Ser Arg Ala Val Lys Glu Arg Leu Leu Glu Trp Asp 890 895 900
GCG GCT TTG GAA CCC CTG TCA TTC ACT AGG ACG GAC TGT CGC ATC ATT 3029 Ala Ala Leu Glu Pro Leu Ser Phe Thr Arg Thr Asp Cys Arg He He 905 910 915
AGA GAT GCT GCG AGG ACC TTG GCC TGC GGG CAG TGC GTC ATG GGC TTG 3077 Arg Asp Ala Ala Arg Thr Leu Ala Cys Gly Gin Cys Val Met Gly Leu 920 925 930
CCT GTG GTA GCG CGC CGT GGT GAC GAG GTT CTT ATC GGT GTC TTT CAG 3125 Pro Val Val Ala Arg Arg Gly Asp Glu Val Leu He Gly Val Phe Gin 935 940 945 950
GAT GTG AAC CAT TTG CCT CCC GGA TTC GTC CCG ACC GCA CCC GTT GTC 3173 Asp Val Asn His Leu Pro Pro Gly Phe Val Pro Thr Ala Pro Val Val 955 960 965
ATC CGG CGG TGC GGG AAG GGG TTT CTG GGG GTC ACT AAG GCT GCC TTG 3221 He Arg Arg Cys Gly Lys Gly Phe Leu Gly Val Thr Lys Ala Ala Leu 970 975 980
ACT GGT CGG GAT CCT GAC TTA CAT CCA GGG AAC GTC ATG GTG TTG GGG 3269 Thr Gly Arg Asp Pro Asp Leu His Pro Gly Asn Val Met Val Leu Gly 985 990 995
ACG GCT ACG TCG CGA AGC ATG GGG ACA TGC CTG AAC GGC CTG CTG TTC 3317 Thr Ala Thr Ser Arg Ser Met Gly Thr Cys Leu Asn Gly Leu Leu Phe 1000 1005 1010 ACG ACT TTC CAT GGG GCT TCA TCC CGA ACC ATC GCC ACG CCC GTG GGG 3365 Thr Thr Phe His Gly Ala Ser Ser Arg Thr He Ala Thr Pro Val Gly 1015 1020 1025 1030
GCC CTT AAT CCC AGG TGG TGG TCC GCC AGT GAT GAC GTC ACG GTG TAC 3413 Ala Leu Asn Pro Arg Trp Trp Ser Ala Ser Asp Asp Val Thr Val Tyr 1035 1040 1045
CCG CTC CCG GAT GGG GCA ACC TCG TTG ACG CCC TGC ACT TGC CAG GCT 3461 Pro Leu Pro Asp Gly Ala Thr Ser Leu Thr Pro Cys Thr Cys Gin Ala 1050 1055 1060
GAG TCC TGT TGG GTC ATA CGG TCC GAC GGG GCT TTG TGC CAT GGC TTG 3509 Glu Ser Cys Trp Val He Arg Ser Asp Gly Ala Leu Cys His Gly Leu 1065 1070 1075
AGT AAG GGA GAC AAG GTG GAG CTA GAT GTG GCC ATG GAG GTC TCA GAT 3557 Ser Lys Gly Asp Lys Val Glu Leu Asp Val Ala Met Glu Val Ser Asp 1080 1085 1090
TTC CGT GGC TCG TCC GGC TCA CCT GTC CTG TGC GAC GAG GGG CAC GCA 3605 Phe Arg Gly Ser Ser Gly Ser Pro Val Leu Cys Asp Glu Gly His Ala 1095 1100 1105 1110
GTA GGA ATG CTC GTG TCG GTG CTC CAC TCG GGT GGT CGG GTC ACC GCG 3653 Val Gly Met Leu Val Ser Val Leu His Ser Gly Gly Arg Val Thr Ala 1115 1120 1125
GCT CGA TTC ACC AGG CCG TGG ACC CAG GTC CCA ACA GAT GCT AAG ACC 3701 Ala Arg Phe Thr Arg Pro Trp Thr Gin Val Pro Thr Asp Ala Lys Thr 1130 1135 1140
ACC ACT GAA CCC CCT CCG GTG CCG GCA AAG GGA GTT TTC AAG GAA GCC 3749 Thr Thr Glu Pro Pro Pro Val Pro Ala Lys Gly Val Phe Lys Glu Ala 1145 1150 1155
CCA CTG TTT ATG CCC ACG GGC GCA GGA AAG AGC ACG CGC GTC CCG TTG 3797 Pro Leu Phe Met Pro Thr Gly Ala Gly Lys Ser Thr Arg Val Pro Leu 1160 1165 1170
GAG TAT GGC AAC ATG GGG CAC AAG GTC CTG ATT TTG AAC CCC TCG GTG 3845 Glu Tyr Gly Asn Met Gly His Lys Val Leu He Leu Asn Pro Ser Val 1175 1180 1185 1190 GCG ACA GTG AGG GCC ATG GGC CCT TAC ATG GAG CGA CTG GCG GGA AAA 3893 Ala Thr Val Arg Ala Met Gly Pro Tyr Met Glu Arg Leu Ala Gly Lys 1195 1200 1205
CAT CCA AGT ATC TAC TGT GGC CAT GAC ACC ACT GCC TTC ACA AGG ATC 3941 His Pro Ser He Tyr Cys Gly His Asp Thr Thr Ala Phe Thr Arg He 1210 1215 1220
ACT GAT TCC CCC TTA ACG TAC TCT ACC TAT GGG AGG TTT CTG GCC AAC 3989 Thr Asp Ser Pro Leu Thr Tyr Ser Thr Tyr Gly Arg Phe Leu Ala Asn 1225 1230 1235
CCT AGG CAG ATG CTG CGA GGT GTG TCG GTG GTC ATT TGC GAT GAA TGC 4037 Pro Arg Gin Met Leu Arg Gly Val Ser Val Val He Cys Asp Glu Cys 1240 1245 1250
CAC AGT CAT GAT TCC ACT GTG TTG TTG GGG ATT GGA CGG GTC CGG GAG 4085 His Ser His Asp Ser Thr Val Leu Leu Gly He Gly Arg Val Arg Glu 1255 1260 1265 1270
CTG GCA CGA GAG TGT GGG GTG CAG CTT GTG CTC TAC GCC ACT GCC ACG 4133 Leu Ala Arg Glu Cys Gly Val Gin Leu Val Leu Tyr Ala Thr Ala Thr 1275 1280 1285
CCT CCT GGG TCC CCC ATG ACT CAG CAT CCG TCA ATC ATT GAG ACC AAA 4181 Pro Pro Gly Ser Pro Met Thr Gin His Pro Ser He He Glu Thr Lys 1290 1295 1300
TTG GAT GTG GGT GAG ATT CCC TTC TAT GGG CAT GGC ATA CCC CTC GAG 4229 Leu Asp Val Gly Glu He Pro Phe Tyr Gly His Gly He Pro Leu Glu 1305 1310 1315
CGG ATG CGG ACC GGT AGG CAC CTC GTA TTC TGC TAC TCT AAG GCA GAG 4277 Arg Met Arg Thr Gly Arg His Leu Val Phe Cys Tyr Ser Lys Ala Glu 1320 1325 1330
TGT GAG CGG CTA GCC GGT CAG TTT TCT GCT AGG GGA GTT AAC GCC ATA 4325 Cys Glu Arg Leu Ala Gly Gin Phe Ser Ala Arg Gly Val Asn Ala He 1335 1340 1345 1350
GCC TAT TAC AGG GGA AAA GAC AGT TCT ATC ATC AAG GAC GGA GAT CTG 4373 Ala Tyr Tyr Arg Gly Lys Asp Ser Ser He He Lys Asp Gly Asp Leu 1355 1360 1365 GTG GTG TGC GCG ACC GAC GCG CTA TCC ACT GGA TAC ACT GGG AAC TTC 4421
Val Val Cys Ala Thr Asp Ala Leu Ser Thr Gly Tyr Thr Gly Asn Phe 1370 1375 1380
GAT TCT GTC ACC GAC TGT GGG TTA GTG GTG GAG GAG GTC GTC GAG GTG 4469 Asp Ser Val Thr Asp Cys Gly Leu Val Val Glu Glu Val Val Glu Val 1385 1390 1395
ACC CTT GAT CCC ACC ATT ACC ATC TCC CTG CGG ACA GTG CCC GCG TCG 4517 Thr Leu Asp Pro Thr He Thr He Ser Leu Arg Thr Val Pro Ala Ser 1400 1405 1410
GCA GAA CTG TCG ATG CAG AGA CGA GGA CGC ACG GGT AGA GGC AGG TCT 4565 Ala Glu Leu Ser Met Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Ser 1415 1420 1425 1430
GGG CGC TAC TAC TAC GCC GGG GTC GGA AAG GCC CCC GCG GGT GTG GTG 4613 Gly Arg Tyr Tyr Tyr Ala Gly Val Gly Lys Ala Pro Ala Gly Val Val 1435 1440 1445
CGC TCG GGT CCT GTC TGG TCG GCG GTG GAG GCC GGA GTG ACC TGG TAT 4661 Arg Ser Gly Pro Val Trp Ser Ala Val Glu Ala Gly Val Thr Trp Tyr 1450 1455 1460
GGA ATG GAA CCT GAC TTG ACA GCT AAC CTA TTG AGA CTT TAC GAC GAC 4709 Gly Met Glu Pro Asp Leu Thr Ala Asn Leu Leu Arg Leu Tyr Asp Asp 1465 1470 1475
TGC CCT TAC ACC GCA GCC GTC GCA GCT GAC ATC GGT GAA GCC GCG GTG 4757 Cys Pro Tyr Thr Ala Ala Val Ala Ala Asp He Gly Glu Ala Ala Val 1480 1485 1490
TTT TTC TCC GGG CTA GCC CCG TTG AGG ATG CAT CCC GAT GTT AGC TGG 4805 Phe Phe Ser Gly Leu Ala Pro Leu Arg Met His Pro Asp Val Ser Trp 1495 1500 1505 1510
GCA AAA GTG CGC GGC GTC AAC TGG CCC CTC TTG GTG GGT GTT CAG CGG 4853 Ala Lys Val Arg Gly Val Asn Trp Pro Leu Leu Val Gly Val Gin Arg 1515 1520 1525
ACC ATG TGC CGG GAA ACA CTG TCT CCC GGA CCA TCG GAC GAC CCC CAA 4901 Thr Met Cys Arg Glu Thr Leu Ser Pro Gly Pro Ser Asp Asp Pro Gin 1530 1535 1540 TGG GCA GGT CTG AAG GGC CCG AAT CCT GTT CCA CTA CTG CTG AGG TGG 4949 Trp Ala Gly Leu Lys Gly Pro Asn Pro Val Pro Leu Leu Leu Arg Trp 1545 1550 1555
GGC AAT GAT TTA CCA TCA AAA GTG GCC GGC CAC CAC ATT GTT GAC GAC 4997 Gly Asn Asp Leu Pro Ser Lys Val Ala Gly His His He Val Asp Asp 1560 1565 1570
CTG GTT CGT AGG CTT GGT GTG GCG GAG GGT TAT GTC CGC TGC GAT GCG 5045 Leu Val Arg Arg Leu Gly Val Ala Glu Gly Tyr Val Arg Cys Asp Ala 1575 1580 1585 1590
GGG CCG ATC TTA ATG GTC GGC CTC GCT ATC GCG GGG GGG ATG ATC TAC 5093 Gly Pro He Leu Met Val Gly Leu Ala He Ala Gly Gly Met He Tyr 1595 1600 1605
GCA TCT TAC ACC GGG TCT TTA GTG GTG GTG ACA GAC TGG GAT GTA AAG 5141 Ala Ser Tyr Thr Gly Ser Leu Val Val Val Thr Asp Trp Asp Val Lys 1610 1615 1620
GGG GGT GGC AGC CCT CTT TAT CGG CAT GGA GAC CAG GCC ACG CCA CAG 5189 Gly Gly Gly Ser Pro Leu Tyr Arg His Gly Asp Gin Ala Thr Pro Gin 1625 1630 1635
CCG GTT GTG CAG GTC CCC CCG GTA GAC CAT CGG CCG GGG GGG GAG TCT 5237 Pro Val Val Gin Val Pro Pro Val Asp His Arg Pro Gly Gly Glu Ser 1640 1645 1650
GCG CCT TCG GAT GCC AAG ACA GTG ACA GAT GCG GTG GCG GCC ATC CAG 5285 Ala Pro Ser Asp Ala Lys Thr Val Thr Asp Ala Val Ala Ala He Gin 1655 1660 1665 1670
GTG GAT TGC GAT TGG TCA GTC ATG ACC CTG TCG ATC GGG GAA GTG CTG 5333 Val Asp Cys Asp Trp Ser Val Met Thr Leu Ser He Gly Glu Val Leu 1675 1680 1685
TCC TTG GCT CAG GCT AAA ACA GCT GAG GCC TAC ACG GCA ACC GCC AAG 5381 Ser Leu Ala Gin Ala Lys Thr Ala Glu Ala Tyr Thr Ala Thr Ala Lys 1690 1695 1700
TGG CTC GCT GGC TGC TAC ACG GGG ACG CGG GCC GTT CCC ACT GTT TCA 5429 Trp Leu Ala Gly Cys Tyr Thr Gly Thr Arg Ala Val Pro Thr Val Ser 1705 1710 1715 ATT GTT GAC AAG CTC TTT GCC GGA GGG TGG GCG GCT GTG GTT GGC CAC 5477 He Val Asp Lys Leu Phe Ala Gly Gly Trp Ala Ala Val Val Gly His 1720 1725 1730
TGT CAC AGC GTC ATA GCT GCG GCG GTG GCT GCC TAC GGG GCT TCC AGG 5525 Cys His Ser Val He Ala Ala Ala Val Ala Ala Tyr Gly Ala Ser Arg 1735 1740 1745 1750
AGT CCG CCG TTG GCA GCC GCG GCT TCC TAC CTG ATG GGA CTG GGC GTC 5573 Ser Pro Pro Leu Ala Ala Ala Ala Ser Tyr Leu Met Gly Leu Gly Val 1755 1760 1765
GGA GGC AAC GCT CAG ACG CGT TTG GCG TCT GCC CTC CTG TTG GGG GCC 5621 Gly Gly Asn Ala Gin Thr Arg Leu Ala Ser Ala Leu Leu Leu Gly Ala 1770 1775 1780
GCT GGC ACC GCC CTG GGC ACT CCC GTC GTG GGT TTA ACC ATG GCG GGG 5669 Ala Gly Thr Ala Leu Gly Thr Pro Val Val Gly Leu Thr Met Ala Gly 1785 1790 1795
GCG TTC ATG GGG GGT GCT AGC GTC TCT CCC TCC TTG GTC ACC ATC TTG 5717 Ala Phe Met Gly Gly Ala Ser Val Ser Pro Ser Leu Val Thr He Leu 1800 1805 1810
TTG GGG GCC GTG GGA GGC TGG GAG GGC GTC GTC AAC GCT GCT AGC CTT 5765 Leu Gly Ala Val Gly Gly Trp Glu Gly Val Val Asn Ala Ala Ser Leu 1815 1820 1825 1830
GTC TTT GAC TTC ATG GCG GGG AAA CTA TCG TCA GAA GAT CTG TGG TAC 5813 Val Phe Asp Phe Met Ala Gly Lys Leu Ser Ser Glu Asp Leu Trp Tyr 1835 1840 1845
GCC ATC CCA GTG CTC ACC AGC CCG GGG GCG GGC CTT GCG GGG ATC GCC 5861 Ala He Pro Val Leu Thr Ser Pro Gly Ala Gly Leu Ala Gly He Ala 1850 1855 1860
CTT GGG TTG GTG CTG TAC TCA GCT AAC AAC TCT GGT ACT ACC ACT TGG 5909 Leu Gly Leu Val Leu Tyr Ser Ala Asn Asn Ser Gly Thr Thr Thr Trp 1865 1870 1875
TTG AAC CGT CTG CTG ACT ACG TTA CCT AGG TCT TCT TGC ATC CCT GAC 5957 Leu Asn Arg Leu Leu Thr Thr Leu Pro Arg Ser Ser Cys He Pro Asp 1880 1885 1890 AGC TAT TTC CAA CAG GCC GAT TAC TGT GAC AAG GTC TCG GCC GTG CTT 6005 Ser Tyr Phe Gin Gin Ala Asp Tyr Cys Asp Lys Val Ser Ala Val Leu 1895 1900 1905 1910
CGC CGA CTG AGC CTC ACC CGC ACT GTG GTG GCC CTA GTC AAT AGG GAA 6053 Arg Arg Leu Ser Leu Thr Arg Thr Val Val Ala Leu Val Asn Arg Glu 1915 1920 1925
CCC AAG GTG GAC GAG GTA CAG GTG GGG TAC GTC TGG GAT CTC TGG GAG 6101 Pro Lys Val Asp Glu Val Gin Val Gly Tyr Val Trp Asp Leu Trp Glu 1930 1935 1940
TGG ATC ATG CGT CAA GTG CGC ATG GTC ATG GCC AGG CTC CGG GCT CTC 6149 Trp He Met Arg Gin Val Arg Met Val Met Ala Arg Leu Arg Ala Leu 1945 1950 1955
TGC CCC GTG GTG TCA CTG CCT TTG TGG CAC TGC GGG GAG GGG TGG TCC 6197 Cys Pro Val Val Ser Leu Pro Leu Trp His Cys Gly Glu Gly Trp Ser 1960 1965 1970
GGA GAG TGG TTG TTG GAC GGC CAT GTG GAG AGT CGC TGT CTT TGC GGG 6245 Gly Glu Trp Leu Leu Asp Gly His Val Glu Ser Arg Cys Leu Cys Gly 1975 1980 1985 1990
TGC GTG ATC ACC GGC GAT GTT TTC AAT GGG CAA CTC AAA GAG CCA GTT 6293 Cys Val He Thr Gly Asp Val Phe Asn Gly Gin Leu Lys Glu Pro Val 1995 2000 2005
TAC TCT ACA AAG TTG TGC CGG CAC TAT TGG ATG GGG ACC GTT CCT GTG 6341 Tyr Ser Thr Lys Leu Cys Arg His Tyr Trp Met Gly Thr Val Pro Val 2010 2015 2020
AAC ATG CTG GGT TAC GGC GAA ACA TCA CCC CTC TTG GCC TCT GAC ACC 6389 Asn Met Leu Gly Tyr Gly Glu Thr Ser Pro Leu Leu Ala Ser Asp Thr 2025 2030 2035
CCG AAG GTG GTG CCT TTT GGG ACG TCG GGC TGG GCT GAG GTG GTG GTG 6437 Pro Lys Val Val Pro Phe Gly Thr Ser Gly Trp Ala Glu Val Val Val 2040 2045 2050
ACC CCT ACC CAC GTG GTG ATC AGG AGA ACC TCT CCC TAC GAG TTG CTG 6485 Thr Pro Thr His Val Val He Arg Arg Thr Ser Pro Tyr Glu Leu Leu 2055 2060 2065 2070 CGC CAA CAA ATC CTA TCA GCT GCA GTT GCT GAG CCC TAT TAT GTC GAC 6533 Arg Gin Gin He Leu Ser Ala Ala Val Ala Glu Pro Tyr Tyr Val Asp 2075 2080 2085
GGC ATA CCG GTC TCA TGG GAC GCG GAC GCT CGT GCG CCT GCT ATG GTT 6581 Gly He Pro Val Ser Trp Asp Ala Asp Ala Arg Ala Pro Ala Met Val 2090 2095 2100
TAT GGC CCT GGG CAA AGT GTT ACC ATT GAC GGG GAG CGC TAC ACC CTG 6629 Tyr Gly Pro Gly Gin Ser Val Thr He Asp Gly Glu Arg Tyr Thr Leu 2105 2110 2115
CCG CAT CAA CTG CGG CTC AGG AAT GTA GCG CCC TCT GAG GTT TCA TCC 6677 Pro His Gin Leu Arg Leu Arg Asn Val Ala Pro Ser Glu Val Ser Ser 2120 2125 2130
GAG GTG TCC ATA GAC ATT GGG ACG GAG ACT GAA GAC TCA GAA CTG ACT 6725 Glu Val Ser He Asp He Gly Thr Glu Thr Glu Asp Ser Glu Leu Thr 2135 2140 2145 2150
GAG GCC GAC CTG CCG CCG GCA GCT GCA GCC CTC CAG GCT ATC GAG AAT 6773 Glu Ala Asp Leu Pro Pro Ala Ala Ala Ala Leu Gin Ala He Glu Asn 2155 2160 2165
GCT GCG AGG ATT CTT GAG CCT CAT ATT GAT GTC ATC ATG GAG GAT TGC 6821 Ala Ala Arg He Leu Glu Pro His He Asp Val He Met Glu Asp Cys 2170 2175 2180
AGT ACA CCC TCT CTT TGT GGT AGT AGC CGA GAG ATG CCT GTG TGG GGA 6869 Ser Thr Pro Ser Leu Cys Gly Ser Ser Arg Glu Met Pro Val Trp Gly 2185 2190 2195
GAA GAC ATC CCC CGC ACT CCA TCG CCA GCA CTT ATC TCG GTT ACC GAG 6917 Glu Asp He Pro Arg Thr Pro Ser Pro Ala Leu He Ser Val Thr Glu 2200 2205 2210
AGC AGC TCA GAT GAG AAG ACC CCG TCG GTG TCC TCC TCG CAG GAG GAT 6965 Ser Ser Ser Asp Glu Lys Thr Pro Ser Val Ser Ser Ser Gin Glu Asp 2215 2220 2225 2230
ACC CCG TCC TCT GAC TCA TTC GAA GTC ATC CAA GAG TCT GAG ACA GCT 7013 Thr Pro Ser Ser Asp Ser Phe Glu Val He Gin Glu Ser Glu Thr Ala 2235 2240 2245 GAA GGA GAG GAA AGT GTC TTC AAC GTG GCT CTT TCC GTA CTA GAA GCC 7061 Glu Gly Glu Glu Ser Val Phe Asn Val Ala Leu Ser Val Leu Glu Ala 2250 2255 2260
TTG TTT CCA CAG AGT GAT GCC ACT AGA AAG CTT ACC GTC AGG ATG AAT 7109 Leu Phe Pro Gin Ser Asp Ala Thr Arg Lys Leu Thr Val Arg Met Asn 2265 2270 2275
TGC TGC GTT GAG AAG AGC GTC ACG CGC TTC TTT TCT TTG GGG CTG ACG 7157 Cys Cys Val Glu Lys Ser Val Thr Arg Phe Phe Ser Leu Gly Leu Thr 2280 2285 2290
GTG GCT GAT GTG GCC AGT CTG TGT GAG ATG GAG ATC CAG AAC CAT ACA 7205 Val Ala Asp Val Ala Ser Leu Cys Glu Met Glu He Gin Asn His Thr 2295 2300 2305 2310
GCC TAT TGT GAC AAG GTG CGC ACT CCG CTC GAA TTG CAA GTT GGG TGC 7253 Ala Tyr Cys Asp Lys Val Arg Thr Pro Leu Glu Leu Gin Val Gly Cys 2315 2320 2325
TTG GTG GGC AAT GAA CTT ACC TTT GAA TGT GAT AAG TGT GAG GCT AGG 7301 Leu Val Gly Asn Glu Leu Thr Phe Glu Cys Asp Lys Cys Glu Ala Arg 2330 2335 2340
CAA GAG ACT TTG GCC TCC TTC TCC TAT ATT TGG TCT GGG GTG CCA TTG 7349 Gin Glu Thr Leu Ala Ser Phe Ser Tyr He Trp Ser Gly Val Pro Leu 2345 2350 2355
ACT AGG GCC ACA CCG GCT AAA CCA CCT GTG GTG AGG CCG GTG GGG TCC 7397 Thr Arg Ala Thr Pro Ala Lys Pro Pro Val Val Arg Pro Val Gly Ser 2360 2365 2370
TTG TTG GTG GCT GAC ACC ACG AAA GTG TAT GTC ACA AAC CCG GAC AAT 7445 Leu Leu Val Ala Aβp Thr Thr Lys Val Tyr Val Thr Asn Pro Asp Asn 2375 2380 2385 2390
GTT GGG AGA AGA GTG GAC AAG GTG ACC TTC TGG CGC GCC CCC AGG GTC 7493 Val Gly Arg Arg Val Asp Lys Val Thr Phe Trp Arg Ala Pro Arg Val 2395 2400 2405
CAT GAC AAA TAT CTC GTG GAC TCC ATC GAG CGT GCC AGG AGG GCG GCT 7541 His Asp Lys Tyr Leu Val Asp Ser He Glu Arg Ala Arg Arg Ala Ala 2410 2415 2420 CAA GCC TGC CAA AGC ATG GGT TAC ACT TAT GAG GAA GCA ATA AGG ACT 7589
Gin Ala Cys Gin Ser Met Gly Tyr Thr Tyr Glu Glu Ala He Arg Thr 2425 2430 2435
GTT AGG CCA CAT GCT GCC ATG GGC TGG GGA TCT AAG GTG TCG GTC AAG 7637 Val Arg Pro His Ala Ala Met Gly Trp Gly Ser Lys Val Ser Val Lys 2440 2445 2450
GAC TTG GCC ACC CCT GCG GGG AAG ATG GCC GTC CAC GAC CGA CTT CAG 7685 Asp Leu Ala Thr Pro Ala Gly Lys Met Ala Val His Asp Arg Leu Gin 2455 2460 2465 2470
GAG ATA CTT GAG GGG ACT CCG GTC CCT TTT ACT CTT ACT GTG AAA AAG 7733 Glu He Leu Glu Gly Thr Pro Val Pro Phe Thr Leu Thr Val Lys Lys 2475 2480 2485
GAG GTG TTC TTC AAA GAC CGT AAG GAG GAG AAG GCC CCC CGC CTC ATT 7781 Glu Val Phe Phe Lys Asp Arg Lys Glu Glu Lys Ala Pro Arg Leu He 2490 2495 2500
GTG TTC CCC CCC CTG GAC TTC CGG ATA GCT GAG AAG CTT ATC CTG GGA 7829 Val Phe Pro Pro Leu Asp Phe Arg He Ala Glu Lys Leu He Leu Gly 2505 2510 2515
GAC CCG GGG CGG GTG GCC AAG GCG GTG TTG GGG GGG GCT TAC GCC TTC 7877 Asp Pro Gly Arg Val Ala Lys Ala Val Leu Gly Gly Ala Tyr Ala Phe 2520 2525 2530
CAG TAC ACC CCA AAT CAG CGA GTT AAG GAG ATG CTC AAA CTG TGG GAG 7925 Gin Tyr Thr Pro Asn Gin Arg Val Lys Glu Met Leu Lys Leu Trp Glu 2535 2540 2545 2550
TCA AAG AAA ACA CCT TGC GCC ATC TGT GTG GAC GCC ACT TGC TTC GAC 7973 Ser Lys Lys Thr Pro Cys Ala He Cys Val Asp Ala Thr Cys Phe Asp 2555 2560 2565
AGT AGC ATT ACT GAA GAG GAC GTG GCG CTG GAG ACA GAG CTG TAC GCT 8021 Ser Ser He Thr Glu Glu Asp Val Ala Leu Glu Thr Glu Leu Tyr Ala 2570 2575 2580
CTG GCC TCT GAC CAT CCA GAG TGG GTG CGA GCT TTG GGG AAG TAC TAT 8069 Leu Ala Ser Asp His Pro Glu Trp Val Arg Ala Leu Gly Lys Tyr Tyr 2585 2590 2595 GCC TCA GGA ACC ATG GTC ACC CCT GAG GGG GTT CCC GTA GGT GAG AGG 8117 Ala Ser Gly Thr Met Val Thr Pro Glu Gly Val Pro Val Gly Glu Arg 2600 2605 2610
TAT TGT AGA TCC TCA GGC GTT TTG ACT ACC AGC GCG AGT AAC TGC CTG 8165 Tyr Cys Arg Ser Ser Gly Val Leu Thr Thr Ser Ala Ser Asn Cys Leu 2615 2620 2625 2630
ACC TGC TAC ATC AAG GTG AAA GCC GCT TGT GAG AGA GTG GGG CTG AAA 8213 Thr Cys Tyr He Lys Val Lys Ala Ala Cys Glu Arg Val Gly Leu Lys 2635 2640 2645
AAT GTC TCG CTT CTC ATA GCC GGC GAT GAC TGT TTG ATC ATA TGC GAA 8261 Asn Val Ser Leu Leu He Ala Gly Asp Asp Cys Leu He He Cys Glu 2650 2655 2660
CGG CCA GTG TGC GAC CCT TGT GAC GCC TTG GGC AGA GCC CTG GCG AGC 8309 Arg Pro Val Cys Asp Pro Cys Asp Ala Leu Gly Arg Ala Leu Ala Ser 2665 2670 2675
TAT GGG TAT GCT TGC GAG CCT TCG TAT CAT GCA TCA CTG GAC ACG GCC 8357 Tyr Gly Tyr Ala Cys Glu Pro Ser Tyr His Ala Ser Leu Asp Thr Ala 2680 2685 2690
CCC TTC TGC TCC ACT TGG CTC GCT GAG TGC AAC GCA GAT GGG AAA CGC 8405 Pro Phe Cys Ser Thr Trp Leu Ala Glu Cys Asn Ala Asp Gly Lys Arg 2695 2700 2705 2710
CAT TTC TTC CTG ACC ACG GAC TTT CGG AGG CCG CTT GCT CGC ATG TCG 8453 His Phe Phe Leu Thr Thr Asp Phe Arg Arg Pro Leu Ala Arg Met Ser 2715 2720 2725
AGC GAG TAT AGT GAC CCA ATG GCT TCG GCC ATA GGT TAC ATC CTC CTG 8501 Ser Glu Tyr Ser Asp Pro Met Ala Ser Ala He Gly Tyr He Leu Leu 2730 2735 2740
TAT CCC TGG CAT CCC ATC ACA CGG TGG GTC ATC ATC CCT CAT GTG CTA 8549 Tyr Pro Trp His Pro He Thr Arg Trp Val He He Pro His Val Leu 2745 2750 2755
ACG TGC GCA TTC AGG GGT GGT GGT ACA CCG TCT GAT CCG GTT TGG TGT 8597 Thr Cys Ala Phe Arg Gly Gly Gly Thr Pro Ser Asp Pro Val Trp Cys 2760 2765 2770 CAG GTG CAT GGT AAC TAC TAC AAG TTT CCA CTG GAC AAA CTG CCT AAC 8645 Gin Val His Gly Asn Tyr Tyr Lys Phe Pro Leu Asp Lys Leu Pro Asn 2775 2780 2785 2790
ATC ATC GTG GCC CTC CAC GGA CCA GCA GCG TTG AGG GTT ACC GCA GAC 8693 He He Val Ala Leu His Gly Pro Ala Ala Leu Arg Val Thr Ala Asp 2795 2800 2805
ACA ACT AAG ACA AAA ATG GAA GCT GGG AAG GTG CTG AGT GAC CTC AAG 8741 Thr Thr Lys Thr Lys Met Glu Ala Gly Lys Val Leu Ser Asp Leu Lys 2810 2815 2820
CTC CCT GGC CTA GCG GTC CAC CGA AAG AAG GCC GGA GCA CTG CGA ACA 8789 Leu Pro Gly Leu Ala Val His Arg Lys Lys Ala Gly Ala Leu Arg Thr 2825 2830 2835
CGC ATG CTT CGG TCG CGC GGT TGG GCC GAG TTG GCG AGG GGC CTG TTG 8837 Arg Met Leu Arg Ser Arg Gly Trp Ala Glu Leu Ala Arg Gly Leu Leu 2840 2845 2850
TGG CAT CCA GGC CTC CGG CTC CCT CCC CCT GAG ATT GCT GGT ATC CCG 8885 Trp His Pro Gly Leu Arg Leu Pro Pro Pro Glu He Ala Gly He Pro 2855 2860 2865 2870
GGG GGT TTC CCC CTC TCC CCC CCC TAC ATG GGG GTG GTG CAT CAA TTG 8933 Gly Gly Phe Pro Leu Ser Pro Pro Tyr Met Gly Val Val His Gin Leu 2875 2880 2885
GAT TTT ACA AGC CAG AGG AGT CGC TGG CGG TGG CTG GGG TTC TTA GCC 8981 Asp Phe Thr Ser Gin Arg Ser Arg Trp Arg Trp Leu Gly Phe Leu Ala 2890 2895 2900
CTG CTC ATC GTA GCC CTC TTC GGG TGAACTAAAT TCATCTGTTG CGGCAAGGTC 9035 Leu Leu He Val Ala Leu Phe Gly 2905 2910
CAGTGACTGA TCATCACTGG AGGAGGTTCC CGCCCTCCCC GCCCCAGGGG TCTCCCCGCT 9095
GGGTAAAA 9103
(2) INFORMATION FOR SEQ ID NO:157:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 2910 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:157:
Met Ser Leu Leu Thr Asn Arg Leu Ser Arg Arg Val Asp Lys Asp Gin
1 5 10 15
Trp Gly Pro Gly Phe Met Gly Lys Asp Pro Lys Pro Cys Pro Ser Arg 20 25 30
Arg Thr Gly Lys Cys Met Gly Pro Pro Ser Ser Ala Ala Ala Cys Ser 35 40 45
Arg Gly Ser Pro Arg He Leu Arg Val Arg Ala Gly Gly He Ser Leu 50 55 60
Pro Tyr Thr He Met Glu Ala Leu Leu Phe Leu Leu Gly Val Glu Ala 65 70 75 80
Gly Ala He Leu Ala Pro Ala Thr His Ala Cys Arg Ala Asn Gly Gin 85 90 95
Tyr Phe Leu Thr Asn Cys Cys Ala Pro Glu Asp He Gly Phe Cys Leu 100 105 110
Glu Gly Gly Cys Leu Val Ala Leu Gly Cys Thr Val Cys Thr Asp Arg 115 120 125
Cys Trp Pro Leu Tyr Gin Ala Gly Leu Ala Val Arg Pro Gly Lys Ser 130 135 140
Ala Ala Gin Leu Val Gly Gin Leu Gly Gly Leu Tyr Gly Pro Leu Ser 145 150 155 160
Val Ser Ala Tyr Val Ala Gly He Leu Gly Leu Gly Glu Val Tyr Ser 165 170 175
Gly Val Leu Thr Val Gly Val Ala Leu Thr Arg Arg Val Tyr Pro Met 180 185 190 Pro Asn Leu Thr Cys Ala Val Glu Cyβ Glu Leu Lys Trp Glu Ser Glu 195 200 205
Phe Trp Arg Trp Thr Glu Gin Leu Ala Ser Asn Tyr Trp He Leu Glu 210 215 220
Tyr Leu Trp Lys Val Pro Phe Asp Phe Trp Arg Gly Val Leu Ser Leu 225 230 235 240
Thr Pro Leu Leu Val Cys Val Ala Ala Leu Leu Leu Leu Glu Gin Arg 245 250 255
He Val Met Val Phe Leu Leu Val Thr Met Ala Gly Met Ser Gin Gly 260 265 270
Ala Pro Ala Ser Val Leu Gly Ser Arg Pro Phe Asp Tyr Gly Leu Thr 275 280 285
Trp Gin Ser Cys Ser Cys Arg Ala Asn Gly Ser Arg Tyr Thr Thr Gly 290 295 300
Glu Lys Val Trp Asp Arg Gly Asn Val Thr Leu Leu Cys Asp Cys Pro 305 310 315 320
Asn Gly Pro Trp Val Trp Leu Pro Ala Phe Cys Gin Ala He Gly Trp 325 330 335
Gly Asp Pro He Thr His Trp Ser His Gly Gin Asn Arg Trp Pro Leu 340 345 350
Ser Cys Pro Gin Tyr Val Tyr Gly Ser Val Ser Val Thr Cys Val Trp 355 360 365
Gly Ser Val Ser Trp Phe Ala Ser Thr Gly Gly Arg Asp Ser Lys He 370 375 380
Asp Val Trp Ser Leu Val Pro Val Gly Ser Ala Ser Cys Thr He Ala 385 390 395 400
Ala Leu Gly Ser Ser Asp Arg Asp Thr Val Val Glu Leu Ser Glu Trp 405 410 415
Gly Val Pro Cys Ala Thr Cys He Leu Asp Arg Arg Pro Ala Ser Cys 420 425 430 Gly Thr Cys Val Arg Asp Cys Trp Pro Glu Thr Gly Ser Val Arg Phe 435 440 445
Pro Phe His Arg Cys Gly Ala Gly Pro Lys Leu Thr Lys Asp Leu Glu 450 455 460
Ala Val Pro Phe Val Asn Arg Thr Thr Pro Phe Thr He Arg Gly Pro 465 470 475 480
Leu Gly Asn Gin Gly Arg Gly Asn Pro Val Arg Ser Pro Leu Gly Phe 485 490 495
Gly Ser Tyr Ala Met Thr Lys He Arg Asp Ser Leu His Leu Val Lys 500 505 510
Cys Pro Thr Pro Ala He Glu Pro Pro Thr Gly Thr Phe Gly Phe Phe 515 520 525
Pro Gly Val Pro Pro Leu Asn Asn Cys Leu Leu Leu Gly Thr Glu Val 530 535 540
Ser Glu Ala Leu Gly Gly Ala Gly Leu Thr Gly Gly Phe Tyr Glu Pro 545 550 555 560
Leu Val Arg Arg Arg Ser Glu Leu Met Gly Arg Arg Asn Pro Val Cys 565 570 575
Pro Gly Phe Ala Trp Leu Ser Ser Gly Arg Pro Asp Gly Phe He His 580 585 590
Val Gin Gly His Leu Gin Glu Val Asp Ala Gly Asn Phe He Pro Pro 595 600 605
Pro Arg Trp Leu Leu Leu Asp Phe Val Phe Val Leu Leu Tyr Leu Met 610 615 620
Lys Leu Ala Glu Ala Arg Leu Val Pro Leu He Leu Leu Leu Leu Trp 625 630 635 640
Trp Trp Val Asn Gin Leu Ala Val Leu Gly Leu Pro Ala Val Asp Ala 645 650 655
Ala Val Ala Gly Glu Val Phe Ala Gly Pro Ala Leu Ser Trp Cys Leu 660 665 670 Gly Leu Pro Thr Val Ser Met He Leu Gly Leu Ala Asn Leu Val Leu 675 680 685
Tyr Phe Arg Trp Met Gly Pro Gin Arg Leu Met Phe Leu Val Leu Trp 690 695 700
Lys Leu Ala Arg Gly Ala Phe Pro Leu Ala Leu Leu Met Gly He Ser 705 710 715 720
Ala Thr Arg Gly Arg Thr Ser Val Leu Gly Ala Glu Phe Cys Phe Asp 725 730 735
Val Thr Phe Glu Val Asp Thr Ser Val Leu Gly Trp Val Val Ala Ser 740 745 750
Val Val Ala Trp Ala He Ala Leu Leu Ser Ser Met Ser Ala Gly Gly 755 760 765
Trp Arg His Lys Ala Val He Tyr Arg Thr Trp Cys Lys Gly Tyr Gin 770 775 780
Ala He Arg Gin Arg Val Val Arg Ser Pro Leu Gly Glu Gly Arg Pro 785 790 795 800
Thr Lys Pro Leu Thr Phe Ala Trp Cys Leu Ala Ser Tyr He Trp Pro 805 810 815
Asp Ala Val Met Met Val Val Val Ala Leu Val Leu Leu Phe Gly Leu 820 825 830
Phe Asp Ala Leu Asp Trp Ala Leu Glu Glu Leu Leu Val Ser Arg Pro 835 840 845
Ser Leu Arg Arg Leu Ala Arg Val Val Glu Cys Cys Val Met Ala Gly 850 855 860
Glu Lys Ala Thr Thr Val Arg Leu Val Ser Lys Met Cys Ala Arg Gly 865 870 875 880
Ala Tyr Leu Phe Asp His Met Gly Ser Phe Ser Arg Ala Val Lys Glu 885 890 895
Arg Leu Leu Glu Trp Asp Ala Ala Leu Glu Pro Leu Ser Phe Thr Arg 900 905 910 Thr Asp Cys Arg He He Arg Asp Ala Ala Arg Thr Leu Ala Cys Gly 915 920 925
Gin Cys Val Met Gly Leu Pro Val Val Ala Arg Arg Gly Asp Glu Val 930 935 940
Leu He Gly Val Phe Gin Asp Val Asn His Leu Pro Pro Gly Phe Val 945 950 955 960
Pro Thr Ala Pro Val Val He Arg Arg Cys Gly Lys Gly Phe Leu Gly 965 970 975
Val Thr Lys Ala Ala Leu Thr Gly Arg Asp Pro Asp Leu His Pro Gly 980 985 990
Asn Val Met Val Leu Gly Thr Ala Thr Ser Arg Ser Met Gly Thr Cys 995 1000 1005
Leu Asn Gly Leu Leu Phe Thr Thr Phe His Gly Ala Ser Ser Arg Thr 1010 1015 1020
He Ala Thr Pro Val Gly Ala Leu Asn Pro Arg Trp Trp Ser Ala Ser 1025 1030 1035 1040
Asp Asp Val Thr Val Tyr Pro Leu Pro Asp Gly Ala Thr Ser Leu Thr 1045 1050 1055
Pro Cys Thr Cys Gin Ala Glu Ser Cys Trp Val He Arg Ser Asp Gly 1060 1065 1070
Ala Leu Cys His Gly Leu Ser Lys Gly Asp Lys Val Glu Leu Asp Val 1075 1080 1085
Ala Met Glu Val Ser Asp Phe Arg Gly Ser Ser Gly Ser Pro Val Leu 1090 1095 1100
Cys Asp Glu Gly His Ala Val Gly Met Leu Val Ser Val Leu His Ser 1105 1110 1115 1120
Gly Gly Arg Val Thr Ala Ala Arg Phe Thr Arg Pro Trp Thr Gin Val 1125 1130 1135
Pro Thr Asp Ala Lys Thr Thr Thr Glu Pro Pro Pro Val Pro Ala Lys 1140 1145 1150 Gly Val Phe Lys Glu Ala Pro Leu Phe Met Pro Thr Gly Ala Gly Lys 1155 1160 1165
Ser Thr Arg Val Pro Leu Glu Tyr Gly Asn Met Gly His Lys Val Leu 1170 1175 1180
He Leu Asn Pro Ser Val Ala Thr Val Arg Ala Met Gly Pro Tyr Met 1185 1190 1195 1200
Glu Arg Leu Ala Gly Lys His Pro Ser He Tyr Cys Gly His Asp Thr 1205 1210 1215
Thr Ala Phe Thr Arg He Thr Asp Ser Pro Leu Thr Tyr Ser Thr Tyr 1220 1225 1230
Gly Arg Phe Leu Ala Asn Pro Arg Gin Met Leu Arg Gly Val Ser Val 1235 1240 1245
Val He Cys Asp Glu Cys His Ser His Asp Ser Thr Val Leu Leu Gly 1250 1255 1260
He Gly Arg Val Arg Glu Leu Ala Arg Glu Cys Gly Val Gin Leu Val 1265 1270 1275 1280
Leu Tyr Ala Thr Ala Thr Pro Pro Gly Ser Pro Met Thr Gin His Pro 1285 1290 1295
Ser He He Glu Thr Lys Leu Asp Val Gly Glu He Pro Phe Tyr Gly 1300 1305 1310
His Gly He Pro Leu Glu Arg Met Arg Thr Gly Arg His Leu Val Phe 1315 1320 1325
Cys Tyr Ser Lys Ala Glu Cys Glu Arg Leu Ala Gly Gin Phe Ser Ala 1330 1335 1340
Arg Gly Val Asn Ala He Ala Tyr Tyr Arg Gly Lys Asp Ser Ser He 1345 1350 1355 1360
He Lys Asp Gly Asp Leu Val Val Cys Ala Thr Asp Ala Leu Ser Thr 1365 1370 1375
Gly Tyr Thr Gly Asn Phe Asp Ser Val Thr Asp Cys Gly Leu Val Val 1380 1385 1390 Glu Glu Val Val Glu Val Thr Leu Asp Pro Thr He Thr He Ser Leu 1395 1400 1405
Arg Thr Val Pro Ala Ser Ala Glu Leu Ser Met Gin Arg Arg Gly Arg 1410 1415 1420
Thr Gly Arg Gly Arg Ser Gly Arg Tyr Tyr Tyr Ala Gly Val Gly Lys 1425 1430 1435 1440
Ala Pro Ala Gly Val Val Arg Ser Gly Pro Val Trp Ser Ala Val Glu 1445 1450 1455
Ala Gly Val Thr Trp Tyr Gly Met Glu Pro Asp Leu Thr Ala Asn Leu 1460 1465 1470
Leu Arg Leu Tyr Asp Asp Cys Pro Tyr Thr Ala Ala Val Ala Ala Asp 1475 1480 1485
He Gly Glu Ala Ala Val Phe Phe Ser Gly Leu Ala Pro Leu Arg Met 1490 1495 1500
His Pro Asp Val Ser Trp Ala Lys Val Arg Gly Val Asn Trp Pro Leu 1505 1510 1515 1520
Leu Val Gly Val Gin Arg Thr Met Cys Arg Glu Thr Leu Ser Pro Gly 1525 1530 1535
Pro Ser Asp Asp Pro Gin Trp Ala Gly Leu Lys Gly Pro Asn Pro Val 1540 1545 1550
Pro Leu Leu Leu Arg Trp Gly Asn Asp Leu Pro Ser Lys Val Ala Gly 1555 1560 1565
His His He Val Asp Asp Leu Val Arg Arg Leu Gly Val Ala Glu Gly 1570 1575 1580
Tyr Val Arg Cys Asp Ala Gly Pro He Leu Met Val Gly Leu Ala He 1585 1590 1595 1600
Ala Gly Gly Met He Tyr Ala Ser Tyr Thr Gly Ser Leu Val Val Val 1605 1610 1615
Thr Asp Trp Asp Val Lys Gly Gly Gly Ser Pro Leu Tyr Arg His Gly 1620 1625 1630 Asp Gin Ala Thr Pro Gin Pro Val Val Gin Val Pro Pro Val Asp His 1635 1640 1645
Arg Pro Gly Gly Glu Ser Ala Pro Ser Asp Ala Lys Thr Val Thr Asp 1650 1655 1660
Ala Val Ala Ala He Gin Val Asp Cys Asp Trp Ser Val Met Thr Leu 1665 1670 1675 1680
Ser He Gly Glu Val Leu Ser Leu Ala Gin Ala Lys Thr Ala Glu Ala 1685 1690 1695
Tyr Thr Ala Thr Ala Lys Trp Leu Ala Gly Cys Tyr Thr Gly Thr Arg 1700 1705 1710
Ala Val Pro Thr Val Ser He Val Asp Lys Leu Phe Ala Gly Gly Trp 1715 1720 1725
Ala Ala Val Val Gly His Cys His Ser Val He Ala Ala Ala Val Ala 1730 1735 1740
Ala Tyr Gly Ala Ser Arg Ser Pro Pro Leu Ala Ala Ala Ala Ser Tyr 1745 1750 1755 1760
Leu Met Gly Leu Gly Val Gly Gly Asn Ala Gin Thr Arg Leu Ala Ser 1765 1770 1775
Ala Leu Leu Leu Gly Ala Ala Gly Thr Ala Leu Gly Thr Pro Val Val 1780 1785 1790
Gly Leu Thr Met Ala Gly Ala Phe Met Gly Gly Ala Ser Val Ser Pro 1795 1800 1805
Ser Leu Val Thr He Leu Leu Gly Ala Val Gly Gly Trp Glu Gly Val 1810 1815 1820
Val Asn Ala Ala Ser Leu Val Phe Asp Phe Met Ala Gly Lys Leu Ser 1825 1830 1835 1840
Ser Glu Asp Leu Trp Tyr Ala He Pro Val Leu Thr Ser Pro Gly Ala 1845 1850 1855
Gly Leu Ala Gly He Ala Leu Gly Leu Val Leu Tyr Ser Ala Asn Asn 1860 1865 1870 Ser Gly Thr Thr Thr Trp Leu Asn Arg Leu Leu Thr Thr Leu Pro Arg 1875 1880 1885
Ser Ser Cys He Pro Asp Ser Tyr Phe Gin Gin Ala Asp Tyr Cys Asp 1890 1895 1900
Lys Val Ser Ala Val Leu Arg Arg Leu Ser Leu Thr Arg Thr Val Val 1905 1910 1915 1920
Ala Leu Val Asn Arg Glu Pro Lys Val Asp Glu Val Gin Val Gly Tyr 1925 1930 1935
Val Trp Asp Leu Trp Glu Trp He Met Arg Gin Val Arg Met Val Met 1940 1945 1950
Ala Arg Leu Arg Ala Leu Cys Pro Val Val Ser Leu Pro Leu Trp His 1955 1960 1965
Cys Gly Glu Gly Trp Ser Gly Glu Trp Leu Leu Asp Gly His Val Glu 1970 1975 1980
Ser Arg Cys Leu Cys Gly Cys Val He Thr Gly Asp Val Phe Asn Gly 1985 1990 1995 2000
Gin Leu Lys Glu Pro Val Tyr Ser Thr Lys Leu Cys Arg His Tyr Trp 2005 2010 2015
Met Gly Thr Val Pro Val Asn Met Leu Gly Tyr Gly Glu Thr Ser Pro 2020 2025 2030
Leu Leu Ala Ser Asp Thr Pro Lys Val Val Pro Phe Gly Thr Ser Gly 2035 2040 2045
Trp Ala Glu Val Val Val Thr Pro Thr His Val Val He Arg Arg Thr 2050 2055 2060
Ser Pro Tyr Glu Leu Leu Arg Gin Gin He Leu Ser Ala Ala Val Ala 2065 2070 2075 2080
Glu Pro Tyr Tyr Val Asp Gly He Pro Val Ser Trp Asp Ala Asp Ala 2085 2090 2095
Arg Ala Pro Ala Met Val Tyr Gly Pro Gly Gin Ser Val Thr He Asp 2100 2105 2110 Gly Glu Arg Tyr Thr Leu Pro His Gin Leu Arg Leu Arg Asn Val Ala 2115 2120 2125
Pro Ser Glu Val Ser Ser Glu Val Ser He Asp He Gly Thr Glu Thr 2130 2135 2140
Glu Asp Ser Glu Leu Thr Glu Ala Asp Leu Pro Pro Ala Ala Ala Ala 2145 2150 2155 2160
Leu Gin Ala He Glu Asn Ala Ala Arg He Leu Glu Pro His He Asp 2165 2170 2175
Val He Met Glu Asp Cys Ser Thr Pro Ser Leu Cys Gly Ser Ser Arg 2180 2185 2190
Glu Met Pro Val Trp Gly Glu Asp He Pro Arg Thr Pro Ser Pro Ala 2195 2200 2205
Leu He Ser Val Thr Glu Ser Ser Ser Asp Glu Lys Thr Pro Ser Val 2210 2215 2220
Ser Ser Ser Gin Glu Asp Thr Pro Ser Ser Asp Ser Phe Glu Val He 2225 2230 2235 2240
Gin Glu Ser Glu Thr Ala Glu Gly Glu Glu Ser Val Phe Asn Val Ala 2245 2250 2255
Leu Ser Val Leu Glu Ala Leu Phe Pro Gin Ser Asp Ala Thr Arg Lye 2260 2265 2270
Leu Thr Val Arg Met Asn Cyβ Cys Val Glu Lye Ser Val Thr Arg Phe 2275 2280 2285
Phe Ser Leu Gly Leu Thr Val Ala Asp Val Ala Ser Leu Cys Glu Met 2290 2295 2300
Glu He Gin Asn His Thr Ala Tyr Cys Asp Lys Val Arg Thr Pro Leu 2305 2310 2315 2320
Glu Leu Gin Val Gly Cys Leu Val Gly Asn Glu Leu Thr Phe Glu Cys 2325 2330 2335
Asp Lys Cys Glu Ala Arg Gin Glu Thr Leu Ala Ser Phe Ser Tyr He 2340 2345 2350 Trp Ser Gly Val Pro Leu Thr Arg Ala Thr Pro Ala Lys Pro Pro Val 2355 2360 2365
Val Arg Pro Val Gly Ser Leu Leu Val Ala Asp Thr Thr Lys Val Tyr 2370 2375 2380
Val Thr Asn Pro Asp Asn Val Gly Arg Arg Val Asp Lys Val Thr Phe 2385 2390 2395 2400
Trp Arg Ala Pro Arg Val His Asp Lys Tyr Leu Val Asp Ser He Glu 2405 2410 2415
Arg Ala Arg Arg Ala Ala Gin Ala Cys Gin Ser Met Gly Tyr Thr Tyr 2420 2425 2430
Glu Glu Ala He Arg Thr Val Arg Pro His Ala Ala Met Gly Trp Gly 2435 2440 2445
Ser Lys Val Ser Val Lys Asp Leu Ala Thr Pro Ala Gly Lys Met Ala 2450 2455 2460
Val His Asp Arg Leu Gin Glu He Leu Glu Gly Thr Pro Val Pro Phe 2465 2470 2475 2480
Thr Leu Thr Val Lys Lys Glu Val Phe Phe Lys Asp Arg Lys Glu Glu 2485 2490 2495
Lys Ala Pro Arg Leu He Val Phe Pro Pro Leu Asp Phe Arg He Ala 2500 2505 2510
Glu Lys Leu He Leu Gly Asp Pro Gly Arg Val Ala Lys Ala Val Leu 2515 2520 2525
Gly Gly Ala Tyr Ala Phe Gin Tyr Thr Pro Asn Gin Arg Val Lys Glu 2530 2535 2540
Met Leu Lys Leu Trp Glu Ser Lys Lys Thr Pro Cys Ala He Cys Val 2545 2550 2555 2560
Asp Ala Thr Cys Phe Aβp Ser Ser He Thr Glu Glu Asp Val Ala Leu 2565 2570 2575
Glu Thr Glu Leu Tyr Ala Leu Ala Ser Asp His Pro Glu Trp Val Arg 2580 2585 2590 Ala Leu Gly Lys Tyr Tyr Ala Ser Gly Thr Met Val Thr Pro Glu Gly 2595 2600 2605
Val Pro Val Gly Glu Arg Tyr Cys Arg Ser Ser Gly Val Leu Thr Thr 2610 2615 2620
Ser Ala Ser Asn Cys Leu Thr Cys Tyr He Lys Val Lys Ala Ala Cys 2625 2630 2635 2640
Glu Arg Val Gly Leu Lys Asn Val Ser Leu Leu He Ala Gly Asp Asp 2645 2650 2655
Cys Leu He He Cys Glu Arg Pro Val Cys Asp Pro Cys Asp Ala Leu 2660 2665 2670
Gly Arg Ala Leu Ala Ser Tyr Gly Tyr Ala Cys Glu Pro Ser Tyr His 2675 2680 2685
Ala Ser Leu Asp Thr Ala Pro Phe Cys Ser Thr Trp Leu Ala Glu Cys 2690 2695 2700
Asn Ala Asp Gly Lys Arg His Phe Phe Leu Thr Thr Asp Phe Arg Arg 2705 2710 2715 2720
Pro Leu Ala Arg Met Ser Ser Glu Tyr Ser Asp Pro Met Ala Ser Ala 2725 2730 2735
He Gly Tyr He Leu Leu Tyr Pro Trp His Pro He Thr Arg Trp Val 2740 2745 2750
He He Pro His Val Leu Thr Cys Ala Phe Arg Gly Gly Gly Thr Pro 2755 2760 2765
Ser Asp Pro Val Trp Cys Gin Val His Gly Asn Tyr Tyr Lys Phe Pro 2770 2775 2780
Leu Asp Lys Leu Pro Asn He He Val Ala Leu His Gly Pro Ala Ala 2785 2790 2795 2800
Leu Arg Val Thr Ala Asp Thr Thr Lys Thr Lys Met Glu Ala Gly Lys 2805 2810 2815
Val Leu Ser Asp Leu Lys Leu Pro Gly Leu Ala Val His Arg Lys Lys 2820 2825 2830 Ala Gly Ala Leu Arg Thr Arg Met Leu Arg Ser Arg Gly Trp Ala Glu 2835 2840 2845
Leu Ala Arg Gly Leu Leu Trp His Pro Gly Leu Arg Leu Pro Pro Pro 2850 2855 2860
Glu He Ala Gly He Pro Gly Gly Phe Pro Leu Ser Pro Pro Tyr Met 2865 2870 2875 2880
Gly Val Val His Gin Leu Asp Phe Thr Ser Gin Arg Ser Arg Trp Arg 2885 2890 2895
Trp Leu Gly Phe Leu Ala Leu Leu He Val Ala Leu Phe Gly 2900 2905 2910

Claims

IT IS CLAIMED:
1. A purified polypeptide antigen encoded by the reverse-frame of a virus having an RNA genome, where said polypeptide antigen is specifically immunoreactive with serum infected with said RNA virus.
2. A polypeptide antigen of claim 1, where said virus is a single, positive strand RNA virus.
3. A polypeptide antigen of claim 2, where said virus is Hepatitis G Virus (HGV) or Hepatitis C Virus (HCV) .
4. A polypeptide antigen of claim 3, where said virus is HGV and said polypeptide antigen or a polypeptide antigen containing fragment is encoded by the sequence presented as SEQ ID NO:19 or SEQ ID NO:28.
5. A polypeptide antigen of claim 3, where said virus is HCV and said polypeptide antigen or a polypeptide antigen containing fragment is derived from a sequence selected from the group consisting of SEQ ID NO:141, SEQ ID NO:142, SEQ ID NO:143, SEQ ID NO:144, SEQ ID NO:145 and SEQ ID NO:146.
6. A method of detecting serum infected with a virus having an RNA genome, comprising reacting serum with a substantially isolated polypeptide antigen of claim 1, and examining the polypeptide antigen for the presence of bound antibody.
7. A method of claim 6, wherein the polypeptide antigen is attached to a solid support, said reacting in¬ cludes reacting the serum with the support, and subsequently reacting the support with a reporter- labelled anti-human antibody, and said examining includes detecting the presence of reporter-labelled antibody on the solid support.
8. A monoclonal antibody specifically immunoreactive with a polypeptide antigen of claim 1.
9. A substantially isolated preparation of polyclonal antibodies specifically immunoreactive with a polypeptide antigen of claim 1.
10. A preparation of polyclonal antibodies of claim 9, where said polyclonal antibodies are prepared by affinity.
11. A method of identifying a polypeptide antigen that is specifically immunoreactive with antibodies against a selected virus having an RNA genome, comprising determining a first polynucleotide sequence corresponding to coding sequences for identifiable viral proteins for the selected virus, generating a second polynucleotide sequence complementary to the first polynucleotide encoding said identifiable viral proteins, examining the said second polynucleotide for the presence of an open reading frame (ORF) , identifying a polypeptide antigen encoded by said ORF that is specifically immunoreactive with antibodies against said virus.
12. A method of claim 11, where said first polynucleotide is the genomic strand of a single, positive strand RNA virus that encodes a polyprotein.
13. A method of claim 11, where said identifying includes producing said polypeptide antigen and screening said polypeptide antigen against sera infected with said virus.
PCT/US1995/006266 1994-05-20 1995-05-17 Detection of viral antigens coded by reverse-reading frames WO1995032292A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU25941/95A AU2594195A (en) 1994-05-20 1995-05-17 Detection of viral antigens coded by reverse-reading frames

Applications Claiming Priority (12)

Application Number Priority Date Filing Date Title
US24698594A 1994-05-20 1994-05-20
US08/246,985 1994-05-20
US28556194A 1994-08-03 1994-08-03
US08/285,561 1994-08-03
US32972994A 1994-10-26 1994-10-26
US08/329,729 1994-10-26
US34427194A 1994-11-23 1994-11-23
US08/344,271 1994-11-23
US35750994A 1994-12-16 1994-12-16
US08/357,509 1994-12-16
US38988695A 1995-02-15 1995-02-15
US08/389,886 1995-02-15

Publications (2)

Publication Number Publication Date
WO1995032292A2 true WO1995032292A2 (en) 1995-11-30
WO1995032292A3 WO1995032292A3 (en) 1996-01-25

Family

ID=27559345

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1995/006266 WO1995032292A2 (en) 1994-05-20 1995-05-17 Detection of viral antigens coded by reverse-reading frames

Country Status (2)

Country Link
AU (1) AU2594195A (en)
WO (1) WO1995032292A2 (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997039129A1 (en) * 1996-04-17 1997-10-23 Wabco B.V. Non-a-non-e hepatitis virus having a translatable core region, reagents and methods for their use
US5709997A (en) * 1995-08-14 1998-01-20 Abbott Laboratories Nucleic acid detection of hepatitis GB virus
EP0832901A1 (en) * 1996-09-18 1998-04-01 Roche Diagnostics GmbH Antibodies to hepatitis G virus, and their diagnostic use to detect HGV, and as therapeutic agent
US5807670A (en) * 1995-08-14 1998-09-15 Abbott Laboratories Detection of hepatitis GB virus genotypes
US5843450A (en) * 1994-02-14 1998-12-01 Abbott Laboratories Hepatitis GB Virus synthetic peptides and uses thereof
US5955318A (en) * 1995-08-14 1999-09-21 Abbott Laboratories Reagents and methods useful for controlling the translation of hepatitis GBV proteins
US5981172A (en) * 1994-02-14 1999-11-09 Abbott Laboratories Non-A, non-B, non-C, non-D, non-E Hepatitis reagents and methods for their use
US6051374A (en) * 1994-02-14 2000-04-18 Abbott Laboratories Non-A, non-B, non-C, non-D, non-E hepatitis reagents and methods for their use
CN1058751C (en) * 1998-06-15 2000-11-22 中国人民解放军第二军医大学 Full length CDNA clone of hepatitis virus G genome and its construction method
US6156495A (en) * 1994-02-14 2000-12-05 Abbott Laboratories Hepatitis GB virus recombinant proteins and uses thereof
US6451578B1 (en) 1994-02-14 2002-09-17 Abbott Laboratories Non-A, non-B, non-C, non-D, non-E hepatitis reagents and methods for their use
US6558898B1 (en) 1994-02-14 2003-05-06 Abbott Laboratories Non-A, non-B, non-C, non-D, non-E hepatitis reagents and methods for their use
US6586568B1 (en) 1994-02-14 2003-07-01 Abbott Laboratories Non-A, non-B, non-C, non-D, non-E hepatitis reagents and methods for their use
US6720166B2 (en) 1994-02-14 2004-04-13 Abbott Laboratories Non-a, non-b, non-c, non-c, non-d, non-e hepatitis reagents and methods for their use

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
J. GEN. VIROL., vol. 60, 1982 pages 381-384, J. STRUTHERS ET AL. 'Identification of a major non-structural protein in the nuclei of Rift Valley Fever virus infected cells' *
J. MED. VIROL., vol. 46, 1995 pages 81-90, G. SCHLAUDER ET AL. 'Molecular and serological analysis in the transmission of the GB hepatitis agents' *
J. VIROL., vol. 52, no. 2, 1984 pages 897-904, D. AUPERIN ET AL. 'Sequencing studies of Pichinde arenavirus S RNA indicate a novel coding strategy, an ambisense viral S RNA' *
J. VIROL., vol. 64, no. 1, 1990 pages 247-255, J. SIMONS ET AL. 'Uukuniemivirus S segment RNA' *
PROC. NATL. ACAD. SCI. USA, vol. 92, 1995 pages 3401-3405, J. SIMONS ET AL. 'Identification of two flavivirus-like genomes in the GB agent' *
RES. VIROL., vol. 140, 1989 pages 155-164, J. SALUZZO ET AL. 'Antigenic and biological properties of Rift Valley fever virus isolated during the 1987 Mauritanian apidemic' *
VIROLOGY, vol. 157, 1987 pages 338-350, H. OVERTON ET AL. 'Identification of the N ans NSS proteins coded by the ambisense S RNA of Punta Toro phlebovirus using monospecific antisera raised to baculovirus expressed N and NSs proteins' *
VIROLOGY, vol. 169, 1989 pages 341-345, A. MARRIOT ET AL. 'The S segment of sandfly fever Sicilian virus: evidence for an ambisense genome' *
VIROLOGY, vol. 180, 1991 pages 738-753, C. GIORGIO ET AL. 'Sequences and coding strategies of the S RNAs of Toscana and Rift Valley fever viruses compared to those of Punta Toro, Sicilian Sandfly fever and Uukuniemi viruses' *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6156495A (en) * 1994-02-14 2000-12-05 Abbott Laboratories Hepatitis GB virus recombinant proteins and uses thereof
US5981172A (en) * 1994-02-14 1999-11-09 Abbott Laboratories Non-A, non-B, non-C, non-D, non-E Hepatitis reagents and methods for their use
US6720166B2 (en) 1994-02-14 2004-04-13 Abbott Laboratories Non-a, non-b, non-c, non-c, non-d, non-e hepatitis reagents and methods for their use
US6586568B1 (en) 1994-02-14 2003-07-01 Abbott Laboratories Non-A, non-B, non-C, non-D, non-E hepatitis reagents and methods for their use
US5843450A (en) * 1994-02-14 1998-12-01 Abbott Laboratories Hepatitis GB Virus synthetic peptides and uses thereof
US6558898B1 (en) 1994-02-14 2003-05-06 Abbott Laboratories Non-A, non-B, non-C, non-D, non-E hepatitis reagents and methods for their use
US6451578B1 (en) 1994-02-14 2002-09-17 Abbott Laboratories Non-A, non-B, non-C, non-D, non-E hepatitis reagents and methods for their use
US6051374A (en) * 1994-02-14 2000-04-18 Abbott Laboratories Non-A, non-B, non-C, non-D, non-E hepatitis reagents and methods for their use
US5709997A (en) * 1995-08-14 1998-01-20 Abbott Laboratories Nucleic acid detection of hepatitis GB virus
US5955318A (en) * 1995-08-14 1999-09-21 Abbott Laboratories Reagents and methods useful for controlling the translation of hepatitis GBV proteins
US5807670A (en) * 1995-08-14 1998-09-15 Abbott Laboratories Detection of hepatitis GB virus genotypes
WO1997039129A1 (en) * 1996-04-17 1997-10-23 Wabco B.V. Non-a-non-e hepatitis virus having a translatable core region, reagents and methods for their use
EP0832901A1 (en) * 1996-09-18 1998-04-01 Roche Diagnostics GmbH Antibodies to hepatitis G virus, and their diagnostic use to detect HGV, and as therapeutic agent
CN1058751C (en) * 1998-06-15 2000-11-22 中国人民解放军第二军医大学 Full length CDNA clone of hepatitis virus G genome and its construction method

Also Published As

Publication number Publication date
AU2594195A (en) 1995-12-18
WO1995032292A3 (en) 1996-01-25

Similar Documents

Publication Publication Date Title
JP4296174B2 (en) Hepatitis G virus and its molecular cloning
US5766840A (en) Hepatitis G virus and molecular cloning thereof
Tsukiyama-Kohara et al. Antigenicities of group I and II hepatitis C virus polypeptides—molecular basis of diagnosis
Bartenschlager et al. Nonstructural protein 3 of the hepatitis C virus encodes a serine-type proteinase required for cleavage at the NS3/4 and NS4/5 junctions
Farci et al. Lack of protective immunity against reinfection with hepatitis C virus
Warrener et al. Pestivirus NS3 (p80) protein possesses RNA helicase activity
US6071693A (en) HCV genomic sequences for diagnostics and therapeutics
CZ237793A3 (en) Hcv genom sequences for diagnosis and therapy
WO1995032292A2 (en) Detection of viral antigens coded by reverse-reading frames
EP1227323A1 (en) Non-A, non-B hepatitis virus antigen, diagnostic methods and vaccines
WO1995021922A2 (en) Non-a, non-b, non-c, non-d, non-e hepatitis reagents and methods for their use
US20020081574A1 (en) Methods for identifying inhibitors of helicase C virus
US5874563A (en) Hepatitis G virus and molecular cloning thereof
Yildiz et al. Molecular characterization of a full genome Turkish hepatitis C virus 1b isolate (HCV-TR1): a predominant viral form in Turkey
WO1995032290A2 (en) Non-a/non-b/non-c/non-d/non-e hepatitis agents and molecular cloning thereof
US5859230A (en) Non-A/non-B/non-C/non-D/non-E hepatitis agents and molecular cloning thereof
Hotta et al. Analysis of the core and E1 envelope region sequences of a novel variant of hepatitis C virus obtained in Indonesia
US6110465A (en) Nucleotide and deduced amino acid sequences of hypervariable region 1 of the envelope 2 gene of isolates of hepatitis C virus and the use of reagents derived from these hypervariable sequences in diagnostic methods and vaccines
AU684177C (en) Hepatitis G virus and molecular cloning thereof
KR0120928B1 (en) Novel hcv gene which is separated in korea
JPH08504421A (en) Peptide derived from C33 region of HCV, antibody against the peptide, and method for detecting HCV
Long 2 C. BRÉCHOT
Fitzpatrick Evolution of the Predominant Sequence of the Hypervariable Region in the Putative Envelope Gene E2/NS1 of Hepatitis C Virus in Patients on Haemodialysis
EP0745129A1 (en) Non-a, non-b, non-c, non-d, non-e hepatitis reagents and methods for their use

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AM AT AU BB BG BR BY CA CH CN CZ DE DK EE ES FI GB GE HU JP KE KG KP KR KZ LK LR LT LU LV MD MG MN MW MX NO NZ PL PT RO RU SD SE SI SK TJ TT UA US US US US US US UZ VN

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): KE MW SD SZ UG AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

AK Designated states

Kind code of ref document: A3

Designated state(s): AM AT AU BB BG BR BY CA CH CN CZ DE DK EE ES FI GB GE HU JP KE KG KP KR KZ LK LR LT LU LV MD MG MN MW MX NO NZ PL PT RO RU SD SE SI SK TJ TT UA US US US US US US UZ VN

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): KE MW SD SZ UG AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: CA

DPE2 Request for preliminary examination filed before expiration of 19th month from priority date (pct application filed from 20040101)