WO1993017110A2 - A recombinant hepatitis c virus polypeptide - Google Patents

A recombinant hepatitis c virus polypeptide Download PDF

Info

Publication number
WO1993017110A2
WO1993017110A2 PCT/GB1993/000345 GB9300345W WO9317110A2 WO 1993017110 A2 WO1993017110 A2 WO 1993017110A2 GB 9300345 W GB9300345 W GB 9300345W WO 9317110 A2 WO9317110 A2 WO 9317110A2
Authority
WO
WIPO (PCT)
Prior art keywords
ala
leu
val
ser
gly
Prior art date
Application number
PCT/GB1993/000345
Other languages
French (fr)
Other versions
WO1993017110A3 (en
Inventor
David Parker
Brian Colin Rodgers
Original Assignee
The Wellcome Foundation Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Wellcome Foundation Limited filed Critical The Wellcome Foundation Limited
Publication of WO1993017110A2 publication Critical patent/WO1993017110A2/en
Publication of WO1993017110A3 publication Critical patent/WO1993017110A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/005Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/40Fusion polypeptide containing a tag for immunodetection, or an epitope for immunisation
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2770/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssRNA viruses positive-sense
    • C12N2770/00011Details
    • C12N2770/24011Flaviviridae
    • C12N2770/24211Hepacivirus, e.g. hepatitis C virus, hepatitis G virus
    • C12N2770/24222New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes

Definitions

  • This invention relates to recombinant polypeptide or polypeptides for screening for parenterally transmitted non A non B hepatitis (PT-NANBH), DNA sequences encoding such polypeptide or polypeptides, expression vectors containing such DNA sequences and host cells transformed by such expression vectors.
  • the present invention also relates to the use of the recombinant polypeptide or polypeptides in an immuno assay for the diagnosis of PT-NANBH or in a vaccine for its prevention.
  • Non A non B hepatitis is by definition a diagnosis of exclusion and has generally been employed to describe cases of viral hepatitis infection in human beings that are not due to hepatitis A or B viruses.
  • the PT-NANBH virus has also been referred to as Hepatitis C Virus (HCV).
  • HCV Hepatitis C Virus
  • the cause of the infection has not been identified although, on clinical and epidemiological grounds, a number of agents have been thought to be responsible as reviewed in Shih et al (Prog Liver Dis., 1986, 8 , 433-452). In the USA alone, up to 10% of blood transfusions can result in NANBH which makes it a significant problem.
  • GB-A-2239245 discloses nucleotide and peptide sequences of a viral agent responsible for PT-NANBH.
  • GB-A-2 239 245 discloses a recombinant polypeptide BHC-11 which comprises an antigen obtained from the non-structural coding region (NS) (the 3'-end) and one antigen from the structural coding region (S) (the 5'-end) of the NANBH virus.
  • BHC-11 (SEQ ID NO: 1 and 2) contains a portion of the non-structural region of the virus, called NS5, (putative replicase) at the amino terminus joined via a synthetic linker (Val, Lys, Lys, Lys, Lys, Lys) to a portion of the structural region which contains almost all the core protein sequence (9 amino acids from the amino terminus are not present) and a part of a sequence from the structural region called El. It is disclosed that BHC-11 may be used in diagnosis of PT-NANBH. Other workers have used antigens from other regions of the HCV genome in order to screen for PT-NANBH, in particular C-100-3 which contains part of the NS4 region and 33c which comes from NS3 (Review by Alberti A. (1991) J. Hepatol 12: 279-282). C-100-3 and 33c are commercially available from Ortho Diagnostic Systems, Raritan, New Jersey.
  • PT-NANBH can be detected in samples from donors with either NS5 alone or core alone.
  • the present invention provides a recombinant PT-NANBH polypeptide or recombinant PT-NANBH polypeptides which polypeptide comprises or which polypeptides together comprise (i) at least one antigen from the structural coding region of the viral genome, (ii) at least one antigen from the non-structural coding region of the viral genome; and (iii) at least one further antigen from either the structural or non-structural coding region of the viral genome and which is different from the antigens referred to in (i) and (ii).
  • the polypeptide or polypeptides comprise an antigen from the structural coding region having an amino acid sequence which is at least 70%, preferably at least 80%, more preferably at least 90% and even more preferably at least 95%, homologous with the amino acid sequence as set forth in SEQ ID NO: 3 or 4 or an antigenic fragment thereof, an antigen from the non-structural coding region having an amino acid sequence that is at least 70%, preferably at least 80%, more preferably at least 90% and even more preferably at least 95%, homologous with the amino acid sequence as set forth in SEQ ID NO: 5 or 6 or an antigenic fragment thereof, and an antigen from the non-structural coding region having an amino acid sequence which is at least 70%, preferably at least 80%, more preferably at least 90% and even more preferably at least 95%, homologous with the amino acid sequence as set forth in SEQ ID NO: 7 or 8 or an antigenic fragment thereof.
  • An antigenic fragment is a fragment which is capable of being bound by the antigen binding portion of antibodies produced by humans naturally infected by hepatitis C virus.
  • the antigens may be fused to form a single recombinant PT-NANBH polypeptide.
  • the provision of a single polypeptide greatly simplifies the production and purification over that which would be required if each antigen were individually expressed.
  • the antigens may also however be used together as individual recombinant antigens.
  • Preferably the antigens are fused to form a single recombinant polypeptide.
  • polypeptides of the present invention may be fused to a heterologous polypeptide.
  • PT-NANBH polypeptides whether as a single polypeptide or together as individual recombinant polypeptides advantageously provide maximum sensitivity by combining multiple regions, and in particular non-structural and structural regions, from the PT-NANBH genome for screening for PT-NANBH.
  • a recombinant PT-NANBH polypeptide or polypeptides comprising additional antigenic regions and thus include a total of four or more antigenic regions or fragments thereof.
  • the four or more recombinant PT-NANBH polypeptides may be used as a single recombinant polypeptide or together as individual recombinant polypeptides. It is also possible to replace antigenic regions in the polypeptide or polypeptides with. other, yet unidentified, PT-NANBH antigenic regions.
  • a PT-NANBH recombinant polypeptide comprising multiple polypetides
  • the fusion should of course be carried out in such a manner that the antigenic activity of each polypeptide is not significantly compromised by its position relative to another polypetide.
  • the methods by which such single polypeptides can be obtained are well known in the art.
  • the order in which the PT-NANBH polypetide in the recombinant polypeptide and the sequence used to link them together can be varied.
  • the present invention also provides two novel antigenic regions of the PT-NANBH genome and in particular provides a recombinant PT-NANBH polypeptide comprising an amino acid sequence which is at least 70%, preferably at least 80%, more preferably at least 90% and even more preferably at least 95%, homologous with the amino acid sequence as set forth in SEQ ID NO: 14 or 15 or an antigenic fragment thereof, and a recombinant PT-NANBH polypeptide comprising an amino acid sequence which is at least 70%, preferably at least 80%, more preferably at least 90% and even more preferably at least 95%, homologous with the amino acid sequence as set forth in SEQ ID NO: 19 or 20 or 7 or 8 or an antigenic fragment thereof.
  • the PT-NANBH recombinant polypeptide or polypeptides and antigens of the present invention may be obtained using an amino acid synthesizer, if it is an polypeptide having no more than about thirty residues, or by recombinant DNA technology.
  • the present invention also provides a DNA sequence encoding the PT-NANBH recombinant polypeptide or polypeptides as herein defined.
  • the DNA sequences may be synthetic or cloned.
  • the DNA sequences are as set forth in SEQ ID NO: 3, 5, 7, 14, 19 or 26.
  • the present invention also provides expression vectors containing the DNA sequences as herein defined, which vectors being capable, in an appropriate host, of expressing the DNA sequences to produce the PT-NANBH recombinant polypeptide or polypeptides of the invention.
  • the expression vector normally contains control elements of DNA that effect expression of the DNA sequence in an appropriate host. These elements may vary according to the host but usually include a promoter, ribosome binding site, translational start and stop sites, and a transcriptional termination site. Examples of such vectors include plasmids and viruses.
  • Expression vectors of the present invention encompass both extrachromosomal vectors and vectors that are integrated into the host cell's chromosome.
  • the expression vector may contain the DNA sequence of the present invention optionally as a fusion linked to either the 5'- or 3'-end of the DNA sequence encoding, for example, ⁇ -galactosidase or to the 3'-end of the DNA sequence encoding, for example, the trp E gene.
  • the DNA sequence is optionally fused to the polyhedrin coding sequence.
  • the present invention also provides a host cell transformed with expression vectors as herein defined.
  • host cells of use with the present invention include prokaryotic and eukaryotic cells, such as bacterial, yeast, mammalian and insect cells. Particular examples of such cells are E.coli, S.cerevisiae, P.pastoris, Chinese hamster ovary and mouse cells, and Spodoptera fru ⁇ iperda and Tricoplusia ni.
  • the choice of host cell may depend on a number of factors but, if post-translational modification of the PT- NANBH viral polypeptide is important, then an eukaryotic host would be preferred.
  • the present invention also provides a process for preparing a recombinant PT-NANBH polypeptide or PT-NANBH polypeptides which comprises isolating the DNA sequences, as herein defined, from the PT-NANBH genome, preferably by an amplification process, or synthesising DNA sequences encoding the antigens of PT-NANBH recombinant polypeptide or polypeptides, as herein defined, inserting the DNA sequences into one or more expression vectors such that it is or they are capable, in an appropriate host, of being expressed, transforming an host cells with the one or more expression vectors, culturing the transformed host cells, and isolating the recombinant polypeptide or polypeptides.
  • Amplification is preferably carried out by the polymerase chain reaction (PCR) technique (Saiki et al, Science, 1985, 230, 1350-4).
  • DNA sequence encoding PT-NANBH recombinant polypeptide and the two PT-NANBH antigens may be synthesised using standard procedures (Gait, Oligonucleotide Synthesis: A Practical Approach, 1984, Oxford, IRL Press).
  • the desired DNA sequence obtained as described above may be inserted into an expression vector using known and standard techniques.
  • the expression vector is normally cut using restriction enzymes and the DNA sequence inserted using bluntend or staggered-end ligation.
  • the cut is usually made at a restriction site in a convenient position in the expression vector such that, once inserted, the DNA sequences are under the control of the functional elements of DNA that effect its expression.
  • Transformation of an host cell may be carried out using standard techniques. Some phenotypic marker is usually employed to distinguish between the transformants that have successfully taken up the expression vector and those that have not. Culturing of the transformed host cell and isolation of the PT-NANBH recombinant polypeptide or polypeptides as required may also be carried out using standard techniques. Diagnostic assays based upon the present invention may be used to determine the presence or absence of PT-NANBH infection. They may also be used to monitor treatment of such infection, for example in interferon therapy. In an assay for the diagnosis of viral infection, there are basically three distinct approaches that can be adopted involving the detection of viral nucleic acid, viral antigen or viral antibody as discussed in GB-A-2 239 245.
  • the method may comprise contacting a test sample with a PT-NANBH recombinant polypeptide or polypeptides of the present invention and determining whether there is any antigen-antibody binding contained within the test sample.
  • a test kit may be provided comprising a PT-NANBH recombinant polypeptide or polypeptides, as defined herein, and means for determining whether there is any binding with antibody contained in the test sample.
  • the test sample may be taken from any of the appropriate tissues and physiological fluids mentioned above for the detection of viral nucleic acid. If a physiological fluid is obtained, it may optionally be concentrated for any viral antibody present.
  • the PT-NANBH recombinant polypeptide or polypeptides can be used to capture selectively antibody against PT-NANBH from solution, to label selectively the antibody already captured, or both to capture and label the antibody.
  • the recombinant polypeptide or polypeptides may be used in a variety of homogeneous assay formats in which the antibody reactive with the antigen is detected in solution with no separation of phases.
  • the types of assay in which the PT-NANBH recombinant polypeptide or polypeptides are used to capture antibody from solution involve immobilization of the polypeptide or polypeptides on to a solid surface. This surface should be capable of being washed in some way.
  • suitable surfaces include polymers of various types (moulded into microtitre wells; beads; dipsticks of various types; aspiration tips; electrodes; and optical devices), particles (for example latex; stabilized red blood cells; bacterial or fungal cells; spores; gold or other metallic or metal-containing sols; and proteinaceous colloids) with the usual size of the particle being from 0.02 to 5 microns, membranes (for example of nitrocellulose; paper; cellulose acetate; and high porosity/high surface area membranes of an organic or inorganic material).
  • particles for example latex; stabilized red blood cells; bacterial or fungal cells; spores; gold or other metallic or metal-containing sols; and proteinaceous colloids
  • membranes for example of nitrocellulose; paper; cellulose acetate; and high porosity/high surface area membranes of an organic or inorganic material.
  • the attachment of the PT-NANBH recombinant polypeptide or polypeptides to the surface can be by passive adsorption from a solution of optimum composition which may include surfactants, solvents, salts and/or chaotropes; or by active chemical bonding.
  • Active bonding may be through a variety of reactive or activatible functional groups which may be exposed on the surface (for example condensing agents; active acid esters, halides and anhydrides; amino, hydroxyl, or carboxyl groups; sulphydryl groups; carbonyl groups; diazo groups; or unsaturated groups).
  • the active bonding may be through a protein (itself attached to the surface passively or through active bonding), such as albumin or casein, to which the viral polypeptide may be chemically bonded by any of a variety of methods.
  • a protein in this way may confer advantages because of isoelectric point, charge, hydrophilicity or other physico-chemical property.
  • the viral polypeptide may also be attached to the surface (usually but not necessarily a membrane) following electrophoretic separation of a reaction mixture, such as immune precipitation.
  • the captured antibody After contacting (reacting) the surface bearing the PT- NANBH recombinant polypeptide or polypeptides with a test sample, allowing time for reaction, and, where necessary, removing the excess of the sample by any of a variety of means, (such as washing, centrifugation, filtration, magnetism or capilliary action) the captured antibody is detected by any means which will give a detectable signal.
  • this may be achieved by use of a labelled molecule or particle as described above which will react with the captured antibody (for example protein A or protein G and the like; anti-species or anti-immunoglobulin-sub-type; rheumatoid factor; or antibody to the antigen, used in a competitive or blocking fashion), or any molecule containing an epitope contained in the polypeptide.
  • the detectable signal may be optical or radioactive or physico-chemical and may be provided directly by labelling the molecule or particle with, for example, a dye, radiolabel, electroactive species, magnetically resonant species or fluorophore, or indirectly by labelling the molecule or particle with an enzyme itself capable of giving rise to a measurable change of any sort.
  • the detectable signal may be obtained using, for example, agglutination, or through a diffraction or birefringent effect if the surface is in the form of particles.
  • Assays in which a PT-NANBH recombinant polypeptide or polypeptides itself is used to label an already captured antibody require some form of labelling of the antigen which will allow it to be detected.
  • the labelling may be direct by chemically or passively attaching for example a radio label, magnetic resonant species, particle or enzyme label to the polypeptide; or indirect by attaching any form of label to a molecule which will itself react with the polypeptide or polypeptides.
  • the chemistry of bonding a label to the PT-NANBH recombinant polypeptide or polypeptides can be directly through a moiety already present in the polypeptide or polypeptides, such as an amino group, or through an intermediate moiety, such as a maleimide group.
  • Capture of the antibody may be on any of the surfaces already mentioned by any reagent including passive or activated adsorption which will result in specific antibody or immune complexes being bound. In particular, capture of the antibody could be by anti-species or anti-immunoglobulin-sub-type, by rheumatoid factor, proteins A, G and the like, or by any molecule containing an epitope contained in the polypeptide or polypeptides.
  • the labelled PT-NANBH recombinant polypeptide or polypeptides may be used in a competitive binding fashion in which its binding to any specific molecule on any of the surfaces exemplified above is blocked by antigen in the sample. Alternatively, it may be used in a non-competitive fashion in which antigen in the sample is bound specifically or non- specifically to any of the surfaces above and is also bound to a specific bi- or poly-valent molecule (e.g. an antibody) with the remaining valencies being used to capture the labelled polypeptide or polypeptides.
  • a specific bi- or poly-valent molecule e.g. an antibody
  • the PT-NANBH recombinant polypeptide or polypeptides and an antibody are separately labelled so that, when the antibody reacts with the recombinant polypeptide or polypeptides in free solution, the two labels interact to allow, for example, non-radiative transfer of energy captured by one label to the other label with appropriate detection of the excited second label or quenched first label (e.g. by fluorimetry, magnetic resonance or enzyme measurement).
  • Addition of either viral polypeptide or antibody in a sample results in restriction of the interaction of the labelled pair and thus in a different level of signal in the detector.
  • a suitable assay format for detecting PT-NANBH antibody is the direct sandwich enzyme immunoassay (EIA) format.
  • EIA direct sandwich enzyme immunoassay
  • a PT-NANBH recombinant polypeptide or polypeptides is coated onto microtitre wells.
  • a test sample and a PT-NANBH recombinant polypeptide or polypeptides to which an enzyme is coupled are added simultaneously.
  • Any PT-NANBH antibody present in the test sample binds both to the recombimant polypeptide or polypeptides coating the well and to the enzyme-coupled recombinant polypeptide or polypeptides.
  • the same recombinant polypeptide or polypeptides are used on both sides of the sandwich. After washing, bound enzyme is detected using a specific substrate involving a colour change.
  • a test kit for use in such an EIA comprises:
  • (3) means providing a surface on which a PT-NANBH recombinant polypeptide or polypeptides is immobilised; and (4) optionally, washing solutions and/or buffers.
  • the recombinant polypeptide or polypeptides of the present invention may be incorporated into a vaccine formulation for inducing immunity to PT-NANBH in man.
  • the recombinant polypeptide or polypeptides may be presented in association with a pharmaceutically acceptable carrier.
  • the recombinant polypeptide or polypeptides may optionally be presented as part of an hepatitis B core fusion particle, as described in Clarke et al (Nature, 1987, 330, 381-384), or a polylysine based polymer, as described in Tarn (PNAS. 1988, 85, 5409-5413).
  • the recombinant polypeptide or polypeptides may optionally be attached to a particulate structure, such as liposomes or ISCOMS.
  • Pharmaceutically acceptable carriers include liquid media suitable for use as vehicles to introduce the recombinant polypeptide or polypeptides into a patient. An example of such liquid media is saline solution.
  • the recombinant polypeptide or polypeptides may be dissolved or suspended as a solid in the carrier.
  • the vaccine formulation may also contain an adjuvant for stimulating the immune response and thereby enhancing the effect of the vaccine.
  • adjuvants include aluminium hydroxide and aluminium phosphate.
  • the vaccine formulation may contain a final concentration of recombinant polypeptide or polypeptides in the range from 0.01 to 5 mg/ml, preferably from 0.03 to 2 mg/ml.
  • the vaccine formulation may be incorporated into a sterile container, which is then sealed and stored at a low temperature, for example 4°C, or may be freeze-dried.
  • one or more doses of the vaccine formulation may be administered. Each dose may be 0.1 to 2 ml, preferably 0.2 to 1 ml.
  • a method for inducing immunity to PT-NANBH in man comprises the administration of an effective amount of a vaccine formulation, as hereinbefore defined.
  • the present invention also provides the use of a PT-NANBH recombinant polypeptide or polypeptides as herein defined in the preparation of a vaccine for use in the induction of immunity to PT-NANBH in man.
  • Vaccines of the present invention may be administered by any convenient method for the administration of vaccines including oral and parenteral (e.g. intravenous, subcutaneous or intramuscular) injection.
  • the treatment may consist of a single dose of vaccine or a plurality of doses over a period of time.
  • Figure 1 shows a schematic representation of the NANBH genome including the structural and non-structural coding regions together with the recombinant polypeptide or polypeptides of the present invention.
  • Example 1 Production and Expression of BHC-19
  • the NS4 region (2258-3347 bp) from a 6587 bp contingous sequence from the 3' region of the PT-NANBH genome, SEQ ID NO: 9 is amplified by PCR using primers D224 and D226 SEQ ID NO's: 11 and 13 and the 1119 bp fragment SEQ ID NO: 14 is cloned into a vector and expressed in infected insect cells as in the method described in Example 7 GB-A-2 239 245.
  • the recombinant virus (BHC-19) was able to express the NS4 specific recombinant protein (60 KDa apparent MW) at low levels in the infected insect cells.
  • Example 2 Production of DX200 and expression in E. coli
  • the NS3 region (1125-2090 bp) from the above mentioned 6587 bp contingous sequence is amplified by PCR using primers D344 and D308 SEQ ID NO's: 16 and 18 and the resulting 993 bp fragment SEQ ID NO: 19 is cloned into the E coli expression vector pDEV107, which fuses the sequence onto the C-terminus of ß-galactosidase and the resulting fusion protein (DX200125kDa apparent MW) is expressed in E. coli in accordance with the method of Example 6 in GB-A-2 239 245.
  • Example 3 Performance of BHC-19, DX200, BHC-4 and BHC-7 in an EIA
  • BHC-19, DX200, BHC-4 (a core construct fusing the first 135 amino acids of the core protein onto the 3' of the polyhedrin gene, 35kDa apparent MW)
  • SEQ ID NO: 21 or 22 and BHC-7 (NS5, 100kDa apparent MW, as disclosed in GB-A-2239245) can be resolved from each other when co-electrophoresed on SDS-PAG.
  • the NS3 specific region of pDX200 was amplified by PCR using the primers D360 and D361 SEQ ID NO: 23 and 24 to produce a 600 bp fragment SEQ ID NO: 7 with Pstl ends.
  • the fragment was cloned into Pstl digested pDX136 (the transfer vector used to produce BHC-11 in GB-A-2 239 245).
  • the resulting transformants were analyzed by restriction enzymze mapping to identify those which contained the NS3 sequence inserted in the correct orientation between the NS5 and core parts of pDX136; this was called pDX208, SEQ ID NO: 26 or 27, see Figure 1.
  • a stock of the recombinant baculovirus BHC-28 was produced.
  • Insect cells infected with BHC-28 produce a new band of i21kDa apparent molecular weight as judged by SDS-PAGE of cell extracts. This band reacts with sera which contain antibodies to each of its constituent PT-NANBH antigens and appears to be the expected tripartite fusion of NS5-NS3-core in that order.
  • BHC-28 was partially purified from infected cells and used to coat microtitre wells. These wells were then used in an EIA to determine whether or not BHC-28 is better than BHC-11 at detecting NS3-only sera.
  • Table 2 presents the comparative data using sera having a predominant reaction with NS3, some of which are identified in Table 1.
  • a vaccine formulation may be prepared by conventional techniques using the following constituents in the indicated amounts.
  • MOLECULE TYPE cDNA to genomic RNA
  • FEATURE
  • GGC CCC AGG TTG GGT GTG CGC GCG ACT AGG AAG ACT TCC GAG CGG TCG 2016
  • GGT AAT TCC TCC CGC TGC TGG GTA GCG CTC ACT CCC ACG CTC GCG GCC 2592 Gly Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala
  • MOLECULE TYPE cDNA to genomic RNA
  • MOLECULE TYPE cDNA to genomic RNA
  • AAG CGC AGG CTG GCC AGG GGG TCT CCC CCC TCC TTG GCC AGC TCT TCA 240 Lys Arg Arg Leu Ala Arg Gly Ser Pro Ser Leu Ala Ser Ser Ser Ser
  • GGC ACG GCA ACC GCC CCT CCT GAC CAA TCC TCC GAC GAC GGC GGA GCA 768 Gly Thr Ala Thr Ala Pro Pro Asp Gln Ser Ser Asp Asp Gly Gly Ala
  • GGT CCC CTG ACT AAT TCA AAA GGG CAG AAC TGC GGC TAT CGC CGG TGC 1728 Gly Pro Leu Thr Asn Ser Lys Gly Gln Asn Cys Gly Tyr Arg Arg Cys
  • MOLECULE TYPE cDNA to genomic RNA
  • baculovirus BHC-28 (e.g.4) which expresses a recombinant protein containing NS5 - NS3 - core sequences in infected cells
  • MOLECULE TYPE cDNA to genomic RNA
  • GGT TGC ATC ATT ACC AGC CTC ACA GGT CGG GAC AAG AAC CAA GTC GAG 720 Gly Cys Ile Ile Thr Ser Leu Thr Gly Arg Asp Lys Asn Gln Val Glu
  • GGC AAA GCC ATC CCT ATT GAG ACC ATC AAG GGG GGG AGG CAC CTC ATT 1728 Gly Lys Ala Ile Pro Ile Glu Thr Ile Lys Gly Gly Arg His Leu Ile
  • GCT GCT TCA GCT TTC GTA
  • GGC GCC
  • GGC ATT GCT GGT GCG
  • TTC TCC ATC CTT CTA GCC CAG GAG CAA CTT GAA AAA GCC CTA GAT TGT 6624 Phe Ser Ile Leu Leu Ala Gln Glu Gln Leu Glu Lys Ala Leu Asp Cys

Abstract

The present invention relates to a recombinant PT-NANBH polypeptide or recombinant PT-NANBH polypeptides which polypeptide comprises or which polypeptides together comprise (i) at least one antigen from the structural coding region of the viral genome, (ii) at least one antigen from the non-structural coding region of the viral genome; and (iii) at least one further antigen from either the structural or non-structural coding region of the viral genome and which is different from the antigens referred to in (i) and (ii), and the use of the polypeptide in an immuno assay for diagnosis PT-NANBH or in a vaccine for its prevention.

Description

A RECOMBINANT HEPATITIS C VIRUS POLYPEPTIDE
This invention relates to recombinant polypeptide or polypeptides for screening for parenterally transmitted non A non B hepatitis (PT-NANBH), DNA sequences encoding such polypeptide or polypeptides, expression vectors containing such DNA sequences and host cells transformed by such expression vectors. The present invention also relates to the use of the recombinant polypeptide or polypeptides in an immuno assay for the diagnosis of PT-NANBH or in a vaccine for its prevention.
Non A non B hepatitis (NANBH) is by definition a diagnosis of exclusion and has generally been employed to describe cases of viral hepatitis infection in human beings that are not due to hepatitis A or B viruses. The PT-NANBH virus has also been referred to as Hepatitis C Virus (HCV). In the majority of such cases, the cause of the infection has not been identified although, on clinical and epidemiological grounds, a number of agents have been thought to be responsible as reviewed in Shih et al (Prog Liver Dis., 1986, 8 , 433-452). In the USA alone, up to 10% of blood transfusions can result in NANBH which makes it a significant problem.
GB-A-2239245 discloses nucleotide and peptide sequences of a viral agent responsible for PT-NANBH. In particular, GB-A-2 239 245 discloses a recombinant polypeptide BHC-11 which comprises an antigen obtained from the non-structural coding region (NS) (the 3'-end) and one antigen from the structural coding region (S) (the 5'-end) of the NANBH virus. Specifically BHC-11 (SEQ ID NO: 1 and 2) contains a portion of the non-structural region of the virus, called NS5, (putative replicase) at the amino terminus joined via a synthetic linker (Val, Lys, Lys, Lys, Lys, Lys) to a portion of the structural region which contains almost all the core protein sequence (9 amino acids from the amino terminus are not present) and a part of a sequence from the structural region called El. It is disclosed that BHC-11 may be used in diagnosis of PT-NANBH. Other workers have used antigens from other regions of the HCV genome in order to screen for PT-NANBH, in particular C-100-3 which contains part of the NS4 region and 33c which comes from NS3 (Review by Alberti A. (1991) J. Hepatol 12: 279-282). C-100-3 and 33c are commercially available from Ortho Diagnostic Systems, Raritan, New Jersey.
It has been found that the immune response to PT-NANBH infection can be varied and can be restricted to particular viral antigens, for example GB-A-2 239 245 discloses that PT-NANBH can be detected in samples from donors with either NS5 alone or core alone.
It has now surprisingly been found that if at least three different PT-NANBH antigens are used to screen for PT-NANBH the screening is much more sensitive for PT-NANBH, as compared to the use of only two PT-NANBH antigens.
Accordingly, the present invention provides a recombinant PT-NANBH polypeptide or recombinant PT-NANBH polypeptides which polypeptide comprises or which polypeptides together comprise (i) at least one antigen from the structural coding region of the viral genome, (ii) at least one antigen from the non-structural coding region of the viral genome; and (iii) at least one further antigen from either the structural or non-structural coding region of the viral genome and which is different from the antigens referred to in (i) and (ii).
Preferably, the polypeptide or polypeptides comprise an antigen from the structural coding region having an amino acid sequence which is at least 70%, preferably at least 80%, more preferably at least 90% and even more preferably at least 95%, homologous with the amino acid sequence as set forth in SEQ ID NO: 3 or 4 or an antigenic fragment thereof, an antigen from the non-structural coding region having an amino acid sequence that is at least 70%, preferably at least 80%, more preferably at least 90% and even more preferably at least 95%, homologous with the amino acid sequence as set forth in SEQ ID NO: 5 or 6 or an antigenic fragment thereof, and an antigen from the non-structural coding region having an amino acid sequence which is at least 70%, preferably at least 80%, more preferably at least 90% and even more preferably at least 95%, homologous with the amino acid sequence as set forth in SEQ ID NO: 7 or 8 or an antigenic fragment thereof.
An antigenic fragment is a fragment which is capable of being bound by the antigen binding portion of antibodies produced by humans naturally infected by hepatitis C virus.
The antigens may be fused to form a single recombinant PT-NANBH polypeptide. The provision of a single polypeptide greatly simplifies the production and purification over that which would be required if each antigen were individually expressed. The antigens may also however be used together as individual recombinant antigens. Preferably the antigens are fused to form a single recombinant polypeptide.
The polypeptides of the present invention may be fused to a heterologous polypeptide.
The recombinant PT-NANBH polypeptides whether as a single polypeptide or together as individual recombinant polypeptides advantageously provide maximum sensitivity by combining multiple regions, and in particular non-structural and structural regions, from the PT-NANBH genome for screening for PT-NANBH.
It may of course be possible to provide a recombinant PT-NANBH polypeptide or polypeptides comprising additional antigenic regions and thus include a total of four or more antigenic regions or fragments thereof. The four or more recombinant PT-NANBH polypeptides may be used as a single recombinant polypeptide or together as individual recombinant polypeptides. It is also possible to replace antigenic regions in the polypeptide or polypeptides with. other, yet unidentified, PT-NANBH antigenic regions.
To obtain a PT-NANBH recombinant polypeptide comprising multiple polypetides, it is preferred to fuse the individual coding sequences into a single open reading frame. The fusion should of course be carried out in such a manner that the antigenic activity of each polypeptide is not significantly compromised by its position relative to another polypetide. Particular regard should be had for the nature of the sequences at the actual junction between the polypeptide. The methods by which such single polypeptides can be obtained are well known in the art. The order in which the PT-NANBH polypetide in the recombinant polypeptide and the sequence used to link them together can be varied.
The present invention also provides two novel antigenic regions of the PT-NANBH genome and in particular provides a recombinant PT-NANBH polypeptide comprising an amino acid sequence which is at least 70%, preferably at least 80%, more preferably at least 90% and even more preferably at least 95%, homologous with the amino acid sequence as set forth in SEQ ID NO: 14 or 15 or an antigenic fragment thereof, and a recombinant PT-NANBH polypeptide comprising an amino acid sequence which is at least 70%, preferably at least 80%, more preferably at least 90% and even more preferably at least 95%, homologous with the amino acid sequence as set forth in SEQ ID NO: 19 or 20 or 7 or 8 or an antigenic fragment thereof. The PT-NANBH recombinant polypeptide or polypeptides and antigens of the present invention may be obtained using an amino acid synthesizer, if it is an polypeptide having no more than about thirty residues, or by recombinant DNA technology. The present invention also provides a DNA sequence encoding the PT-NANBH recombinant polypeptide or polypeptides as herein defined. The DNA sequences may be synthetic or cloned. Preferably the DNA sequences are as set forth in SEQ ID NO: 3, 5, 7, 14, 19 or 26. The present invention also provides expression vectors containing the DNA sequences as herein defined, which vectors being capable, in an appropriate host, of expressing the DNA sequences to produce the PT-NANBH recombinant polypeptide or polypeptides of the invention.
The expression vector normally contains control elements of DNA that effect expression of the DNA sequence in an appropriate host. These elements may vary according to the host but usually include a promoter, ribosome binding site, translational start and stop sites, and a transcriptional termination site. Examples of such vectors include plasmids and viruses. Expression vectors of the present invention encompass both extrachromosomal vectors and vectors that are integrated into the host cell's chromosome. For use in E.coli. the expression vector may contain the DNA sequence of the present invention optionally as a fusion linked to either the 5'- or 3'-end of the DNA sequence encoding, for example, β-galactosidase or to the 3'-end of the DNA sequence encoding, for example, the trp E gene. For use in the insect baculovirus (AcNPV) system, the DNA sequence is optionally fused to the polyhedrin coding sequence.
The present invention also provides a host cell transformed with expression vectors as herein defined.
Examples of host cells of use with the present invention include prokaryotic and eukaryotic cells, such as bacterial, yeast, mammalian and insect cells. Particular examples of such cells are E.coli, S.cerevisiae, P.pastoris, Chinese hamster ovary and mouse cells, and Spodoptera fruσiperda and Tricoplusia ni. The choice of host cell may depend on a number of factors but, if post-translational modification of the PT- NANBH viral polypeptide is important, then an eukaryotic host would be preferred.
The present invention also provides a process for preparing a recombinant PT-NANBH polypeptide or PT-NANBH polypeptides which comprises isolating the DNA sequences, as herein defined, from the PT-NANBH genome, preferably by an amplification process, or synthesising DNA sequences encoding the antigens of PT-NANBH recombinant polypeptide or polypeptides, as herein defined, inserting the DNA sequences into one or more expression vectors such that it is or they are capable, in an appropriate host, of being expressed, transforming an host cells with the one or more expression vectors, culturing the transformed host cells, and isolating the recombinant polypeptide or polypeptides. Amplification is preferably carried out by the polymerase chain reaction (PCR) technique (Saiki et al, Science, 1985, 230, 1350-4).
The DNA sequence encoding PT-NANBH recombinant polypeptide and the two PT-NANBH antigens may be synthesised using standard procedures (Gait, Oligonucleotide Synthesis: A Practical Approach, 1984, Oxford, IRL Press).
The desired DNA sequence obtained as described above may be inserted into an expression vector using known and standard techniques. The expression vector is normally cut using restriction enzymes and the DNA sequence inserted using bluntend or staggered-end ligation. The cut is usually made at a restriction site in a convenient position in the expression vector such that, once inserted, the DNA sequences are under the control of the functional elements of DNA that effect its expression.
Transformation of an host cell may be carried out using standard techniques. Some phenotypic marker is usually employed to distinguish between the transformants that have successfully taken up the expression vector and those that have not. Culturing of the transformed host cell and isolation of the PT-NANBH recombinant polypeptide or polypeptides as required may also be carried out using standard techniques. Diagnostic assays based upon the present invention may be used to determine the presence or absence of PT-NANBH infection. They may also be used to monitor treatment of such infection, for example in interferon therapy. In an assay for the diagnosis of viral infection, there are basically three distinct approaches that can be adopted involving the detection of viral nucleic acid, viral antigen or viral antibody as discussed in GB-A-2 239 245. In an assay for the diagnosis of PT-NANBH involving detection of viral antibody, the method may comprise contacting a test sample with a PT-NANBH recombinant polypeptide or polypeptides of the present invention and determining whether there is any antigen-antibody binding contained within the test sample. For this purpose, a test kit may be provided comprising a PT-NANBH recombinant polypeptide or polypeptides, as defined herein, and means for determining whether there is any binding with antibody contained in the test sample. The test sample may be taken from any of the appropriate tissues and physiological fluids mentioned above for the detection of viral nucleic acid. If a physiological fluid is obtained, it may optionally be concentrated for any viral antibody present.
A variety of assay formats may be employed. The PT-NANBH recombinant polypeptide or polypeptides can be used to capture selectively antibody against PT-NANBH from solution, to label selectively the antibody already captured, or both to capture and label the antibody. In addition, the recombinant polypeptide or polypeptides may be used in a variety of homogeneous assay formats in which the antibody reactive with the antigen is detected in solution with no separation of phases. The types of assay in which the PT-NANBH recombinant polypeptide or polypeptides are used to capture antibody from solution involve immobilization of the polypeptide or polypeptides on to a solid surface. This surface should be capable of being washed in some way. Examples of suitable surfaces include polymers of various types (moulded into microtitre wells; beads; dipsticks of various types; aspiration tips; electrodes; and optical devices), particles (for example latex; stabilized red blood cells; bacterial or fungal cells; spores; gold or other metallic or metal-containing sols; and proteinaceous colloids) with the usual size of the particle being from 0.02 to 5 microns, membranes (for example of nitrocellulose; paper; cellulose acetate; and high porosity/high surface area membranes of an organic or inorganic material).
The attachment of the PT-NANBH recombinant polypeptide or polypeptides to the surface can be by passive adsorption from a solution of optimum composition which may include surfactants, solvents, salts and/or chaotropes; or by active chemical bonding. Active bonding may be through a variety of reactive or activatible functional groups which may be exposed on the surface (for example condensing agents; active acid esters, halides and anhydrides; amino, hydroxyl, or carboxyl groups; sulphydryl groups; carbonyl groups; diazo groups; or unsaturated groups). Optionally, the active bonding may be through a protein (itself attached to the surface passively or through active bonding), such as albumin or casein, to which the viral polypeptide may be chemically bonded by any of a variety of methods. The use of a protein in this way may confer advantages because of isoelectric point, charge, hydrophilicity or other physico-chemical property. The viral polypeptide may also be attached to the surface (usually but not necessarily a membrane) following electrophoretic separation of a reaction mixture, such as immune precipitation.
After contacting (reacting) the surface bearing the PT- NANBH recombinant polypeptide or polypeptides with a test sample, allowing time for reaction, and, where necessary, removing the excess of the sample by any of a variety of means, (such as washing, centrifugation, filtration, magnetism or capilliary action) the captured antibody is detected by any means which will give a detectable signal. For example, this may be achieved by use of a labelled molecule or particle as described above which will react with the captured antibody (for example protein A or protein G and the like; anti-species or anti-immunoglobulin-sub-type; rheumatoid factor; or antibody to the antigen, used in a competitive or blocking fashion), or any molecule containing an epitope contained in the polypeptide. The detectable signal may be optical or radioactive or physico-chemical and may be provided directly by labelling the molecule or particle with, for example, a dye, radiolabel, electroactive species, magnetically resonant species or fluorophore, or indirectly by labelling the molecule or particle with an enzyme itself capable of giving rise to a measurable change of any sort. Alternatively the detectable signal may be obtained using, for example, agglutination, or through a diffraction or birefringent effect if the surface is in the form of particles.
Assays in which a PT-NANBH recombinant polypeptide or polypeptides itself is used to label an already captured antibody require some form of labelling of the antigen which will allow it to be detected. The labelling may be direct by chemically or passively attaching for example a radio label, magnetic resonant species, particle or enzyme label to the polypeptide; or indirect by attaching any form of label to a molecule which will itself react with the polypeptide or polypeptides. The chemistry of bonding a label to the PT-NANBH recombinant polypeptide or polypeptides can be directly through a moiety already present in the polypeptide or polypeptides, such as an amino group, or through an intermediate moiety, such as a maleimide group. Capture of the antibody may be on any of the surfaces already mentioned by any reagent including passive or activated adsorption which will result in specific antibody or immune complexes being bound. In particular, capture of the antibody could be by anti-species or anti-immunoglobulin-sub-type, by rheumatoid factor, proteins A, G and the like, or by any molecule containing an epitope contained in the polypeptide or polypeptides. The labelled PT-NANBH recombinant polypeptide or polypeptides may be used in a competitive binding fashion in which its binding to any specific molecule on any of the surfaces exemplified above is blocked by antigen in the sample. Alternatively, it may be used in a non-competitive fashion in which antigen in the sample is bound specifically or non- specifically to any of the surfaces above and is also bound to a specific bi- or poly-valent molecule (e.g. an antibody) with the remaining valencies being used to capture the labelled polypeptide or polypeptides.
Often in homogeneous assays the PT-NANBH recombinant polypeptide or polypeptides and an antibody are separately labelled so that, when the antibody reacts with the recombinant polypeptide or polypeptides in free solution, the two labels interact to allow, for example, non-radiative transfer of energy captured by one label to the other label with appropriate detection of the excited second label or quenched first label (e.g. by fluorimetry, magnetic resonance or enzyme measurement). Addition of either viral polypeptide or antibody in a sample results in restriction of the interaction of the labelled pair and thus in a different level of signal in the detector.
A suitable assay format for detecting PT-NANBH antibody is the direct sandwich enzyme immunoassay (EIA) format. A PT-NANBH recombinant polypeptide or polypeptides is coated onto microtitre wells. A test sample and a PT-NANBH recombinant polypeptide or polypeptides to which an enzyme is coupled are added simultaneously. Any PT-NANBH antibody present in the test sample binds both to the recombimant polypeptide or polypeptides coating the well and to the enzyme-coupled recombinant polypeptide or polypeptides. Typically, the same recombinant polypeptide or polypeptides are used on both sides of the sandwich. After washing, bound enzyme is detected using a specific substrate involving a colour change. A test kit for use in such an EIA comprises:
(1) a PT-NANBH recombinant polypeptide or polypeptides labelled with an enzyme;
(2) a substrate for the enzyme;
(3) means providing a surface on which a PT-NANBH recombinant polypeptide or polypeptides is immobilised; and (4) optionally, washing solutions and/or buffers.
The recombinant polypeptide or polypeptides of the present invention may be incorporated into a vaccine formulation for inducing immunity to PT-NANBH in man. For this purpose the recombinant polypeptide or polypeptides may be presented in association with a pharmaceutically acceptable carrier.
For use in a vaccine formulation, the recombinant polypeptide or polypeptides may optionally be presented as part of an hepatitis B core fusion particle, as described in Clarke et al (Nature, 1987, 330, 381-384), or a polylysine based polymer, as described in Tarn (PNAS. 1988, 85, 5409-5413). Alternatively, the recombinant polypeptide or polypeptides may optionally be attached to a particulate structure, such as liposomes or ISCOMS. Pharmaceutically acceptable carriers include liquid media suitable for use as vehicles to introduce the recombinant polypeptide or polypeptides into a patient. An example of such liquid media is saline solution. The recombinant polypeptide or polypeptides may be dissolved or suspended as a solid in the carrier.
The vaccine formulation may also contain an adjuvant for stimulating the immune response and thereby enhancing the effect of the vaccine. Examples of adjuvants include aluminium hydroxide and aluminium phosphate.
The vaccine formulation may contain a final concentration of recombinant polypeptide or polypeptides in the range from 0.01 to 5 mg/ml, preferably from 0.03 to 2 mg/ml. The vaccine formulation may be incorporated into a sterile container, which is then sealed and stored at a low temperature, for example 4°C, or may be freeze-dried. In order to induce immunity in man to PT-NANBH, one or more doses of the vaccine formulation may be administered. Each dose may be 0.1 to 2 ml, preferably 0.2 to 1 ml. A method for inducing immunity to PT-NANBH in man, comprises the administration of an effective amount of a vaccine formulation, as hereinbefore defined.
The present invention also provides the use of a PT-NANBH recombinant polypeptide or polypeptides as herein defined in the preparation of a vaccine for use in the induction of immunity to PT-NANBH in man.
Vaccines of the present invention may be administered by any convenient method for the administration of vaccines including oral and parenteral (e.g. intravenous, subcutaneous or intramuscular) injection. The treatment may consist of a single dose of vaccine or a plurality of doses over a period of time. In the Figures:
Figure 1 shows a schematic representation of the NANBH genome including the structural and non-structural coding regions together with the recombinant polypeptide or polypeptides of the present invention.
In the sequence listing, there are listed SEQ ID NO: 1 to 27 to which reference is made in the description and claims.
The embodiments of the invention will now be illustrated.
Example 1: Production and Expression of BHC-19 The NS4 region (2258-3347 bp) from a 6587 bp contingous sequence from the 3' region of the PT-NANBH genome, SEQ ID NO: 9 is amplified by PCR using primers D224 and D226 SEQ ID NO's: 11 and 13 and the 1119 bp fragment SEQ ID NO: 14 is cloned into a vector and expressed in infected insect cells as in the method described in Example 7 GB-A-2 239 245. The recombinant virus (BHC-19) was able to express the NS4 specific recombinant protein (60 KDa apparent MW) at low levels in the infected insect cells. Example 2: Production of DX200 and expression in E. coli
The NS3 region (1125-2090 bp) from the above mentioned 6587 bp contingous sequence is amplified by PCR using primers D344 and D308 SEQ ID NO's: 16 and 18 and the resulting 993 bp fragment SEQ ID NO: 19 is cloned into the E coli expression vector pDEV107, which fuses the sequence onto the C-terminus of ß-galactosidase and the resulting fusion protein (DX200125kDa apparent MW) is expressed in E. coli in accordance with the method of Example 6 in GB-A-2 239 245. Example 3: Performance of BHC-19, DX200, BHC-4 and BHC-7 in an EIA
BHC-19, DX200, BHC-4 (a core construct fusing the first 135 amino acids of the core protein onto the 3' of the polyhedrin gene, 35kDa apparent MW) SEQ ID NO: 21 or 22 and BHC-7 (NS5, 100kDa apparent MW, as disclosed in GB-A-2239245) can be resolved from each other when co-electrophoresed on SDS-PAG.
Western blot strips were produced by co-electrophoresis of suitable amounts of each recombinant across the width of a 12.5% gel followed by transfer to PDVF membranes (Millipore). Commercially-available anti-HCV antibody detection EIA's (Ortho Diagnostic Systems, Raritan, New Jersey (HCV ELISA test system) and Abbott Diagnostics, Chicago) use a combination of HCV recombinant antigens from putative core, NS3 and NS4 regions of the NANBH genome. These commercially available anti-HCV antibody detection EIA's were used together with an EIA based upon BHC-11 which combines core with NS5 (SEQ ID NO: 1 or 2).
The Western blot strips which contain recombinant agents core, NS3, NS4 and NS5 as described earlier were used to analyze those sera from a variety of different patient groups which gave discordant reactions in the two types of EIA. It was shown that those sera (BBI4.6, BD1) which reacted with BHC-ll and yet were negative in the commercial assays blotted against the NS5 specific band. However, there were some sera which did not react with BHC-11 but did react in the commercial assay. When these were analysed by Western blots they reacted with the NS3-specific recombinant on the blots. Table 1 presents these data.
Figure imgf000017_0001
It was thus surprisingly determined that some sera can react predominantly if not exclusively with individual PT-NANBH antigens. Example 4: Production of an improved PT-NANBH recombinant polypeptide
The NS3 specific region of pDX200 was amplified by PCR using the primers D360 and D361 SEQ ID NO: 23 and 24 to produce a 600 bp fragment SEQ ID NO: 7 with Pstl ends. The fragment was cloned into Pstl digested pDX136 (the transfer vector used to produce BHC-11 in GB-A-2 239 245). The resulting transformants were analyzed by restriction enzymze mapping to identify those which contained the NS3 sequence inserted in the correct orientation between the NS5 and core parts of pDX136; this was called pDX208, SEQ ID NO: 26 or 27, see Figure 1. Following transfection and selection as described above, a stock of the recombinant baculovirus BHC-28 was produced. Insect cells infected with BHC-28 produce a new band of i21kDa apparent molecular weight as judged by SDS-PAGE of cell extracts. This band reacts with sera which contain antibodies to each of its constituent PT-NANBH antigens and appears to be the expected tripartite fusion of NS5-NS3-core in that order.
Example 5: Comparison of BHC-28 and BHC-11 in an EIA
BHC-28 was partially purified from infected cells and used to coat microtitre wells. These wells were then used in an EIA to determine whether or not BHC-28 is better than BHC-11 at detecting NS3-only sera. Table 2 presents the comparative data using sera having a predominant reaction with NS3, some of which are identified in Table 1.
Figure imgf000019_0001
It was surprising that one Italian Patient (Q1) was positive only for NS3 for a period of time (samples 1 to 7) before exhibiting an additional response to NS4 (as shown in Table 1). A total of 12 samples from this patient (1 to 7 NS3 only, 8 to 12 NS3 and NS4) were tested with BHC-11 and BHC-28; all were negative with BHC-11 but clearly positive with BHC-28. Samples from other donors/patients with different backgrounds were all detected by BHC-28 whereas only 2 (TR16 and T.Bh.) showed any suggestion of reactivity with BHC-11. It has thus been shown that a number of different PT-NANBH antigens are required for maximum sensitivity for screening purposes. In particular the core, NS3 and NS5 regions are all required. The BHC-28 recombinant represents an improvement over BHC-11. By fusing all the required antigens into a single polypeptide the production and purification process has been greatly simplified over that which would be required if each antigen were individually expressed. Example 6: Vaccine Formulation
A vaccine formulation may be prepared by conventional techniques using the following constituents in the indicated amounts.
PT-NANBH viral polypeptide > 0.36 mg
(individually or as a single
fusion)
Thiomersal 0.04 - 0.2 mg
Sodium Chloride < 8.5 mg
Water < 1 ml
SEQUENCE LISTING
(1) GENERAL INFORMATION:
(i) APPLICANT:
(A) NAME: The Wellcome Foundation Limited
(B) STREET: Unicorn House, 160 Euston Road,
(C) CITY: London.
(E) COUNTRY: Great Britain
(F) POSTAL CODE (ZIP): NW1 2BP
(ii) TITLE OF INVENTION: A Recombinant Polypeptide
(iii) NUMBER OF SEQUENCES: 27
(iv) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Floppy disk
(B) COMPUTER: IBM PC compatible
(C) OPERATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: PatentIn Release #1.0, Version #1.25 (EPO)
(2) INFORMATION FOR SEQ ID NO: 1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2790 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to genomic RNA (ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..2790
(D) OTHER INFORMATION: /note= "Fusion of non-structural and structural protein sequences of PT-NANBH"
(ix) FEATURE:
(A) NAME/KEY: mιsc_feature
(B) LOCATION: 1..63
(D) OTHER INFORMATION: /note= "Amino terminal sequence of the Autographa cali formca Nuclear Polyhedros is
Virus (AcNPV) polyhedrin . "
(ix) FEATURE:
(A) NAME/KEY: mιsc_feature
(B) LOCATION: 64..1851
(D) OTHER INFORMATION: /note= "Putative NS5 coding
sequence"
(ix) FEATURE:
(A) NAME/KEY: mιsc_feature
(B) LOCATION: 1852..1875 (D) OTHER INFORMATION: /note= "Synthetic linker region"
( ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 1876..2706
(D) OTHER INFORMATION: /note= "Structural protein coding
sequence containing the core and El regions"
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 2707..2790
(D) OTHER INFORMATION: /note= "Polyhedrin gene sequence
read out-of-frame"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1:
ATG CCG GAT TAT TCA TAC CGT CCC ACC ATC GGG CCG GAT CCC CCG TCA 48 Met Pro Asp Tyr Ser Tyr Arg Pro Thr Ile Gly Pro Asp Pro Pro Ser
1 5 10 15
CTA TCG GCG GAA TTC ACA GAA GTG GAT GGG GTG CGG CTG CAC AGG TAC 96 Leu Ser Ala Glu Phe Thr Glu Val Asp Gly Val Arg Leu His Arg Tyr
20 25 30
GCT CCG GCG TGC AAA CCT CTC CTA CGG GAG GAG GTC ACA TTC CAG GTC 144 Ala Pro Ala Cys Lys Pro Leu Leu Arg Glu Glu Val Thr Phe Gln Val
35 40 45
GGG CTC AAC CAA TAC CTG GTT GGG TCG CAG CTC CCA TGC GAG CCC GAA 192 Gly Leu Asn Gln Tyr Leu Val Gly Ser Gln Leu Pro Cys Glu Pro Glu
50 55 60
CCG GAT GTA GCA GTG CTC ACT TCC ATG CTC ACC GAC CCC TCC CAC ATC 240 Pro Asp Val ALa Val Leu Thr Ser Met Leu Thr Asp Pro Ser His Ile
65 70 75 80
ACA GCA GAG ACG GCT AAG CGC AGG CTG GCC AGG GGG TCT CCC CCC TCC 288 Thr Ala Glu Thr Ala Lys Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser
85 90 95
TTG GCC AGC TCT TCA GCT AGC CAG TTG TCT GGC CCT TCC TCG AAG GCG 336 Leu Ala Ser Ser Ser Ala Ser Gln Leu Ser Gly Pro Ser Ser Lys Ala
100 105 110
ACA TAC ATT ACC CAA AAT GAC TTC CCA GAC GCT GAC CTC ATC GAG GCC 384
Thr Tyr Ile Thr Gln Asn Asp Phe Pro Asp Ala Asp Leu Ile Glu Ala
115 120 125
AAC CTC CTG TGG CGG CAT GAG ATG GGC GGG GAC ATT ACC CGC GTG GAG 432 Asn Leu Leu Trp Arg His Glu Met Gly Gly Asp Ile Thr Arg Val Glu
130 135 140
TCA GAG AAC AAG GTA GTA ATC CTG GAC TCT TTC GAC CCG CTC CGA GCG 480 Ser Glu Asn Lys Val Val Ile Leu Asp Ser Phe Asp Pro Leu Arg Ala
145 150 155 160 GAG GAG GAT GAG CGG GAA GTG TCC GTC CCG GCG GAG ATC CTG CGG AAA 528
Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ala Glu Ile Leu Arg Lys
165 170 175 TCC AAG AAA TTC CCA CCA GCG ATG CCC GCA TGG GCA CGC CCG GAT TAC 576
Ser Lys Lys Phe Pro Pro Ala Met Pro Ala Trp Ala Arg Pro Asp Tyr
180 185 190
AAC CCT CCG CTG CTG GAG TCC TGG AAG GCC CCG GAC TAC GTC CCT CCA 624 Asn Pro Pro Leu Leu Glu Ser Trp Lys Ala Pro Asp Tyr Val Pro Pro
195 200 205
GTG GTA CAT GGG TGC CCA CTG CCA CCT ACT AAG ACC CCT CCT ATA CCA 672
Val Val His Gly Cys Pro Leu Pro Pro Thr Lys Thr Pro Pro Ile Pro
210 215 220
CCT CCA CGG AGA AAG AGG ACA GTT GTT CTG ACA GAA TCC ACC. GTG TCT 720
Pro Pro Arg Arg Lys Arg Thr Val Val Leu Thr Glu Ser Thr Val Ser
225 230 235 240
TCT GCC CTG GCG GAG CTT GCC ACA AAG GCT TTT GGT AGC TCC GGA CCG 768
Ser Ala Leu Ala Glu Leu Ala Thr Lys Ala Phe Gly Ser Ser Gly Pro
245 250 255
TCG GCC GTC GAC AGC GGC ACG GCA ACC GCC CCT CCT GAC CAA TCC TCC 816
Ser Ala Val Asp Ser Gly Thr Ala Thr Ala Pro Pro Asp Gln Ser Ser
260 265 270
GAC GAC GGC GGA GCA GGA TCT GAC GTT GAG TCG TAT TCC TCC ATG CCC 864 Asp Asp Gly Gly Ala Gly Ser Asp Val Glu Ser Tyr Ser Ser Met Pro
275 280 285
CCC CTT GAG GGG GAG CCG GGG GAC CCC GAT CTC AGC GAC GGG TCT TGG 912
Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp
290 295 300
TCT ACC GTG AGT GAG GAG GCC GGT GAG GAC GTC GTC TGC TGC TCG ATG 960
Ser Thr Val Ser Glu Glu Ala Gly Glu Asp Val Val Cys Cys Ser Met
305 310 315 320
TCC TAC ACA TGG ACA GGC GCT CTG ATC ACG CCA TGC GCT GCG GAG GAA 1008
Ser Tyr Thr Trp Thr Gly Ala Leu Ile Thr Pro Cys Ala Ala Glu Glu
325 330 335
AGC AAG CTG CCC ATC AAC GCG TTG AGC AAC TCT TTG CTG CGT CAC CAC 1056
Ser Lys Leu Pro Ile Asn Ala Leu Ser Asn Ser Leu Leu Arg His His
340 345 350
AAC ATG GTC TAC GCT ACC ACA TCC CGC AGC GCA AGC CAG CGG CAG AAG 1104 Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala Ser Gln Arg Gln Lys
355 360 365
AAG GTC ACC TTT GAC AGA CTG CAA ATC CTG GAC GAT CAC TAC CAG GAC 1152
Lys Val Thr Phe Asp Arg Leu Gln Ile Leu Asp Asp His Tyr Gln Asp
370 375 380 GTG CTC AAG GAG ATG AAG GCG AAG GCG TCC ACA GTT AAG GCT AAG CTT 1200 Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val Lys Ala Lys Leu
385 390 395 400
CTA TCA GTA GAG GAA GCC TGC AAG CTG ACG CCC CCA CAT TCG GCC AAA 1248 Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro His Ser Ala Lys
405 410 415
TCT AAA TTT GGC TAT GGG GCA AAG GAC GTC CGG AAC CTA TCC AGC AAG 1296 Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Asn Leu Ser Ser Lys
420 425 430
GCC ATT AAC CAC ATC CGC TCC GTG TGG GAG GAC TTG TTG GAA GAC ACT 1344 Ala Ile Asn His Ile Arg Ser Val Trp Glu Asp Leu Leu Glu Asp Thr
435 440 445
GAA ACA CCA ATT GAC ACC ACC ATC ATG GCA AAA AAT GAG GTT TTC TGC 1392 Glu Thr Pro Ile Asp Thr Thr Ile Met Ala Lys Asn Glu Val Phe Cys
450 455 460
GTC CAA CCA GAG AGA GGA GGC CGC AAG CCA GCT CGC CTT ATC GTG TTC 1440 Val Gln Pro Glu Arg Gly Gly Arg Lys Pro Ala Arg Leu Ile Val Phe
465 470 475 480
CCA GAC TTG GGG GTC CGT GTG TGC GAG AAA ATG GCC CTC TAT GAC GTG 1488 Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val
485 490 495
GTC TCC ACC CTC CCT CAG GCT GTG ATG GGC TCC TCG TAC GGA TTC CAG 1536 Val Ser Thr Leu Pro Gln Ala Val Met Gly Ser Ser Tyr Gly Phe Gln
500 505 510
TAT TCT CCT GGA CAG CGG GTC GAG TTC CTG GTG AAC GCC TGG AAA TCA 1584 Tyr Ser Pro Gly Gln Arg Val Glu Phe Leu Val Asn Ala Trp Lys Ser
515 520 525
AAG AAG ACC CCT ATG GGC TTT GCA TAT GAC ACC CGC TGT TTT GAC TCA 1632
Lvs Lvs Thr Pro Met Gly Phe Ala Tyr Asp Thr Arg Cys Phe Asp Ser
530 535 540
ACA GTC ACT GAG AAT GAC ATC CGT GTA GAG GAG TCA ATT TAT CAA TGT 1680
Thr Val Thr Glu Asn Asp Ile Arg Val Glu Glu Ser Ile Tyr Gln Cys
545 550 555 560
TGT GAC TTG GCC CCC GAA GCC AGA CAG GCC ATA AGG TCG CTC ACA GAG 1728 Cys Asp Leu Ala Pro Glu Ala Arg Gln Ala Ile Arg Ser Leu Thr Glu
565 570 575
CGG CTT TAT ATC GGG GGT CCC CTG ACT AAT TCA AAA GGG CAG AAC TGC 1776 Arg Leu Tyr Ile Gly Gly Pro Leu Thr Asn Ser Lys Gly Gln Asn Cys
580 585 590
GGC TAT CGC CGG TGC CGC GCG AGC GGC GTG CTG ACG ACT AGC TGC GGT 1824 Giy Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly
595 600 605 AAT ACC CTC ACA TGT TAC TTG AAG GCC TCT GCA GTA AAG AAG AAG AAG 1872
Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Val Lys Lys Lys Lys
610 615 620
AAG AAA ACC AAA CGT AAC ACC AAC CTC CGC CCA CAG GAC GTC AGG TTC 1920
Lys Lys Thr Lys Arg Asn Thr Asn Leu Arg Pro Gln Asp Val Arg Phe
625 630 635 640
CCG GGC GGT GGT CAG ATC GTT GGT GGA GTT TAC CTG TTG CCG CGC AGG 1968 Pro Gly Gly Gly Gln Ile Val Gly Gly Val Tyr Leu Leu Pro Arg Arg
645 650 655
GGC CCC AGG TTG GGT GTG CGC GCG ACT AGG AAG ACT TCC GAG CGG TCG 2016
Gly Pro Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser
660 665 670
CAA CCT CGT GGA AGG CGA CAA CCT ATC CCC AAG GCT CGC CAG CCC GAG 2064 Gln Pro Arg Gly Arg Arg Gln Pro Ile Pro Lys Ala Arg Gln Pro Glu
675 680 685
GGC AGG GCC TGG GCT CAG CCC GGG TAC CCT TGG CCC CTC TAT GGC AAC 2112
Gly Arg Ala Trp Ala Gln Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn
690 695 700
GAG GGC ATG GGG TGG GCA GGA TGG CTC CTG TCA CCC CGT GGC TCC CGG 2160
Glu Gly Met Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg
705 710 715 720
CCT AGT TGG GGC CCC ACT GAC CCC CGG CGT AGG TCG CGT AAT TTG GGT 2208 Pro Ser Trp Gly Pro Thr Asp Pro Arg Arg Arg Ser Arg Asn Leu Gly
725 730 735
AAA GTC ATC GAT ACC CTC ACA TGC GGC TTC GCC GAC CTC ATG GGG TAC 2256
Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr
740 745 750
ATT CCG CTC GTC GGC GCT CCC TTA GGG GGC GCT GCC AGG GCC CTG GCG 2304 Ile Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala Leu Ala
755 760 765
CAT GGC GTC CGG GTT CTG GAG GAC GGC GTG AAC TAT GCA ACA GGG AAT 2352
His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn
770 775 780
TTA CCC GGT TGC TCT TTC TCT ATC TTC CTC TTG GCT TTG CTG TCC TGT 2400
Leu Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu Leu Ser Cys
785 790 795 800
TTG ACC ATT CCA GCT TCC GCT TAT GAA GTG CGC AAC GTG TCC GGG ATC 2448 Leu Thr Ile Pro Ala Ser Ala Tyr Glu Val Arg Asn Val Ser Gly Ile
805 810 815
TAC CAT GTC ACG AAC GAT TGC TCC AAC TCA AGC ATC GTG TAC GAG ACA 2496
Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser Ile Val Tyr Glu Thr
820 825 830 GCG GAC ATG ATC ATG CAC ACC CCC GGG TGT GTG CCC TGT GTC CGG GAG 2544 Ala Asp Met Ile Met His Thr Pro Gly Cys Val Pro Cys Val Arg Glu
835 840 845
GGT AAT TCC TCC CGC TGC TGG GTA GCG CTC ACT CCC ACG CTC GCG GCC 2592 Gly Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala
850 855 860
AAG GAC GCC AGC ATC CCC ACT GCG ACA ATA CGA CGC CAC GTC GAT TTG 2640 Lys Asp Ala Ser Ile Pro Thr Ala Thr Ile Arg Arg His Val Asp Leu
865 870 875 880
CTC GTT GGG GCG GCT GCC TTC TGC TCC GCT ATG TAC GTG GGG GAT CTC 2688 Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val Gly Asp Leu
885 890 895
TGC GGA TCT GTT TTC CCG GAA TTC CAG CTG AGC GCC GGT CGC TAC GGA 2736
Cys Gly Ser Val Phe Pro Glu Phe Gln Leu Ser Ala Gly Arg Tyr Gly
900 905 910
TCC TTT CCT GGG ACC CGG CAA GAA CCA AAA ACT CAC TCT CTT CAA GGA 2784
Ser Phe Pro Gly Thr Arg Gln Glu Pro Lys Thr His Ser Leu Gln Gly
915 920 925
AAT CCG 2790
Asn Pro
930
(2) INFORMATION FOR SEQ ID NO: 2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 930 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2:
Met Pro Asp Tyr Ser Tyr Arg Pro Thr Ile Gly Pro Asp Pro Pro Ser 1 5 10 15 Leu Ser Ala Glu Phe Thr Glu Val Asp Gly Val Arg Leu His Arg Tyr
20 25 30
Ala Pro Ala Cys Lys Pro Leu Leu Arg Glu Glu Val Thr Phe Gln Val
35 40 45
Gly Leu Asn Gln Tyr Leu Val Gly Ser Gln Leu Pro Cys Glu Pro Glu 50 55 60
Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His Ile 65 70 75 80
Thr Ala Glu Thr Ala Lys Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser
85 90 95 Leu Ala Ser Ser Ser Ala Ser Gln Leu Ser Gly Pro Ser Ser Lvs Ala
100 105 110
Thr Tyr Ile Thr Gln Asn Asp Phe Pro Asp Ala Asp Leu Ile Glu Ala
115 120 125
Asn Leu Leu Trp Arg His Glu Met Gly Gly Asp Ile Thr Arg Val Glu 130 135 140
Ser Glu Asn Lys Val Val Ile Leu Asp Ser Phe Asp Pro Leu Arg Ala 145 150 155 160
Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ala Glu Ile Leu Arg Lys
165 170 175 Ser Lys Lys Phe Pro Pro Ala Met Pro Ala Trp Ala Arg Pro Asp Tyr
180 185 190
Asn Pro Pro Leu Leu Glu Ser Trp Lys Ala Pro Asp Tyr Val Pro Pro
195 200 205
Val Val His Gly Cys Pro Leu Pro Pro Thr Lys Thr Pro Pro Ile Pro 210 215 220
Pro Pro Arg Arg Lys Arg Thr Val Val Leu Thr Glu Ser Thr Val Ser 225 230 235 240 Ser ALa Leu Ala Glu Leu Ala Thr Lys Ala Phe Gly Ser Ser Gly Pro 245 250 255
Ser Ala Val Asp Ser Gly Thr Ala Thr Ala Pro Pro Asp Gln Ser Ser
260 265 270
Asp Asp Gly Gly Ala Gly Ser Asp Val Glu Ser Tyr Ser Ser Met Pro
275 280 285
Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp 290 295 300
Ser Thr Val Ser Glu Glu Ala Gly Glu Asp Val Val Cys Cys Ser Met 305 310 315 320
Ser Tyr Thr Trp Thr Gly Ala Leu Ile Thr Pro Cys Ala Ala Glu Glu
325 330 335
Ser Lys Leu Pro Ile Asn Ala Leu Ser Asn Ser Leu Leu Arg His His
340 345 350
Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala Ser Gln Arg Gln Lys
355 360 365
Lys Val Thr Phe Asp Arg Leu Gln Ile Leu Asp Asp His Tyr Gln Asp 370 375 380
Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val Lys Ala Lys Leu 385 390 395 400
Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro His Ser Ala Lys
405 410 415
Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Asn Leu Ser Ser Lys
420 425 430
Ala Ile Asn His Ile Arg Ser Val Trp Glu Asp Leu Leu Glu Asp Thr
435 440 445
Glu Thr Pro Ile Asp Thr Thr Ile Met Ala Lys Asn Glu Val Phe Cys 450 455 460
Val Gin Pro Glu Arg Gly Gly Arg Lys Pro Ala Arg Leu Ile Val Phe 465 470 475 480
Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val
485 490 495
Val Ser Thr Leu Pro Gln Ala Val Met Gly Ser Ser Tyr Gly Phe Gln
500 505 510
Tyr Ser Pro Gly Gln Arg Val Glu Phe Leu Val Asn Ala Trp Lys Ser
515 520 525
Lys Lys Thr Pro Met Gly Phe Ala Tyr Asp Thr Arg Cys Phe Asp Ser 530 535 540 Thr Val Thr Glu Asn Asp Ile Arg Val Glu Glu Ser Ile Tyr Gln Cys 545 550 555 560
Cys Asp Leu Ala Pro Glu Ala Arg Gln Ala Ile Arg Ser Leu Thr Glu
565 570 575
Arg Leu Tyr Ile Gly Gly Pro Leu Thr Asn Ser Lys Gly Gln Asn Cys
580 585 590
Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly
595 600 605
Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Val Lys Lys Lys Lys 610 615 620
Lys Lys Thr Lys Arg Asn Thr Asn Leu Arg Pro Gln Asp Val Arg Phe 625 630 635 640
Pro Gly Gly Gly Gln Ile Val Gly Gly Val Tyr Leu Leu Pro Arg Arg
645 650 655
Gly Pro Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser
660 665 670
Gln Pro Arg Gly Arg Arg Gln Pro Ile Pro Lys Ala Arg Gln Pro Glu
675 680 685
Gly Arg Ala Trp Ala Gln Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn 690 695 700
Glu Gly Met Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg 705 710 715 720
Pro Ser Trp Gly Pro Thr Asp Pro Arg Arg Arg Ser Arg Asn Leu Gly
725 730 735
Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr
740 745 750
Ile Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala Leu Ala
755 760 765
His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn 770 775 780
Leu Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu Leu Ser Cys 785 790 795 800
Leu Thr Ile Pro Ala Ser Ala Tyr Glu Val Arg Asn Val Ser Gly Ile
805 810 815
Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser Ile Val Tyr Glu Thr
820 825 830
Ala Asp Met Ile Met His Thr Pro Gly Cys Val Pro Cys Val Arg Glu
835 840 845 Gly Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala 850 855 860
Lys Asp Ala Ser Ile Pro Thr Ala Thr Ile Arg Arg His Val Asp Leu 865 870 875 880
Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val Gly Asp Leu
885 890 895 Cys Gly Ser Val Phe Pro Glu Phe Gln Leu Ser Ala Gly Arg Tyr Gly
900 905 910
Ser Phe Pro Gly Thr Arg Gln Glu Pro Lys Thr His Ser Leu Gln Gly
915 920 925
Asn Pro
930
(2) INFORMATION FOR SEQ ID NO: 3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 831 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to genomic RNA
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..831
(D) OTHER INFORMATION: /product= "PT-NANBH coding
sequence"
/note= "Encodes PT-NANBH structural protein
region"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3:
AAA ACC AAA CGT AAC ACC AAC CTC CGC CCA CAG GAC GTC AGG TTC CCG 48
Lys Thr Lys Arg Asn Thr Asn Leu Arg Pro Gln Asp Val Arg Phe Pro
1 5 10 15
GGC GGT GGT CAG ATC GTT GGT GGA GTT TAC CTG TTG CCG CGC AGG GGC 96
Gly Gly Gly Gln Ile Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly
20 25 30
CCC AGG TTG GGT GTG CGC GCG ACT AGG AAG ACT TCC GAG CGG TCG CAA 144
Pro Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gln
35 40 45
CCT CGT GGA AGG CGA CAA CCT ATC CCC AAG GCT CGC CAG CCC GAG GGC 192
Pro Arg Gly Arg Arg Gln Pro Ile Pro Lys Ala Arg Gln Pro Glu Gly
50 55 60
AGG GCC TGG GCT CAG CCC GGG TAC CCT TGG CCC CTC TAT GGC AAC GAG 240
Arg Ala Trp Ala Gln Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu
65 70 75 80
GGC ATG GGG TGG GCA GGA TGG CTC CTG TCA CCC CGT GGC TCC CGG CCT 288
Gly Met Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro
85 90 95
AGT TGG GGC CCC ACT GAC CCC CGG CGT AGG TCG CGT AAT TTG GGT AAA 336
Ser Trp Gly Pro Thr Asp Pro Arg Arg Arg Ser Arg Asn Leu Gly Lys
100 105 110
GTC ATC GAT ACC CTC ACA TGC GGC TTC GCC GAC CTC ATG GGG TAC ATT 384
Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr Ile
115 120 125
CCG CTC GTC GGC GCT CCC TTA GGG GGC GCT GCC AGG GCC CTG GCG CAT 432
Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala Leu Ala His 130 135 140
GGC GTC CGG GTT CTG GAG GAC GGC GTG AAC TAT GCA ACA GGG AAT TTA 480 Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu
145 150 155 160
CCC GGT TGC TCT TTC TCT ATC TTC CTC TTG GCT TTG CTG TCC TGT TTG 528
Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu Leu Ser Cys Leu
165 170 175
ACC ATT CCA GCT TCC GCT TAT GAA GTG CGC AAC GTG TCC GGG ATC TAC 576
Thr Ile Pro Ala Ser Ala Tyr Glu Val Arg Asn Val Ser Gly Ile Tyr
180 185 190
CAT GTC ACG AAC GAT TGC TCC AAC TCA AGC ATC GTG TAC GAG ACA GCG 624 His Val Thr Asn Asp Cys Ser Asn Ser Ser Ile Val Tyr Glu Thr Ala
195 200 205
GAC ATG ATC ATG CAC ACC CCC GGG TGT GTG CCC TGT GTC CGG GAG GGT 672 Asp Met Ile Met His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Gly
210 215 220
AAT TCC TCC CGC TGC TGG GTA GCG CTC ACT CCC ACG CTC GCG GCC AAG 720 Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Lys
225 230 235 240
GAC GCC AGC ATC CCC ACT GCG ACA ATA CGA CGC CAC GTC GAT TTG CTC 768
Asp Ala Ser Ile Pro Thr Ala Thr Ile Arg Arg His Val Asp Leu Leu
245 250 255
GTT GGG GCG GCT GCC TTC TGC TCC GCT ATG TAC GTG GGG GAT CTC TCG 816
Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val Gly Asp Leu Ser
260 265 270
GGA TCT GTT TTC CCG 831
Gly Ser Val Phe Pro
275
(2) INFORMATION FOR SEQ ID NO: 4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 277 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4:
Lys Thr Lys Arg Asn Thr Asn Leu Arg Pro Gln Asp Val Arg Phe Pro 1 5 10 15 Gly Gly Gly Gln Ile Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly
20 25 30
Pro Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gln
35 40 45
Pro Arg Gly Arg Arg Gln Pro Ile Pro Lvs Ala Arg Gln Pro Glu Gly 50 55 60
Arg Ala Trp Ala Gln Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu 65 70 75 80
Gly Met Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro
85 90 95 Ser Trp Gly Pro Thr Asp Pro Arg Arg Arg Ser Arg Asn Leu Gly Lys
100 105 110
Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr Ile
115 120 125
Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala Leu Ala His
130 135 140
Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu 145 150 155 160
Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu Leu Ser Cys Leu
165 170 175 Thr Ile Pro Ala Ser Ala Tyr Glu Val Arg Asn Val Ser Gly Ile Tyr
180 185 190
His Val Thr Asn Asp Cys Ser Asn Ser Ser Ile Val Tyr Glu Thr Ala
195 200 205
Asp Met Ile Met His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Gly 210 215 220
Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Lys 225 230 235 240 Asp Ala Ser Ile Pro Thr Ala Thr Ile Arg Arg His Val Asp Leu Leu 245 250 255
Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val Gly Asp Leu Ser
260 265 270
Gly Ser Val Phe Pro
275
(2) INFORMATION FOR SEQ ID NO: 5:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1788 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to genomic RNA
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..1788
(D) OTHER INFORMATION: /product= "PT-NANBH coding
sequence"
/note= "Encodes putative PT-NANBH NS5 protein" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:
ACA GAA GTG GAT GGG GTG CGG CTG CAC AGG TAC GCT CCG GCG TGC AAA 48 Thr Glu Val A-sp Gly Val Arg Leu His Arg Tyr Ala Pro Ala Cys Lys
1 5 10 15
CCT CTC CTA CGG GAG GAG GTC ACA TTC CAG GTC GGG CTC AAC CAA TAC 96 Pro Leu Leu Arg Glu Glu Val Thr Phe Gln Val Gly Leu Asn Gln Tyr
20 25 30
CTG GTT GGG TCG CAG CTC CCA TGC GAG CCC GAA CCG GAT GTA GCA GTG 144 Leu Val Gly Ser Gln Leu Pro Cys Glu Pro Glu Pro Asp Val Ala Val
35 40 45
CTC ACT TCC ATG CTC ACC GAC CCC TCC CAC ATC ACA GCA GAG ACG GCT 192 Leu Thr Ser Met Leu Thr Asp Pro Ser His Ile Thr Ala Glu Thr Ala
50 55 60
AAG CGC AGG CTG GCC AGG GGG TCT CCC CCC TCC TTG GCC AGC TCT TCA 240 Lys Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Leu Ala Ser Ser Ser
65 70 75 80
GCT AGC CAG TTG TCT GGC CCT TCC TCG AAG GCG ACA TAC ATT ACC CAA 288
Ala Ser Gln Leu Ser Gly Pro Ser Ser Lys Ala Thr Tyr Ile Thr Gln
85 90 95
AAT GAC TTC CCA GAC GCT GAC CTC ATC GAG GCC AAC CTC CTG TGG CGG 336
Asn Asp Phe Pro Asp Ala Asp Leu Ile Glu Ala Asn Leu Leu Trp Arg
100 105 110
CAT GAG ATG GGC GGG GAC ATT ACC CGC GTG GAG TCA GAG AAC AAG GTA 384 His Glu Met Gly Gly Asp Ile Thr Arg Val Glu Ser Glu Asn Lys Val
115 120 125
GTA ATC CTG GAC TCT TTC GAC CCG CTC CGA GCG GAG GAG GAT GAG CGG 432 Val lie Leu Asp Ser Phe Asp Pro Leu Arg Ala Glu Glu Asp Glu Arg
130 135 140 GAA GTG TCC GTC CCG GCG GAG ATC CTG CGG AAA TCC AAG AAA TTC CCA 480 Glu Val Ser Val Pro Ala Glu Ile Leu Arg Lys Ser Lys Lys Phe Pro
145 150 155 160
CCA GCG ATG CCC GCA TGG GCA CGC CCG GAT TAC AAC CCT CCG CTG CTG 528 Pro Ala Met Pro Ala Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu Leu
165 170 175
GAG TCC TGG AAG GCC CCG GAC TAC GTC CCT CCA GTG GTA CAT GGG TGC 576 Glu Ser Trp Lys Ala Pro Asp Tyr Val Pro Pro Val Val His Gly Cys
180 185 190
CCA CTG CCA CCT ACT AAG ACC CCT CCT ATA CCA CCT CCA CGG AGA AAG 624 Pro Leu Pro Pro Thr Lys Thr Pro Pro Ile Pro Pro Pro Arg Arg Lys
195 200 205
AGG ACA GTT GTT CTG ACA GAA TCC ACC GTG TCT TCT GCC CTG GCG GAG 672
Arg Thr Val Val Leu Thr Glu Ser Thr Val Ser Ser Ala Leu Ala Glu
210 215 220
CTT GCC ACA AAG GCT TTT GGT AGC TCC GGA CCG TCG GCC GTC GAC AGC 720
Leu Ala Thr Lys Ala Phe Gly Ser Ser Gly Pro Ser Ala Val Asp Ser
225 230 235 240
GGC ACG GCA ACC GCC CCT CCT GAC CAA TCC TCC GAC GAC GGC GGA GCA 768 Gly Thr Ala Thr Ala Pro Pro Asp Gln Ser Ser Asp Asp Gly Gly Ala
245 250 255
GGA TCT GAC GTT GAG TCG TAT TCC TCC ATG CCC CCC CTT GAG GGG GAG 816 Gly Ser Asp Val Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly Glu
260 265 270
CCG GGG GAC CCC GAT CTC AGC GAC GGG TCT TGG TCT ACC GTG AGT GAG 864 Pro Gly Asp Pro Asp Leu Ser A-sp Gly Ser Trp Ser Thr Val Ser Glu
275 280 285
GAG GCC GGT GAG GAC GTC GTC TGC TGC TCG ATG TCC TAC ACA TGG ACA 912 Glu Ala Gly Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp Thr
290 295 300
GGC GCT CTG ATC ACG CCA TGC GCT GCG GAG GAA AGC AAG CTG CCC ATC 960 Gly Ala Leu Ile Thr Pro Cys Ala Ala Glu Glu Ser Lys Leu Pro Ile
305 310 315 320
AAC GCG TTG AGC AAC TCT TTG CTG CGT CAC CAC AAC ATG GTC TAC GCT 1008 Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Met Val Tyr Ala
325 330 335
ACC ACA TCC CGC AGC GCA AGC CAG CGG CAG AAG AAG GTC ACC TTT GAC 1056 Thr Thr Ser A-rg Ser Ala Ser Gln Arg Gln Lys Lys Val Thr Phe Asp
340 345 350
AGA CTG CAA ATC CTG GAC GAT CAC TAC CAG GAC GTG CTC AAG GAG ATG 1104 Arg Leu Gin Ile Leu Asp Asp His Tyr Gln Asp Val Leu Lys Glu Met
355 360 365 AAG GCG AAG GCG TCC ACA GTT AAG GCT AAG CTT CTA TCA GTA GAG GAA 1152 Lys Ala Lys Ala Ser Thr Val Lys Ala Lys Leu Leu Ser Val Glu Glu
370 375 380
GCC TGC AAG CTG ACG CCC CCA CAT TCG GCC AAA TCT AAA TTT GGC TAT 1200 Ala Cys Lys Leu Thr Pro Pro His Ser Ala Lys Ser Lys Phe Gly Tyr
385 390 395 400
GGG GCA AAG GAC GTC CGG AAC CTA TCC AGC AAG GCC ATT AAC CAC ATC 1248 Gly Ala Lys Asp Val Arg Asn Leu Ser Ser Lys Ala Ile Asn His Ile
405 410 415
CGC TCC GTG TGG GAG GAC TTG TTG GAA GAC ACT GAA ACA CCA ATT GAC 1296 Arg Ser Val Trp Glu Asp Leu Leu Glu Asp Thr Glu Thr Pro Ile Asp
420 425 430
ACC ACC ATC ATG GCA AAA AAT GAG GTT TTC TGC GTC CAA CCA GAG AGA 1344 Thr Thr Ile Met Ala Lys Asn Glu Val Phe Cys Val Gln Pro Glu Arg
435 440 445
GGA GGC CGC AAG CCA GCT CGC CTT ATC GTG TTC CCA GAC TTG GGG GTC 1392 Gly Gly Arg Lys Pro Ala Arg Leu Ile Val Phe Pro Asp Leu Gly Val
450 455 460
CGT GTG TGC GAG AAA ATG GCC CTC TAT GAC GTG GTC TCC ACC CTC CCT 1440 Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Ser Thr Leu Pro
465 470 475 480
CAG GCT GTG ATG GGC TCC TCG TAC GGA TTC CAG TAT TCT CCT GGA CAG 1488 Gln Ala Val Met Gly Ser Ser Tyr Gly Phe Gln Tyr Ser Pro Gly Gln
485 490 495
CGG GTC GAG TTC CTG GTG AAC GCC TGG AAA TCA AAG AAG ACC CCT ATG 1536 Arg Val Glu Phe Leu Val Asn Ala Trp Lys Ser Lys Lys Thr Pro Met
500 505 510
GGC TTT GCA TAT GAC ACC CGC TGT TTT GAC TCA ACA GTC ACT GAG AAT 1584
Gly Phe Ala Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu Asn
515 520 525
GAC ATC CGT GTA GAG GAG TCA ATT TAT CAA TGT TGT GAC TTG GCC CCC 1632
Asp Ile Arg Val Glu Glu Ser Ile Tyr Gln Cys Cys Asp Leu Ala Pro
530 535 540
GAA GCC AGA CAG GCC ATA AGG TCG CTC ACA GAG CGG CTT TAT ATC GGG 1680 Glu Ala Arg Gln Ala Ile Arg Ser Leu Thr Glu Arg Leu Tyr Ile Gly
545 550 555 560
GGT CCC CTG ACT AAT TCA AAA GGG CAG AAC TGC GGC TAT CGC CGG TGC 1728 Gly Pro Leu Thr Asn Ser Lys Gly Gln Asn Cys Gly Tyr Arg Arg Cys
565 570 575
CGC GCG AGC GGC GTG CTG ACG ACT AGC TGC GGT AAT ACC CTC ACA TGT 1776 Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys
580 585 590 TAC TTG AAG GCC 1788
Tyr Leu Lys Ala
595
(2) INFORMATION FOR SEQ ID NO: 6:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 596 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6:
Thr Glu Val Asp Gly Val Arg Leu His Arg Tyr Ala Pro Ala Cys Lys 1 5 10 15 Pro Leu Leu Arg Glu Glu Val Thr Phe Gln Val Gly Leu Asn Gln Tyr
20 25 30
Leu Val Gly Ser Gln Leu Pro Cys Glu Pro Glu Pro Asp Val Ala Val
35 40 45
Leu Thr Ser Met Leu Thr Asp Pro Ser His Ile Thr Ala Glu Thr Ala 50 55 60
Lys Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser Leu Ala Ser Ser Ser 65 70 75 80
Ala Ser Gln Leu Ser Gly Pro Ser Ser Lys Ala Thr Tyr Ile Thr Gln
85 90 95 Asn Asp Phe Pro Asp Ala Asp Leu Ile Glu Ala Asn Leu Leu Trp Arg
100 105 110
His Glu Met Gly Gly Asp Ile Thr Arg Val Glu Ser Glu Asn Lys Val
115 120 125
Val Ile Leu Asp Ser Phe Asp Pro Leu Arg Ala Glu Glu Asp Glu Arg 130 135 140
Glu Val Ser Val Pro Ala Glu Ile Leu Arg Lys Ser Lys Lys Phe Pro 145 150 155 160
Pro Ala Met Pro Ala Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu Leu
165 170 175 Glu Ser Trp Lys Ala Pro Asp Tyr Val Pro Fro Val Val His Gly Cys
180 185 190
Pro Leu Pro Pro Thr Lys Thr Pro Pro Ile Pro Pro Pro Arg Arg Lys
195 200 205
Arg Thr Val Val Leu Thr Glu Ser Thr Val Ser Ser Ala Leu Ala Glu 210 215 220
Leu Ala Thr Lys Ala Phe Gly Ser Ser Gly Pro Ser Ala Val Asp Ser 225 230 235 240 Gly Thr Ala Thr Ala Pro Pro Asp Gln Ser Ser Asp Asp Gly Gly Ala 245 250 255
Gly Ser Asp Val Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly Glu
260 265 270
Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val Ser Glu
275 280 285
Glu Ala Gly Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp Thr 290 295 300
Gly Ala Leu Ile Thr Pro Cys Ala Ala Glu Glu Ser Lys Leu Pro Ile 305 310 315 320
Asn Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Met Val Tyr Ala
325 330 335
Thr Thr Ser Arg Ser Ala Ser Gln Arg Gln Lys Lys Val Thr Phe Asp
340 345 350
A-rg Leu Gln Ile Leu Asp Asp His Tyr Gln Asp Val Leu Lys Glu Met
355 360 365
Lys Ala Lys Ala Ser Thr Val Lys Ala Lys Leu Leu Ser Val Glu Glu 370 375 380
Ala Cys Lys Leu Thr Pro Pro His Ser Ala Lys Ser Lys Phe Gly Tyr 385 390 395 400
Gly Ala Lys Asp Val Arg Asn Leu Ser Ser Lys Ala Ile Asn His Ile
405 410 415
Arg Ser Val Trp Glu Asp Leu Leu Glu Asp Thr Glu Thr Pro Ile Asp
420 425 430
Thr Thr Ile Met Ala Lys Asn Glu Val Phe Cys Val Gln Pro Glu Arg
435 440 445
Gly Gly Arg Lys Pro Ala Arg Leu Ile Val Phe Pro Asp Leu Gly Val 450 455 460
Arg Val Cys Glu Lys Met A-la Leu Tyr Asp Val Val Ser Thr Leu Pro 465 470 475 480 Gln Ala Val Met Gly Ser Ser Tyr Gly Phe Gln Tyr Ser Pro Gly Gln
485 490 495
Arg Val Glu Phe Leu Val Asn Ala Trp Lys Ser Lys Lys Thr Pro Met
500 505 510
Gly Phe Ala Tyr Asp Thr Arg Cvs Phe Asp Ser Thr Val Thr Glu Asn
515 520 525
Asp Ile Arg Val Glu Glu Ser Ile Tyr Gln Cys Cys Asp Leu Ala Pro 530 535 540 Glu Ala Arg Gln Ala Ile Arg Ser Leu Thr Glu Arg Leu Tyr Ile Gly 545 550 555 560
Gly Pro Leu Thr Asn Ser Lys Gly Gln Asn Cys Gly Tyr A-rg Arg Cys
565 570 575
Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys
580 585 590
Tyr Leu Lys Ala
595
(2) INFORMATION FOR SEQ ID NO: 7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 600 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to genomic RNA
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 7..600
(D) OTHER INFORMATION: /product= "Codes for part of the putative NS3 region of PT-NANBH"
/note= "Restriction enzyme sites facilitate
subsequent cloning; used to produce recombinant baculovirus BHC-28 (e.g.4) which expresses a recombinant protein containing NS5 - NS3 - core sequences in infected cells
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 1..15
(D) OTHER INFORMATION: /note= "Synthetic DNA containing BamHI and Pstl restriction enzyme recognition
sites"
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 16..600
(D) OTHER INFORMATION: /note= "PT-NANBH coding sequence
with Pstl restriction Enzyme recognition site
introduced at 590 to 595."
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7:
GGATCC TCT GCA GGG ATC ATA ATA TGT GAT GAA TGC CAC TCA ACT GAC 48 Ser Ala Gly Ile Ile Ile Cys A-sp Glu Cys His Ser Thr Asp
1 5 10
TCG ACT TCC ATC CTG GGC ATT GGC ACA GTC CTA GAC CAA GCG GAG ACG 96 Ser Thr Ser Ile Leu Gly Ile Gly Thr Val Leu Asp Gln A-la Glu Thr
15 20 25 30
GCT GGA GCG CGC CTT GTC GTG CTC GCC ACC GCT ACG CCT CCG GGA TCG 144 Ala Gly Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser
35 40 45
GTC ACC GTG CCG CAC CCC AAT ATC GAG GAG GTG GCT CTG TCC AAC ACT 192 Val Thr Val Pro His Pro Asn Ile Glu Glu Val Ala Leu Ser Asn Thr
50 55 60
GGA GAG ATC CCC TTC TAT GGC AAA GCC ATC CCT ATT GAG ACC ATC AAG 240 Gly Glu Ile Pro Phe Tyr Gly Lys Ala Ile Pro Ile Glu Thr Ile Lys
65 70 75
GGG GGG AGG CAC CTC ATT TTC TGC CAC TCC AAG AAG AAG TGT GAC GAA 288 Gly Gly Arg His Leu Ile Phe Cys His Ser Lys Lys Lys Cys Asp Glu
80 85 90
CTC GCT GCA AAA CTG GTG GGC CTC GGA ATC AAT GCT GTA GCG TAT TAC 336 Leu Ala Ala Lys Leu Val Gly Leu Gly Ile Asn Ala Val Ala Tyr Tyr
95 100 105 110
CGG GGC CTT GAT GTG TCC GTC ATA CCG GCC AGC GGA GAC GTC GTT GTT 384
Arg Gly Leu Asp Val Ser Val Ile Pro Ala Ser Gly Asp Val Val Val
115 120 125
GTA GCA ACA GAC GCT CTA ATG ACG GGC TTT ACC GGC GAC TTT GAC TCA 432
Val Ala Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp Phe Asp Ser
130 135 140
GTG ATC GAC TGT AAT ACA TGT GTC ACC CAG ACG GTC GAT TTC AGC TTG 480 Val Ile Asp Cys Asn Thr Cys Val Thr Gln Thr Val Asp Phe Ser Leu
145 150 155
GAC CCT ACC TTT ACC ATT GAG ACG ACG ACC GTG CCC CAA GAC GCG GTG 528 Asp Pro Thr Phe Thr Ile Glu Thr Thr Thr Val Pro Gln Asp Ala Val
160 165 170
TCG CGC TCA CAA CGG CGA GGC AGG ACT GGT AGG GGC AGG AGA GGC ATC 576 Ser Arg Ser Gln Arg Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly Ile
175 180 185 190
TAC AGG TTT GTG GCT GCA GGA GAA 600
Tyr Arg Phe Val Ala Ala Gly Glu
195
(2) INFORMATION FOR SEQ ID NO: 8:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 198 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8:
Ser Ala Gly Ile Ile Ile Cys Asp Glu Cys His Ser Thr Asp Ser Thr 1 5 10 15 Ser Ile Leu Gly Ile Gly Thr Val Leu Asp Gln Ala Glu Thr Ala Gly
20 25 30
Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr
35 40 45
Val Pro His Pro Asn Ile Glu Glu Val Ala Leu Ser Asn Thr Gly Glu 50 55 60
Ile Pro Phe Tyr Gly Lys Ala Ile Pro Ile Glu Thr Ile Lys Gly Gly 65 70 75 80
Arg His Leu Ile Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala
85 90 95 Ala Lys Leu Val Gly Leu Gly Ile Asn Ala Val Ala Tyr Tyr Arg Gly
100 105 110
Leu Asp Val Ser Val Ile Pro Ala Ser Gly Asp Val Val Val Val Ala
115 120 125
Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp Phe Asp Ser Val Ile 130 135 140
Asp Cys Asn Thr Cvs Val Thr Gln Thr Val A-sp Phe Ser Leu Asn Pro 145 150 155 160
Thr Phe Thr Ile Glu Thr Thr Thr Val Pro Gln Asp Ala Val Ser Arg
165 170 175 Ser Gln Arg Arg Gly Arg Thr Gly Arg Gly Arg Arg Gly Ile Tyr Arg
180 185 190
Phe Val Ala Ala Gly Glu
195 (2) INFORMATION FOR SEQ ID NO: 9:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 7065 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to genomic RNA
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..7065
(D) OTHER INFORMATION: /product= "PT-NANBH coding
sequence"
/note= "Encodes putative PT-NANBH non-structural proteins"
(ix) FEATURE:
(A) NAME/KEY: misc_signal
(B) LOCATION: 7063..7065
(D) OTHER INFORMATION: /function= "Stop codon"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9:
TCG TGC GGA GGC GCG GTT TTC ATA GGT CTG ATG CCC TTG ACC CTG TCA 48 Ser Cys Gly Gly Ala Val Phe Ile Gly Leu Met Pro Leu Thr Leu Ser
1 5 10 15
CCA TAC TAT AAG GTG TTC CTC GCT AAG CTC ATA TGG TGG TTA CAA TAC 96 Pro Tyr Tyr Lys Val Phe Leu Ala Lys Leu Ile Trp Trp Leu Gln Tyr
20 25 30
TTC ATC ACC AGA GCC GAG GCG CAC TTG CAA GTG TGG GTC CCC CCC CTT 144 Phe Ile Thr Arg Ala Glu Ala His Leu Gln Val Trp Val Pro Pro Leu
35 40 45
AAT GTT CGG GGG GGC CGC GAT GCC ATC ATC CTC CTC GCG TGC GTG GTC 192 Asn Val Arg Gly Gly Arg Asp Ala Ile Ile Leu Leu Ala Cys Val Val
50 55 60
CAC CCA GAG CTG ACC TTT GAC ATC TCC AAG ATC TTG CTC GCC ATA CTC 240
His Pro Glu Leu Thr Phe Asp Ile Ser Lys Ile Leu Leu Ala Ile Leu
65 70 75 80
GGT CCG CTC ATG TTG CTC CAG GCT GGC ATA ACT AGA GTG CCG TAC TTT 288
Gly Pro Leu Met Leu Leu Gln Ala Gly Ile Thr Arg Val Pro Tyr Phe
85 90 95
GTG CGC GCT CAA GGG CTC ATT CGT GCA TGC ATG CTA GTG CGG AAA GTC 336 Val Arg Ala Gln Gly Leu Ile Arg Ala Cys Met Leu Val Arg Lys Val
100 105 110
GCT GGG GGC CAT TAT GTC CAA ATG GCT CTC ATG AAA CTG GCC GCT CTG 384 Ala Gly Gly His Tyr Val Gln Met Ala Leu Met Lys Leu Ala Ala Leu
115 120 125 ACA GGT ACG TAC GTT TAT GAC CAT CTT ACT CCG CTG CGG GAC TGG GCC 432 Thr Gly Thr Tyr Val Tyr Asp His Leu Thr Pro Leu Arg Asp Trp Ala
130 135 140
CAC GCG GGC CTA CGA GAC CTC GCG GTG GCA GTT GAG CCC GTT GTC TTC 480 His Ala Gly Leu Arg Asp Leu Ala Val Ala Val Glu Pro Val Val Phe
145 150 155 160
TCT GAC ATG GAG ACC AAG GTC ATC ACC TGG GGG ACG GAC ACC GCG GCG 528 Ser Asp Met Glu Thr Lys Val Ile Thr Trp Gly Thr Asp Thr Ala Ala
165 170 175
TGT GGG GAC ATC ATC TTG GGC CTA CCC GTC TCC GCC CGA AGG GGG GAT 576 Cys Gly Asp Ile Ile Leu Gly Leu Pro Val Ser Ala Arg Arg Gly Asp
180 185 190
GAG ATA TTT CTG GGA CCG GCT GAC AGT CTC GAA GGG CAG GGG TGG CGA 624
Glu Ile Phe Leu Gly Pro Ala Asp Ser Leu Glu Gly Gln Gly Trp Arg
195 200 205
CTC CTT GCG CCT ATC ACG GCC TAC TCC CAA CAG ACA CGG GGC CTA CTT 672
Leu Leu Ala Pro Ile Thr Ala Tyr Ser Gln Gln Thr Arg Gly Leu Leu
210 215 220
GGT TGC ATC ATT ACC AGC CTC ACA GGT CGG GAC AAG AAC CAA GTC GAG 720 Gly Cys Ile Ile Thr Ser Leu Thr Gly Arg Asp Lys Asn Gln Val Glu
225 230 235 240
GGG GAA GTG CAA GTG GTC TCC ACC GCA ACA CAA TCT TTC CTG GCG ACC 768 Gly Glu Val Gln Val Val Ser Thr Ala Thr Gln Ser Phe Leu Ala Thr
245 250 255
TGT GTC AAC GGC GTG TGT TGG ACT GTC TAC CAT GGC GCC GGC TCT AAG 816 Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys
260 265 270
ACC TTG GCC GGC CCA AAA GGC CCA GTC ATC CAA ATG TAC ACC AAT GTA 864
Thr Leu Ala Gly Pro Lys Gly Pro Val Ile Gln Met Tyr Thr Asn Val
275 280 285
GAC CAG GAC CTC GTC GGC TGG CCA GCG CCC CCC GGG GCG CGT TCC TTA 912
Asp Gin Asp Leu Val Gly Trp. Pro Ala Pro Pro Gly Ala Arg Ser Leu
290 295 300
ACA CCA TGC ACC TGC GGC AGC TCA GAC CTT TAC TTG GTC ACG AGG CAT 960 Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His
305 310 315 320
GCT GAT GTC ATT CCG GTG CGC CGG CGG GGT GAC AGC AGG GGA AGC CTA 1008 Ala Asp Val Ile Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu
325 330 335
CTC TCC CCC AGG CCC ATC TCC TAT TTG AAG GGC TCC TCG GGT GGT CCA 1056 Leu Ser Pro Arg Pro Ile Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro
340 345 350 TTG CTC TGC CCC TCG GGG CAC GCC GTG GGC ATC TTC CGG GCT GCC GTG 1104
Leu Leu Cys Pro Ser Gly His Ala Val Gly Ile Phe Arg Ala Ala Val
355 360 365
TGC ACC CGG GGG GTC GCG AAG GCG GTG GAC TTT ATA CCC GTT GAG TCT 1152
Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Ile Pro Val Glu Ser
370 375 380
ATG GAA ACC ACT GTG CGG TCC CCG GTC TTT ACG GAC AAC TCA TCT CCT 1200 Met Glu Thr Thr Val Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro
385 390 395 400
CCG GCC GTA CCG CAG TCA TTC CAA GTG GCC CAT CTA CAC GCC CCC ACT 1248
Pro Ala Val Pro Gln Ser Phe Gln Val Ala His Leu His Ala Pro Thr
405 410 415
GGC AGC GGC AAG AGC ACC AGG GTG CCG GCT GCG TAT GCA GCC CAA GGG 1296
Gly Ser Gly Lys Ser Thr Arg Val Pro Ala Ala Tyr Ala Ala Gln Gly
420 425 430
TAC AAG GTA CTT GTC CTG AAC CCG TCC GTT GCC GCC ACC CTA GGC TTT 1344
Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe
435 440 445
GGG GCG TAT ATG TCT AAG GCA CAC GGT GTC GAC CCT AAC ATC AGA TCT 1392
Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp Pro Asn Ile Arg Ser
450 455 460
GGG GTA AGG ACC ATC ACC ACG GGC GCC CCC ATC ACG TAC TCC ACC TAT 1440 Gly Val Arg Thr Ile Thr Thr Gly Ala Pro Ile Thr Tyr Ser Thr Tyr
465 470 475 480
GGC AAG TTC CTT GCC GAC GGT GGT TGC TCT GGG GGC GCC TAT GAC ATC 1488
Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp Ile
485 490 495
ATA ATA TGT GAT GAA TGC CAC TCA ACT GAC TCG ACT TCC ATC CTG GGC 1536 Ile Ile Cys Asp Glu Cys His Ser Thr Asp Ser Thr Ser Ile Leu Gly
500 505 510
ATT GGC ACA GTC CTA GAC CAA GCG GAG ACG GCT GGA GCG CGC CTT GTC 1584 Ile Gly Thr Val Leu Asp Gln Ala Glu Thr Ala Gly Ala Arg Leu Val
515 520 525
GTG CTC GCC ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCG CAC CCC 1632
Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro
530 535 540
AAT ATC GAG GAG GTG GCT CTG TCC AAC ACT GGA GAG ATC CCC TTC TAT 1680 Asn Ile Glu Glu Val Ala Leu Ser Asn Thr Gly Glu Ile Pro Phe Tyr
545 550 555 560
GGC AAA GCC ATC CCT ATT GAG ACC ATC AAG GGG GGG AGG CAC CTC ATT 1728 Gly Lys Ala Ile Pro Ile Glu Thr Ile Lys Gly Gly Arg His Leu Ile
565 570 575 TTC TGC CAC TCC AAG AAG AAG TGT GAC GAA CTC GCT GCA AAA CTG GTG 1776
Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val
580 585 590
GGC CTC GGA ATC AAT GCT GTA GCG TAT TAC CGG GGC CTT GAT GTG TCC 1824
Gly Leu Gly Ile Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser
595 600 605
GTC ATA CCG GCC AGC GGA GAC GTC GTT GTT GTA GCA ACA GAC GCT CTA 1872 Val Ile Pro Ala Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu
610 615 620
ATG ACG GGC TTT ACC GGC GAC TTT GAC TCA GTG ATC GAC TGT AAT ACA 1920
Met Thr Gly Phe Thr Gly Asp Phe Asp Ser Val Ile Asp Cys Asn Thr
625 630 635 640
TGT GTC ACC CAG ACG GTC GAT TTC AGC TTG GAC CCT ACC TTT ACC ATT 1968
Cys Val Thr Gln Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr Ile
645 650 655
GAG ACG ACG ACC GTG CCC CAA GAC GCG GTG TCG CGC TCA CAA CGG CGA 2016
Glu Thr Thr Thr Val Pro Gln Asp Ala Val Ser Arg Ser Gln Arg Arg
660 665 670
GGC AGG ACT GGT AGG GGC AGG AGA GGC ATC TAC AGG TTT GTG GCT CCA 2064
Gly Arg Thr Gly Arg Gly Arg Arg Gly Ile Tyr Arg Phe Val Ala Pro
675 680 685
GGA GAA CGG CCC TCG GGC ATG TTC GAT TCC TCG GTC CTG TGT GAG TGT 2112 Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys
690 695 700
TAT GAC GCG GGT TGC GCT TGG TAC GAG CTC ACG CCC GCC GAG ACC TCG 2160
Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser
705 710 715 720
GTT AGG TTG CGG GCT TAC CTG AAT ACA CCA GGG TTG CCC GTT TGC CAG 2208
Val Arg Leu Arg Ala Tyr Leu -Asn Thr Pro Gly Leu Pro Val Cys Gln
725 730 735
GAC CAT CTG GAG TTC TGG GAG AGC GTC TTC ACG GGC CTC ACC CAC GTG 2256
Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His Val
740 745 750
GAT GCC CAC TTC TTG TCC CAA ACA AAG CAG GCA GGA GAC AAC TTC CCC 2304
Asp Ala His Phe Leu Ser Gln Thr Lys Gln Ala Gly Asp Asn Phe Pro
755 760 765
TAC CTG GTG GCG TAC CAG GCT ACT GTG TGC GCT AGG GCC CAG GCC CCA 2352 Tyr Leu Val Ala Tyr Gln Ala Thr Val Cys Ala Arg Ala Gln Ala Pro
770 775 780
CCT CCA TCA TGG GAT CAA ATG TGG AAG TGT CTC ATA CGG CTA AAG CCT 2400
Pro Pro Ser Trp Asp Gln Met Trp Lys Cys Leu Ile Arg Leu Lys Pro
785 790 795 800 ACT CTG CGC GGG CCA ACA CCC TTG CTG TAT AGG CTG GGA GCC GTC CAA 2448
Thr Leu Arg Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gln
805 810 815 AAC GAG GTC ACC CTC ACA CAC CCC ATA ACC AAA TTC ATC ATG GCA TGC 2496
Asn Glu Val Thr Leu Thr His Pro Ile Thr Lys Phe Ile Met Ala Cys
820 825 830
ATG TCA GCC GAC CTG GAG GTC GTC ACG AGC ACC TGG GTG CTG GTG GGC 2544 Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly
835 840 845
GGG GTC CTT GCA GCT CTG GCT GCG TAT TGC TTG ACA ACA GGC AGC GTG 2592
Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val
850 855 860
GTC ATT GTG GGT AGG ATC ATC TTG TCC GGG CGG CCG GCT ATT GTT CCC 2640
Val Ile Val Gly Arg Ile Ile Leu Ser Gly Arg Pro Ala Ile Val Pro
865 870 875 880
GAC AGG GAA GTC CTC TAC CAG GAG TTC GAT GAG ATG GAA GAG TGC GCG 2688
Asp Arg Glu Val Leu Tyr Gln Glu Phe Asp Glu Met Glu Glu Cys Ala
885 890 895
TCG CAC CTC CCT TAC ATC GAG CAG GGA ATG CAG CTC GCC GAG CAG TTC 2736
Ser His Leu Pro Tyr Ile Glu Gln Gly Met Gln Leu Ala Glu Gln Phe
900 905 910
AAG CAA AAA GCG CTC GGG TTG CTG CAG ACA GCC ACC AAG CAA GCG GAG 2784 Lys Gln Lys Ala Leu Gly Leu Leu Gln Thr Ala Thr Lys Gln Ala Glu
915 920 925
GCT GCT GCT CCC GTG GTG GAG TCC AAG TGG CGA GCC CTT GAG ACC TTC 2832
Ala Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe
930 935 940
TGG GCG AAA CAC ATG TGG AAC TTC ATC AGC GGG ATA CAG TAC TTA GCA 2880
Trp Ala Lys His Met Trp Asn Phe Ile Ser Gly Ile Gln Tyr Leu Ala
945 950 955 960
GGC TTG TCT ACT CTG CCT GGG AAT CCC GCG ATT GCA TCA CTG ATG GCG 2928
Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala Ile Ala Ser Leu Met Ala
965 970 975
TTC ACA GCC TCT GTC ACT AGC CCG CTC ACC ACC CA4 TCT ACC CTC CTG 2976
Phe Thr Ala Ser Val Thr Ser Pro Leu Thr Thr Gln Ser Thr Leu Leu
980 985 990
CTT AAC ATC CTG GGG GGA TGG GTA GCC GCC CAA CTC GCT CCC CCC AGT 3024 Leu Asn Ile Leu Gly Gly Trp Val Ala Ala Gln Leu Ala Pro Pro Ser
995 1000 1005
GCT GCT TCA GCT TTC GTA GGC GCC GGC ATT GCT GGT GCG GCT GTT GGC 3072
Ala Ala Ser Ala Phe Val Gly Ala Gly Ile Ala Gly Ala Ala Val Gly
1010 1015 1020 AGC ATA GGC CTT GGG AAG GTG CTT GTG GAC ATC TTG GCG GGC TAT GGA 3120 Ser Ile Gly Leu Gly Lys Val Leu Val Asp Ile Leu Ala Gly Tyr Gly
1025 1030 1035 1040
GCA GGA GTG GCA GGC GCG CTC GTG GCC TTT AAG GTC ATG AGC GGC GAA 3168 Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly Glu
1045 1050 1055
ATG CCC TCC ACC GAG GAC CTG GTT AAC TTA CTC CCT GCC ATC CTC TCT 3216 Met Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala Ile Leu Ser
1060 1065 1070
CCT GGT GCC CTG GTC GTC GGG GTC GTG TGC GCA GCG ATA CTG CGT CGG 3264 Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala Ile Leu Arg Arg
1075 1080 1085
CAC GTG GGT CCA GGG GAG GGG GCT GTG CAG TGG ATG AAC CGG CTG ATA 3312
His Val Gly Pro Gly Glu Gly Ala Val Gln Trp Met Asn Arg Leu Ile
1Q90 1095 1100
GCG TTC GCC TCG CGG GGT AAC CAT GTT TCC CCC ACG CAC TAT GJG CCA 3360
Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro
1105 1110 1115 1120
GAG AGC GAC GCC GCA GCA CGT GTC ACT CAG ATC CTC TCC GAC CTT ACT 3408 Glu Ser Asp Ala Ala Ala Arg Val Thr Gln Ile Leu Ser A-sp Leu Thr
1125 1130 1135
ATC ACC CAA CTG TTG AAG AGG CTC CAC CAG TGG ATT AAC GAG GAC TGC 3456 Ile Thr Gln Leu Leu Lys Arg Leu His Gln Trp Ile Asn Glu Asp Cys
1140 1145 1150
TCC ACG CCC TGC TCC GGC TCG TGG CTA AGG GAT GTT TGG GAC TGG ATA 3504 Ser Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp Val Trp Asp Trp Ile
1155 1160 1165
TGC ACA GTT TTG GCT GAC TTC AAG ACC TGG CTC CAG TCC AAG CTC CTG 3552
Cys Thr Val Leu Ala Asp Phe Lys Thr Trp Leu Gln Ser Lys Leu Leu
1170 1175 1180
CCG CGA TTA CCG GGA GTC CCC TTT TTC TCA TGC CAA CGT GGG TAC AAG 3600
Pro Arg Leu Pro Gly Val Pro Phe Phe Ser Cys Gln Arg Gly Tyr Lys
1185 1190 1195 1200
GGG GTC TGG CGG GGA GAC GGC ATC ATG CAG ACC ACC TGC TCA TGT GGA 3648 Gly Val Trp Arg Gly Asp Gly Ile Met Gln Thr Thr Cys Ser Cys Gly
1205 1210 1215
GCA CAG ATC ACC GGA CAT GTC AAA AAC GGT TCC ATG AGG ATC GTT GGG 3696 Ala Gln Ile Thr Gly His Val Lys Asn Gly Ser Met Arg Ile Val Gly
1220 1225 1230
CCT AAG ACC TGT AGT AAC ATG TGG CAT GGA ACA TTC CCC ATC AAC GCA 3744 Pro Lys Thr Cys Ser Asn Met Trp His Gly Thr Phe Pro Ile Asn Ala
1235 1240 1245 TAC ACC ACG GGC CCC TGC ACG CCC TCC CCA GCG CCA AAC TAT TCC AGG 3792 Tyr Thr Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr Ser Arg
1250 1255 1260
GCG CTG TGG CGG GTG GCT GCT GAG GAG TAC GTG GAG GTT ACG CGG GTG 3840 Ala Leu Trp Arg Val Ala Ala Glu Glu Tyr Val Glu Val Thr Arg Val
1265 1270 1275 1280
GGG GAT TTC CAC TAC GTG ACG AGC ATG ACC ACT GAC AAC GTA AAA TGC 3888 Gly Asp Phe His Tyr Val Thr Ser Met Thr Thr Asp Asn Val Lys Cys
1285 1290 1295
CCG TGC CAG GTT CCA GCC CCC GAA TTC TTC ACA GAA GTG GAT GGG GTG 3936 Pro Cys Gln Val Pro Ala Pro Glu Phe Phe Thr Glu Val Asp Gly Val
1300 1305 1310
CGG CTG CAC AGG TAC GCT CCG GCG TGC AAA CCT CTC CTA CGG GAG GAG 3984 Arg Leu His Arg Tyr Ala Pro Ala Cys Lys Pro Leu Leu Arg Glu Glu
1315 1320 1325
GTC ACA TTC CAG GTC GGG CTC AAC CAA TAC CTG GTT GGG TCG CAG CTC 4032 Val Thr Phe Gln Val Gly Leu Asn Gln Tyr Leu Val Gly Ser Gln Leu
1330 1335 1340
CCA TGC GAG CCC GAA CCG GAT GTA GCA GTG CTC ACT TCC ATG CTC ACC 4080 Pro Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr
1345 1350 1355 1360
GAC CCC TCC CAC ATC ACA GCA GAG ACG GCT AAG CGC AGG CTG GCC AGG 4128 Asp Pro Ser His Ile Thr Ala Glu Thr Ala Lys Arg Arg Leu Ala Arg
1365 1370 1375
GGG TCT CCC CCC TCC TTG GCC AGC TCT TCA GCT AGC CAG TTG TCT GCG 4176 Gly Ser Pro Pro Ser Leu Ala Ser Ser Ser Ala Ser Gln Leu Ser Ala
1380 1385 1390
CCT TCC TCG AAG GCG ACA TAC ATT ACC CAA AAT GAC TTC CCA GAC GCT 4224
Pro Ser Ser Lys Ala Thr Tyr Ile Thr Gln Asn Asp Phe Pro Asp Ala
1395 1400 1405
GAC CTC ATC GAG GCC AAC CTC CTG TGG CGG CAT GAG ATG GGC GGG GAC 4272
Asp Leu Ile Glu Ala Asn Leu Leu Trp Arg His Glu Met Gly Gly Asp
1410 1415 1420
ATT ACC CGC GTG GAG TCA GAG AAC AAG GTA GTA ATC CTG GAC TCT TTC 4320 Ile Thr Arg Val Glu Ser Glu Asn Lys Val Val Ile Leu Asp Ser Phe
1425 1430 1435 1440
GAC CCG CTC CGA GCG GAG GAG GAT GAG CGG GAA GTG TCC GTC CCG GCG 4368 Asp Pro Leu Arg Ala Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ala
1445 1450 1455
GAG ATC CTG CGG AAA TCC AAG AAA TTC CCA CCA GCG ATG CCC GCA TGG 4416 Glu Ile Leu Arg Lys Ser Lys Lys Phe Pro Pro Ala Met Pro Ala Trp
1460 1465 1470 GCA CGC CCG GAT TAC AAC CCT CCG CTG CTG GAG TCC TGG AAG GCC CCG 4464 Ala Arg Pro Asp Tyr Asn Pro Pro Leu Leu Glu Ser Trp Lys Ala Pro
1475 1480 1485
GAC TAC GTC CCT CCA GTG GTA CAT GGG TGC CCA CTG CCA CCT ACT AAG 4512 Asp Tyr Val Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Thr Lys
1490 1495 1500
ACC CCT CCT ATA CCA CCT CCA CGG AGG AAG AGG ACA GTT GTT CTG ACA 4560 Thr Pro Pro Ile Pro Pro Pro Arg Arg Lys Arg Thr Val Val Leu Thr
1505 1510 1515 1520
GAA TCC ACC GTG TCT TCT GCC CTG GCG GAG CTT GCC ACA AAG GCT TTC 4608 Glu Ser Thr Val Ser Ser Ala Leu Ala Glu Leu Ala Thr Lys Ala Phe
1525 1530 1535
GGT AGC TCC GAA CCG TCG GCC GTC GAC AGC GGC ACG GCA ACC GCC CCT 4656
Gly Ser Ser Glu Pro Ser Ala Val Asp Ser Gly Thr Ala Thr Ala Pro
1540 1545 1550
CCT GAC CAA CCC TCC GAC GAC GGC GGA GCA GGA TCT GAC GTT GAG TCG 4704
Pro Asp Gln Pro Ser Asp Asp Gly Gly Ala Gly Ser Asp Val Glu Ser
1555 1560 1565
TAT TCC TCC ATG CCC CCC CTT GAG GGG GAG CCG GGG GAC CCC GAT CTC 4752 Tyr Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu
1570 1575 1580
AGC GAC GGG TCT TGG TCT ACC GTG AGT GAG GAG GCC GGT GAG GAC GTC 4800 Ser Asp Gly Ser Trp Ser Thr Val Ser Glu Glu Ala Gly Glu Asp Val
1585 1590 1595 1600
GTC TGC TGC TCG ATG TCC TAC ACA TGG ACA GGC GCT CTG ATC ACG CCA 4848 Val Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu Ile Thr Pro
1605 1610 1615
TGC GCT GCG GAG GAA AGC AAG CTG CCC ATC AAC GCG TTG AGC AAC TCT 4896 Cys Ala Ala Glu Glu Ser Lys Leu Pro Ile Asn Ala Leu Ser Asn Ser
1620 1625 1630
TTG CTG CGT CAC CAC AAC ATG GTC TAC GCT ACC ACA TCC CGC AGC GCA 4944 Leu Leu Arg His His Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala
1635 1640 1645
AGC CAG CGG CAG AAG AAG GTC ACC TTT GAC AGA CTG CAA ATC CTG GAC 4992 Ser Gin Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gln Ile Leu Asp
1650 1655 1660
GAT CAC TAC CAG GAC GTG CTC AAG GAG ATG AAG GCG AAG GCG TCC ACA 5040 Asp His Tyr Gln Asp Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr
1665 1670 1675 1680
GTT AAG GCT AAG CTT CTA TCA GTA GAG GAA GCC TGC AAG CTG ACG CCC 5088 Val Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro
1685 1690 1695 CCA CAT TCG GCC AAA TCT AAA TTT GGC TAT GGG GCA AAG GAC GTC CGG 5136 Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg
1700 1705 1710
AAC CTA TCC AGC AAG GCC ATT AAC CAC ATC CGC TCC GTG TGG GAG GAC 5184 Asn Leu Ser Ser Lys Ala Ile Asn His Ile Arg Ser Val Trp Glu Asp
1715 1720 1725
TTG TTG GAA GAC ACT GAA ACA CCA ATT GAC ACC ACC ATC ATG GCA AAA 5232 Leu Leu Glu Asp Thr Glu Thr Pro Ile Asp Thr Thr Ile Met Ala Lys
1730 1735 1740
AAT GAG GTT TTC TGC GTC CAA CCA GAG AGA GGA GGC CGC AAG CCA GCT 5280 Asn Glu Val Phe Cys Val Gln Pro Glu Arg Gly Gly Arg Lys Pro Ala
1745 1750 1755 1760
CGC CTT ATC GTG TTC CCA GAC TTG GGG GTC CGT GTG TGC GAG AAA ATG 5328 Arg Leu Ile Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met
1765 1770 1775
GCC CTC TAT GAC GTG GTC TCC ACC CTC CCT CAG GCT GTG ATG GGC TCC 5376 Ala Leu Tyr Asp Val Val Ser Thr Leu Pro Gln Ala Val Met Gly Ser
1780 1785 1790
TCG TAC GGA TTC CAG TAT TCT CCT GGA CAG CGG GTC GAG TTC CTG GTG 5424 Ser Tyr Gly Phe Gln Tyr Ser Pro Gly Gln Arg Val Glu Phe Leu Val
1795 1800 1805
AAC GCC TGG AAA TCA AAG AAG ACC CCT ATG GGC TTT GCA TAT GAC ACC 5472 Asn Ala Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ala Tyr Asp Thr
1810 1815 1820
CGC TGT TTT GAC TCA ACA GTC ACT GAG AAT GAC ATC CGT GTA GAG GAG 5520 Arg Cys Phe Asp Ser Thr Val Thr Glu Asn Asp Ile Arg Val Glu Glu
1825 ' 1830 1835 1840
GTT AAG GCT AAG CTT CTA TCA GTA GAG GAA GCC TGC AAG CTG ACG CCC 5568
Val Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro
1845 1850 1855
CCA CAT TCG GCC AAA TCT AAA TTT GGC TAT GGG GCA AAG GAC GTC CGG 5616
Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg
1860 1865 1870
AAC CTA TCC AGC AAG GCC ATT AAC CAC ATC CGC TCC GTG TGG GAG GAC 5664
Asn Leu Ser Ser Lys Ala Ile Asn His Ile Arg Ser Val Trp Glu Asp
1875 1880 1885
TTG TTG GAA GAC ACT GAA ACA CCA ATT GAC ACC ACC ATC ATG GCA AAA 5712 Leu Leu Glu Asp Thr Glu Thr Pro Ile Asp Thr Thr Ile Met Ala Lys
1890 1895 1900
AAT GAG GTT TTC TGC GTC CAA CCA GAG AGA GGA GGC CGC AAG CCA GCT 5760 Asn Glu Val Phe Cys Val Gln Pro Glu Arg Gly Gly Arg Lys Pro Ala
1905 1910 1915 1920 CGC CTT ATC GTG TTC CCA GAC TTG GGG GTC CGT GTG TGC GAG AAA ATG 5808 Arg Leu Ile Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met
1925 1930 1935
GCC CTC TAT GAC GTG GTC TCC ACC CTC CCT CAG GCT GTG ATG GGC TCC 5856 Ala Leu Tyr Asp Val Val Ser Thr Leu Pro Gln Ala Val Met Gly Ser
1940 1945 1950
TCG TAC GGA TTC CAG TAT TCT CCT GGA CAG CGG GTC GAG TTC CTG GTG 5904 Ser Tyr Gly Phe Gln Tyr Ser Pro Gly Gln Arg Val Glu Phe Leu Val
1955 1960 1965
AAC GCC TGG AAA TCA AAG AAG ACC CCT ATG GGC TTT GCA TAT GAC ACC 5952 Asn Ala Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ala Tyr Asp Thr
1970 1975 1980
CGC TGT TTT GAC TCA ACA GTC ACT GAG AAT GAC ATC CGT GTA GAG GAG 6000
Arg Cys Phe Asp Ser Thr Val Thr Glu Asn Asp Ile Arg Val Glu Glu
1985 1990 1995 2000
TCA ATT TAT CAA TGT TGT GAC TTG GCC CCC GAA GCC AGA CAG GCC ATA 6048
Ser Ile Tyr Gln Cys Cys Asp Leu Ala Pro Glu Ala Arg Gln Ala Ile
2005 2010 2015
AGG TCG CTC ACA GAG CGG CTT TAT ATC GGG GGT CCC CTG ACT AAT TCA 6096 Arg Ser Leu Thr Glu Arg Leu Tyr Ile Gly Gly Pro Leu Thr A.sn Ser
2020 2025 2030
AAA GGG CAG AAC TGC GGC TAT CGC CGG TGC CGC GCG AGC GGC GTG CTG 6144 Lys Gly Gln Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu
2035 2040 2045
ACG ACT AGC TGC GGT AAT ACC CTC ACA TGT TAC TTG AAG GCC TCT GCA 6192 Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala
2050 2055 2060
GCC TGT CGA GCT GCA AAG CTC CAG GAC TGC ACG ATG CTC GTG TGC GGA 6240 Ala Cys Arg Ala Ala Lys Leu Gln Asp Cys Thr Met Leu Val Cys Gly
2065 2070 2075 2080
GAC GAC CTT GTC GTT ATC TGT GAG AGC GCG GGA ACC CAG GAG GAC GCG 6288 Asp Asp Leu Val Val Ile Cys Glu Ser Ala Gly Thr Gln Glu Asp Ala
2085 2090 2095
GCG AGC CTA CGA GTC TTC ACG GAG GCT ATG ACT AGG TAC TCT GCC CCC 6336 Ala Ser Leu Arg Val Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro
2100 2105 2110
CCC GGG GAC CCG CCC CAA CCA GAA TAC GAC CTG GAG TTG ATA ACA TCA 6384 Pro Gly Asp Pro Pro Gln Pro Glu Tyr Asp Leu Glu Leu Ile Thr Ser
2115 2120 2125
TGC TCC TCC AAT GTG TCG GTC GCG CAC GAT GCA TCT GGC AAA AGG GTA 6432 Cys Ser Ser Asn Val Ser Val Ala His Asp Ala Ser Gly Lys Arg Val
2130 2135 2140 TAC TAC CTC ACC CGT GAC CCC ACC ACC CCC CTT GCA CGG GCT GCG TGG 6480 Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp
2145 2150 2155 2160
GAG ACA GCT AGA CAC ACT CCA GTT AAC TCC TGG CTA GGC AAC ATC ATC 6528 Glu Thr Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn Ile Ile
2165 2170 2175
ATG TAT GCG CCC ACC ATA TGG GCA AGG ATG ATT CTG ATG ACT CAC TTC 6576 Met Tyr Ala Pro Thr Ile Trp Ala Arg Met Ile Leu Met Thr His Phe
2180 2185 2190
TTC TCC ATC CTT CTA GCC CAG GAG CAA CTT GAA AAA GCC CTA GAT TGT 6624 Phe Ser Ile Leu Leu Ala Gln Glu Gln Leu Glu Lys Ala Leu Asp Cys
2195 2200 2205
CAG ATC TAC GGG GCC TGT TAC TCC ATT GAA CCA CTT GAC CTA CCT CAG 6672 Gln Ile Tyr Gly Ala Cys Tyr Ser Ile Glu Pro Leu Asp Leu Pro Gln
2210 2215 2220
ATC ATT CAG CGA CTC CAT GGT CTT AGC GCG TTT TCA CTC CAT AGT TAC 6720 Ile Ile Gln Arg Leu His Gly Leo Ser Ala Phe Ser Leu His Ser Tyr
2225 2230 2235 2240
TCT CCG GGT GAG ATC AAT AGG GTG GCT TCA TGC CTC AGG AAA CTT GGG 6768 Ser Pro Gly Glu Ile Asn Arg Val Ala Ser Cys Leu Arg Lys Leu Gly
2245 2250 2255
GTA CCG CCC TTG CGA GTC TGG AGA CAT CGG GCC AGA GCT GTC CGC GCT 6816 Val Pro Pro Leu Arg Val Trp Arg His Arg Ala Arg Ala Val Arg Ala
2260 2265 2270
AAG CTA CTG TCC CAG GGG GGG AGG GCT GCC ATT TGT GGC AGG TAC CTT 6864 Lys Leu Leu Ser Gln Gly Gly Ar,g Ala Ala Ile Cys Gly Arg Tyr Leu
2275 2280 2285
TTC AAC TGG GCA GTA AAG ACC AAG CTC AAA CTC ACT CCA ATC CCG GCT 6912
Phe Asn Trp Ala Val Lys Thr Lys Leu Lys Leu Thr Pro Ile Pro Ala
2290 2295 2300
GCG TCC CGG CTG GAC TTG TCC GGC TGG TTC GTT GCT GGT TAC AGC GGG 6960
Ala Ser Arg Leu Asp Leu Ser Gly Trp Phe Val Ala Gly Tyr Ser Gly
2305 2310 2315 2320
GGA GAC ATA TAT CAC AGC GTG TCT CGT GCC CGA CCC CGC TGG TTC ATG 7008 Gly Asp Ile Tyr His Ser Val Ser Arg Ala Arg Pro Arg Trp Phe Met
2325 2330 2335
TGG TGC CTA CTC CTA CTT TCT GTA GGA GTA GGA ATC TAC CTG CTC CCC 7056 Trp Cys Leu Leu Leu Leu Ser Val Gly Val Gly Ile Tyr Leu Leu Pro
2340 2345 2350
AAC CGA TGA 7065
Asn Arg
2355 (2) INFORMATION FOR SEQ ID NO: 10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2354 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10:
Ser Cys Gly Gly Ala Val Phe Ile Gly Leu Met Pro Leu Thr Leu Ser 1 5 10 15 Pro Tyr Tyr Lys Val Phe Leu Ala Lys Leu Ile Trp Trp Leu Gln Tyr
20 25 30
Phe Ile Thr Arg Ala Glu Ala His Leu Gln Val Trp Val Pro Pro Leu
35 40 45
Asn Val Arg Gly Gly Arg Asp Ala Ile Ile Leu Leu Ala Cys Val Val 50 55 60
Kis Pro Glu Leu Thr Phe Asp Ile Ser Lys Ile Leu Leu Ala Ile Leu 65 70 75 80
Gly Pro Leu Met Leu Leu Gln Ala Gly Ile Thr Arg Val Pro Tyr Phe
85 90 95 Val Arg Ala Gin Gly Leu Ile Arg Ala Cys Met Leu Val Arg Lys Val
100 105 110
Ala Gly Gly His Tyr Val Gln Met Ala Leu Met Lys Leu Ala Ala Leu
115 120 125
Thr Gly Thr Tyr Val Tyr Asp His Leu Thr Pro Leu Arg Asp Trp Ala 130 135 140
His Ala Gly Leu Arg Asp Leu Ala Val Ala Val Glu Pro Val Val Phe 145 150 155 160
Ser Asp Met Glu Thr Lys Val. Ile Thr Trp Gly Thr Asp Thr Ala Ala.
165 170 175 Cys Gly Asp Ile Ile Leu Gly Leu Pro Val Ser Ala Arg Arg Gly Asp
180 185 190
Glu Ile Phe Leu Gly Pro Ala Asp Ser Leu Glu Gly Gln Gly Trp Arg
195 200 205
Leu Leu Ala Pro Ile Thr Ala Tyr Ser Gln Gln Thr Arg Gly Leu Leu 210 215 . 220
Gly Cys Ile Ile Thr Ser Leu Thr Gly Arg Asp Lys Asn Gln Val Glu 225 230 235 240 Gly Glu Val Gln Val Val Ser Thr Ala Thr Gln Ser Phe Leu Ala Thr 245 250 255
Cys Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys
260 265 270
Thr Leu Ala Gly Pro Lys Gly Pro Val Ile Gln Met Tyr Thr Asn Val
275 280 285
Asp Gln Asp Leu Val Gly Trp Pro Ala Pro Pro Gly Ala Arg Ser Leu 290 295 300
Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His 305 310 315 320
Ala Asp Val Ile Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu
325 330 335
Leu Ser Pro Arg Pro Ile Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro
340 345 350
Leu Leu Cys Pro Ser Gly His Ala Val Gly Ile Phe Arg Ala Ala Val
355 360 365
Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe Ile Pro Val Glu Ser 370 375 380
Met Glu Thr Thr Val Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro 385 390 395 400
Pro Ala Val Pro Gln Ser Phe Gln Val Ala His Leu His Ala Pro Thr
405 410 415
Gly Ser Gly Lys Ser Thr Arg Val Pro Ala Ala Tyr Ala Ala Gln Gly
420 425 430
Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe
435 440 445
Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp Pro Asn Ile Arg Ser 450 455 460
Gly Val Arg Thr Ile Thr Thr Gly Ala Pro Ile Thr Tyr Ser Thr Tyr 465 470 475 480
Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp Ile
485 490 495 Ile Ile Cys Asp Glu Cys His Ser Thr Asp Ser Thr Ser Ile Leu Gly
500 505 510
Ile Gly Thr Val Leu Asp Gln Ala Glu Thr Ala Gly Ala Arg Leu Val
515 520 525
Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro 530 535 540 Asn Ile Glu Glu Val Ala Leu Ser Asn Thr Gly Glu Ile Pro Phe Tyr 545 550 555 560
Gly Lys Ala Ile Pro Ile Glu Thr Ile Lys Gly Gly Arg His Leu Ile
565 570 575
Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val
580 585 590
Gly Leu Gly Ile Asn Ala Val Ala Tyr Tyr A-rg Gly Leu Asp Val Ser
595 600 605
Val Ile Pro Ala Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu 610 615 620
Met Thr Gly Phe Thr Gly Asp Phe Asp Ser Val Ile Asp Cys Asn Thr 625 630 635 640
Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr Ile
645 650 655
Glu Thr Thr Thr Val Pro Gln Asp Ala Val Ser Arg Ser Gln Arg Arg
660 665 670
Gly Arg Thr Gly Arg Gly Arg Arg Gly Ile Tyr Arg Phe Val Ala Pro
675 680 685
Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys 690 695 700
Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser 705 710 715 720
Val Arg Leu A.rg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gln
725 730 735
Asp His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His Val
740 745 750
Asp Ala His Phe Leu Ser Gln Thr Lys Gln Ala Gly Asp Asn Phe Pro
755 760 765
Tyr Leu Val Ala Tyr Gln Ala Thr Val Cys Ala Arg Ala Gln Ala Pro 770 775 780
Pro Pro Ser Trp Asp Gln Met Trp Lys Cys Leu Ile Arg Leu Lys Pro 785 790 795 800
Thr Leu Arg Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gln
805 810 815
Asn Glu Val Thr Leu Thr His Pro Ile Thr Lys Phe Ile Met Ala Cys
820 825 830
Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly
835 840 845 Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val 850 855 860
Val Ile Val Gly Arg Ile Ile Leu Ser Gly Arg Pro Ala Ile Val Pro 865 870 875 880
Asp Arg Glu Val Leu Tyr Gln Glu Phe Asp Glu Met Glu Glu Cys Ala
885 890 895 Ser His Leu Pro Tyr Ile Glu Gln Gly Met Gln Leu Ala Glu Gln Phe
900 905 910
Lys Gln Lys Ala Leu Gly Leu Leu Gln Thr Ala Thr Lys Gln Ala Glu
915 920 925
Ala Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe 930 935 940
Trp Ala Lys His Met Trp Asn Phe Ile Ser Gly Ile Gln Tyr Leu Ala 945 950 955 960
Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala Ile Ala Ser Leu Met Ala
965 970 975 Phe Thr Ala Ser Val Thr Ser Pro Leu Thr Thr Gln Ser Thr Leu Leu
980 985 990
Leu Asn Ile Leu Gly Gly Trp Val Ala Ala Gln Leu Ala Pro Pro Ser
995 1000 1005
Ala Ala Ser Ala Phe Val Gly Ala Gly Ile Ala Gly Ala Ala Val Gly 1010 1015 1020
Ser Ile Gly Leu Gly Lys Val Leu Val Asp Ile Leu Ala Gly Tyr Gly 1025 1030 1035 1040
Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly Glu
1045 1050 1055 Met Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala Ile Leu Ser
1060 1065 1070
Pro Gly Ala Leu Val Val Gly Val Val Cvs Ala Ala Ile Leu Arg Arg
1075 1080 1085
His Val Gly Pro Gly Glu Gly Ala Val Gln Trp Met Asn Arg Leu Ile 1090 1095 1100
Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro 1105 1110 1115 1120
Glu Ser Asp Ala Ala Ala Arg Val Thr Gln Ile Leu Ser Asp Leu Thr
1125 1130 1135 Ile Thr Gln Leu Leu Lys Arg Leu His Gln Trp Ile Asn Glu Asp Cys
1140 1145 1150 Ser Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp Val Trp Asp Trp Ile 1155 1160 1165
Cys Thr Val Leu Ala Asp Phe Lys Thr Trp Leu Gln Ser Lys Leu Leu 1170 1175 1180
Pro Arg Leu Pro Gly Val Pro Phe Phe Ser Cys Gln Arg Gly Tyr Lys 1185 1190 1195 1200 Gly Val Trp Arg Gly Asp Gly Ile Met Gin Thr Thr Cys Ser Cys Gly
1205 1210 1215
Ala Gln Ile Thr Gly His Val Lys Asn Gly Ser Met Arg Ile Val Gly
1220 1225 1230
Pro Lys Thr Cys Ser Asn Met Trp His Gly Thr Phe Pro Ile Asn Ala
1235 1240 1245
Tyr Thr Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr Ser Arg 1250 1255 1260
Ala Leu Trp Arg Val Ala Ala Glu Glu Tyr Val Glu Val Thr Arg Val 1265 1270 1275 1280 Gly Asp Phe His Tyr Val Thr Ser Met Thr Thr Asp Asn Val Lys Cys
1285 1290 1295
Pro Cys Gln Val Pro Ala Pro Glu Phe Phe Thr Glu Val Asp Gly Val
1300 1305 1310
Arg Leu His Arg Tyr Ala Pro Ala Cys Lys Pro Leu Leu Arg Glu Glu
1315 1320 1325
Val Thr Phe Gln Val Gly Leu Asn Gln Tyr Leu Val Gly Ser Gln Leu 1330 1335 1340
Pro Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr 1345 1350 1355 1360 Asp Pro Ser His Ile Thr Ala Glu Thr Ala Lys Arg Arg Leu Ala Arg
1365 1370 1375
Gly Ser Pro Pro Ser Leu Ala Ser Ser Ser Ala Ser Gln Leu Ser Ala
1380 1385 1390
Pro Ser Ser Lys Ala Thr Tyr Ile Thr Gln Asn Asp Phe Pro Asp Ala
1395 1400 1405
Asp Leu Ile Glu Ala Asn Leu Leu Trp Arg His Glu Met Gly Gly Asp 1410 1415 1420
Ile Thr Arg Val Glu Ser Glu Asn Lys Val Val Ile Leu Asp Ser Phe 1425 1430 1435 1440 Asp Pro Leu Arg Ala Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ala
1445 1450 1455 Glu Ile Leu Arg Lys Ser Lys Lys Phe Pro Pro Ala Met Pro Ala Trp 1460 1465 1470
Ala Arg Pro Asp Tyr Asn Pro Pro Leu Leu Glu Ser Trp Lys Ala Pro
1475 1480 1485
Asp Tyr Val Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Thr Lys 1490 1495 1500
Thr Pro Pro Ile Pro Pro Pro Arg Arg Lys Arg Thr Val Val Leu Thr 1505 1510 1515 1520
Glu Ser Thr Val Ser Ser Ala Leu Ala Glu Leu Ala Thr Lys Ala Phe
1525 1530 1535
Gly Ser Ser Glu Pro Ser Ala Val Asp Ser Gly Thr Ala Thr Ala Pro
1540 1545 1550
Pro Asp Gln Pro Ser Asp Asp Gly Gly Ala Gly Ser Asp Val Glu Ser
1555 1560 1565
Tyr Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu 1570 1575 1580
Ser Asp Gly Ser Trp Ser Thr Val Ser Glu Glu Ala Gly Glu Asp Val 1585 1590 1595 1600
Val Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu Ile Thr Pro
1605 1610 1615
Cys Ala Ala Glu Glu Ser Lys Leu Pro Ile Asn Ala Leu Ser Asn Ser
1620 1625 1630
Leu Leu Arg His His Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala
1635 1640 1645
Ser Gln Arg Gln Lys Lys Val Thr Phe Asp Arg Leu Gln Ile Leu Asp 1650 1655 1660
Asp His Tyr Gln Asp Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr 1665 1670 1675 1680
Val Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro
1685 1690 1695
Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg
1700 1705 1710
Asn Leu Ser Ser Lys Ala Ile Asn His Ile Arg Ser Val Trp Glu Asp
1715 1720 1725
Leu Leu Glu Asp Thr Glu Thr Pro Ile Asp Thr Thr Ile Met Ala Lys 1730 1735 1740
Asn Glu Val Phe Cys Val Gln Pro Glu Arg Gly Gly Arg Lys Pro Ala 1745 1750 1755 1760 Arg Leu Ile Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met 1765 1770 1775
Ala Leu Tyr Asp Val Val Ser Thr Leu Pro Gln Ala Val Met Gly Ser
1780 1785 1790
Ser Tyr Gly Phe Gln Tyr Ser Pro Gly Gln Arg Val Glu Phe Leu Val
1795 1800 1805
Asn Ala Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ala Tyr Asp Thr 1810 1815 1820
Arg Cys Phe Asp Ser Thr Val Thr Glu Asn Asp Ile Arg Val Glu Glu 1825 1830 1835 1840
Val Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro
1845 1850 1855
Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg
I860 1865 1870
Asn Leu Ser Ser Lys Ala Ile Asn His Ile Arg Ser Val Trp Glu Asp
1875 1880 1885
Leu Leu Glu Asp Thr Glu Thr Pro Ile Asp Thr Thr Ile Met Ala Lys 1890 1895 1900
Asn Glu Val Phe Cys Val Gln Pro Glu Arg Gly Gly Arg Lys Pro Ala 1905 1910 1915 1920
Arg Leu Ile Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met
1925 1930 1935
Ala Leu Tyr Asp Val Val Ser Thr Leu Pro Gln Ala Val Met Gly Ser
1940 1945 1950
Ser Tyr Gly Phe Gln Tyr Ser Pro Gly Gln Arg Val Glu Phe Leu Val
1955 1960 1965
Asn Ala Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ala Tyr Asp Thr 1970 1975 1980
Arg Cys Phe Asp Ser Thr Val Thr Glu Asn Asp Ile Arg Val Glu Glu 1985 1990 1995 2000
Ser Ile Tyr Gin Cys Cys Asp Leu Ala Pro Glu Ala Arg Gln Ala Ile
2005 2010 2015
Arg Ser Leu Thr Glu Arg Leu Tyr Ile Gly Gly Pro Leu Thr Asn Ser
2020 2025 2030
Lys Gly Gln Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu
2035 2040 2045
Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala 2050 2055 2060 Ala Cys Arg Ala Ala Lys Leu Gln Asp Cys Thr Met Leu Val Cys Gly 2065 2070 2075 2080
Asp Asp Leu Val Val Ile Cys Glu Ser Ala Gly Thr Gln Glu Asp Ala
2085 2090 2095
Ala Ser Leu Arg Val Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro
2100 2105 2110
Pro Gly Asp Pro Pro Gln Pro Glu Tyr Asp Leu Glu Leu Ile Thr Ser
2115 2120 2125
Cys Ser Ser Asn Val Ser Val Ala His Asp Ala Ser Gly Lys Arg Val 2130 2135 2140
Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp 2145 2150 2155 2160
Glu Thr Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn Ile Ile
2165 2170 2175
Met Tyr Ala Pro Thr Ile Trp Ala Arg Met Ile Leu Met Thr His Phe
2180 2185 2190
Phe Ser Ile Leu Leu Ala Gln Glu Gln Leu Glu Lys Ala Leu Asp Cys
2195 2200 2205
Gln Ile Tyr Gly Ala Cys Tyr Ser Ile Glu Pro Leu Asp Leu Pro Gln 2210 2215 2220
Ile Ile Gln Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr 2225 2230 2235 2240
Ser Pro Gly Glu Ile Asn Arg Val Ala Ser Cys Leu Arg Lys Leu Glv
2245 2250 2255
Val Pro Pro Leu Arg Val Trp Arg His Arg Ala Arg Ala Val Arg Ala
2260 2265 2270
Lys Leu Leu Ser Gln Gly Gly Arg Ala Ala Ile Cys Gly Arg Tyr Leu
2275 2280 2285
Phe Asn Trp Ala Val Lys Thr Lys Leu Lys Leu Thr Pro Ile Pro Ala 2290 2295 2300
Ala Ser Arg Leu Asp Leu Ser Gly Trp Phe Val Ala Gly Tyr Ser Gly 2305 2310 2315 2320
Gly Asp Ile Tyr His Ser Val Ser Arg Ala Arg Pro Arg Trp Phe Met
2325 2330 2335
Trp Cys Leu Leu Leu Leu Ser Val Gly Val Gly Ile Tyr Leu Leu Pro
2340 2345 2350
Asn Arg (2) INFORMATION FOR SEQ ID NO : 11 :
( i ) SEQUENCE CHARACTERISTICS :
(A) LENGTH: 27 base pairs
(B ) TYPE: nuc leic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..27
(D) OTHER INFORMATION: /note= "Primes synthesis at the 5'
end of the NS4 region and introduces a BamHI site to facilitate subsequent cloning steps."
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 1..27
(D) OTHER INFORMATION: /note= "Homologous to bases 2251 to 2277 of SEQ ID NO. 4 except that position 2263 is changed from G to C to introduce a BamHI recognition site
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 9..14
(D) OTHER INFORMATION: /note= "BamHI recognition site"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11:
ACC CAC GTG GAT CCC CAC TTC TTG TCC 27
Thr His Val Asp Pro His Phe Leu Ser
1 5
(2) INFORMATION FOR SEQ ID NO: 12:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 9 ammo acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12:
Thr His Val Asp Pro His Phe Leu Ser
1 5 (2) INFORMATION FOR SEQ ID NO: 13:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 1..33
(D) OTHER INFORMATION: /function= "Primes synthesis at 3'
end of the NS4 region of PT-NANBH"
/note= "Complementary to bases 3334 to 3336 of SEQ ID No.4; positions 3349, 3351, 3352 and 3353 have been changed to incorporate a BamHI recognition site to facilitate subsequent cloning
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 14..19
(D) OTHER INFORMATION: /note= "BamHI site"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13:
GCTCTCTGGC ACAGGATCCG TGGGGGAAAC CTG 33
(2) INFORMATION FOR SEQ ID NO : 14 :
( i ) SEQUENCE CHARACTERISTICS :
(A) LENGTH: 1119 base pairs
(B ) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to genomic RNA
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..1119
(D) OTHER INFORMATION: /function= "Encodes putative NS4 region protein of PT-NANBH"
/product= "Coding sequence of PT-NANBH"
/note= "Used to produce the recombinant baculovirus BHC-19 which expresses NS4-specific recombinant protein in infected cells (e.g. 1)."
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 9..14
(D) OTHER INFORMATION: /note= "BamHI recognition site"
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 1101..1106
(D) OTHER INFORMATION: /note= "BamHI recognition site"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14:
ACC CAC GTG GAT CCC CAC TTC TTG TCC CAA ACA AAG CAG GCA GGA GAC 48 Thr His Val Asp Pro His Phe Leu Ser Gln Thr Lys Gln Ala Gly Asp
1 5 10 15
AAC TTC CCC TAC CTG GTG GCG TAC CAG GCT ACT GTG TGC. GCT AGG GCC 96 Asn Phe Pro Tyr Leu Val Ala Tyr Gln Ala Thr Val Cys Ala Arg Ala
20 25 30
CAG GCC CCA CCT CCA TCA TGG GAT CAA ATG TGG AAG TGT CTC ATA CGG 144 Gln Ala Pro Pro Pro Ser Trp Asp Gln Met Trp Lys Cys Leu Ile Arg
35 40 45
CTA AAG CCT ACT CTG CGC GGG CCA ACA CCC TTG CTG TAT AGG CTG GGA 192
Leu Lys Pro Thr Leu Arg Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly
50 55 60
GCC GTC CAA AAC GAG GTC ACC CTC ACA CAC CCC ATA ACC AAA TTC ATC 240
Ala Val Gln Asn Glu Val Thr Leu Thr His Pro Ile Thr Lys Phe Ile
65 70 75 80
ATG GCA TGC ATG TCA GCC GAC CTG GAG GTC GTC ACG AGC ACC TGG GTG 288 Met Ala Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val 85 90 95
CTG GTG GGC GGG GTC CTT GCA GCT CTG GCT GCG TAT TGC TTG ACA ACA 336 Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr
100 105 110
GGC AGC GTG GTC ATT GTG GGT AGG ATC ATC TTG TCC GGG CGG CCG GCT 384
Gly Ser Val Val Ile Val Gly Arg Ile Ile Leu Ser Gly Arg Pro Ala
115 120 125
ATT GTT CCC GAC AGG GAA GTC CTC TAC CAG GAG TTC GAT GAG ATG GAA 432 Ile Val Pro Asp Arg Glu Val Leu Tyr Gln Glu Phe Asp Glu Met' Glu
130 135 140
GAG TGC GCG TCG CAC CTC CCT TAC ATC GAG CAG GGA ATG CAG CTC GCC 480 Glu Cys Ala Ser His Leu Pro Tyr Ile Glu Gln Gly Met Gln Leu Ala
145 150 155 160
GAG CAG TTC AAG CAA AAA GCG CTC GGG TTG CTG CAG ACA GCC ACC AAG 528 Glu Gln Phe Lys Gln Lys Ala Leu Gly Leu Leu Gln Thr Ala Thr Lys
165 170 175
CAA GCG GAG GCT GCT GCT CCC GTG GTG GAG TCC AAG TGG CGA GCC CTT 576 Gln Ala Glu Ala Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu
180 185 190
GAG ACC TTC TGG GCG AAA CAC ATG TGG AAC TTC ATC AGC GGG ATA CAG 624 Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe Ile Ser Gly Ile Gln
195 200 205
TAC TTA GCA GGC TTG TCT ACT CTG CCT GGG AAT CCC GCG ATT GCA TCA 672 Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala Ile Ala Ser
210 215 220
CTG ATG GCG TTC ACA GCC TCT GTC ACT AGC CCG CTC ACC ACC CAA TCT 720 Leu Met Ala Phe Thr Ala Ser Val Thr Ser Pro Leu Thr Thr Gln Ser
225 230 235 240
ACC CTC CTG CTT AAC ATC CTG GGG GGA TGG GTA GCC GCC CAA CTC GCT 768 Thr Leu Leu Leu Asn Ile Leu Gly Gly Trp Val Ala Ala Gln Leu Ala
245 250 255
CCC CCC AGT GCT GCT TCA GCT TTC GTA GGC GCC GGC ATT GCT GGT GCG 816 Pro Pro Ser Ala Ala Ser Ala Phe Val Gly Ala Gly Ile Ala Gly Ala
260 265 270
GCT GTT GGC AGC ATA GGC CTT GGG AAG GTG CTT GTG GAC ATC TTG GCG 864
Ala Val Gly Ser Ile Gly Leu Gly Lys Val Leu Val Asp Ile Leu Ala
275 280 285
GGC TAT GGA GCA GGA GTG GCA GGC GCG CTC GTG GCC TTT AAG GTC ATG 912
Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met
290 295 300
AGC GGC GAA ATG CCC TCC ACC GAG GAC CTG GTT AAC TTA CTC CCT GCC 960 Ser Gly Glu Met Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala 305 310 315 320
ATC CTC TCT CCT GGT GCC CTG GTC GTC GGG GTC GTG TGC GCA GCG ATA 1008 Ile Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala Ile
325 330 335
CTG CGT CGG CAC GTG GGT CCA GGG GAG GGG GCT GTG CAG TGG ATG AAC 1056 Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gln Trp Met Asn
340 345 350
CGG CTG ATA GCG TTC GCC TCG CGG GGT AAC CAT GTT TCC CCC ACG GAT 1104 Arg Leu Ile Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr Asp
355 360 365
CCT GTG CCA GAG AGC 1119
Pro Val Pro Glu Ser
370
(2) INFORMATION FOR SEQ ID NO: 15:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 373 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15:
Thr His Val Asp Pro His Phe Leu Ser Gln Thr Lys Gln Ala Gly Asp 1 5 10 15 Asn Phe Pro Tyr Leu Val Ala Tyr Gln Ala Thr Val Cys Ala Arg Ala
20 25 30
Gln Ala Pro Pro Pro Ser Trp Asp Gln Met Trp Lys Cys Leu Ile Arg
35 40 45
Leu Lys Pro Thr Leu Arg Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly 50 55 60
Ala Val Gln Asn Glu Val Thr Leu Thr His Pro Ile Thr Lys Phe Ile 65 70 75 80
Met Ala Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val
85 90 95 Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr
100 105 110
Gly Ser Val Val Ile Val Gly Arg Ile Ile Leu Ser Gly Arg Pro Ala
115 120 125
Ile Val Pro Asp Arg Glu Val Leu Tyr Gln Glu Phe Asp Glu Met Glu 130 135 140
Glu Cys Ala Ser His Leu Pro Tyr Ile Glu Gln Gly Met Gln Leu Ala 145 150 155 160
Glu Gln Phe Lys Gln Lys Ala Leu Gly Leu Leu Gln Thr Ala Thr Lys
165 170 175 Gln Ala Glu Ala Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu
180 185 190
Glu Thr Phe Trp Ala Lys His Met Trp Asn Phe Ile Ser Gly Ile Gln
195 200 205
Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala Ile Ala Ser 210 215 220
Leu Met Ala Phe Thr Ala Ser Val Thr Ser Pro Leu Thr Thr Gln Ser 225 230 235 240 Thr Leu Leu Leu Asn Ile Leu Gly Gly Trp Val Ala Ala Gln Leu Ala 245 250 255
Pro Pro Ser Ala Ala Ser Ala Phe Val Gly Ala Gly Ile Ala Gly Ala
260 265 270
Ala Val Gly Ser Ile Gly Leu Gly Lys Val Leu Val Asp Ile Leu Ala
275 280 285
Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met 290 295 300
Ser Gly Glu Met Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala 305 310 315 320 Ile Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala Ile
325 330 335
Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gln Trp Met Asn
340 345 350
Arg Leu Ile Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr Asp
355 360 365
Pro Val Pro Glu Ser
370
(2) INFORMATION FOR SEQ ID NO: 16:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 1..9
(D) OTHER INFORMATION: /note= "Dummy sequence"
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 9..14
(D) OTHER INFORMATION: /note= "EcoRI restriction enzyme
recognition site"
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 14..33
(D) OTHER INFORMATION: /note= "Sequence corresponding to
nucleotides 1121 to 1140 of SEQ ID No. 5"
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 10..33
(D) OTHER INFORMATION: /note= "Primer for 5' end of putative NS3 region of PT-NANBH which introduces an EcoRI site for subsequent cloning."
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16:
TCAGTATGG AAT TCG AAG GCG GTG GAC TTC ATA 33
Asn Ser Lys Ala Val Asp Phe Ile
1 5
(2) INFORMATION FOR SEQ ID NO: 17:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 8 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17:
Asn Ser Lys Ala Val Asp Phe Ile
1 5 (2) INFORMATION FOR SEQ ID NO: 18:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 1..27
(D) OTHER INFORMATION: /function= "Primes 3' end putative
NS3 region PT-NANBH; introduces BamHI"
/note= "Sequence complementary to nucleotides 2074 to 2100 of SEQ ID No. 4 with nucleotide 9 changed to G and nucleotide 13 changed to C in order to introduce a BamHI restriction enzyme recognition site"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18:
GACCGAGGGA TCCAACATGC CCGAGGG 27
(2) INFORMATION FOR SEQ ID NO: 19:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 993 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to genomic RNA
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 14..993
(D) OTHER INFORMATION: /product= "PT-NANBH coding
sequence"
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 1..9
(D) OTHER INFORMATION: /note= "Dummy sequence present in
the amplifying primer."
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 9..14
(D) OTHER INFORMATION: /note= "EcoRI restriction enzyme recognition site"
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 981..986
(D) OTHER INFORMATION: /note= "BamHI restriction enzyme recognition sequence."
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 10..993
(D) OTHER INFORMATION: /product= "Putative NS3 region PT-NANBH; restriction sites each end"
/note= "Used to produce pDX200 which expresses
Beta-galacosidase-NS3 fusion protein in E.coli
(e.g. 2)."
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19:
TCAGTATGG AAT TCG AAG GCG GTG GAC TTT ATA CCC GTT GAG TCT ATG 48
Asn Ser Lys Ala Val Asp Phe Ile Pro Val Glu Ser Met
1 5 10
GAA ACC ACT GTG CGG TCC CCG GTC TTT ACG GAC AAC TCA TCT CCT CCG 96 Glu Thr Thr Val Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro
15 20 25
GCC GTA CCG CAG TCA TTC CAA GTG GCC CAT CTA CAC GCC CCC ACT GGC 144 Ala Val Pro Gln Ser Phe Gln Val Ala His Leu His Ala Pro Thr Gly
30 35 40 45
AGC GGC AAG AGC ACC AGG GTG CCG GCT GCG TAT GCA GCC CAA GGG TAC 192 Ser Gly Lys Ser Thr Arg Val Pro Ala Ala Tyr Ala Ala Gln Gly Tyr
50 55 60
AAG GTA CTT GTC CTG AAC CCG TCC GTT GCC GCC ACC CTA GGC TTT GGG 240 Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly
65 70 75
GCG TAT ATG TCT AAG GCA CAC GGT GTC GAC CCT AAC ATC AGA TCT GGG 288 Ala Tyr Met Ser Lys Ala His Gly Val Asp Pro Asn Ile Arg Ser Gly
80 85 90
GTA AGG ACC ATC ACC ACG GGC GCC CCC ATC ACG TAC TCC ACC TAT GGC 336 Val Arg Thr Ile Thr Thr Gly Ala Pro Ile Thr Tyr Ser Thr Tyr Gly
95 100 105
AAG TTC CTT GCC GAC GGT GGT TGC TCT GGG GGC GCC TAT GAC ATC ATA 384 Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp Ile Ile
110 115 120 125
ATA TGT GAT GAA TGC CAC TCA ACT GAC TCG ACT TCC ATC CTG GGC ATT 432 Ile Cys Asp Glu Cys His Ser Thr Asp Ser Thr Ser Ile Leu Gly Ile
130 135 140
GGC ACA GTC CTA GAC CAA GCG GAG ACG GCT GGA GCG CGC CTT GTC GTG 480 Gly Thr Val Leu Asp Gln Ala Glu Thr Ala Gly Ala Arg Leu Val Val
145 150 155
CTC GCC ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCG CAC CCC AAT 528
Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn
160 165 170
ATC GAG GAG GTG GCT CTG TCC AAC ACT GGA GAG ATC CCC TTC TAT GGC 576 Ile Glu Glu Val Ala Leu Ser Asn Thr Gly Glu Ile Pro Phe Tyr Gly
175 180 185
AAA GCC ATC CCT ATT GAG ACC ATC AAG GGG GGG AGG CAC CTC ATT TTC 624 Lys Ala Ile Pro Ile Glu Thr Ile Lys Gly Gly Arg His Leu Ile Phe
190 195 200 205
TGC CAC TCC AAG AAG AAG TGT GAC GAA CTC GCT GCA AAA CTG GTG GGC 672 Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Gly
210 215 220
CTC GGA ATC AAT GCT GTA GCG TAT TAC CGG GGC CTT GAT GTG TCC GTC 720 Leu Gly Ile Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val
225 230 235
ATA CCG GCC AGC GGA GAC GTC GTT GTT GTA GCA ACA GAC GCT CTA ATG 768 Ile Pro Ala Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met
240 245 250
ACG GGC TTT ACC GGC GAC TTT GAC TCA GTG ATC GAC TGT AAT ACA TGT 816 Thr Gly Phe Thr Gly Asp Phe Asp Ser Val Ile Asp Cys Asn Thr Cys
255 260 265
GTC ACC CAG ACG GTC GAT TTC AGC TTG GAC CCT ACC TTT ACC ATT GAG 864 Val Thr Gln Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr Ile Glu
270 275 280 285
ACG ACG ACC GTG CCC CAA GAC GCG GTG TCG CGC TCA CAA CGG CGA GGC 912 Thr Thr Thr Val Pro Gln Asp Ala Val Ser Arg Ser Gln Arg Arg Gly
290 295 300
AGG ACT GGT AGG GGC AGG AGA GGC ATC TAC AGG TTT GTG GCT CCA GGA 960 Arg Thr Gly Arg Gly Arg Arg Gly Ile Tyr Arg Phe Val Ala Pro Gly
305 310 315
GAA CGG CCC TCG GGC ATG TTG GAT CCC TCG GTC 993 Glu Arg Pro Ser Gly Met Leu Asp Pro Ser Val
320 325
(2) INFORMATION FOR SEQ ID NO: 20:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 328 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20:
Asn Ser Lys Ala Val Asp Phe Ile Pro Val Glu Ser Met Glu Thr Thr 1 5 10 15 Val Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Ala Val Pro
20 25 30
Gln Ser Phe Gln Val Ala His Leu His Ala Pro Thr Gly Ser Gly Lys
35 40 45
Ser Thr Arg Val Pro Ala Ala Tyr Ala Ala Gln Gly Tyr Lys Val Leu 50 55 60
Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met 65 70 75 80
Ser Lys Ala His Gly Val Asp Pro Asn Ile Arg Ser Gly Val Arg Thr
85 90 95 Ile Thr Thr Gly Ala Pro Ile Thr Tyr Ser Thr Tyr Gly Lys Phe Leu
100 105 110
Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp Ile Ile Ile Cys Asp
115 120 125
Glu Cys His Ser Thr Asp Ser Thr Ser Ile Leu Gly Ile Gly Thr Val 130 135 140
Leu Asp Gln Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala Thr 145 150 155 160
Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn Ile Glu Glu
165 170 175 Val Ala Leu Ser Asn Thr Gly Glu Ile Pro Phe Tyr Gly Lys Ala Ile
180 185 190
Pro Ile Glu Thr Ile Lys Gly Gly Arg His Leu Ile Phe Cys His Ser
195 200 205
Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Gly Leu Gly Ile 210 215 220
Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val Ile Pro Ala 225 230 235 240 Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly Phe 245 . 250 255
Thr Gly Asp Phe Asp Ser Val Ile Asp Cys Asn Thr Cys Val Thr Gln
260 265 270
Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr Ile Glu Thr Thr Thr
275 280 285
Val Pro Gln Asp Ala Val Ser Arg Ser Gln Arg Arg Gly Arg Thr Gly 290 295 300
Arg Gly Arg Arg Gly Ile Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro 305 310 315 320
Ser Gly Met Leu Asp Pro Ser Val
325
(2) INFORMATION FOR SEQ ID NO: 21:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 972 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: fusion of cDNA to genomic RNA with genomic DNA
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 1..405
(D) OTHER INFORMATION: /product= "PT-NANBH sequence"
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 406..972
(D) OTHER INFORMATION: /product= "AcNPV polyhedrin gene
sequence"
(ix) FEATURE:
(A) NAME/KEY: misc_signal
(B) LOCATION: 970..972
(D) OTHER INFORMATION: /standard_name= "Stop codon"
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..972
(D) OTHER INFORMATION: /note= "Used to produce recombinant
baculovirus BHC-15 which expresses a fusion of
PT-NANBH putative core sequence with the carboxy-terminal portion of AcNPV polyhedrin protein (e.g. 3)"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21:
ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA CGT AAC ACC AAC 48 Met Ser Thr Asn Pro Lys Pro Gln Arg Lys Thr Lys Arg Asn Thr Asn
1 5 10 15
CTC CGC CCA CAG GAC GTC AGG TTC CCG GGC GGT GGT CAG ATC GTT GGT 96 Leu Arg Pro Gln Asp Val Arg Phe Pro Gly Gly Gly Gln Ile Val Gly
20 25 30
GGA GTT TAC CTG TTG CCG CGC AGG GGC CCC AGG TTG GGT GTG CGC GCG 144 Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala
35 40 45
ACT AGG AAG ACT TCC GAG CGG TCG CAA CCT CGT GGA AGG CGA CAA CCT 192 Thr Arg Lys Thr Ser Glu Arg Ser Gln Pro Arg Gly Arg Arg Gln Pro
50 55 60
ATC CCC AAG GCT CGC CAG CCC GAG GGC AGG GCC TGG GCT CAG CCC GGG 240 Ile Pro Lys Ala Arg Gln Pro Glu Gly Arg Ala Trp Ala Gln Pro Gly
65 70 75 80
TAC CCT TGG CCC CTC TAT GGC AAC GAG GGC ATG GGG TGG GCA GGA TGG 288 Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Met Gly Trp Ala Gly Trp
85 90 95
CTC CTG TCA CCC CGT GGC TCC CGG CCT AGT TGG GGC CCC ACT GAC CCC 336 Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro
100 105 110
CGG CGT AGG TCG CGT AAT TTG GGT AAA GTC ATC GAT ACC CTC ACA TGC 384
Arg Arg Arg Ser Arg Asn Leu Gly Lys Val Ile Asp Thr Leu Thr Cys
115 120 125
GGC TTC GCC GAC CTC ATG GGG GAT CCT TTC CTG GGA CCC GGC AAG AAC 432
Gly Phe Ala Asp Leu Met Gly Asp Pro Phe Leu Gly Pro Gly Lys Asn
130 135 140
CAA AAA CTC ACT CTC TTC AAG GAA ATC CGT AAT GTT AAA CCC GAC ACG 480 Gln Lys Leu Thr Leu Phe Lys Glu Ile Arg Asn Val Lys Pro Asp Thr
145 150 155 160
ATG AAG CTT GTC GTT GGA TGG AAA GGA AAA GAG TTC TAC AGG GAA ACT 528 Met Lys Leu Val Val Gly Trp Lys Gly Lys Glu Phe Tyr Arg Glu Thr
165 170 175
TGG ACC CGC TTC ATG GAA GAC AGC TTC CCC ATT GTT AAC GAC CAA GAA 576 Trp Thr Arg Phe Met Glu Asp Ser Phe Pro Ile Val Asn Asp Gln Glu
180 185 190
GTG ATG GAT GTT TTC CTT GTT GTC AAC ATG CGT CCC ACT AGA CCC AAC 624
Val Met Asp Val Phe Leu Val Val Asn Met Arg Pro Thr Arg Pro Asn
195 200 205
CGT TGT TAC AAA TTC CTG GCC CAA CAC GCT CTG CGT TGC GAC CCC GAC 672
Arg Cys Tyr Lys Phe Leu Ala Gln His Ala Leu Arg Cys Asp Pro Asp
2Ϊ0 215 220
TAT GTA CCT CAT GAC GTG ATT AGG ATC GTC GAG CCT TCA TGG GTG GGC 720 Tyr Val Pro His Asp Val Ile Arg Ile Val Glu Pro Ser Trp Val Gly
225 230 235 240
AGC AAC AAC GAG TAC CGC ATC AGC CTG GCT AAG AAG GGC GGC GGC TGC 768 Ser Asn Asn Glu Tyr Arg Ile Ser Leu Ala Lys Lys Gly Gly Gly Cys
245 250 255
CCA ATA ATG AAC CTT CAC TCT GAG TAC ACC AAC TCG TTC GAA CAG TTC 816 Pro Ile Met Asn Leu His Ser Glu Tyr Thr Asn Ser Phe Glu Gln Phe
260 265 270
ATC GAT CGT GTC ATC TGG GAG AAC TTC TAC AAG CCC ATC GTT TAC ATC 864 Ile Asp Arg Val Ile Trp Glu Asn Phe Tyr Lys Pro Ile Val Tyr Ile
275 280 285
GGT ACC GAC TCT GCT GAA GAG GAG GAA ATT CTC CTT GAA GTT TCC CTG 912 Gly Thr Asp Ser Ala Glu Glu Glu Glu Ile Leu Leu Glu Val Ser Leu
290 295 300
GTG TTC AAA GTA AAG GAG TTT GCA CCA GAC GCA CCT CTG TTC ACT GGT 960 Val Phe Lys Val Lys Glu Phe Ala Pro Asp Ala Pro Leu Phe Thr Gly
305 310 315 320
CCG GCG TAT TAA 972
Pro Ala Tyr
(2) INFORMATION FOR SEQ ID NO: 22:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 323 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22:
Met Ser Thr Asn Pro Lys Pro Gln Arg Lys Thr Lys Arg Asn Thr Asn 1 5 10 15 Leu Arg Pro Gln Asp Val Arg Phe Pro Gly Gly Gly Gln Ile Val Gly
20 25 30
Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala
35 40 45
Thr Arg Lys Thr Ser Glu Arg Ser Gln Pro Arg Gly Arg Arg Gln Pro 50 55 60
Ile Pro Lys Ala Arg Gln Pro Glu Gly Arg Ala Trp Ala Gln Pro Gly 65 70 75 80
Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Met Gly Trp Ala Gly Trp
85 90 95 Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro
100 105 110
A-rg Arg Arg Ser Arg Asn Leu Gly Lys Val Ile Asp Thr Leu Thr Cys
115 120 125
Gly Phe Ala Asp Leu Met Gly Asp Pro Phe Leu Gly Pro Gly Lys Asn 130 135 140
Gln Lys Leu Thr Leu Phe Lys Glu Ile Arg Asn Val Lys Pro Asp Thr 145 150 155 160
Met Lys Leu Val Val Gly Trp Lys Gly Lys Glu Phe Tyr Arg Glu Thr
165 170 175 Trp Thr Arg Phe Met Glu Asp Ser Phe Pro Ile Val Asn Asp Gln Glu
180 185 190
Val Met Asp Val Phe Leu Val Val Asn Met Arg Pro Thr Arg Pro Asn
195 200 205
Arg Cys Tyr Lys Phe Leu Ala Gln His Ala Leu Arg Cys Asp Pro Asp 210 215 220
Tyr Val Pro His Asp Val Ile Arg Ile Val Glu Pro Ser Trp Val Gly 225 230 235 240 Ser Asn Asn Glu Tyr Arg Ile Ser Leu Ala Lys Lys Gly Gly Gly Cys 245 250 255
Pro Ile Met Asn Leu His Ser Glu Tyr Thr Asn Ser Phe Glu Gln Phe
260 265 270
Ile Asp Arg Val Ile Trp Glu Asn Phe Tyr Lys Pro Ile Val Tyr Ile
275 280 285
Gly Thr Asp Ser Ala Glu Glu Glu Glu Ile Leu Leu Glu Val Ser Leu 290 295 300
Val Phe Lys Val Lys Glu Phe Ala Pro Asp Ala Pro Leu Phe Thr Gly 305 310 315 320
Pro Ala Tyr
(2) INFORMATION FOR SEQ ID NO: 23:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 1..27
(D) OTHER INFORMATION: /note= "Sequence complimentary to
nucleotides 2044 to 2070 of of SEQ ID No. 4 with position 9 changed G to C to create a Pstl restriction enzyme recognition site"
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 6..11
(D) OTHER INFORMATION: /note= "Pstl recognition site" (ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 1..27
(D) OTHER INFORMATION: /note= "Primes synthesis at 3' end
of part of the putative NS3 region of PT-NANABH and introduces a Pstl recognition site to facilitate subsequent cloning"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23:
TTCTCCTGCA GCCACAAACC TGTAGAT 27
(2) INFORMATION FOR SEQ ID NO: 24:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 33 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 16..33
(D) OTHER INFORMATION: /note= "Coding sequence of PT-NANBH corresponding to nucleotides 1486 to 1503 of SEQ ID No. 4."
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 1..6
(D) OTHER INFORMATION: /note= "BamHI restriction enzyme
recognition site"
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 1..33
(D) OTHER INFORMATION: /note= "Primes synthesis at 5' end
of a portion of the putative NS3 region of
PT-NANBH and introduces both BamHI and Pstl recognition sites to facilitate subsequent cloning"
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 8..13
(D) OTHER INFORMATION: /note= "Pstl restriction enzyme recognition site." (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24:
GGATCCTCTG CAGGG ATC ATA ATA TGT GAT GAA 33
Ile Ile Ile Cys Asp Glu
1 5 (2) INFORMATION FOR SEQ ID NO: 25:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 6 amino acids (B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: Ile Ile Ile Cys Asp Glu
1 5
(2) INFORMATION FOR SEQ ID NO: 26:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3372 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA to genomic RNA
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..3372
(D) OTHER INFORMATION: /note= "Encodes a fusion protein which can be expressed in insect cells infected with the baculovirus BHC-28 "
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 1..63
(D) OTHER INFORMATION: /note= "amino terminal sequence of the Autographa californica Nuclear Polyhedrosis Virus (AcNPV) polyhedrin"
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 64..1852
(D) OTHER INFORMATION: /note= "putative NS5 coding
sequence"
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 1853..1858
(D) OTHER INFORMATION: /note= "Pstl restriction enzyme recognition site"
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 1859..2434
(D) OTHER INFORMATION: /note= "Putative NS3 coding region"
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 2435..2457
(D) OTHER INFORMATION: /note= "Synthetic linker region containing a Psrl restriction enzyme recognition site"
(ix) FEATURE:
(A) NAME/KEY: misc_feature
(B) LOCATION: 2458..3288
(D) OTHER INFORMATION: /note= "Structural protein coding sequence containing the core and El regions"
(ix) FEATURE: (A) NAME/KEY: misc_feature
(B) LOCATION: 3289..3372
(D) OTHER INFORMATION: /note= "Polyhedrin gene sequence
read out-of-frame"
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26:
ATG CCG GAT TAT TCA TAC CGT CCC ACC ATC GGG CCG GAT CCC CCG TCA 48 Met Pro Asp Tyr Ser Tyr Arg Pro Thr Ile Gly Pro Asp Pro Pro Ser
1 5 10 15
CTA TCG GCG GAA TTC ACA GAA GTG GAT GGG GTG CGG CTG CAC AGG TAC 96 Leu Ser Ala Glu Phe Thr Glu Val Asp Gly Val Arg Leu His Arg Tyr
20 25 30
GCT CCG GCG TGC AAA CCT CTC CTA CGG GAG GAG GTC ACA TTC CAG GTC 144
Ala Pro Ala Cys Lys Pro Leu Leu Arg Glu Glu Val Thr Phe Gln Val
35 40 45
GGG CTC AAC CAA TAC CTG GTT GGG TCG CAG CTC CCA TGC GAG CCC GAA 192
Gly Leu Asn Gln Tyr Leu Val Gly Ser Gln Leu Pro Cys Glu Pro Glu
50 55 60
CCG GAT GTA GCA GTG CTC ACT TCC ATG CTC ACC GAC CCC TCC CAC ATC 240 Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His Ile
65 70 75 80
ACA GCA GAG ACG GCT AAG CGC AGG CTG GCC AGG GGG TCT CCC CCC TCC 288 Thr Ala Glu Thr Ala Lys Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser
85 90 95
TTG GCC AGC TCT TCA GCT AGC CAG TTG TCT GGC CCT TCC TCG AAG GCG 336 Leu Ala Ser Ser Ser Ala Ser Gln Leu Ser Gly Pro Ser Ser Lys Ala
100 105 110
ACA TAC ATT ACC CAA AAT GAC TTC CCA GAC GCT GAC CTC ATC GAG GCC 384
Thr Tyr Ile Thr Gln Asn Asp Phe Pro Asp Ala Asp Leu Ile Glu Ala
115 120 125
AAC CTC CTG TGG CGG CAT GAG ATG GGC GGG GAC ATT ACC CGC GTG GAG 432
Asn Leu Leu Trp Arg His Glu Met Gly Gly Asp Ile Thr Arg Val Glu
130 135 140
TCA GAG AAC AAG GTA GTA ATC CTG GAC TCT TTC GAC CCG CTC CGA GCG 480 Ser Glu Asn Lys Val Val Ile Leu Asp Ser Phe Asp Pro Leu Arg Ala
145 150 155 160
GAG GAG GAT GAG CGG GAA GTG TCC GTC CCG GCG GAG ATC CTG CGG AAA 528 Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ala Glu Ile Leu Arg Lys
165 170 175
TCC AAG AAA TTC CCA CCA GCG ATG CCC GCA TGG GCA CGC CCG GAT TAC 576 Ser Lys Lys Phe Pro Pro Ala Met Pro Ala Trp Ala Arg Pro Asp Tyr
180 185 190 AAC CCT CCG CTG CTG GAG TCC TGG AAG GCC CCG GAC TAC GTC CCT CCA 624 Asn Pro Pro Leu Leu Glu Ser Trp Lys Ala Pro Asp Tyr Val Pro Pro
195 200 205
GTG GTA CAT GGG TGC CCA CTG CCA CCT ACT AAG ACC CCT CCT ATA CCA 672 Val Val His Gly Cys Pro Leu Pro Pro Thr Lys Thr Pro Pro Ile Pro
210 215 220
CCT CCA CGG AGA AAG AGG ACA GTT GTT CTG ACA GAA TCC ACC GTG TCT 720 Pro Pro Arg Arg Lys Arg Thr Val Val Leu Thr Glu Ser Thr Val Ser
225 230 235 240
TCT GCC CTG GCG GAG CTT GCC ACA AAG GCT TTT GGT AGC TCC GGA CCG 768 Ser Ala Leu Ala Glu Leu Ala Thr Lys Ala Phe Gly Ser Ser Gly Pro
245 250 255
TCG GCC GTC GAC AGC GGC ACG GCA ACC GCC CCT CCT GAC CAA TCC TCC 816
Ser Ala Val Asp Ser Gly Thr ALa Thr Ala Pro Pro Asp Gln Ser Ser
260 265 270
GAC GAC GGC GGA GCA GGA TCT GAC GTT GAG TCG TAT TCC TCC ATG CCC 864
Asp Asp Gly Gly Ala Gly Ser Asp Val Glu Ser Tyr Ser Ser Met Pro
275 280 285
CCC CTT GAG GGG GAG CCG GGG GAC CCC GAT CTC AGC GAC GGG TCT TGG 912 Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp
290 295 300
TCT ACC GTG AGT GAG GAG GCC GGT GAG GAC GTC GTC TGC TGC TCG ATG 960 Ser Thr Val Ser Glu Glu Ala Gly Glu Asp Val Val Cys Cys Ser Met
305 310 315 320
TCC TAC ACA TGG ACA GGC GCT CTG ATC ACG CCA TGC GCT GCG GAG GAA 1008 Ser Tyr Thr Trp Thr Gly Ala Leu Ile Thr Pro Cys Ala Ala Glu Glu
325 330 335
AGC AAG CTG CCC ATC AAC GCG TTG AGC AAC TCT TTG CTG CGT CAC CAC 1056 Ser Lys Leu Pro Ile Asn Ala Leu Ser Asn Ser Leu Leu Arg His His
340 345 350
AAC ATG GTC TAC GCT ACC ACA TCC CGC AGC GCA AGC CAG CGG CAG AAG 1104 Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala Ser Gln Arg Gln Lys
355 360 365
AAG GTC ACC TTT GAC AGA CTG CAA ATC CTG GAC GAT CAC TAC CAG GAC 1152 Lys Val Thr Phe Asp Arg Leu Gln Ile Leu Asp Asp His Tyr Gln Asp
370 375 380
GTG CTC AAG GAG ATG AAG GCG AAG GCG TCC ACA GTT AAG GCT AAG CTT 1200 Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val Lys Ala Lys Leu
385 390 395 400
CTA TCA GTA GAG GAA GCC TGC AAG CTG ACG CCC CCA CAT TCG GCC AAA 1248 Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro His Ser Ala Lys
405 410 415 TCT AAA TTT GGC TAT GGG GCA AAG GAC GTC CGG AAC CTA TCC AGC AAG 1296
Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Asn Leu Ser Ser Lys
420 425 430
GCC ATT AAC CAC ATC CGC TCC GTG TGG GAG GAC TTG TTG GAA GAC ACT 1344
Ala Ile Asn His Ile Arg Ser Val Trp Glu Asp Leu Leu Glu Asp Thr
435 440 445
GAA ACA CCA ATT GAC ACC ACC ATC ATG GCA AAA AAT GAG GTT TTC TGC 1392 Glu Thr Pro Ile Asp Thr Thr Ile Met Ala Lys Asn Glu Val Phe Cys
450 455 460
GTC CAA CCA GAG AGA GGA GGC CGC AAG CCA GCT CGC CTT ATC GTG TTC 1440
Val Gln Pro Glu Arg Gly Gly Arg Lys Pro Ala Arg Leu Ile Val Phe
465 470 475 480
CCA GAC TTG GGG GTC CGT GTG TGC GAG AAA ATG GCC CTC TAT GAC GTG 1488
Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val
485 490 495
GTC TCC ACC CTC CCT CAG GCT GTG ATG GGC TCC TCG TAC GGA TTC CAG 1536
Val Ser Thr Leu Pro Gln Ala Val Met Gly Ser Ser Tyr Gly Phe Gln
500 505 510
TAT TCT CCT GGA CAG CGG GTC GAG TTC CTG GTG AAC GCC TGG AAA TCA 1584
Tyr Ser Pro Gly Gln Arg Val Glu Phe Leu Val Asn Ala Trp Lys Ser
515 520 525
AAG AAG ACC CCT ATG GGC TTT GCA TAT GAC ACC CGC TGT TTT GAC TCA 1632 Lys Lys Thr Pro Met Gly Phe Ala Tyr Asp Thr Arg Cys Phe Asp Ser
530 535 540
ACA GTC ACT GAG AAT GAC ATC CGT GTA GAG GAG TCA ATT TAT CAA TGT 1680
Thr Val Thr Glu Asn Asp Ile Arg Val Glu Glu Ser Ile Tyr Gln Cys
545 550 555 560
TGT GAC TTG GCC CCC GAA GCC AGA CAG GCC ATA AGG TCG CTC ACA GAG 1728
Cys Asp Leu Ala Pro Glu Ala Arg Gln Ala Ile Arg Ser Leu Thr Glu
565 570 575
CGG CTT TAT ATC GGG GGT CCC CTG ACT AAT TCA AAA GGG CAG AAC TGC 1776
Arg Leu Tyr Ile Gly Gly Pro Leu Thr Asn Ser Lys Gly Gln Asn Cys
580 585 590
GGC TAT CGC CGG TGC CGC GCG AGC GGC GTG CTG ACG ACT AGC TGC GGT 1824
Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly
595 600 605
AAT ACC CTC ACA TGT TAC TTG AAG GCC TCT GCA GGG ATC ATA ATA TGT 1872 Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Gly Ile Ile Ile Cys
610 615 620
GAT GAA TGC CAC TCA ACT GAC TCG ACT TCC ATC CTG GGC ATT GGC ACA 1920
Asp Glu Cys His Ser Thr Asp Ser Thr Ser Ile Leu Gly Ile Gly Thr
625 630 635 640 GTC CTA GAC CAA GCG GAG ACG GCT GGA GCG CGC CTT GTC GTG CTC GCC 1968 Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala
645 650 655 ACC GCT ACG CCT CCG GGA TCG GTC ACC GTG CCG CAC CCC AAT ATC GAG 2016 Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn Ile Glu
660 665 670
GAG GTG GCT CTG TCC AAC ACT GGA GAG ATC CCC TTC TAT GGC AAA GCC 2064 Glu Val Ala Lεu Ser Asn Thr Gly Glu Ile Pro Phe Tyr Gly Lys Ala
675 680 685
ATC CCT ATT GAG ACC ATC AAG GGG GGG AGG CAC CTC ATT TTC TGC CAC 2112 Ile Pro Ile Glu Thr Ile Lys Gly Gly Arg His Leu Ile Phe Cys His
690 695 700
TCC AAG AAG AAG TGT GAC GAA CTC GCT GCA AAA CTG GTG GGC CTC GGA 2160
Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Gly Leu Gly
705 710 715 720
ATC AAT GCT GTA GCG TAT TAC CGG GGC CTT GAT GTG TCC GTC ATA CCG 2208 Ile Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val Ile Pro
725 730 735
GCC AGC GGA GAC GTC GTT GTT GTA GCA ACA GAC GCT CTA ATG ACG GGC 2256 Ala Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly
740 745 750
TTT ACC GGC GAC TTT GAC TCA GTG ATC GAC TGT AAT ACA TGT GTC ACC 2304 Phe Thr Gly Asp Phe Asp Ser Val Ile Asp Cys Asn Thr Cys Val Thr
755 760 765
CAG ACG GTC GAT TTC AGC TTG GAC CCT ACC TTT ACC ATT GAG ACG ACG 2352 Gln Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr Ile Glu Thr Thr
770 775 780
ACC GTG CCC CAA GAC GCG GTG TCG CGC TCA CAA CGG CGA GGC AGG ACT 2400
Thr Val Pro Gin. Asp Ala Val Ser Arg Ser Gln Arg Arg Gly Arg Thr
785 790 795 800
GGT AGG GGC AGG AGA GGC ATC TAC AGG TTT GTG GCT GCA GTA AAG AAG 2448
Gly Arg Gly Arg Arg Gly Ile Tyr Arg Phe Val Ala Ala Val Lys Lys
805 810 815
AAG AAG AAG AAA ACC AAA CGT AAC ACC AAC CTC CGC CCA CAG GAC GTC 2496 Lys Lys Lys Lys Thr Lys Arg Asn Thr Asn Leu Arg Pro Gln Asp Val
820 825 830
AGG TTC CCG GGC GGT GGT CAG ATC GTT GGT GGA GTT TAC CTG TTG CCG 2544 Arg Phe Pro Gly Gly Gly Gln Ile Val Gly Gly Val Tyr Leu Leu Pro
835 840 845
CGC AGG GGC CCC AGG TTG GGT GTG CGC GCG ACT AGG AAG ACT TCC GAG 2592 Arg Arg Gly Pro Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser Glu
850 855 860 CGG TCG CAA CCT CGT GGA AGG CGA CAA CCT ATC CCC AAG GCT CGC CAG 2640 Arg Ser Gln Pro Arg Gly Arg Arg Gln Pro Ile Pro Lys Ala Arg Gln
865 870 875 880
CCC GAG GGC AGG GCC TGG GCT CAG CCC GGG TAC CCT TGG CCC CTC TAT 2688 Pro Glu Gly Arg Ala Trp Ala Gln Pro Gly Tyr Pro Trp Pro Leu Tyr
885 890 895
GGC AAC GAG GGC ATG GGG TGG GCA GGA TGG CTC CTG TCA CCC CGT GGC 2736 Gly Asn Glu Gly Met Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg Gly
900 905 910
TCC CGG CCT AGT TGG GGC CCC ACT GAC CCC CGG CGT AGG TCG CGT AAT 2784 Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro Arg Arg Arg Ser Arg Asn
915 920 925
TTG GGT AAA GTC ATC GAT ACC CTC ACA TGC GGC TTC GCC GAC CTC ATG 2832 Leu Gly Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met
930 935 940
GGG TAC ATT CCG CTC GTC GGC GCT CCC TTA GGG GGC GCT GCC AGG GCC 2880 Gly Tyr Ile Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala
945 950 955 960
CTG GCG CAT GGC GTC CGG GTT CTG GAG GAC GGC GTG AAC TAT GCA ACA 2928 Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr
965 970 975
GGG AAT TTA CCC GGT TGC TCT TTC TCT ATC TTC CTC TTG GCT TTG CTG 2976 Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu Leu
980 985 990
TCC TGT TTG ACC ATT CCA GCT TCC GCT TAT GAA GTG CGC AAC GTG TCC 3024 Ser Cys Leu Thr Ile Pro Ala Ser Ala Tyr Glu Val Arg Asn Val Ser
995 1000 1005
GGG ATC TAC CAT GTC ACG AAC GAT TGC TCC AAC TCA AGC ATC GTG TAC 3072
Gly Ile Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser Ile Val Tyr
1010 1015 1020
GAG ACA GCG GAC ATG ATC ATG CAC ACC CCC GGG TGT GTG CCC TGT GTC 3120
Glu Thr Ala Asp Met Ile Met His Thr Pro Gly Cys Val Pro Cys Val
1025 1030 1035 1040
CGG GAG GGT AAT TCC TCC CGC TGC TGG GTA GCG CTC ACT CCC ACG CTC 3168 Arg Glu Gly Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Leu
1045 1050 1055
GCG GCC AAG GAC GCC AGC ATC CCC ACT GCG ACA ATA CGA CGC CAC GTC 3216 Ala Ala Lys Asp Ala Ser Ile Pro Thr Ala Thr Ile Arg Arg His Val
1060 1065 1070
GAT TTG CTC GTT GGG GCG GCT GCC TTC TGC TCC GCT ATG TAC GTG GGG 3264 Asp Leu Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val Gly
1075 1080 1085 GAT CTC TGC GGA TCT GTT TTC CCG GAA TTC CAG CTG AGC GCC GGT CGC 3312 Asp Leu Cys Gly Ser Val Phe Pro Glu Phe Gln Leu Ser Ala Gly Arg
1090 1095 1100
TAC GGA TCC TTT CCT GGG ACC CGG CAA GAA CCA AAA ACT CAC TCT CTT 3360 Tyr Gly Ser Phe Pro Gly Thr Arg Gln Glu Pro Lys Thr His Ser Leu
1105 1110 1115 1120
CAA GGA AAT CCG 3372 Gln Gly Asn Pro
(2) INFORMATION FOR SEQ ID NO: 27:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1124 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27:
Met Pro Asp Tyr Ser Tyr Arg Pro Thr Ile Gly Pro Asp Pro Pro Ser 1 5 10 15 Leu Ser Ala Glu Phe Thr Glu Val Asp Gly Val Arg Leu His Arg Tyr
20 25 30
Ala Pro Ala Cys Lys Pro Leu Leu Arg Glu Glu Val Thr Phe Gln Val
35 40 45
Gly Leu Asn Gln Tyr Leu Val Gly Ser Gln Leu Pro Cys Glu Pro Glu 50 55 60
Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser His Ile 65 70 75 80
Thr Ala Glu Thr Ala Lys Arg Arg Leu Ala Arg Gly Ser Pro Pro Ser
85 90 95 Leu Ala Ser Ser Ser Ala Ser Gln Leu Ser Gly Pro Ser Ser Lys Ala
100 105 110
Thr Tyr Ile Thr Gln Asn Asp Phe Pro Asp Ala Asp Leu Ile Glu Ala
115 120 125
Asn Leu Leu Trp Arg His Glu Met Gly Gly Asp Ile Thr Arg Val Glu 130 135 140
Ser Glu Asn Lys Val Val Ile Leu Asp Ser Phe Asp Pro Leu Arg Ala 145 150 155 160
Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ala Glu Ile Leu Arg Lys
165 170 175 Ser Lys Lys Phe Pro Pro Ala Met Pro Ala Trp Ala Arg Pro Asp Tyr
180 185 190
Asn Pro Pro Leu Leu Glu Ser Trp Lys Ala Pro Asp Tyr Val Pro Pro
195 200 205
Val Val His Gly Cys Pro Leu Pro Pro Thr Lys Thr Pro Pro Ile Pro 210 215 220
Pro Pro Arg Arg Lys Arg Thr Val Val Leu Thr Glu Ser Thr Val Ser 225 230 235 240 Ser Ala Leu Ala Glu Leu Ala Thr Lys Ala Phe Gly Ser Ser Gly Pro 245 250 255
Ser Ala Val Asp Ser Gly Thr Ala Thr Ala Pro Pro Asp Gln Ser Ser
260 265 270
Asp Asp Gly Gly Ala Gly Ser Asp Val Glu Ser Tyr Ser Ser Met Pro
275 280 285
Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser Trp 290 295 300
Ser Thr Val Ser Glu Glu Ala Gly Glu Asp Val Val Cys Cys Ser Met 305 310 315 320
Ser Tyr Thr Trp Thr Gly Ala Leu Ile Thr Pro Cys Ala Ala Glu Glu
325 330 335
Ser Lys Leu Pro Ile Asn Ala Leu Ser Asn Ser Leu Leu Arg His His
340 345 350
Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala Ser Gln Arg Gln Lys
355 360 365
Lys Val Thr Phe Asp Arg Leu Gln Ile Leu Asp Asp His Tyr Gin Asp 370 375 380
Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val Lys Ala Lys Leu 385 390 395 400
Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro His Ser Ala Lys
405 410 415
Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Asn Leu Ser Ser Lys
420 425 430
Ala Ile Asn His Ile Arg Ser Val Trp Glu Asp Leu Leu Glu Asp Thr
435 440 445
Glu Thr Pro Ile Asp Thr Thr Ile Met Ala Lys Asn Glu Val Phe Cys 450 455 460
Val Gin Pro Glu Arg Gly Gly Arg Lys Pro Ala Arg Leu Ile Val Phe 465 470 475 480
Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val
485 490 495
Val Ser Thr Leu Pro Gln Ala Val Met Gly Ser Ser Tyr Gly Phe Gln
500 505 510
Tyr Ser Pro Gly Gln Arg Val Glu Phe Leu Val Asn Ala Trp Lys Ser
515 520 525
Lys Lys Thr Pro Met Gly Phe Ala Tyr Asp Thr Arg Cys Phe Asp Ser 530 535 540 Thr Val Thr Glu Asn Asp Ile Arg Val Glu Glu Ser Ile Tyr Gln Cys 545 550 555 560
Cys Asp Leu Ala Pro Glu Ala Arg Gln Ala Ile Arg Ser Leu Thr Glu
565 570 575
Arg Leu Tyr Ile Gly Gly Pro Leu Thr Asn Ser Lys Gly Gln Asn Cys
580 585 590
Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly
595 600 605
Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Gly Ile Ile lie Cys 610 615 620
Asp Glu Cys His Ser Thr Asp Ser Thr Ser Ile Leu Gly Ile Gly Thr 625 630 635 640
Val Leu Asp Gln Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu Ala
645 650 655
Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn Ile Glu
660 665 670
Glu Val Ala Leu Ser Asn Thr Gly Glu Ile Pro Phe Tyr Gly Lys Ala
675 680 685
Ile Pro Ile Glu Thr Ile Lys Gly Gly Arg His Leu Ile Phe Cys His 690 695 700
Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Gly Leu Gly 705 710 715 720 Ile Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val Ile Pro
725 730 735
Ala Ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr Gly
740 745 750
Phe Thr Gly Asp Phe Asp Ser Val Ile Asp Cys Asn Thr Cys Val Thr
755 760 765
Gln Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr Ile Glu Thr Thr 770 775 780
Thr Val Pro Gln Asp Ala Val Ser Arg Ser Gln Arg Arg Gly Arg Thr 785 790 795 800
Gly Arg Gly Arg Arg Gly Ile Tyr Arg Phe Val Ala Ala Val Lys Lys
805 810 815
Lys Lys Lvs Lys Thr Lys Arg Asn Thr Asn Leu Arg Pro Gln Asp Val
820 825 830
Arg Phe Pro Gly Gly Gly Gln Ile Val Gly Gly Val Tyr Leu Leu Pro
835 840 845 Arg Arg Gly Pro Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser Glu 850 855 860
Arg Ser Gln Pro Arg Gly Arg Arg Gln Pro Ile Pro Lys Ala Arg Gln 865 870 875 880
Pro Glu Gly Arg Ala Trp Ala Gln Pro Gly Tyr Pro Trp Pro Leu Tyr
885 890 895 Gly Asn Glu Gly Met Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg Gly
900 905 910
Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro Arg Arg Arg Ser Arg Asn
915 920 925
Leu Gly Lys Val Ile Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met 930 935 940
Gly Tyr Ile Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala 945 950 955 960
Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr
965 970 975 Gly Asn Leu Pro Gly Cys Ser Phe Ser Ile Phe Leu Leu Ala Leu Leu
980 985 990
Ser Cys Leu Thr Ile Pro Ala Ser Ala Tyr Glu Val Arg Asn Val Ser
995 1000 1005
Gly Ile Tyr His Val Thr Asn Asp Cys Ser Asn Ser Ser Ile Val Tyr 1010 1015 1020
Glu Thr Ala Asp Met Ile Met His Thr Pro Gly Cys Val Pro Cys Val 1025 1030 1035 1040
Arg Glu Gly Asn Ser Ser Arg Cys Trp Val Ala Leu Thr Pro Thr Leu
1045 1050 1055 Ala Ala Lys Asp Ala Ser Ile Pro Thr Ala Thr Ile Arg Arg His Val
1060 1065 1070
Asp Leu Leu Val Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val Gly
1075 1080 1085
Asp Leu Cys Gly Ser Val Phe Pro Glu Phe Gln Leu Ser Ala Gly Arg 1090 1095 1100
Tyr Gly Ser Phe Pro Gly Thr Arg Gln Glu Pro Lys Thr His Ser Leu 1105 1110 1115 1120 Gln Gly Asn Pro

Claims

CLAIMS :
1. A recombinant PT-NANBH polypeptide or recombinant PT-NANBH polypeptides which polypeptide comprises or which polypeptides together comprise (i) at least one antigen from the structural coding region of the viral genome, (ii) at least one antigen from the non-structural coding region of the viral genome; and (iii) at least one further antigen from either the structural or non-structural coding region of the viral genome and which is different from the antigens referred to in (i) and (ii).
2. A recombinant PT-NANBH polypeptide or recombinant PT-NANBH polypeptides as claimed in claim 1 which comprise an antigen from the structural coding region having an amino acid sequence which is at least 70% homologous with the amino acid sequence as set forth in SEQ ID NO: 3 or 4 or an antigenic fragment thereof, an antigen from the non-structural coding region having an amino acid sequence that is at least 70% homologous with the amino acid sequence as set forth in SEQ ID NO: 5 or 6 or an antigenic fragment thereof, and an antigen from the non-structural coding region having an amino acid sequence which is at least 70% homologous with the amino acid sequence as set forth in SEQ ID NO: 7 or 8 or an antigenic fragment thereof.
3. A recombinant polypeptide wherein the polypeptides as claimed in claim 1 or 2 are fused to form a single recombinant polypeptide.
4. A recombinant PT-NANBH polypeptide comprising an amino acid sequence which is at least 70% homologous with the amino acid sequence as set forth in SEQ ID NO: 14 or 15 or an antigenic fragment thereof.
5. A recombinant PT-NANBH polypeptide comprising an amino acid sequence which is at least 70% homologous with the amino acid sequence as set forth in SEQ ID NO: 19 or 20 or 7 or 8 or an antigenic fragment thereof.
6. A DNA sequence encoding a polypeptide or polypeptides as claimed in any one of claims 1 to 5.
7. A DNA sequence as claimed in claim 6 as set forth in SEQ ID No: 3, 5, 7 , 14, 19 or 26.
8. An expression vector containing a DNA sequence as claimed in claim 6 or 7 being capable in an appropriate host of expressing the DNA sequence to produce a PT-NANBH recombinant polypeptide or polypeptides.
9. A host cell transformed with an expression vector as claimed in claim 8.
10. A process for preparing a PT-NANBH recombinant polypeptide or polypeptides as claimed in any of claims 1 to 3 which comprises isolating the DNA sequences, as set forth in SEQ ID NO: 3, 5, 7, 14, 19 or 26, from the PT-NANBH genome, or synthesising DNA sequences encoding the antigens of PT-NANBH recombinant polypeptide or polypeptides, as set forth in SEQ ID NO: 3, 5, 7, 14, 19 or 26, inserting the DNA sequences into one or more expression vectors such that it or they are capable, in an appropriate host, of being expressed, transforming an host cell with the expression vector, culturing the transformed host cell, and isolating the recombinant polypeptide or polypeptides.
11. A method for the detection of PT-NANBH viral antibody which comprises contacting a test sample with a PT-NANBH recombinant polypeptide or polypeptides as claimed in any of claims 1 to 5 and determining whether there is any antigen-antibody binding contained within the test sample.
12. A text kit for the detection of PT-NANBH viral antibody which comprises the recombinant PT-NANBH polypeptide or polypeptides as claimed in any of claims 1 to 5 and means for determining whether there is any antigen - antibody binding contained in the test sample.
13. A vaccine formulation which comprises a recombinant PT-NANBH polypeptide or polypeptides as claimed in any of claims 1 to 5 in association with a pharmaceutically acceptable carrier.
14. A method for inducing immunity in man to PT-NANBH which comprises the administration of an effective amount of a vaccine formulation according to claim 13.
PCT/GB1993/000345 1992-02-21 1993-02-19 A recombinant hepatitis c virus polypeptide WO1993017110A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB9203803.3 1992-02-21
GB929203803A GB9203803D0 (en) 1992-02-21 1992-02-21 A recombinant polypeptide

Publications (2)

Publication Number Publication Date
WO1993017110A2 true WO1993017110A2 (en) 1993-09-02
WO1993017110A3 WO1993017110A3 (en) 1993-10-14

Family

ID=10710857

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB1993/000345 WO1993017110A2 (en) 1992-02-21 1993-02-19 A recombinant hepatitis c virus polypeptide

Country Status (4)

Country Link
AU (1) AU3509693A (en)
GB (1) GB9203803D0 (en)
WO (1) WO1993017110A2 (en)
ZA (1) ZA931203B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996037606A1 (en) * 1995-05-22 1996-11-28 Bionova Corporation Compositions and methods for the diagnosis of, and vaccination against, hepatitis c virus (hcv)
US5885771A (en) * 1993-10-29 1999-03-23 Srl, Inc. Antigenic peptide compound and immunoassay
US5910405A (en) * 1993-04-30 1999-06-08 Lucky Limited HCV diagnostic agents
US6153378A (en) * 1992-10-16 2000-11-28 Bionova Corporation Diagnosis of, and vaccination against, a positive stranded RNA virus using an isolated, unprocessed polypeptide encoded by a substantially complete genome of such virus
US7625569B2 (en) 2004-10-18 2009-12-01 Globeimmune, Inc. Yeast-based therapeutic for chronic hepatitis C infection
US7781567B2 (en) 2002-05-24 2010-08-24 Nps Pharmaceuticals, Inc. Method for enzymatic production of GLP-2(1-33) and GLP-2(1-34) peptides
WO2010096115A1 (en) * 2008-10-29 2010-08-26 Apath, Llc Compounds, compositions and methods for control of hepatitis c viral infections
US7829307B2 (en) 2003-11-21 2010-11-09 Nps Pharmaceuticals, Inc. Production of glucagon-like peptide 2
US8728489B2 (en) 2008-09-19 2014-05-20 Globeimmune, Inc. Immunotherapy for chronic hepatitis C virus infection
US9322025B2 (en) * 2002-05-24 2016-04-26 Medtronic, Inc. Methods and DNA constructs for high yield production of polypeptides

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2239245A (en) * 1989-12-18 1991-06-26 Wellcome Found Post-transfusional non-A non-B hepatitis viral polypeptides
EP0445423A2 (en) * 1989-12-22 1991-09-11 Abbott Laboratories Hepatitis C assay
EP0450931A1 (en) * 1990-04-04 1991-10-09 Chiron Corporation Combinations of hepatitis C virus (HCV) antigens for use in immunoassays for anti-HCV antibodies
EP0468527A2 (en) * 1990-07-26 1992-01-29 United Biomedical, Inc. Synthetic peptides specific for the detection of antibodies to HCV, diagnosis of HCV infection and prevention thereof as vaccines
EP0472207A2 (en) * 1990-08-24 1992-02-26 Abbott Laboratories Hepatitis C assay utilizing recombinant antigens
EP0521318A2 (en) * 1991-06-10 1993-01-07 Lucky Ltd. Hepatitis C diagnostics and vaccines

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2239245A (en) * 1989-12-18 1991-06-26 Wellcome Found Post-transfusional non-A non-B hepatitis viral polypeptides
EP0445423A2 (en) * 1989-12-22 1991-09-11 Abbott Laboratories Hepatitis C assay
EP0450931A1 (en) * 1990-04-04 1991-10-09 Chiron Corporation Combinations of hepatitis C virus (HCV) antigens for use in immunoassays for anti-HCV antibodies
EP0468527A2 (en) * 1990-07-26 1992-01-29 United Biomedical, Inc. Synthetic peptides specific for the detection of antibodies to HCV, diagnosis of HCV infection and prevention thereof as vaccines
EP0472207A2 (en) * 1990-08-24 1992-02-26 Abbott Laboratories Hepatitis C assay utilizing recombinant antigens
EP0521318A2 (en) * 1991-06-10 1993-01-07 Lucky Ltd. Hepatitis C diagnostics and vaccines

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SCAND. J. GASTROENTEROL. vol. 26, 1991, pages 1257 - 1262 L. MATTSSON ET AL. 'Antibodies to recombinant and synthetic peptides derived from hepatitis C virus genome in long-term studied patients with posttransfusion hepatitis C' *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6153378A (en) * 1992-10-16 2000-11-28 Bionova Corporation Diagnosis of, and vaccination against, a positive stranded RNA virus using an isolated, unprocessed polypeptide encoded by a substantially complete genome of such virus
US5910405A (en) * 1993-04-30 1999-06-08 Lucky Limited HCV diagnostic agents
US5885771A (en) * 1993-10-29 1999-03-23 Srl, Inc. Antigenic peptide compound and immunoassay
WO1996037606A1 (en) * 1995-05-22 1996-11-28 Bionova Corporation Compositions and methods for the diagnosis of, and vaccination against, hepatitis c virus (hcv)
US8148508B2 (en) 2002-05-24 2012-04-03 Nps Pharmaceuticals, Inc. Method for enzymatic production of GLP-2(1-33) and GLP-2(1-34) peptides
US9951368B2 (en) 2002-05-24 2018-04-24 Medtronic, Inc. Methods of DNA constructs for high yield production of polypeptides
US7781567B2 (en) 2002-05-24 2010-08-24 Nps Pharmaceuticals, Inc. Method for enzymatic production of GLP-2(1-33) and GLP-2(1-34) peptides
US9322025B2 (en) * 2002-05-24 2016-04-26 Medtronic, Inc. Methods and DNA constructs for high yield production of polypeptides
US7829307B2 (en) 2003-11-21 2010-11-09 Nps Pharmaceuticals, Inc. Production of glucagon-like peptide 2
US7625569B2 (en) 2004-10-18 2009-12-01 Globeimmune, Inc. Yeast-based therapeutic for chronic hepatitis C infection
US8007816B2 (en) 2004-10-18 2011-08-30 Globeimmune, Inc. Yeast-based therapeutic for chronic hepatitis C infection
US8388980B2 (en) 2004-10-18 2013-03-05 Globeimmune, Inc. Yeast-based therapeutic for chronic hepatitis C infection
US8821892B2 (en) 2004-10-18 2014-09-02 Globeimmune, Inc. Yeast-based therapeutic for chronic hepatitis C infection
US7632511B2 (en) 2004-10-18 2009-12-15 Globeimmune, Inc. Yeast-based therapeutic for chronic hepatitis C infection
US8728489B2 (en) 2008-09-19 2014-05-20 Globeimmune, Inc. Immunotherapy for chronic hepatitis C virus infection
US8809344B2 (en) 2008-10-29 2014-08-19 Apath, Llc Compounds, compositions, and methods for control of hepatitis C viral infections
WO2010096115A1 (en) * 2008-10-29 2010-08-26 Apath, Llc Compounds, compositions and methods for control of hepatitis c viral infections

Also Published As

Publication number Publication date
AU3509693A (en) 1993-09-13
WO1993017110A3 (en) 1993-10-14
GB9203803D0 (en) 1992-04-08
ZA931203B (en) 1993-10-04

Similar Documents

Publication Publication Date Title
US6210675B1 (en) PT-NANB hepatitis polypeptides
US6312889B1 (en) Combinations of hepatitis c virus (HCV) antigens for use in immunoassays for anti-HCV antibodies
US5683864A (en) Combinations of hepatitis C virus (HCV) antigens for use in immunoassays for anti-HCV antibodies
EP0693687B2 (en) Combinations of hepatitis C virus (HCV) antigens for use in immunoassays for anti-HCV antibodies
Bukh et al. Sequence analysis of the core gene of 14 hepatitis C virus genotypes.
AU695259B2 (en) Hepatitis-C virus type 4, 5 and 6
EP0593290B1 (en) Core antigen protein of hepatitis C virus, and diagnostic method and kit using the same
WO1993017110A2 (en) A recombinant hepatitis c virus polypeptide
AU679179B2 (en) CDNA sequence of dengue virus serotype 1 (singapore strain)
JPH05504143A (en) Immunologically active peptide or polypeptide of parvovirus B19
CA2147622A1 (en) Methods and compositions for detecting anti-hepatitis e virus activity
JPH06502542A (en) Polypeptide derived from hepatitis C virus protein, test kit containing the polypeptide, and vaccine for preventing hepatitis C virus infection
AU3572593A (en) Hepatitis C virus peptides
US20020150990A1 (en) Hepatitis C virus peptides
CA2115925A1 (en) Hepatitis c assay utilizing recombinant antigens from ns5 region
US7166287B1 (en) Viral agent
AU639560C (en) Combinations of hepatitis C virus (HCV) antigens for use in immunoassays for anti-HCV antibodies
WO1992000323A2 (en) HUMAN CYTOMEGALOVIRUS (HCMV) ANTIGEN pp150
JPH07198723A (en) Diagnosing medicine for hepatitis c-virus infectious disease
AU684478C (en) Methods and compositions for detecting anti-hepatitis E virus activity
WO1993002363A1 (en) Method to detect antibodies against hepatitis c virus and kits for the use thereof
JPH061799A (en) Antigen peptide of hepatitis virus of non-a non-b type, nucleic acid fragment encoding the same peptide and its use

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AU CA CZ FI HU JP KR NO NZ PL SK US

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
AK Designated states

Kind code of ref document: A3

Designated state(s): AU CA CZ FI HU JP KR NO NZ PL SK US

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: CA