AU746826B2 - Production of mature proteins in plants - Google Patents

Production of mature proteins in plants Download PDF

Info

Publication number
AU746826B2
AU746826B2 AU61716/98A AU6171698A AU746826B2 AU 746826 B2 AU746826 B2 AU 746826B2 AU 61716/98 A AU61716/98 A AU 61716/98A AU 6171698 A AU6171698 A AU 6171698A AU 746826 B2 AU746826 B2 AU 746826B2
Authority
AU
Australia
Prior art keywords
mature
ala
seq
ser
leu
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU61716/98A
Other versions
AU6171698A (en
Inventor
Raymond L Rodriguez
Thomas D. Sutliff
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of California
Invitria Inc
Original Assignee
Applied Phytologics Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Applied Phytologics Inc filed Critical Applied Phytologics Inc
Publication of AU6171698A publication Critical patent/AU6171698A/en
Application granted granted Critical
Publication of AU746826B2 publication Critical patent/AU746826B2/en
Assigned to REGENTS OF THE UNIVERSITY OF CALIFORNIA, THE, VENTRIA BIOSCIENCE reassignment REGENTS OF THE UNIVERSITY OF CALIFORNIA, THE Alteration of Name(s) in Register under S187 Assignors: APPLIED PHYTOLOGICS, INC.
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/81Protease inhibitors
    • C07K14/8107Endopeptidase (E.C. 3.4.21-99) inhibitors
    • C07K14/811Serine protease (E.C. 3.4.21) inhibitors
    • C07K14/8121Serpins
    • C07K14/8128Antithrombin III
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/76Albumins
    • C07K14/765Serum albumin, e.g. HSA
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/81Protease inhibitors
    • C07K14/8107Endopeptidase (E.C. 3.4.21-99) inhibitors
    • C07K14/811Serine protease (E.C. 3.4.21) inhibitors
    • C07K14/8121Serpins
    • C07K14/8125Alpha-1-antitrypsin
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • C12N15/8221Transit peptides
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • C12N15/8222Developmentally regulated expression systems, tissue, organ specific, temporal or spatial regulation
    • C12N15/823Reproductive tissue-specific promoters
    • C12N15/8234Seed-specific, e.g. embryo, endosperm
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • C12N15/8222Developmentally regulated expression systems, tissue, organ specific, temporal or spatial regulation
    • C12N15/823Reproductive tissue-specific promoters
    • C12N15/8235Fruit-specific
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • C12N15/8237Externally regulated expression systems
    • C12N15/8238Externally regulated expression systems chemically inducible, e.g. tetracycline
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8242Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits
    • C12N15/8257Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits for the production of primary gene products, e.g. pharmaceutical products, interferon
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/48Hydrolases (3) acting on peptide bonds (3.4)
    • C12N9/50Proteinases, e.g. Endopeptidases (3.4.21-3.4.25)
    • C12N9/52Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from bacteria or Archaea
    • C12N9/54Proteinases, e.g. Endopeptidases (3.4.21-3.4.25) derived from bacteria or Archaea bacteria being Bacillus

Landscapes

  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Microbiology (AREA)
  • Medicinal Chemistry (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Cell Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Reproductive Health (AREA)
  • Pregnancy & Childbirth (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Developmental Biology & Embryology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Toxicology (AREA)
  • Peptides Or Proteins (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Fertilizers (AREA)

Description

WO 98/36085 PCT/US98/03068 Production of Mature Proteins in Plants Field of the Invention The present invention relates to the production of mature proteins in plant cells, and in particular, to the production of proteins in mature secreted form.
Background of the Invention A major commercial focus of biotechnology is the recombinant production of proteins, including both industrial enzymes and proteins that have important therapeutic uses.
Therapeutic proteins are commonly produced recombinantly by microbial expression systems, such as in E. coli and the yeast system S. cerevisiae. To date, the cost of recombinant proteins produced in a microbial host has limited the availability of a variety of therapeutically important proteins, such as human serum albumin (HSA) and ,a-antitrypsin (AAT), to the extent that the proteins are in short supply.
Some therapeutic proteins appear to rely on glycosylation for optimal activity or stability, and the general inability of microbial systems to glycosylate or properly glycosylate mammalian proteins has also limited the usefulness of these recombinant expression systems. In some cases, proper protein folding cannot take place, because of the need for mammalian-specific foldases or other folding conditions.
To some extent, protein expression in cultured mammalian cells, or in transgenic animals may overcome the limitations of microbial expression systems. However, the cost per weight ratio of the protein is still high in mammalian expression systems, and the risk of protein contamination by mammalian viruses may be a significant regulatory problem. Protein production by transgenic animals also carries the risk of genetic variation from one generation to another. The attendant risk is variation in the recombinant protein produced, for example, variation in protein processing to yield a nature active protein with different N-terminal residue.
It would therefore be desirable to produce selected therapeutic and industrial proteins in a protein expression system that largely overcomes problems associated with microbial and mammalian-cell systems. In particular, production of the proteins should allow large volume production at low cost, and yield properly processed and glycosylated proteins. The production system should also have a relatively stable genotype from generation to generation. These aims are achieved, in the present invention, for the therapeutic proteins AAT, HSA, and antithrombin III (ATIII), and the industrial enzyme subtilisin BPN'.
SUBSTITUTE SHEET (RULE 26) r- WO 98/36085 PCT/US98/03068 Human r-antitrypsin Human m-antitrypsin (AAT) is a monomer with a molecular weight of about 52Kd.
Normal AAT contains 394 residues, with three complex oligosaccharide units exposed to the surface of the molecule, linked to asparagines 46, 83, and 247 (Carrell, et al., Nature (1982) 298:329).
AAT is the major plasma proteinase inhibitor whose primary function is to control the proteolytic activity of trypsin, elastase, and chymotrypsin in plasma. In particular, the protein is a potent inhibitor of neutrophil elastase, and a deficiency of AAT has been observed in a number of patients with chronic emphysema of the lungs. A proportion of individuals with serum deficiency of AAT may progress to cirrhosis and liver failure Wu, et al., BioEssays 11(4):163 (1991).
Because of the key role of AAT as an elastase inhibitor, and because of the prevalence of genetic diseases resulting in deficient serum levels of AAT, there has been an active interest in recombinant synthesis of AAT, for human therapeutic use. To date, this approach has not been satisfactory for AAT produced by recombinant methods, for the reasons discussed above.
Human Antithrombin III Antithrombin III (ATIII) is the major inhibitor of thrombin and factor Xa, and to a lesser extent, other serine proteases generated during the coagulation process, factors IXa, XIa, and XIIa. The inhibitory effect of ATIII is accelerated dramatically by heparin. In patients with a history of deep vein thrombosis and pulmonary embolism, the prevalence of ATIII deficiency is 2- 3%.
ATIII protein has been useful in treating hereditary ATIII deficiency and has wide clinical applications for the prevention of thrombosis in high risk situations, such as surgery and delivery, and for treating acute thrombotic episodes, when used in combination with heparin.
ATIII is a glycoprotein with a molecular weight of 58,200, having 432 amino acids and containing three disulfide linkages and four asparagine-linked biantennary carbohydrate chains.
Because of the key role of ATIII as an anti-thrombotic agent, and because of the broad clinical potential in anti-thrombosis therapy, there has been an active interest in recombinant synthesis of ATIII, for human therapeutic use. To date, this approach has not been satisfactory for ATIII produced by microbial or mammalian recombinant methods, for the reasons discussed above.
Human Serum Albumin Serum albumin is the main protein component of plasma. Its main function is regulation of colloidal osmotic pressure in the bloodstream. Serum albumin binds numerous ions and small molecules, including Ca2', Na', fatty acids, hormones, bilirubin and certain drugs.
2 SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 Human serum albumin (HSA) is expressed as a 609 amino acid prepro-protein which is further processed by removal of an amino-terminal peptide and an additional six amino acid residues to form the mature protein. The mature protein found in human serum is a monomeric,.
unglycosylated protein 585 amino acids in length (66 kDal), with a globular structure maintained by 17 disulfide bonds. The pattern of disulfide links forms a structural unit of one small and two large disulfide-linked double loops (Geisow, M.J. et al. (1977) Biochem. J. 163:477-484) which forms a high-affinity bilirubin binding site.
HSA is used to expand blood volume and raise low blood protein levels in cases of shock, trauma, and post-surgical recovery. HSA is often administered in emergency situations to stabilize blood pressure.
Because of the key role of HSA as an osmotic stabilizing agent, and because of its broad clinical potential in, plasma replacement therapy, there has been an active interest in recombinant synthesis of HSA for human therapeutic use. This approach has not been satisfactory for HSA produced by microbial or mammalian recombinant methods, for the reasons discussed above.
Subtilisin BPN' Subtilisin BPN' (BPN') is an important industrial enzyme, particularly for use as a detergent enzyme. Several groups have reported amino acid substitution modifications of the enzyme that are effective in enhancing the activity, pH optimum, stability and/or therapeutic use of the enzyme.
BPN' is expressed in as a 381 amino acid preproenzyme, including 35 amino acid sequence required for secretion and a 77 amino acid moiety which serves as a chaperon to facilitate folding.
Studies indicate that the pro moiety acts in trans outside of cells.
To date, large-scale production of BPN' is predominantly by microbial fermentation, which has relatively high costs associated with it. In addition, the enzyme tends to auto-degrade at optimal fermentation growth-medium conditions.
Summary of the Invention In one aspect, the invention includes a method of producing, in monocot plant cells, a mature heterologous protein selected from the group consisting of mature, glycosylated aantitrypsin (AAT) having the same N-terminal amino acid sequence as mature AAT produced in humans and a glycosylation pattern which increases serum halflife substantially over that of nonglycosylated mature AAT; (ii) mature, glycosylated antithrombin III (ATII) having the same Nterminal amino acid sequence as mature ATIII produced in humans; (iii) mature human serum albumin (HSA) having the same N-terminal amino acid sequence as mature HSA produced in 3 SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 humans and having the folding pattern of native mature HSA as evidenced by its bilirubin-binding characteristics; and (iv) mature, active subtilisin BPN' glycosylated or non-glycosylated, having the same N-terminal amino acid sequence as BPN' produced in Bacillus.
The method includes obtaining monocot cells transformed with a chimeric gene having a monocot transcriptional regulatory region, inducible by addition or removal of a small molecule, or during seed maturation, (ii) a first DNA sequence encoding the heterologous protein, and (iii) a second DNA sequence encoding a signal peptide. The second DNA sequence is operably linked to the transcriptional regulatory region and to the first DNA sequence. The first DNA sequence is in translation-frame with the second DNA sequence, and the two sequences encode a fusion protein.
The transformed cells are cultivated under conditions effective to induce the transcriptional regulatory region, thereby promoting expression of the fusion protein and secretion of the mature heterologous protein from the transformed cells. The mature heterologous protein produced by the transformed cells is then isolated.
In one embodiment of the method, the first DNA sequence encodes pro-subtilisin BPN' (proBPN'), the cultivating includes cultivating the transformed cells at a pH between 5 and 6, and the isolating step includes incubating the proBPN' to under condition effective to allow its autoconversion to active mature BPN'. In another embodiment, the first DNA sequence encodes mature BPN', and the cells are transformed with a second chimeric gene containing a transcriptional regulatory region inducible by addition or removal of a small molecule, (ii) a third DNA sequence encoding the pro-peptide moiety of BPN', and (iii) a fourth DNA sequence encoding a signal polypeptide. The fourth DNA sequence is operably linked to the transcriptional regulatory region and to the third DNA sequence, and the signal polypeptide is in translation-frame with the pro-peptide moiety and is effective to facilitate secretion of expressed pro-peptide moiety from the transformed cells. The cultivating step includes cultivating the transformed cells at a pH between and 6, and the isolating step includes incubating the mature BPN' and the pro-moiety under conditions effective to allow the conversion of BPN' by the pro- moiety to active mature BPN'.
In another embodiment of the method, the signal peptide is the RAmy3D signal peptide (SEQ ID NO:1) or the RAmylA signal peptide (SEQ ID NO:4). The coding sequence of the signal peptide may be a codon-optimized sequence, such as the codon-optimized RAmy3D sequence identified as SEQ ID NO:3. The first DNA sequence may also be codon-optimized. Exemplary codon-optimized signal peptide-heterologous protein fusion protein coding sequences include 3D- AAT (SEQ ID NO:18), 3D-ATIII (SEQ ID NO:19), and 3D-HSA (SEQ ID NO:20). The first DNA sequence may further contain codon substitutions which eliminate one or more potential glycosylation sites present in the native amino acid sequence of the heterologous protein, such as the codon-optimized sequence encoding 3D-proBPN' (SEQ ID NO:21).
4 SUBSTITUTE SHEET (RULE 26) 0 In other embodiments of the method, the transcriptional regulatory region may be a promoter derived from a rice or barley a-amylase gene, including RAmylA, RAmylB, RAmy2A, RAmy3A, RAmy3B, RAmy3C, RAmy3D, RAmy3E, pM/C, gKAmyl41, gKAmyl55, Amy32b, or HV18. The chimeric gene may further include, between the transcriptional regulatory region and the fusion protein coding sequence, the 5' untranslated region UTR) of an inducible monocot gene such as one of the rice or barley aamylase genes described above. One preferred 5' UTR is that from the RAmylA gene, which is effective to enhance the stability of the gene transcript. The chimeric gene may further include, downstream of the coding sequence, the 3' untranslated region UTR) from an inducible monocot gene, such as one of the rice or barley a-amylase genes mentioned above. One preferred 3' UTR is from the RAmylA gene.
Where the method is employed in protein production in a monocot cell culture, preferred promoters are the RAmy3D and RAmy3E gene promoters, which are upregulated by sugar depletion in cell culture. Where the gene is employed in protein production in germinating seeds, a preferred promoter is the RAmylA gene promoter, which is upregulated by gibberellic acid during seed germination. Where gene is upregulated during seed maturation, a 20 preferred promoter is the barley endosperm-specific B1-hordein promoter.
The invention also includes a mature heterologous protein produced by the above method. The protein has a glycosylation pattern characteristic of the monocot plant in which the protein is produced. The glycosylated protein is selected from the group consisting of mature glycosylated aantitrypsin (AAT) having the same N-terminal amino acid sequence as mature AAT produced in humans and having a glycosylation pattern which increases serum halflife substantially over that of non-glycosylated mature AAT; (ii) mature glycosylated antithrombin III (ATIII) having the same N- 0o terminal amino acid sequence as mature ATIII produced in humans; and (iii) 30 mature glycosylated subtilisin BPN' (BPN') having the same N-terminal amino acid sequence as BPN' produced in Bacillus.
The invention also includes plant cells and seeds capable of producing the mature heterologous proteins according to the above method.
Any discussion of documents, acts, materials, devices, articles or the like which has been included in the present specification is solely for the purpose of providing a context for the present invention. It is not to be taken as an admission that any or all of these matters form part of the prior art base or were common general knowledge in the field relevant to the present invention as it existed before the priority date of each claim of this application.
Throughout this specification the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.
These and other features of the invention will be more fully understood when the following detailed description of the invention is read in conjunction with the accompanying drawings.
Brief Description of the Figures Fig. 1 shows, in the lower row, the amino acid sequence of a RAmy3D signal sequence portion employed in the invention, identified as SEQ ID NO:1; in the middle row, the corresponding native coding sequence, identified as SEQ ID NO:2; and in the upper row, a corresponding codonoptimized sequence, identified as SEQ ID NO:3; U Ii o *~o *e o° WO 98/36085 PCT/US98/03068 Fig. 2 illustrates the components of a chimeric gene constructed in accordance with an embodiment of the invention; Figs. 3A and 3B illustrate the construction of an exemplary transformation vector for use in transforming a monocot plant, for production of a mature protein in cell culture in accordance with one embodiment of the invention (native mature AAT coding sequence under control of the RAmy3D promoter and signal sequence); Fig. 4 illustrates factors in the metabolic regulation of AAT production in rice cell culture; Fig. 5 shows immunodetection of AAT using antibody raised against the C-terminal region of AAT; Fig. 6 shows Western blot analysis of AAT produced by transformed rice cell lines 18F, 11B, and 27F; Fig. 7 shows the time course of elastase:AAT complex formation in human and riceproduced forms of AAT; Fig. 8 shows an N-terminal sequence for mature ai-antitrypsin (AAT) produced in accordance with the invention, identified herein as SEQ ID NO:22; Fig. 9 shows a Western blot of ATIII produced in accordance with the invention; Fig. 10 shows a Western blot of plant-produced BPN', comparing expression from codonoptimized and native coding sequences; Fig. 11 compares the specific activity of BPN' codon-optimized (AP106) vs. BPN' native (AP101) expression in rice callus cell culture; and Fig. 12 shows a western blot of HSA produced in germinating seeds in accordance with the invention.
Brief Description of the Sequences SEQ ID NO:1 is the amino acid sequence of the RAmy3D signal peptide; SEQ ID NO:2 is the native sequence encoding the RAmy3D signal peptide; SEQ ID NO:3 is a codon-optimized sequence encoding the RAmy3D signal peptide; SEQ ID NO:4 is the amino acid sequence of the RAmylA signal peptide; SEQ ID NO:5 is the 5' UTR derived from the RAmylA gene; SEQ ID NO:6 is the 3' UTR derived from the RAmylA gene; SEQ ID NO:7 is the amino acid sequence of mature a,-antitrypsin (AAT); SEQ ID NO:8 is the native DNA coding sequence of mature AAT; SEQ ID NO:9 is the amino acid sequence of mature antithrombin III (ATIII); SEQ ID NO: 10 is the native DNA coding sequence of mature ATIII; SEQ ID NO:11 is the amino acid sequence of mature human serum albumin (HSA); 6 SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 SEQ ID NO: 12 is the native DNA coding sequence of mature HSA; SEQ ID NO: 13 is the amino acid sequence of native proBPN'; SEQ ID NO: 14 is the native DNA coding sequence of proBPN'; SEQ ID NO: 15 is the amino acid sequence of the "pro" moiety of BPN'; SEQ ID NO:16 is the amino acid sequence of native mature BPN'; SEQ ID NO:17 is the amino acid sequence of a mature BPN' variant in which all potential N-glycosylation sites are removed according to Table 2; SEQ ID NO:18 is a codon-optimized sequence encoding the RAmy3D signal sequence/mature ai-antitrypsin fusion protein; SEQ ID NO: 19 is a sequence encoding the RAmy3D signal sequence/mature antithrombin III fusion protein, with a codon-optimized RAmy3D coding sequence fused to the native mature ATI coding sequence; SEQ ID NO:20 is a sequence encoding the RAmy3D signal sequence/mature human serum albumin fusion protein, with a codon-optimized RAmy3D coding sequence fused to the native mature HSA coding sequence; SEQ ID NO:21 is a codon-optimized sequence encoding the RAmy3D signal sequence/prosubtilisin BPN' fusion protein; SEQ ID NO:22 is the N-terminal sequence of mature ai-antitrypsin produced in accordance with the invention; SEQ ID NO:23 is an oligonucleotide used to prepare the intermediate p3DProSig construct of Example 1; SEQ ID NO:24 is the complement of SEQ ID NO:23; SEQ ID NO:25 is an oligonucleotide used to prepare the intermediate p3DProSigENDlink construct of Example 1; SEQ ID NO:26 is the complement of SEQ ID SEQ ID NO:27 is one of six oligonucleotides used to prepare the intermediate plAProSig construct of Example 1; SEQ ID NO:28 is one of six oligonucleotides used to prepare the intermediate plAProSig construct of Example 1; SEQ ID NO:29 is one of six oligonucleotides used to prepare the intermediate plAProSig construct of Example 1; SEQ ID NO:30 is one of six oligonucleotides used to prepare the intermediate plAProSig construct of Example 1; SEQ ID NO:31 is one of six oligonucleotides used to prepare the intermediate plAProSig construct of Example 1; 7 SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 SEQ ID NO:32 is one of six oligonucleotides used to prepare the intermediate plAProSig construct of Example 1; SEQ ID NO:33 is the N-terminal primer used to PCR-amplify the AAT coding sequence according to Example 1; and SEQ ID NO:34 is the C-terminal primer used to PCR-amplify the AAT coding sequence according to Example 1.
Detailed Description of the Invention I. Definitions: The terms below have the following meaning, unless indicated otherwise in the specification.
"Cell culture" refers to cells and cell clusters, typically callus cells, growing on or suspended in a suitable growth medium.
"Germination" refers to the breaking of dormancy in a seed and the resumption of metabolic activity in the seed, including the production of enzymes effective to break down starches in the seed endosperm.
"Inducible" means a promoter that is upregulated by the presence or absence of a small molecules. It includes both indirect and direct inducement.
"Inducible during germination" refers to promoters which are substantially silent but not totally silent prior to germination but are turned on substantially (greater than 25%) during germination and development in the seed. Examples of promoters that are inducible during germination are presented below.
"Small molecules", in the context of promoter induction, are typically small organic or bioorganic molecules less than about 1 kDal. Examples of such small molecules include sugars, sugar-derivatives (including phosphate derivatives), and plant hormones (such as, gibberellic or absissic acid).
"Specifically regulatable" refers to the ability of a small molecule to preferentially affect transcription from one promoter or group of promoters the a-amylase gene family), as opposed to non-specific effects, such as, enhancement or reduction of global transcription within a cell by a small molecule.
"Seed maturation" or "grain development" refers to the period starting with fertilization in which metabolizable reserves, sugars, oligosaccharides, starch, phenolics, amino acids, and proteins, are deposited, with and without vacuole targeting, to various tissues in the seed (grain), endosperm, testa, aleurone layer, and scutellar epithelium, leading to grain enlargement, grain filling, and ending with .grain desiccation.
8 SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 "Inducible during seed maturation" refers to promoters which are turned on substantially (greater than 25%) during seed maturation.
"Heterologous DNA" or "foreign DNA" refers to DNA which has been introduced into plant cells from another source, or which is from a plant source, including the same plant source, but which is under the control of a promoter or terminator that does not normally regulate expression of the heterologous DNA.
"Heterologous protein" is a protein, including a polypeptide, encoded by a heterologous DNA. A "transcription regulatory region" or "promoter" refers to nucleic acid sequences that influence and/or promote initiation of transcription. Promoters are typically considered to include regulatory regions, such as enhancer or inducer elements.
A "chimeric gene," in the context of the present invention, typically comprises a promoter sequence operably linked to DNA sequence that encodes a heterologous gene product, a selectable marker gene or a fusion protein gene. A chimeric gene may also contain further transcription regulatory elements, such as transcription termination signals, as well as translation regulatory signals, such as, termination codons.
"Operably linked" refers to components of a chimeric gene or an expression cassette that function as a unit to express a heterologous protein. For example, a promoter operably linked to a heterologous DNA, which encodes a protein, promotes the production of functional mRNA corresponding to the heterologous DNA.
A "product" encoded by a DNA molecule includes, for example, RNA molecules and polypeptides.
"Removal" in the context of a metabolite includes both physical removal as by washing and the depletion of the metabolite through the absorption and metabolizing of the metabolite by the cells.
"Substantially isolated" is used in several contexts and typically refers to the at least partial purification of a protein or polypeptide away from unrelated or contaminating components.
Methods and procedures for the isolation or purification of proteins or polypeptides are known in the art.
"Stably transformed" as used herein refers to a cereal cell or plant that has foreign nucleic acid stably integrated into its genome which is transmitted through multiple generations.
"l-antitrypsin or "AAT" refers to the protease inhibitor which has an amino acid sequence substantially identical or homologous to AAT protein identified by SEQ ID NO:7.
"Antithrombin III" or "ATm" refers to the heparin-activated inhibitor of thrombin and factor Xa, and which has an amino acid sequence substantially identical or homologous to ATII protein identified by SEQ ID NO:9.
9 SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 "Human serum albumin" or "HSA" refers to a protein which has an amino acid sequence substantially identical or homologous to the mature HSA protein identified by SEQ ID NO:11.
"Subtilisin" or "subtilisin BPN'" or "BPN"' refers to the protease enzyme produced.
naturally by B. amyloliquefaciens, and having the sequence of SEQ ID NO:16, or a sequence homologous therewith.
"proBPN"' refers to a form of BPN' having an approximately 78 amino-acid "pro" moiety that functions as a chaperon polypeptide to assist in folding and activation of the BPN', and having the sequence in SEQ ID NO:13, or a sequence homologous therewith.
"Codon optimization" refers to changes in the coding sequence of a gene to replace native codons with those corresponding to optimal codons in the host plant.
A DNA sequence is "derived from" a gene, such as a rice or barley a-amylase gene, if it corresponds in sequence to a segment or region of that gene. Segments of genes which may be derived from a gene include the promoter region, the 5' untranslated region, and the 3' untranslated region of the gene.
II. Transformed plant cells The plants used in the process of the present invention are derived from monocots, particularly the members of the taxonomic family known as the Gramineae. This family includes all members of the grass family of which the edible varieties are known as cereals. The cereals include a wide variety of species such as wheat (Triticum sps.), rice (Oryza sps.) barley (Hordeum sps.) oats, (Avena sps.) rye (Secale sps.), corn (Zea sps.) and millet (Pennisettum sps.). In the present invention, preferred family members are rice and barley.
Plant cells or tissues derived from the members of the family are transformed with expression constructs plasmid DNA into which the gene of interest has been inserted) using a variety of standard techniques electroporation, protoplast fusion or microparticle bombardment). The expression construct includes a transcription regulatory region (promoter) whose transcription is specifically upregulated by the presence of absence of a small molecule, such as the reduction or depletion of sugar, sucrose, in culture medium, or in plant tissues, e.g., germinating seeds. In the present invention, particle bombardment is the preferred transformation procedure.
The construct also includes a gene encoding a mature heterologous protein in a form suitable for secretion from plant cells. The gene encoding the recombinant heterologous protein is placed under the control of a metabolically regulated promoter. Metabolically regulated promoters are those in which mRNA synthesis or transcription, is repressed or upregulated by a small metabolite or hormone molecule, such as the rice RAmy3D and RAmy3E promoters, which are SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 upregulated by sugar-depletion in cell culture. For protein production in germinating seeds from regenerated transgenic plants, a preferred promoter is the Ramy 1A promoter, which is up-regulated by gibberellic acid during seed germination. The expression construct also utilizes additional regulatory DNA sequences preferred codons, termination sequences, to promote efficient translation of AAT, as will be described.
A. Plant Expression Vector Expression vectors for use in the present invention comprise a chimeric gene (or expression cassette), designed for operation in plants, with companion sequences upstream and downstream from the expression cassette. The companion sequences will be of plasmid or viral origin and provide necessary characteristics to the vector to permit the vectors to move DNA from bacteria to the desired plant host. Suitable transformation vectors are described in related application PCT WO 95/14099, published May 25, 1995, which is incorporated by reference herein. Suitable components of the expression vector, including an inducible promoter, coding sequence for a signal peptide, coding sequence for a mature heterologous protein, and suitable termination sequences are discussed below. One exemplary vector is the p3D(AAT)vl.0 vector illustrated in Figs 3A and 3B.
Al. Promoters The transcription regulatory or promoter region is chosen to be regulated in a manner allowing for induction under selected cultivation conditions, sugar depletion in culture or water uptake followed by gibberellic acid production in germinating seeds. Suitable promoters, and their method of selection are detailed in above-cited PCT application WO 95/14099. Examples of such promoters include those that transcribe the cereal a-amylase genes and sucrose synthase genes, and are repressed or induced by small molecules, like sugars, sugar depletion or phytohormones such as gibberellic acid or absissic acid. Representative promoters include the promoters from the rice a-amylase RAmylA, RAmylB, RAmy2A, RAmy3A, RAmy3B, RAmy3C, RAmy3D, and RAmy3E genes, and from the pM/C, gKAmyl41, gKAmyl55, Amy32b, and HV18 barley aamylase genes. These promoters are described, for example, in ADVANCES IN PLANT BIOTECHNOLOGY Ryu, et al, Eds., Elsevier, Amsterdam, 1994, p.37, and references cited therein. Other suitable promoters include the sucrose synthase and sucrose-6-phosphate-synthetase (SPS) promoters from rice and barley.
Other suitable promoters include promoters which are regulated in a manner allowing for induction under seed-maturation conditions. Examples of such promoters include those associated with the following monocot storage proteins: rice glutelins, oryzins, and prolamines, barley hordeins, wheat gliadins and glutelins, maize zeins and glutelins, oat glutelins, and sorghum 11 SUBSTITUTE SHEET (RULE 26)
-A
WO 98/36085 PCT/US98/03068 kafirins, millet pennisetins, and rye secalins.
A preferred promoter for expression in germinating seeds is the rice a-amylase RAmylA promoter, which is upregulated by gibberellic acid. Preferred promoters for expression in cell culture are the rice a-amylase RAmy3D and RAmy3E promoters which are strongly upregulated by sugar depletion in the culture. These promoters are also active during seed germination. A preferred promoter for expression in maturing seeds is the barley endosperm-specific B1-hordein promoter (Brandt, et al., (1985) Carlsberg Res. Commun. 50:333-345).
The chimeric gene may further include, between the promoter and coding sequences, the untranslated region UTR) of an inducible monocot gene, such as the 5' UTR derived from one of the rice or barley a-amylase genes mentioned above. One preferred 5' UTR is that derived from the RAmylA gene, which is effective to enhance the stability of the gene transcript. This 5' UTR has the sequence given by SEQ ID NO:5 herein.
A2. Signal Sequences In addition to encoding the protein of interest, the chimeric gene encodes a signal sequence (or signal peptide) that allows processing and translocation of the protein, as appropriate. Suitable signal sequences are described in above-referenced PCT application WO 95/14099. One preferred signal sequence is identified as SEQ ID NO:1 and is derived from the RAmy3D promoter. Another preferred signal sequence is identified as SEQ ID NO:4 and is derived from the RAmylA promoter.
The plant signal sequence is placed in frame with a heterologous nucleic acid encoding a mature protein, forming a construct which encodes a fusion protein having an N-terminal region corresponding to the signal peptide and, immediately adjacent to the C-terminal amino acid of the signal peptide, the N-terminal amino acid of the mature heterologous protein. The expressed fusion protein is subsequently secreted and processed by signal peptidase cleavage precisely at the junction of the signal peptide and the mature protein, to yield the mature heterologous protein.
In another embodiment of the invention, the coding sequence in the fusion protein gene, in at least the coding region for the signal sequence, may be codon-optimized for optimal expression in plant cells, rice cells, as described below. The upper row in Fig. 1 shows one codonoptimized coding sequence for the RAmy3D signal sequence, identified herein as SEQ ID NO:3.
A3. Naturally-Occurring Heterologous Protein Coding Sequences am-Antitrpsin: Mature human AAT is composed of 394 amino acids, having the sequence identified herein as SEQ ID NO:7. The protein has N-glycosylation sites at asparagines 46, 83 and 247. The corresponding native DNA coding sequence is identified herein as SEQ ID NO:8.
12 SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 (ii) Antithrombin III: Mature human ATIII is composed of 432 amino acids, having the sequence identified herein as SEQ ID NO:9. The protein has N-glycosylation sites at the four asparagine residues 96, 135, 155, and 192. The corresponding native DNA coding sequence is identified herein as SEQ ID NO: (iii) Human serum albumin: Mature HSA as found in human serum is composed of 585 amino acids, having the sequence identified herein as SEQ ID NO: 11. The protein has no N-linked glycosylation sites. The corresponding native DNA coding sequence is identified herein as SEQ ID NO:12.
(iv) Subtilisin BPN': Native proBPN' as produced in B. amyloliquefaciens is composed of 352 amino acids, having the sequence identified herein as SEQ ID NO:13, The corresponding native DNA coding sequence is identified herein as SEQ ID NO: 14. The proBPN' polypeptide contains a 77 amino acid "pro" moiety which is identified herein as SEQ ID NO:15. The remainder of the polypeptide, which forms the mature active BPN', is a 275 amino acid sequence identified herein by SEQ ID NO: 16. Native BPN' as produced in Bacillus is not glycosylated.
A4. Codon-Optimized Coding Sequences In accordance with one aspect of the invention, it has been discovered that a severalfold enhancement of expression level can be achieved in plant cell culture by modifying the native coding sequence of a heterologous gene by contain predominantly or exclusively, highest-frequency codons found in the plant cell host.
The method will be illustrated for expression of a heterologous gene in rice plant cells, it being recognized that the method is generally applicable to any monocot. As a first step, a representative set of known coding gene sequence from rice is assembled. The sequences are then analyzed for codon frequency for each amino acid, and the most frequent codon is selected for each amino acid. This approach differs from earlier reported codon matching methods, in which more than one frequent codon is selected for at least some of the amino acids. The optimal codons selected in this manner for rice and barley are shown in Table 1.
Table 1 Amino Acid Rice Preferred Codon Barley Preferred Codon Ala A GCC Arg R CGC Asn N AAC 13 SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 Amino Acid Rice Preferred Codon Barley Preferred Codon Asp D GAC Cys C UGC Gin Q CAG Glu E GAG Gly G GGC His H CAC Ile I
AUC
Leu L CUC Lys K AAG Phe F UUC Pro P CCG
CCC
Ser S AGC
UCC
Thr T
ACC
Tyr Y UAC Val V GUC GUG stop UAA UGA As indicated above, the fusion protein coding sequence in the chimeric gene is constructed such that the final (C-terminal) codon in the signal sequence is immediately followed by the codon for the N-terminal amino acid in the mature form of the heterologous protein. Exemplary fusion protein genes, in accordance with the present invention, are identified herein as follows: SEQ ID NO:18, corresponding to codon-optimized coding sequences of the fusion protein consisting of RAmy3D signal sequence/mature al-antitrypsin; SEQ ID NO:19, corresponding to the fusion protein coding sequence consisting of the codon-optimized RAmy3D signal sequence and the native mature antithrombin III sequence; SEQ ID NO:20, corresponding to the fusion protein coding sequence consisting of the codon-optimized RAmy3D signal sequence and the native mature human serum albumin sequence; SEQ ID NO:21, corresponding to codon-optimized coding sequence of the fusion protein RAmy3D signal sequence/prosubtilisin BPN'. In this instance, prosubtilisin is considered the "mature" protein, in that secreted prosubtilisin can autocatalyze to active, mature subtilisin.
In a preferred embodiment, the BPN' coding sequence is further modified to eliminate 14 SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 potential N-glycosylation sites, as native BPN' is not glycosylated. Table 2 illustrates preferredcodon substitutions, which eliminate all potential N-glycosylation sites in subtilisin BPN'. SEQ ID NO:17 corresponds to a mature BPN' amino acid sequence containing the substitutions presented in Table 2.
Table 2 N-Glycosylation Sites Location (Asn) (in mature Amino Acid protein) Substitution Asn Asn Ser 61 Thr Asn Ser Asn Asn Ser 76 Thr Asn Ser Asn Met Ser 123 Thr Met Ser Asn Gly Thr 218 Ser Gly Thr' Asn Trp Thr 240 Thr Trp Thr 'improved thermostability; Bryan, el al., Proteins: Structure, Function, and Genetics 1:326 (1986).
Transcription and Translation Terminators The chimeric gene may also include, downstream of the coding sequence, the 3' untranslated region UTR) from an inducible monocot gene, such as one of the rice or barley aamylase genes mentioned above. One preferred 3' UTR is that derived from the RAmylA gene, whose sequence is given by SEQ ID NO:6. This sequence includes non-coding sequence 5' to the polyadenylation site, the polyadenylation site, and the transcription termination sequence. The transcriptional termination region may be selected, particularly for stability of the mRNA to enhance expression. Polyadenylation tails (Alber and Kawasaki, 1982, Mol. and Appl. Genet.
1:419-434) are also commonly added to the expression cassette to optimize high levels of transcription and proper transcription termination, respectively. Polyadenylation sequences include but are not limited to the Agrobacterium octopine synthetase signal (Gielen, et al., EMBO J. 3:835- 846 (1984) or the nopaline synthase of the same species (Depicker, et al., Mol. Appl. Genet. 1:561- 573 (1982).
Since the ultimate expression of the heterologous protein will be in a eukaryotic cell (in this case, a member of the grass family), it is desirable to determine whether any portion of the cloned gene contains sequences which will be processed out as introns by the host's splicing machinery. If so, site-directed mutagenesis of the "intron" region may be conducted to prevent losing a portion of the genetic message as a false intron code (Reed and Maniatis, Cell 41:95-105 (1985).
SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 Fig. 2 shows the elements of one preferred chimeric gene constructed in accordance with the invention, and intended particularly for use in protein expression in a rice cell suspension culture. The gene includes, in a 5' to 3' direction, the promoter from the RAmy3D gene, which is inducible in cell culture with sugar depletion, the 5' UTR from the RAmylA gene, which confers enhanced stability on the gene transcript, the RAmy3D signal sequence coding region, as identified above, the coding region of a heterologous protein to be produced, and a 3' UTR region from the RAmylA gene.
III. Plant Transformation For transformation of plants, the chimeric gene is placed in a suitable expression vector designed for operation in plants. The vector includes suitable elements of plasmid or viral origin that provide necessary characteristics to the vector to permit the vectors to move DNA from bacteria to the desired plant host. Suitable transformation vectors are described in related application PCT WO 95/14099, published May 25, 1995, which is incorporated by reference herein. Suitable components of the expression vector, including' the chimeric gene described above, are discussed below. One exemplary vector is the p3Dvl.0 vector described in Example 1.
A. Transformation Vector Vectors containing a chimeric gene of the present invention may also include selectable markers for use in plant cells (such as the nptII kanamycin resistance gene, for selection in kanamycin-containing or the phosphinothricin acetyltransferase gene, for selection in medium containing phosphinothricin (PPT).
The vectors may also include sequences that allow their selection and propagation in a secondary host, such as sequences containing an origin of replication and a selectable marker such as antibiotic or herbicide resistance genes, HPH (Hagio et al., Plant Cell Reports 14:329 (1995); van der Elzer, Plant Mol. Biol. 5:299-302 (1985). Typical secondary hosts include bacteria and yeast. In one embodiment, the secondary host is Escherichia coli, the origin of replication is a colEl-type, and the selectable marker is a gene encoding ampicillin resistance. Such sequences are well known in the art and are commercially available as well Clontech, Palo Alto, CA; Stratagene, La Jolla, CA).
The vectors of the present invention may also be modified to intermediate plant transformation plasmids that contain a region of homology to an Agrobacteriwn tumefaciens vector, a T-DNA border region from Agrobacterium tumefaciens, and chimeric genes or expression cassettes (described above). Further, the vectors of the invention may comprise a disarmed plant tumor inducing plasmid of Agrobacterium tumefaciens.
16 SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 The vector described in Example 1, and having a promoter from the RAmy3D gene, is suitable for use in a method of mature protein production in cell culture, where the RAmy3D promoter is induced by sugar depletion in cell culture medium. Other promoters may be selected for other applications, as indicated above. For example, for mature protein expression in germinating seeds, the coding sequence may be placed under the control of the rice a-amylase RAmylA promoter, which is inducible by gibberellic acid during seed germination.
B. Transformation of plant cells Various methods for direct or vectored transformation of plant cells, plant protoplast cells, have been described, in above-cited PCT application WO 95/14099. As noted in that reference, promoters directing expression of selectable markers used for plant transformation nptll) should operate effectively in plant hosts. One such promoter is the nos promoter from native Ti plasmids (Herrera-Estrella, et al., Nature 303:209-213 (1983). Others include the 35S and 19S promoters of cauliflower mosaic virus (Odell, et al., Nature 313:810-812 (1985) and the 2' promoter (Velten, et al., EMBO J. 1:2723-2730 (1984).
In one preferred embodiment, the embryo and endosperm of mature seeds are removed to exposed scutulum tissue cells. The cells may be transformed by DNA bombardment or injection, or by vectored transformation, by Agrobacterium infection after bombarding the scuteller cells with microparticles to make them susceptible to Agrobacterium infection (Bidney et al., Plant Mol.
Biol. 18:301-313, 1992).
One preferred transformation follows the methods detailed generally in Sivamani, E. et al., Plant Cell Reports 15:465 (1996); Zhang, et al., Plant Cell Reports 15:465 (1996); and Li, L., et al., Plant Cell Reports 12:250 (1993). Briefly, rice seeds are sterilized by standard methods, and callus induction from the seeds is carried out on MB media with 2,4D. During a first incubation period, callus tissue forms around the embryo of the seed. By the end of the incubation period, 14 days at 28oC) the calli are about 0.25 to 0.5 cm in diameter. Callus mass is then detached from the seed, and placed on fresh NB media, and incubated again for about 14 days at 28oC. After the second incubation period, satellite calli developed around the original "mother" callus mass.
These satellite calli were slightly smaller, more compact and defined than the original tissue. It was these calli were transferred to fresh media. The "mother calli was not transferred. The goal was to select only the strongest, most vigorous growing tissue for further culture.
Calli to be bombarded are selected from 14-day-old subcultures. The size, shape, color and density are all important in selecting calli in the optimal physiological condition for transformation.
The calli should be between .8 and 1.1 mm in diameter. The calli should appear as spherical masses with a rough exterior.
17 SUBSTITUTE SHEET (RULE 26) 7;11 lii-*i C iii WO 98/36085 PCT/US98/03068 Transformation is by particle bombardment, as detailed in the references cited above. After the transformation steps, the cells are typically grown under conditions that permit expression of the selectable marker gene. In a preferred embodiment, the selectable marker gene is HPH. It is preferred to culture the transformed cells under multiple rounds of selection to produce a uniformly stable transformed cell line.
IV. Cell Culture Production of Mature Heterologous Protein Transgenic cells, typically callus cells, are cultured under conditions that favor plant cell growth, until the cells reach a desired cell density, then under conditions that favor expression of the mature protein under the control of the given promoter. Preferred culture conditions are described below and in Example 2. Purification of the mature protein secreted into the medium is by standard techniques known by those of skill in the art.
Production of mature AAT: In a preferred embodiment, the culture medium contains a phosphate buffer, the 20 mM phosphate buffer, pH 6.8 described in Example 2, to reduce AAT degradation catalyzed by metals. Alternatively, or in addition, a metal chelating agent, such as EDTA, may be added to the medium.
Following the cell culture method described in Example 2, cell culture media was partially purified and the fraction containing AAT was analyzed by Western blot, as shown in Fig. 4. The first two lanes ("phosphate") show AAT bands both in the presence and absence of elastase and where the higher molecular weight bands in the presence of elastase correspond roughly to a 58-59 kdal AAT/elastase complex. Also as seen in the figure, expression was high in the absence of sucrose, but nearly undetectable in the presence of sucrose.
To ascertain the degree of glycosylation (as determined by apparent molecular weight by SDS-PAGE) the protein produced in culture was fractionated by SDS-PAGE and immunodetected with a labeled antibody raised against the C-terminal portion of AAT, as shown in Fig. 5. Lane 4 contains human AAT, and its migration position corresponds to about 52 kdal. In lane 3 is the plant-produced AAT, having an apparent molecular weight of about 49-50 kdal, indicating an extent of glycosylation of up to 60-80% of the glycosylation found in human AAT (non-glycosylated AAT has a molecular weight of 45 kdal).
Similar results are shown in the Western blots in Fig. 6. Lanes 1-3 in this figure correspond to decreasing amount (15, 10, and 5 ng) of human AAT; lane 4, to 10 l supernatant from a non-expressing plant cell line; lanes 5 and 6, to 10 pl supernatant from AAT-expressing plant cell lines 11B and 27F, respectively, and lane 7, to 10 1l supernatant from cell line 27F plus 250 ng trypsin. The upward mobility shift in lane 7 is indicative of association between trypsin and the plant-produced
AAT.
18 SUBSTITUTE SHEET (RULE 26) 51 n WO 98/36085 PCT/US98/03068 The ability of plant-produced AAT to bind to elastase is demonstrated in Fig. 7, which shows the shift in molecular weight over a 30 minute binding interval for the 52 kdal human AAT (lanes 1-4) and the 49-50 kdal plant-produced AAT.
To demonstrate that the mature protein is produced in secreted form, with the desired Nterminus, a chimeric gene constructed as above, and having the coding sequence for mature alantitrypsin was expressed and secreted in cell culture as described in Example 2. The isolated protein was then sequenced at its N-terminal region, yielding the N-terminal sequence shown in Fig.
8. This sequence, which is identified herein as SEQ ID NO:22, has the same N-terminal residues as native mature a,-antitrypsin.
Production of mature ATII: In a preferred embodiment, the culture medium contains a MES buffer, pH 6.8. Western blot analysis of the ATIII-protein produced, shown in lanes 4 and 6 in Fig. 9, shows a band corresponding to ATII (lane 1) in cell lines 42 and 46, when grown in the absence (but not in the presence) of sucrose.
Production of mature BPN': In one embodiment of the invention, in which BPN' is secreted as the proBPN' form of the enzyme, the chaperon "pro" moiety of the enzyme facilitates enzyme folding and is cleaved from the enzyme, leaving the active mature form of BPN'. In another embodiment, the mature enzyme is co-expressed and co-secreted with the "pro" chaperon moiety, with conversion of the enzyme to active form occurring in presence of the free chaperon (Eder et al., Biochem. (1993) 32:18-26; Eder et al, (1993) J. Mol. Biol. 223:293-304). In yet another embodiment of the invention, the BPN' is secreted in inactive form at a pH that may be in the 6-8 range, with subsequent activation of the inactive form, after enzyme isolation, by exposure to the "pro" chaperon moiety, immobilized to a solid support.
In both of these embodiments, the culture medium is maintained at a pH of between 5 and 6, preferably about 5.5 during the period of active expression and secretion of BPN', to keep the BPN', which is normally active at alkaline pH, at a pH below optimal activity.
Codon optimization to the host plant's most frequent codons yielded a severalfold enhancement in the level of expressed heterologous protein in cell culture as shown in Fig. 11. The extent of enhancement is seen from the Western blot analysis shown in Fig. 10 for two cells lines and further substantiated in Fig. 11. Lane 2 (second from left) in Fig. 10 shows a Western blot of BPN' obtained in culture from cells transformed with a native proBPN' coding sequence. Two bands observed correspond to a lower molecular weight protein whose approximately 35 kdal molecular weight corresponds to that of proBPN'. The upper band corresponds to a somewhat higher molecular weight species, possibly glycosylated.
The first lane in the figure shows BPN' polypeptides produced in culture by plant cells transformed with the codon-optimized proBPN' sequence identified by SEQ ID NO:21. For 19 SUBSTITUTE SHEET (RULE 26)
I
WO 98/36085 PCT/US98/03068 comparative purposes, the same volume of culture medium, adjusted for cell density, was applied in both lanes 1 and 2. As seen, the amount of BPN' enzyme produced with a codon-optimized sequence was severalfold higher than for subtilisin BPN' produced with the native coding sequence.
Further, a dark band or bands corresponding to mature peptide (molecular weight 27.5 kdal) was observed. However, it should be noted that directly above the band at 35kD is a more pronounced band which may be pro mature product yet to be cleaved into active form.
Fig. 11 compares the specific activity of BPN' codon-optimized (AP106) versus BPN' native (AP101) expression in rice callus cell culture, assayed using the chromogenic peptide substrate suc-Ala-Ala-Pro-Phe-pNA as described by DelMar, E.G. et al. (1979; Anal. Biochem.
99:316-320). As shown if Fig. 11, several of the cell lines transformed with codon-optimized chimeric genes produced levels of BPN', as evidenced by measured specific activity in culture medium, that were 2-5 times the highest levels observed for plant cells transformed with native proBPN' sequence.
In accordance with another aspect of the invention, it has been found that the transformed plant cell culture is able to express and secrete BPN' at a cell culture pH, pH 5.5, which largely inhibits self-degradation of mature, active BPN'. To assay for optimal pH conditions, the assay disclosed in DelMar, et al. (supra) is used to test the media derived from BPN' transformed cell lines under various pH conditions. Transformed rice callus cells are cultured in a MES medium under similar conditions as disclosed in Example 2, but where the pH of the medium is maintained at a selected pH between 5 and 8.0. At each pH, the total amount of expressed and secreted BPN' is determined by Western blot analysis. BPN' activity can be tested in the assay described by DelMar (supra).
V. Production of Mature Heterologous Protein in Germinating Seeds In this embodiment, monocot cells transformed as above are used to regenerate plants, seeds from the plants are harvested and then germinated, and the mature protein is isolated from the germinated seeds.
Plant regeneration from cultured protoplasts or callus tissue is carried by standard methods, as described in Evans et al., HANDBOOK OF PLANT CELL CULTURES Vol. 1: (MacMillan Publishing Co. New York, 1983); and Vasil I.R. CELL CULTURE AND SOMATIC CELL GENETICS OF PLANTS, Acad. Press, Orlando, Vol. I, 1984, and Vol. III, 1986, and as described in the aliove-cited PCT application.
A. Seed Germination Conditions The transgenic seeds obtained from the regenerated plants are harvested, and prepared for germination by an initial steeping step, in which the seeds immersed in or sprayed with water to SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 increase the moisture content of the seed to between 35-45%. This initiates germination. Steepingtypically takes place in a steep tank which is typically fitted with a conical end to allow the seed to flow freely out. The addition of compressed air to oxygenate the steeping process is an option.
The temperature is controlled at approximately 22-C depending on the seed.
After steeping, the seeds are transferred to a germination compartment which contains air saturated with water and is under controlled temperature and air flows. The typical temperatures are between 12-25oC and germination is permitted to continue for from 3 to 7 days.
Where the heterologous protein coding gene is operably linked to a inducible promoter requiring a metabolite such as sugar or plant hormone, 2 to 100 9M gibberellic acid, this metabolite is added, removed or depleted from the steeping water medium and/or is added to the water saturated air used during germination. The seed absorbs the aqueous medium and begins to germinate, expressing the heterologous protein. The medium may then be withdrawn and the malting begun, by maintaining the seeds in a moist temperature controlled aerated environment. In this way, the seeds may begin growth prior to expression, so that the expressed product is less likely to be partially degraded or denatured during the process.
More specifically, the temperature during the imbibition or steeping phase will be maintained in the range of about 15-25oC, while the temperature during the germination will usually be about 200C. The time for the imbibition will usually be from about 1 to 4 days, while the germination time will usually be an additional 1 to 10 days, more usually 3 to 7 days. Usually, the time for the malting does not exceed about ten days. The period for the malting can be reduced by using plant hormones during the imbibition, particularly gibberellic acid.
To achieve maximum production of recombinant protein from malting, the malting procedure may be modified to accommodate de-hulled and de-embryonated seeds, as described in above-cited PCT application WO 95/14099. In the absence of sugars from the endosperm, there is expected to be a 5 to 10 fold increase in RAmy3D promoter activity and thus expression of heterologous protein. Alternatively when embryoless half-seeds are incubated in 10 mM CaC 2 1 and gM gibberellic acid, there is a 50 fold increase in RAmylA promoter activity.
Production of mature HSA: Following the germination conditions as outlined above and further detailed in Example 3, supernatant was analyzed by Western blot. Western blot analysis shows production of HSA in germinating rice seeds, with seed samples taken 24, 72, and 120 hours after induction with gibberellin. HSA production was highest approximately 24 hours postinduction (lanes 3 and 4, Fig. 12). Bilirubin binding, a measure of correct folding of plantproduced HSA, is assayed according to the method presented in Example 3.
VI. Production of Mature Heterologous Protein in Maturing Seeds 21 SUBSTITUTE SHEET (RULE 26) i if i.i_ ill I= f. WO 98/36085 PCT/US98/03068 In this embodiment, monocot cells transformed as above are used to regenerate plants, and seeds from the plants are allowed to mature, typically in the field, with consequent production of heterologous protein in the seeds.
Following seed maturation, the seeds and their heterologous proteins may be used directly, that is, without protein isolation, where for example, the heterologous protein is intended to confer a benefit on the seed as a whole, for example, to enrich the seed in the selected protein.
Alternatively, the seeds may be fractionated by standard methods to obtain the heterologous protein in enriched or purified form. In one general approach, the seed is first milled, then suspended in a suitable extraction medium, an aqueous or an organic solvent, to extract the protein or metabolite of interest. If desired the heterologous protein can be further fractionated and purified, using standard purification methods.
The following examples are provided by way of illustration only and not by way of limitation. Those of skill will readily recognize a variety of noncritical parameters which could be changed or modified to yield essentially similar results.
General Methods Generally, the nomenclature and laboratory procedures with respect to standard recombinant DNA technology can be found in Sambrook, et al., MOLECULAR CLONING A LABORATORY MANUAL, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 1989 and in S.B.
Gelvin and R.A. Schilperoot, PLANT MOLECULAR BIOLOGY, 1988. Other general references are provided throughout this document. The procedures therein are known in the art and are provided for the convenience of the reader.
Example 1 Construction of a Transforming Vector Containing a Codon-Optimized ,I-antitrypsin Sequence A. Hygromycin Resistance Gene Insertion: The 3 kb BamHI fragment containing the 35S promoter-Hph-NOS was removed from the plasmid pMON410 (Monsanto, St. Louis, MO) and placed into an site-directed mutagenized Bgill site in the pUC18 at 1463 to form the plasmid pUCH18+.
B. Terminator Insertion: is a 5 kb BamHI-KpnI fragment from lambda clone ,OSglA (Huang, et al., (1990) Nuc. Acids Res. 18:7007) cloned into pBluescript KS- (Stratagene, San Diego, CA).
22 SUBSTITUTE SHEET (RULE 26) i i I i- i i~_l~l.lili-^-~i.lii-Jf ~t(i:ll-iiriCIII--~-~ii~BCl~i;i;l:r~~ WO 98/36085 PCT/US98/03068 Plasmid pOSglABK5 was digested with MspI and blunted with T4 DNA polymerase followed by SpeI digestion. The 350 bp terminator fragment was subcloned into pUC19 (New England BioLabs, Beverly, MA), which had been digested with BamHI, blunted with T4 DNA polymeraseand digested with Xbal, to form pUC19/terminator.
C. RAmy3D Promoter Insertion: A 1.1 kb NheI-PstI fragment derived from plAS1.5 (Huang, N. et al. (1993) Plant Mol.
Biol. 23:737-747), was cloned into the vector pGEM5zf- [multiple cloning site (MCS) (Promega, Madison, WI): Apal, AatII, SphI, NcoI, SstII, EcoRV, SpeI, NotI, PstI, Sall, NdeI, SacI, MluI, NsiI] at the SpeI and PstI sites to form pGEM5zf-(3DINheI-PstI). pGEM5zf-(3DINheI-PstI) was then digested with PstI and SacI, and two non-kinased 30mers having the complementary sequences GCTTG ACCTG TAACT CGGGC CAGGC GAGCT 3' (SEQ ID NO:23) and 5' CGCCT AGCCC GAGTT ACAGG TCAAG CAGCT 3' (SEQ ID NO:24) were ligated in to form p3DProSig. The promoter fragment prepared by digesting p3DProSig with NcoI, blunting with T4 DNA polymerase, and digesting with SstI was subcloned into pUC19/terminator which had been digested with EcoRI, blunted with T4 DNA polymerase and digested with Sstl, to form p3DProSigEND.
D. Multiple Cloning Site Insertion: p3DProSigEND was digested with SstI and Smal followed by the ligation of a new synthetic linker fragment constructed with the non-kinased complementary oligonucleotides 5' AGCTC CATGG CCGTG GCTCG AGTCT AGACG CGTCC CC 3' (SEQ ID NO:25) and 5' GGGGA CGCGT CTAGA CTCGA GCCAC GGCCA TGG 3' (SEQ ID NO:26) to form p3DProSigENDlink.
E. p3DProSigENDink Flanking Site Modification: p3DProSigENDlink was digested with Sall and blunted with T4 DNA polymerase followed by EcoRV digestion. The blunt fragment was then inserted into pBluescript KS+ (Stratagene) in the EcoRV site so that the HindIII site is proximal to the promoter and the EcoRI is proximal to the terminator sequence. The HindIII-EcoRI fragment was then moved into the polylinker of pUCHl8+ to form the p3Dvl.0 expression vector.
F. RAmylA Promoter Insertion: A 1.9 kb NheI-PstI fragment derived from subclone pOSG2CA2.3 from lambda clone XOSg2 (Huang et al. (1990) Plant Mol. Biol. 14:655-668), was cloned into the vector pGEM5zf- at 23 SUBSTITUTE SHEET (RULE 26) -77-777777777777 _W0 98/36085 PCT/US98/03068 the SpeI and PstI sites to form pGEM5zf-(1ANheI-PstI). pGEM5zf-(1AINheI-PstI) was digested with PstI and SacI and two non-kinased 35mers and four kinased 32mers were ligated in, with the complementary sequences as follows: 5' GCATG CAGGT GCTGA ACACC ATGGT GAACA.
AACAC 3' (SEQ ID NO:27); 5' TTCTT GTCCC TTTCG GTCCT CATCG TCCTC CT 3' (SEQ ID NO:28); 5' TGGCC TCTCC TCCAA CTTGA CAGCC GGGAG CT 3' (SEQ ID 0:29); TTCAC CATGG TGTTC AGCAC CTGCA TGCTG CA 3' (SEQ ID NO:30); 5' CGATG AGGAC CGAAA GGGAC AAGAA GTGTT TG 3' (SEQ ID NO:31); 5' CCCGG CTGTC AAGTT GGAGG AGAGG CCAAG GAGGA 3' (SEQ ID NO:32) to form plAProSig. The Hindll-SacI 0.8 kb promoter fragment was subcloned from plAProSig into the p3Dvl.0 vector digested with Hindm -SacI to yield the plAvl.0 expression vector.
G. Construction of p3D-AAT Plasmid Two PCR primers were used to amplify a fragment encoding AAT according to the sequence disclosed as Genbank Accession No. K01396: N-terminal primer 5' GAGGA TCCCC AGGGA GATGC TGCCC AGAA 3' (SEQ ID NO:33) and C-terminal primer 5' CGCGC TCGAG TTATT TITGG GTGGG ATTCA CCAC 3' (SEQ ID NO:34). The N-terminal primer amplifies to a blunt site for in-frame insertion with the end of the p3D signal peptide and the C-terminal primer contains a XhoI site for cloning the fragment into the vector as shown in Figs. 3A and 3B.
Alternatively, the sequence encoding mature AAT (SEQ ID NO:8) or codon-optimized AAT may be chemically synthesized using techniques known in the art, incorporating a XhoI restriction site 3' of the termination codon for insertion into the expression vector as described above.
Example 2 Production of mature a-antitrypsin in cell culture After selection of transgenic callus, callus cells were suspended in liquid culture containing AA2 media (Thompson, et al., Plant Science 47:123 (1986), at 3% sucrose, pH 5.8.
Thereafter, the cells were shifted to phosphate-buffered media (20 mM phosphate buffer, pH 6.8) using 10 mL multi-well tissue culture plates and shaken at 120 rpm in the dark for 48 hours. The supernatant was then removed and stored at -800C prior to western blot analysis.
Supernatants were concentrated using Centricon-10 filters (Amicon cat. #4207) and washed with induction media to remove substances interfering with electrophoretic migration. Samples were concentrated approximately 10 fold, and mature AAT was purified by SDS PAGE electrophoresis. The purified protein was extracted from the electrophoresis medium, and sequenced at its N-terminus, giving the sequence shown in Fig. 8, identified herein as SEQ ID NO:22.
24 SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 Example 3 HSA Induction in Germinating Seeds After selection of transgenic plants which tested positive for the presence of a codonoptimized HSA gene driven by the GA 3 -responsive RAmylA promoter, seeds were harvested and imbibed for 24 hours with 100 rpm orbital shaking in the dark at 25oC. GA, was added to a final concentration of 5ApM and incubated for an additional 24-120 hours. Total soluble protein was isolated by double grinding each seed in 120 til grinding buffer and centrifuging at 23,000 x g for 1 minute at 4oC. The clear supernatant was carefully removed from the pellet and transferred to a fresh tube.
Bilirubin binding assay Bilirubin binding to its high-affinity site on mature HSA is assayed using the method described by Jacobsen, J. et al. (1974; Clin. Chem. 20:783) and Reed, R.G. et al. (1975; Biochemistry 14:4578-4583). Briefly, the concentration of free bilirubin in equilibrium with protein-bound bilirubin is determined by the rate of peroxide-peroxidase catalyzed oxidation of free bilirubin. Stock solutions of bilirubin (Nutritional Biochemicals Corp.) are prepared fresh daily in mM NaOH containing ImM EDTA and the concentration determined using a molar absorptivity of 47,500 M cm' at 440 nm. An aliquot containing between 5 and 30 nmol bilirubin is added to a 1 cm cuvette containing 1 ml PBS and approximately 30 nmol HSA at 37oC. An absorbance spectrum between 500 and 350 nm is recorded. Aliquots of horseradish peroxidase (Sigma), 0.05 mg/ml in PBS, and 0.05% ethyl hydrogen peroxide (Ferrosan; Malmo Sweden) are added and the change in absorbance at xmax is recorded for 3-5 minutes. The concentrations of free and bound billirubin calculated from the oxidation rate observed using varying concentrations of total bilirubin are used to construct a Scatchard plot from which the association constant for a single binding site is determined.
Although the invention has been described with reference to particular embodiments, it will be appreciated that a variety of changes and modifications can be made without departing from the invention.
SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 SEQUENCE LISTING GENERAL INFORMATION APPLICANT: Applied Phytologics, Inc.
(ii) TITLE OF THE INVENTION: Production of Mature Proteins in Plants (iii) NUMBER OF SEQUENCES: 34 (iv) CORRESPONDENCE ADDRESS: ADDRESSEE: Dehlinger Associates STREET: P.O. Box 60850 CITY: Palo Alto STATE: CA COUNTRY: USA ZIP: 94306 COMPUTER READABLE FORM: MEDIUM TYPE: Diskette COMPUTER: IBM Compatible OPERATING SYSTEM: DOS SOFTWARE: FastSEQ for Windows Version (vi) CURRENT APPLICATION DATA: APPLICATION NUMBER: PCT/US98/03068 FILING DATE: 13-FEB-1998
CLASSIFICATION:
(vii) PRIOR APPLICATION DATA: APPLICATION NUMBER: 60/038,169 FILING DATE: 13-FEB-1997 APPLICATION NUMBER: 60/037,991 FILING DATE: 13-FEB-1997 APPLICATION NUMBER: 60/038,170 FILING DATE: 13-FEB-1997 APPLICATION NUMBER: 60/038,168 FILING DATE: 13-FEB-1997 (viii) ATTORNEY/AGENT INFORMATION: NAME: Petithory, Joanne R REGISTRATION NUMBER: P42,995 REFERENCE/DOCKET NUMBER: 0665-0007.41 (ix) TELECOMMUNICATION INFORMATION: TELEPHONE: 650-324-0880 TELEFAX: 650-324-0960 INFORMATION FOR SEQ ID NO:1: SEQUENCE CHARACTERISTICS: LENGTH: 25 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (vii) IMMEDIATE SOURCE: CLONE: 3D signal peptide sequence (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: 26 SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 Met Lys Asn Thr Ser Ser Leu Cys Leu Leu Leu Leu Val Val Leu Cys 1 5 10 Ser Leu Thr Cys Asn Ser Gly Gin Ala INFORMATION FOR SEQ ID NO:2: SEQUENCE CHARACTERISTICS: LENGTH: 75 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (vii) IMMEDIATE SOURCE: CLONE: native 3D signal peptide DNA sequence (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: ATGAAGAACA CCAGCAGCTT GTGTTTGCTG CTCCTCGTGG TGCTCTGCAG CTTGACCTGT AACTCGGGCC AGGCG INFORMATION FOR SEQ ID NO:3: SEQUENCE CHARACTERISTICS: LENGTH: 75 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (vii) IMMEDIATE SOURCE: CLONE: codon-optimized 3D signal peptide DNA sequence (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: ATGAAGAACA CCTCCTCCCT CTGCCTCCTG CTGCTCGTGG TCCTCTGCTC CCTGACCTGC AACAGCGGCC AGGCC INFORMATION FOR SEQ ID NO:4: SEQUENCE CHARACTERISTICS: LENGTH: 25 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (vii) IMMEDIATE SOURCE: CLONE: RAmylA signal peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: Met Val Asn Lys His Phe Leu Ser Leu Ser Val Leu Ile Val Leu Leu 1 5 10 Gly Leu Ser Ser Asn Leu Thr Ala Gly INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 51 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (vii) IMMEDIATE SOURCE: CLONE: RAmy 1A 5' untranslated region (UTR) (xi) SEQUENCE DESCRIPTION: SEQ ID 27 SUBSTITUTE SHEET (RULE 26) 1 1 ,t WO 98/36085 PCT/US98/03068 ATCAATCATC CATCTCCGAA GTGTGTCTGC AGCATGCAGG TGCTGAACAC C INFORMATION FOR SEQ ID NO:6: SEQUENCE CHARACTERISTICS: LENGTH: 321 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (vii) IMMEDIATE SOURCE: CLONE: RAmy 1A 3' untranslated region (UTR) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: GCGCACGATG ACGAGACTCT CAGTTTAGCA GATTTAACCT GCGATTTTTA CCCTGACCGG TATACGTATA TACGTGCCGG CAACGAGCTG TATCCGATCC GAATTACGGA TGCAATTGTC CACGAAGTAC TTCCTCCGTA AATAAAGTAG GATCAGGGAC ATACATTTGT ATGGTTTTAC GAATAATGCT ATGCAATAAA ATTTGCACTG CTTAATGCTT ATGCATTTTT GCTTGGTTCG ATTGTACTGG TGAATTATTG TTACTGTTCT TTTTACTTCT CGAGTGGCAG TATTGTTCTT CTACGAAAAT TTGATGCGTA G INFORMATION FOR SEQ ID NO:7: SEQUENCE CHARACTERISTICS: LENGTH: 394 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (vii) IMMEDIATE SOURCE: CLONE: mature AAT amino acid sequence (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 51 120 180 240 300 321 Glu Asp Pro Gln Gly 1 Asp Phe Asn Ser 65 Asn Gin Thr Lys Val 145 Val Asp Lys His Gly 225 Leu Gln Asp Ala Phe Ile Phe Leu Gly Phe Asn Glu Leu Thr Gly 115 Phe Leu 130 Asn Phe Glu Lys Arg Asp Trp Glu 195 Val Asp 210 Met Phe Met Lys His Ser Phe Thr Leu Leu 100 Asn Glu Gly Gly Thr 180 Arg Gln Asn Tyr 5 Pro Leu Ser Lys Thr Arg Gly Asp Asp Thr 165 Val Pro Val Ile Leu 245 Asp Thr Tyr Pro Ala 70 Glu Thr Leu Val Thr 150 Gln Phe Phe Thr Gln 230 Gly Ala Ala Gln Lys Thr 10 Phe Asn Lys Ile Thr 25 Arg Gln Leu Ala His 40 Val Ser Ile Ala Thr 55 Asp Thr His Asp Glu 75 Ile Pro Glu Ala Gln 90 Leu Asn Gin Pro Asp 105 Phe Leu Ser.Glu Gly 120 Lys Lys Leu Tyr His 135 Glu Glu Ala Lys Lys 155 Gly Lys Ile Val Asp 170 Ala Leu Val Asn Tyr 185 Glu Val Lys Asp Thr 200 Thr Val Lys Val Pro 215 His Cys Lys Lys Leu 235 Asn Ala Thr Ala Ile 250 Asp Pro Gin Ala Ile Ile Ser Leu Ser 140 Gin Leu Ile Glu Met 220 Ser Thr Asn Ser Phe Leu His Gln Lys 125 Glu Ile Val Phe Glu 205 Met Ser Ser His Leu Ala Asn Ser Ala Met Glu Gly Glu Gly Leu Gln 110 Leu Val Ala Phe Asn Asp Lys Glu 175 Phe Lys 190 Glu Asp Lys Arg Trp Val His Glu Thr Leu Leu Phe Leu Asp Thr Tyr 160 Leu Gly Phe Leu Leu 240 Phe Phe Leu Pro Asp 255 SUBSTITUTE SHEET (RULE 26) WO 98/36085 Glu Gly Lys Thr Lys Phe 275 Pro Ls Leu Leu 260 Leu Gin His Leu Glu Asn G 265 Arg A lu Leu Thr His PCTIUS98/03068 Asp Ile Ile 270 Leu His Leu Val Leu Gly Glu Asn Glu Asp 280 Thr rg Ser Ala Ser 285 Ser Ser Ile Thr Gin 305 Val 290 Gly 295 Val Tyr Asp Leu Lys 300 Leu Gly Ile Thr Lys 310 Thr Glu Glu Ala Pro Phe Ser Asn Gly Leu Lys Val Leu Thr Leu Glu Ala 355 Ile 340 Ile Glu Lys Gly Leu Ser 330 Thr Glu 315 Lys Ala Ala Asp Leu Ser Gly 320 Ala Val His Lys Ala 335 Ala Gly Ala Met Phe 350 Val Lys Phe Asn Lys 345 Pro Pro Met Ser Ile 360 Glu Pro Glu Pro Phe 370 Met Gly 385 Val Phe Leu Met 365 Ser Ile 375 Pro Gin Asn Thr Pro Leu Phe Lys Val Val Asn 390 Thr Gin Lys INFORMATION FOR SEQ ID NO:8: SEQUENCE CHARACTERISTICS: LENGTH: 1185 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (vii) IMMEDIATE SOURCE: CLONE: native coding sequence of mature AAT (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:
GAGGATCCCC
CCAACCTTCA
CTGGCACACC
TTTGCAATGC
AATTTCAACC
CGTACCCTCA
AGCGAGGGCC
GAAGCCTTCA
GTGGAGAAGG
GTTTTTGCTC
AGGGAGATGC TGCCCAGAAG ACAGATACAT CCCACCATGA TCAGGATCAC ACAAGATCAC CCCCAACCTG GCTGAGTTCG CCTTCAGCCT ATACCGCCAG AGTCCAACAG CACCAATATC TTCTTCTCCC CAGTGAGCAT CGCTACAGCC TCTCCCTGGG GACCAAGGCT GACACTCACG ATGAAATCCT GGAGGGCCTG TCACGGAGAT TCCGGAGGCT CAGATCCATG AAGGCTTCCA GGAACTCCTC ACCAGCCAGA CAGCCAGCTC CAGCTGACCA CCGGCAATGG CCTGTTCCTC TGAAGCTAGT GGATAAGTTT TTGGAGGATG TTAAAAAGTT GTACCACTCA CTGTCAACTT CGGGGACACC GAAGAGGCCA AGAAACAGAT CAACGATTAC GTACTCAAGG GAAAATTGTG GATTTGGTCA AGGAGCTTGA CAGAGACACA TGGTGAATTA CATCTTCTTT AAAGGCAAAT GGGAGAGACC CTTTGAAGTC AAGGACACCG AGGAAGAGGA CTTCCACGTG GACCAGGTGA CCACCGTGAA GGTGCCTATG ATGAAGCGTT TAGGCATGTT TAACATCCAG CACTGTAAGA AGCTGTCCAG CTGGGTGCTG CTGATGAAAT ACCTGGGCAA TGCCACCGCC ATCTTCTTCC TGCCTGATGA GGGGAAACTA CAGCACCTGG AAAATGAACT CACCCACGAT ATCATCACCA AGTTCCTGGA AAATGAAGAC AGAAGGTCTG CCAGCTTACA TTTACCCAAA CTGTCCATTA CTGGAACCTA TGATCTGAAG AGCGTCCTGG GTCAACTGGG CATCACTAAG GTCTTCAGCA ATGGGGCTGA CCTCTCCGGG GTCACAGAGG AGGCACCCCT GAAGCTCTCC AAGGCCGTGC ATAAGGCTGT GCTGACCATC GACGAGAAAG GGACTGAAGC TGCTGGGGCC ATGTTTTTAG AGGCCATACC CATGTCTATC CCCCCCGAGG TCAAGTTCAA CAAACCCTTT GTCTTCTTAA TGATTGAACA AAATACCAAG TCTCCCCTCT TCATGGGAAA AGTGGTGAAT CCCACCCAAA AATAA 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1185 INFORMATION FOR SEQ ID NO:9: SEQUENCE CHARACTERISTICS: LENGTH: 432 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (vii) IMMEDIATE SOURCE: CLONE: mature ATIII aa sequence (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: His Gly Ser Pro Val Asp Ile Cys Thr Ala Lys Pro Arg Asp Ile Pro 29 SUBSTITUTE SHEET (RULE 26)
I
WO 98/36085 PCT/US98/03068 1 5 10 Met Asn Pro Met Cys Ile Tyr Arg Ser Pro Glu Lys Lys Ala Thr Glu 25 Asp Glu Gly Ser Glu Gin Lys Ile Pro Glu Ala Thr Asn Arg Arg Val 40 Trp Glu Leu Ser Lys Ala Asn Ser Arg Phe Ala Thr Thr Phe Tyr Gin 55 His Leu Ala Asp Ser Lys Asn Asp Asn Asp Asn Ile Phe Leu Ser Pro 65 70 75 Leu Ser Ile Ser Thr Ala Phe Ala Met Thr Lys Leu Gly Ala Cys Asn 90 Asp Thr Leu Gin Gin Leu Met Glu Val Phe Lys Phe Asp Thr Ile Ser 100 105 110 Glu Lys Thr Ser Asp Gin Ile His Phe Phe Phe Ala Lys Leu Asn Cys 115 120 125 Arg Leu Tyr Arg Lys Ala Asn Lys Ser Ser Lys Leu Val Ser Ala Asn 130 135 140 Arg Leu Phe Gly Asp Lys Ser Leu Thr Phe Asn Glu Thr Tyr Gin Asp 145 150 155 160 Ile Ser Glu Leu Val Tyr Gly Ala Lys Leu Gin Pro Leu Asp Phe Lys 165 170 175 Glu Asn Ala Glu Gin Ser Arg Ala Ala Ile Asn Lys Trp Val Ser Asn 180 185 190 Lys Thr Glu Gly Arg Ile Thr Asp Val Ile Pro Ser Glu Ala Ile Asn 195 200 205 Glu Leu Thr Val Leu Val Leu Val Asn Thr Ile Tyr Phe Lys Gly Leu 210 215 220 Trp Lys Ser Lys Phe Ser Pro Glu Asn Thr Arg Lys Glu Leu Phe Tyr 225 230 235 240 Lys Ala Asp Gly Glu Ser Cys Ser Ala Ser Met Met Tyr Gin Glu Gly 245 250 255 Lys Phe Arg Tyr Arg Arg Val Ala Glu Gly Thr Gin Val Leu Glu Leu 260 265 270 Pro Phe Lys Gly Asp Asp Ile Thr Met Val Leu Ile Leu Pro Lys Pro 275 280 285 Glu Lys Ser Leu Ala Lys Val Glu Lys Glu Leu Thr Pro Glu Val Leu 290 295 300 Gin Glu Trp Leu Asp Glu Leu Glu Glu Met Met Leu Val Val His Met 305 310 315 320 Pro Arg Phe Arg Ile Glu Asp Gly Phe Ser Leu Lys Glu Gin Leu Gin 325 330 335 Asp Met Gly Leu Val Asp Leu Phe Ser Pro Glu Lys Ser Lys Leu Pro 340 345 350 Gly Ile Val Ala Glu Gly Arg Asp Asp Leu Tyr Val Ser Asp Ala Phe 355 360 365 His Lys Ala Phe Leu Glu Val Asn Glu Glu Gly Ser Glu Ala Ala Ala 370 375 380 Ser Thr Ala Val Val Ile Ala Gly Arg Ser Leu Asn Pro Asn Arg Val 385 390 395 400 Thr Phe Lys Ala Asn Arg Pro Phe Leu Val Phe Ile Arg Glu Val Pro 405 410 415 Leu Asn Thr Ile Ile Phe Met Gly Arg Val Ala Asn Pro Cys Val Lys 420 425 430 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 1299 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (vii) IMMEDIATE SOURCE: CLONE: native ATIII DNA sequence (xi) SEQUENCE DESCRIPTION: SEQ ID SUBSTITUTE SHEET (RULE 26) WO 98/36085 WO 9836085PCT/US98/03068
CACGGAAGCC
TGCATTTACC
CCGGAGGCCA
ACTTTCTATC
CTGAGTATCT
CAACTGATGG
TTCTTCTTTG
GTATCAGCCA
ATCAGTGAGT
CAATCCAGAG
GTCATTCCCT
CTGTGGACAT CTGCACAGCC AAGCCGCGGG ACATTCCCAT GAATCCCATG GCTCCCCGGA GAAGAAGGCA ACTGAGGATG AGGGCTCAGA ACAGAAGATC CCAACCGGCG TGTCTGGGAA CTGTCCAAGG CCAATTCCCG CTTTGCTACC AGCACCTGGC AGATTCCAAG AATGACAATG ATAACATTTT CCTGTCACCC CCACGGCTTT TGCTATGACC AAGCTGGGTG CCTGTAATGA CACCCTCCAG AGGTATTTAA GTTTGACACC ATATCTGAGA AAACATCTGA TCAGATCCAC
CCAAACTGAA
ATCGCCTTTT
TGGTATATGG
CGGCCATCAA
CGGAAGCCAT
CTGCCGACTC
TGGAGACAAA
AGCCAAGCTC
CAAATGGGTG
TATCGAAAAG
TCCCTTACCT
CAGCCCCTGG
TCCAATAAGA
CAATGAGCTC ACTGTTCTGG
CCAACAAATC
TCAATGAGAC
ACTTCAAGGA
CCGAAGGCCG
TGCTGGTTAA
CAAGGAAGGA
AGGAAGGCAA
TCAAAGGTGA
CTCCAAGTTA
CTACCAGGAC
AAATGCAGAG
AATCACCGAT
CACCATTTAC
ACTGTTCTAC
GTTCCGTTAT
TGACATCACC
TTCAAGGGCC TGTGGAAGTC AAAGTTCAGC CCTGAGAACA AAGGCTGATG GAGAGTCGTG TTCAGCATCT ATGATGTACC CGGCGCGTGG CTGAAGGCAC CCAGGTGCTT GAGTTGCCCT 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1299
ATGGTCCTCA
CCAGAGGTGC
CCCCGCTTCC
GTCGATCTGT
GACCTCTATG
GAAGCAGCTG
ACTTTCAAGG
ATCTTCATGG
TCTTGCCCAA GCCTGAGAAG AGCCTGGCCA AGGTGGAGAA GGAACTCACC TGCAGGAGTG GCTGGATGAA TTGGAGGAGA TGATGCTGGT GGTTCACATG GCATTGAGGA CGGCTTCAGT TTGAAGGAGC AGCTGCAAGA CATGGGCCTT TCAGCCCTGA AAAGTCCAAA CTCCCAGGTA TTGTTGCAGA AGGCCGAGAT TCTCAGATGC ATTCCATAAG GCATTTCTTG AGGTAAATGA AGAAGGCAGT CAAGTACCGC TGTTGTGATT GCTGGCCGTT CGCTAAACCC CAACAGGGTG CCAACAGGCC CTTCCTGGTT TTTATAAGAG AAGTTCCTCT GAACACTATT GCAGAGTAGC CAACCCTTGT GTTAAGTAA INFORMATION FOR SEQ ID NO:11: Wi SEQUENCE CHARACTERISTICS: LENGTH: 585 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (vii) IMMEDIATE SOURCE: CLONE: mature HSA amino acid sequence (xi) SEQUENCE DESCRIPTION: SEQ ID NO:ll: Asp 1 Glu Ala His Lys Asn Phe Lys Ser 5 Ala Glu Val Ala Leu Val Leu His Arg Phe Lys Asp Leu Gly Glu 10 Ile Ala Phe Ala Gln Tyr Leu Gln Gln Cys Pro Phe Ala Lys Ser Leu His Phe Glu Asp His Val 40 Thr Cys Val Ala Asp 55 Lys Leu Val Asn Glu Asn Val Thr Glu Cys Asp Lys Glu Ser Ala Glu Thr Thr Leu Arg Phe Gly Asp 70 Lys Leu Cys 75 Cys Val Ala Thr Leu Glu Thr Tyr Gly Cys Glu Phe Glu Arg Asn Pro Arg Leu 115 Asp Asn Glu 130 Glu 100 Val Met Ala Asp Cys 90 Leu Gln His Lys 105- Glu Val Asp Val Ala Lys Gln Glu Pro Asp Asp Asn Arg Pro Met Cys Glu Thr Phe 120 Lys Thr 125 Pro Asn Leu 110 Ala Phe His Ile Ala Arg Lys Tyr Leu Tyr Glu Arg His Pro 145 Tyr Lys Ala Tyr Phe Tyr 150 Ala Phe Thr Pro Glu Leu Leu 155 Ala Phe Ala Glu Cys 165 Cys Leu Ser Ala Arg Ala Leu Pro Lys Leu 180 Lys Gln Arg Leu 195 Phe Lys Ala Trp Glu Phe Ala Glu Asp Glu Lys Cys 200 Ala Val 215 Val Ser Cys Gln 170 Leu Arg Phe Ala Lys Arg 160 Asp Lys Ala Ala 175 Gly Lys Ala Ser 190 Lys Phe Gly Glu Asp Glu 185 Ala Ser Leu Gln 205 Gln Ala Arg Leu Arg Phe Pro 210 Ala Lys 225 Val Lys Leu His Thr Glu Cys 230 Cys Val 235 Leu Asp Leu Thr His Gly Asp Leu Glu Cys Ala Asp SUBSTITUTE SHEET (RULE WO 98/36085 PCT/US98/03068 Arg Ala Ser Lys Cys Ile 290 Leu Ala 305 Glu Ala Arg His Tyr Glu Cys Tyr 370 Gin Asn 385 Tyr Lys Gin Val Val Gly Ala Glu 450 Glu Lys 465 Leu Val Tyr Val Ile Cys Leu Val 530 Lys Ala 545 Ala Asp Ala Ala 245 Asp Leu Ala Lys Tyr Ile Leu 275 Ala Ala Lys Pro Thr 355 Ala Leu Phe Ser Ser 435 Asp Thr Asn Pro Thr 515 Glu Val Asp Ser 260 Lys Glu Asp Asp Asp 340 Thr Lys Ile Gin Thr 420 Lys Tyr Pro Arg Lys 500 Leu Leu Met Lys Gin 580 Glu Val Phe Val 325 Tyr Leu Val Lys Asn 405 Pro Cys Leu Val Arg 485 Glu Ser Val Asp Glu 565 Ala Cys Glu Val 310 Phe Ser Glu Phe Gin 390 Ala Thr Cys Ser Ser 470 Pro Phe Glu Lys Asp 550 Thr Ala Cys Asn 295 Glu Leu Val Lys Asp 375 Asn Leu Leu Lys Val 455 Asp Cys Asn Lys His 535 Phe Cys Leu Glu 280 Asp Ser Gly Val Cys 360 Glu Cys Leu Val His 440 Val Arg Phe Ala Glu 520 Lys Ala Phe Gly Cys 265 Lys Glu Lys Met Leu 345 Cys Phe Glu Val Glu 425 Pro Leu Val Ser Glu 505 Arg Pro Ala Ala Leu Pro Met Asp Phe 330 Leu Ala Lys Leu Arg 410 Val Glu Asn Thr Ala 490 Thr Gin Lys Phe Glu 570 Glu Asn Gin Asp Ser Ile Ser 250 255 Leu Pro Val 315 Leu Leu Ala Pro Phe 395 Tyr Ser Ala Gin Lys 475 Leu Phe Ile Ala Val 555 Glu Leu Ala 300 Cys Tyr Arg Ala Leu 380 Lys Thr Arg Lys Leu 460 Cys Glu Thr Lys Thr 540 Glu Gly Glu 285 Asp Lys Glu Leu Asp 365 Val Gin Lys Asn Arg 445 Cys Cys Val Phe Lys 525 Lys Lys Lys 270 Lys Leu Asn Tyr Ala 350 Pro Glu Leu Lys Leu 430 Met Val Thr Asp His 510 Gin Glu Cys Lys Ser Pro Tyr Ala 335 Lys His Glu Gly Val 415 Gly Pro Leu Glu Glu 495 Ala Thr Gin Cys Leu 575 His Ser Ala 320 Arg Thr Glu Pro Glu 400 Pro Lys Cys His Ser 480 Thr Asp Ala Leu Lys 560 Val INFORMATION FOR SEQ ID NO:12: SEQUENCE CHARACTERISTICS: LENGTH: 1865 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (vii) IMMEDIATE SOURCE: CLONE: native coding sequence of mature HSA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: AGATGCACAC AAGAGTGAGG TTGCTCATCG GTTTAAAGAT TTGGGAGAAG AAAATTTCAA AGCCTTGGTG TTGATTGCCT TTGCTCAGTA TCTTCAGCAG TGTCCATTTG AAGATCATGT AAAATTAGTG AATGAAGTAA CTGAATTTGC AAAAACATGT GTAGCTGATG AGTCAGCTGA AAATTGTGAC AAATCACTTC ATACCCTTTT TGGAGACAAA TTATGCACAG TTGCAACTCT TCGTGAAACC TATGGTGAAA TGGCTGACTG CTGTGCAAAA CAAGAACCTG AGAGAAATGA ATGCTTCTTG CAACACAAAG ATGACAACCC AAACCTCCCC CGATTGGTGA GACCAGAGGT TGATGTGATG TGCACTGCTT TTCATGACAA TGAAGAGACA TTTTTGAAAA AATACTTATA TGAAATTGCC AGAAGACATC CTTACTTTTA TGCCCCGGAA CTCCTTTTCT TTGCTAAAAG GTATAAAGCT GCTTTTACAG AATGTTGCCA AGCTGCTGAT AAAGCTGCCT GCCTGTTGCC AAAGCTCGAT GAACTTCGGG ATGAAGGGAA GGCTTCGTCT GCCAAACAGA GACTCAAATG 32 120 180 240 300 360 420 480 540 600 SUBSTITUTE SHEET (RULE 26) WO 98/36085 WO 9836085PCT/US98/03068 TGCCAGTCTC CAAAAATTTG CCAGAGATTT CCCAAAGCTG AGTCCACACG GAATGCTGCC TGCCAAGTAT ATCTGTGAAA AAAACCTCTG TTGGAAAAAT TGACTTGCCT TCATTAGCTG TGAGGCAAAG GATGTCTTCC TTACTCTGTC GTGCTGCTGC CTGTGCCGCT GCAGATCCTC GAGAAAGAGC TTTCAAAGCA TGGGCAGTGG CTCGCCTGAG AGTTTGCAGA AGTTTCCAAG TTAGTGACAG ATCTTACCAA ATGGAGATCT GCTTGAATGT GCTGATGACA GGGCGGACCT ATCAGGATTC GATCTCCAGT AAACTGAAGG AATGCTGTGA CCCACTGCAT TGCCGAAGTG GAAAA.TGATG AGATGCCTGC CTGATTTTGT TGAAAGTAAG GATGTTTGCA AAAACTATGC TGGGCATGTT TTTGTATGAA TATGCAAGAA GGCATCCTGA TGAGACTTGC CAAGACATAT GAAACCACTC TAGAGAAGTG A.TGAATGCTA TGCCAAAGTG TTCGATGAAT TTAAACCTCT rAATCAALACA AAACTGTGAG CTTTTTAAGC AGCTTGGAGA
TGTGGAAGAG
GTACAAATTC
TCCAACTCTT
TCCTGAAGCA
ATGTGTGTTG
CTTGGTGAAC
AGAGTTTAAT
CCTCAGAATT'
CAGAATGCGC
GTAGAGGTCT
AAAAGAATGC
CATGAGAAAA
AGGCGACCAT
GCTGAAACAT
TATTAGTTCG TTACACCAAG AAAGTACCCC CAAGAAACCT AGGAAAAGTG GGCAGCAAAT CCTGTGCAGA AGACTATCTA TCCGTGGTCC CGCCAGTAAG TGACAGAGTC ACAAAATGCT GCTTTTCAGC TCTGGA.AGTC GATGAAACAT TCACCTTCCA TGCAGATATA TGCACACTTT CTGCACTTGT TGAGCTTGTG AAACACAAGC TTATGGATGA TTTCGCAGCT TTTGTAGAGA GCTTTGCCGA GGAGGGTAAA AAACTTGTTG
AAGTGTCAAC
GTTGTAAACA
TGAACCAGTT
GCACAGAGTC
ACGTTCCCAA
CTGAGAAGGA
CCAAGGCAAC
AGTGCTGCAA
CTGCAAGTCA
TGAGAATAAG
660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1865 GAGACAAATC AAGAAACAAA AAAAGAGCAA CTGAAAGCTG GGCTGACGAT AAGGAGACCT AGCTGCCTTA GGCTTATAAC ATCTACATTT AAAAGCATCT AGAAAGAAAA TGAAGATCAA AAGCTTATTC ATCTGTTTTC
AACAC
INFORMATION FOR SEQ ID NO: 13: Wi SEQUENCE CHARACTERISTICS: LENGTH: 352 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (vii) IMMEDIATE SOURCE:
CAGCCTACCA
TTTTTCGTTG GTGTAA.AGCC CLONE: native proBPN' amino acid sequence (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: Ala Gly Lys Ser Asn 1 Thr Met Ser Thr Met Val Gly Giu Lys Lys Tyr Ile Val Gly Phe Lys Gin 10 Ser Ala Ala Lys Lys Lys Asp Val Ile Ser Giu 25 Gin Lys Gin Phe Lys Tyr Val Asp Ala Ala Ser Lys Gly Gly Ala Thr Leu Lys Asn Glu Lys Ala 55 Val Glu Giu Asp 70 Val Lys His Val Glu Ala Leu Lys His Ala Lys Asp Pro Ser Tyr Ala Gin Ser Val Val Ala Tyr Pro Tyr Tyr Thr Pro Gly Val Ser Gly Ser Gly Asn Gin Ile Lys Ala 90 Val Lys Vai Ala 105 Leu Lys Val Ala Ala Leu Asp Ser Pro Ser 130 Ser 115 Glu Pro Asp Val Ile Asp Gly Gly Ala 125 Asn Ser His His Ser Gin Ser Gly Ile 110 Ser Met Val Thr Asn Pro Phe 135 Al a Gin Leu Asp Asn Val 145 Val Ala Gly Thr Val Al a 150 Gly Gly Thr His Val Leu Gly 160 Ala Pro Ser Ser Leu Tyr Ser Trp Ile Asn Asn Ser 155 Ala Val Lys 170 Ile Asn Gly Gly Ser Gly Gin 180 Ala Asn Asn Met Val Leu Gly Ile Glu Trp 190 Gly Gly Pro Ala Asp 175 Ala Ile Ser Gly Asp Val Ile 185 Met Ser Ala 210 Ala Leu Lys Val Ala Ala Asn 200 205 Ala Ala Ala Val 215 Ala Gly Asn Ser Leu Val 225 Thr Val Asp Lys Ala Glu Gly Thr 235 Pro Ser Val Ser Gly Val Val 220 Ser Gly Ser Ser Val Gly Tyr Pro 230 Gly Ser 240 Ala Lys Tyr Ile Ala Val Gly SUBSTITUTE SHEET (RULE WO 98/36085 PCTIUS98/03068 Val Asp Ser Leu Asp Val 275 Asn Lys Tyr 245 Asn Ser 260 Met Gin Arg Ala 250 Phe Ser 265 Ser Ser Ser Val Gly 270 255 Pro Glu Ala Pro Gly Val 280 Gly Ile Gin Ser Thr 285 Ser Leu Pro Gly Pro His Val Gly Ala Tyr Thr Ser Met 290 Gly Ala 300 Pro Ala 305 Thr Ala Ala Ala Leu 310 Ser Leu Ser Lys His 315 Thr Asn Trp Thr Asn 320 Gin Val Arg Ser 325 Gly Leu Glu Asn Thr 330 Asn Thr Lys Leu Gly Asp 335 Ala Gin Ser Phe Tyr Tyr 340 Lys Gly Leu Ile 345 Val Gin Ala Ala 350 INFORMATION FOR SEQ ID NO:14: SEQUENCE CHARACTERISTICS: LENGTH: 1056 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (vii) IMMEDIATE SOURCE: CLONE: native proBPN' coding sequence (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: GCAGGGAAAT CAAACGGGGA AAAGAAATAT ATTGTCGGGT TTAAACAGAC AATGAGCACG ATGAGCGCCG CTAAGAAGAA AGATGTCATT TCTGAAAAAG GCGGGAAAGT GCAAAAGCAA TTCAAATATG TAGACGCAGC TTCAGCTACA TTAAACGAAA AAGCTGTAAA AGAATTGAAA AAAGACCCGA GCGTCGCTTA CGTTGAAGAA GATCACGTAG CACATGCGTA CGCGCAGTCC GTGCCTTACG GCGTATCACA AATTAAAGCC CCTGCTCTGC ACTCTCAAGG CTACACTGGA TCAAATGTTA AAGTAGCGGT TATCGACAGC GGTATCGATT CTTCTCATCC TGATTTAAAG GTAGCAGGCG GAGCCAGCAT GGTTCCTTCT GAAACAAATC CTTTCCAAGA CAACAACTCT CACGGAACTC ACGTTGCCGG CACAGTTGCG GCTCTTAATA ACTCAATCGG TGTATTAGGC GTTGCGCCAA GCGCATCACT TTACGCTGTA AAAGTTCTCG GTGCTGACGG TTCCGGCCAA TACAGCTGGA TCATTAACGG AATCGAGTGG GCGATCGCAA ACAATATGGA CGTTATTAAC ATGAGCCTCG GCGGACCTTC TGGTTCTGCT GCTTTAAAAG CGGCAGTTGA TAAAGCCGTT GCATCCGGCG TCGTAGTCGT TGCGGCAGCC GGTAACGAAG GCACTTCCGG CAGCTCAAGC ACAGTGGGCT ACCCTGGTAA ATACCCTTCT GTCATTGCAG TAGGCGCTGT TGACAGCAGC AACCAAAGAG CATCTTTCTC AAGCGTAGGA CCTGAGCTTG ATGTCATGGC ACCTGGCGTA TCTATCCAAA GCACGCTTCC TGGAAACAAA TACGGGGCGT ACAACGGTAC GTCAATGGCA TCTCCGCACG TTGCCGGAGC GGCTGCTTTG ATTCTTTCTA AGCACCCGAA CTGGACAAAC ACTCAAGTCC GCAGCAGTTT AGAAAACACC ACTACAAAAC TTGGTGATTC TTTCTACTAT GGAAAAGGGC TGATCAACGT ACAGGCGGCA GCTCAG 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1056 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 77 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (vii) IMMEDIATE SOURCE: CLONE: subtilisin BPN' pro-peptide (xi) SEQUENCE DESCRIPTION: SEQ ID Ala 1 Thr Gly Lys Ser Met Ser Thr Asn 5 Met Gly Glu Lys Ser Ala Ala Lys Tyr Ile Val Gly Phe Lys Gin 10 Lys Lys Lys Asp Val Ile Ser Glu 25 Phe Lys Tyr Val Asp Ala Ala Ser Lys Gly Gly 35 Ala Thr Leu Lys Val Gin Lys Gin Asn Glu Lys Ala 55 40 Val Lys Glu Leu Lys Asp Pro Ser 34-i SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 Val Ala Tyr Val Glu Glu Asp His Val Ala His Ala Tyr 70 INFORMATION FOR SEQ ID NO:16: SEQUENCE CHARACTERISTICS: LENGTH: 275 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (vii) IMMEDIATE SOURCE: CLONE: native mature BPN' amino acid sequence (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: Ala Gin Ser Val Pro Tyr Gly Val Ser Gin Ile Lys Ala Pro Ala Leu 1 5 10 His Ser Gin Gly Tyr Thr Gly Ser Asn Val Lys Val Ala Val Ile Asp 25 Ser Gly Ile Asp Ser Ser His Pro Asp Leu Lys Val Ala Gly Gly Ala 40 Ser Met Val Pro Ser Glu Thr Asn Pro Phe Gin Asp Asn Asn Ser His 50 55 Gly Thr His Val Ala Gly Thr Val Ala Ala Leu Asn Asn Ser Ile Gly 70 75 Val Leu Gly Val Ala Pro Ser Ala Ser Leu Tyr Ala Val Lys Val Leu 90 Gly Ala Asp Gly Ser Gly Gin Tyr Ser Trp Ile Ile Asn Gly Ile Glu 100 105 110 Trp Ala Ile Ala Asn Asn Met Asp Val Ile Asn Met Ser Leu Gly Gly 115 120 125 Pro Ser Gly Ser Ala Ala Leu Lys Ala Ala Val Asp Lys Ala Val Ala 130 135 140 Ser Gly Val Val Val Val Ala Ala Ala Gly Asn Glu Gly Thr Ser Gly 145 150 155 160 Ser Ser Ser Thr Val Gly Tyr Pro Gly Lys Tyr Pro Ser Val Ile Ala 165 170 175 Val Gly Ala Val Asp Ser Ser Asn Gin Arg Ala Ser Phe Ser Ser Val 180 185 190 Gly Pro Glu Leu Asp Val Met Ala Pro Gly Val Ser Ile Gin Ser Thr 195 200 205 Leu Pro Gly Asn Lys Tyr Gly Ala Tyr Asn Gly Thr Ser Met Ala Ser 210 215 220 Pro His Val Ala Gly Ala Ala Ala Leu Ile Leu Ser Lys His Pro Asn 225 230 235 240 Trp Thr Asn Thr Gin Val Arg Ser Ser Leu Glu Asn Thr Thr Thr Lys 245 250 255 Leu Gly Asp Ser Phe Tyr Tyr Gly Lys-Gly Leu Ile Asn Val Gin Ala 260 265 270 Ala Ala Gin 275 INFORMATION FOR SEQ ID NO:17: SEQUENCE CHARACTERISTICS: LENGTH: 275 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (vii) IMMEDIATE SOURCE: CLONE: amino acid sequence of mature BPN' variant 34-ii SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: Ala Gin Ser Val 1 His Pro 5 Tyr Tyr Gly Val Ser Gin Ile Lys Ala Pro Ala Leu Ser Gin Gly Thr Gly Ser 10 Val Asn 25 Lys Val Ala Ser Gly Ile Ser Met Val Gly Thr His Asp Ser Ser His Pro 40 Asn Pro Ser Glu Thr 55 Thr Asp Leu Lys Val Pro Phe Gin Asp Ala Thr Val Ile Asp Gly Gly Ala Asn Ser His Val Ala Val Gly 70 Pro Val Ala Ala Leu Thr Asn 75 Ser Ile Gly Leu Gly Val Ala Ser Ser Ala Ser Leu 90 Trp Tyr Ile Gly Ala Asp Gly Gin Tyr Ser 105 Val Ala Val Lys Val Leu Ile Asn Gly Ile Glu 110 Met Ser Leu Gly Gly Trp Ala Ile 115 Pro Ser Gly 130 Ser Gly Val Asn Asn Met Asp 120 Lys Ile Thr Ser Ala Ala Val Val Val 150 Thr Val Gly Leu 135 Ala Ala Ala Val 125 Asp Lys 140 Glu Gly Pro Ser Ala Ala 145 Ser Ser Ser Tyr Pro Gly Gly Asn 155 Lys Tyr 170 Arg Ala Ala Val Ala Thr Ser Gly 160 Val Ile Ala 175 Val Gly Ala Gly Pro Glu 195 165 Val Asp 180 Leu Asp Ser Ser Asn Val Met Ala 200 Tyr Gly Ala Gin 185 Ser Phe Ser Ser Val 190 Ser Ile Gin Ser Thr Pro Gly Val Tyr Ser Gly 205 Ser Leu Pro 210 Pro His Gly Asn Lys Val Ala Gly Asn Thr Gin 225 Trp 215 Ala Ala 230 Val Arg Ala Leu Ile Leu 235 Glu Ser Lys Asn Thr Thr 220 Met Ala Ser His Pro Thr 240 Thr Thr Lys 255 Val Gin Ala 270 Thr Ser Ser Leu Gly Asp Ser 260 Ala Ala Gin 245 Phe Leu 250 Gly Tyr Tyr Gly Lys 265 Leu Ile Asn INFORMATION FOR SEQ ID NO:18: SEQUENCE CHARACTERISTICS: LENGTH: 1260 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (vii) IMMEDIATE SOURCE: CLONE: codon-optimized-3D signal peptide-AAT DNA sequence (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:
ATGAAGAACA
AACAGCGGCC
CACGACCAGG
AGCCTGTACC
AGCATCGCCA
ATCCTCGAAG
TTCCAGGAGC
AACGGGCTCT
AAGCTCTACC
CAGATCAACG
TTGGACAGGG
CGCCCGTTCG
GTCAAGGTCC
TCCAGCTGGG
CCTCCTCCCT CTGCCTCCTG CTGCTCGTGG TCCTCTGCTC CCTGACCTGC AGGCCGAGGA CCCGCAGGGC ACCACCCGAC GTTCAACAAG GCCAGCTCGC GCACCAGTCC CCGCCTTCGC CATGCTGTCC GGCTGAACTT CAACCTGACG TGCTCAGGAC GCTCAACCAG TCCTGTCCGA GGGCCTCAAG ACTCCGAGGC GTTCACCGTC ACTACGTCGA GAAGGGGACC ACACCGTCTT CGCGCTCGTC AGGTGAAGGA CACCGAGGAG CGATGATGAA GAGGCTCGGC TGCTCCTCAT GAAGTACCTG GACGCCGCCC AGAAGACCGA ATCACCCCGA ATTTGGCCGA AACTCCACCA ACATCTTCTT CTGGGTACCA AGGCGGACAC GAGATCCCGG AGGCGCAGAT CCGGACTCCC AGCTCCAGCT CTCGTCGATA AGTTCCTGGA AACTTCGGGG ACACCGAGGA CAGGGCAAGA TCGTGGACCT AACTACATCT TCTTCAAGGG GAGGACTTCC ACGTCGACCA ATGTTCAACA TCCAGCACTG GGGAACGCCA CCGCCATCTT
CACCAGCCAC
ATTCGCCTTC
CAGCCCGGTG
CCACGACGAG
CCACGAGGGC
CACCACCGGC
GGACGTGAAG
GGCCAAGAAG
GGTCAAGGAA
CAAGTGGGAG
GGTCACCACC
CAAGAAGCTC
CTTCCTGCCG
120 180 240 300 360 420 480 540 600 660 720 780 840 34-iii SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068
GACGAGGGCA
CTGGAGAACG
ACGTACGACC
GCGGACCTCT
GCGGTGCTCA
ATCCCCATGT
GAGCAGAACA
AGCTCCAGCA CCTGGAGAAC GAGCTGACGC ACGACATCAT CACGAAGTTC AGGACAGGCG CTCCGCTAGC CTCCACCTCC CGAAGCTGAG CATCACCGGC TGAAGAGCGT GCTGGGCCAG CTGGGCATCA CGAAGGTCTT CAGCAACGGC CCGGCGTGAC GGAGGAGGCC CCCCTGAAGC TCTCCAAGGC CGTGCACAAG CGATCGACGA GAAGGGGACG GAAGCTGCCG GGGCCATGTT CCTGGAGGCC CCATCCCGCC CGAGGTCAAG TTCAACAAGC CCTTCGTCTT CCTGATGATC CGAAGAGCCC CCTCTTCATG GGGAAGGTCG TCAACCCCAC GCAGAAGTGA 900 960 1020 1080 1140 1200 1260 INFORMATION FOR SEQ ID NO:19: SEQUENCE CHARACTERISTICS: LENGTH: 1382 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (vii) IMMEDIATE SOURCE: CLONE: codon-optimized 3D signal peptide-ATIII DNA sequen (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: ATGAAGAACA CCTCCTCCCT CTGCCTCCTG CTGCTCGTGG TCCTCTGCTC CCTGACCTGC AACAGCGGCC AGGCCCACGG AAGCCCTGTG GACATCTGCA CCCATGAATC CCATGTGCAT TTACCGCTCC CCGGAGAAGA TCAGAACAGA AGATCCCGGA GGCCACCAAC CGGCGTGTCT TCCCGCTTTG CTACCACTTT CTATCAGCAC CTGGCAGATT ATTTTCCTGT CACCCCTGAG TATCTCCACG GCTTTTGCTA AATGACACCC TCCAGCAACT GATGGAGGTA TTTAAGTTTG TCTGATCAGA TCCACTTCTT CTTTGCCAAA CTGAACTGCC AAATCCTCCA AGTTAGTATC AGCCAATCGC CTTTTTGGAG GAGACCTACC AGGACATCAG TGAGTTGGTA TATGGAGCCA AAGGAAAATG CAGAGCAATC CAGAGCGGCC ATCAACAAAT GGCCGAATCA CCGATGTCAT TCCCTCGGAA GCCATCAATG CAGCCAAGCC GCGGGACATT AGGCAACTGA GGATGAGGGC GGGAACTGTC CAAGGCCAAT CCAAGAATGA CAATGATAAC TGACCAAGCT GGGTGCCTGT ACACCATATC TGAGAAAACA GACTCTATCG AAAAGCCAAC ACAAATCCCT TACCTTCAAT AGCTCCAGCC CCTGGACTTC GGGTGTCCAA TAAGACCGAA AGCTCACTGT TCTGGTGCTG TCAGCCCTGA GAACACAAGG CATCTATGAT GTACCAGGAA TGCTTGAGTT GCCCTTCAAA AGAAGAGCCT GGCCAAGGTG GTTAACACCA TTTACTTCAA GGGCCTGTGG AAGGAACTGT TCTACAAGGC TGATGGAGAG GGCAAGTTCC GTTATCGGCG CGTGGCTGAA GGTGATGACA TCACCATGGT CCTCATCTTG GAGAAGGAAC TCACCCCAGA GGTGCTGCAG CTGGTGGTTC ACATGCCCCG CTTCCGCATT CAAGACATGG GCCTTGTCGA TCTGTTCAGC GCAGAAGGCC GAGATGACCT CTATGTCTCA AATGAAGAAG GCAGTGAAGC AGCTGCAAGT
AAGTCAAAGT
TCGTGTTCAG
GGCACCCAGG
CCCAAGCCTG
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1382 GAGTGGCTGG ATGAATTGGA GAGGACGGCT TCAGTTTGAA CCTGAAAAGT CCAAACTCCC GATGCATTCC ATAAGGCATT ACCGCTGTTG TGATTGCTGG
GGAGATGATG
GGAGCAGCTG
AGGTATTGTT
TCTTGAGGTA
CCGTTCGCTA
AACCCCAACA GGGTGACTTT CAAGGCCAAC AGGCCCTTCC CCTCTGAACA CTATTATCTT CATGGGCAGA GTAGCCAACC
CC
TGGTTTTTAT AAGAGAAGTT CTTGTGTTAA GTAACTCGAG INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 1940 base pairs.
TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (vii) IMMEDIATE SOURCE: CLONE: codon-optimized 3D signal peptide-HSA DNA sequence (xi) SEQUENCE DESCRIPTION: SEQ ID ATGAAGAACA CCTCCTCCCT CTGCCTCCTG CTGCTCGTGG TCCTCTGCTC CCTGACCTGC AACAGCGGCC AGGCCAGATG CACACAAGAG TGAGGTTGCT CATCGGTTTA AAGATTTGGG AGAAGAAAAT TTCAAAGCCT TGGTGTTGAT TGCCTTTGCT CAGTATCTTC AGCAGTGTCC ATTTGAAGAT CATGTAAAAT TAGTGAATGA AGTAACTGAA TTTGCAAAAA CATGTGTAGC TGATGAGTCA GCTGAAAATT GTGACAAATC ACTTCATACC CTTTTTGGAG ACAAATTATG CACAGTTGCA ACTCTTCGTG AAACCTATGG TGAAATGGCT GACTGCTGTG CAAAACAAGA ACCTGAGAGA AATGAATGCT TCTTGCAACA CAAAGATGAC AACCCAAACC TCCCCCGATT GGTGAGACCA GAGGTTGATG TGATGTGCAC TGCTTTTCAT GACAATGAAG AGACATTTTT 120 180 240 300 360 420 480 34-iv SUBSTITUTE SHEET (RULE 26) WO 98/36085 WO 9836085PCT/US98/03068 GAAAAAATAC TTATATGAAA TTGCCAGAAG ACATCCTTAC TTTTATGCCC CGGAACTCCT 540 TTTCTTTGCT AAAAGGTATA AAGCTGCTTT TACAGAATGT TGCCAAGCTG CTGATAAAGC 600 TGCCTGCCTG TTGCCAAAGC TCGATGAACT TCGGGATGAA GGGAAGGCTT CGTCTGCCAA 660 ACAGAGACTC AAATGTGCCA GTCTCCAAAA ATTTGGAGAA AGAGCTTTCA AAGCATGGGC 720 AGTGGCTCGC CTGAGCCAGA GATTTCCCAA AGCTGAGTTT GCAGAAGTTT CCAAGTTAGT 780 GACAGATCTT ACCAAAGTCC ACACGGAATG CTGCCATGGA GATCTGCTTG AATGTGCTGA 840 TGACAGGGCG GACCTTGCCA AGTATATCTG TGAAAATCAG GATTCGATCT CCAGTAAACT 900 GAAGGAATGC TGTGAAAAAC CTCTGTTGGA AAAATCCCAC TGCATTGCCG AAGTGGAAAJA 960 TGATGAGATG CCTGCTGACT TGCCTTCATT AGCTGCTGAT TTTGTTGAAA GTAAGGATGT 1020 TTGCAAAAAC TATGCTGAGG CAAAGGATGT CTTCCTGGGC ATGTTTTTGT ATGAATATGC 1080 AAGAAGGCAT CCTGATTACT CTGTCGTGCT GCTGCTGAGA CTTGCCAAGA CATATGAC 1140 CACTCTAGAG AAGTGCTGTG CCGCTGCAGA TCCTCATGAA TGCTATGCCA AAGTGTTCGA 1200 TGAATTTAAA CCTCTTGTGG AAGAGCCTCA GAATTTAATC AAACAAAACT GTGAGCTTTT 1260 TAAGCAGCTT GGAGAGTACA AATTCCAGAA TGCGCTATTA GTTCGTTACA CCAAGAAAGT 1320 ACCCCAAGTG TCAACTCCAA CTCTTGTAGA GGTCTCAAGA AACCTAGGAA AAGTGGGCAG 1380 CAAATGTTGT AAACATCCTG AAGCAAAAAG AATGCCCTGT GCAGAAGACT ATCTATCCGT 1440 GGTCCTGAAC CAGTTATGTG TGTTGCATGA GAAAACGCCA GTAAGTGACA GAGTCACAAA 1500 ATGCTGCACA GAGTCCTTGG TGAACAGGCG ACCATGCTTT TCAGCTCTGG AAGTCGATGA 1560 AACATACGTT CCCAAAGAGT TTAATGCTGA AACATTCACC TTCCATGCAG ATATATGCAC 1620 ACTTTCTGAG AAGGAGAGAC AAATCAAGAA ACAAACTGCA CTTGTTGAGC TTGTGAAACA 1680 CAAGCCCAAG GCAACAAAAG AGCAACTGAA AGCTGTTATG GATGATTTCG CAGCTTTTGT 1740 AGAGAAGTGC TGCAAGGCTG ACGATAAGGA GACCTGCTTT GCCGAGGAGG GTAAAAAACT 1800 TGTTGCTGCA AGTCAAGCTG CCTTAGGCTT ATAACATCTA CATTTAAAAG CATCTCAGCC 1860 TACCATGAGA ATAAGAGAAA GAAAATGAAG ATCAAAAGCT TATTCATCTG TTTTCTTTTT 1920 CGTTGGTGTA AAGCCAACAC 1940 INFORMATION FOR SEQ ID NO:21: Wi SEQUENCE CHARACTERISTICS: LENGTH: 1140 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (vii) IMMEDIATE SOURCE: CLONE: codon-optimized 3D signal peptide-BPN' DNA sequene (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: ATGAAGAACA CCTCCTCCCT CTGCCTCCTG CTGCTCGTGG TCCTCTGCTC CCTGACCTGC AACAGCGGCC AGGCCGCTGG CAAGAGCAAC GGGGAGAAGA AGTACATCGT CGGCTTCAAG 120 CAGACCATGA GCACCATGAG CGCCGCCAAG AAGAAGGACG TCATCAGCGA GAAGGGCGGC 180 AAGGTACAGA AGCAGTTCAA GTACGTGGAC GCCGCCAGCG CCACCCTCAA CGAGAAGGCC 240 GTCAAGGAGC TGAAGAAGGA CCCGAGCGTC GCCTACGTCG AGGAGGACCA CGTCGCCCAC 300 GCATATGCAC AGAGCGTCCC GTACGGCGTC AGCCAGATCA AGGCCCCGGC CCTCCACAGC 360 CAGGGCTACA CCGGCAGCAA CGTCAAGGTC GCCGTCATCG ACAGCGGCAT CGACAGCAGC 420 CACCCGGACC TCAAGGTCGC CGGCGGAGCT AGCATGGTCC CGAGCGAGAC CAACCCGTTC 480 CAGGACACCA ACAGCCATGG CACCCACGTC GCCGGCAC!CG TCGCCGCCCT CACCAACAGC 540 ATCGGCGTCC TCGGCGTCGC CCCGAGCGCC AGCCTCTACG CCGTCAAGGT ACTCGGCGCC 600 GACGGCAGCG GCCAGTACAG CTGGATCATC AACGGCATCG AGTGGGCCAT CGCCAACAAC 660 ATGGACGTCA TCACCATGAG CCTCGGCGGC CCGAGCGGCA GCGCCGCCCT CAAGGCCGCC 720 GTCGACAAGG CCGTCGCCAG CGGCGTCGTC GTCGTCGCCG CCGCCGGCAA CGAGGGCACC 780 AGCGGCAGCA GCAGCACCGT CGGCTACCCG GGCAAGTACC CGAGCGTCAT CGCCGTCGGC 840 GCCGTGGACA GCAGCAACCA GCGCGCGAGC TTCAGCAGCG TCGGCCCGGA GCTGGACGTC 900 ATGGCCCCGG GCGTCAGCAT CCAGAGCACC CTCCCGGGCA ACAAGTACGG CGCCTACAGC 960 GGCACCAGCA TGGCCAGCCC GCACGTCGCC GGCGCCGCTG CACTCATCCT CAGCAAGCAC 1020 CCGACCTGGA CCAACACCCA GGTCCGCAGC AGCCTGGAGA ACACCACCAC CAAGCTCGGC 1080 GACAGCTTCT ACTACGGCAA GGGCCTCATC AACGTCCAGG CCGCCGCCCA GTGACTCGAG 1140 INFORMATION FOR SEQ ID NO:22: SEQUENCE CHARACTERISTICS: LENGTH: 13 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: peptide 34-v SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 (vii) IMMEDIATE SOURCE: CLONE: N-terminus of mature AAT (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: Glu Asp Pro Gin Gly Asp Ala Ala Gin Lys Thr Asp Thr 1 5 INFORMATION FOR SEQ ID NO:23: SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: GCTTGACCTG TAACTCGGGC CAGGCGAGCT INFORMATION FOR SEQ ID NO:24: SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID'NO:24: CGCCTAGCCC GAGTTACAGG TCAAGCAGCT INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 37 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID AGCTCCATGG CCGTGGCTCG AGTCTAGACG CGTCCCC 37 INFORMATION FOR SEQ ID NO:26: SEQUENCE CHARACTERISTICS: LENGTH: 33 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: GGGGACGCGT CTAGACTCGA GCCACGGCCA TGG 33 INFORMATION FOR SEQ ID NO:27: SEQUENCE CHARACTERISTICS: LENGTH: 35 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 34-vi SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 (xi) SEQUENCE DESCRIPTION: SEQ-ID NO:27: GCATGCAGGT GCTGAACACC ATGGTGAACA AACAC INFORMATION FOR SEQ ID NO:28: SEQUENCE CHARACTERISTICS: LENGTH: 32 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: TTCTTGTCCC TTTCGGTCCT CATCGTCCTC CT 32 INFORMATION FOR SEQ ID NO:29: SEQUENCE CHARACTERISTICS: LENGTH: 32 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: TGGCCTCTCC TCCAACTTGA CAGCCGGGAG CT 32 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 32 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID TTCACCATGG TGTTCAGCAC CTGCATGCTG CA 32 INFORMATION FOR SEQ ID NO:31: SEQUENCE CHARACTERISTICS: LENGTH: 32 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: CGATGAGGAC CGAAAGGGAC AAGAAGTGTT TG 32 INFORMATION FOR SEQ ID NO:32: SEQUENCE CHARACTERISTICS: LENGTH: 35 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 34-vii SUBSTITUTE SHEET (RULE 26) WO 98/36085 PCT/US98/03068 CCCGGCTGTC AAGTTGGAGG AGAGGCCAAG GAGGA INFORMATION FOR SEQ ID NO:33: SEQUENCE CHARACTERISTICS: LENGTH: 29 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: GAGGATCCCC AGGGAGATGC TGCCCAGAA 29 INFORMATION FOR SEQ ID NO:34: SEQUENCE CHARACTERISTICS: LENGTH: 34 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: CGCGCTCGAG TTATTTTTGG GTGGGATTCA CCAC 34 34-viii

Claims (15)

1. A method of producing, in monocot plant cells, a mature heterologous protein selected from the group consisting of mature, glycosylated oa-antitrypsin (AAT) having the same N-terminal amino acid sequence as mature AAT produced in humans and a glycosylation pattern which increases serum halflife substantially over that of mature non-glycosylated AAT; (ii) mature, glycosylated antithrombin HI (ATIII) having the same N-terminal amino acid sequence as mature ATIII produced in humans; (iii) mature human serum albumin (HSA) having the same N-terminal amino acid sequence as mature HSA produced in humans and having the folding pattern of native mature HSA as evidenced by its bilirubin-binding characteristics; and (iv) mature, active subtilisin BPN' (BPN') having the same N-terminal amino acid sequence as BPN' produced in Bacillus; the method comprising: obtaining monocot cells transformed with a chimeric gene having a monocot transcriptional regulatory region, inducible by addition or removal of a small molecule, or during seed maturation, (ii) a first DNA sequence encoding the heterologous protein, and (iii) a second DNA sequence encoding a signal peptide, said first and second DNA sequences in translation-frame and encoding a fusion protein, and wherein the transcriptional regulatory region is operably linked to the second DNA sequence, and (ii) said signal peptide is effective to facilitate secretion of the mature heterologous protein from the transformed cells; cultivating the transformed cells under conditions effective to induce said transcriptional regulatory region, thereby promoting expression of the fusion protein and secretion of the mature heterologous protein from the transformed cells; and isolating said mature heterologous protein produced by the transformed cells.
2. The method of claim 1, wherein said first DNA sequence encodes proBPN', said cultivating includes cultivating said transformed cells at a pH between 5-6 to promote expression and secretion of proBPN' from the cells, and said isolating step includes incubating the proBPN' under conditions effective to allow the autoconversion of proBPN' to active mature BPN'.
3. The method of claim 1, wherein said first DNA sequence encodes mature BPN', and said method further includes: transforming said cells with a second chimeric gene containing a transcriptional SUBSTITUTE SHEET (RULE 26) regulatory region inducible by addition or removal of a small molecule, or during seed maturation, (ii) a third DNA sequence encoding the pro-peptide moiety of BPN', and (iii) a fourth DNA sequence encoding a signal polypeptide, where said fourth DNA sequence is operably linked to said transcriptional regulatory region and said third DNA sequence, and where said signal polypeptide is in translation-frame with said pro-peptide moiety and is effective to facilitate secretion of expressed pro-peptide moiety from the transformed cells; said cultivating step includes cultivating the transformed cells at a pH between 5-6 to promote expression and secretion of BPN' and the pro-peptide moiety from the cells; and said isolating step includes incubating the BPN' and the pro-moiety under conditions effective to allow the conversion of BPN' to active mature BPN', and isolating the active mature BPN'.
4. The method according to any one of claims 1 to 3, wherein said signal peptide is the RAmy3D signal peptide having the amino acid sequence identified by SEQ ID NO:1. 20 5. The method according to any one of claims 1 to 3, wherein said second DNA sequence encodes the RAmy3D signal peptide (SEQ ID NO:1) and has the codon-optimized nucleotide sequence identified by SEQ ID NO:3.
6. The method according to any one of claims 1 to 3, wherein said signal peptide is the RAmylA signal peptide having the amino acid sequence identified by SEQ ID NO:4.
7. The method according to any one of claims 1 to 3, wherein the second DNA sequence, the first DNA sequence, or both the second and the 30 first DNA sequence, is codon-optimized for enhanced expression in said plant. 36 Lc
8. The method according to any one of claims 1 to 7, wherein said transcription regulatory region is a promoter derived from a rice or barley amylase gene selected from the group consisting of the RAmylA, RAmylB, RAmy2A, RAmy3A, RAmy3B, RAmy3C, RAmy3D, and RAmy3E, pM/C, gKAmyl41, gKAmy155, Amy32b, and HV18 genes.
9. The method of claim 8, wherein the chimeric gene further comprises, between said transcriptional regulatory region and said second DNA coding sequence, the 5' untranslated region of an inducible monocot gene selected from the group consisting of RAmylA, RAmy3B, RAmy3C, RAmy3D, HV18 and RAmy3E. The method of claim 8, wherein said chimeric gene further comprises, downstream of the sequence encoding said fusion protein, the 3' untranslated region of an inducible monocot gene derived from a rice or barley a-amylase gene selected from the group consisting of the RAmylA, RAmylB, RAmy2A, RAmy3A, RAmy3B, RAmy3C, RAmy3D, and RAmy3E, pM/C, gKAmyl41, gKAmyl55, Amy32b, and HV18 genes.
11. The method of claim 1, wherein said cultivating includes culturing the transformed plant cells in a sugar-free or sugar-depleted medium, the transcriptional regulatory region is derived from the RAmy3E or RAmy3D gene, the 5' untranslated region is derived from the RAmylA gene and has the sequence identified by SEQ ID NO:5, and the 3' untranslated region is derived from the RAmylA gene.
12. The method according to any one of claims 1 to 7, wherein the transformed cells are aleurone cells of mature seeds, the transcriptional regulatory region is upregulated by addition of a small molecule to promote 30 seed germination, and said cultivating includes germinating said seeds, either in embryonated or de-embryonated form.
13. The method of claim 12, wherein the transcriptional regulatory region is a rice a-amylase RAmylA promoter or a barley HV18 promoter, and said small molecule is gibberellic acid.
14. A mature heterologous protein produced by the method of claim 1, wherein said protein is selected from the group consisting of: mature glycosylated a-antitrypsin (AAT) having the same N- terminal amino acid sequence as mature AAT produced in humans and having a glycosylation pattern which increases serum halflife substantially over that of non-glycosylated mature AAT; (ii) mature glycosylated antithrombin III (ATIII) having the same N- terminal amino acid sequence as mature ATIII produced in humans; and (iii) mature glycosylated subtilisin BPN' (BPN') having the same N- terminal amino acid sequence as BPN' produced in Bacillus; wherein said protein has a glycosylation pattern characteristic of proteins produced in said monocot plant. The method according to any one of claims 1 to 13, wherein said monocot plant cells are transformed rice, barley, corn, wheat, oat, rye, sorghum, or millet cells.
16. The method according to any one of claims 1 to 13, wherein said monocot plant cells are transformed rice or barley cells.
17. Plant cells capable of producing the mature heterologous protein according to the method of claim 1, wherein said cultivating includes culturing the transformed plant cells in a sugar-free or sugar-depleted medium, the transcriptional regulatory region is derived from the RAmy3E or RAmy3D gene, the 5' untranslated region is derived from the RAmylA gene .i and has the sequence identified by SEQ ID NO:5, and the 3' untranslated region is derived from the RAmylA gene.
18. Seeds capable of producing the mature heterologous protein 30 according to the method of claim 1, wherein said transformed cells are aleurone cells, the transcriptional regulatory region is upregulated by addition of a small molecule to promote seed germination, and said cultivating includes germinating said seeds, either in embryonated or de- embryonated form. Dated this nineteenth day of February 2002 Applied Phytologics, Inc. Patent Attorneys for the Applicant: FBRIGE&CO a a a. a a a a. a. a a a a a. a a a
AU61716/98A 1997-02-13 1998-02-13 Production of mature proteins in plants Ceased AU746826B2 (en)

Applications Claiming Priority (9)

Application Number Priority Date Filing Date Title
US3817097P 1997-02-13 1997-02-13
US3816897P 1997-02-13 1997-02-13
US3799197P 1997-02-13 1997-02-13
US3816997P 1997-02-13 1997-02-13
US60/037991 1997-02-13
US60/038169 1997-02-13
US60/038170 1997-02-13
US60/038168 1997-02-13
PCT/US1998/003068 WO1998036085A1 (en) 1997-02-13 1998-02-13 Production of mature proteins in plants

Publications (2)

Publication Number Publication Date
AU6171698A AU6171698A (en) 1998-09-08
AU746826B2 true AU746826B2 (en) 2002-05-02

Family

ID=27488492

Family Applications (1)

Application Number Title Priority Date Filing Date
AU61716/98A Ceased AU746826B2 (en) 1997-02-13 1998-02-13 Production of mature proteins in plants

Country Status (5)

Country Link
EP (1) EP0981635A1 (en)
JP (1) JP2001512318A (en)
AU (1) AU746826B2 (en)
CA (1) CA2280894A1 (en)
WO (1) WO1998036085A1 (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2774379B1 (en) * 1998-01-30 2002-03-29 Groupe Limagrain Holding PROCESS FOR THE PRODUCTION OF ALPHA 1-ANTITRYPSIN AND ITS VARIANTS BY PLANT CELLS, AND PRODUCTS CONTAINING THE ALPHA-ANTITRYPSIN OBTAINED THEREBY
US6087558A (en) * 1998-07-22 2000-07-11 Prodigene, Inc. Commercial production of proteases in plants
DE19947290A1 (en) * 1999-10-01 2001-04-19 Greenovation Pflanzenbiotechno Process for the production of proteinaceous substances
US8022270B2 (en) 2000-07-31 2011-09-20 Biolex Therapeutics, Inc. Expression of biologically active polypeptides in duckweed
NZ523912A (en) 2000-07-31 2005-03-24 Biolex Inc Expression of biologically active polypeptides in duckweed
US7632983B2 (en) 2000-07-31 2009-12-15 Biolex Therapeutics, Inc. Expression of monoclonal antibodies in duckweed
EP2261250B1 (en) 2001-12-21 2015-07-01 Human Genome Sciences, Inc. GCSF-Albumin fusion proteins
GB0314856D0 (en) * 2003-06-25 2003-07-30 Unitargeting Res As Protein expression system
WO2006108830A2 (en) * 2005-04-13 2006-10-19 Bayer Cropscience Sa TRANSPLASTOMIC PLANTS EXPRESSING α 1-ANTITRYPSIN
WO2007002762A2 (en) * 2005-06-28 2007-01-04 Ventria Bioscience Components of cell culture media produced from plant cells
JP2007151435A (en) * 2005-12-02 2007-06-21 Niigata Univ Transformed plant having high starch-accumulating ability and method for producing the same
JP5158639B2 (en) 2008-04-11 2013-03-06 独立行政法人農業生物資源研究所 Genes specifically expressed in the endosperm of plants, promoters of the genes, and use thereof
KR20120018827A (en) 2009-02-20 2012-03-05 벤트리아 바이오사이언스 Cell culture media containing combinations of proteins
CN102532254B (en) * 2010-12-24 2015-06-24 武汉禾元生物科技股份有限公司 Method for separating and purifying recombinant human serum albumin (rHSA) from rice seeds
KR102435211B1 (en) * 2021-06-29 2022-08-23 (주)진셀바이오텍 Plant cell lines producing recombinant albumin with high yield and uses thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1992001042A1 (en) * 1990-07-06 1992-01-23 Novo Nordisk A/S Transgenic plants expressing industrial enzymes
WO1995014099A2 (en) * 1993-11-16 1995-05-26 The Regents Of The University Of California Process for protein production in plants

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0348348B1 (en) * 1988-06-20 2000-08-09 Novartis AG Process for controlling plant pests with the help of non-plant-derived proteinase-inhibitors
EP0428572A1 (en) * 1988-07-29 1991-05-29 Washington University School Of Medicine Producing commercially valuable polypeptides with genetically transformed endosperm tissue
NL8901932A (en) * 1989-07-26 1991-02-18 Mogen Int PRODUCTION OF heterologous PROTEINS IN PLANTS OR PLANTS.
US5460952A (en) * 1992-11-04 1995-10-24 National Science Counsil Of R.O.C. Gene expression system comprising the promoter region of the α-amylase genes

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1992001042A1 (en) * 1990-07-06 1992-01-23 Novo Nordisk A/S Transgenic plants expressing industrial enzymes
WO1995014099A2 (en) * 1993-11-16 1995-05-26 The Regents Of The University Of California Process for protein production in plants

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JENSEN L.G. ET AL., '96 PROC. NATL., ACAD. SCI 93:3487-3491 *

Also Published As

Publication number Publication date
EP0981635A1 (en) 2000-03-01
JP2001512318A (en) 2001-08-21
WO1998036085A1 (en) 1998-08-20
CA2280894A1 (en) 1998-08-20
AU6171698A (en) 1998-09-08

Similar Documents

Publication Publication Date Title
EP0871749B1 (en) Oil body proteins as carriers of high value proteins
US6753167B2 (en) Preparation of heterologous proteins on oil bodies
Zhong et al. Commercial production of aprotinin in transgenic maize seeds
US5948682A (en) Preparation of heterologous proteins on oil bodies
AU746826B2 (en) Production of mature proteins in plants
US6777591B1 (en) Legume-like storage protein promoter isolated from flax and methods of expressing proteins in plant seeds using the promoter
US6359196B1 (en) Germination-specific plant promoters
US8158857B2 (en) Monocot seed product comprising a human serum albumin protein
JP4570327B2 (en) Methods for the production of multimeric proteins and related compositions
JPH04502861A (en) Production of heterologous proteins in plants and plant cells
US20030167531A1 (en) Expression and purification of bioactive, authentic polypeptides from plants
KR100719629B1 (en) Flax seed specific promoters
US5824870A (en) Commercial production of aprotinin in plants
US6127145A (en) Production of α1 -antitrypsin in plants
JP2008521767A (en) Protein isolation and purification
US6066781A (en) Production of mature proteins in plants
WO1997017453A9 (en) Commercial production of aprotinin in plants
AU2003218396B2 (en) Human blood proteins expressed in monocot seeds
WO2002064750A2 (en) Expression system for seed proteins
US6750046B2 (en) Preparation of thioredoxin and thioredoxin reductase proteins on oil bodies
US20080010697A1 (en) Methods of Expressing Heterologous Protein in Plant Seeds Using Monocot Non Seed-Storage Protein Promoters
Kervinen et al. Structure and possible function of aspartic proteinases in barley and other plants
AU750980B2 (en) Rice beta-glucanase enzymes and genes
AU2023201738A1 (en) Nepenthesin-1 derived resistance to fungal pathogens in major crop plants
Sutliff et al. Production of α 1-antitrypsin in plants

Legal Events

Date Code Title Description
FGA Letters patent sealed or granted (standard patent)
PC Assignment registered

Owner name: VENTRIA BIOSCIENCE, THE REGENTS OF THE UNIVERSITY

Free format text: FORMER OWNER WAS: APPLIED PHYTOLOGICS, INC.