WO1999055890A1 - A method for increasing the protein content of plants - Google Patents

A method for increasing the protein content of plants Download PDF

Info

Publication number
WO1999055890A1
WO1999055890A1 PCT/US1999/009067 US9909067W WO9955890A1 WO 1999055890 A1 WO1999055890 A1 WO 1999055890A1 US 9909067 W US9909067 W US 9909067W WO 9955890 A1 WO9955890 A1 WO 9955890A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
protein
gene
equals
unit
Prior art date
Application number
PCT/US1999/009067
Other languages
French (fr)
Inventor
Jesse M. Jaynes
Original Assignee
Demegen, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Demegen, Inc. filed Critical Demegen, Inc.
Priority to CA002325463A priority Critical patent/CA2325463A1/en
Priority to AU38681/99A priority patent/AU3868199A/en
Publication of WO1999055890A1 publication Critical patent/WO1999055890A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8242Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits
    • C12N15/8243Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits involving biosynthetic or metabolic pathways, i.e. metabolic engineering, e.g. nicotine, caffeine
    • C12N15/8251Amino acid content, e.g. synthetic storage proteins, altering amino acid biosynthesis
    • C12N15/8253Methionine or cysteine
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8242Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits
    • C12N15/8243Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits involving biosynthetic or metabolic pathways, i.e. metabolic engineering, e.g. nicotine, caffeine
    • C12N15/8251Amino acid content, e.g. synthetic storage proteins, altering amino acid biosynthesis
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8242Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits
    • C12N15/8243Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits involving biosynthetic or metabolic pathways, i.e. metabolic engineering, e.g. nicotine, caffeine
    • C12N15/8251Amino acid content, e.g. synthetic storage proteins, altering amino acid biosynthesis
    • C12N15/8254Tryptophan or lysine
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A40/00Adaptation technologies in agriculture, forestry, livestock or agroalimentary production
    • Y02A40/10Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture
    • Y02A40/146Genetically Modified [GMO] plants, e.g. transgenic plants

Definitions

  • composition of plant storage proteins determines the nutritional value of plants and grains when they are used as foods for man and domestic animals.
  • the amount of protein varies with genotype or cultivar, but in general, cereals contain 10% of the dry weight of the seed as protein, while in legumes, the protein content varies between 20% and 30% of the dry weight. In many seeds, the storage proteins account for 50% or more of the total protein and thus determine the protein quality of seeds.
  • Each year the total world cereal harvest amounts to some 1 ,700 million tons of grain (Keris et al. 1985). This yields about 85 million tons of cereal storage proteins harvested each year and contributes a majority ofthe total protein intake of humans and animals.
  • AMINO ACID REQUIREMENTS The biosynthesis of amino acids from simpler precursors is a process vital to all forms of life as these amino acids are the building blocks of proteins. Organisms differ markedly with respect to their ability to synthesize amino acids. In fact, virtually all members of the animal kingdom are incapable of manufacturing some amino acids. There are twenty common amino acids which are utilized in the fabrication of proteins and essential amino acids are those protein building blocks which cannot be synthesized by the animal. It is generally agreed that humans require eight ofthe twenty common amino acids in their diet. Protein deficiencies can usually be ascribed to a diet which is deficient in one or more of the essential amino acids. A 2 nutritionally adequate diet must include a minimum daily consumption of these amino acids
  • Seed storage proteins can be characterized by several main features (Pernollet and Mosse 1983): 1) their main function is to provide amino acids or nitrogen to the young seedling; 2) the general absence of any other known function; 3) their peculiar amino acid composition in cereal and legume seeds; and 4) their localization within storage organelles called protein bodies, at least during seed development.
  • Several classes of storage proteins are generally recognized based on their solubilities in different solvents. Proteins soluble in water are called “albumins”; proteins 3 soluble in 5% saline, “globulins”; and proteins soluble in 70% ethanol, “prolamins”. The proteins that remain following these extractions are treated further with dilute acid or alkali, and are named “glutelins”.
  • prolamin type proteins Most cereals contain primarily prolamin type proteins and can be classified into different groups on the basis of the relative proportions of prolamins, glutelins, and globulins, and the subcellular location of these proteins in the mature seed.
  • the first group corresponds to the Panicoideae sub-family, the second group the Triticeae tribe, and the last one to oat and rice storage proteins.
  • the principal members of the Panicoideae sub-family are maize, sorghum, and millet. Their major storage proteins are prolamins (50 to 60% of seed protein) and glutelins (35 to 40% of seed protein) (Pernollet and Mosse 1983). Prolamins are stored within protein bodies, but glutelins are located both inside and outside these organelles.
  • the Triticeae tribe which includes wheat, barley, and rye, differ from the Panicoideae mainly in storage protein localization and structure. In the starchy endosperm ofthe seeds belonging to this tribe, no protein bodies are left at maturity. Clusters of proteins are then deposited between starch granules, but are no longer surrounded by a membrane.
  • the major storage proteins are salt-soluble globulins (80%)) and prolamins (10-15%). Globulins can be divided into vicillins and legumins (Agros, 1985), based on their sedimentation coefficient (7S/11S), oligomeric organization (trimeric/hexameric), and polypeptide chain structure (single chain/disulphide-linked pair of chains).
  • protein bodies are embedded between starch granules
  • phaseolin tetramer of trimer
  • Glycinin the soybean legumin
  • the soybean legumin has a quaternary structure that was suggested by Badley et al. (1975) to be twelve subunits packed in two identical hexagons.
  • the legumin molecule is a polymer formed by the association of six monomers. Each monomer consists of two subunits, acidic and basic. Sometimes, these subunits are associated by disulf ⁇ de linkages.
  • arachin the peanut legumin, was found to consist of different kinds of subunits. The arachin hexamer association does not need different kinds of subunits, which suggests that the subunits have a very similar structure.
  • zeins The most studied storage proteins, in terms of structure, are the corn prolamines called zeins. These proteins perform no known enzymatic function.
  • Three types of zeins ( , ⁇ and ⁇ ) (Esen 1986) are synthesized on rough endoplasmic reticulum and aggregate within this membrane as protein bodies. The zein protein readily self-associates to form protein bodies and is insoluble in water even in low concentrations of salt.
  • the presence of all types of zeins is not necessary for the formation of a protein body as a single type of zein can aggregate into a dense structure and is generally found at the surface of protein bodies (Lending et al. 1988; Wallace et al. 1988).
  • Circular dichroic measurements, amino acid sequence analysis, and electron microscopy of a zein protein suggests that zein secondary structure is primarily helical with nine adjacent, topologically antiparallel helices clustered within a distorted cylinder (Agros et al., 1982; Larkins, 1983; Larkins, et al., 1984).
  • Polar and hydrophobic residues are appropriately distributed along the helical surfaces allowing intra- and intermolecular hydrogen bonds and van der Waals interactions among neighboring helices, such that rod-shaped zein molecules can aggregate and then stack through glutamate interactions at the cylindrical caps. Because of this structure, zein is much less soluble under physiological conditions than the globulin phaseolin, and precipitation of insoluble zein in the tightly packed protein body may make them less available for proteolytic degradation (Greenwood and Chrispeels 1985).
  • the storage protein structures are adapted to a maximal packing within protein bodies (Pernollet and Mosse 1983). Maximal packing is achieved in at least one of two ways. The folding of the polypeptide chain may favor the maximal packing of amino acids within the 5 protein molecule, or the compacting of proteins is increased by the formation of closely packed quaternary structure. High degrees of polymerization can be observed in pearl millet pennisetin (Pernollet and Mosse 1983) or zein (Lending et al. 1988; Wallace et al. 1988). Also, wheat prolamins and glutelin associate into aggregates arising in the formation of insoluble gluten. These insoluble forms of protein deposits are osmotically inactive and stable during the long period of storage between the time of seed maturation and germination.
  • Cis-acting DNA sequences involved in developmental and/or tissue-specific regulation of gene expression can be defined by introducing plant storage protein gene regulatory regions coupled to bacterial reporter genes (Twell and Ooms 1987; Wenzler et al. 1989, Marries et al. 1988; Chen et al. 1988), or by introducing entire or dissected genes (Colot et al. 1987; Chen et al. 1986) into a transgenic environment.
  • bacterial reporter genes Twell and Ooms 1987; Wenzler et al. 1989, Marries et al. 1988; Chen et al. 1988
  • Colot et al. 1987; Chen et al. 1986 a transformation system for the nutritionally important cereal species has not yet been well established. Therefore, most regulation mechanisms have been studied with transgenic dicot plants.
  • gene expression is controlled, at least partly, by the interaction between regulatory molecules and short sequences that are present in the 5' flanking region ofthe gene.
  • the regulatory sequences of potato storage protein were investigated using transgenic potato plants.
  • a 2.5 kb 5' flanking DNA fragment containing the promoter and the patatin gene was used to construct a transcriptional fusion gene with chloramphenicol acetyl transferase
  • CAT ⁇ -glucuronidase
  • GUS ⁇ -glucuronidase
  • the expression pattern of storage protein genes of cereals is retained in tobacco, not only with respect to tissue, but also to temporal expression.
  • the 5' upstream regions of wheat glutenin genes possess regulatory sequences that determine endosperm-specific expression in transgenic tobacco (Colot et al. 1987).
  • Deletion analysis of the low molecular weight (LMW) glutenin sequence indicated that sequences present between 326 bp and 160 bp upstream of the transcriptional start point are necessary to confer endosperm-specific expression.
  • cis-acting elements determining the regulation of each gene in the cluster are recognized by the tobacco trans-acting factor but also that cis-acting elements directing expression of one gene do not affect expression of neighboring genes.
  • VSP Vegetative storage protein
  • Agrobacterium tumefaciens Ti plasmid As a vector system for the transformation of plants.
  • A. tumefaciens infects most dicotyledonous and some monocotyledonous plants by entry through wound sites.
  • the bacteria bind to cells in the wound and are stimulated by phenolic compounds released from these cells to transfer a portion of their endogenous, 200 kb Ti plasmid into the plant cell (Weiler and Schroder 1987).
  • T-DNA Ti plasmid
  • the transferred portion ofthe Ti plasmid, (T-DNA) becomes covalently integrated into the plant genome, where it directs the biosynthesis of phytohormones using enzymes which it encodes.
  • the vir gene in the bacterial genome is known to be responsible for this process.
  • directly repeating sequences of 25 bases called "border" sequences are essential, but only the right terminus has been shown to be used for T-DNA transfer and integration.
  • Ti plasmids from which these disease-producing genes have been removed or replaced, are referred to as "disarmed” and can be used for the introduction of foreign genes into plants.
  • intermediate vectors such as pMON237 or pBI121 can be used to introduce genes into the Ti plasmid.
  • cointegrating vectors and binary vectors are available as intermediate vectors.
  • a cointegrating transformation vector must include a region of homology between the vector plasmid and the Ti plasmid. Once recombination occurs, the cointegrated plasmid is replicated by the Ti plasmid origin of replication.
  • the cointegrate system while more difficult to use, does offer advantages. Once the cointegrate has been formed, the plasmid is stable in Agrobacterium.
  • a binary vector contains an origin of replication from a broad host-range plasmid instead of a region of homology with the Ti plasmid. Since the plasmid does not need to form a cointegrate, these plasmids are considerably easier to introduce into Agrobacterium.
  • the other advantage to binary vectors is that this vector can be introduced into any Agrobacterium host 8 containing any Ti or Ri plasmid, as long as the vir helper function is provided. Using these systems, the gene regulation mechanism of storage proteins has been elucidated.
  • the amino acid composition ofthe cereal endosperm protein is characterized by a high content of proline and glutamine while the amount of essential amino acids, lysine and tryptophan in particular, is a limiting factor (Pernollet and Mosse 1983).
  • sulfur containing amino acids such as methionine and cysteine are the major limiting essential amino acids for the efficient utilization of plant protein as animal or human food while roots and tubers are deficient in almost all of the essential amino acids.
  • the modified proteins are typically susceptible to proteolytic attack in the plant. Because a natural storage protein is a highly evolved structure, artificial modifications to it are likely to destabilize it. For example, a stabilizing glutamic acid-lysine salt bridge might be broken. (3) Multiple copies of genes.
  • Naturally occurring storage proteins are typically encoded by multiple gene copies. A mutation in just one ofthe copies ofthe gene will likely have only a limited effect.
  • Lysine and tryptophan-encoding oligonucleotides were introduced at several positions into a 19 kD ⁇ -type zein complementary DNA by oligonucleotide-mediated mutagenesis (Wallace et al. 1988).
  • Messenger RNA for the modified zein was synthesized in vitro and injected into Xenopus laevis oocytes.
  • the modified zein aggregated into structures similar to membrane-bound protein bodies. This experiment suggested the possibility of creating high- lysine corn by genetic engineering.
  • the maize 15 kD zein structural gene was placed under the regulation of French bean ⁇ - phaseolin gene flanking regions and expressed in tobacco (Hoffmann et al. 1987). Zein accumulation was obtained as high as 1.6%> of the total seed protein. Zein was found in roots, 10 hypocotyls, and cotyledons ofthe germinating transgenic tobacco seeds. Zein was deposited and accumulates in the vacuolar protein bodies ofthe tobacco embryo and endosperm. The storage proteins of legume seeds such as the common bean (Phaseolus vulgaris) and soybean (Glycine max) are deficient in sulfur-containing amino acids. The nutritional quality of soybean could be improved by introducing and expressing the gene encoding methionine-rich 15 kD zein
  • HAAE I High Essential Amino Acid Encoding
  • Protein design has two components: the design of activity and the design of structure. This review will concentrate on the design of structurally stable storage protein-like proteins.
  • the stability of this peptide is 1000-fold greater than the value calculated from the Zimm-Bragg equation.
  • Specific side-chain interactions, factors that are not considered in the Zimm-Bragg model, are responsible, at least in part, for the fact that the C-peptide is much more helical than predicted (Scheraga, 1985).
  • Medium-range interactions are responsible for the additional stabilization of secondary structures (DeGrado et al. 1989). Interaction between the side-chains are regarded as important medium range interactions (Shoemaker et al. 1987; Marqusee and Baldwin 1987). These include electrostatic interactions, hydrogen bonding, and the perpendicular stacking of aromatic residues (Blundell et al. 1986).
  • An -helix possesses a dipole moment as a result ofthe alignment of its peptide bonds. The positive and negative ends ofthe amide group dipole point toward the helix NH 2 -terminus and COOH-terminus, respectively, giving rise to a significant macrodipole.
  • Protein structures contain several long-range stabilizing interactions which include hydrophobic and packing interactions, and hydrogen bonds. Among these, the hydrophobic effect is a prime contributor to the folding and stabilizing of protein structures.
  • the driving force for helix formation in RNase A arises from long-range interactions between C-peptide and S-protein, a large fragment ofthe protein from which C-peptide was excised ( Komoriya and Chaiken 1985).
  • Hydrophobic residues often repeat every three to four residues in an -helix and form an amphiphilic structure (DeGrado et al. 1989). Amphiphihcity is important for the stabilization of the secondary structures of peptides and proteins which bind in aqueous solution to extrinsic apolar surfaces, including phospholipid membranes, air, and the hydrophobic binding sites of regulatory proteins (Degrade and Lear 1985). This amphiphilic secondary structure can be stabilized relative to other conformations by self-association. Therefore, short peptides often form the -helix in water only because the helix is amphiphilic and is stabilized by peptide aggregation along the hydrophobic surface.
  • Natural globular proteins are folded by a similar mechanism, involving hydrophobic interaction between neighboring segments of secondary structure (Presnell and Cohen 1989).
  • DeGrado and coworkers have successfully built peptide-hormone analogs with minimal homology to the native sequences. These peptides, like the native ones, are not helical in solution but do form helices at the hydrophobic surfaces of membranes.
  • Designed synthetic peptides have been used to show how hydrophobic periodicity in a protein sequence stabilizes the formation of simple secondary structures such as an amphiphilic ⁇ -helix (Ho and DeGrado 1987).
  • the strategies used in the design ofthe helices in the four-helix bundles are: 1) the helices should be composed of strong helix forming amino acids and 2) the helices should be amphiphilic; i.e., they should have an apolar face to interact with neighboring helices and a polar face to maintain water solubility ofthe ensuing aggregates.
  • the results show that hydrophobic periodicity can determine the structure of a peptide. Therefore, the peptides 13 tend to have random conformations in very dilute solution, but form secondary structures when they self-associate (at high concentration) or bind to the air- water surface.
  • the free energy associated with dimerization or tetramerization ofthe designed peptides could be experimentally determined from the concentration dependence ofthe CD spectra for the peptides (DeGrado et al. 1989; Lear et al. 1988; DeGrado and Lear 1985).
  • the peptides were found to be monomeric and have low helical contents, whereas at high concentration they could self-associate and stabilize the secondary structure. Therefore, possible hairpin loops between helices can affect the stability ofthe secondary structure by enhancing the self-association between the helical monomers.
  • a strong helix breaker (Chou and Fasman 1978; Kabsch and Sander 1983, Sueki et al.
  • proline residue was included as the first and last residue to set the stage for adding a hairpin loop between the helices.
  • a single proline residue appeared capable of serving as a suitable link if the C and N terminal glycine residue are slightly unwound.
  • Glycine lacks a ⁇ -carbon, which is essential for the reverse turn where positive dihedral angles are required.
  • the pyrrolidine ring of proline constrains its f dihedral angle -60°.
  • proline should be destabilizing at positions where significantly different backbone torsion angles are required.
  • This amino acid, as well as glycine has a high tendency to break helices and occurs frequently at turns (Creighton 1987).
  • Structural stability of proteins is directly related to in vivo proteo lysis (Parasell and Sauer 1989). Proteolysis depends on the accessibility of the scissile peptide bonds to the attacking protease. The sites of proteolytic processing are generally in relatively flexible interdomain segments or on the surface of the loops, in contrast to the less accessible interdomain peptide 14 bonds (Neurath 1989). This suggests that the stability ofthe folded state ofthe protein is the most important determinant for its proteolytic degradation rate. The effect of a folded structure on the proteolytic degradation has been proven by several experiments. First, proteins that contain amino acid analogs or are prematurely terminated are often degraded rapidly in the cells (Goldberg and St. John 1976).
  • Metabolic stability is another factor influencing the in vivo stability of proteins.
  • damaged and abnormal proteins are metabolically unstable in vivo (Finley and Varshavsky 1985; Pontremoli and Melloni 1986).
  • covalent conjugation of ubiquitin with proteins is essential for the selective degradation of short-lived proteins (Finley and Varshavsky. 1985).
  • the amino acid at the amino-terminus of the protein determined the rate of ubiquitination (Bachmair et al. 1986).
  • Both prokaryotic and eukaryotic long-lived proteins have stabilizing amino acids such as methionine, serine, alanine, glycine, threonine, and valine at the amino terminus end.
  • amino acids such as leucine, phenylalanine, aspartic acid, lysine, and arginine destabilize the target proteins.
  • transgenic plants which produce higher levels of essential amino acids.
  • plants were made transgenic with a synthetic nucleic acid construct which encoded a protein containing high levels of essential amino acids. Resulting transgenic plants produced not only higher levels of essential amino acids, but unexpectedly these plants also produced higher levels of protein in general.
  • This increase in total protein content ranged from approximately 2-fold to 5-fold.
  • One aspect of the invention is a transgenic plant comprising a gene which encodes a protein which causes the transgenic plant to overproduce total protein as compared to a nontransgenic plant.
  • a second aspect ofthe invention is a gene encoding a protein wherein plants which are transgenic for this gene overproduce total protein as compared to a nontransgenic plant.
  • a third aspect ofthe invention is a protein wherein if a plant is made transgenic for a gene encoding said protein said transgenic plant will overproduce total plant protein as compared to the plant when it is not transgenic.
  • This protein may comprise an amphiphilic ⁇ -helical sequence, a ⁇ -pleated sheet sequence, or a combination of -helix and ⁇ -pleated sheet.
  • Another aspect ofthe invention is a transgenic plant cell which contains a gene encoding a protein which causes the plant cell to overproduce total protein as compared to a nontransgenic cell.
  • Yet another aspect ofthe invention is a method for increasing the production of a specific protein in a plant or plant cell by transforming the plant or plant cell with a gene which encodes a protein which causes the overproduction of total protein in the transgenic plant or plant cell.
  • Still another aspect of the invention is a method for increasing the production of a nonprotein product in a plant or plant cell by transforming the plant or plant cell with a gene encoding a protein which causes the overproduction of total protein in the transgenic plant or plant cell and thereby results in the increased synthesis of nonproteinaceous material.
  • Yet another aspect ofthe invention is a method for enhancing the production of a specific protein or nonprotein in a plant or plant cell by cotransforming the plant or plant cell with 1) a gene encoding the specific protein or a protein involved as an enzyme in the synthetic pathway of the nonprotein product and 2) a gene encoding a protein which results in the generalized overproduction of total plant or plant cell protein.
  • Figure 1 shows the average essential amino acid requirement for both children and adults in mg per kg body weight.
  • Figure 2 shows the amounts of foodstuffs which must be consumed in grams per day in order to meet the minimum daily requirement of all essential amino acids.
  • Figure 3 illustrates how the amino acid composition of the ASPl monomer was chosen.
  • Figure 4 shows the percentage of essential amino acids (EAA) and percentage of most limiting essential amino acids (MLEAA) in ASP 1 tetramer compared with natural proteins.
  • Figure 5 is a depiction ofthe amphiphihcity ofthe ASPl monomer where hydrophobic amino acids are in the white rectangle and hydrophilic amino acids are in the shaded rectangle.
  • Figure 6 shows the amino acid sequence of the ASPl tetramer (SEQ ID NO:2). Hydrophilic amino acids are underlined and ⁇ -turns are indicated.
  • Figures 7A-7B show the protein content of plants.
  • Figure 7A shows the overall protein content determined by amino acid analysis.
  • P-2 TC is a control plant and P-7 T, P-l 1 T, P-17 T and P-29 T are plants transformed with ASPl tetramer.
  • Figure 7B shows the % increase of protein content in the transformed plants as compared to the control plant. These data were derived from seedlings obtained from transformed mother plants. A minimum of four separate assays were used and the variation was no more than 30%. 17
  • Figure 8A depicts the overall protein content of leaves from control and ASPl tetramer seedlings. The plants are labeled as for Figures 7A-7B.
  • Figure 8B shows the % increase in protein content for the transformed plants as compared to the control plant.
  • the present invention uses quite a different approach. Rather than mutate or transfer a gene for a naturally occurring protein, an artificial protein has been constructed de novo. This de novo protein has nutritionally balanced proportions of the essential amino acids, is stable following expression in a plant, and shares some of the characteristics of naturally occurring plant storage proteins. Transgenic plants have been produced which contain such a gene. These plants not only produce more essential amino acids compared to controls, but surprisingly the total amount of protein produced by these plants is also increased. Furthermore, the total amount of nonproteinaceous components can also be increased via these methods.
  • Val 1.23 1.31 1.19 1.01 1.36 19 these values and derived a set of numbers we call the 'Average Ratio for All Crops Idealized to the DNP 1 Monomer' ( Figure 4). This set of numbers represents the ratio of essential amino acids necessary to complement the deficiencies found in all 10 crops for all human age groups.
  • ASPl nutritional protein for humans
  • the amino acid sequence for ASPl is shown in Figure 6 and is SEQ ID NO:2.
  • the DNA sequence used to encode this protein is shown as SEQ ID NO:l. It has 1.8 times more of the essential amino acids compared to zein or phaseolin. The difference in MLEAA is much higher, containing 3 times more than phaseolin and 6.5 times more than zein.
  • the helical region of ASPl is amphipathic (hydrophobic residues clustered on one face of the helix while hydrophilic residues are found on the other face) and is stabilized by several GLU - LYS salt bridges ( Figure 6
  • the helix breaker Gly-Pro-Gly-Arg (SEQ ID NO: 8) has been used as a turn sequence.
  • the design results in an antiparallel tetramer which achieves an extraordinarily stable secondary and tertiary structure even at low concentration.
  • ASPl has been designed to have a stable storage protein-like structure in plants. Its design is based on the structurally well-studied corn storage zein proteins (Z19 and Z22), which are comprised of 9 repeated helical units (Agros et al. 1982).
  • Each helical unit, 16 to 26 amino acids long, of zein is flanked by turn regions and forms an antiparallel helical bundle. Most of the amino acids in the helices are hydrophobic residues.
  • ASPl is comprised of 4 helical repeating units, each 20 amino acids long ( Figure 6). Increased gene copy number by concatenation can increase the protein yields. At the same time, gene concatenation gives the increased molecular mass ofthe encoded protein.
  • the gene encoding this novel peptide was chemically synthesized and cloned into an E. coli expression vector.
  • This gene contains plant consensus sequences at the 5' end of the translation initiation site to optimize the expression of proteins in vivo. It was placed under the control ofthe 35S cauliflower mosaic virus (CaMV) promoter in order to permit the constitutive 20 expression of this gene in tobacco.
  • CaMV 35S cauliflower mosaic virus
  • the gene can also be cloned into other microorganisms, such as yeasts, through standard means known in the art.
  • ASPl is intended to encompass any one or more of the following: (1) the peptide whose sequence is SEQ ID NO:3; (2) the peptide whose sequence is SEQ ID NO:4; (3) any polymer, copolymer, oligomer or co-oligomer of one or both of SEQ ID NO:3 and SEQ ID NO:4, such as the tetrameric ASPl whose sequence is SEQ ID NO:2; or (4) any peptide or protein having substantially the same amino acid sequence as any ofthe above, and substantially the same stability upon expression in at least one plant, but whose amino acid sequence has been modified in a manner which will naturally occur to one of skill in the art, such as by insertions, deletions, and/or transpositions which are not substantially detrimental to the stability of, or to the nutritionally balanced essential amino acid composition of, the protein.
  • the protein should also be designed for ready digestibility by the proteases of the intended consumer. For example, frequent lysine (or arginine) sites will promote proteolytic attack by trypsin. Frequent phenylalanine (or tyrosine) sites will promote proteolytic attack by chymotrypsin.
  • an artificial storage protein to be expressed in maize might have one composition if the maize is intended for 21 human consumption, and a somewhat different composition if the maize is intended for feeding pigs.
  • amphipathic peptide or protein is one in which the hydrophobic amino acid residues are predominantly on one side, while the hydrophobic amino acid residues are predominantly on the opposite side, resulting in a peptide or protein which is predominantly hydrophobic on one face, and predominantly hydrophilic on the opposite face.
  • PREDICT-SECONDARY in ⁇ -SYBYL.
  • the percentage of -helix content predicted by information- theory showed a higher -helix content compared to the other two prediction methods (Bayes-statistic and neural-net) in PREDICT-SECONDARY.
  • the predicted secondary structures by information-theory gave 100%> helical content for the monomer and 74% for the tetramer.
  • CD spectra of ASPl -monomer showed the typical pattern of alpha helical proteins with double minima at 208 and 222 nm in aqueous solution (data not shown).
  • the stability of the secondary structure can be induced by the inter-molecular interaction between the helical chains (DeGrado et al. 1989). Therefore, stable aggregation between monomers, presumably through hydrophobic interactions, could stabilize the helical structure.
  • ⁇ -turn Gly-Pro-Gly-Arg (SEQ ID NO:8) sequences were inserted between four monomers for the ASPl -tetramer construction.
  • the ⁇ -turn could play an important role for structural stability of the ASPl -tetramer when it is expressed in vivo. It can also help stabilize tertiary structure formation.
  • the interactions between the helical monomers might be much faster due to the proximate effect when they are connected. This proximate effect might be critical for folding at the low concentrations of ASP 1 -tetramer that are possible when they are expressed in vivo.
  • the stability ofthe secondary structure is increased by the hydrophobic interactions between helical monomers.
  • this ⁇ -turn sequence has a tryptic digestion site (Gly- Arg) which can increase the digestibility of this protein when it is consumed by animals.
  • the stability of the folded structure of a protein has a close relation to its proteolytic degradation rate (Pace and Barret 1984; Pakula and Sauer 1986; Parasell and Sauer 1989; Pakula 23 and Sauer 1989).
  • Stable quaternary structure is essential for the formation of protein bodies of storage proteins in zein or phaseolin (Lawrence et al. 1990). These higher order structures can be achieved through the interaction and close packing ofthe stable tertiary structures.
  • the major driving force for this quaternary structure formation is also hydrophobic interaction between the tertiary structures.
  • Leaf discs transformed with LBA4404 carrying the ASPl gene, gave about 5 to 7 shoots two to three weeks after infection. A total of 565 kanamycin-resistant shoots were regenerated from 120 leaf discs. These shoots were excised from the leaf discs and transferred to new media to grow several more weeks, and then transferred to rooting media. After three weeks in rooting medium, 126 rooted shoots were analyzed for ⁇ -glucuronidase (GUS). Root tips of 56 out of 126 plants showed various levels of GUS activity. Not all the kanamycin-resistant shoots showed the GUS positive result.
  • GUS ⁇ -glucuronidase
  • kanamycin resistance was due to the expression of neomycin phosphotransferase (NPT II gene), regeneration of nontransgenic shoots in the presence of kanamycin has been reported. Therefore, escapes from the screening based on kanamycin sensitivity might have occurred in the nontransformed plants, making them kanamycin resistant.
  • NPT II gene neomycin phosphotransferase
  • nuclease hypersensitive sites correlate to active transcription (Gross and Garrard 1987).
  • the degree of methylation of DNA is inversely related to gene expression.
  • the gene is located near the plant's endogenous promoter or enhancer sites, the level of expression of this gene will be increased by these near-by enhancing factors. Therefore, the difference in the levels of GUS activity between the transformed plants might be due to this positional effect, which was determined by the sites of incorporation of this gene into the tobacco genome.
  • Efficient transcription of inserted ASPl genes in the tobacco plants was tested by Northern blot analysis.
  • the polyA RNA was analyzed using the ASPl -tetramer probe.
  • the correct gene size transcribed was about 490 bases, which consisted of 30 bases upstream and 170 bases downstream of the ASPl -tetramer gene.
  • eukaryotic mRNA contains different sizes of polyA. Therefore, the expected size ofthe ASPl -tetramer message should be around 600 plus -100 bases long. Bands were observed which corresponded to this expected size from all the samples which were analyzed. However, the levels of transcription of the ASPl genes were dramatically different among the different transformed plants.
  • Transformed plant #17 accumulated 5- to 50-fold more transcripts than the other transformed plants. Such differences in accumulation could be explained by the effect of position, or by the effect of multiple copy insertion.
  • the expression levels of the ASPl gene and its neighboring GUS gene correlated with each other in some transformed plants (such as in plant #17), but not in all. These results suggested that the level of expression of two closely connected genes can be dramatically different. Multiple transcripts with different sized bands (500-700 bases) were observed from several transformed plants. This result might be due to multiple insertion ofthe ASPl gene into the tobacco genome. These inserted genes may be rearranged, but still produce transcripts. Another possibility might be strong secondary structure which could be formed due to the four directly repeated sequences ofthe tetrameric ASPl transcripts. Different mobilities could result, depending on the secondary structure. Expression of ASPl
  • Standard means known in the art were used to raise polyclonal antibody against synthetic ASPl monomer. This antibody was used to detect the production of stable ASPl protein in tobacco. If desired, standard means known in the art can also be used to prepare monoclonal antibodies against ASPl . High levels ofthe tetrameric form (11.2 kD) ofthe ASPl protein were detected from plant #17 by Western blot analysis (data not shown). Therefore, direct correlation was found between gene copy number, number of genetic NPTII loci, GUS expression, accumulation of ASPl transcript and protein expression level in the case of plant #17. Some heterologous seed proteins undergo specific degradation when expressed in transgenic plants.
  • ASPl in addition to being a very stable protein in a plant cell, ASPl must function as a general 'protein-stabilizer' and reduces overall protein turnover without apparent deleterious effects to the plants, since there is no observable difference in growth characteristics in the plants producing high amounts of ASPl as compared to control plants.
  • Table 7 lists the percentage of total protein, as a function of dry weight, ofthe transformed controls and ASPl transformants of sweet potato. The numbers are the average of 5 separate assays.
  • Table 8 indicates the amount of essential amino acid in mg/100 grams edible portion ofthe sweet potato and the numbers are the average of 3 separate assays.
  • Table 9 illustrates the percentage of these essential amino acids compared to the transformed control, the numbers being the average of 3 separate assays.
  • Table 10 shows data 29 for a repeat of experiments as done in Table 8 but with the content of more ofthe amino acids determined.
  • Table 10 The numbers in Table 10 are the average of 3 separate assays.
  • Table 11 shows the increase in transformant #5.
  • Table 12 shows the %> protein (wet weight basis) of roots and leaves, with the numbers being the average of at least 3 separate assays, while Table 13 depicts the overall protein content ofthe roots of transformed plants on a dry weight basis and percent dry matter and overall moisture content.
  • Methionine 30 143 190 173 165 197
  • HDNP1 which has the following monomeric amino acid sequence:
  • MLEEIFKKMTE IEKVLKTM (SEQIDNO:6) hhHHhhHHhHHhhHhh (SEQ ID NO:31)
  • Hydrophobic amino acids comprise: isoleucine, methionine, phenylalanine, tryptophan, valine, leucine, alanine and cysteine.
  • Hydrophilic amino acids comprise: arginine, glutamic acid, histidine, lysine, asparagine, aspartic acid, glutamine, tyrosine and proline. Glycine, threonine and serine can act as either hydrophilic or hydrophobic amino acid residues depending upon their immediate environment.
  • the HDNP1 monomer is composed of 20 amino acids in the structural motif to render an amphiphilic -helix.
  • the tetrameric form is: MLEEIFKKMTE WIEKVLKTMgpgrMLEEIFKKMTE WIEKVLKTMgpgrMLEEIFKKMTE
  • This tetrameric form shows the 4 -helices interspaced with the ⁇ -turn gpgr (SEQ ID NO:8).
  • the tetramer is composed of 92 amino acids, including the 12 amino acids comprising the 3 ⁇ -turns, in the structural motif to render an amphiphilic -helix.
  • HDNPl is quite similar to ASPl except that the Leu in position 5 ofthe monomer has been changed to He and also the He in position 17 ofthe monomer for ASPl has been changed to a Leu in HDNPl .
  • these changes are made throughout the protein as can be seen by comparing the amino acid sequences.
  • This monomer is composed of 20 amino acids in the structural motif to render an amphiphilic ⁇ -pleated sheet.
  • the tetrameric form shown below has each stretch of ⁇ -pleated sheet interspaced with the ⁇ -turn gpgr (SEQ ID NO:8).
  • the sequence ofthe tetrameric form is: MTIEWKVELKFEMKIELKMTgpgrMTIEWKVELKFEMKIELKMTgpgrMTIEWKVELKF
  • the tetramer is composed of 92 amino acids, including the 12 amino acids comprising the 3 ⁇ -turns, in the structural motif to render an amphiphilic ⁇ -pleated sheet.
  • the proteins ASPl, HDNPl and HDNP2 were designed to yield high levels of essential amino acids especially suitable for humans. Each type of animal has its own set of required essential amino acids and these sets of essential amino acids, while usually overlapping, are different from each other. Other proteins can be designed which yield higher levels of essential amino acids more suitable for organisms other than humans. For example, pigs have one set of essential amino acids, chickens have a different set, and fish have yet a different set. Transgenic plants can be engineered to be designed to be fed to one particular species of animal.
  • transgenic corn plants can be produced wherein one transgenic form is most suitable for humans, a second transgenic form will produce a high level of those essential amino acids suited for pigs, and a third transgenic form can be made which is most suited for chickens.
  • This monomer is composed of 41 amino acids in the structural motif to render an amphiphilic -helix.
  • the tetrameric form is: MFETI VKLVEETMHKWEEVIKKFVTMVEETLKKFEEITKKMgpgrMFETIVKLVEETM
  • HKWEEVIKKFVTMVEETLKKFEEITKKMgpgrMFETIVKLVEETMHKWEEVIKKFVT MVEETLKKFEEITKKMgpgrMFETIVKLVEETMHKWEEVIKKFVTMVEETLKKFEEIT KKM (SEQ ID NO: 12).
  • This tetrameric form shows the 4 -helices interspaced with the ⁇ -turn gpgr (SEQ ID NO:8).
  • the tetramer is composed of 176 amino acids, including the 12 amino acids comprising the 3 ⁇ -turns, in the structural motif to render an amphiphilic ⁇ -helix.
  • a second protein for swine is SDNP2 and has the monomeric amino acid sequence MTIEFKVELKVETH EMKIEVKFETKIEVKTEMKLEVKFTM (SEQ ID NO: 13) hHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhH
  • sequence ofthe tetrameric form is: MTIEFKVELKVETHWEMKIEVKFETKIEVKTEMKLEVKFTMgpgrMTIEFKVELKVET HWEMKIEVKFETKIEVKTEMKLEVKFTMgpgrMTIEFKVELKVETHWEMKIEVKFETK IEVKTEMKLEVKFTMgpgrMTIEFKVELKVETHWEMKIEVKFETKIEVKTEMKLEVKFTM
  • the tetramer is composed of 176 amino acids, including the 12 amino acids comprising the 3 ⁇ -turns, in the structural motif to render an amphiphilic ⁇ -pleated sheet.
  • a protein directed to use with poultry is PDNP1 which has the amino acid sequence: MFEGLVKIMEEVLRHWTEVFGKIFE GTRFLEGFTKM (SEQIDNO:15) hhHHhhHhhHHhhHHhhHHhhhHhhHHhhHHhHHh (SEQ ID NO:33) 36
  • This monomer is composed of 37 amino acids in the structural motif to render an amphiphilic -helix.
  • the tetrameric form is:
  • This tetrameric form shows the 4 -helices interspaced with the ⁇ -turn gpgr (SEQ ID NO: 8).
  • the tetramer is composed of 160 amino acids, including the 12 amino acids comprising the 3 ⁇ -turns, in the structural motif to render an amphiphilic a -helix.
  • a second protein for poultry is PDNP2 and has the monomeric amino acid sequence MEFKVGIELRFT EMHVGFELKIGFTVEMRLGFETKM (SEQIDNO:17) hHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhH
  • This monomer is composed of 37 amino acids in the structural motif to render an amphiphilic ⁇ -pleated sheet.
  • the tetrameric form shown below has each stretch of ⁇ -pleated sheet interspaced with the ⁇ -turn gpgr (SEQ ID NO:8).
  • the sequence ofthe tetrameric form is: MEFKVGIELRFTWEMHVGFELKIGFTVEMRLGFETKMgpgrMEFKVGIELRFTWEMH
  • the tetramer is composed of 160 amino acids, including the 12 amino acids comprising the 3 ⁇ -turns, in the structural motif to render an amphiphilic ⁇ -pleated sheet.
  • a protein directed to use with fish is FDNP1 which has the amino acid sequence: FEELVRTIEELMKK EEVFKRVLHILEEFVRKFEETMRK (SEQ ID NO:19) hhHHhhHhhHHhHHhHHhhhHhhHHhHHhHHhHHhhHH (SEQ ID NO:34)
  • This monomer is composed of 40 amino acids in the structural motif to render an amphiphilic ⁇ -helix.
  • the tetrameric form is: MFEELVRTIEELMKKWEEVFKRVLHILEEFVRKFEETMRKgpgrMFEELVRTIEELMK
  • This tetrameric form shows the 4 -helices interspaced with the ⁇ -turn gpgr (SEQ ID NO:8).
  • the tertramer is composed of 172 amino acids, including the 12 amino acids comprising the 3 ⁇ -turns, in the structural motif to render an amphiphilic -helix.
  • a second protein for fish is FDNP2 and has the monomeric amino acid sequence 37
  • MEIKLEVRFETKVELKVEWRIEFHTELKMELRVELRFEMK (SEQIDNO:21) hHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhH
  • This monomer is composed of 40 amino acids in the structural motif to render an amphiphilic ⁇ -pleated sheet.
  • the tetrameric form shown below has each stretch of ⁇ -pleated sheet interspaced with the ⁇ -turn gpgr (SEQ ID NO:8).
  • the sequence ofthe tetrameric form is:
  • the tetramer is composed of 172 amino acids, including the 12 amino acids comprising the 3 ⁇ -turns, in the structural motif to render an amphiphilic ⁇ -pleated sheet.
  • DDNP1 A protein directed to use with dogs is DDNP1 which has the amino acid sequence:
  • the tertramer is composed of 164 amino acids, including the 12 amino acids comprising the 3 ⁇ -turns, in the structural motif to render an amphiphilic -helix.
  • a second protein for dogs is DDNP2 and has the monomeric amino acid sequence
  • This monomer is composed of 38 amino acids in the structural motif to render an amphiphilic ⁇ -pleated sheet.
  • the tetrameric form shown below has each stretch of ⁇ -pleated sheet interspaced with the ⁇ -turn gpgr (SEQ ID NO:8).
  • the sequence ofthe tetrameric form is:
  • the tetramer is composed of 164 amino acids, including the 12 amino acids comprising the 3 ⁇ -turns, in the structural motif to render an amphiphilic ⁇ -pleated sheet.
  • CDNP1 A protein directed to use with cats is CDNP1 which has the amino acid sequence:
  • This monomer is composed of 38 amino acids in the structural motif to render an amphiphilic ⁇ -helix.
  • the tetrameric form is:
  • This tetrameric form shows the 4 ⁇ -helices interspaced with the ⁇ -turn gpgr (SEQ ID NO:8).
  • the tertramer is composed of 164 amino acids, including the 12 amino acids comprising the 3 ⁇ -turns, in the structural motif to render an amphiphilic -helix.
  • a second protein for cats is CDNP2 and has the monomeric amino acid sequence MTLEFKLTMELH EIKVELKTEVRIEMKFEVRLEFRMT (SEQ ID NO:29) hHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhH
  • This monomer is composed of 38 amino acids in the structural motif to render an amphiphilic ⁇ -pleated sheet.
  • the tetrameric form shown below has each stretch of ⁇ -pleated sheet interspaced with the ⁇ -turn gpgr (SEQ ID NO:8).
  • the sequence ofthe tetrameric form is: MTLEFKLTMELHWEIKVELKTEVRIEMKFEVRLEFRMTgpgrMTLEFKLTMELHWEIK
  • the tetramer is composed of 164 amino acids, including the 12 amino acids comprising the 3 ⁇ -turns, in the structural motif to render an amphiphilic ⁇ -pleated sheet.
  • the generally enhanced levels of protein production can be useful in expressing other valuable proteins. For example, if a gene coding for insulin were cloned into a plant expressing the ASPl gene, it is expected that levels of insulin production will be higher, as compared to control plants having the insulin gene, but lacking the ASPl gene. Therefore plants which are transgenic for both the ASPl gene or similar gene which also results in increased total protein 39 production and for a second gene which encodes a protein of interest will make more of the protein of interest than if the plant were transformed solely with the gene encoding the protein of interest and not transformed with the ASPl or similar gene. It is irrelevant whether the plant is first transformed with ASPl or a similar gene and later transformed with a gene of interest, or whether the plant is first transformed with the gene of interest and then is later transformed with
  • ASPl or similar gene a transformation can be performed using both genes simultaneously.
  • Plants and plant cells which have been made transgenic for ASPl or similar amphipathic proteins produce greater amounts of all protein than do nontransgenic plants or cells. As a result of this generally higher level of protein, higher levels of nonprotein products will also be made. This result is expected because there will be an increase in the levels of enzymes which are used in the synthesis of such products. For example, taxol is naturally synthesized by certain plants and the synthesis of taxol is dependent on enzymes. Increased levels of those enzymes will lead to increased levels of taxol. Similarly, many plants produce sugars, e.g., sugarcane. Again, the synthesis of sugars is dependent on enzymes within the plant. Increased levels of these enzymes will yield increased levels ofthe sugars.
  • Sweetpotato was transformed with ASPl and two transformed lines were assayed for sugar content and overall amount of dry matter versus moisture content. Results are shown in Table 14. 40 Table 14
  • Transformant 1 had an increased production of sucrose but normal production of both glucose and fructose whereas Transformant 2 had increased production of all 3 sugars as compared to the control plant.
  • Table 13 indicates that the overall amount of dry matter is increased from 17% in the control to roughly 22% in the transformants. This is approximately a 30% increase in dry matter as a percent of the total weight ofthe plant.
  • ASPl or related genes can also be performed, in a manner generally analogous to that described above for tobacco and sweet potato, in certain economically important plants such as rice, wheat, barley, sorghum, maize, potato, plantain, cassava, taro, soybean, alfalfa, or a forage grass. It is desirable to incorporate suitable promoters or other regulatory sequences to encourage expression (preferably constitutive expression) primarily in the part of the plant intended as a foodstuff. For example, in rice or maize, expression is desired primarily in the seeds; while in potato or sweet potato, expression is desired primarily in the tuber.
  • transformation protocols known in the art other than the Agrobacterium protocol will be used, such as transformation through DNA particle gun or via plant protoplasts. See, e.g., Klein et al. (1987) and Croughan et al. (1989). These plants can be transformed with vectors encoding not only ASPl , but for any such similar proteins including any ofthe proteins disclosed above.
  • Plant cells can be made transgenic with a gene encoding ASPl or other amphipathic protein and these transgenic cells can be grown in culture or in a bioreactor. This avoids the necessity of having to regenerate 41 a plant. These transgenic cells will produce enhanced levels of protein and other products as was seen in the transgenic plants. These cells can be cotransformed with any genes of interest, for example a gene encoding insulin. The desired product will be overproduced as compared to a nontransgenic plant cell or a cell not transformed with a gene encoding ASPl or other amphipathic protein. The desired product can be purified from the cultured cells.
  • the term “higher plant” is intended to encompass gymnosperms, monocotyledons, and dicotyledons; as well as any cells, tissues, or organs taken or derived from any of the above, including without limitation any seeds, leaves, stems, flowers, roots, tubers, single cells, gametes, or protoplasts taken or derived from any gymnosperm, monocotyledon, or dicotyledon. Also, the term “higher plant” is intended to encompass gymnosperms, monocotyledons, and dicotyledons; as well as any cells, tissues, or organs taken or derived from any of the above, including without limitation any seeds, leaves, stems, flowers, roots, tubers, single cells, gametes, or protoplasts taken or derived from any gymnosperm, monocotyledon, or dicotyledon. Also, the term “higher plant” is intended to encompass gymnosperms, monocotyledons, and dicotyledons; as well as any cells, tissues,
  • protein is meant to include peptides such as dipeptides or any longer peptide as well as proteins.
  • Bacteriophage 1 Cro mutation effect on activity and intracellular degradation. Proc. Natl. Acad. Sci. U.S.A. 82: 8829-8833.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Organic Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Biomedical Technology (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Plant Pathology (AREA)
  • Nutrition Science (AREA)
  • Microbiology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Cell Biology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)

Abstract

A de novo designed, artificial storage protein has been stably expressed in plants. This protein was designed to have high levels of all the essential amino acids needed for human nutrition. Expressing the gene coding for this protein in crop plants will greatly improve the nutritional quality of the resulting crops. The gene has also been observed to increase the overall level of protein production in a plant. This property will allow enhanced levels of production of other valuable proteins by a plant. For example, a transgenic plant with a gene encoding for insulin may produce higher levels of insulin when the plant also expresses a gene for an artificial storage protein. The method will also allow enhanced production of nonprotein products. Contransfomation with a gene of interest can result in enhanced levels of the protein product of the gene of interest or a product synthesized thereby.

Description

TITLE OF THE INVENTION
A METHOD FOR INCREASING THE PROTEIN CONTENT OF PLANTS
BACKGROUND OF THE INVENTION
The composition of plant storage proteins, a major food reservoir for the developing seeds, determines the nutritional value of plants and grains when they are used as foods for man and domestic animals. The amount of protein varies with genotype or cultivar, but in general, cereals contain 10% of the dry weight of the seed as protein, while in legumes, the protein content varies between 20% and 30% of the dry weight. In many seeds, the storage proteins account for 50% or more of the total protein and thus determine the protein quality of seeds. Each year the total world cereal harvest amounts to some 1 ,700 million tons of grain (Keris et al. 1985). This yields about 85 million tons of cereal storage proteins harvested each year and contributes a majority ofthe total protein intake of humans and animals. With respect to human and animal nutrition, most seeds do not provide a balanced source of protein because of deficiencies in one or more of the essential amino acids in the storage proteins. For example, humans require from foods eight amino acids: isoleucine, leucine, lysine, methionine, phenylalanine, threonine, tryptophan and valine, to maintain a balanced diet. Consumption of proteins of unbalanced composition of amino acids can lead to a malnourished state which is most often found in children in developing countries where plants are the major source of protein intake. Therefore, the development of nutritionally-balanced proteins for introduction into plants is of extreme importance.
AMINO ACID REQUIREMENTS The biosynthesis of amino acids from simpler precursors is a process vital to all forms of life as these amino acids are the building blocks of proteins. Organisms differ markedly with respect to their ability to synthesize amino acids. In fact, virtually all members of the animal kingdom are incapable of manufacturing some amino acids. There are twenty common amino acids which are utilized in the fabrication of proteins and essential amino acids are those protein building blocks which cannot be synthesized by the animal. It is generally agreed that humans require eight ofthe twenty common amino acids in their diet. Protein deficiencies can usually be ascribed to a diet which is deficient in one or more of the essential amino acids. A 2 nutritionally adequate diet must include a minimum daily consumption of these amino acids
(Figure 1).
When diets are high in carbohydrates and low in protein, over a protracted period, essential amino acid deficiencies result. The name given to this undernourished condition in humans is "Kwashiorkor" which is an African word meaning "deposed child" (deposed from the mother's breast by a newborn sibling). This debilitating and malnourished state, characterized by a bloated stomach and reddish-orange discolored hair, is more often found in children than adults because of their great need for essential amino acids during growth and development. In order for normal physical and mental maturation to occur, the above mentioned daily source of essential amino acids is a requisite. Essential amino acid content, or protein quality, is as important a feature ofthe diet as total protein quantity or total calorie intake.
Some foods, such as milk, eggs, and meat, have very high nutritional values because they contain a disproportionately high level of essential amino acids. On the other hand, most foodstuffs obtained from plants possess a poor nutritional value because of their relatively low content of some or, in a few cases, all ofthe essential amino acids. Generally, the essential amino acids which are found to be most limiting in plants are isoleucine, lysine, methionine, threonine, and tryptophan (MLEAA) (Figure 2).
It has been difficult to produce significant increases in the essential amino acid content of crop plants utilizing classical plant breeding approaches. This is primarily due to the fact that the genetics of plant breeding is complex and that an increase in essential amino acid content may be offset by a loss in other agronomically important characters. Also, it is probable that the storage proteins are very conserved in their structure and their essential amino acid composition would be little modified by these conventional techniques.
STRUCTURE AND CLASSIFICATION OF NATURAL STORAGE PROTEINS
Seed storage proteins can be characterized by several main features (Pernollet and Mosse 1983): 1) their main function is to provide amino acids or nitrogen to the young seedling; 2) the general absence of any other known function; 3) their peculiar amino acid composition in cereal and legume seeds; and 4) their localization within storage organelles called protein bodies, at least during seed development. Several classes of storage proteins are generally recognized based on their solubilities in different solvents. Proteins soluble in water are called "albumins"; proteins 3 soluble in 5% saline, "globulins"; and proteins soluble in 70% ethanol, "prolamins". The proteins that remain following these extractions are treated further with dilute acid or alkali, and are named "glutelins". Most cereals contain primarily prolamin type proteins and can be classified into different groups on the basis of the relative proportions of prolamins, glutelins, and globulins, and the subcellular location of these proteins in the mature seed. The first group corresponds to the Panicoideae sub-family, the second group the Triticeae tribe, and the last one to oat and rice storage proteins.
The principal members of the Panicoideae sub-family are maize, sorghum, and millet. Their major storage proteins are prolamins (50 to 60% of seed protein) and glutelins (35 to 40% of seed protein) (Pernollet and Mosse 1983). Prolamins are stored within protein bodies, but glutelins are located both inside and outside these organelles. The Triticeae tribe which includes wheat, barley, and rye, differ from the Panicoideae mainly in storage protein localization and structure. In the starchy endosperm ofthe seeds belonging to this tribe, no protein bodies are left at maturity. Clusters of proteins are then deposited between starch granules, but are no longer surrounded by a membrane.
In legumes and most other dicots, the major storage proteins are salt-soluble globulins (80%)) and prolamins (10-15%). Globulins can be divided into vicillins and legumins (Agros, 1985), based on their sedimentation coefficient (7S/11S), oligomeric organization (trimeric/hexameric), and polypeptide chain structure (single chain/disulphide-linked pair of chains). In the legume seed cotyledon, protein bodies are embedded between starch granules
(Pernollet and Mosse 1983). They are membrane-bound organelles, a few microns in diameter, mainly filled with storage proteins and phytates. Besides storage proteins, protein bodies also contain other proteins, such as enzymes or lectins, although in lesser amounts.
The structure of soluble globulins were studied more than the insoluble prolamins and glutelins. Vicillin appears as a homo- or heterotrimer, sometimes able to associate into hexameric form. Soybean β-conglycin and french bean phaseolin (Bollini and Chrispeels 1978) are the structurally best known vicillins. Recently, the three-dimensional structure of phaseolin was determined by X-ray crystallographic analysis (Lawrence et al. 1990). However, unlike other vicillins, the phaseolin trimer can associate into a dodecamer (tetramer of trimer) below pH 4.5. Each polypeptide ofthe trimeric form comprises two structurally similar units each made up of a β -barrel and an α -helical domain. 4
Glycinin, the soybean legumin, has a quaternary structure that was suggested by Badley et al. (1975) to be twelve subunits packed in two identical hexagons. In general, the legumin molecule is a polymer formed by the association of six monomers. Each monomer consists of two subunits, acidic and basic. Sometimes, these subunits are associated by disulfϊde linkages. On the other hand, arachin, the peanut legumin, was found to consist of different kinds of subunits. The arachin hexamer association does not need different kinds of subunits, which suggests that the subunits have a very similar structure.
The most studied storage proteins, in terms of structure, are the corn prolamines called zeins. These proteins perform no known enzymatic function. Three types of zeins ( , β and γ) (Esen 1986) are synthesized on rough endoplasmic reticulum and aggregate within this membrane as protein bodies. The zein protein readily self-associates to form protein bodies and is insoluble in water even in low concentrations of salt. The presence of all types of zeins is not necessary for the formation of a protein body as a single type of zein can aggregate into a dense structure and is generally found at the surface of protein bodies (Lending et al. 1988; Wallace et al. 1988). The mechanism responsible for protein body formation is thought to involve hydrophobic and weak polar interactions between individual zein molecules (Wallace et al. 1988; Agros et al. 1982), while they require a high amount of ethanol in aqueous systems to maintain their strict molecular conformation (Agros et al. 1982).
Circular dichroic measurements, amino acid sequence analysis, and electron microscopy of a zein protein suggests that zein secondary structure is primarily helical with nine adjacent, topologically antiparallel helices clustered within a distorted cylinder (Agros et al., 1982; Larkins, 1983; Larkins, et al., 1984). Polar and hydrophobic residues are appropriately distributed along the helical surfaces allowing intra- and intermolecular hydrogen bonds and van der Waals interactions among neighboring helices, such that rod-shaped zein molecules can aggregate and then stack through glutamate interactions at the cylindrical caps. Because of this structure, zein is much less soluble under physiological conditions than the globulin phaseolin, and precipitation of insoluble zein in the tightly packed protein body may make them less available for proteolytic degradation (Greenwood and Chrispeels 1985).
The storage protein structures are adapted to a maximal packing within protein bodies (Pernollet and Mosse 1983). Maximal packing is achieved in at least one of two ways. The folding of the polypeptide chain may favor the maximal packing of amino acids within the 5 protein molecule, or the compacting of proteins is increased by the formation of closely packed quaternary structure. High degrees of polymerization can be observed in pearl millet pennisetin (Pernollet and Mosse 1983) or zein (Lending et al. 1988; Wallace et al. 1988). Also, wheat prolamins and glutelin associate into aggregates arising in the formation of insoluble gluten. These insoluble forms of protein deposits are osmotically inactive and stable during the long period of storage between the time of seed maturation and germination.
REGULATION OF STORAGE PROTEIN GENES
All storage proteins which have been investigated are encoded by multigene families (Barrels and Tompson 1983; Crouch et al. 1983; Forde et al. 1985; Kasarda et al. 1984; Lycett et al. 1985; Rafalski et al. 1984; Slightom et al. 1983). The structure of these families varies. In some cases, as in wheat or barley, two major subgroups can be noted: the - and γ-gliadins and the B- and C-hordeins, respectively (Forde et al. 1985; Kasarda et al. 1984; Rafalski et al. 1984). Within each subgroup, several subfamilies can be distinguished. Often short repeats account for at least part of the structure of the polypeptides. These repeats constitute links through which different subfamilies within the same species are related.
Storage protein genes, like most other plant genes characterized to date, are transcribed in a regulated rather than a constitutive fashion. Expression is frequently tissue-specific and/or temporally regulated. Cis-acting DNA sequences involved in developmental and/or tissue- specific regulation of gene expression can be defined by introducing plant storage protein gene regulatory regions coupled to bacterial reporter genes (Twell and Ooms 1987; Wenzler et al. 1989, Marries et al. 1988; Chen et al. 1988), or by introducing entire or dissected genes (Colot et al. 1987; Chen et al. 1986) into a transgenic environment. Unfortunately, a transformation system for the nutritionally important cereal species has not yet been well established. Therefore, most regulation mechanisms have been studied with transgenic dicot plants. However, there is increasing evidence that gene expression is controlled, at least partly, by the interaction between regulatory molecules and short sequences that are present in the 5' flanking region ofthe gene.
The regulatory sequences of potato storage protein were investigated using transgenic potato plants. A 2.5 kb 5' flanking DNA fragment containing the promoter and the patatin gene was used to construct a transcriptional fusion gene with chloramphenicol acetyl transferase
(CAT) or the β-glucuronidase (GUS) gene (Twell and Ooms 1987; Wenzler et al. 1989). When 6 reintroduced into potato, these chimeric genes were expressed in tubers, but not in leaves, stems or roots.
The expression pattern of storage protein genes of cereals is retained in tobacco, not only with respect to tissue, but also to temporal expression. The 5' upstream regions of wheat glutenin genes possess regulatory sequences that determine endosperm-specific expression in transgenic tobacco (Colot et al. 1987). Deletion analysis of the low molecular weight (LMW) glutenin sequence indicated that sequences present between 326 bp and 160 bp upstream of the transcriptional start point are necessary to confer endosperm-specific expression. Furthermore, cis-acting elements determining the regulation of each gene in the cluster are recognized by the tobacco trans-acting factor but also that cis-acting elements directing expression of one gene do not affect expression of neighboring genes. This was demonstrated by the transfer of a 17.1 kb soybean DNA containing a seed lectin gene with at least four nonseed protein genes to transgenic tobacco plants (Okamuro, 1986). The genes in this cluster were expressed in a manner similar to that in soybean; i.e., the lectin gene products accumulated in seeds, and the other genes were expressed in tobacco leaves, stems, and roots.
The expression of several DNA deletion mutants with a 257 bp 5' flanking sequence of the oc'-conglycin gene indicates that this region contained enhancer like elements (Chen et al. 1986). Only a low level of expression ofthe ' gene occurred in developing seeds of transgenic plants that contain the ' gene flanked by 159 nucleotides 5' of the transcriptional start site. However, a 20 fold increase in expression occurred when an additional 98 nucleotides of upstream sequence were included. The DNA sequence between 143 and 257 contained five repeats of the sequence AA(G)CCCA, and played a role in conferring tissue-specific and developmental regulation. The 35S promoter containing this sequence in different positions and different orientations is able to enhance the expression ofthe CAT gene by 25 to 40 fold (Chen et al. 1988).
Trans-acting factors directly involved in storage protein gene regulation have not yet been reported. However, in some cases, the level of amino acids can control the expression of storage protein. Vegetative storage protein (VSP) gene expression in leaves, stems and seed pods is closely related to whether these organs are currently a sink for nitrogen or a source for mobilized nitrogen for other organs (Staswick 1989). The leaves have a sensitive mechanism for detecting changes in sink demand of mobilizing reserves, and VSP gene expression can be rapidly adjusted 7 accordingly. Sequestering excess amino acids in this way may prevent their accumulation to toxic levels.
GENETIC ENGINEERING USING AGROBACTERIUM TUMEFACIENS One ofthe most significant recent advances in the area of plant molecular biology has been the development ofthe Agrobacterium tumefaciens Ti plasmid as a vector system for the transformation of plants. In nature, A. tumefaciens infects most dicotyledonous and some monocotyledonous plants by entry through wound sites. The bacteria bind to cells in the wound and are stimulated by phenolic compounds released from these cells to transfer a portion of their endogenous, 200 kb Ti plasmid into the plant cell (Weiler and Schroder 1987). The transferred portion ofthe Ti plasmid, (T-DNA), becomes covalently integrated into the plant genome, where it directs the biosynthesis of phytohormones using enzymes which it encodes. The vir gene in the bacterial genome is known to be responsible for this process. In addition to vir gene products, directly repeating sequences of 25 bases called "border" sequences are essential, but only the right terminus has been shown to be used for T-DNA transfer and integration.
Expression of the T-DNA gene inside the plants results in the uncontrolled growth of these and surrounding cells, leading to formation of a gall (Weiler and Schroder 1987). Ti plasmids, from which these disease-producing genes have been removed or replaced, are referred to as "disarmed" and can be used for the introduction of foreign genes into plants. The great size of the disarmed Ti plasmid and lack of unique restriction endonuclease sites prohibit direct cloning into the T-DNA. Instead, intermediate vectors such as pMON237 or pBI121 can be used to introduce genes into the Ti plasmid. Currently, two kinds of vector systems are available as intermediate vectors: cointegrating vectors and binary vectors. A cointegrating transformation vector must include a region of homology between the vector plasmid and the Ti plasmid. Once recombination occurs, the cointegrated plasmid is replicated by the Ti plasmid origin of replication. The cointegrate system, while more difficult to use, does offer advantages. Once the cointegrate has been formed, the plasmid is stable in Agrobacterium.
A binary vector contains an origin of replication from a broad host-range plasmid instead of a region of homology with the Ti plasmid. Since the plasmid does not need to form a cointegrate, these plasmids are considerably easier to introduce into Agrobacterium. The other advantage to binary vectors is that this vector can be introduced into any Agrobacterium host 8 containing any Ti or Ri plasmid, as long as the vir helper function is provided. Using these systems, the gene regulation mechanism of storage proteins has been elucidated.
IMPROVEMENT OF NUTRITIONAL QUALITIES OF PLANTS The amino acid composition ofthe cereal endosperm protein is characterized by a high content of proline and glutamine while the amount of essential amino acids, lysine and tryptophan in particular, is a limiting factor (Pernollet and Mosse 1983). In legumes, sulfur containing amino acids such as methionine and cysteine are the major limiting essential amino acids for the efficient utilization of plant protein as animal or human food while roots and tubers are deficient in almost all of the essential amino acids.
There has been a great deal of effort to overcome these amino acid limitations by breeding and selecting for more nutritionally balanced varieties. Plants have been mutated in hopes of recovering individuals with more nutritious storage proteins. Neither of these approaches has been very successful, although some naturally occurring and artificially produced mutants of cereals were shown to contain a more nutritionally balanced amino acid composition.
These mutations cause a significant reduction in the amount of storage protein synthesized and thereby result in a higher percentage of lysine in the seed; however, the softer kernels and low yield of such strains have limited their usefulness (Pernollet and Mosse 1983). The reduction in storage protein also causes the seeds to become more brittle; as a result, these seeds shatter more easily during storage. The lower levels of prolamin also result in flours with unfavorable functional properties which cause brittleness in the baked products (Pernollet et al. 1983). Thus, no satisfactory solution has yet been found for improving the amino acid composition of storage proteins.
One direct approach to this problem is to modify the nucleotide sequence of genes encoding storage proteins so that they contain high levels of essential amino acids. To achieve this aim, several laboratories have tried to modify and express storage proteins in the host plants. Modified natural storage proteins have been created by inserting into the natural storage protein genes exogenous DNA sequences coding for essential amino acids. The basic idea is to produce modified proteins which are similar to the naturally occurring proteins, but which have inserted into them sequences of essential amino acid residues. There are at least three problems encountered with this approach: (1) Dilution. Even if this approach is successful, the modified 9 protein will still have high levels of non-essential amino acids, effectively "diluting" the net concentrations ofthe encoded essential amino acids. (2) Instability. The modified proteins are typically susceptible to proteolytic attack in the plant. Because a natural storage protein is a highly evolved structure, artificial modifications to it are likely to destabilize it. For example, a stabilizing glutamic acid-lysine salt bridge might be broken. (3) Multiple copies of genes.
Naturally occurring storage proteins are typically encoded by multiple gene copies. A mutation in just one ofthe copies ofthe gene will likely have only a limited effect.
In vitro mutagenesis was used to supplement the sulfur amino acid codon content of a gene encoding β -phaseolin, a Phaseolus vulgaris storage protein (Hoffmann et al. 1988). The nutritional quality of β -phaseolin was increased by the insertion of 15 amino acids six of which were methionine. The inserted peptide was essentially a duplication of a naturally occurring sequence found in the maize 15 kD zein storage protein (Pederson et al. 1986). However, this modified phaseolin achieved less than 1% of the expression level of normal phaseolin in transformed seeds. Recently it has been found that this insertion was made in part of a major structural element ofthe phaseolin trimer (Lawrence et al. 1990). Therefore, an inclusion of 15 residues at this site could distort the structure at the tertiary and/or quaternary level.
Lysine and tryptophan-encoding oligonucleotides were introduced at several positions into a 19 kD α-type zein complementary DNA by oligonucleotide-mediated mutagenesis (Wallace et al. 1988). Messenger RNA for the modified zein was synthesized in vitro and injected into Xenopus laevis oocytes. The modified zein aggregated into structures similar to membrane-bound protein bodies. This experiment suggested the possibility of creating high- lysine corn by genetic engineering.
There are alternative approaches that might be more practical. One of these is to transfer heterologous storage protein genes that encode storage proteins with higher levels ofthe desired amino acids. For this purpose, a chimeric gene encoding a Brazil nut methionine-rich protein which contains 18% methionine has been transferred to tobacco and expressed in the developing seeds (Altenbach et al. 1989). The remarkably high level of accumulation ofthe methionine-rich protein in the seed of tobacco results in a significant increase in methionine levels of ~30%>.
The maize 15 kD zein structural gene was placed under the regulation of French bean β- phaseolin gene flanking regions and expressed in tobacco (Hoffmann et al. 1987). Zein accumulation was obtained as high as 1.6%> of the total seed protein. Zein was found in roots, 10 hypocotyls, and cotyledons ofthe germinating transgenic tobacco seeds. Zein was deposited and accumulates in the vacuolar protein bodies ofthe tobacco embryo and endosperm. The storage proteins of legume seeds such as the common bean (Phaseolus vulgaris) and soybean (Glycine max) are deficient in sulfur-containing amino acids. The nutritional quality of soybean could be improved by introducing and expressing the gene encoding methionine-rich 15 kD zein
(Pederson et al. 1986).
A synthetic gene (HEAAE I = High Essential Amino Acid Encoding) which encoded a protein domain high in essential amino acid was expressed as a CAT-HEAAE I fusion protein in potato (Jaynes et al. 1986; Yang et al. 1989). However, structural instability limited the high level expression of this fusion protein in the potato system. Also, the content of essential amino acids was diluted to less than 40% ofthe original encoded protein by constructing this fusion.
There are several precautions that should be considered in engineering storage proteins
(Larkins, 1983). First, in vitro mutational change must not be in regions of the protein that perturb the normal protein structure; otherwise, the proteins might be unstable. Second, when attempting to increase nutritional quality by introducing a gene encoding a heterologous protein in crop plants, it is important that the protein encoded by an introduced gene does not produce any adverse effects in humans or livestock, the ultimate consumers of the engineered seed proteins (Altenbach et al. 1989). Finally, it is critical that the amino acids present in the introduced protein are able to be utilized by the animal for growth and development.
DE NOVO DESIGN OF PROTEINS
Recently, a new field in protein research, de novo design of proteins, has made remarkable progress due to a better understanding ofthe rules which govern protein folding and topology. Protein design has two components: the design of activity and the design of structure. This review will concentrate on the design of structurally stable storage protein-like proteins.
The usual approach for the design of helical bundle proteins consists of linking sequences with a propensity for forming an α -helix via short loop sequences to get linear polypeptide chains. This chain can fold into the predetermined 'globular type' tertiary structure in aqueous solution (Mutter 1988; DeGrado et al. 1989). -helical secondary structures are stabilized by interatomic interactions that can be classified according to the distance between interacting atoms in the sequence ofthe protein (DeGrado et al. 1989). 11
Short range interactions account for different amino acids having different conformational preferences. Both statistical (Chou and Fasman 1978) and experimental (Sueki et al. 1984) methods show that residues such as Glu, Ala and Met tend to stabilize helices, whereas residues such as Gly and Pro are destabilizing. However, these intrinsic preferences are not sufficient to determine the stability of helices in globular proteins.
Analysis of the free-energy requirements for helix initiation and propagation indicates that peptides of 10 to 20 residues should show little helix formation in water (Bierzynski et al. 1982) when the Zimm-Bragg equation (Zimm and Bragg 1959) is used, with parameters (s and S) determined by host-guest experiments where s is the helix nucleation constant, n is the number of H-bonded residues in the helix and S is an average stability constant for one residue. sSn- (S-l) Nevertheless, the 13 amino acid C-peptide obtained from RNase A does show measurable helicity (-25%) at low temperature (Bierzynski et al. 1982; Brown and Klee 1981). The stability of this peptide is 1000-fold greater than the value calculated from the Zimm-Bragg equation. Specific side-chain interactions, factors that are not considered in the Zimm-Bragg model, are responsible, at least in part, for the fact that the C-peptide is much more helical than predicted (Scheraga, 1985).
Medium-range interactions are responsible for the additional stabilization of secondary structures (DeGrado et al. 1989). Interaction between the side-chains are regarded as important medium range interactions (Shoemaker et al. 1987; Marqusee and Baldwin 1987). These include electrostatic interactions, hydrogen bonding, and the perpendicular stacking of aromatic residues (Blundell et al. 1986). An -helix possesses a dipole moment as a result ofthe alignment of its peptide bonds. The positive and negative ends ofthe amide group dipole point toward the helix NH2-terminus and COOH-terminus, respectively, giving rise to a significant macrodipole. Appropriately charged residues near the ends ofthe helix can favorably interact with the helical dipole and stabilize helix formation. It was estimated that the electrostatic interaction between a pair of antiparallel α -helices is about 20 Kcal/mol less than a parallel α -helices pair (Hoi and Sanders 1981). Hydrogen bonds between side chains and terminal helical N-H and C=O groups also participate in the stabilization of helical structure (Richardson and Richardson 1988; Presta and Rose 1988; Richardson and Richardson 1989). 12
Protein structures contain several long-range stabilizing interactions which include hydrophobic and packing interactions, and hydrogen bonds. Among these, the hydrophobic effect is a prime contributor to the folding and stabilizing of protein structures. The driving force for helix formation in RNase A arises from long-range interactions between C-peptide and S-protein, a large fragment ofthe protein from which C-peptide was excised (Komoriya and Chaiken 1985).
The role of hydrophobic interactions in determining secondary structures was studied for a series of peptides containing only Glu and Lys in their sequence (DeGrado and Lear 1985). Glu and Lys residues were chosen as charged residues for the solvent-accessible exterior of the protein to help stabilize helix formation by electrostatic interaction.
STABILITY OF DESIGNED PROTEINS
Hydrophobic residues often repeat every three to four residues in an -helix and form an amphiphilic structure (DeGrado et al. 1989). Amphiphihcity is important for the stabilization of the secondary structures of peptides and proteins which bind in aqueous solution to extrinsic apolar surfaces, including phospholipid membranes, air, and the hydrophobic binding sites of regulatory proteins (Degrade and Lear 1985). This amphiphilic secondary structure can be stabilized relative to other conformations by self-association. Therefore, short peptides often form the -helix in water only because the helix is amphiphilic and is stabilized by peptide aggregation along the hydrophobic surface. Natural globular proteins are folded by a similar mechanism, involving hydrophobic interaction between neighboring segments of secondary structure (Presnell and Cohen 1989). Using the concept of an amphiphilic helix, DeGrado and coworkers have successfully built peptide-hormone analogs with minimal homology to the native sequences. These peptides, like the native ones, are not helical in solution but do form helices at the hydrophobic surfaces of membranes. Designed synthetic peptides have been used to show how hydrophobic periodicity in a protein sequence stabilizes the formation of simple secondary structures such as an amphiphilic α-helix (Ho and DeGrado 1987). The strategies used in the design ofthe helices in the four-helix bundles are: 1) the helices should be composed of strong helix forming amino acids and 2) the helices should be amphiphilic; i.e., they should have an apolar face to interact with neighboring helices and a polar face to maintain water solubility ofthe ensuing aggregates. The results show that hydrophobic periodicity can determine the structure of a peptide. Therefore, the peptides 13 tend to have random conformations in very dilute solution, but form secondary structures when they self-associate (at high concentration) or bind to the air- water surface.
The free energy associated with dimerization or tetramerization ofthe designed peptides could be experimentally determined from the concentration dependence ofthe CD spectra for the peptides (DeGrado et al. 1989; Lear et al. 1988; DeGrado and Lear 1985). At low concentrations, the peptides were found to be monomeric and have low helical contents, whereas at high concentration they could self-associate and stabilize the secondary structure. Therefore, possible hairpin loops between helices can affect the stability ofthe secondary structure by enhancing the self-association between the helical monomers. A strong helix breaker (Chou and Fasman 1978; Kabsch and Sander 1983, Sueki et al. 1984, Scheraga 1978) was included as the first and last residue to set the stage for adding a hairpin loop between the helices. A single proline residue appeared capable of serving as a suitable link if the C and N terminal glycine residue are slightly unwound. Glycine lacks a β -carbon, which is essential for the reverse turn where positive dihedral angles are required. The pyrrolidine ring of proline constrains its f dihedral angle -60°. Thus, proline should be destabilizing at positions where significantly different backbone torsion angles are required. This amino acid, as well as glycine, has a high tendency to break helices and occurs frequently at turns (Creighton 1987).
The direct evidence for stabilization of protein structure by adding the linking sequence was observed by comparing the guanidine denaturation curve for a monomer, dimer and tetramer (Degrado et al. 1989). The gene encoding tetrameric protein was expressed in E. coli and purified to homogeneity. In the series of mono-, di-, and tetramer, the stability toward guanidine denaturation increases concomitantly with the increase in covalent cross-links between helical monomer. At equivalent peptide concentrations, the midpoints of the denaturation curves occurred at 0.55, 4.5 and 6.5 M guanidine for the mono-, di, and tetramer. Furthermore, as the number of covalent cross-links was increased, the curves became increasingly cooperative. Thus, the linker sequence stabilized the formation ofthe four helix structures at low concentration of the peptides (<1 mg/ml).
Structural stability of proteins is directly related to in vivo proteo lysis (Parasell and Sauer 1989). Proteolysis depends on the accessibility of the scissile peptide bonds to the attacking protease. The sites of proteolytic processing are generally in relatively flexible interdomain segments or on the surface of the loops, in contrast to the less accessible interdomain peptide 14 bonds (Neurath 1989). This suggests that the stability ofthe folded state ofthe protein is the most important determinant for its proteolytic degradation rate. The effect of a folded structure on the proteolytic degradation has been proven by several experiments. First, proteins that contain amino acid analogs or are prematurely terminated are often degraded rapidly in the cells (Goldberg and St. John 1976). Second, there are good correlations between the thermal stabilities of specific mutant proteins and their rates of degradation in E. Coli (Pakula and Sauer 1986, Parasell and Sauer 1989). Finally, second-site suppressor mutations that increase the thermodynamic stability of unstable mutant proteins have also been shown to increase resistance to intracellular proteolysis (Pakula and Sauer 1989). The solubility of proteins could also affect their proteolytic resistance as some proteins aggregate to form inclusion bodies that escape proteolytic attack (Kane and Hartley 1988).
Metabolic stability is another factor influencing the in vivo stability of proteins. Usually, damaged and abnormal proteins are metabolically unstable in vivo (Finley and Varshavsky 1985; Pontremoli and Melloni 1986). In eukaryotes, covalent conjugation of ubiquitin with proteins is essential for the selective degradation of short-lived proteins (Finley and Varshavsky. 1985). It was found that the amino acid at the amino-terminus of the protein determined the rate of ubiquitination (Bachmair et al. 1986). Both prokaryotic and eukaryotic long-lived proteins have stabilizing amino acids such as methionine, serine, alanine, glycine, threonine, and valine at the amino terminus end. On the other hand, amino acids such as leucine, phenylalanine, aspartic acid, lysine, and arginine destabilize the target proteins.
Recently, many laboratories have attempted to improve the nutritional quality of plant storage proteins by transferring heterologous storage protein genes from other plants (Pederson et al. 1986). The development of recombinant DNA technology and the Agrobacterium-based vector system has made this approach possible. However, genes encoding storage proteins containing a more favorable amino acid balance do not exist in the genomes of major crop plants.
Furthermore, modification of native storage proteins has met with difficulty because of their instability, low level of expression, and limited host range. One possible alternative is the de novo design of a more nutritionally-balanced protein which retains certain characteristics ofthe natural storage proteins of plants. Our initial work described the use of small fragments of DNA which encoded spans of protein high in essential amino acids (Jaynes et al. 1985; Yang et al. 1989). Subsequently, the 15 genes encoding these protein domains were cloned into an existing protein and the expression level of this modified protein determined in transgenic potato plants. However, because of some of the problems mentioned above, the results were somewhat less than desirable (Yang et al. 1989).
The publications and other materials used herein to illuminate the background of the invention or provide additional details respecting the practice, are incorporated by reference, and for convenience are respectively grouped in the appended List of References.
SUMMARY OF THE INVENTION
Experiments were performed which were designed to produce transgenic plants which produce higher levels of essential amino acids. For this purpose, plants were made transgenic with a synthetic nucleic acid construct which encoded a protein containing high levels of essential amino acids. Resulting transgenic plants produced not only higher levels of essential amino acids, but unexpectedly these plants also produced higher levels of protein in general.
This increase in total protein content ranged from approximately 2-fold to 5-fold.
One aspect of the invention is a transgenic plant comprising a gene which encodes a protein which causes the transgenic plant to overproduce total protein as compared to a nontransgenic plant. A second aspect ofthe invention is a gene encoding a protein wherein plants which are transgenic for this gene overproduce total protein as compared to a nontransgenic plant.
A third aspect ofthe invention is a protein wherein if a plant is made transgenic for a gene encoding said protein said transgenic plant will overproduce total plant protein as compared to the plant when it is not transgenic. This protein may comprise an amphiphilic α -helical sequence, a β -pleated sheet sequence, or a combination of -helix and β -pleated sheet.
Another aspect ofthe invention is a transgenic plant cell which contains a gene encoding a protein which causes the plant cell to overproduce total protein as compared to a nontransgenic cell.
Yet another aspect ofthe invention is a method for increasing the production of a specific protein in a plant or plant cell by transforming the plant or plant cell with a gene which encodes a protein which causes the overproduction of total protein in the transgenic plant or plant cell. 16
Still another aspect of the invention is a method for increasing the production of a nonprotein product in a plant or plant cell by transforming the plant or plant cell with a gene encoding a protein which causes the overproduction of total protein in the transgenic plant or plant cell and thereby results in the increased synthesis of nonproteinaceous material.
Yet another aspect ofthe invention is a method for enhancing the production of a specific protein or nonprotein in a plant or plant cell by cotransforming the plant or plant cell with 1) a gene encoding the specific protein or a protein involved as an enzyme in the synthetic pathway of the nonprotein product and 2) a gene encoding a protein which results in the generalized overproduction of total plant or plant cell protein.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 shows the average essential amino acid requirement for both children and adults in mg per kg body weight.
Figure 2 shows the amounts of foodstuffs which must be consumed in grams per day in order to meet the minimum daily requirement of all essential amino acids.
Figure 3 illustrates how the amino acid composition of the ASPl monomer was chosen.
Figure 4 shows the percentage of essential amino acids (EAA) and percentage of most limiting essential amino acids (MLEAA) in ASP 1 tetramer compared with natural proteins.
Figure 5 is a depiction ofthe amphiphihcity ofthe ASPl monomer where hydrophobic amino acids are in the white rectangle and hydrophilic amino acids are in the shaded rectangle.
There are interactions between the Glu (E) and Lys (K) residues which are shown as dark lines depicting salt-bridges.
Figure 6 shows the amino acid sequence of the ASPl tetramer (SEQ ID NO:2). Hydrophilic amino acids are underlined and β -turns are indicated. Figures 7A-7B show the protein content of plants. Figure 7A shows the overall protein content determined by amino acid analysis. P-2 TC is a control plant and P-7 T, P-l 1 T, P-17 T and P-29 T are plants transformed with ASPl tetramer. Figure 7B shows the % increase of protein content in the transformed plants as compared to the control plant. These data were derived from seedlings obtained from transformed mother plants. A minimum of four separate assays were used and the variation was no more than 30%. 17
Figure 8A depicts the overall protein content of leaves from control and ASPl tetramer seedlings. The plants are labeled as for Figures 7A-7B. Figure 8B shows the % increase in protein content for the transformed plants as compared to the control plant.
DETAILED DESCRIPTION OF THE INVENTION
The present invention uses quite a different approach. Rather than mutate or transfer a gene for a naturally occurring protein, an artificial protein has been constructed de novo. This de novo protein has nutritionally balanced proportions of the essential amino acids, is stable following expression in a plant, and shares some of the characteristics of naturally occurring plant storage proteins. Transgenic plants have been produced which contain such a gene. These plants not only produce more essential amino acids compared to controls, but surprisingly the total amount of protein produced by these plants is also increased. Furthermore, the total amount of nonproteinaceous components can also be increased via these methods.
There are at least two fundamental difficulties in achieving efficient expression of designed proteins. First, it is not yet known what stabilizes a protein against proteolytic breakdown and second, the mechanisms for folding of an amino acid sequence into a biologically-stable tertiary structure have not yet been fully delineated. For the construction of DNP 1 (Designed Nutritional Protein), we focused on the design of a physiologically-stable as well as a highly nutritious, storage protein-like, artificial protein. DESIGNED NUTRITIONAL PROTEINS
We designed the synthetic protein DNP 1 to contain a high content of those amino acids which are essential to the diet of animals. The optimized content of essential amino acids for this new protein was obtained empirically by determining the amounts of essential amino acids necessary for normal metabolism ofthe animal. See Table 1, which gives essential amino acid requirement (grams/day) (in the following order) for children at 3 months, children at 5 years, children at 10 years, average for children at these three ages, adults at 25 years, adults at 75 years, average for adults at these two ages, and overall average. We also determined the 'deficiency values' or the ratios of deficient essential amino acids for the 10 primary crops animals consume throughout the world (Figure 3). See Table 2, which gives essential amino acid deficiency ratios for the ten major crop plants consumed by humans. From these data, we then found the ratio of essential amino acids needed to totally complement each particular plant foodstuff. We averaged 18 Table 1
Infant Child Child Child Adult Adult Adult Overall 3 mo 5 yr 10 yr Ave 25 yr 75 yr Ave Ave
He 0.258 0.524 0.879 0.554 0.754 0.754 0.754 0.654
Leu 0.521 1.234 1.382 1.046 1.102 1.102 1.102 1.074
Lys 0.370 1.150 1.382 0.967 1.020 1.020 1.020 0.994
Met+Cys 0.235 0.468 0.691 0.465 0.986 0.986 0.986 0.725
Phe+Tyr 0.403 1.178 1.124 0.902 1.102 1.102 1.102 1.002
Thr 0.241 0.636 0.879 0.585 0.522 0.522 0.522 0.554
Trp 0.095 0.185 0.240 0.173 0.260 0.260 0.260 0.217
Val 0.308 0.655 0.785 0.583 0.754 0.754 0.754 0.668
Figure imgf000020_0001
Table 2
E.A.A. Wheat Corn Rice Barley Sorghum
He 1.72 2.23 1.71 1.94 1.51
Leu 1.85 1.25 1.57 1.98 0.85
Lys 4.08 3.36 3.03 3.68 4.54
Met + Cys 1.73 2.41 3.86 2.53 2.55
Phe + Tyr 1.20 1.30 1.10 1.32 1.49
Thr 2.14 1.67 1.75 2.01 1.89
Trp 1.61 2.20 1.78 1.37 1.69
Val 1.68 1.39 1.20 1.18 1.50
E.A.A. Cassava Taro Sweet Potato Potato Plaintain
He 1.88 1.69 1.92 1.68 1.33
Leu 2.13 1.64 2.70 2.45 2.10
Lys 2.83 2.29 2.96 2.08 2.24
Met + Cys 3.18 3.52 2.84 3.53 3.74
Phe + Tyr 1.58 2.37 1.29 1.67 1.71
Thr 1.58 1.55 1.62 1.54 2.28
Trp 1.02 1.41 1.37 1.60 1.39
Val 1.23 1.31 1.19 1.01 1.36
Figure imgf000020_0002
19 these values and derived a set of numbers we call the 'Average Ratio for All Crops Idealized to the DNP 1 Monomer' (Figure 4). This set of numbers represents the ratio of essential amino acids necessary to complement the deficiencies found in all 10 crops for all human age groups.
From the above set of numbers, we designed a nutritional protein for humans (ASPl). The amino acid sequence for ASPl is shown in Figure 6 and is SEQ ID NO:2. The DNA sequence used to encode this protein is shown as SEQ ID NO:l. It has 1.8 times more of the essential amino acids compared to zein or phaseolin. The difference in MLEAA is much higher, containing 3 times more than phaseolin and 6.5 times more than zein. The helical region of ASPl is amphipathic (hydrophobic residues clustered on one face of the helix while hydrophilic residues are found on the other face) and is stabilized by several GLU - LYS salt bridges (Figure
5). The helix breaker Gly-Pro-Gly-Arg (SEQ ID NO: 8) has been used as a turn sequence. The design results in an antiparallel tetramer which achieves an extraordinarily stable secondary and tertiary structure even at low concentration.
The structural stability of a protein is important in determining its susceptibility to proteolysis. Most native proteins are relatively resistant to cleavage by proteolytic enzymes, whereas denatured proteins are much more sensitive (Pace and Barret 1984). Several findings suggest that the stability of a folded protein is an important determinant of its rate of degradation. Therefore, in addition to improved nutritional quality, ASPl has been designed to have a stable storage protein-like structure in plants. Its design is based on the structurally well-studied corn storage zein proteins (Z19 and Z22), which are comprised of 9 repeated helical units (Agros et al. 1982). Each helical unit, 16 to 26 amino acids long, of zein is flanked by turn regions and forms an antiparallel helical bundle. Most of the amino acids in the helices are hydrophobic residues. On the other hand, ASPl is comprised of 4 helical repeating units, each 20 amino acids long (Figure 6). Increased gene copy number by concatenation can increase the protein yields. At the same time, gene concatenation gives the increased molecular mass ofthe encoded protein.
Such an increase in size and concatenation can significantly stabilize an otherwise unstable product (Shen 1984).
The gene encoding this novel peptide was chemically synthesized and cloned into an E. coli expression vector. This gene contains plant consensus sequences at the 5' end of the translation initiation site to optimize the expression of proteins in vivo. It was placed under the control ofthe 35S cauliflower mosaic virus (CaMV) promoter in order to permit the constitutive 20 expression of this gene in tobacco. The gene can also be cloned into other microorganisms, such as yeasts, through standard means known in the art.
Unless otherwise clearly indicated by context, the term "ASPl" is intended to encompass any one or more of the following: (1) the peptide whose sequence is SEQ ID NO:3; (2) the peptide whose sequence is SEQ ID NO:4; (3) any polymer, copolymer, oligomer or co-oligomer of one or both of SEQ ID NO:3 and SEQ ID NO:4, such as the tetrameric ASPl whose sequence is SEQ ID NO:2; or (4) any peptide or protein having substantially the same amino acid sequence as any ofthe above, and substantially the same stability upon expression in at least one plant, but whose amino acid sequence has been modified in a manner which will naturally occur to one of skill in the art, such as by insertions, deletions, and/or transpositions which are not substantially detrimental to the stability of, or to the nutritionally balanced essential amino acid composition of, the protein. By way of example, numerous transpositions, insertions, or deletions in the amino acid residue sequence of ASPl or other proteins of the invention will occur to those of skill in the art. It will be desirable to maintain overall amphipathy ofthe structure to promote stability; and it will also be desirable to have as internal sequences glu-X-X-X-lys (SEQ ID
NO:5), to promote salt bridges in the -helix, which also promote protein stability. While other acid-X-X-X-base sequences may also serve this function glu-X-X-X-lys (SEQ ID NO:5) is preferred: lysine is preferred as the base because it is an essential amino acid; and glutamic acid is preferred as the acid because it has been observed to stabilize an -helix better than does aspartic acid. This same type of definition as set out for ASPl also applies to all other polypeptides or proteins which are disclosed herein.
The protein should also be designed for ready digestibility by the proteases of the intended consumer. For example, frequent lysine (or arginine) sites will promote proteolytic attack by trypsin. Frequent phenylalanine (or tyrosine) sites will promote proteolytic attack by chymotrypsin.
It may be desirable to tailor the essential amino acid content of the protein specifically to complement the essential amino acid content of a particular crop of interest, rather than an average for several crops. It may also be desirable to tailor the essential amino acid content to match the nutritional requirements ofthe intended consumer species. For example, an artificial storage protein to be expressed in maize might have one composition if the maize is intended for 21 human consumption, and a somewhat different composition if the maize is intended for feeding pigs.
An amphipathic peptide or protein is one in which the hydrophobic amino acid residues are predominantly on one side, while the hydrophobic amino acid residues are predominantly on the opposite side, resulting in a peptide or protein which is predominantly hydrophobic on one face, and predominantly hydrophilic on the opposite face.
PREDICTION OF THE STRUCTURE OF ASPl
Without wishing to be bound by the following discussion of inferences regarding ASPl's structure, the following gives the inventor's best current information and inferences regarding that structure. The secondary structures ofthe ASPl monomer and tetramer were predicted by
PREDICT-SECOND ARY in β-SYBYL. The percentage of -helix content predicted by information- theory showed a higher -helix content compared to the other two prediction methods (Bayes-statistic and neural-net) in PREDICT-SECONDARY. The predicted secondary structures by information-theory gave 100%> helical content for the monomer and 74% for the tetramer.
However, the accuracy ofthe three widely used prediction methods ranged from 49% to
56%o for prediction of three states; helix, sheet, and coil (Kabsch and Sander 1983). This inaccuracy might be due to the small size ofthe data base and/or the fact that secondary structure is determined by tertiary interactions which are not included in the local sequences. For further predictions of structure, the structures predicted by information-theory were energy minimized using SYBYL MAXIMIN2.
A perfect amphiphilic α -helical conformation was predicted for the ASPl -monomer after minimization. The tertiary structure of the ASPl -tetramer after minimization showed the antiparallel conformation as was designed. These minimization results suggested the high probability of stable secondary structure ( -helix and β-turn) formation ofthe ASPl -monomer and -tetramer.
STRUCTURAL ANALYSIS OF ASPl PROTEIN The structural stability of ASPl -monomer and tetramer could not be determined by minimization only. Therefore, the stability of the α -helical secondary structure of ASPl- 22 monomer was investigated. HPLC analysis ofthe gel filtered synthetic ASPl -monomer showed that purity was more than 90% and amino acid analysis ofthe purified fraction gave the expected molar ratios. This fraction was also analyzed by mass spectrometry, and the molecular weight peak corresponding to the ASPl -monomer (2896.5) was present. Since the structural stability of ASPl -monomer and tetramer could not be determined by minimization only, the stability ofthe -helical secondary structure of ASPl -monomer was investigated by circular dichroism (CD) analysis. CD spectra of ASPl -monomer showed the typical pattern of alpha helical proteins with double minima at 208 and 222 nm in aqueous solution (data not shown). The stability of the secondary structure can be induced by the inter-molecular interaction between the helical chains (DeGrado et al. 1989). Therefore, stable aggregation between monomers, presumably through hydrophobic interactions, could stabilize the helical structure. Besides, proper packing of the apolar side chains and proper electrostatic interaction might play important roles in stabilizing the secondary structure of ASPl . The stable interaction among the monomeric ASPl molecules is an important determinant for the proper folding into the tertiary structure ofthe ASP 1 -tetramer. Therefore, the self-association capability ofthe ASPl -monomers was investigated by using size exclusion chromatography. The hydrodynamic behavior of this peptide showed that it was aggregated into a hexamer form with an apparent molecular weight of about 17 kD. This hexameric aggregate could be maintained in either low or high ionic strength solutions. This result provides proof of the stable globular type tertiary structure formation of tetrameric ASPl . Three potential β-turn (Gly-Pro-Gly-Arg (SEQ ID NO:8)) sequences were inserted between four monomers for the ASPl -tetramer construction. The β-turn could play an important role for structural stability of the ASPl -tetramer when it is expressed in vivo. It can also help stabilize tertiary structure formation. The interactions between the helical monomers might be much faster due to the proximate effect when they are connected. This proximate effect might be critical for folding at the low concentrations of ASP 1 -tetramer that are possible when they are expressed in vivo. At the same time, the stability ofthe secondary structure is increased by the hydrophobic interactions between helical monomers. In addition, this β-turn sequence has a tryptic digestion site (Gly- Arg) which can increase the digestibility of this protein when it is consumed by animals. The stability of the folded structure of a protein has a close relation to its proteolytic degradation rate (Pace and Barret 1984; Pakula and Sauer 1986; Parasell and Sauer 1989; Pakula 23 and Sauer 1989). In this respect, we expected high stability of folded ASPl -tetramer against proteolytic degradation when it is expressed in vivo. Stable quaternary structure is essential for the formation of protein bodies of storage proteins in zein or phaseolin (Lawrence et al. 1990). These higher order structures can be achieved through the interaction and close packing ofthe stable tertiary structures. The major driving force for this quaternary structure formation is also hydrophobic interaction between the tertiary structures.
INTRODUCTION OF ASPl GENE INTO TOBACCO
The correct insertion and orientation ofthe pBI derivative containing the ASPl tetramer was screened for by EcoRI and Hindlll digestion (it was found in E. coli that the most stable form ofthe gene was the tetramer form). The EcoRI digestion gave a fragment ofthe expected size, 3.2 kb, which consisted of 3'NOS of ASPl and the GUS gene (data not shown). Also, the ASPl gene with its 35S promoter and 3'NOS sequences was detected as a 1.4 kb band by Hindlll digestion. Stable transformation ofthe ASPl gene into A. tumefaciens LBA4404 was confirmed by Hindlll digestion of isolated plasmid DNA. It could be isolated from Agrobacterium and detected by enzyme digestion because pBI121 is a binary vector. Leaf discs, transformed with LBA4404 carrying the ASPl gene, gave about 5 to 7 shoots two to three weeks after infection. A total of 565 kanamycin-resistant shoots were regenerated from 120 leaf discs. These shoots were excised from the leaf discs and transferred to new media to grow several more weeks, and then transferred to rooting media. After three weeks in rooting medium, 126 rooted shoots were analyzed for β-glucuronidase (GUS). Root tips of 56 out of 126 plants showed various levels of GUS activity. Not all the kanamycin-resistant shoots showed the GUS positive result. Although kanamycin resistance was due to the expression of neomycin phosphotransferase (NPT II gene), regeneration of nontransgenic shoots in the presence of kanamycin has been reported. Therefore, escapes from the screening based on kanamycin sensitivity might have occurred in the nontransformed plants, making them kanamycin resistant.
Thirty six plantlets which showed high levels of β-glucuronidase activity were transplanted into jiffy pots. After establishment ofthe plants, a more accurate fluorogenic assay for GUS activity was done to quantify the expression level of this gene (Table 3). GUS activity was measured as pmole 4-methyl umbelliferone produced per mg protein per minute, all at an excess of 4-methyl umbelliferone glucuronide. Some of these transformed tobacco plants 24 showed higher levels of B-glucuronidase activity compared to other plants. The level' of expression might be primarily affected by whether the gene is incorporated into an active or inactive site of chromatin. Activity of chromatin, methylation of DNA and nuclease hypersensitivity are closely related to each other. It has been found that the nuclease hypersensitive sites correlate to active transcription (Gross and Garrard 1987). The degree of methylation of DNA is inversely related to gene expression. Furthermore, if the gene is located near the plant's endogenous promoter or enhancer sites, the level of expression of this gene will be increased by these near-by enhancing factors. Therefore, the difference in the levels of GUS activity between the transformed plants might be due to this positional effect, which was determined by the sites of incorporation of this gene into the tobacco genome.
Table 3
Transgenic Plant GUS Activity
ASPl #1 200
ASPl #9 315
ASPl #11 3,790
ASPl #13 360
ASPl #17 2,400
ASPl #29 200 pBI 121 #2 320
Figure imgf000026_0001
Wildtype #37 10
ANALYSIS OF TRANSFORMED PLANTS DNA Analysis
Although GUS activity and kanamycin resistance are good indicators of transformation, rearrangement in the T-DNA after incorporation in the plant genome can inactivate or silence the other genes transferred. Correct incorporation ofthe ASPl gene into the tobacco genome was therefore determined by Southern blotting using the ASPl tetrameric fragment as a probe. A distinct 1.4 kb Hindlll band appeared in 7 out of 9 tobacco genomic DNA samples analyzed, but did not appear in negative control samples. As a positive control, and to check the copy number, Hindlll-digested plasmid pBI ASPl -tetramer was also loaded, corresponding to 1 and 5 copy number ofthe inserted gene in tobacco DNA. Multiple positive bands were observed from most of the transformed plants, with the expected size of 1.4 kb. Extra bands appeared which were 25 bigger than 1.4 kb, and which showed different patterns between the individual plants. These results suggested that the ASPl gene, alone or with neighboring genes, might be inserted into several sites in the chromosomes, with or without rearrangement. The copy number of the correct band varied among the plants, and ranged from 1 to 5 by densitometric measurement. The copy number of a gene can affect its expression level; a gene with a high copy number can give a higher level of expression. The impact of copy number on the extent of expression varies from one system to another. In some cases there are positive correlations to expression, but not always. It should not be expected that all the copies ofthe gene are equally active, because the position of a copy in the genome can affect its level ofthe transcription. However, as the number of chromosomal sites containing foreign DNA increases, the likelihood that at least one ofthe pieces of DNA will integrate into a transcriptionally active region also increases. Kanamycin Gene Segregation Test
First generation progeny from self-fertilized, transformed parents were tested for kanamycin gene segregation. Because the integrated T-DNA is inherited as a dominant Mendelian trait, the copy number of the ASP 1 gene can be determined by the kanamycin segregation pattern of the progeny. The results showed that most transformed plants had multiple copies of the ASPl gene. See Table 4, in which Kn(r) is the number showing Kanamycin resistance, and Kn(s) is the number showing Kanamycin sensitivity. The progenies of transformed plants carrying single, double, or triple genetic NPT II loci (the gene bestowing Kanamycin resistance) are expected to segregate in 3:1, 15:1 or 63:1 ratios, respectively. Therefore, plants #2 and #13 have one NPT II locus; plants #1, #11 and #29 have two NPT II loci; and plant #17 has more than three loci ofthe gene encoding kanamycin resistance.
Table 4
Transgenic Plant Kn(r) Kn(s) Kn(r)/Kn(s)
ASPl #1 136 8 15:1
ASPl #11 127 8 15:1
ASPl #13 112 37 3:1
ASPl #17 175 1 175:1
ASPl #29 131 9 15:1 pBI 121 #2 107 34 3:1
Figure imgf000027_0001
26
RNA Analysis
Efficient transcription of inserted ASPl genes in the tobacco plants was tested by Northern blot analysis. The polyA RNA was analyzed using the ASPl -tetramer probe. The correct gene size transcribed was about 490 bases, which consisted of 30 bases upstream and 170 bases downstream of the ASPl -tetramer gene. In addition to this message, eukaryotic mRNA contains different sizes of polyA. Therefore, the expected size ofthe ASPl -tetramer message should be around 600 plus -100 bases long. Bands were observed which corresponded to this expected size from all the samples which were analyzed. However, the levels of transcription of the ASPl genes were dramatically different among the different transformed plants.
Transformed plant #17 accumulated 5- to 50-fold more transcripts than the other transformed plants. Such differences in accumulation could be explained by the effect of position, or by the effect of multiple copy insertion. The expression levels of the ASPl gene and its neighboring GUS gene correlated with each other in some transformed plants (such as in plant #17), but not in all. These results suggested that the level of expression of two closely connected genes can be dramatically different. Multiple transcripts with different sized bands (500-700 bases) were observed from several transformed plants. This result might be due to multiple insertion ofthe ASPl gene into the tobacco genome. These inserted genes may be rearranged, but still produce transcripts. Another possibility might be strong secondary structure which could be formed due to the four directly repeated sequences ofthe tetrameric ASPl transcripts. Different mobilities could result, depending on the secondary structure. Expression of ASPl
Standard means known in the art were used to raise polyclonal antibody against synthetic ASPl monomer. This antibody was used to detect the production of stable ASPl protein in tobacco. If desired, standard means known in the art can also be used to prepare monoclonal antibodies against ASPl . High levels ofthe tetrameric form (11.2 kD) ofthe ASPl protein were detected from plant #17 by Western blot analysis (data not shown). Therefore, direct correlation was found between gene copy number, number of genetic NPTII loci, GUS expression, accumulation of ASPl transcript and protein expression level in the case of plant #17. Some heterologous seed proteins undergo specific degradation when expressed in transgenic plants. A significant amount of the immunoreactive protein accumulated in tobacco seed expressing the 27 phaseolin gene is smaller than the final processed protein (Sengupta et al. 1985). A similar result was found when β -conglycinin was expressed in transgenic petunia (Beachy et al. 1985). In contrast to these results, the ASPl protein appears to be quite stable in transgenic tobacco plants. Amino acid and total protein analyses were conducted on leaf tissue from several of the transgenic plants which produced detectable levels of ASPl . Surprisingly, we found that the overall levels of all amino acids were increased with some ofthe plants being remarkably high. (Table 5, Figures 7A-7B). These data were derived from seedlings from transformed mother plants. A minimum of four separate assays was used; variation was no more than 30%. The data shown in Table 5 were determined from the dry weight of the whole plant. This rather disconcerting result has been repeated numerous times and the overall levels of all amino acids in the transgenic plants remain significantly elevated. See Table 6 and Figures 8A-8B. Table 6 gives percentages of he amino acids above that ofthe control (pBI 121 #2) for protein isolated from various ASPl tetramer seedlings. These values were derived from amino acid analysis. Figures 8A-8B depict overall protein content from equivalent samples (by weight) taken from leaves of control (P-2TC) and various ASPl tetramer seedlings. Other methods of determining overall protein content have been used with similar trends observed. For example, comparison of total protein densitometric values derived from SDS-PAGE of equivalent samples (on a weight basis) yield the same results (data not shown). Therefore, in addition to being a very stable protein in a plant cell, ASPl must function as a general 'protein-stabilizer' and reduces overall protein turnover without apparent deleterious effects to the plants, since there is no observable difference in growth characteristics in the plants producing high amounts of ASPl as compared to control plants.
Table 5
% Total Protein % Above Control
Transformed Control 12 0
ASPl #7 19 58
ASP #11 28 133
ASP #17 24 100
ASP #29 17 42
Figure imgf000029_0001
28 Table 6
% of 7 Above C % of 11 Above C % of 17 Above C % of 29 Above C
Asx 60.55 80.59 47.00 35.00
Glx 65.26 56.18 46.42 20.68
Ser 30.00 109.67 73 28.00
Gly 14.46 115.96 78.01 23.69
His 31.30 94.27 63.74 27.23
Arg 39.06 86.5 58.28 23.24
Thr 31.00 106.55 79.48 23.44
Ala 6.21 123.91 76.55 27.54
Pro 11.68 114.53 73.65 19.66
Tyr 254.95 261.26 236.49 121.02
Val 14.45 80.06 53.32 23.70
Met ND ND ND ND
Cys ND ND ND ND
He 19.86 69.51 45.3 22.65
Leu 3.95 99.81 68.17 22.54
Phe 2.65 101.77 72.42 14.45
Lys -25.90 119.51 79.34 3.61
Figure imgf000030_0001
Trp ND ND ND ND
Expression of ASPl in Sweet Potato
The above results indicate the surprising overall increase in protein production in tobacco plants which were transformed with ASPl . Similar results have also been found in sweet potato. These results indicate that the increase in total protein content is a general phenomenon which is applicable to at least most plants. Table 7 lists the percentage of total protein, as a function of dry weight, ofthe transformed controls and ASPl transformants of sweet potato. The numbers are the average of 5 separate assays. Table 8 indicates the amount of essential amino acid in mg/100 grams edible portion ofthe sweet potato and the numbers are the average of 3 separate assays. Table 9 illustrates the percentage of these essential amino acids compared to the transformed control, the numbers being the average of 3 separate assays. Table 10 shows data 29 for a repeat of experiments as done in Table 8 but with the content of more ofthe amino acids determined. The numbers in Table 10 are the average of 3 separate assays. Table 11 shows the increase in transformant #5. Table 12 shows the %> protein (wet weight basis) of roots and leaves, with the numbers being the average of at least 3 separate assays, while Table 13 depicts the overall protein content ofthe roots of transformed plants on a dry weight basis and percent dry matter and overall moisture content.
Table 7 Overall Protein Content and Percentage ofthe Control Transformed Plant
% Total Protein % of Control
Transformed Control 3.3 ± 0.31 100
ASPl Transformant 1 6.3 ± 0.46 191
ASPl Transformant 2 5.2 ± 0.14 158
ASPl Transformant 3 4.8 ± 0.06 146
ASPl Transformant 4 9.1 ± 0.16 276
ASPl Transformant 5 9.6 ± 0.19 291
Figure imgf000031_0001
Table 8 Essential Amino Acid Content in mg/100 Grams Edible Portion
Essential AA T-Control ASPl TI ASPl T2 ASPl T3 ASPl T4 ASPl T5
Isoleucine 90 225 255 290 315 270
Leucine 175 360 415 465 455 430
Lysine 148 275 315 350 395 365
Methionine 55 115 135 135 15 135
Phenylalanine 135 275 340 375 430 350
Threonine 135 225 305 350 385 325
Figure imgf000031_0002
30
Table 9 Essential Amino Acid Content as a Percentage ofthe Transformed Control Plants
Essential AA T-Control ASPl TI ASPl T2 ASPl T3 ASPl T4 ASPl T5
Isoleucine 100 250 283 322 350 300
Leucine 100 206 237 266 260 246
Lysine 100 186 213 237 267 247
Methionine 100 209 246 246 282 246
Phenylalanine 100 204 252 277 319 259
Threonine 100 167 226 259 285 241
Figure imgf000032_0001
31
Table 10 Essential Amino Acid Content in mg/100 grams Edible Portion (Sweet Potato)
Essential AA T-Control ASPl TI ASP T2 ASPl T3 ASPl T4 ASPl T5
Isoleucine 80 320 420 383 388 433
Leucine 155 567 680 633 625 687
Lysine 125 450 510 493 493 537
Methionine 30 143 190 173 165 197
Phenylalanine 90 497 600 560 540 617
Threonine 105 423 480 430 445 487
Tryptophan 0.5 83 65 51 57 61
Valine 110 473 610 573 588 637
Nonessential AA
Aspartic Acid 230 2,260 1,267 1,533 2,395 2,567
Serine 95 513 450 547 558 660
Glutamic Acid 245 1,100 913 993 1,000 1,210
Proline 110 223 270 333 358 400
Glycine 100 387 333 397 393 443
Alanine 105 473 367 397 480 487
Tyrosine 55 327 310 367 358 403
Histidine 40 197 153 160 198 210
Arginine 95 417 357 590 450 507
Ammonium 45 250 140 160 260 277
% Protein 2.300 10.150 7.593 9.473 10.403 11.083
Figure imgf000033_0001
32
Table 11 Essential Amino Acid Content as a Percentage ofthe Transformed Control Plants
Essential AA T-Control ASPl T5
Isoleucine 100 542
Leucine 100 443
Lysine 100 429
Methionine 100 656
Phenylalanine 100 685
Threonine 100 464
Tryptophan 100 1,230
Valine 100 579
Figure imgf000034_0001
Table 12 Overall Protein Content (Fresh Weight) in Storage Roots and Leaves
% Protein in Roots % Protein in Leaves
Transformed Control 0.36 0.94
ASPl Transformant 1 2.42 1.93
ASPl Transformant 2 1.60 1.21
ASPl Transformant 3 2.11 1.26
ASPl Transformant 4 2.23 1.60
ASPl Transformant 5 2.46 2.03
Figure imgf000034_0002
33
Table 13 Various Composition of Roots Content on a Dry Weight Basis
% Protein % Dry Matter % Moisture
Transformed Control 2.3 17.0 83
ASPl Transformant 1 10.2 23.7 76
ASPl Transformant 2 8.0 21.7 79
ASPl Transformant 3 9.5 23.0 78
ASPl Transformant 4 10.6 20.7 80
ASPl Transformant 5 11.7 21.9 79
Figure imgf000035_0001
In a field study of 5 separate transformed lines of sweetpotato which were transformed with ASPl , it was seen that 3 of the 5 lines grew more slowly than the control plants and produced fewer storage roots, while the remaining two transformed lines which had the highest protein levels grew normally. Application of nitrogenous fertilizer did not make any significant difference in the yield of these two lines.
Other Gene Constructs Which Can Yield Increased Overall Protein Content
Protein constructs similar to ASPl will also cause plants to give elevated total protein yields when the plant is transformed with a gene construct which expresses such a protein. One such protein is HDNP1 which has the following monomeric amino acid sequence:
MLEEIFKKMTE IEKVLKTM (SEQIDNO:6) hhHHhhHHhHHhhHHhhHhh (SEQ ID NO:31)
The "h" and "H" below the amino acid sequence refer to "hydrophobic" and "hydrophilic", respectively. Hydrophobic amino acids comprise: isoleucine, methionine, phenylalanine, tryptophan, valine, leucine, alanine and cysteine. Hydrophilic amino acids comprise: arginine, glutamic acid, histidine, lysine, asparagine, aspartic acid, glutamine, tyrosine and proline. Glycine, threonine and serine can act as either hydrophilic or hydrophobic amino acid residues depending upon their immediate environment. The HDNP1 monomer is composed of 20 amino acids in the structural motif to render an amphiphilic -helix. The tetrameric form is: MLEEIFKKMTE WIEKVLKTMgpgrMLEEIFKKMTE WIEKVLKTMgpgrMLEEIFKKMTE
WIEKVLKTMgpgrMLEEIFKKMTEWIEKVLKTM (SEQ ID NO:7). 34
This tetrameric form shows the 4 -helices interspaced with the β-turn gpgr (SEQ ID NO:8). The tetramer is composed of 92 amino acids, including the 12 amino acids comprising the 3 β -turns, in the structural motif to render an amphiphilic -helix.
HDNPl is quite similar to ASPl except that the Leu in position 5 ofthe monomer has been changed to He and also the He in position 17 ofthe monomer for ASPl has been changed to a Leu in HDNPl . For the tetramer, these changes are made throughout the protein as can be seen by comparing the amino acid sequences. Yet another protein which can yield similarly elevated protein levels when plants are transformed with a gene construct expressing the gene, is the protein HDNP2 which has the following monomeric sequence: MTIE KVELKFEMKIELKMT (SEQ ID NO:9) hHhHhHhHhHhHhHhHhHhH (SEQ ID NO:36)
This monomer is composed of 20 amino acids in the structural motif to render an amphiphilic β -pleated sheet. The tetrameric form shown below has each stretch of β -pleated sheet interspaced with the β-turn gpgr (SEQ ID NO:8). The sequence ofthe tetrameric form is: MTIEWKVELKFEMKIELKMTgpgrMTIEWKVELKFEMKIELKMTgpgrMTIEWKVELKF
EMKIELKMTgpgrMTIEWKVELKFEMKIELKMT (SEQ ID NO: 10).
The tetramer is composed of 92 amino acids, including the 12 amino acids comprising the 3 β -turns, in the structural motif to render an amphiphilic β -pleated sheet.
Protein Designs Useful for Organisms Other than Humans
The proteins ASPl, HDNPl and HDNP2 were designed to yield high levels of essential amino acids especially suitable for humans. Each type of animal has its own set of required essential amino acids and these sets of essential amino acids, while usually overlapping, are different from each other. Other proteins can be designed which yield higher levels of essential amino acids more suitable for organisms other than humans. For example, pigs have one set of essential amino acids, chickens have a different set, and fish have yet a different set. Transgenic plants can be engineered to be designed to be fed to one particular species of animal. For example, various transgenic corn plants can be produced wherein one transgenic form is most suitable for humans, a second transgenic form will produce a high level of those essential amino acids suited for pigs, and a third transgenic form can be made which is most suited for chickens.
The design of such proteins can be based on the design of ASPl, HDNPl and HDNP2. One of 35 skill in the art knows how to prepare DNA which will encode each desired protein. The following are examples ofthe monomeric and tetrameric forms of proteins which may be used for specific species of animals. DNAs encoding these proteins are easily designed and used to make transgenic plants as described above. A protein directed to use with swine is SDNP1 which has the amino acid sequence:
MFETIVKLVEETMHK EEVIKKFVTMVEETLKKFEEITKKM (SEQ ID NO:l 1) hhHHhhHhhHHhhHHhHHhhHHhhHhhHHhhHHhHHhhHHh (SEQ ID NO:32)
This monomer is composed of 41 amino acids in the structural motif to render an amphiphilic -helix. The tetrameric form is: MFETI VKLVEETMHKWEEVIKKFVTMVEETLKKFEEITKKMgpgrMFETIVKLVEETM
HKWEEVIKKFVTMVEETLKKFEEITKKMgpgrMFETIVKLVEETMHKWEEVIKKFVT MVEETLKKFEEITKKMgpgrMFETIVKLVEETMHKWEEVIKKFVTMVEETLKKFEEIT KKM (SEQ ID NO: 12).
This tetrameric form shows the 4 -helices interspaced with the β-turn gpgr (SEQ ID NO:8). The tetramer is composed of 176 amino acids, including the 12 amino acids comprising the 3 β -turns, in the structural motif to render an amphiphilic α -helix.
A second protein for swine is SDNP2 and has the monomeric amino acid sequence MTIEFKVELKVETH EMKIEVKFETKIEVKTEMKLEVKFTM (SEQ ID NO: 13) hHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHh (SEQ ID NO:37) This monomer is composed of 41 amino acids in the structural motif to render an amphiphilic β -pleated sheet. The tetrameric form shown below has each stretch of β -pleated sheet interspaced with the β-turn gpgr (SEQ ID NO:8). The sequence ofthe tetrameric form is: MTIEFKVELKVETHWEMKIEVKFETKIEVKTEMKLEVKFTMgpgrMTIEFKVELKVET HWEMKIEVKFETKIEVKTEMKLEVKFTMgpgrMTIEFKVELKVETHWEMKIEVKFETK IEVKTEMKLEVKFTMgpgrMTIEFKVELKVETHWEMKIEVKFETKIEVKTEMKLEVKFTM
(SEQ ID NO: 14).
The tetramer is composed of 176 amino acids, including the 12 amino acids comprising the 3 β -turns, in the structural motif to render an amphiphilic β -pleated sheet.
A protein directed to use with poultry is PDNP1 which has the amino acid sequence: MFEGLVKIMEEVLRHWTEVFGKIFE GTRFLEGFTKM (SEQIDNO:15) hhHHhhHhhHHhhHHhHHhhHHhhHhhHHhhHHhHHh (SEQ ID NO:33) 36
This monomer is composed of 37 amino acids in the structural motif to render an amphiphilic -helix. The tetrameric form is:
MFEGLVKIMEEVLRHWTEVFGKIFEMGTRFLEGFTKMgpgrMFEGLVKIMEEVLRHW TEVFGKIFEMGTRFLEGFTKMgpgrMFEGLVKIMEEVLRHWTEVFGKIFEMGTRFLEG FTKMgpgrMFEGLVKIMEEVLRHWTEVFGKIFEMGTRFLEGFTKM (SEQ ID NO: 16).
This tetrameric form shows the 4 -helices interspaced with the β-turn gpgr (SEQ ID NO: 8). The tetramer is composed of 160 amino acids, including the 12 amino acids comprising the 3 β -turns, in the structural motif to render an amphiphilic a -helix.
A second protein for poultry is PDNP2 and has the monomeric amino acid sequence MEFKVGIELRFT EMHVGFELKIGFTVEMRLGFETKM (SEQIDNO:17) hHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHh (SEQ IDNO:38)
This monomer is composed of 37 amino acids in the structural motif to render an amphiphilic β -pleated sheet. The tetrameric form shown below has each stretch of β -pleated sheet interspaced with the β-turn gpgr (SEQ ID NO:8). The sequence ofthe tetrameric form is: MEFKVGIELRFTWEMHVGFELKIGFTVEMRLGFETKMgpgrMEFKVGIELRFTWEMH
VGFELKIGFTVEMRLGFETKMgpgrMEFKVGIELRFTWEMHVGFELKIGFTVEMRLGF
ETKMgpgrMEFKVGIELRFTWEMHVGFELKIGFTVEMRLGFETKM (SEQ ID NO: 18).
The tetramer is composed of 160 amino acids, including the 12 amino acids comprising the 3 β -turns, in the structural motif to render an amphiphilic β -pleated sheet. A protein directed to use with fish is FDNP1 which has the amino acid sequence: FEELVRTIEELMKK EEVFKRVLHILEEFVRKFEETMRK (SEQ ID NO:19) hhHHhhHhhHHhhHHhHHhhHHhhHhhHHhhHHhHHhhHH (SEQ ID NO:34)
This monomer is composed of 40 amino acids in the structural motif to render an amphiphilic α -helix. The tetrameric form is: MFEELVRTIEELMKKWEEVFKRVLHILEEFVRKFEETMRKgpgrMFEELVRTIEELMK
KWEEVFKRVLHILEEFVRKFEETMRKgpgrMFEELVRTIEELMKKWEEVFKRVLHILE EFVRKFEETMRKgpgrMFEELVRTIEELMKKWEEVFKRVLHILEEFVRKFEETMRK(SEQ ID NO:20).
This tetrameric form shows the 4 -helices interspaced with the β-turn gpgr (SEQ ID NO:8). The tertramer is composed of 172 amino acids, including the 12 amino acids comprising the 3 β -turns, in the structural motif to render an amphiphilic -helix.
A second protein for fish is FDNP2 and has the monomeric amino acid sequence 37
MEIKLEVRFETKVELKVEWRIEFHTELKMELRVELRFEMK (SEQIDNO:21) hHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhH (SEQ IDNO:39)
This monomer is composed of 40 amino acids in the structural motif to render an amphiphilic β -pleated sheet. The tetrameric form shown below has each stretch of β -pleated sheet interspaced with the β-turn gpgr (SEQ ID NO:8). The sequence ofthe tetrameric form is:
MEIKLEVRFETKVELKVEWRIEFHTELKMELRVELRFEMKgpgrMEIKLEVRFETKVE
LKVEWRIEFHTELKMELRVELRFEMKgpgrMEIKLEVRFETKVELKVEWRIEFHTELK
MELRVELRFEMKgpgrMEIKLEVRFETKVELKVEWRIEFHTELKMELRVELRFEMK(SEQ
ID NO:22). The tetramer is composed of 172 amino acids, including the 12 amino acids comprising the 3 β -turns, in the structural motif to render an amphiphilic β -pleated sheet.
A protein directed to use with dogs is DDNP1 which has the amino acid sequence:
MVETFIKLVEEIVRKWEEMLHKFVEVLTKLFETFTKIM (SEQ ID NO:23) hhHHhhHhhHHhhHHhHHhhHHhhHhhHHhhHHhHHhh (SEQ IDNO:35) This monomer is composed of 38 amino acids in the structural motif to render an amphiphilic -helix. The tetrameric form is:
MVETFIKLVEEIVRKWEEMLHKFVEVLTKLFETFTKIMgpgrMVETFIKLVEEIVRKWE EMLHKFVEVLTKLFETFTKIMgpgrMVETFIKLVEEIVRKWEEMLHKFVEVLTKLFETF TKIMgpgrMVETFIKLVEEIVRKWEEMLHKFVEVLTKLFETFTKIM (SEQ ID NO:24). This tetrameric form shows the 4 -helices interspaced with the β-turn gpgr (SEQ ID
NO:8). The tertramer is composed of 164 amino acids, including the 12 amino acids comprising the 3 β -turns, in the structural motif to render an amphiphilic -helix.
A second protein for dogs is DDNP2 and has the monomeric amino acid sequence
MTVEFKLEIKVTIEFK EVHLEIRFEVKLEMKFTLTMV (SEQ ID NO:25) hHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhh (SEQ ID NO:40)
This monomer is composed of 38 amino acids in the structural motif to render an amphiphilic β -pleated sheet. The tetrameric form shown below has each stretch of β -pleated sheet interspaced with the β-turn gpgr (SEQ ID NO:8). The sequence ofthe tetrameric form is:
MTVEFKLEIKVTIEFKWEVHLEIRFEVKLEMKFTLTMVgpgrMTVEFKLEIKVTIEFKW EVHLEIRFEVKLEMKFTLTMVgpgrMTVEFKLEIKVTIEFKWEVHLEIRFEVKLEMKFT
LTMVgpgrMTVEFKLEIKVTIEFKWEVHLEIRFEVKLEMKFTLTMV (SEQ ID NO:26). 38
The tetramer is composed of 164 amino acids, including the 12 amino acids comprising the 3 β -turns, in the structural motif to render an amphiphilic β -pleated sheet.
A protein directed to use with cats is CDNP1 which has the amino acid sequence:
MLETLFKIVEETLRK EEMFKHVLTFMEEIVKRITRLM (SEQ ID NO:27) hhHHhhHhhHHhhHHhHHhhHHhhHhhHHhhHHhHHhh (SEQIDNO:35)
This monomer is composed of 38 amino acids in the structural motif to render an amphiphilic α -helix. The tetrameric form is:
MLETLFKIVEETLRKWEEMFKHVLTFMEEIVKRITRLMgpgrMLETLFKIVEETLRKWE EMFKHVLTFMEEIVKRITRLMgpgrMLETLFKIVEETLRKWEEMFKHVLTFMEEIVKRI TRLMgpgrMLETLFKIVEETLRKWEEMFKHVLTFMEEIVKRITRLM (SEQ ID NO:28).
This tetrameric form shows the 4 α -helices interspaced with the β-turn gpgr (SEQ ID NO:8). The tertramer is composed of 164 amino acids, including the 12 amino acids comprising the 3 β -turns, in the structural motif to render an amphiphilic -helix.
A second protein for cats is CDNP2 and has the monomeric amino acid sequence MTLEFKLTMELH EIKVELKTEVRIEMKFEVRLEFRMT (SEQ ID NO:29) hHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhH (SEQ ID NO:41)
This monomer is composed of 38 amino acids in the structural motif to render an amphiphilic β -pleated sheet. The tetrameric form shown below has each stretch of β -pleated sheet interspaced with the β-turn gpgr (SEQ ID NO:8). The sequence ofthe tetrameric form is: MTLEFKLTMELHWEIKVELKTEVRIEMKFEVRLEFRMTgpgrMTLEFKLTMELHWEIK
VELKTEVRIEMKFEVRLEFRMTgpgrMTLEFKLTMELHWEIKVELKTEVRIEMKFEVR
LEFRMTgpgrMTLEFKLTMELHWEIKVELKTEVRIEMKFEVRLEFRMT (SEQ ID NO:30).
The tetramer is composed of 164 amino acids, including the 12 amino acids comprising the 3 β -turns, in the structural motif to render an amphiphilic β -pleated sheet.
Use of Vectors to Increase the Production of a Second Protein for which the Plant is Transformed
The generally enhanced levels of protein production can be useful in expressing other valuable proteins. For example, if a gene coding for insulin were cloned into a plant expressing the ASPl gene, it is expected that levels of insulin production will be higher, as compared to control plants having the insulin gene, but lacking the ASPl gene. Therefore plants which are transgenic for both the ASPl gene or similar gene which also results in increased total protein 39 production and for a second gene which encodes a protein of interest will make more of the protein of interest than if the plant were transformed solely with the gene encoding the protein of interest and not transformed with the ASPl or similar gene. It is irrelevant whether the plant is first transformed with ASPl or a similar gene and later transformed with a gene of interest, or whether the plant is first transformed with the gene of interest and then is later transformed with
ASPl or similar gene. Also, a transformation can be performed using both genes simultaneously.
Use of Vectors to Increase the Production of Nonprotein Products
Plants and plant cells which have been made transgenic for ASPl or similar amphipathic proteins produce greater amounts of all protein than do nontransgenic plants or cells. As a result of this generally higher level of protein, higher levels of nonprotein products will also be made. This result is expected because there will be an increase in the levels of enzymes which are used in the synthesis of such products. For example, taxol is naturally synthesized by certain plants and the synthesis of taxol is dependent on enzymes. Increased levels of those enzymes will lead to increased levels of taxol. Similarly, many plants produce sugars, e.g., sugarcane. Again, the synthesis of sugars is dependent on enzymes within the plant. Increased levels of these enzymes will yield increased levels ofthe sugars. Therefore simply making a plant or plant cell transgenic for ASPl or similar amphipathic protein will result in the plant or cell producing more product wherein said product need not be a protein but is synthesized by protein (enzyme) action. Similarly, if one knows the enzymes involved in the synthetic pathway of a desired product, e.g., taxol or sugar, one can co-transform a plant or plant cell with a gene encoding ASPl or similar amphipathic protein and with a gene encoding the enzyme which is utilized in synthesizing the desired product. In this way one can further enhance the production of the desired product. This can be especially useful if there is one limiting enzyme and the gene for this limiting enzyme ofthe pathway is used.
Sweetpotato was transformed with ASPl and two transformed lines were assayed for sugar content and overall amount of dry matter versus moisture content. Results are shown in Table 14. 40 Table 14
% Sucrose % Glucose % Fructose
Control 0.54 0.06 0.06
Transformant 1 0.95 0.06 0.06
Transformant 2 1.57 0.15 0.13
Figure imgf000042_0001
Table 14 shows that Transformant 1 had an increased production of sucrose but normal production of both glucose and fructose whereas Transformant 2 had increased production of all 3 sugars as compared to the control plant.
Table 13 indicates that the overall amount of dry matter is increased from 17% in the control to roughly 22% in the transformants. This is approximately a 30% increase in dry matter as a percent of the total weight ofthe plant.
Use in Plants Other than Tobacco and Sweet Potato High-level, tissue-specific expression of ASPl or related genes can also be performed, in a manner generally analogous to that described above for tobacco and sweet potato, in certain economically important plants such as rice, wheat, barley, sorghum, maize, potato, plantain, cassava, taro, soybean, alfalfa, or a forage grass. It is desirable to incorporate suitable promoters or other regulatory sequences to encourage expression (preferably constitutive expression) primarily in the part of the plant intended as a foodstuff. For example, in rice or maize, expression is desired primarily in the seeds; while in potato or sweet potato, expression is desired primarily in the tuber. Where necessary, transformation protocols known in the art other than the Agrobacterium protocol will be used, such as transformation through DNA particle gun or via plant protoplasts. See, e.g., Klein et al. (1987) and Croughan et al. (1989). These plants can be transformed with vectors encoding not only ASPl , but for any such similar proteins including any ofthe proteins disclosed above.
Cell Culture of Transgenic Plant Cells
It is not necessary to make transgenic plants to perform the invention. Plant cells can be made transgenic with a gene encoding ASPl or other amphipathic protein and these transgenic cells can be grown in culture or in a bioreactor. This avoids the necessity of having to regenerate 41 a plant. These transgenic cells will produce enhanced levels of protein and other products as was seen in the transgenic plants. These cells can be cotransformed with any genes of interest, for example a gene encoding insulin. The desired product will be overproduced as compared to a nontransgenic plant cell or a cell not transformed with a gene encoding ASPl or other amphipathic protein. The desired product can be purified from the cultured cells.
As used in the claims below, unless otherwise clearly indicated by context, the term "higher plant" is intended to encompass gymnosperms, monocotyledons, and dicotyledons; as well as any cells, tissues, or organs taken or derived from any of the above, including without limitation any seeds, leaves, stems, flowers, roots, tubers, single cells, gametes, or protoplasts taken or derived from any gymnosperm, monocotyledon, or dicotyledon. Also, the term
"protein" is meant to include peptides such as dipeptides or any longer peptide as well as proteins.
Although the bulk of the above discussion regarding this invention has focused on de novo proteins having an -helical structure, the same basic approach can work in designing de novo proteins having a β -sheet structure. To generate amphipathic β -sheets (which are not believed to have been reported in nature), amino acid residues will alternate between being hydrophobic and being hydrophilic, so that one side ofthe structure is hydrophobic, and the other side is hydrophilic. This structure was seen in the sequences disclosed above. Salt bridges to promote stability can be formed with internal sequences glu-X-lys. Other acid-X-base sequences may also serve this function. Lysine is preferred as the base because it is an essential amino acid.
It may be possible to substitute aspartic acid for glutamic acid, however, to give the internal sequence asp-X-lys. Turns between adjacent monomer units may be promoted, for example, by the internal sequence gly-asn, to form ohgomers or polymers ofthe main peptide structure.
While the invention has been disclosed in this patent application by reference to the details of preferred embodiments of the invention, it is to be understood that the disclosure is intended in an illustrative rather than in a limiting sense, as it is contemplated that modifications will readily occur to those skilled in the art, within the spirit ofthe invention and the scope ofthe appended claims. 42
LITERATURE CITED
Agros, P., Pederson, K., Marks, D. and Larkins, B. A. 1982. A structural model for maize zein proteins. J. Biol. Chem. 257: 9984-9990.
Agros, P., Naravana, S. V. L., and Nielsen, N. C. 1985. Structural similarity between legumin and vicillin storage proteins from legumes. The EMBO J. 4: 1111-1117.
Altenbach, S. B., Pederson, K. W., Meeker, G. Staraci, L. C, and Sun, S. S. M. 1989. Enhancement ofthe methionine content of seed proteins by the expression of a chimeric gene encoding a methionine-rich protein in transgenic plants. Plant Mol. Biol. 13: 513-522.
Bachmair, A., Finley, D., and Varshavsky, A. 1986. In vivo halflife of a protein is a function of its amino-terminal residue. Science 234: 179-186.
Badley, R.A., Atkinson, D., Hauser, H., Oldani, D., Green, J.P., and Stubbs, J.M. 1975. The structure, physical and chemical properties ofthe soybean protein glycinin. Biochim. Biophys. Acta. 412: 214-228.
Bartels, D., and Tompson, R.D. 1983. The characterization of cDNA clones coding for wheat storage proteins. Nucleic Acids Res. 11 : 2961-2977
Beachy R.N., Chen Z.L., Horsch R.B., Rogers S.G., Hoffman N.J., and Fraley R.T. 1985. Accumulation and assembly of soybean β -conglycinin in seeds of transformed petunia plants. EMBO J. 4: 3047-3053.
Bierzynski, A., Kim, P. S., and Baldwin, R. L. 1982. A salt bridge stabilizes the helix formed by isolated c-peptide of RNAse A. Proc. Natl. Acad. Sci. U. S. A. 79: 2470-2474.
Blundell, T.L., Thornton, S. J., Burley, S. K., and Petsco, G. A. 1986. Atomic interactions. Science 234: 1005-1009.
Bollini, R. and Chrispeels, M.J. 1978. Characterization and subcellular localization of vicillin and phyto-hemagglutinin, the two major reserve proteins of Phaseolus vulgaris. Planta 142: 291- 298.
Brown, J. E. and Klee, W. A. 1971. Helix-coil transition of the isolated amino terminus of Ribonuclease. Biochemistry 10: 470-476.
Chen, Z. L., Pan, N. S., and Beachy, R. N. 1988. A DNA sequence element that confers seed- specific enhancement of a constitutive promoter. The EMBO J. 7: 297-302.
Chen, Z. L., Schuler, M. A. and Beachy, R. N. 1986. Functional analysis of regulatory elements in a plant embryo-specific gene. Proc. Natl. Acad. Sci. U. S.A. 83:8560-8564.
Chou, P. Y. and Fasman, G. D. 1978. Prediction ofthe secondary structure of proteins from their amino acid sequence. Adv. Enzymol. 47: 45-148. 43
Colot, V., Robert, L. S., Kavanagh, T. A., Beavan, M. W. and Tompson, R. D. 1987. Localization of sequences in wheat endosperm protein genes which confer tissue-specific expression in tobacco. The EMBO J. 6: 3559-3564.
Creighton, T.E. 1984. Proteins. New York: Freeman.
Crouch, M., Tenberge, K., Simone, N.E., and Ferl, R. 1983. Sequence ofthe 1.7K storage protein of Brassica napus. Mol. Appl. Genet. 2: 273-283.
Croughan et al. 1989. Advances in Plant Biotechnology, pp. 107-114.
Degrado, W. F., and Lear, J. D. 1985. Induction of peptide conformation at apolar/water interfaces. J. Am. Chem. Soc. 107: 7684-7689.
Degrado, W. F., Wasserman, Z. R., and Lear, J.D. 1989. Protein design, a minimalist approach.
Science 241 : 622-628.
Esen, E. 1986. Separation of alcohol-soluble proteins (zeins) from maize into three fractions by differential solubility. Plant Physiol. 80: 623-627.
Finley, D. and Varshavsky, A. 1985. The ubiquitin system: functions and mechanisms. Trends Biochem. Sci. 10: 343-346.
Forde, B.G., Kreis, M., Williamson, M.S., Fry, R.P. and Pywell, J. 1985. Short tandem repeats shared by B- and C-hordein cDNAs suggest a common evolutionary origin for two groups of cereal storage protein genes. The EMBO J. 4: 9-15.
Goldberg, A.L., and St John, A.C. 1976. Intracellular protein degradation in mammalian and bacterial cells: part 2. Annu. Rev. of Biochem. 45: 747-803.
Greenwood, J. S., and Chrispeels, M. J. 1985. Correct targeting of the bean storage protein phaseolin in the seeds of transformed tobacco. Plant Physiol. 79: 65-71.
Gross, D.S., and Garrard, W.T. 1987. Poising chromatin for transcription. Trends in Biochem. 12: 293-296.
Ho, S. P., and Degrado, W. F. 1987. Design of a 4-helix bundle protein: synthesis of peptides which self-associate into helical protein. J. Am. Chem. Soc. 109: 6751-6758.
Hoffmann, L.E., Donaldson, D. D., and Herman, E. M. 1988. A modified storage protein is synthesized, processed, and degraded in the seeds of transgenic plants. Plant Mol. Biol. 11: 717- 729.
Hoffmann, L. E., Donaldson, D. D., Bookland, R., Rashka, K. and Herman, E. M. 1987. Synthesis and protein body deposition of maize 15-kd zein in transgenic tobacco seeds. The
EMBO J. 6: 3213-3221. 44
Hoi, W. G. and Sander, H. C. 1981. Dipole of the α -helix and β -sheet: their role in protein folding. Nature 294: 532-536.
Horsch, R.B., Fry, J., Hoffmann, N., Neidermeyer, J., Rogers, S.G. and Fraley, R.T. 1988. In Plant Mol. Biol. Manual ed. S. B. Gelvin and R. A. Schilperoort, Dordrecht: Kluwer Academic.
Jaynes, J. M., Nagpala, P., Destefano, L., Denny, T., Clark, C, and Kim, J-H. 1992. Expression of a de novo designed peptide in transgenic tobacco plants confers enhanced resistance to Pseudomonas solanacearum infection. Plant Science 89: 43-53.
Jaynes, J. M., Yang, M. S., Espinoza, N. O., and Dodds, J. H. 1986. Plant protein improvement by genetic engineering: use of synthetic genes. Trends in Biotechnol. 4: 314-320.
Jones, J.D.G., and Gilbert, D.E. 1987. T-DNA structure and gene expression in petunia plants transformed by Agrobacterium tumefaciens C58 derivatives. Mol. Gen. Genet. 207: 478-485.
Kabsch, W., and Sander, C. 1983. How good are predictions of protein structure? FEBS lett. 155: 179-182.
Kane, J.F. and Hartley, D.L. 1988. Formation of recombinant protein inclusion bodies in
Escherichia coli. Trends in Biotechnol. 6: 95-101.
Kasarda, D.D., Okita, T.W., Bernardin, J.E., Baecker, P.A., and Nimmo, CC. 1984. DNA and amino acid sequences of alpha and gamma gliadins. Proc. Natl. Acad. Sci. U.S.A. 81 : 4712-4716.
Keris, M., Shewry, P. R., Forde, B. G., Forde, G. and Miflin, J. 1985. Structure and evolution of seed storage proteins and their genes with particular reference to those of wheat, barley and rye. Oxford Survey of Plant Mol. and Cell Biol. 2:253-317.
Klein, T.M., Wolf, E.D., Wu, R. and Sanford, J.C. 1987. High-velocity microprojectiles for delivering nucleic acids into living cells. Nature 327:70-73.
Komoriya, A., and Chaiken, J. M. 1982. Sequence modeling using semisynthetic Ribonuclease S. J. Biol. Chem. 257: 2599-2604.
Larkins, B.A. 1983. Genetic engineering of seed storage protein. In Genetic Engineering of Plants ed. B. A. Larkins, pp. 93-120. New York: Plenum.
Larkins, B.A., Pederson, K., Mark, M.D., and Wilson, D.R. 1984. The zein protein of maize endosperm. Trends in Biochem. 9: 306-308.
Lawrence, M.C., Suzuki, E., Varghes, J.N., Davis, P.C., Van Donkelaar, A. Tulloch, P.A. and Collman, P.M. 1990. The three-dimensional structure ofthe seed storage protein phaseolin at 3 A resolution. The EMBO J. 9: 9-15. 45
Lear, J. D., Wasserman, Z. R. and Degrado, W. F. 1988. Synthetic amphiphilic peptide model for protein ion channels. Science 240: 1177-1181.
Lending, C. R., Kriz, A., Larkins, B. A. and Bracker, C. E. 1988. Structure of maize protein bodies and immunocytochemical localization of zeins. Protoplasma 143: 51-62.
Lycett, G. W., Cory, R.D., Shirsat, A. H., Richards, D. M., and Boulter, D. 1985. The 5'-flanking regions of three pea legumin genes: comparison of DNA sequences. Nucleic Acids Res. 13: 6733-6743.
Marqusee, S. and Baldwin, R. 1987. Helix stabilization by GLU-LYS salt bridges in short peptides of de novo design. Proc. Natl. Acad. Sci. U. S. A. 84: 8898-8902.
Marries, C, Gallois, P., Copley, J. and Keris, M. 1988. The 5' flanking region of a barley B hordein gene controls tissue and developmental specific CAT expression in tobacco plants. Plant
Mol. Biol. 10: 359-366.
Mutter, M. 1988. Nature's rules and chemist's tools: a way for creating novel proteins. Trends in Biochem. 13: 260-264.
Pace, CN. and Barret, A.J. 1984. Kinetics of tryptic hydrolysis ofthe arginine-valine bond in folded and unfolded ribonuclease TI . Biochem. J. 219: 411-417.
Pakula, A.A. and Sauer, R.T. 1986. Bacteriophage 1 Cro mutation: effect on activity and intracellular degradation. Proc. Natl. Acad. Sci. U.S.A. 82: 8829-8833.
Pakula, A.A. and Sauer, R.T. 1989. Amino acid substitutions that increase the thermal stability ofthe 1 Cro protein. Proteins 5: 202-210.
Parasell, D.A. and Sauer, R.T. 1989. The structural stability of a protein is an important determinant of its proteolytic susceptibility in Escherichia coli. J. Biol. Chem. 264: 7590-7595.
Pederson, K., Agros, P., Naravana, S. V. L., and Larkins, B. A. 1986. Sequence analysis and characterization of a maize gene encoding a high-sulfur zein protein of Mw 15,000. J. Biol. Chem. 201 : 6279-6284.
Pernollet, J. C. and Mosse, J. 1983. Structure and location of legume and cereal seed storage protein. Seed Proteins. Phytochemical Soc. of Eur. Sym. Series 20: 155-187.
Pontremoli, S. and Melloni, E. 1986. Extralysosomal protein degradation. Annu. Rev. Biochem.
55: 455-481.
Presnell, S.R., and Cohen, F.E. 1989. Topological distribution of a four- -helix bundle. Proc. Natl. Acad. Sci. U.S.A. 86: 6592-6596.
Presta, L. G. and Rose, G. D. 1988. Helix signals in proteins. Science 240: 1632-1641. 46
Rafalski, J.A., Scheets, K., Metzler, M., and Peterson, D.M. 1984. Developmentally regulated plant genes: the nucleotide sequence of a wheat gliadin genomic clone. The EMBO J. 3: 1409- 1415.
Richardson, J. S. and Richardson, D. C. 1988. Amino acid preferences for specific locations at the ends of α-helices. Science 240: 1648-1652.
Richardson, J. S. and Richardson, D. C. 1989. The de novo design of protein structures. Trends in Biochem. 14: 304-309.
Sanders, P.R., Winter, J.A., Barnason, A.R. and Rogers, S.G. 1987. Comparison of cauliflower mosaic virus 35S and nopaline synthetase promoters in transgenic plants. Nucleic Acids Res 15: 1543-1558.
Scheraga, H. 1978. Use of random copolymers to determine helix-coil stability constants ofthe naturally occurring amino acids. Pure. Appl. Chem. 50: 315-324.
Scheraga, H. A. 1985. Effect of side chain-backbone electrostatic interaction on the stability of α-helices. Proc. Natl. Acad. Sci. U. S. A. 82: 5585-5587.
Scott, R.J., and Draper, J. 1987. Transformation of carrot tissue derived from proembryogenic suspension cells: a useful model system for gene expression studies in plants. Plant Mol. Biol. 8: 265-274.
Sengupta, G. C, Reichert, N. A., Baker, R. F., Hall, T. C and Kemp, J. D. 1985.
Developmentally regulated expression ofthe bean β-phaseolin gene in tobacco seed. Proc. Natl. Acad. Sci. U. S. A. 82: 3320-3324.
Shen, S-H. 1984. Multiple joined genes prevent product degradation in E. coli. Proc. Natl Acad. Sci. U. S. A. 81 : 4627-4631.
Slightom, J.L., Sun, S.M., and Hall, T.C 1983. Complete sequence of french bean storage protein gene: phaseolin. Proc. Natl. Acad. Sci. U.S.A. 80: 1897-1901.
Staswick, P. E. 1989. Preferential loss of an abundant storage protein from soybean pods during seed development. Plant Physiol. 90: 1251-1255.
Stockhaus, J., Eckes, P., Blau, A., Schell, J., and Willmitzer, L. 1987. Organ-specific and dosage- dependent expression of a leaf/stem specific gene from potato after tagging and transfer into potato and tobacco plants. Nucleic Acids Res. 15: 3479-3491.
Sueki, M., Lee, S., Power, S. P., Denton, J. B., Konishi, Y., and Scheraga, H. 1984. Helix-coil stability constants for the naturally occurring amino acids in water. Macromolecules 17: 148-155.
Twell, D. and Ooms, G. 1987. The 5' flanking DNA of a patatin gene directs tuber specific expression of a chimeric gene in potato. Plant Mol. Biol. 9: 365-375. 47
Wallace, J. C, Galili, G., Kawata, E. E., Cuellar, R. E., Shotwell, M. A., and Larkins, B.' A. 1988. Aggregation of lysine containing zeins into protein bodies in Xenopus oocytes. Science 240: 662-664.
Wenzler, H. C, Mignery, G. A., Fisher, L. M., and Park, W. D. 1989. Analysis of a chimeric class I potatin-GUS gene in transgenic potato plants: high level expression of tubers and sucrose- inducible expression in cultured leaf and stem explants. Plant Mol. Biol. 12: 41-50.
Yang, M.S., Espinoza, N. O., Dodds, J. H., and Jaynes, J. M. 1989. Expression of a synthetic gene for improved protein quality in transformed potato plants. Plant Science. 64: 99-111.
Zimm, B. H. and Bragg, J. R. 1959. Theory ofthe phase transition between helix and random coil in polypeptide chains. J. Chem. Phys. 31 : 526-535.

Claims

48WHAT IS CLAIMED IS:
1. A plant wherein said plant is a transgenic plant comprising a heterologous gene selected from the group consisting of (i) a gene which encodes a protein comprising an amphiphilic ╬▒ -helix and (ii) a gene which encodes a protein comprising a ╬▓ -pleated sheet, wherein said transgenic plant produces more protein per tissue weight than does said plant when said plant is not a transgenic plant and wherein said tissue is root, tuber, seed, leaf, stem, edible portion, flower or whole plant.
2. The plant of claim 1 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein the value of x in one ((h)x(H)y) need not be the same as the value of x in another ((h)x(H)y) and wherein the value of y in one ((h)x(H)y) need not be the same as the value of y in another ((h)x(H)y), and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2.
3. The plant of claim 2 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.
4. The plant of claim 2 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from 49 the group consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO:40 and SEQ ID NO:41.
5. The plant of claim 2 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:6, SEQ ID NOT 1, SEQ ID NO: 15, SEQ ID NO: 19, SEQ ID NO:23 and SEQ ID NO:27 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:9, SEQ ID NO: 13, SEQ ID NO: 17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID NO:29.
6. The plant of claim 2 wherein if said gene is selected from (i) then said gene encodes multiple units of said amphiphilic ╬▒-helix wherein each unit of amphiphilic ╬▒-helix is defined by (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1 , x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein each unit of amphiphilic ╬▒-helix is separated from any neighboring unit of amphiphilic ╬▒-helix by a helix breaker and wherein any unit of amphiphilic ╬▒-helix can be different from any other unit of amphiphilic ╬▒-helix and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y) and wherein if said gene is selected from (ii) then said gene encodes multiple units of said ╬▓ -pleated sheet wherein each unit of ╬▓ -pleated sheet is defined by (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1 , s is any whole number greater than 0, and t equals 0, 1 or 2 and wherein each unit of ╬▓ -pleated sheet is separated from any neighboring unit of ╬▓ -pleated sheet by a helix breaker and wherein any unit of ╬▓ -pleated sheet can be different from any other unit of ╬▓ -pleated sheet and wherein a value of r in one unit of ╬▓ -pleated sheet need not be the same as a value of r in another unit of ╬▓ -pleated sheet and wherein a value of s in one unit of ╬▓ -pleated sheet need not be the same as a value of s in another unit of ╬▓- 50 pleated sheet and wherein a value oft in one unit of ╬▓ -pleated sheet need not be the same as a value oft in another unit of ╬▓ -pleated sheet.
7. The plant of claim 6 wherein said helix breaker is SEQ ID NO: 8.
8. The plant of claim 6 wherein if said gene is selected from (i) then said gene encodes from 4 to 8 units of amphiphilic ╬▒-helix and if said gene is selected from (ii) then said gene encodes from 4 to 8 units of ╬▓ -pleated sheet.
9. The plant of claim 7 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID NO:24 and SEQ ID NO:28 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO: 10, SEQ ID NO: 14, SEQ ID NO: 18, SEQ ID NO:22, SEQ ID NO:26, and SEQ ID NO:30.
10. A plant wherein said plant is a transgenic plant comprising a heterologous gene which encodes a protein comprising a combination of amphiphilic ╬▒-helix and ╬▓ -pleated sheet and wherein said transgenic plant produces more protein per tissue weight than does said plant when said plant is not a transgenic plant wherein said tissue is root, tuber, seed, leaf, stem, edible portion, flower or whole plant.
11. The plant of claim 10 wherein said gene encodes a protein comprising a sequence of units of (((H)u((h)x(H)y)z(h)wXn)v((H)r(hH)s(h)tXm))p or
(((H)r(hH)s(h)tXm))p((H)u((h)x(H)y)z(h)wXn)v wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, X is any amino acid and may be different for each Xn or Xm, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, w equals 0, 1 or 2, n equals any whole number including 0, v equals any whole number greater than 0, r equals 0 or 1, s equals any whole number greater than 0, t equals 0, 1 or 51
2, m equals any whole number including 0, and p equals any whole number greater than
0 and wherein any one unit within said protein can differ from any other unit within said protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for one unit can differ from any other unit and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y).
12. The plant of claim 11 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.
13. The plant of claim 11 wherein X is SEQ ID NO : 8.
14. A gene encoding a protein selected from the group consisting of (i) a protein comprising an amphiphilic ╬▒ -helical sequence wherein said sequence comprises (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1, x equals
1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein the value of x in one ((h)x(H)y) need not be the same as the value of x in another ((h)x(H)y) and wherein the value of y in one ((h)x(H)y) need not be the same as the value of y in another ((h)x(H)y), and (ii) a protein comprising a ╬▓ -pleated sheet sequence wherein said sequence comprises (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1 , s is any whole number greater than 0, and t equals 0, 1 or 2.
15. The gene of claim 14 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and 52 wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.
16. The gene of claim 14 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO:40 and SEQ ID NO:41.
17. The gene of claim 14 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:6, SEQ ID NOT 1, SEQ ID NOT5, SEQ ID NO:19, SEQ ID NO:23 and SEQ ID NO:27 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:9, SEQ ID NO: 13, SEQ ID NO:17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID NO:29.
18. The gene of claim 14 wherein if said gene is selected from (i) then said gene encodes multiple units of said amphiphilic ╬▒ -helical sequence wherein each unit of amphiphilic ╬▒ -helical sequence is defined by (H)u((h)x(H)y)_(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein each unit of amphiphilic ╬▒ -helical sequence is separated from any neighboring unit of amphiphilic ╬▒ -helical sequence by a helix breaker and wherein any unit of amphiphilic ╬▒ -helical sequence can be different from any other unit of amphiphilic -helical sequence and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H) y) and wherein if said gene is selected from (ii) then said gene encodes multiple units of said ╬▓ -pleated sheet sequence wherein each unit of ╬▓ -pleated 53 sheet sequence is defined by (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1 , s is any whole number greater than 0, and t equals 0, 1 or 2 and wherein each unit of ╬▓ -pleated sheet sequence is separated from any neighboring unit of ╬▓ -pleated sheet sequence by a helix breaker and wherein each unit of ╬▓ -pleated sheet sequence can be different from any other unit of ╬▓ -pleated sheet sequence and wherein a value of r in one unit of ╬▓ -pleated sheet need not be the same as a value of r in another unit of ╬▓ -pleated sheet and wherein a value of s in one unit of ╬▓- pleated sheet need not be the same as a value of s in another unit of ╬▓-pleeated sheet and wherein a vlue oft in one unit of ╬▓ -pleated sheet need not be the same as a value oft in another unit of ╬▓ -pleated sheet.
19. The gene of claim 18 wherein said helix breaker is SEQ ID NO:8.
20. The gene of claim 18 wherein if said gene is selected from (i) then said gene encodes from 4 to 8 units of amphiphilic ╬▒ -helical sequence and if said gene is selected from (ii) then said gene encodes from 4 to 8 units of ╬▓ -pleated sheet sequence.
21. The gene of claim 19 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:7, SEQ ID NO: 12, SEQ ID NO: 16, SEQ ID NO:20, SEQ ID NO:24 and SEQ ID NO:28 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO: 10, SEQ ID NO: 14, SEQ ID NO: 18, SEQ ID NO:22, SEQ ID NO:26 and SEQ ID NO:30.
22. A gene which encodes a protein comprising a sequence of units of (((H)u((h)x(H)y)z(h)^ ^(H) hH)s(h)tX )p or ((H) (hH)s(h)tX )p((H)u((h)x(H)y)z(h)wXn)v wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, X is any amino acid and may be different for each Xn or Xm, u equals 0 or 1 , x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 54
20 and any whole number greater than 20, w equals 0, 1 or 2, n equals any whole number including 0, v equals any whole number greater than 0, r equals 0 or 1 , s equals any whole number greater than 0, t equals 0, 1 or 2, m equals any whole number including 0, and p equals any whole number greater than 0 and wherein any one unit within said protein can differ from any other unit within said protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for one unit can differ from any other unit and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y).
23. The gene of claim 22 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.
24. The gene of claim 22 wherein X is SEQ ID NO : 8.
25. A protein selected from the group consisting of (i) a protein comprising an amphiphilic ╬▒ -helical sequence wherein said sequence comprises (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1 , x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein the value of x in one ((h)x(H)y) need not be the same as the value of x in another ((h)x(H)y) and wherein the value of y in one ((h)x(H)y) need not be the same as the value of y in another ((h)x(H)y), and (ii) a protein comprising a ╬▓ -pleated sheet sequence wherein said sequence comprises (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2. 55
26. The protein of claim 25 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.
27. The protein of claim 25 wherein if said protein is selected from (i) then said protein comprises a sequence selected from the group consisting of SEQ ID NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said protein is selected from (ii) then said protein comprises a sequence selected from the group consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO:40 and SEQ ID NO:41.
28. The protein of claim 25 wherein if said protein is selected from (i) then said protein comprises a sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:6, SEQ ID NO: 11, SEQ ID NO: 15, SEQ ID NO: 19, SEQ ID NO:23 and SEQ ID NO:27 and if said protein is selected from (ii) then said protein comprises a sequence selected from the group consisting of SEQ ID NO:9, SEQ ID NO: 13, SEQ ID NO: 17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID NO:29.
29. The protein of claim 25 wherein if said protein is selected from (i) then said gene encodes multiple units of said amphiphilic ╬▒ -helical sequence wherein each unit of amphiphilic -helical sequence is defined by (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein each unit of amphiphilic ╬▒ -helical sequence is separated from any neighboring unit of amphiphilic -helical sequence by a helix breaker and wherein any unit of amphiphilic ╬▒ -helical sequence can be different from any other unit of amphiphilic -helical sequence and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of 56 y in another ((h)x(H)y) and if said protein is selected from (ii) then said gene encodes multiple units of said ╬▓ -pleated sheet sequence wherein each unit of ╬▓ -pleated sheet sequence is defined by (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1 , s is any whole number greater than 0, and t equals 0, 1 or 2 and wherein each unit of ╬▓ -pleated sheet sequence is separated from any neighboring unit of ╬▓ -pleated sheet sequence by a helix breaker and wherein each unit of ╬▓ -pleated sheet sequence can be different from any other unit of ╬▓ -pleated sheet sequence and wherein a value of r in one unit of ╬▓ -pleated sheet need not be the same as a value of r in another unit of ╬▓ -pleated sheet and wherein a value of s in one unit of ╬▓ -pleated sheet need not be the same as a value of s in another unit of ╬▓ -pleated sheet and wherein a value oft in one unit of ╬▓ -pleated sheet need not be the same as a value of t in another unit of ╬▓- pleated sheet.
30. The protein of claim 29 wherein said helix breaker is SEQ ID NO: 8.
31. The protein of claim 29 wherein if said protein is selected from (i) then said protein comprises 4 to 8 units of amphiphilic ╬▒ -helical sequence and if said protein is selected from (ii) then said protein comprises from 4 to 8 units of ╬▓ -pleated sheet sequence.
32. The protein of claim 30 wherein if said protein is selected from (i) then said protein comprises a sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:7, SEQ ID NO: 12, SEQ ID NO: 16, SEQ ID NO:20, SEQ ID NO:24 and SEQ ID NO:28 and if said protein is selected from (ii) then said said protein comprises a sequence selected from the group consisting of SEQ ID NO: 10, SEQ ID NO: 14, SEQ ID NO: 18, SEQ ID NO:22, SEQ ID NO:26 and SEQ ID NO:30.
33. A protein comprising a sequence of units of (((H)u((h)x(H)y)z(h)wXn)v((H)r(hH)s(h)tXm))p or (((H)r(hH)s(h)tXm))p((H)u((h)x(H)y)z(h)wXn)v wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, X is any amino acid and may be different for each Xn or Xm, u 57 equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, w equals 0, 1 or 2, n equals any whole number including 0, v equals any whole number greater than 0, r equals 0 or 1 , s equals any whole number greater than 0, t equals 0, 1 or 2, m equals any whole number including 0, and p equals any whole number greater than 0 and wherein any one unit within said protein can differ from any other unit within said protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for one unit can differ from any other unit and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y).
34. The protein of claim 33 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.
35. The protein of claim 33 wherein X is SEQ ID NO:8.
36. A plant cell wherein said plant cell is a transgenic plant cell comprising a heterologous gene selected from the group consisting of (i) a gene which encodes a protein comprising an amphiphilic ╬▒-helix and wherein said transgenic plant cell produces more protein per gram of plant cell than does said plant cell when said plant cell is not a transgenic plant cell and (ii) a gene which encodes a protein comprising a ╬▓ -pleated sheet and wherein said transgenic plant cell produces more protein per gram of cells than does said plant cell when said plant cell is not a transgenic plant cell.
37. The plant cell of claim 36 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u eqals or 0 or 1 , x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and 58 any whole number greater than 20, and w equals 0, 1 or 2 and wherein the value of x in one ((h)x(H)y) need not be the same as the value of x in another ((hV(HV) and wherein the value of y in one ((h)x(H)y) need not be the same as the value of y in another ((h)x(H)y), and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2.
38. The plant cell of claim 37 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.
39. The plant cell of claim 37 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO:40 and SEQ ID NO:41.
40. The plant cell of claim 37 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:6, SEQ ID NOT 1, SEQ ID NO:15, SEQ ID NO:19, SEQ ID NO:23 and SEQ ID NO:27 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:9, SEQ ID NO:13, SEQ ID NO:17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID NO:29.
41. The plant cell of claim 37 wherein if said gene is selected from (i) then said gene encodes multiple units of said amphiphilic ╬▒-helix wherein each unit of amphiphilic ╬▒-helix is defined by (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, 59 u eqals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein each unit of amphiphilic ╬▒-helix is separated from any neighboring unit of amphiphilic ╬▒-helix by a helix breaker and wherein any unit of amphiphilic ╬▒-helix can be different from any other unit of amphiphilic ╬▒-helix and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y) and if said gene is selected from (ii) then said gene encodes multiple units of said ╬▓ -pleated sheet wherein each unit of ╬▓ -pleated sheet is defined by (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1 , s is any whole number greater than 0, and t equals 0, 1 or 2 and wherein each unit of ╬▓ -pleated sheet is separated from any neighboring unit of ╬▓ -pleated sheet by a helix breaker and wherein any unit of ╬▓ -pleated sheet can be different from any other unit of ╬▓ -pleated sheet and wherein a value of r in one unit of ╬▓ -pleated sheet need not be the same as a value of r in another unit of ╬▓ -pleated sheet and wherein a value of s in one unit of ╬▓ -pleated sheet need not be the same as a value of s in another unit of ╬▓ -pleated sheet and wherein a value oft in one unit of ╬▓ -pleated sheet need not be the same as a value oft in another unit of ╬▓ -pleated sheet.
42. The plant cell of claim 41 wherein said helix breaker is SEQ ID NO:8.
43. The plant cell of claim 41 wherein if said gene is selected from (i) then said gene encodes from 4 to 8 units of amphiphilic ╬▒-helix and if said gene is selected from (ii) then said gene encodes from 4 to 8 units of ╬▓ -pleated sheet.
44. The plant cell of claim 42 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID NO:24 and SEQ ID NO:28 and if said gene is selected from (ii) then said gene encodes a protein 60 comprising a sequence selected from the group consisting of SEQ ID NO: 10, SEQ ID NO:14, SEQ ID NO:18, SEQ ID NO:22, SEQ ID NO:26, and SEQ ID NO:30.
45. A plant cell wherein said plant cell is a transgenic plant cell comprising a heterologous gene which encodes a protein comprising a combination of amphiphilic ╬▒-helix and ╬▓- pleated sheet and wherein said transgenic plant cell produces more protein per gram of cells than does said plant cell when said plant cell is not a transgenic plant cell.
46. The plant cell of claim 45 wherein said gene encodes a protein comprising a sequence of units of (((H)u((h)x(H)y)2(h)wXn)v((H)r(hH)s(h)tXm))p or
(((H)r(hH)s(h)tXm))p((H)u((h)x(H)y)z(h)NVXn)v wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, X is any amino acid and may be different for each Xn or Xm, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, w equals 0, 1 or 2, n equals any whole number including 0, v equals any whole number greater than 0, r equals 0 or 1, s equals any whole number greater than 0, t equals 0, 1 or 2, m equals any whole number including 0, and p equals any whole number greater than 0 and wherein any one unit within said protein can differ from any other unit within said protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for one unit can differ from any other unit and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y).
47. The plant cell of claim 46 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.
48. The plant cell of claim 46 wherein X is SEQ ID NO:8. 61
49. A method for increasing production of a protein or a nonprotein product in a plant of a specified species wherein said method comprises: a) transforming a cell or cells of said species with a heterologous gene to produce a transgenic cell or transgenic cells wherein said gene is selected from the group consisting of (i) a gene which encodes a protein which comprises an amphiphilic ╬▒ -helical sequence and (ii) a gene which encodes a protein which comprises a ╬▓ -pleated sheet sequence; b) growing said transgenic cell or cells to produce a transgenic plant; and c) growing said transgenic plant.
50. The method of claim 49 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein the value of x in one ((h)x(H)y) need not be the same as the value of x in another ((h)x(H)y) and wherein the value of y in one ((h)x(H)y) need not be the same as the value of y in another ((h)x(H)y), and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2.
51. The method of claim 50 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.
52. The method of claim 50 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from 62 the group consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID NOI39, SEQ ID NO:40 and SEQ ID NO:41.
53. The method of claim 50 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:6, SEQ ID NO:l l, SEQ ID NO: 15, SEQ ID NO: 19, SEQ ID NO:23 and SEQ ID NO:27 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:9, SEQ ID NO:13, SEQ ID NO:17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID NO:29.
54. The method of claim 50 wherein if said gene is selected from (i) then said gene encodes multiple units of said amphiphilic ╬▒-helix wherein each unit of amphiphilic ╬▒-helix is defined by (H)u((h)x(HV)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1 , x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein each unit of amphiphilic ╬▒-helix is separated from any neighboring unit of amphiphilic ╬▒-helix by a helix breaker and wherein any unit of amphiphilic ╬▒-helix can be different from any other unit of amphiphilic ╬▒-helix and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y) and wherein if said gene is selected from (ii) then said gene encodes multiple units of said ╬▓ -pleated sheet wherein each unit of ╬▓ -pleated sheet is defined by (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2 and wherein each unit of ╬▓ -pleated sheet is separated from any neighboring unit of ╬▓ -pleated sheet by a helix breaker and wherein any unit of ╬▓ -pleated sheet can be different from any other unit of ╬▓ -pleated sheet and wherein a value of r in one unit of ╬▓ -pleated sheet need not be the same as a value of r in another unit of ╬▓ -pleated sheet and wherein a value of s in one unit of ╬▓ -pleated sheet need not be the same as a value of s in another unit of ╬▓- 63 pleated sheet and wherein a value oft in one unit of ╬▓ -pleated sheet need not be the same as a value oft in another unit of ╬▓ -pleated sheet.
55. The method of claim 54 wherein said helix breaker is SEQ ID NO:8.
56. The method of claim 54 wherein if said gene is selected from (i) then said gene encodes from 4 to 8 units of amphiphilic ╬▒-helix and if said gene is selected from (ii) then said gene encodes from 4 to 8 units of ╬▓ -pleated sheet.
57. The method of claim 55 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID NO:24 and SEQ ID NO:28 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO: 10, SEQ ID NO: 14, SEQ ID NO: 18, SEQ ID NO:22, SEQ ID NO:26, and SEQ ID NO:30.
58. A method for increasing production of a protein or a nonprotein product in a plant of a specified species wherein said method comprises: a) transforming a cell or cells of said species with a heterologous gene to produce a transgenic cell or transgenic cells wherein said gene encodes a protein which comprises a combination of amphiphilic ╬▒ -helical sequence and ╬▓ -pleated sheet sequence; b) growing said transgenic cell or cells to produce a transgenic plant; and c) growing said transgenic plant.
59. The method of claim 58 wherein said gene encodes a protein comprising a sequence of units of (((H)u((h)x(H)y)z(h)wXn)v((H)r(hH)s(h)tX )p or
(((H) hI_)s(h) U)p((H)ΓÇ₧((h)x(H)y)z(h)w^ v wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, X is any amino acid and may be different for each Xn or Xm, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, w 64 equals 0, 1 or 2, n equals any whole number including 0, v equals any whole number greater than 0, r equals 0 or 1, s equals any whole number greater than 0, t equals 0, 1 or 2, m equals any whole number including 0, and p equals any whole number greater than 0 and wherein any one unit within said protein can differ from any other unit within said protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for one unit can differ from any other unit and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y).
60. The method of claim 59 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.
61. The method of claim 59 wherein X is SEQ ID NO: 8.
62. A method for increasing protein production in plant cells of a specified species wherein said method comprises: a) transforming a cell or cells of said species with a heterologous gene to produce a transgenic cell or transgenic cells wherein said gene is selected from the group consisting of (i) a gene which encodes a protein which comprises an amphiphilic ╬▒ -helical sequence and (ii) a gene which encodes a protein which comprises a ╬▓ -pleated sheet sequence; and b) growing said transgenic cell or cells in culture or in a bioreactor.
63. The method of claim 62 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein the value of x in one ((h)x(H)y) need not be the same as the value of x in another ((h)x(H)y) and wherein the 65 value of y in one ((h)x(H)y) need not be the same as the value of y in another ((h)x(H)y), and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence (H)r(hH)s(h), wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2.
64. The method of claim 63 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.
65. The method of claim 63 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO:40 and SEQ ID NO:41.
66. The method of claim 63 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:6, SEQ ID NO:l l, SEQ ID NO: 15, SEQ ID NO: 19, SEQ ID NO:23 and SEQ ID NO:27 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:9, SEQ ID NO: 13, SEQ ID NO: 17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID NO:29.
67. The method of claim 63 wherein if said gene is selected from (i) then said gene encodes multiple units of said amphiphilic ╬▒-helix wherein each unit of amphiphilic ╬▒-helix is defined by (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1 , x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater 66 than 20, and w equals 0, 1 or 2 and wherein each unit of amphiphilic ╬▒-helix is separated from any neighboring unit of amphiphilic ╬▒-helix by a helix breaker and wherein any unit of amphiphilic ╬▒-helix can be different from any other unit of amphiphilic ╬▒-helix and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y) and wherein if said gene is selected from (ii) then said gene encodes multiple units of said ╬▓ -pleated sheet wherein each unit of ╬▓ -pleated sheet is defined by (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2 and wherein each unit of ╬▓ -pleated sheet is separated from any neighboring unit of ╬▓ -pleated sheet by a helix breaker and wherein any unit of ╬▓ -pleated sheet can be different from any other unit of ╬▓ -pleated sheet and wherein a value of r in one unit of ╬▓ -pleated sheet need not be the same as a value of r in another unit of ╬▓ -pleated sheet and wherein a value of s in one unit of ╬▓ -pleated sheet need not be the same as a value of s in another unit of ╬▓- pleated sheet and wherein a value oft in one unit of ╬▓ -pleated sheet need not be the same as a value oft in another unit of ╬▓ -pleated sheet.
68. The method of claim 67 wherein said helix breaker is SEQ ID NO:8.
69. The method of claim 67 wherein if said gene is selected from (i) then said gene encodes from 4 to 8 units of amphiphilic ╬▒-helix and if said gene is selected from (ii) then said gene encodes from 4 to 8 units of ╬▓ -pleated sheet.
70. The method of claim 68 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:7, SEQ ID NO: 12, SEQ ID NO: 16, SEQ ID NO:20, SEQ ID NO:24 and SEQ ID NO:28 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO: 10, SEQ ID NO: 14, SEQ ID NO: 18, SEQ ID NO:22, SEQ ID NO:26, and SEQ ID NO:30. 67
71. A method for increasing production of a protein or a nonprotein product in plant cell's of a specified species wherein said method comprises: a) transforming a cell or cells of said species with a heterologous gene to produce a transgenic cell or transgenic cells wherein said gene encodes a protein which comprises a combination of amphiphilic ╬▒ -helical sequence and ╬▓ -pleated sheet sequence; and b) growing said transgenic cell or cells in culture or in a bioreactor.
72. The method of claim 71 wherein said gene encodes a protein comprising a sequence of units of (((H)u((h)x(H)y)2(h)wXn)v((H)r(hH)s(h)tXm))p or
(((H)r(hH)s(h)tXm))p((H)u((h)x(H)y)z(h)vvXn)v wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, X is any amino acid and may be different for each Xn or Xm, u equals 0 or 1 , x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, w equals 0, 1 or 2, n equals any whole number including 0, v equals any whole number greater than 0, r equals 0 or 1, s equals any whole number greater than 0, t equals 0, 1 or 2, m equals any whole number including 0, and p equals any whole number greater than 0 and wherein any one unit within said protein can differ from any other unit within said protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for one unit can differ from any other unit and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y).
73. The method of claim 72 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.
74. The method of claim 72 wherein X is SEQ ID NO:8. 68
75. A method of increasing production of a first protein, or of a nonprotein product in which case said first protein catalyzes a step in a synthesis of said nonprotein product, in a plant of a specified species wherein said first protein is encoded by a first gene with which said plant is transformed or an ancestor of said plant had been transformed wherein said method comprises the steps of:
(a) selecting a cell or cells of said species;
(b) transforming said cell or cells of said species with said first gene if said cell or cells were not already transformed with said first gene;
(c) transforming said cell or cells of said species with a second gene, if said cell or cells or an ancestor of said cell or cells had not previously been transformed with said second gene, to form a transgenic cell or transgenic cells wherein said second gene is selected from the group consisting of (i) a heterologous gene which encodes a protein which comprises an amphiphilic ╬▒ -helical sequence and (ii) a heterologous gene which encodes a protein which comprises a ╬▓ -pleated sheet sequence;
(d) growing said transgenic cell or cells to produce a transgenic plant comprising both said first gene and said second gene; and
(e) growing said transgenic plant, wherein, if it is necessary to perform steps (b) and (c), either step (b) can be performed before step (c), step (c) can be performed before step (b), or steps (b) and (c) can be performed simultaneously.
76. The method of claim 75 wherein if said second gene is selected from (i) then said second gene encodes a protein comprising a sequence (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1 , x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein the value of x in one ((h)x(H)y) need not be the same as the value of x in another ((h)x(H)y) and wherein the value of y in one ((h)x(H)y) need not be the same as the value of y in another ((h)x(H)y), and if said second gene is selected from (ii) then said second gene encodes a protein comprising a sequence (H)r(hH)s(h)t wherein H is a hydrophilic amino 69 acid residue and can vary along said protein, h is a hydrophobic amino acid residue' and can vary along said protein, r equals 0 or 1 , s is any whole number greater than 0, and t equals 0, 1 or 2.
77. The method of claim 76 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.
78. The method of claim 76 wherein if said second gene is selected from (i) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:31 , SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said second gene is selected from (ii) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO:40 and SEQ ID NO:41.
79. The method of claim 76 wherein if said second gene is selected from (i) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:6, SEQ ID NO:l l, SEQ ID NO:15, SEQ ID NO:19, SEQ ID NO:23 and SEQ ID NO:27 and if said second gene is selected from (ii) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:9, SEQ ID NO: 13, SEQ ID NO: 17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID NO:29.
80. The method of claim 76 wherein if said second gene is selected from (i) then said second gene encodes multiple units of said amphiphilic ╬▒-helix wherein each unit of amphiphilic ╬▒-helix is defined by (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein each unit of amphiphilic ╬▒- 70 helix is separated from any neighboring unit of amphiphilic ╬▒-helix by a helix breaker and wherein any unit of amphiphilic ╬▒-helix can be different from any other unit of amphiphilic ╬▒-helix and whrein a value of x in one ((h)x(H)y) need not be the same as a value of x in annother ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y), and wherein if said second gene is selected from (ii) then said second gene encodes multiple units of said ╬▓ -pleated sheet wherein each unit of ╬▓ -pleated sheet is defined by (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1 , s is any whole number greater than 0, and t equals 0, 1 or 2 and wherein each unit of ╬▓ -pleated sheet is separated from any neighboring unit of ╬▓ -pleated sheet by a helix breaker and wherein any unit of ╬▓ -pleated sheet can be different from any other unit of ╬▓ -pleated sheet and wherein a value of r in one unit of ╬▓ -pleated sheet need not be the same as a value of r in another unit of ╬▓- pleated sheet and wherein a value of s in one unit of ╬▓ -pleated sheet need not be the same as a value of s in another unit of ╬▓ -pleated sheet and wherein a value oft in one unit of ╬▓ -pleated sheet need not be the same as a value oft in another unit of ╬▓ -pleated sheet.
81. The method of claim 80 wherein said helix breaker is SEQ ID NO:8.
82. The method of claim 80 wherein if said second gene is selected from (i) then said second gene encodes from 4 to 8 units of amphiphilic ╬▒-helix and if said second gene is selected from (ii) then said second gene encodes from 4 to 8 units of ╬▓ -pleated sheet.
83. The method of claim 81 wherein if said second gene is selected from (i) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID NO:24 and SEQ ID NO:28 and if said second gene is selected from (ii) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:10, SEQ ID NO:14, SEQ ID NO:18, SEQ ID NO:22, SEQ ID NO:26 and SEQ ID NO:30. 71
84. A method of increasing production of a first protein, or of a nonprotein product in which case said first protein catalyzes a step in a synthesis of said nonprotein product, in a plant of a specified species wherein said protein is encoded by a first gene with which said plant is transformed or an ancestor of said plant had been transformed wherein said method comprises the steps of:
(a) selecting a cell or cells of said species;
(b) transforming said cell or cells of said species with said first gene if said cell or cells were not already transformed with said first gene;
(c) transforming said cell or cells of said species with a second gene, if said cell or cells or an ancestor of said cell or cells had not previously been transformed with said second gene, to form a transgenic cell or transgenic cells wherein said second gene encodes a protein which comprises a combination of amphiphilic ╬▒ -helical sequence and ╬▓ -pleated sheet sequence and wherein said second gene does not naturally occur in said plant;
(d) growing said transgenic cell or cells to produce a transgenic plant comprising both said first gene and said second gene; and
(e) growing said transgenic plant, wherein, if it is necessary to perform steps (b) and (c), either step (b) can be performed before step (c), step (c) can be performed before step (b), or steps (b) and (c) can be performed simultaneously.
85. The method of claim 84 wherein said second gene encodes a protein comprising a sequence of units of (((H)u((h)x(H)y)z(h)wXn)v((H)r(hH)s(h)tXm))p or (((H)r(hH)s(h)tXm))p((H)u((h)x(H)y)z(h)vXn)v wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, X is any amino acid and may be different for each Xn or Xm, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, w equals 0, 1 or 2, n equals any whole number including 0, v equals any whole number greater than 0, r equals 0 or 1, s equals any whole number greater than 0, t equals 0, 1 or 2, m equals any whole number including 0, and p equals any whole number greater than 0 and wherein any one unit within said protein can differ from any other unit within said 72 protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for one unit can differ from any other unit and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y).
86. The method of claim 85 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.
87. The method of claim 85 wherein X is SEQ ID NO:8.
88. A method of increasing production of a first protein, or of a nonprotein product in which case said first protein catalyzes a step in a synthesis of said nonprotein product, in a plant cell or plant cells of a specified species wherein said first protein is encoded by a first gene with which said plant cell or plant cells are transformed or an ancestor of said plant had been transformed wherein said method comprises the steps of:
(a) selecting a cell or cells of said species;
(b) transforming said cell or cells of said species with said first gene if said cell or cells were not already transformed with said first gene;
(c) transforming said cell or cells of said species with a second gene, if said cell or cells or an ancestor of said cell or cells had not previously been transformed with said second gene, to form a transgenic cell or transgenic cells wherein said second gene is selected from the group consisting of (i) a gene which encodes a protein which comprises an amphiphilic -helical sequence and wherein said gene does not naturally occur in said plant and (ii) a gene which encodes a protein which comprises a ╬▓ -pleated sheet sequence and wherein said gene does not naturally occur in said plant; and
(d) growing said transgenic cell or cells in culture or in a bioreactor, wherein, if it is necessary to perform steps (b) and (c), either step (b) can be performed before step (c), step (c) can be performed before step (b), or steps (b) and (c) can be performed simultaneously. 73
89. The method of claim 88 wherein if said second gene is selected from (i) then said second gene encodes a protein comprising a sequence (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1 , x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein the value of x in one ((h)x(H)y) need not be the same as the value of x in another ((h)x(H)y) and wherein the value of y in one ((h)x(H)y) need not be the same as the value of y in another ((h)x(H)y), and if said second gene is selected from (ii) then said second gene encodes a protein comprising a sequence (H)r((hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1 , s is any whole number greater than 0, and t equals 0, 1 or 2.
90. The method of claim 89 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.
91. The method of claim 89 wherein if said second gene is selected from (i) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said second gene is selected from (ii) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO: 36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO:40 and SEQ ID NO:41.
92. The method of claim 89 wherein if said second gene is selected from (i) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:3, SEQ ID NO:6, SEQ ID NO: 11, SEQ ID NO: 15, SEQ ID NO: 19, SEQ ID NO:23 and SEQ ID NO:27 and if said second gene is selected from (ii) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ 74
ID NO:9, SEQ ID NO: 13, SEQ ID NO: 17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID NO:29.
93. The method of claim 89 wherein if said second gene is selected from (i) then said second gene encodes multiple units of said amphiphilic ╬▒-helix wherein each unit of amphiphilic ╬▒-helix is defined by (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1 , x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein each unit of amphiphilic ╬▒- helix is separated from any neighboring unit of amphiphilic ╬▒-helix by a helix breaker and wherein any unit of amphiphilic ╬▒-helix can be different from any other unit of amphiphilic ╬▒-helix and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y), and wherein if said second gene is selected from (ii) then said second gene encodes multiple units of said ╬▓ -pleated sheet wherein each unit of ╬▓ -pleated sheet is defined by (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2 and wherein each unit of ╬▓ -pleated sheet is separated from any neighboring unit of ╬▓ -pleated sheet by a helix breaker and wherein any unit of ╬▓ -pleated sheet can be different from any other unit of ╬▓ -pleated sheet and wherein a value of r in one unit of ╬▓ -pleated sheet need not be the same as a value of r in another unit of ╬▓- pleated sheet and wherein a value of s in one unit of ╬▓ -pleated sheet need not be the same as a value of s in another unit of ╬▓ -pleated sheet and wherein a vlaue oft in one unit of ╬▓ -pleated sheet need not be the same as a value oft in another unit of ╬▓ -pleated sheet.
94. The method of claim 93 wherein said helix breaker is SEQ ID NO:8. 75
95. The method of claim 93 wherein if said second gene is selected from (i) then said second gene encodes from 4 to 8 units of amphiphilic ╬▒-helix and if said second gene is selected from (ii) then said second gene encodes from 4 to 8 units of ╬▓ -pleated sheet.
96. The method of claim 94 wherein if said second gene is selected from (i) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID NO:24 and SEQ ID NO:28 and if said second gene is selected from (ii) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO: 10, SEQ ID NO: 14, SEQ ID NO: 18, SEQ ID NO:22, SEQ ID NO:26 and SEQ ID NO:30.
97. A method of increasing production of a first protein, or of a nonprotein product in which case said first protein catalyzes a step in a synthesis of said nonprotein product, in a plant cell or plant cells of a specified species wherein said protein is encoded by a first gene with which said plant cell or plant cells are transformed or an ancestor of said plant had been transformed wherein said method comprises the steps of:
(a) selecting a cell or cells of said species;
(b) transforming said cell or cells of said species with said first gene if said plant cell or plant cells were not already transformed with said first gene;
(c) transforming said cell or cells of said species with a second gene, if said cell or cells or an ancestor of said cell or cells had not previously been transformed with said second gene, to form a transgenic cell or transgenic cells wherein said second gene encodes a protein which comprises a combination of amphiphilic ╬▒ -helical sequence and ╬▓ -pleated sheet sequence and wherein said gene does not naturally occur in said plant; and
(d) growing said transgenic cell or cells in culture or in a bioreactor, wherein, if it is necessary to perform steps (b) and (c), either step (b) can be performed before step (c), step (c) can be performed before step (b), or steps (b) and (c) can be performed simultaneously. 76
98. The method of claim 97 wherein said second gene encodes a protein comprising a sequence of units of (((H)u((h)x(H)y)z(h)wXn)v((H)r(hH)s(h)tX )p or (((H)r(hH)s(h),Xm))p((H)u((h)x(H)y)z(h)VXn)v wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, X is any amino acid and may be different for each Xn or Xm, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, w equals 0, 1 or 2, n equals any whole number including 0, v equals any whole number greater than 0, r equals 0 or 1, s equals any whole number greater than 0, t equals 0, 1 or 2, m equals any whole number including 0, and p equals any whole number greater than 0 and wherein any one unit within said protein can differ from any other unit within said protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for one unit can differ from any other unit and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y).
99. The method of claim 98 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.
100. The method of claim 98 wherein X is SEQ ID NO:8.
PCT/US1999/009067 1998-04-27 1999-04-27 A method for increasing the protein content of plants WO1999055890A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CA002325463A CA2325463A1 (en) 1998-04-27 1999-04-27 A method for increasing the protein content of plants
AU38681/99A AU3868199A (en) 1998-04-27 1999-04-27 A method for increasing the protein content of plants

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US6605698A 1998-04-27 1998-04-27
US09/066,056 1998-04-27

Publications (1)

Publication Number Publication Date
WO1999055890A1 true WO1999055890A1 (en) 1999-11-04

Family

ID=22066979

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/009067 WO1999055890A1 (en) 1998-04-27 1999-04-27 A method for increasing the protein content of plants

Country Status (4)

Country Link
AR (1) AR016229A1 (en)
AU (1) AU3868199A (en)
CA (1) CA2325463A1 (en)
WO (1) WO1999055890A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004111244A2 (en) * 2003-06-17 2004-12-23 Sembiosys Genetics Inc. Methods for the production of insulin in plants

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1989004371A1 (en) * 1987-11-02 1989-05-18 Louisiana State University Agricultural And Mechan Plants genetically enhanced for disease resistance
WO1993003160A1 (en) * 1991-08-09 1993-02-18 E.I. Du Pont De Nemours And Company Synthetic storage proteins with defined structure containing programmable levels of essential amino acids for improvement of the nutritional value of plants
WO1997028247A2 (en) * 1996-01-29 1997-08-07 Biocem AMINO ACID-ENRICHED PLANT PROTEIN RESERVES, PARTICULARLY LYSINE-ENRICHED MAIZE η-ZEIN, AND PLANTS EXPRESSING SUCH PROTEINS

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1989004371A1 (en) * 1987-11-02 1989-05-18 Louisiana State University Agricultural And Mechan Plants genetically enhanced for disease resistance
WO1993003160A1 (en) * 1991-08-09 1993-02-18 E.I. Du Pont De Nemours And Company Synthetic storage proteins with defined structure containing programmable levels of essential amino acids for improvement of the nutritional value of plants
WO1997028247A2 (en) * 1996-01-29 1997-08-07 Biocem AMINO ACID-ENRICHED PLANT PROTEIN RESERVES, PARTICULARLY LYSINE-ENRICHED MAIZE η-ZEIN, AND PLANTS EXPRESSING SUCH PROTEINS

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JAYNES, J M: "De novo designed synthetic plant storage proteins: Enhancing protein quality of plants for improved human and animal nutrition", BIOTECHNOL. FEED IND., PROC. ALLTECH'S ANNU. SYMP., 10TH (1994), 129-53. EDITORS: LYONS, T. P.; JACQUES, K. A.; PUBLISHER: NOTTINGHAM UNIVERSITY PRESS, LOUGHBOROUGH, UK., XP002115146 *
KIM, JAE HO ET AL: "Enhancing the nutritional quality of crop plants: design, construction, and expression of an artificial plasnt torage protein gene.", MOL. APPROACHES IMPROV. FOOD QUAL. SAF. (1992), 1-36. EDITORS: BHATNAGAR, D. ;CLEVELAND, T. E.; PUBLISHER: VAN NOSTRAND REINHOLD, NEW YORK, USA, XP002115145 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004111244A2 (en) * 2003-06-17 2004-12-23 Sembiosys Genetics Inc. Methods for the production of insulin in plants
WO2004111244A3 (en) * 2003-06-17 2005-02-03 Sembiosys Genetics Inc Methods for the production of insulin in plants
EP1959017A1 (en) * 2003-06-17 2008-08-20 SemBioSys Genetics Inc. Methods for the production of insulin in plants
US7547821B2 (en) 2003-06-17 2009-06-16 Sembiosys Genetics Inc. Methods for the production of insulin in plants
EA014887B1 (en) * 2003-06-17 2011-02-28 Сембайосиз Джинетикс Инк. Method for the expression of insulin in plant seeds, a method for obtaining plant seeds comprising insulin and a plant capable of setting seeds comprising insulin

Also Published As

Publication number Publication date
AR016229A1 (en) 2001-06-20
CA2325463A1 (en) 1999-11-04
AU3868199A (en) 1999-11-16

Similar Documents

Publication Publication Date Title
Mandal et al. Seed storage proteins and approaches for improvement of their nutritional quality by genetic engineering
AU704510B2 (en) Chimeric genes and methods for increasing the lysine content of the seeds of corn, soybean and rapeseed plants
US7365242B2 (en) Suppression of specific classes of soybean seed protein genes
US5559223A (en) Synthetic storage proteins with defined structure containing programmable levels of essential amino acids for improvement of the nutritional value of plants
US5850016A (en) Alteration of amino acid compositions in seeds
CA2242903C (en) Amino acid-enriched plant protein reserves, particularly lysine-enriched maize .gamma.-zein, and plants expressing such proteins
WO1998045458A1 (en) An engineered seed protein having a higher percentage of essential amino acids
Keeler et al. Expression of de novo high-lysine α-helical coiled-coil proteins may significantly increase the accumulated levels of lysine in mature seeds of transgenic tobacco plants
Kim et al. Enhancing the nutritional quality of crop plants: design, construction, and expression of an artificial plant storage protein gene
JPH08507219A (en) Improved forage crops rich in sulfur-containing amino acids and improved methods
Dennis et al. Molecular approaches to crop improvement
US6548744B1 (en) Reduction of bowman-birk protease inhibitor levels in plants
WO1999055890A1 (en) A method for increasing the protein content of plants
Jaynes De novo designed synthetic plant storage proteins: enhancing protein quality of plants for improved human and animal nutrition.
Shotwell et al. Improvement of the protein quality of seeds by genetic engineering
WO2002086077A2 (en) Co-expression of zein proteins
WO2007044886A2 (en) Methods to produce desired proteins in plants
Kawakatsu et al. 1. Seed storage proteins
Kim The design and expression of a synthetic gene encoding a novel polypeptide to enhance the quality of plant protein
Kim et al. Expression of de novo designed high nutritional peptide (HEAAE) in tobacco
EP1847613A2 (en) Suppression of specific classes of soybean seed protein genes
Larkins Transgenic Plants for Improving Seed Storage Proteins
Lin Testing the Cruciferin Deficient Mutant, ssp-1, of Arabidopsis thaliana, as a Vehicle for Overexpression of Foreign Proteins
MXPA00007707A (en) Alteration of amino acid compositions in seeds
AU2004202195A1 (en) Reduction of Bowman-Birk protease inhibitor levels in plants

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
ENP Entry into the national phase

Ref document number: 2325463

Country of ref document: CA

Ref country code: CA

Ref document number: 2325463

Kind code of ref document: A

Format of ref document f/p: F

NENP Non-entry into the national phase

Ref country code: KR

WWE Wipo information: entry into national phase

Ref document number: 38681/99

Country of ref document: AU

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase