CA2325463A1

CA2325463A1 - A method for increasing the protein content of plants

Info

Publication number: CA2325463A1
Application number: CA002325463A
Authority: CA
Inventors: Jesse M. Jaynes
Original assignee: Individual
Current assignee: Demegen Inc
Priority date: 1998-04-27
Filing date: 1999-04-27
Publication date: 1999-11-04
Also published as: WO1999055890A1; AR016229A1; AU3868199A

Abstract

A de novo designed, artificial storage protein has been stably expressed in plants. This protein was designed to have high levels of all the essential amino acids needed for human nutrition. Expressing the gene coding for this protein in crop plants will greatly improve the nutritional quality of the resulting crops. The gene has also been observed to increase the overall level of protein production in a plant. This property will allow enhanced levels of production of other valuable proteins by a plant. For example, a transgenic plant with a gene encoding for insulin may produce higher levels of insulin when the plant also expresses a gene for an artificial storage protein. The method will also allow enhanced production of nonprotein products.
Contransfomation with a gene of interest can result in enhanced levels of the protein product of the gene of interest or a product synthesized thereby.

Description

TITLE OF THE INVENTION
A METHOD FOR INCREASING THE PROTEIN CONTENT OF PLANTS
EACKGROUND OF THE INVENTION
The composition of plant storage proteins, a major food reservoir for the developing seeds, determines the nutritional value of plants and grains when they are used as foods for man and domestic animals. The amount of protein varies with genotype or cultivar, but in general, cereals contain 10% of the dry weight of the seed as protein, while in legumes, the protein content varies between 20% and 30% of the dry weight. In many seeds, the storage proteins account for 50% or more of the total protein and thus determine the protein quality of seeds.
Each year the total world cereal harvest amounts to some 1,700 million tons of grain (Keris et al. 1985). This yields about 85 million tons of cereal storage proteins harvested each year and contributes a majority of the total protein intake of humans and animals.
With respect to human and animal nutrition, most seeds do not provide a balanced source of protein because of deficiencies in one or more of the essential amino acids in the storage proteins. For example, humans require from foods eight amino acids:
isoleucine, leucine, lysine, methionine, phenylalanine, threonine, tryptophan and valine, to maintain a balanced diet.
Consumption of proteins of unbalanced composition of amino acids can lead to a malnourished state which is most often found in children in developing countries where plants are the major source of protein intake. Therefore, the development of nutritionally-balanced proteins for introduction into plants is of extreme importance.
AMINO ACID REQUIREMENTS
The biosynthesis of amino acids from simpler precursors is a process vital to all forms of life as these amino acids are the building blocks of proteins. Organisms differ markedly with respect to their ability to synthesize amino acids. In fact, virtually all members of the animal kingdom are incapable of manufacturing some amino acids. There are twenty common amino acids which are utilized in the fabrication of proteins and essential amino acids are those protein building blocks which cannot be synthesized by the animal. It is generally agreed that humans require eight of the twenty common amino acids in their diet. Protein deficiencies can usually be ascribed to a diet which is deficient in one or more of the essential amino acids. A

nutritionally adequate diet must include a minimum daily consumption of these amino acids (Figure 1 ).
When diets are high in carbohydrates and low in protein, over a protracted period, essential amino acid deficiencies result. The name given to this undernourished condition in humans is "Kwashiorkor" which is an African word meaning "deposed child"
(deposed from the mother's breast by a newborn sibling). This debilitating and malnourished state, characterized by a bloated stomach and reddish-orange discolored hair, is more often found in children than adults because of their great need for essential amino acids during growth and development. In order for normal physical and mental maturation to occur, the above mentioned daily source of essential amino acids is a requisite. Essential amino acid content, or protein quality, is as important a feature of the diet as total protein quantity or total calorie intake.
Some foods, such as milk, eggs, and meat, have very high nutritional values because they contain a disproportionately high level of essential amino acids. On the other hand, most foodstuffs obtained from plants possess a poor nutritional value because of their relatively low content of some or, in a few cases, all of the essential amino acids.
Generally, the essential amino acids which are found to be most limiting in plants are isoleucine, lysine, methionine, threonine, and tryptophan (MLEAA) (Figure 2).
It has been difficult to produce significant increases in the essential amino acid content of crop plants utilizing classical plant breeding approaches. This is primarily due to the fact that the genetics of plant breeding is complex and that an increase in essential amino acid content may be offset by a loss in other agronomically important characters. Also, it is probable that the storage proteins are very conserved in their structure and their essential amino acid composition would be little modified by these conventional techniques.
STRUCTURE AND CLASSIFICATION OF NATURAL STORAGE PROTEINS
Seed storage proteins can be characterized by several main features (Pernollet and Mosse 1983): 1 ) their main function is to provide amino acids or nitrogen to the young seedling; 2) the general absence of any other known function; 3) their peculiar amino acid composition in cereal and legume seeds; and 4) their localization within storage organelles called protein bodies, at least during seed development. Several classes of storage proteins are generally recognized based on their solubilities in different solvents. Proteins soluble in water are called "albumins"; proteins WO 99/55890 PCf/US99/09067 soluble in 5% saline, "globulins"; and proteins soluble in 70% ethanol, "prolamins". The proteins that remain following these extractions are treated further with dilute acid or alkali, and are named "glutelins". Most cereals contain primarily prolamin type proteins and can be classified into different groups on the basis of the relative proportions of prolamins, glutelins, and globulins, and the subcellular location of these proteins in the mature seed.
The first group corresponds to the Panicoideae sub-family, the second group the Triticeae tribe, and the last one to oat and rice storage proteins.
The principal members of the Panicoideae sub-family are maize, sorghum, and millet.
Their major storage proteins are prolamins (50 to 60% of seed protein) and glutelins (35 to 40%
of seed protein) (Pernollet and Mosse 1983). Prolamins are stored within protein bodies, but glutelins are located both inside and outside these organelles. The Triticeae tribe which includes wheat, barley, and rye, differ from the Panicoideae mainly in storage protein localization and structure. In the starchy endosperm of the seeds belonging to this tribe, no protein bodies are left at maturity. Clusters of proteins are then deposited between starch granules, but are no longer surrounded by a membrane.
In legumes and most other dicots, the major storage proteins are salt-soluble globulins (80%) and prolamins (10-15%). Globulins can be divided into vicillins and legumins (Agros, 1985), based on their sedimentation coefficient (7S/11S), oligomeric organization (trimeric/hexameric), and polypeptide chain structure (single chain/disulphide-linked pair of chains). In the legume seed cotyledon, protein bodies are embedded between starch granules (Pernollet and Mosse 1983). They are membrane-bound organelles, a few microns in diameter, mainly filled with storage proteins and phytates. Besides storage proteins, protein bodies also contain other proteins, such as enzymes or lectins, although in lesser amounts.
The structure of soluble globulins were studied more than the insoluble prolamins and glutelins. Vicillin appears as a homo- or heterotrimer, sometimes able to associate into hexameric form. Soybean ~3-conglycin and french bean phaseolin (Bollini and Chrispeels 1978) are the structurally best known vicillins. Recently, the three-dimensional structure of phaseolin was determined by X-ray crystallographic analysis (Lawrence et al. 1990). However, unlike other vicillins, the phaseolin trimer can associate into a dodecamer (tetramer of trimer) below pH 4.5.
Each polypeptide of the trimeric form comprises two structurally similar units each made up of a (3-barrel and an a-helical domain.

Glycinin, the soybean legumin, has a quaternary structure that was suggested by Badiey et al. (1975) to be twelve subunits packed in two identical hexagons. In general, the legumin molecule is a polymer formed by the association of six monomers. Each monomer consists of two subunits, acidic and basic. Sometimes, these subunits are associated by disulfide linkages.
On the other hand, arachin, the peanut legumin, was found to consist of different kinds of subunits. The arachin hexamer association does not need different kinds of subunits, which suggests that the subunits have a very similar structure.
The most studied storage proteins, in terms of structure, are the corn prolamines called zeros. These proteins perform no known enzymatic function. Three types of wins (a, ~i and y) (Esen 1986) are synthesized on rough endoplasmic reticulum and aggregate within this membrane as protein bodies. The zein protein readily self associates to form protein bodies and is insoluble in water even in low concentrations of salt. The presence of all types of zeros is not necessary for the formation of a protein body as a single type of zero can aggregate into a dense structure and is generally found at the surface of protein bodies (Lending et al. 1988; Wallace et al. 1988). The mechanism responsible for protein body formation is thought to involve hydrophobic and weak polar interactions between individual zein molecules (Wallace et al. 1988;
Agros et al. 1982), while they require a high amount of ethanol in aqueous systems to maintain their strict molecular conformation (Agros et al. 1982).
Circular dichroic measurements, amino acid sequence analysis, and electron microscopy of a zero protein suggests that zein secondary structure is primarily helical with nine adjacent, topologically antiparallel helices clustered within a distorted cylinder (Agros et al., 1982;
Larkins, 1983; Larkins, et al.,1984). Polar and hydrophobic residues are appropriately distributed along the helical surfaces allowing infra- and intermolecular hydrogen bonds and van der Waals interactions among neighboring helices, such that rod-shaped zero molecules can aggregate and then stack through glutamate interactions at the cylindrical caps. Because of this structure, zero is much less soluble under physiological conditions than the globulin phaseolin, and precipitation of insoluble zero in the tightly packed protein body may make them less available for proteolytic degradation (Greenwood and Chrispeels 1985).
The storage protein structures are adapted to a maximal packing within protein bodies (Pernollet and Mosse 1983). Maximal packing is achieved in at least one of two ways. The folding of the polypeptide chain may favor the maximal packing of amino acids within the protein molecule, or the compacting of proteins is increased by the formation of closely packed quaternary structure. High degrees of polymerization can be observed in pearl millet pennisetin (Pernollet and Mosse 1983) or zein (Lending et al. 1988; Wallace et al. 1988).
Also, wheat prolamins and glutelin associate into aggregates arising in the formation of insoluble gluten.

5 These insoluble forms of protein deposits are osmotically inactive and stable during the long period of storage between the time of seed maturation and germination.
REGULATION OF STORAGE PROTEIN GENES
All storage proteins which have been investigated are encoded by multigene families (Bartels and Tompson 1983; Crouch et al. 1983; Forde et al. 1985; Kasarda et al. 1984; Lycett et aL 1985; Rafalski et al. 1984; Slightom et al. 1983). The structure of these families varies. In some cases, as in wheat or barley, two major subgroups can be noted: the a-and y-gliadins and the B- and C-hordeins, respectively (Forde et al. 1985; Kasarda et al. 1984;
Rafalski et al. 1984).
Within each subgroup, several subfamilies can be distinguished. Often short repeats account for at least part of the structure of the polypeptides. These repeats constitute links through which different subfamilies within the same species are related.
Storage protein genes, like most other plant genes characterized to date, are transcribed in a regulated rather than a constitutive fashion. Expression is frequently tissue-specific and/or temporally regulated. Cis-acting DNA sequences involved in developmental and/or tissue-specific regulation of gene expression can be defined by introducing plant storage protein gene regulatory regions coupled to bacterial reporter genes (Twell and Ooms 1987;
Wenzler et al.
1989, Marnes et al. 1988; Chen et al. 1988), or by introducing entire or dissected genes (Colot et al. 1987; Chen et al. 1986) into a transgenic environment. Unfortunately, a transformation system for the nutritionally important cereal species has not yet been well established. Therefore, most regulation mechanisms have been studied with transgenic dicot plants.
However, there is increasing evidence that gene expression is controlled, at least partly, by the interaction between regulatory molecules and short sequences that are present in the 5' flanking region of the gene.
The regulatory sequences of potato storage protein were investigated using transgenic potato plants. A 2.5 kb 5' flanking DNA fragment containing the promoter and the patatin gene was used to construct a transcriptional fusion gene with chloramphenicol acetyl transferase (CAT) or the (3-glucuronidase (GUS) gene (Twell and Ooms 1987; Wenzler et al.
1989). When reintroduced into potato, these chimeric genes were expressed in tubers, but not in leaves, stems or roots.
The expression pattern of storage protein genes of cereals is retained in tobacco, not only with respect to tissue, but also to temporal expression. The 5' upstream regions of wheat glutenin genes possess regulatory sequences that determine endosperm-specific expression in transgenic tobacco (Colot et al. 1987). Deletion analysis of the low molecular weight (LMV~ glutenin sequence indicated that sequences present between 326 by and 160 by upstream of the transcriptional start point are necessary to confer endosperm-specific expression. Furthermore, cis-acting elements determining the regulation of each gene in the cluster are recognized by the tobacco traps-acting factor but also that cis-acting elements directing expression of one gene do not affect expression of neighboring genes. This was demonstrated by the transfer of a 17. Z kb soybean DNA containing a seed lectin gene with at least four nonseed protein genes to transgenic tobacco plants (Okamuro, 1986). The genes in this cluster were expressed in a manner similar to that in soybean; i.e., the lectin gene products accumulated in seeds, and the other genes were expressed in tobacco leaves, stems, and roots.
The expression of several DNA deletion mutants with a 257 by 5' flanking sequence of the a'-conglycin gene indicates that this region contained enhancer like elements (Chen et al.
1986). Only a low level of expression of the a' gene occurred in developing seeds of transgenic plants that contain the a' gene flanked by 159 nucleotides 5' of the transcripdonal start site.
However, a 20 fold increase in expression occurred when an additional 98 nucleotides of upstream sequence were included. The DNA sequence between 143 and 257 contained five repeats of the sequence AA(G)CCCA, and played a role in conferring tissue-specific and developmental regulation. The 35S promoter containing this sequence in different positions and different orientations is able to enhance the expression of the CAT gene by 25 to 40 fold (Chen et al. 1988).
Traps-acting factors directly involved in storage protein gene regulation have not yet been reported. However, in some cases, the level of amino acids can control the expression of storage protein. Vegetative storage protein (VSP) gene expression in leaves, stems and seed pods is closely related to whether these organs are can ently a sink for nitrogen or a source for mobilized nitrogen for other organs (Staswick 1989). The leaves have a sensitive mechanism for detecting changes in sink demand of mobilizing reserves, and VSP gene expression can be rapidly adjusted accordingly. Sequestering excess amino acids in this way may prevent their accumulation to toxic levels.
GENETIC ENGINEERING USING AGROBACTERIUM TUMEFACIENS
One of the most significant recent advances in the area of plant molecular biology has been the development of the Agrobacterium tumefaciens Ti plasmid as a vector system for the transformation of plants. In nature, A. tumefaciens infects most dicotyledonous and some monocotyledonous plants by entry through wound sites. The bacteria bind to cells in the wound and are stimulated by phenolic compounds released from these cells to transfer a portion of their endogenous, 200 kb Ti plasmid into the plant cell (Weiler and Schroder 1987).
The transferred portion of the Ti plasmid, (T-DNA), becomes covalently integrated into the plant genome, where it directs the biosynthesis of phytohormones using enzymes which it encodes.
The vir gene in the bacterial genome is known to be responsible for this process. In addition to vir gene products, directly repeating sequences of 25 bases called "border" sequences are essential, but only the right terminus has been shown to be used for T-DNA transfer and integration.
Expression of the T-DNA gene inside the plants results in the uncontrolled growth of these and surrounding cells, leading to formation of a gall (Weiler and Schroder 1987). Ti plasmids, from which these disease-producing genes have been removed or replaced, are referred to as "disarmed" and can be used for the introduction of foreign genes into plants. The great size of the disarmed Ti plasmid and lack of unique restriction endonuclease sites prohibit direct cloning into the T-DNA. Instead, intermediate vectors such as pMON237 or pBI121 can be used to introduce genes into the Ti plasmid. Currently, two kinds of vector systems are available as intermediate vectors: cointegrating vectors and binary vectors. A
cointegrating transformation vector must include a region of homology between the vector plasmid and the Ti plasmid. Once recombination occurs, the cointegrated plasmid is replicated by the Ti plasmid origin of replication. The cointegrate system, while more difficult to use, does offer advantages. Once the cointegrate has been formed, the plasmid is stable in Agrobacterium.
A binary vector contains an origin of replication from a broad host-range plasmid instead of a region of homology with the Ti plasmid. Since the plasmid does not need to form a cointegrate, these plasmids are considerably easier to introduce into Agrobacterium. The other advantage to binary vectors is that this vector can be introduced into any Agrobacterium host containing any Ti or Ri plasmid, as long as the vir helper function is provided. Using these systems, the gene regulation mechanism of storage proteins has been elucidated.
IMPROVEMENT OF NUTRITIONAL QUALITIES OF PLANTS
The amino acid composition of the cereal endosperm protein is characterized by a high content of proline and glutamine while the amount of essential amino acids, lysine and tryptophan in particular, is a limiting factor (Pernollet and Mosse 1983). In legumes, sulfur containing amino acids such as methionine and cysteine are the major limiting essential amino acids for the efficient utilization of plant protein as animal or human food while roots and tubers are deficient in almost all of the essential amino acids.
There has been a great deal of effort to overcome these amino acid limitations by breeding and selecting for more nutritionally balanced varieties. Plants have been mutated in hopes of recovering individuals with more nutritious storage proteins. Neither of these approaches has been very successful, although some naturally occurnng and artificially produced mutants of cereals were shown to contain a more nutritionally balanced amino acid composition.
These mutations cause a significant reduction in the amount of storage protein synthesized and thereby result in a higher percentage of lysine in the seed; however, the softer kernels and low yield of such strains have limited their usefulness (Pernollet and Mosse 1983). The reduction in storage protein also causes the seeds to become more brittle; as a result, these seeds shatter more easily during storage. The lower levels of prolamin also result in flours with unfavorable functional properties which cause brittleness in the baked products (Pernollet et al. 1983). Thus, no satisfactory solution has yet been found for improving the amino acid composition of storage proteins.
One direct approach to this problem is to modify the nucleotide sequence of genes encoding storage proteins so that they contain high levels of essential amino acids. To achieve this aim, several laboratories have tried to modify and express storage proteins in the host plants.
Modified natural storage proteins have been created by inserting into the natural storage protein genes exogenous DNA sequences coding for essential amino acids. The basic idea is to produce modified proteins which are similar to the naturally occurring proteins, but which have inserted into them sequences of essential amino acid residues. There are at least three problems encountered with this approach: ( 1 ) Dilution. Even if this approach is successful, the modified protein will still have high levels of non-essential amino acids, effectively "diluting" the net concentrations of the encoded essential amino acids. (2) Instability. The modified proteins are typically susceptible to proteolytic attack in the plant. Because a natural storage protein is a highly evolved structure, artificial modifications to it are likely to destabilize it. For example, S a stabilizing glutamic acid-lysine salt bridge might be broken. (3) Multiple copies of genes.
Naturally occurring storage proteins are typically encoded by multiple gene copies. A mutation in just one of the copies of the gene will likely have only a limited effect.
In vitro mutagenesis was used to supplement the sulfur amino acid codon content of a gene encoding ~i-phaseolin, a Phaseolus vulgaris storage protein (Hoffmann et al. 1988). The nutritional quality of ~3-phaseolin was increased by the insertion of 15 amino acids six of which were methionine. The inserted peptide was essentially a duplication of a naturally occurring sequence found in the maize 15 kD zein storage protein (Pederson et al. 1986).
However, this modified phaseolin achieved less than 1 % of the expression level of normal phaseolin in transformed seeds. Recently it has been found that this insertion was made in part of a major structural element of the phaseoiin trimer (Lawrence et al. 1990). Therefore, an inclusion of 15 residues at this site could distort the structure at the tertiary and/or quaternary level.
Lysine and tryptophan-encoding oligonucleotides were introduced at several positions into a 19 kD a-type zero complementary DNA by oligonucleotide-mediated mutagenesis (Wallace et al. 1988). Messenger RNA for the modified zero was synthesized in vitro and injected into Xenopus laevis oocytes. The modified zero aggregated into structures similar to membrane-bound protein bodies. This experiment suggested the possibility of creating high-lysine corn by genetic engineering.
There are alternative approaches that might be more practical. One of these is to transfer heterologous storage protein genes that encode storage proteins with higher levels of the desired amino acids. For this purpose, a chimeric gene encoding a Brazil nut methionine-rich protein which contains 18% methionine has been transferred to tobacco and expressed in the developing seeds (Altenbach et al. 1989). The remarkably high level of accumulation of the methionine-rich protein in the seed of tobacco results in a significant increase in methionine levels of ~30%.
The maize 15 kD zero structural gene was placed under the regulation of French bean ~3-phaseolin gene flanking regions and expressed in tobacco (Hoffmann et al.
1987). Zein accumulation was obtained as high as 1.6% of the total seed protein. Zein was found in roots, hypocotyls, and cotyledons of the germinating transgenic tobacco seeds. Zein was deposited and accumulates in the vacuolar protein bodies of the tobacco embryo and endosperm. The storage proteins of legume seeds such as the common bean (Phaseolus vulgaris) and soybean (Glycine max) are deficient in sulfur-containing amino acids. The nutritional quality of soybean could be 5 improved by introducing and expressing the gene encoding methionine-rich 15 kD zero (Pederson et al. 1986).
A synthetic gene (HEAAE I = High Essential Amino Acid Encoding) which encoded a protein domain high in essential amino acid was expressed as a CAT-HEAAE I
fusion protein in potato (Jaynes et al. 1986; Yang et al. 1989}. However, structural instability limited the high 10 level expression of this fusion protein in the potato system. Also, the content of essential amino acids was diluted to less than 40% of the original encoded protein by constructing this fusion.
There are several precautions that should be considered in engineering storage proteins (Larkins, 1983). First, in vitro mutational change must not be in regions of the protein that perturb the normal protein structure; otherwise, the proteins might be unstable. Second, when attempting to increase nutritional quality by introducing a gene encoding a heterologous protein in crop plants, it is important that the protein encoded by an introduced gene does not produce any adverse effects in humans or livestock, the ultimate consumers of the engineered seed proteins (Altenbach et al. 1989). Finally, it is critical that the amino acids present in the introduced protein are able to be utilized by the animal for growth and development.
DE NOVO DESIGN OF PROTEINS
Recently, a new field in protein research, de novo design of proteins, has made remarkable progress due to a better understanding of the rules which govern protein folding and topology. Protein design has two components: the design of activity and the design of structure.
This review will concentrate on the design of structurally stable storage protein-like proteins.
The usual approach for the design of helical bundle proteins consists of linking sequences with a propensity for forming an a-helix via short loop sequences to get linear polypeptide chains. This chain can fold into the predetermined 'globular type' tertiary structure in aqueous solution (Mutter 1988; DeGrado et al. 1989). a-helical secondary structures are stabilized by interatomic interactions that can be classified according to the distance between interacting atoms in the sequence of the protein (DeGrado et al. 1989).

WO 99/55$90 PCT/US99/09067 Short range interactions account for different amino acids having different conformational preferences. Both statistical (Chou and Fasman 1978) and experimental (Sueki et al. 1984) methods show that residues such as Glu, Ala and Met tend to stabilize helices, whereas residues such as Gly and Pro are destabilizing. However, these intrinsic preferences are not sufficient to determine the stability of helices in globular proteins.
Analysis of the free-energy requirements for helix initiation and propagation indicates that peptides of 10 to 20 residues should show little helix formation in water (Bierzynski et al.
1982) when the Zimm-Bragg equation (Zimm and Bragg 1959) is used, with parameters (s and S) determined by host-guest experiments where s is the helix nucleation constant, n is the number of H-bonded residues in the helix and S is an average stability constant for one residue.
sSn-1 / (S-1) Nevertheless, the 13 amino acid C-peptide obtained from RNase A does show measurable helicity (~25%) at low temperature (Bierzynski et al. 1982; Brown and Klee 1981). The stability of this peptide is 1000-fold greater than the value calculated from the Zimm-Bragg equation.
Specific side-chain interactions, factors that are not considered in the Zimm-Bragg model, are responsible, at least in part, for the fact that the C-peptide is much more helical than predicted (Scheraga, 1985).
Medium-range interactions are responsible for the additional stabilization of secondary structures (DeGrado et al. 1989). Interaction between the side-chains are regarded as important medium range interactions (Shoemaker et al. 1987; Marqusee and Baldwin 1987).
These include electrostatic interactions, hydrogen bonding, and the perpendicular stacking of aromatic residues (Blundell et al. 1986). An a-helix possesses a dipole moment as a result of the alignment of its peptide bonds. The positive and negative ends of the amide group dipole point toward the helix NHZ-terminus and COOH-terminus, respectively, giving rise to a significant macrodipole.
Appropriately charged residues near the ends of the helix can favorably interact with the helical dipole and stabilize helix formation. It was estimated that the electrostatic interaction between a pair of antiparallel a-helices is about 20 Kcal/mol less than a parallel a-helices pair (Hol and Sanders I 981 ). Hydrogen bonds between side chains and terminal helical N-H
and C=O groups also participate in the stabilization of helical structure (Richardson and Richardson 1988; Presta and Rose 1988; Richardson and Richardson 1989).

Protein structures contain several long-range stabilizing interactions which include hydrophobic and packing interactions, and hydrogen bonds. Among these, the hydrophobic effect is a prime contributor to the folding and stabilizing of protein structures.
The driving force for helix formation in RNase A arises from long-range interactions between C-peptide and S-protein, a large fragment of the protein from which C-peptide was excised (Komoriya and Chaiken 1985).
The role of hydrophobic interactions in determining secondary structures was studied for a series of peptides containing only Glu and Lys in their sequence (DeGrado and Lear 1985). Glu and Lys residues were chosen as charged residues for the solvent-accessible exterior of the protein to help stabilize helix formation by electrostatic interaction.
STABILITY OF DESIGNED PROTEINS
Hydrophobic residues often repeat every three to four residues in an a-helix and form an amphiphilic structure (DeGrado et al. 1989). Amphiphilicity is important for the stabilization of the secondary structures of peptides and proteins which bind in aqueous solution to extrinsic apolar surfaces, including phospholipid membranes, air, and the hydrophobic binding sites of regulatory proteins (Degrado and Lear 1985). This amphiphilic secondary structure can be stabilized relative to other conformations by self association. Therefore, short peptides often form the a-helix in water only because the helix is amphiphilic and is stabilized by peptide aggregation along the hydrophobic surface. Natural globular proteins are folded by a similar mechanism, involving hydrophobic interaction between neighboring segments of secondary structure (Presnell and Cohen 1989). Using the concept of an amphiphilic helix, DeGrado and coworkers have successfully built peptide-hormone analogs with minimal homology to the native sequences. These peptides, like the native ones, are not helical in solution but do form helices at the hydrophobic surfaces of membranes.
Designed synthetic peptides have been used to show how hydrophobic periodicity in a protein sequence stabilizes the formation of simple secondary structures such as an amphiphilic a-helix (Ho and DeGrado 198'n. The strategies used in the design of the helices in the four-helix bundles are: 1 ) the helices should be composed of strong helix forming amino acids and 2) the helices should be amphiphilic; i.e., they should have an apolar face to interact with neighboring helices and a polar face to maintain water solubility of the ensuing aggregates. The results show that hydrophobic periodicity can determine the structure of a peptide.
Therefore, the peptides tend to have random conformations in very dilute solution, but form secondary structures when they self associate (at high concentration) or bind to the air-water surface.
The free energy associated with dimerization or tetramerization of the designed peptides could be experimentally determined from the concentration dependence of the CD
spectra for the peptides (DeGrado et al. 1989; Lear et al. 1988; DeGrado and Lear 1985). At low concentrations, the peptides were found to be monomeric and have low helical contents, whereas at high concentration they could self associate and stabilize the secondary structure.
Therefore, possible hairpin loops between helices can affect the stability of the secondary structure by enhancing the self association between the helical monomers. A strong helix breaker (Chow and Fasman 1978;
Kabsch and Sander 1983, Sueki et al. 1984, Scheraga 1978) was included as the first and last residue to set the stage for adding a hairpin loop between the helices. A
single proline residue appeared capable of serving as a suitable Iink if the C and N terminal glycine residue are slightly unwound. Glycine lacks a ~i-carbon, which is essential for the reverse turn where positive dihedral angles are required. The pyrrolidine ring of proline constrains its f dihedral angle -60°.
Thus, proline should be destabilizing at positions where significantly different backbone torsion angles are required. This amino acid, as well as glycine, has a high tendency to break helices and occurs frequently at turns (Creighton 1987).
The direct evidence for stabilization of protein structure by adding the linking sequence was observed by comparing the guanidine denaturation curve for a monomer, dimer and tetramer (Degrado et al. 1989). The gene encoding tetrameric protein was expressed in E. coli and purified to homogeneity. In the series of mono-, di-, and tetramer, the stability toward guanidine denaturation increases concomitantly with the increase in covalent cross-links between helical monomer. At equivalent peptide concentrations, the midpoints of the denaturation curves occurred at 0.55, 4.5 and 6.5 M guanidine for the mono-, di, and tetramer.
Furthermore, as the number of covalent cross-links was increased, the curves became increasingly cooperative. Thus, the linker sequence stabilized the formation of the four helix structures at low concentration of the peptides (<1 mg/ml).
Structural stability of proteins is directly related to in vivo proteolysis (Parasell and Sauer 1989). Proteolysis depends on the accessibility of the scissile peptide bonds to the attacking protease. The sites of proteotytic processing are generally in relatively flexible interdomain segments or on the surface of the loops, in contrast to the less accessible interdomain peptide bonds (Neurath 1989). This suggests that the stability of the folded state of the protein is the most important determinant for its proteolytic degradation rate. The effect of a folded structure on the proteolytic degradation has been proven by several experiments. First, proteins that contain amino acid analogs or are prematurely terminated are often degraded rapidly in the cells S {Goldberg and St. John 1976). Second, there are good correlations between the thermal stabilities of specific mutant proteins and their rates of degradation in E. Coli (Pakula and Sauer 1986, Parasell and Sauer 1989). Finally, second-site suppressor mutations that increase the thermodynamic stability of unstable mutant proteins have also been shown to increase resistance to intracellular proteolysis (Pakula and Sauer 1989). The solubility of proteins could also affect their proteolytic resistance as some proteins aggregate to form inclusion bodies that escape proteolytic attack (Kane and Hartley 1988).
Metabolic stability is another factor influencing the in vivo stability of proteins. Usually, damaged and abnormal proteins are metabolically unstable in vivo (Finley and Varshavsky 1985;
Pontremoli and Melloni 1986). In eukaryotes, covalent conjugation of ubiquitin with proteins is essential for the selective degradation of short-lived proteins (Finley and Varshavsky. 1985). It was found that the amino acid at the amino-terminus of the protein determined the rate of ubiquitination (Bachmair et al. 1986). Both prokaryotic and eukaryotic long-lived proteins have stabilizing amino acids such as methionine, serine, alanine, glycine, threonine, and valine at the amino terminus end. On the other hand, amino acids such as leucine, phenylalanine, aspartic acid, lysine, and arginine destabilize the target proteins.
Recently, many laboratories have attempted to improve the nutritional quality of plant storage proteins by transferring heterologous storage protein genes from other plants (Pederson et al. 1986). The development of recombinant DNA technology and the Agrobacterium-based vector system has made this approach possible. However, genes encoding storage proteins containing a more favorable amino acid balance do not exist in the genomes of major crop plants.
Furthermore, modification of native storage proteins has met with difficulty because of their instability, low level of expression, and limited host range. One possible alternative is the de novo design of a more nutritionally-balanced protein which retains certain characteristics of the natural storage proteins of plants.
Our initial work described the use of small fragments of DNA which encoded spans of protein high in essential amino acids (Jaynes et al. 1985; Yang et al. 1989).
Subsequently, the genes encoding these protein domains were cloned into an existing protein and the expression level of this modified protein determined in transgenic potato plants.
However, because of some of the problems mentioned above, the results were somewhat less than desirable (Yang et al.
1989).

The publications and other materials used herein to illuminate the background of the invention or provide additional details respecting the practice, are incorporated by reference, and for convenience are respectively grouped in the appended List of References.

Experiments were performed which were designed to produce transgenic plants which produce higher levels of essential amino acids. For this purpose, plants were made transgenic with a synthetic nucleic acid construct which encoded a protein containing high levels of essential amino acids. Resulting transgenic plants produced not only higher levels of essential 1 S amino acids, but unexpectedly these plants also produced higher levels of protein in general.
This increase in total protein content ranged from approximately 2-fold to S-fold.
One aspect of the invention is a transgenic plant comprising a gene which encodes a protein which causes the transgenic plant to overproduce total protein as compared to a nontransgenic plant.
A second aspect of the invention is a gene encoding a protein wherein plants which are transgenic for this gene overproduce total protein as compared to a nontransgenic plant.
A third aspect of the invention is a protein wherein if a plant is made transgenic for a gene encoding said protein said transgenic plant will overproduce total plant protein as compared to the plant when it is not transgenic. This protein may comprise an amphiphilic a-helical sequence, a ~i-pleated sheet sequence, or a combination of a-helix and ~i-pleated sheet.
Another aspect of the invention is a transgenic plant cell which contains a gene encoding a protein which causes the plant cell to overproduce total protein as compared to a nontransgenic cell.
Yet another aspect of the invention is a method for increasing the production of a specific protein in a plant or plant cell by transforming the plant or plant cell with a gene which encodes a protein which causes the overproduction of total protein in the transgenic plant or plant cell.

WO 99/55$90 PCT/US99/09067 Still another aspect of the invention is a method for increasing the production of a nonprotein product in a plant or plant cell by transforming the plant or plant cell with a gene encoding a protein which causes the overproduction of total protein in the transgenic plant or plant cell and thereby results in the increased synthesis of nonproteinaceous material.
Yet another aspect of the invention is a method for enhancing the production of a specific protein or nonprotein in a plant or plant cell by cotransforming the plant or plant cell with 1 ) a gene encoding the specific protein or a protein involved as an enzyme in the synthetic pathway of the nonprotein product and 2) a gene encoding a protein which results in the generalized overproduction of total plant or plant cell protein.
BRIEF DESCRIP~'ION OF THE DRAWING, Figure 1 shows the average essential amino acid requirement for both children and adults in mg per kg body weight.
Figure 2 shows the amounts of foodstuffs which must be consumed in grams per day in order to meet the minimum daily requirement of all essential amino acids.
Figure 3 illustrates how the amino acid composition of the ASP 1 monomer was chosen.
Figure 4 shows the percentage of essential amino acids (EAA) and percentage of most limiting essential amino acids (MLEAA) in ASP 1 tetramer compared with natural proteins.
Figure 5 is a depiction of the amphiphilicity of the ASP1 monomer where hydrophobic amino acids are in the white rectangle and hydrophilic amino acids are in the shaded rectangle.
There are interactions between the Glu (E) and Lys (K) residues which are shown as dark lines depicting salt-bridges.
Figure 6 shows the amino acid sequence of the ASPI tetramer (SEQ ID N0:2).
Hydrophilic amino acids are underlined and ~3-turns are indicated.
Figures 7A-7B show the protein content of plants. Figure 7A shows the overall protein content determined by amino acid analysis. P-2 TC is a control plant and P-7 T, P-11 T, P-17 T and P-29 T are plants transformed with ASP1 tetramer. Figure 7B shows the %
increase of protein content in the transformed plants as compared to the control plant.
These data were derived from seedlings obtained from transformed mother plants. A minimum of four separate assays were used and the variation was no more than 30%.

Figure 8A depicts the overall protein content of leaves from control and ASP1 tetramer seedlings. The plants are labeled as for Figures 7A-7B. Figure 8B shows the %
increase in protein content for the transformed plants as compared to the control plant.
S DETAILED DESCRIPTION OF THE INV NTION
The present invention uses quite a different approach. Rather than mutate or transfer a gene for a naturally occurring protein, an artificial protein has been constructed de novo. This de novo protein has nutritionally balanced proportions of the essential amino acids, is stable following expression in a plant, and shares some of the characteristics of naturally occurring plant storage proteins. Transgenic plants have been produced which contain such a gene. These plants not only produce more essential amino acids compared to controls, but surprisingly the total amount of protein produced by these plants is also increased.
Furthermore, the total amount of nonproteinaceous components can also be increased via these methods.
There are at least two fundamental difficulties in achieving efficient expression of designed proteins. First, it is not yet known what stabilizes a protein against proteolytic breakdown and second, the mechanisms for folding of an amino acid sequence into a biologically-stable tertiary structure have not yet been fully delineated. For the construction of DNP 1 (Designed Nutritional Protein), we focused on the design of a physiologically-stable as well as a highly nutritious, storage protein-like, artificial protein.
DESIGNED NUTRITIONAL PROTEINS
We designed the synthetic protein DNP 1 to contain a high content of those amino acids which are essential to the diet of animals. The optimized content of essential amino acids for this new protein was obtained empirically by determining the amounts of essential amino acids necessary for normal metabolism of the animal. See Table 1, which gives essential amino acid requirement {grams/day) (in the following order) for children at 3 months, children at 5 years, children at 10 years, average for children at these three ages, adults at 25 years, adults at 75 years, average for adults at these two ages, and overall average. We also determined the 'deficiency values' or the ratios of deficient essential amino acids for the 10 primary crops animals consume throughout the world (Figure 3). See Table 2, which gives essential amino acid deficiency ratios for the ten major crop plants consumed by humans. From these data, we then found the ratio of essential amino acids needed to totally complement each particular plant foodstuff We averaged Table 1 InfantChild Child Child Adult Adult Adult Overall 3 mo 5 yr 10 yr Ave 25 yr 75 yr Ave Ave Ile 0.258 0.524 0.879 0.554 0.754 0.754 0.754 0.654 Leu 0.521 1.234 1.382 1.046 1.102 1.102 1.102 1.074 Lys 0.370 1.150 1.382 0.967 1.020 1.020 1.020 0.994 Met+Cys 0.235 0.468 0.691 0.465 0.986 0.986 0.986 0.725 Phe+Tyr 0.403 1.178 1.124 0.902 1.102 1.102 1.102 1.002 Thr 0.241 0.636 0.879 0.585 0.522 0.522 0.522 0.554 Trp 0.095 0.185 0.240 0.173 0.260 0.260 0.260 0.217 Val 0.308 0.655 0.785 0.583 0.754 0.754 0.754 0.668 Table 2 E.A.A. Wheat Corn Rice Barley Sorghum Ile 1.72 2.23 1.71 1.94 1.51 Leu 1.85 1.25 1.57 1.98 0.85 Lys 4.08 3.36 3.03 3.68 4.54 Met + Cys 1.73 2.41 3.86 2.53 2.55 Phe + Tyr 1.20 1.30 1.10 1.32 1.49 Thr 2.14 1.67 1.75 2.01 1.89 Trp 1.61 2.20 1.78 1.37 1.69 Val 1.68 1.39 1.20 1.18 1.50 E.A.A. Cassava Taro Sweet PotatoPotato Plaintain Ile 1.88 1.69 1.92 1.68 1.33 Leu 2.13 1.64 2.70 2.45 2.10 Lys 2.83 2.29 2.96 2.08 2.24 Met + Cys 3.18 3.52 2.84 3.53 3.74 Phe + Tyr 1.58 2.37 1.29 1.67 1.71 Thr 1.58 1.55 1.62 1.54 2.28 Trp 1.02 1.41 1.37 1.60 1.39 Val 1.23 1.31 1.19 1.01 1.36 these values and derived a set of numbers we call the 'Average Ratio for All Crops Idealized to the DNP 1 Monomer' (Figure 4). This set of numbers represents the ratio of essential amino acids necessary to complement the deficiencies found in all 10 crops for all human age groups.
From the above set of numbers, we designed a nutritional protein for humans (ASP 1 ).
The amino acid sequence for ASP 1 is shown in Figure 6 and is SEQ ID N0:2. The DNA
sequence used to encode this protein is shown as SEQ ID NO:1. It has 1.8 times more of the essential amino acids compared to zein or phaseolin. The difference in MLEAA
is much higher, containing 3 times more than phaseolin and 6.5 times more than zero. The helical region of ASP 1 is amphipathic (hydrophobic residues clustered on one face of the helix while hydrophilic residues are found on the other face) and is stabilized by several GLU - LYS
salt bridges (Figure 5). The helix breaker Gly-Pro-Gly-Arg (SEQ ID N0:8) has been used as a turn sequence. The design results in an antiparallel tetramer which achieves an extraordinarily stable secondary and tertiary structure even at low concentration.
The structural stability of a protein is important in determining its susceptibility to proteolysis. Most native proteins are relatively resistant to cleavage by proteolytic enzymes, whereas denatured proteins are much more sensitive (Pace and Barret 1984).
Several f ndings suggest that the stability of a folded protein is an important determinant of its rate of degradation.
Therefore, in addition to improved nutritional quality, ASP 1 has been designed to have a stable storage protein-like structure in plants. Its design is based on the structurally well-studied corn storage zero proteins (Z19 and Z22), which are comprised of 9 repeated helical units (Agros et al. 1982). Each helical unit, 16 to 26 amino acids long, of zero is flanked by turn regions and forms an antiparallel helical bundle. Most of the amino acids in the helices are hydrophobic residues. On the other hand, ASP1 is comprised of 4 helical repeating units, each 20 amino acids long (Figure 6). Increased gene copy number by concatenation can increase the protein yields.
At the same time, gene concatenation gives the increased molecular mass of the encoded protein.
Such an increase in size and concatenation can significantly stabilize an otherwise unstable product (Shen 1984).
The gene encoding this novel peptide was chemically synthesized and cloned into an E.
coli expression vector. This gene contains plant consensus sequences at the 5' end of the translation initiation site to optimize the expression of proteins in vivo. It was placed under the control of the 35S cauliflower mosaic virus (CaMV) promoter in order to permit the constitutive expression of this gene in tobacco. The gene can also be cloned into other microorganisms, such as yeasts, through standard means known in the art.
Unless otherwise clearly indicated by context, the term "ASP1" is intended to encompass any one or more of the following: ( 1 ) the peptide whose sequence is SEQ ID
N0:3; (2) the 5 peptide whose sequence is SEQ ID N0:4; (3) any polymer, copolymer, oligomer or co-oligomer of one or both of SEQ ID N0:3 and SEQ ID N0:4, such as the tetrameric ASP 1 whose sequence is SEQ ID N0:2; or (4) any peptide or protein having substantially the same amino acid sequence as any of the above, and substantially the same stability upon expression in at least one plant, but whose amino acid sequence has been modified in a manner which will naturally occur to one of 10 skill in the art, such as by insertions, deletions, and/or transpositions which are not substantially detrimental to the stability of, or to the nutritionally balanced essential amino acid composition of, the protein. By way of example, numerous transpositions, insertions, or deletions in the amino acid residue sequence of ASP 1 or other proteins of the invention will occur to those of skill in the art. It will be desirable to maintain overall amphipathy of the structure to promote 15 stability; and it will also be desirable to have as internal sequences glu-X-X-X-lys (SEQ ID
NO:S), to promote salt bridges in the a-helix, which also promote protein stability. While other acid-X-X-X-base sequences may also serve this function glu-X-X-X-lys (SEQ ID
NO:S) is preferred: lysine is preferred as the base because it is an essential amino acid; and glutamic acid is preferred as the acid because it has been observed to stabilize an a-helix better than does 20 aspartic acid. This same type of definition as set out for ASP 1 also applies to all other polypeptides or proteins which are disclosed herein.
The protein should also be designed for ready digestibility by the proteases of the intended consumer. For example, frequent lysine (or arginine) sites will promote proteolytic attack by trypsin. Frequent phenylalanine (or tyrosine) sites will promote proteolytic attack by chymotrypsin.
It may be desirable to tailor the essential amino acid content of the protein specifically to complement the essential amino acid content of a particular crop of interest, rather than an average for several crops. It may also be desirable to tailor the essential amino acid content to match the nutritional requirements of the intended consumer species. For example, an artificial storage protein to be expressed in maize might have one composition if the maize is intended for human consumption, and a somewhat different composition if the maize is intended for feeding pigs.
An amphipathic peptide or protein is one in which the hydrophobic amino acid residues are predominantly on one side, while the hydrophobic amino acid residues are predominantly on the opposite side, resulting in a peptide or protein which is predominantly hydrophobic on one face, and predominantly hydrophilic on the opposite face.

Without wishing to be bound by the following discussion of inferences regarding ASP1's structure, the following gives the inventor's best current information and inferences regarding that structure. The secondary structures of the ASP 1 monomer and tetramer were predicted by PREDICT-SECONDARY in (3-SYBYL. The percentage of a-helix content predicted by information-theory showed a higher a-helix content compared to the other two prediction methods (Bayes-statistic and neural-net) in PREDICT-SECONDARY. The predicted secondary structures by information-theory gave 100% helical content for the monomer and 74% for the tetramer.
However, the accuracy of the three widely used prediction methods ranged from 49% to 56% for prediction of three states; helix, sheet, and coil (Kabsch and Sander 1983). This inaccuracy might be due to the small size of the data base and/or the fact that secondary structure is determined by tertiary interactions which are not included in the local sequences. For further predictions of structure, the structures predicted by information-theory were energy minimized using SYBYL MAXIMIN2.
A perfect amphiphilic a-helical conformation was predicted for the ASP1-monomer after minimization. The tertiary structure of the ASP1-tetramer after minimization showed the antiparallel conformation as was designed. These minimization results suggested the high probability of stable secondary structure (a-helix and (3-turn) formation of the ASP1-monomer and -tetramer.

The structural stability of ASP1-monomer and tetramer could not be determined by minimization only. Therefore, the stability of the a-helical secondary structure of ASPI-monomer was investigated. HPLC analysis of the gel filtered synthetic ASP 1-monomer showed that purity was more than 90% and amino acid analysis of the purified fraction gave the expected molar ratios. This fraction was also analyzed by mass spectrometry, and the molecular weight peak corresponding to the ASP I -monomer (2896.5) was present. Since the structural stability of ASP1-monomer and tetramer could not be determined by minimization only, the stability of the a-helical secondary structure of ASP1-monomer was investigated by circular dichroism (CD) analysis. CD spectra of ASPI-monomer showed the typical pattern of alpha helical proteins with double minima at 208 and 222 nm in aqueous solution (data not shown). The stability of the secondary structure can be induced by the inter-molecular interaction between the helical chains (DeGrado et al. 1989). Therefore, stable aggregation between monomers, presumably through hydrophobic interactions, could stabilize the helical structure. Besides, proper packing of the apolar side chains and proper electrostatic interaction might play important roles in stabilizing the secondary structure of ASP1. The stable interaction among the monomeric ASPI molecules is an important determinant for the proper folding into the tertiary structure of the ASP 1-tetramer.
Therefore, the self association capability of the ASPI-monomers was investigated by using size exclusion chromatography. The hydrodynamic behavior of this peptide showed that it was aggregated into a hexamer form with an apparent molecular weight of about 17 kD. This hexameric aggregate could be maintained in either low or high ionic strength solutions. This result provides proof of the stable globular type tertiary structure formation of tetrameric ASP 1.
Three potential ~3-turn (Gly-Pro-Gly-Arg (SEQ ID N0:8)) sequences were inserted between four monomers for the ASP1-tetramer construction. The ~3-turn could play an important role for structural stability of the ASPI-tetramer when it is expressed in vivo. It can also help stabilize tertiary structure formation. The interactions between the helical monomers might be much faster due to the proximate effect when they are connected. This proximate effect might be critical for folding at the low concentrations of ASP 1-tetramer that are possible when they are expressed in vivo. At the same time, the stability of the secondary structure is increased by the hydrophobic interactions between helical monomers. In addition, this ~3-turn sequence has a tryptic digestion site {Gly-Arg) which can increase the digestibility of this protein when it is consumed by animals.
The stability of the folded structure of a protein has a close relation to its proteolytic degradation rate (Pace and Barret 1984; Pakula and Sauer 1986; Parasell and Sauer 1989; Pakula and Sauer 1989). In this respect, we expected high stability of folded ASPI-tetramer against proteolytic degradation when it is expressed in vivo. Stable quaternary structure is essential for the formation of protein bodies of storage proteins in zero or phaseoiin (Lawrence et al. 1990).
These higher order structures can be achieved through the interaction and close packing of the stable tertiary structures. The major driving force for this quaternary structure formation is also hydrophobic interaction between the tertiary structures.

The correct insertion and orientation of the pBI derivative containing the ASP1 tetramer was screened for by EcoRI and HindIII digestion (it was found in E. coli that the most stable form of the gene was the tetramer form). The EcoRI digestion gave a fragment of the expected size, 3.2 kb, which consisted of 3'NOS of ASP1 and the GUS gene (data not shown). Also, the ASP1 gene with its 35S promoter and 3NOS sequences was detected as a 1.4 kb band by HindIII
digestion. Stable transformation of the ASP 1 gene into A. tumefaciens LBA4404 was confirmed by HindIII digestion of isolated plasmid DNA. It could be isolated from Agrobacterium and detected by enzyme digestion because pBIl21 is a binary vector. Leaf discs, transformed with LBA4404 carrying the ASP1 gene, gave about 5 to 7 shoots two to three weeks after infection.
A total of 565 kanamycin-resistant shoots were regenerated from 120 leaf discs. These shoots were excised from the leaf discs and transferred to new media to grow several more weeks, and then transferred to rooting media. After three weeks in rooting medium, 126 rooted shoots were analyzed for (3-glucuronidase (GUS). Root tips of 56 out of 126 plants showed various levels of GUS activity. Not all the kanamycin-resistant shoots showed the GUS positive result. Although kanamycin resistance was due to the expression of neomycin phosphotransferase (NPT II gene), regeneration of nontransgenic shoots in the presence of kanamycin has been reported. Therefore, escapes from the screening based on kanamycin sensitivity might have occurred in the nontransformed plants, making them kanamycin resistant.
Thirty six plantlets which showed high levels of (3-glucuronidase activity were transplanted into jiffy pots. After establishment of the plants, a more accurate fluorogenic assay for GUS activity was done to quantify the expression level of this gene (Table 3). GUS activity was measured as pmole 4-methyl umbelliferone produced per mg protein per minute, all at an excess of 4-methyl umbelliferone glucuronide. Some of these transformed tobacco plants showed higher levels of B-glucuronidase activity compared to other plants. The level ~ of expression might be primarily affected by whether the gene is incorporated into an active or inactive site of chromatin. Activity of chromatin,- methylation of DNA and nuclease hypersensitivity are closely related to each other. It has been found that the nuclease hypersensitive sites correlate to active transcription (Gross and Garrard 1987). The degree of methylation of DNA is inversely related to gene expression. Furthermore, if the gene is located near the plant's endogenous promoter or enhancer sites, the level of expression of this gene will be increased by these near-by enhancing factors. Therefore, the difference in the levels of GUS
activity between the transformed plants might be due to this positional effect, which was determined by the sites of incorporation of this gene into the tobacco genome.
Tahle 3 Transgenic Plant GUS Activity ASP1 #1 200 ASP1 #9 315 ASP 1 # 11 3,790 ASP1 #13 360 ASP1 #17 2,400 ASP 1 #29 200 pBI 121 #2 320 Wildtype #37 10 ANALYSIS OF TRANSFORMED PLANTS
DNA Analysis Although GUS activity and kanamycin resistance are good indicators of transformation, rearrangement in the T-DNA after incorporation in the plant genome can inactivate or silence the other genes transferred. Correct incorporation of the ASP1 gene into the tobacco genome was therefore determined by Southern blotting using the ASP 1 tetrameric fragment as a probe. A
distinct 1.4 kb HindIII band appeared in 7 out of 9 tobacco genomic DNA
samples analyzed, but did not appear in negative control samples. As a positive control, and to check the copy number, HindIII-digested plasmid pBI ASP1-tetramer was also loaded, corresponding to 1 and 5 copy number of the inserted gene in tobacco DNA. Multiple positive bands were observed from most of the transformed plants, with the expected size of 1.4 kb. Extra bands appeared which were WO 99/55890 PC'f/US99/09067 bigger than 1.4 kb, and which showed different patterns between the individual plants. These results suggested that the ASP1 gene, alone or with neighboring genes, might be inserted into several sites in the chromosomes, with or without rearrangement. The copy number of the correct band varied among the plants, and ranged from 1 to 5 by densitometric measurement.
5 The copy number of a gene can affect its expression level; a gene with a high copy number can give a higher level of expression. The impact of copy number on the extent of expression varies from one system to another. In some cases there are positive correlations to expression, but not always. It should not be expected that all the copies of the gene are equally active, because the position of a copy in the genome can affect its level of the transcription.
However, as the number 10 of chromosomal sites containing foreign DNA increases, the likelihood that at least one of the pieces of DNA will integrate into a transcriptionally active region also increases.
Kanamycin Gene Segregation Test First generation progeny from self fertilized, transformed parents were tested for kanamycin gene segregation. Because the integrated T-DNA is inherited as a dominant 15 Mendelian trait, the copy number of the ASP 1 gene can be determined by the kanamycin segregation pattern of the progeny. The results showed that most transformed plants had multiple copies of the ASP1 gene. See Table 4, in which Kn(r) is the number showing Kanamycin resistance, and Kn(s) is the number showing Kanamycin sensitivity.
The progenies of transformed plants carrying single, double, or triple genetic NPT II loci (the gene bestowing 20 Kanamycin resistance) are expected to segregate in 3:1, 15:1 or 63:1 ratios, respectively.
Therefore, plants #2 and # 13 have one NPT II locus; plants # 1, # 11 and #29 have two NPT II
loci; and plant # 17 has more than three loci of the gene encoding kanamycin resistance.
Table 4 Transgenic PlantKn(r) Kn(s) Kn(r)/Kn(s) 25 ASP1 #1 136 8 15:1 ASP1 #11 127 8 15:1 ASP1 #13 112 37 3:1 ASP 1 # 17 175 1 175:1 ASP 1 #29 131 9 15:1 pBI 121 #2 107 34 3:1 RNA Analysis Efficient transcription of inserted ASPl genes in the tobacco plants was tested by Northern blot analysis. The polyA RNA was analyzed using the ASP1-tetramer probe. The correct gene size transcribed was about 490 bases, which consisted of 30 bases upstream and 170 bases downstream of the ASP 1-tetramer gene. In addition to this message, eukaryotic mRNA
contains different sizes of polyA. Therefore, the expected size of the ASP1-tetramer message should be around 600 plus 100 bases long. Bands were observed which corresponded to this expected size from all the samples which were analyzed. However, the levels of transcription of the ASPl genes were dramatically different among the different transformed plants.
Transformed plant # 17 accumulated 5- to 50-fold more transcripts than the other transformed plants. Such differences in accumulation could be explained by the effect of position, or by the effect of multiple copy insertion. The expression levels of the ASP 1 gene and its neighboring GUS gene correlated with each other in some transformed plants (such as in plant # 17), but not in all. These results suggested that the level of expression of two closely connected genes can be dramatically different. Multiple transcripts with different sized bands (500-700 bases) were observed from several transformed plants. This result might be due to multiple insertion of the ASP1 gene into the tobacco genome. These inserted genes may be rearranged, but still produce transcripts. Another possibility might be strong secondary structure which could be formed due to the four directly repeated sequences of the tetrameric ASP1 transcripts.
Different mobilities could result, depending on the secondary structure.
Expression ofASPl Standard means known in the art were used to raise polyclonal antibody against synthetic ASP1 monomer. This antibody was used to detect the production of stable ASP1 protein in tobacco. If desired, standard means known in the art can also be used to prepare monoclonal antibodies against ASP 1. High levels of the tetrameric form ( 11.2 kD) of the ASP 1 protein were detected from plant # 17 by Western blot analysis (data not shown). Therefore, direct correlation was found between gene copy number, number of genetic NPTII loci, GUS
expression, accumulation of ASP 1 transcript and protein expression level in the case of plant # 17. Some heterologous seed proteins undergo specific degradation when expressed in transgenic plants. A
significant amount of the immunoreactive protein accumulated in tobacco seed expressing the phaseolin gene is smaller than the final processed protein (Sengupta et al.
1985). A similar result was found when ~i-conglycinin was expressed in transgenic petunia (Beachy et al. 1985). In contrast to these results, the ASP1 protein appears to be quite stable in transgenic tobacco plants.
Amino acid and total protein analyses were conducted on leaf tissue from several of the transgenic plants which produced detectable levels of ASP1. Surprisingly, we found that the overall levels of all amino acids were increased with some of the plants being remarkably high.
(Table 5, Figures 7A-7B). These data were derived from seedlings from transformed mother plants. A minimum of four separate assays was used; variation was no more than 30%. The data shown in Table 5 were determined from the dry weight of the whole plant. This rather disconcerting result has been repeated numerous times and the overall levels of all amino acids in the transgenic plants remain significantly elevated. See Table 6 and Figures 8A-8B. Table 6 gives percentages of the amino acids above that of the control (pBI 121 #2) for protein isolated from various ASP 1 tetramer seedlings. These values were derived from amino acid analysis.
Figures 8A-8B depict overall protein content from equivalent samples (by weight) taken from 1 S leaves of control (P-2TC) and various ASP 1 tetramer seedlings. Other methods of determining overall protein content have been used with similar trends observed. For example, comparison of total protein densitometric values derived from SDS-PAGE of equivalent samples (on a weight basis) yield the same results (data not shown). Therefore, in addition to being a very stable protein in a plant cell, ASP 1 must function as a general 'protein-stabilizer' and reduces overall protein turnover without apparent deleterious effects to the plants, since there is no observable difference in growth characteristics in the plants producing high amounts of ASP 1 as compared to control plants.
Table 5 Total Protein % Above Control Transformed Control 12 0 ASP1 #7 19 58 ASP #11 28 133 ASP #17 24 100 ASP #29 17 42 Table 6 of 7 Above % of 11 Above% of 17 Above % of 29 Above C C C C

Asx 60.55 80.59 47.00 35.00 Glx 65.26 56.18 46.42 20.68 Ser 30.00 109.67 73 28.00 Gly 14.46 115.96 78.01 23.69 His 31.30 94.27 63.74 27.23 Arg 39.06 86.5 58.28 23.24 Thr 31.00 106.55 79.48 23.44 Ala 6.21 123.91 76.55 27.54 Pro 11.68 114.53 73.65 19.66 Tyr 254.95 261.26 236.49 121.02 Val 14.45 80.06 53.32 23.70 Met ND ND ND ND

Cys ND ND ND ND

Ile 19.86 69.51 45.3 22.65 Leu 3.95 99.81 68.17 22.54 Phe 2.65 1 O 1.77 72.42 14.45 Lys -25.90 119.51 79.34 3.61 Trp ND ND ND ND

Expression of ASPI in Sweet Potato The above results indicate the surprising overall increase in protein production in tobacco plants which were transformed with ASP1. Similar results have also been found in sweet potato.
These results indicate that the increase in total protein content is a general phenomenon which is applicable to at least most plants. Table 7 lists the percentage of total protein, as a function of dry weight, of the transformed controls and ASP 1 transformants of sweet potato. The numbers are the average of 5 separate assays. Table 8 indicates the amount of essential amino acid in mg/100 grams edible portion of the sweet potato and the numbers are the average of 3 separate assays. Table 9 illustrates the percentage of these essential amino acids compared to the transformed control, the numbers being the average of 3 separate assays. Table 10 shows data for a repeat of experiments as done in Table 8 but with the content of more of the amino acids determined. The numbers in Table 10 are the average of 3 separate assays.
Table 11 shows the increase in transformant #5. Table 12 shows the % protein (wet weight basis) of roots and leaves, with the numbers being the average of at least 3 separate assays, while Table 13 depicts the overall protein content of the roots of transformed plants on a dry weight basis and percent dry matter and overall moisture content.
Table 7 Overall Protein Content and Percentage of the Control Transformed Plant % Total Protein % of Control Transformed Control 3.3 ~ 0.31 100 ASP 1 Transformant 6.3 t 0.46 191 ASP1 Transformant 5.2 t 0.14 158 ASP1 Transformant 4.8 ~ 0.06 146 ASP1 Transformant 9.1 ~ 0.16 276 ASP1 Transformant 9.6 t 0.19 291 Table 8 Essential Amino Acid Content in mg/100 Grams Edible Portion Essential AA T-ControlASP1 ASP/ ASP/ ASP1 ASP/

Isoleucine 90 225 255 290 315 270 Leucine 175 360 415 465 455 430 Lysine 148 275 315 350 395 365 Methionine 55 115 135 135 15 135 Phenylalanine 135 275 340 375 430 350 Threonine 135 225 305 350 385 325 Table 9 Essential Amino Acid Content as a Percentage of the Transformed Control Plants Essential AA T-ControlASP ASP 1 ASP 1 ASP 1 ASP 1 1 Tl T2 T3 T4 TS

Isoleucine 100 250 283 322 350 300 Leucine 100 206 237 266 260 246 Lysine 100 186 213 237 267 247 Methionine 100 209 246 246 282 246 Phenylalanine 100 204 252 277 319 259 Threonine 100 167 226 259 285 241 Table 10 Essential Amino Acid Content in mg/100 grams Edible Portion (Sweet Potato) Essential AA T-ControlASP1 ASP T2 ASP1 ASP1 ASP1 Isoleucine 80 320 420 383 388 433 Leucine 155 567 680 633 625 687 Lysine 125 450 510 493 493 537 Methionine 30 143 190 173 165 197 Phenylalanine 90 497 600 560 540 617 Threonine 105 423 480 430 445 487 Tryptophan 0.5 83 65 51 57 61 Valine 110 473 610 573 588 637 Nonessential AA

Aspartic Acid 230 2,260 1,267 1,533 2,395 2,567 Serine 95 513 450 547 558 660 Glutamic Acid 245 1,100 913 993 1,000 1,210 Proline 110 223 270 333 358 400 Glycine 100 387 333 397 393 443 Alanine 105 473 367 397 480 487 Tyrosine 55 327 310 367 358 403 Histidine 40 197 153 160 198 210 Arginine 95 417 357 590 450 507 Ammonium 45 250 140 160 260 277 Protein 2.300 10.150 7.593 9.473 10.403 11.083 Table 11 Essential Amino Acid Content as a Percentage of the Transformed Control Plants Essential AA T-Control ASP1 TS

Isoleucine 100 542 Leucine 100 443 Lysine 100 429 Methionine 100 656 Phenylalanine I 00 685 Threonine 100 464 Tryptophan 100 1,230 Valine 100 579 Table 12 Overall Protein Content (Fresh Weight) in Storage Roots and Leaves Protein in Roots % Protein in Leaves Transformed Control 0.36 0.94 ASP1 Transformant 2.42 1.93 ASP1 Transformant 1.60 1.21 ASP 1 Transformant 2.11 1.26 ASP1 Transformant 2.23 1.60 ASP 1 Transformant 2.46 2.03 Table 13 Various Composition of Roots Content on a Dry Weight Basis Protein % Dry Matter % Moisture Transformed Control 2.3 17.0 83 ASP 1 Transformant 10.2 23.7 76 ASP1 Transformant 8.0 21.7 79 ASP1 Transformant 9.5 23.0 78 ASP 1 Transformant 10.6 20.7 80 ASP1 Transformant 11.7 21.9 79 In a field study of 5 separate transformed lines of sweetpotato which were transformed with ASP1, it was seen that 3 of the 5 lines grew more slowly than the control plants and produced fewer storage roots, while the remaining two transformed lines which had the highest protein levels grew normally. Application of nitrogenous fertilizer did not make any significant difference in the yield of these two lines.
Other Gene Constructs Which Can Yield Increased Overall Protein Content Protein constructs similar to ASP1 will also cause plants to give elevated total protein yields when the plant is transformed with a gene construct which expresses such a protein. One such protein is HDNP1 which has the following monomeric amino acid sequence:
MLEEIFKKMTEWIEKVLKTM (SEQIDN0:6) hhHHhhHHhHHhhHHhhHhh (SEQ ID N0:31) The "h" and "H" below the amino acid sequence refer to "hydrophobic" and "hydrophilic", respectively. Hydrophobic amino acids comprise: isoleucine, methionine, phenylalanine, tryptophan, valine, leucine, alanine and cysteine. Hydrophilic amino acids comprise: arginine, glutamic acid, histidine, lysine, asparagine, aspartic acid, glutamine, tyrosine and proline.
Glycine, threonine and serine can act as either hydrophilic or hydrophobic amino acid residues depending upon their immediate environment. The HDNP 1 monomer is composed of 20 amino acids in the structural motif to render an amphiphilic a-helix. The tetrameric form is:
MLEEIFKKMTEWIEKVLKTMgpgrMLEEIFKKMTEWIEKVLKTMgpgrMLEEIFKKMTE
WIEKVLKTMgpgrMLEEIFKKMTEWIEKVLKTM (SEQ ID N0:7).

This tetrameric form shows the 4 a-helices interspaced with the ~i-turn gpgr (SEQ ID
N0:8). The tetramer is composed of 92 amino acids, including the 12 amino acids comprising the 3 ~3-turns, in the structural motif to render an amphiphilic a-helix.
HDNP1 is quite similar to ASP1 except that the Leu in position 5 of the monomer has been changed to Ile and also the Ile in position 17 of the monomer for ASP 1 has been changed to a Leu in HDNP 1. For the tetramer, these changes are made throughout the protein as can be seen by comparing the amino acid sequences. Yet another protein which can yield similarly elevated protein levels when plants are transformed with a gene construct expressing the gene, is the protein HDNP2 which has the following monomeric sequence:
MTIEWKVELKFEMKIELKMT (SEQ ID N0:9) hHhHhHhHhHhHhHhHhHhH (SEQ ID N0:36) This monomer is composed of 20 amino acids in the structural motif to render an amphiphilic ~3-pleated sheet. The tetrameric form shown below has each stretch of (3-pleated sheet interspaced with the (3-turn gpgr (SEQ ID N0:8). The sequence of the tetrameric form is:
MTIEWKVELKFEMKIELKMTgpgrMTIEWKVELKFEMKIELKMTgpgrMTIEWKVELKF
EMKIELKMTgpgrMTIEWKVELKFEMKIELKMT (SEQ ID NO:10).
The tetramer is composed of 92 amino acids, including the 12 amino acids comprising the 3 (3-turns, in the structural motif to render an amphiphilic (i-pleated sheet.
Protein Designs Useful for Organisms Other than Humans The proteins ASP1, HDNP1 and HDNP2 were designed to yield high levels of essential amino acids especially suitable for humans. Each type of animal has its own set of required essential amino acids and these sets of essential amino acids, while usually overlapping, are different from each other. Other proteins can be designed which yield higher levels of essential amino acids more suitable for organisms other than humans. For example, pigs have one set of essential amino acids, chickens have a different set, and fish have yet a different set. Transgenic plants can be engineered to be designed to be fed to one particular species of animal. For example, various transgenic corn plants can be produced wherein one transgenic form is most suitable for humans, a second transgenic form will produce a high level of those essential amino acids suited for pigs, and a third transgenic form can be made which is most suited for chickens.
The design of such proteins can be based on the design of ASP1, HDNPI and HDNP2. One of skill in the art knows how to prepare DNA which will encode each desired protein. The following are examples of the monomeric and tetrameric forms of proteins which may be used for specific species of animals. DNAs encoding these proteins are easily designed and used to make transgenic plants as described above.
5 A protein directed to use with swine is SDNP1 which has the amino acid sequence:
MFETIVKLVEETMHKWEEVIKKFVTMVEETLKKFEEITKKM (SEQ ID NO:11) hhHHhhHhhHHhhHHhHHhhHHhhHhhHHhhHHhHHhhHHh (SEQ ID N0:32) This monomer is composed of 41 amino acids in the structural motif to render an amphiphilic a-helix. The tetrameric form is:
10 MFETIVKLVEETMHKWEEVIKKFVTMVEETLKKFEEITKKMgpgrMFETIVKLVEETM
HKWEEVIKKFVTMVEETLKKFEEITKKMgpgrMFETIVKLVEETMHKWEEVIKKFVT
MVEETLKKFEEITKKMgpgrMFETIVKLVEETMHKWEEVIKKFVTMVEETLKKFEEIT
KKM (SEQ ID N0:12).
This tetrameric form shows the 4 a-helices interspaced with the ~3-turn gpgr (SEQ ID
15 N0:8). The tetramer is composed of 176 amino acids, including the 12 amino acids comprising the 3 ~i-turns, in the structural motif to render an amphiphilic a-helix.
A second protein for swine is SDNP2 and has the monomeric amino acid sequence MTIEFKVELKVETHWEMKIEVKFETKIEVKTEMKLEVKFTM (SEQ ID N0:13) hHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHh (SEQ ID N0:37) 20 This monomer is composed of 41 amino acids in the structural motif to render an amphiphilic ~i-pleated sheet. The tetrameric form shown below has each stretch of (3-pleated sheet interspaced with the ~3-turn gpgr (SEQ ID N0:8). The sequence of the tetrameric form is:
MTIEFKVELKVETHWEMKIEVKFETKIEVKTEMKLEVKFTMgpgrMTIEFKVELKVET
HWEMKIEVKFETKIEVKTEMKLEVKFTMgpgrMTIEFKVELKVETHWEMKIEVKFETK
25 IEVKTEMKLEVKFTMgpgrMTIEFKVELKVETHWEMKIEVKFETKIEVKTEMKLEVKFTM
(SEQ ID N0:14).
The tetramer is composed of 176 amino acids, including the 12 amino acids comprising the 3 ~3-turns, in the structural motif to render an amphiphilic (3-pleated sheet.
A protein directed to use with poultry is PDNP 1 which has the amino acid sequence:
30 MFEGLVKIMEEVLRHWTEVFGKI FEMGTRFLEGFTKM (SEQ ID NO:15) hhHHhhHhhHHhhHHhHHhhHHhhHhhHHhhHHhHHh (SEQ ID N0:33) This monomer is composed of 37 amino acids in the structural motif to render an amphiphilic a-helix. The tetrameric form is:
MFEGLVKIMEEVLRHWTEVFGKIFEMGTRFLEGFTKMgpgrMFEGLVKIMEEVLRHW
TEVFGKIFEMGTRFLEGFTKMgpgrMFEGLVKIMEEVLRHWTEVFGKIFEMGTRFLEG
FTKMgpgrMFEGLVKIMEEVLRHWTEVFGKIFEMGTRFLEGFTKM (SEQ ID N0:16).
This tetrameric form shows the 4 a-helices interspaced with the (3-turn gpgr (SEQ ID
N0:8). The tetramer is composed of 160 amino acids, including the 12 amino acids comprising the 3 (i-turns, in the structural motif to render an amphiphilic a-helix.
A second protein for poultry is PDNP2 and has the monomeric amino acid sequence MEFKVGIELRFTWEMHVGFELKIGFTVEMRLGFETKM {SEQ ID N0:17) hHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHh (SEQ ID N0:38) This monomer is composed of 37 amino acids in the structural motif to render an amphiphilic ~3-pleated sheet. The tetrameric form shown below has each stretch of (3-pleated sheet interspaced with the ~i-tum gpgr (SEQ ID N0:8). The sequence of the tetrameric form is:
MEFKVGIELRFTWEMHVGFELKIGFTVEMRLGFETKMgpgrMEFKVGIELRFTWEMH
VGFELKIGFTVEMRLGFETKMgpgrMEFKVGIELRFTWEMHVGFELKIGFTVEMRLGF
ETKMgpgrMEFKVGIELRFTWEMHVGFELKIGFTVEMRLGFETKM (SEQ ID N0:18).
The tetramer is composed of 160 amino acids, including the 12 amino acids comprising the 3 ~3-turns, in the structural motif to render an amphiphilic ~i-pleated sheet.
A protein directed to use with fish is FDNP 1 which has the amino acid sequence:
MFEELVRTIEELMKKWEEVFKRVLHILEEFVRKFEETMRK (SEQIDN0:19) hhHHhhHhhHHhhHHhHHhhHHhhHhhHHhhHHhHHhhHH (SEQ ID N0:34) This monomer is composed of 40 amino acids in the structural motif to render an amphiphilic a-helix. The tetrameric form is:
MFEELVRTIEELMKKWEEVFKRVLHILEEFVRKFEETMRKgpgrMFEELVRTIEELMK
KWEEVFKRVLHILEEFVRKFEETMRKgpgrMFEELVRTIEELMKKWEEVFKRVLHILE
EFVRKFEETMRKgpgrMFEELVRTIEELMKKWEEVFKRVLHILEEFVRKFEETMRK(SEQ
ID N0:20).
This tetrameric form shows the 4 a-helices interspaced with the (3-tum gpgr (SEQ ID
N0:8). The tertramer is composed of 172 amino acids, including the 12 amino acids comprising the 3 (3-turns, in the structural motif to render an amphiphilic a-helix.
A second protein for fish is FDNP2 and has the monomeric amino acid sequence MEIKLEVRFETKVELKVEWRIEFHTELKMELRVELRFEMK (SEQIDN0:21) ' hHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhH (SEQ ID N0:39) This monomer is composed of 40 amino acids in the structural motif to render an amphiphilic ~i-pleated sheet. The tetrameric form shown below has each stretch of ~i-pleated S sheet interspaced with the (3-turn gpgr (SEQ ID N0:8). The sequence of the tetrameric form is:
MEIKLEVRFETKVELKVEWRIEFHTELKMELRVELRFEMKgpgrMEIKLEVRFETKVE
LKVEWRIEFHTELKMELRVELRFEMKgpgrMEIKLEVRFETKVELKVEWRIEFHTELK
MELRVELRFEMKgpgrMEIKLEVRFETKVELKVEWRIEFHTELKMELRVELRFEMK(SEQ
ID N0:22).
The tetramer is composed of 172 amino acids, including the 12 amino acids comprising the 3 ~3-turns, in the structural motif to render an amphiphilic ~i-pleated sheet.
A protein directed to use with dogs is DDNP 1 which has the amino acid sequence:
MVETFIKLVEEIVRKWEEMLHKFVEVLTKLFETFTKIM (SEQ ID N0:23) hhHHhhHhhHHhhHHhHHhhHHhhHhhHHhhHHhHHhh (SEQ ID N0:35) This monomer is composed of 38 amino acids in the structural motif to render an amphiphilic a-helix. The tetrameric form is:
MVETFIKLVEEIVRKWEEMLHKFVEVLTKLFETFTKIMgpgrMVETFIKLVEEIVRKWE
EMLHKFVEVLTKLFETFTKIMgpgrMVETFIKLVEEIVRKWEEMLHKFVEVLTKLFETF
TKIMgpgrMVETFIKLVEEIVRKWEEMLHKFVEVLTKLFETFTKIM (SEQ ID N0:24).
This tetrameric form shows the 4 a-helices interspaced with the (3-turn gpgr (SEQ ID
N0:8). The tertramer is composed of 164 amino acids, including the 12 amino acids comprising the 3 ~3-turns, in the structural motif to render an amphiphilic a-helix.
A second protein for dogs is DDNP2 and has the monomeric amino acid sequence MTVEFKLEIKVTIEFKWEVHLEIRFEVKLEMKFTLTMV (SEQ ID N0:25) hHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhh (SEQ ID N0:40) This monomer is composed of 38 amino acids in the structural motif to render an amphiphilic ~i-pleated sheet. The tetrameric form shown below has each stretch of (3-pleated sheet interspaced with the ~i-turn gpgr (SEQ ID N0:8). The sequence of the tetrameric form is:
MTVEFKLEIKVTIEFKWEVHLEIRFEVKLEMKFTLTMVgpgrMTVEFKLEIKVTIEFKW
EVHLEIRFEVKLEMKFTLTMVgpgrMTVEFKLEIKVTIEFKWEVHLEIRFEVKLEMKFT
LTMVgpgrMTVEFKLEIKVTIEFKWEVHLEIRFEVKLEMKFTLTMV (SEQ ID N0:26).

The tetramer is composed of 164 amino acids, including the 12 amino acids comprising the 3 (3-turns, in the structural motif to render an amphiphilic ~i-pleated sheet.
A protein directed to use with cats is CDNP 1 which has the amino acid sequence:
MLETLFKIVEETLRKWEEMFKHVLTFMEEIVKRITRLM (SEQ ID N0:27) hhHHhhHhhHHhhHHhHHhhHHhhHhhHHhhHHhHHhh (SEQ ID N0:35) This monomer is composed of 38 amino acids in the structural motif to render an amphiphilic a-helix. The tetrameric form is:
MLETLFKIVEETLRKWEEMFKHVLTFMEEIVKRITRLMgpgrMLETLFKIVEETLRKWE
EMFKHVLTFMEEIVKRITRLMgpgrMLETLFKIVEETLRKWEEMFKHVLTFMEEIVKRI
TRLMgpgrMLETLFKIVEETLRKWEEMFKHVLTFMEEIVKRITRLM (SEQ ID N0:28).
This tetrameric form shows the 4 a-helices interspaced with the ~i-turn gpgr (SEQ ID
N0:8). The tertramer is composed of 164 amino acids, including the 12 amino acids comprising the 3 (i-turns, in the structural motif to render an amphiphilic a-helix.
A second protein for cats is CDNP2 and has the monomeric amino acid sequence MTLEFKLTMELHWEIKVELKTEVRIEMKFEVRLEFRMT (SEQ ID N0:29) hHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhH (SEQ ID N0:41) This monomer is composed of 38 amino acids in the structural motif to render an amphiphilic ~i-pleated sheet. The tetrameric form shown below has each stretch of (3-pleated sheet interspaced with the ~3-turn gpgr (SEQ ID N0:8). The sequence of the tetrameric form is:
MTLEFKLTMELHWEIKVELKTEVRIEMKFEVRLEFRMTgpgrMTLEFKLTMELHWEIK
VELKTEVRIEMKFEVRLEFRMTgpgrMTLEFKLTMELHWEIKVELKTEVRIEMKFEVR
LEFRMTgpgrMTLEFKLTMELHWEIKVELKTEVRIEMKFEVRLEFRMT (SEQ ID N0:30).
The tetramer is composed of 164 amino acids, including the 12 amino acids comprising the 3 ~i-turns, in the structural motif to render an amphiphilic ~i-pleated sheet.
Use of Vectors to Increase the Production of a Second Protein for which the Plant is Transformed The generally enhanced levels of protein production can be useful in expressing other valuable proteins. For example, if a gene coding for insulin were cloned into a plant expressing the ASP I gene, it is expected that levels of insulin production will be higher, as compared to control plants having the insulin gene, but lacking the ASP 1 gene. Therefore plants which are transgenic for both the ASP1 gene or similar gene which also results in increased total protein production and for a second gene which encodes a protein of interest will make more of'the protein of interest than if the plant were transformed solely with the gene encoding the protein of interest and not transformed with the ASP 1 or similar gene. It is irrelevant whether the plant is first transformed with ASP 1 or a similar gene and later transformed with a gene of interest, or whether the plant is first transformed with the gene of interest and then is later transformed with ASP 1 or similar gene. Also, a transformation can be performed using both genes simultaneously.
Use of Vectors to Increase the Production of Nonprotein Products Plants and plant cells which have been made transgenic for ASP 1 or similar amphipathic proteins produce greater amounts of all protein than do nontransgenic plants or cells. As a result of this generally higher level of protein, higher levels of nonprotein products will also be made.
This result is expected because there will be an increase in the levels of enzymes which are used in the synthesis of such products. For example, taxol is naturally synthesized by certain plants and the synthesis of taxol is dependent on enzymes. Increased levels of those enzymes will lead to increased levels of taxol. Similarly, many plants produce sugars, e.g., sugarcane. Again, the synthesis of sugars is dependent on enzymes within the plant. Increased levels of these enzymes will yield increased levels of the sugars. Therefore simply making a plant or plant cell transgenic for ASP 1 or similar amphipathic protein will result in the plant or cell producing more product wherein said product need not be a protein but is synthesized by protein (enzyme) action.
Similarly, if one knows the enzymes involved in the synthetic pathway of a desired product, e.g., taxol or sugar, one can co-transform a plant or plant cell with a gene encoding ASP1 or similar amphipathic protein and with a gene encoding the enzyme which is utilized in synthesizing the desired product. In this way one can further enhance the production of the desired product. This can be especially useful if there is one limiting enzyme and the gene for this limiting enzyme of the pathway is used.
Sweetpotato was transformed with ASP1 and two transformed lines were assayed for sugar content and overall amount of dry matter versus moisture content.
Results are shown in Table 14.

Table 14 Sucrose % Glucose % Fructose Control 0.54 0.06 0.06 Transformant 0.95 0.06 0.06 5 Transforrnant 1.57 0.1 S 0.13 Table 14 shows that Transformant 1 had an increased production of sucrose but normal production of both glucose and fructose whereas Transformant 2 had increased production of all 3 sugars as compared to the control plant.
10 Table 13 indicates that the overall amount of dry matter is increased from 17% in the control to roughly 22% in the transformants. This is approximately a 30%
increase in dry matter as a percent of the total weight of the plant.
Use in Plants Other than Tobacco and Sweet Potato 15 High-level, tissue-specific expression of ASP1 or related genes can also be performed, in a manner generally analogous to that described above for tobacco and sweet potato, in certain economically important plants such as rice, wheat, barley, sorghum, maize, potato, plantain, cassava, taro, soybean, alfalfa, or a forage grass. It is desirable to incorporate suitable promoters or other regulatory sequences to encourage expression (preferably constitutive expression) 20 primarily in the part of the plant intended as a foodstuff. For example, in rice or maize, expression is desired primarily in the seeds; while in potato or sweet potato, expression is desired primarily in the tuber. Where necessary, transformation protocols known in the art other than the Agrobacterium protocol will be used, such as transformation through DNA
particle gun or via plant protoplasts. See, e.g., Klein et al. (1987) and Croughan et al.
(1989). These plants can 25 be transformed with vectors encoding not only ASP1, but for any such similar proteins including any of the proteins disclosed above.
Cell Culture of Transgenic Plant Cells It is not necessary to make transgenic plants to perform the invention. Plant cells can be 30 made transgenic with a gene encoding ASP1 or other amphipathic protein and these transgenic cells can be grown in culture or in a bioreactor. This avoids the necessity of having to regenerate a plant. These transgenic cells will produce enhanced levels of protein and other products as was seen in the transgenic plants. These cells can be cotransformed with any genes of interest, for example a gene encoding insulin. The desired product will be overproduced as compared to a nontransgenic plant cell or a cell not transformed with a gene encoding ASP 1 or other amphipathic protein. The desired product can be purified from the cultured cells.
As used in the claims below, unless otherwise clearly indicated by context, the term "higher plant" is intended to encompass gymnosperms, monocotyledons, and dicotyledons; as well as any cells, tissues, or organs taken or derived from any of the above, including without limitation any seeds, leaves, stems, flowers, roots, tubers, single cells, gametes, or protoplasts taken or derived from any gymnosperm, monocotyledon, or dicotyledon. Also, the term "protein" is meant to include peptides such as dipeptides or any longer peptide as well as proteins.
Although the bulk of the above discussion regarding this invention has focused on de novo proteins having an a-helical structure, the same basic approach can work in designing de novo proteins having a (3-sheet structure. To generate amphipathic ~i-sheets (which are not believed to have been reported in nature), amino acid residues will alternate between being hydrophobic and being hydrophilic, so that one side of the structure is hydrophobic, and the other side is hydrophilic. This structure was seen in the sequences disclosed above.
Salt bridges to promote stability can be formed with internal sequences glu-X-lys. Other acid-X-base sequences may also serve this function. Lysine is preferred as the base because it is an essential amino acid.
It may be possible to substitute aspartic acid for glutamic acid, however, to give the internal sequence asp-X-lys. Turns between adjacent monomer units may be promoted, for example, by the internal sequence gly-asn, to form oligomers or polymers of the main peptide structure.
While the invention has been disclosed in this patent application by reference to the details of preferred embodiments of the invention, it is to be understood that the disclosure is intended in an illustrative rather than in a limiting sense, as it is contemplated that modifications will readily occur to those skilled in the art, within the spirit of the invention and the scope of the appended claims.

LITERATURE CITED
Agros, P., Pederson, K., Marks, D. and Larkins, B. A. 1982. A structural model for maize zein proteins. J. Biol. Chem. 257: 9984-9990.
Agros, P., Naravana, S. V. L.; and Nielsen, N. C. 1985. Structural similarity between legumin and vicillin storage proteins from legumes. The EMBO J. 4: 1111-1117.
Altenbach, S. B., Pederson, K. W., Meeker, G. Staraci, L. C., and Sun, S. S.
M. 1989.
Enhancement of the methionine content of seed proteins by the expression of a chimeric gene encoding a methionine-rich protein in transgenic plants. Plant Mol. Biol. 13:
513-522.
Bachmair, A., Finley, D., and Varshavsky, A. 1986. In vivo halflife of a protein is a function of its amino-terminal residue. Science 234: 179-186.
Badley, R.A., Atkinson, D., Hauler, H., Oldani, D., Green, J.P., and Stubbs, J.M. 1975. The structure, physical and chemical properties of the soybean protein glycinin.
Biochim. Biophys.
Acts. 412: 214-228.
Bartels, D., and Tompson, R.D. 1983. The characterization of cDNA clones coding for wheat storage proteins. Nucleic Acids Res. 11: 2961-2977 Beachy R.N., Chen Z.L., Horsch R.B., Rogers S.G., Hoffman N.J., and Fraley R.T. 1985.
Accumulation and assembly of soybean (3-conglycinin in seeds of transformed petunia plants.
EMBO J. 4: 3047-3053.
Bierzynski, A., Kim, P. S., and Baldwin, R. L. 1982. A salt bridge stabilizes the helix formed by isolated c-peptide of RNAse A. Proc. Natl. Acad. Sci. U. S. A. 79: 2470-2474.
Blundell, T.L., Thornton, S. J., Burley, S. K., and Petsco, G. A. 1986. Atomic interactions.
Science 234: 1005-1009.
Bollini, R. and Chrispeels, M.J. 1978. Characterization and subcellular localization of vicillin and phyto-hemagglutinin, the two major reserve proteins of Phaseolus vulgaris.
Plants 142: 291-298.
Brown, J. E. and Klee, W. A. 1971. Helix-coil transition of the isolated amino terminus of Ribonuclease. Biochemistry 10: 470-476.
Chen, Z. L., Pan, N. S., and Beachy, R. N. 1988. A DNA sequence element that confers seed-specific enhancement of a constitutive promoter. The EMBO J. 7: 297-302.
Chen, Z. L., Schuler, M. A. and Beachy, R. N. 1986. Functional analysis of regulatory elements in a plant embryo-specific gene. Proc. Natl. Acad. Sci. U. S.A. 83:8560-8564.
Chou, P. Y. and Fasman, G. D. 1978. Prediction of the secondary structure of proteins from their amino acid sequence. Adv. Enzymol. 47: 45-148.

Colot, V., Robert, L. S., Kavanagh, T. A., Beavan, M. W. and Tompson, R. D.
1987.
Localization of sequences in wheat endosperm protein genes which confer tissue-specific expression in tobacco. The EMBO J. 6: 3559-3564.
Creighton, T.E. 1984. Proteins. New York: Freeman.
Crouch, M., Tenberge, K., Simone, N.E., and Ferl, R. 1983. Sequence of the 1.7K storage protein of Brassica napus. Mol. Appl. Genet. 2: 273-283.
Croughan et al. 1989. Advances in Plant Biotechnology, pp. 107-114.
Degrado, W. F., and Lear, J. D. 1985. Induction of peptide conformation at apolar/water interfaces. J. Am. Chem. Soc. 107: 7684-7689.
Degrado, W. F., Wasserman, Z. R., and Lear, J.D. 1989. Protein design, a minimalist approach.
Science 241: 622-628.
Esen, E. 1986. Separation of alcohol-soluble proteins (zeros) from maize into three fractions by differential solubility. Plant Physiol. 80: 623-627.
Finley, D. and Varshavsky, A. 1985. The ubiquitin system: functions and mechanisms. Trends Biochem. Sci. 10: 343-346.
Forde, B.G., Kreis, M., Williamson, M.S., Fry, R.P. and Pywell, J. 1985. Short tandem repeats shared by B- and C-hordein cDNAs suggest a common evolutionary origin for two groups of cereal storage protein genes. The EMBO J. 4: 9-15.
Goldberg, A.L., and St John, A.C. 1976. Intracellular protein degradation in mammalian and bacterial cells: part 2. Annu. Rev. of Biochem. 45: 747-803.
Greenwood, J. S., and Chrispeels, M. J. 1985. Correct targeting of the bean storage protein phaseolin in the seeds of transformed tobacco. Plant Physiol. 79: 65-71.
Gross, D.S., and Garrard, W.T. 1987. Poising chromatin for transcription.
Trends in Biochem.
12: 293-296.
Ho, S. P., and Degrado, W. F. 1987. Design of a 4-helix bundle protein:
synthesis of peptides which self associate into helical protein. J. Am. Chem. Soc. 109: 6751-6758.
Hoffmann, L.E., Donaldson, D. D., and Herman, E. M. 1988. A modified storage protein is synthesized, processed, and degraded in the seeds of transgenic plants. Plant Mol. Biol. 11: 717-729.
Hoffmann, L. E., Donaldson, D. D., Bookland, R., Rashka, K. and Herman, E. M.
1987.
Synthesis and protein body deposition of maize 15-kd zein in transgenic tobacco seeds. The EMBO J. 6: 3213-3221.

Hol, W. G. and Sander, H. C. 1981. Dipole of the a-helix and ~i-sheet: their role in protein folding. Nature 294: 532-536.
Horsch, R.B., Fry, J., Hoffmann, N., Neidermeyer, J., Rogers, S.G. and Fraley, R.T. 1988. In Plant Mol. Biol. Manual ed. S. B. Gelvin and R. A. Schilperoort, Dordrecht:
Kluwer Academic.
Jaynes, J. M., Nagpala, P., Destefano, L., Denny, T., Clark, C., and Kim, J-H.
1992. Expression of a de novo designed peptide in transgenic tobacco plants confers enhanced resistance to Pseudomonas solanacearum infection. Plant Science 89: 43-53.
3aynes, J. M., Yang, M. S., Espinoza, N. O., and Dodds, J. H. 1986. Plant protein improvement by genetic engineering: use of synthetic genes. Trends in Biotechnol. 4: 314-320.
Jones, J.D.G., and Gilbert, D.E. 1987. T-DNA structure and gene expression in petunia plants transformed by Agrobacterium tumefaciens C58 derivatives. Mol. Gen. Genet.
207: 478-485.
Kabsch, W., and Sander, C. 1983. How good are predictions of protein structure? FEBS lett. 155:
179-182.
Kane, J.F. and Hartley, D.L. 1988. Formation of recombinant protein inclusion bodies in Escherichia colt. Trends in Biotechnol. 6: 95-101.
Kasarda, D.D., Okita, T.W., Bernardin, J.E., Baecker, P.A., and Nimmo, C.C.
1984. DNA and amino acid sequences of alpha and gamma gliadins. Proc. Natl. Acad. Sci.
U.S.A. 81: 4712-4716.
Keris, M., Shewry, P. R., Forde, B. G., Forde, G. and Miflin, J. 1985.
Structure and evolution of seed storage proteins and their genes with particular reference to those of wheat, barley and rye.
Oxford Survey of Plant Mol. and Cell Biol. 2:253-317.
Klein, T.M., Wolf, E.D., Wu, R. and Sanford, J.C. 1987. High-velocity microprojectiles for delivering nucleic acids into living cells. Nature 327:70-73.
Komoriya, A., and Chaiken, J. M. 1982. Sequence modeling using semisynthetic Ribonuclease S. J. Biol. Chem. 257: 2599-2604.
Larkins, B.A. 1983. Genetic engineering of seed storage protein. In Genetic Engineering of Plants ed. B. A. Larkins, pp. 93-120. New York: Plenum.
Larkins, B.A., Pederson, K., Mark, M.D., and Wilson, D.R. 1984. The zein protein of maize endosperm. Trends in Biochem. 9: 306-308.
Lawrence, M.C., Suzuki, E., Varghes, J.N., Davis, P.C., Van Donkelaar, A.
Tulloch, P.A. and Collman, P.M. 1990. The three-dimensional structure of the seed storage protein phaseolin at 3 ~ resolution. The EMBO J. 9: 9-1 S.

Lear, J. D., Wasserman, Z. R. and Degrado, W. F. 1988. Synthetic amphiphilic peptide model for protein ion channels. Science 240: 1177-1181.
Lending, C. R., Kriz, A., Larkins, B. A. and Bracker, C. E. 1988. Structure of maize protein bodies and immunocytochemical localization of zeros. Protoplasma 143: 51-62.
Lycett, G. W., Cory, R.D., Shirsat, A. H., Richards, D. M., and Boulter, D.
1985. The 5'-flanking regions of three pea legumin genes: comparison of DNA sequences. Nucleic Acids Res. 13:
6733-6743.
Marqusee, S. and Baldwin, R. 1987. Helix stabilization by GLU-LYS salt bridges in short peptides of de novo design. Proc. Natl. Acad. Sci. U. S. A. 84: 8898-8902.
Marries, C., Gallois, P., Copley, J. and Keris, M. 1988. The 5' flanking region of a barley B
15 hordein gene controls tissue and developmental specific CAT expression in tobacco plants. Plant Mol. Biol. 10: 359-366.
Mutter, M. 1988. Nature's rules and chemist's tools: a way for creating novel proteins. Trends in Biochem. 13: 260-264.
Pace, C.N. and Barnet, A.J. 1984. Kinetics of tryptic hydrolysis of the arginine-valine bond in folded and unfolded ribonuclease Tl. Biochem. J. 219: 411-417.
Pakula, A.A. and Sauer, R.T. 1986. Bacteriophage 1 Cro mutation: effect on activity and intracellular degradation. Proc. Natl. Acad. Sci. U.S.A. 82: 8829-8833.
Pakula, A.A. and Sauer, R.T. 1989. Amino acid substitutions that increase the thermal stability of the 1 Cro protein. Proteins 5: 202-210.
Parasell, D.A. and Sauer, R.T. 1989. The structural stability of a protein is an important determinant of its proteolytic susceptibility in Escherichia coli. J. Biol.
Chem. 264: 7590-7595.
Pederson, K., Agros, P., Naravana, S. V. L., and Larkins, B. A. 1986. Sequence analysis and characterization of a maize gene encoding a high-sulfur zero protein of Mw 15,000. J. Biol.
Chem.201:6279-6284.
Pernollet, J. C. and Mosse, J. 1983. Structure and location of legume and cereal seed storage protein. Seed Proteins. Phytochemical Soc. of Eur. Sym. Series 20: 155-187.
Pontremoli, S. and Melloni, E. 1986. Extralysosomal protein degradation. Annu.
Rev. Biochem.
55: 455-481.
Presnell, S.R., and Cohen, F.E. 1989. Topological distribution of a four-a-helix bundle. Proc.
Natl. Acad. Sci. U.S.A. 86: 6592-6596.
Presta, L. G. and Rose, G. D. 1988. Helix signals in proteins. Science 240:
1632-1641.

Rafalski, J.A., Scheets, K., Metzler, M., and Peterson, D.M. 1984.
Developmentally regulated plant genes: the nucleotide sequence of a wheat gliadin genomic clone. The EMBO J. 3: 1409-1415.
S Richardson, J. S. and Richardson, D. C. 1988. Amino acid preferences for specific locations at the ends of a-helices. Science 240: 1648-1652.
Richardson, J. S. and Richardson, D. C. 1989. The de novo design of protein structures. Trends in Biochem. 14: 304-309.
Sanders, P.R., Winter, J.A., Barnason, A.R. and Rogers, S.G. 1987. Comparison of cauliflower mosaic virus 355 and nopaline synthetase promoters in transgenic plants.
Nucleic Acids Res 15:
1543-1558.
Scheraga, H. 1978. Use of random copolymers to determine helix-coil stability constants of the naturally occurring amino acids. Pure. Appl. Chem. 50: 315-324.
Scheraga, H. A. 1985. Effect of side chain-backbone electrostatic interaction on the stability of a-helices. Proc. Natl. Acad. Sci. U. S. A. 82: 5585-5587.
Scott, R.J., and Draper, J. 1987. Transformation of carrot tissue derived from proembryogenic suspension cells: a useful model system for gene expression studies in plants.
Plant Mol. Biol.
8: 265-274.
Sengupta, G. C., Reichert, N. A., Baker, R. F., Hall, T. C. and Kemp, J. D.
1985.
Developmentally regulated expression of the bean ~i-phaseolin gene in tobacco seed. Proc. Natl.
Acad. Sci. U. S. A. 82: 3320-3324.
Shen, S-H. 1984. Multiple joined genes prevent product degradation in E. coli.
Proc. Natl Acad.
Sci. U. S. A. 81: 4627-4631.
Slightom, J.L., Sun, S.M., and Hall, T.C. 1983. Complete sequence of french bean storage protein gene: phaseolin. Proc. Natl. Acad. Sci. U.S.A. 80: 1897-1901.
Staswick, P. E. 1989. Preferential loss of an abundant storage protein from soybean pods during seed development. Plant Physiol. 90: 1251-1255.
Stockhaus, J., Eckes, P., Blau, A., Schell, J., and Willmitzer, L. 1987. Organ-specific and dosage-dependent expression of a leaf/stem specific gene from potato after tagging and transfer into potato and tobacco plants. Nucleic Acids Res. 15: 3479-3491.
Sueki, M., Lee, S., Power, S. P., Denton, J. B., Konishi, Y., and Scheraga, H.
1984. Helix-coil stability constants for the naturally occurring amino acids in water.
Macromolecules 17: 148-155.
Twell, D. and Ooms, G. 1987. The 5' flanking DNA of a patatin gene directs tuber specific expression of a chimeric gene in potato. Plant Mol. Biol. 9: 365-375.

Wallace, J. C., Galili, G., Kawata, E. E., Cuellar, R. E., Shotwell, M. A., and Larkins, B: A.
1988. Aggregation of lysine containing zeros into protein bodies in Xenopus oocytes. Science 240: 662-664.
Wenzler, H. C., Mignery, G. A., Fisher, L. M., and Park, W. D. 1989. Analysis of a chimeric class I potatin-GUS gene in transgenic potato plants: high level expression of tubers and sucrose-inducible expression in cultured leaf and stem explants. Plant Mol. Biol. 12:
41-50.
Yang, M.S., Espinoza, N. O., Dodds, J. H., and Jaynes, J. M. 1989. Expression of a synthetic gene for improved protein quality in transformed potato plants. Plant Science.
64: 99-111.
Zimm, B. H. and Bragg, J. R. 1959. Theory of the phase transition between helix and random coil in polypeptide chains. J. Chem. Phys. 31: 526-535.

SEQUENCE LISTING
<110> Demegen, Inc.
<120> A Method for Increasing the Protein Content of Plants <130> 2093-124 <140>
<141>
<150> U.S. 09/066,056 <151> 1998-04-27 <160> 41 <170> PatentIn Ver. 2.0 <210> 1 <211> 293 <212> DNA
<213> Artificial Sequence <220>
<221> CDS
<222> (10)..(285) <220>
<223> Description of Artificial Sequence:Synthetic DNA
to encode artificial protein ASP1.
<400> 1 gatccaaca atg ctt gaa gag ctg ttc aaa aag atg acc gag tgg atc gag 51 Met Leu Glu Glu Leu Phe Lys Lys Met Thr Glu Trp Ile Glu aaa gtg atc aaa acg atg gga cca ggc agg atg ctc gag gag ctg ttc 99 Lys Val Ile Lys Thr Met Gly Pro Gly Arg Met Leu Glu Glu Leu Phe aaa aag atg acc gag tgg atc gag aaa gtg atc aaa acg atg gga cca 147 Lys Lys Met Thr Glu Trp Ile Glu Lys Val Ile Lys Thr Met Gly Pro ggc agg atg ctc gag gag ctg ttc aaa aag atg acc gag tgg atc gag 195 Gly Arg Met Leu Glu Glu Leu Phe Lys Lys Met Thr Glu Trp Ile Glu aaa gtg atc aaa acg atg gga cca ggc agg atg ctc gag gag ctc ttt 243 Lys Val Ile Lys Thr Met Gly Pro Gly Arg Met Leu Glu Glu Leu Phe aaa aaa atg act gag tgg atc gaa aaa gtg atc aaa act atg taggaatt 293 Lys Lys Met Thr Glu Trp Ile Glu Lys Val Ile Lys Thr Met <210> 2 <211> 92 <212> PRT
<213> Artificial Sequence <400> 2 Met Leu Glu Glu Leu Phe Lys Lys Met Thr Glu Trp Ile Glu Lys Val Ile Lys Thr Met Gly Pro Gly Arg Met Leu Glu Glu Leu Phe Lys Lys Met Thr Glu Trp Ile Glu Lys Val Ile Lys Thr Met Gly Pro Gly Arg Met Leu Glu Glu Leu Phe Lys Lys Met Thr Glu Trp Ile Glu Lys Val Ile Lys Thr Met Gly Pro Gly Arg Met Leu Glu Glu Leu Phe Lys Lys Met Thr Glu Trp Ile Glu Lys Val Ile Lys Thr Met <210> 3 <211> 20 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:Version of ASP1.
<400> 3 Met Leu Glu Glu Leu Phe Lys Lys Met Thr Glu Trp Ile Glu Lys Val Ile Lys Thr Met <210> 4 <211> 24 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:Version of ASP1.
<400> 4 Met Leu Glu Glu Leu Phe Lys Lys Met Thr Glu Trp Ile Glu Lys Val Ile Lys Thr Met Gly Pro Gly Arg <210> 5 <211> 5 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Protein segment to promote salt bridges.
<400> 5 Glu Xaa Xaa Xaa Lys <210> 6 <211> 20 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:HDNPl monomer.
<400> 6 Met Leu Glu Glu Ile Phe Lys Lys Met Thr Glu Trp Ile Glu Lys Val Leu Lys Thr Met <210> 7 <211> 92 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:HDNPi tetramer.
<400> 7 Met Leu Glu Glu Ile Phe Lys Lys Met Thr Glu Trp Ile Glu Lys Val Leu Lys Thr Met Gly Pro Gly Arg Met Leu Glu Glu Ile Phe Lys Lys Met Thr Glu Trp Ile Glu Lys Val Leu Lys Thr Met Gly Pro Gly Arg Met Leu Glu Glu Ile Phe Lys Lys Met Thr Glu Trp Ile Glu Lys Val Leu Lys Thr Met Gly Pro Gly Arg Met Leu Glu Glu Ile Phe Lys Lys Met Thr Glu Trp Ile Glu Lys Val Leu Lys Thr Met <210> 8 <211> 4 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:Protein segment to act as helix breaker.
<400> 8 Gly Pro Gly Arg <210> 9 <211> 20 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:HDNP2 monomer.
<400> 9 Met Thr Ile Glu Trp Lys Val Glu Leu Lys Phe Glu Met Lys Ile Glu Leu Lys Met Thr <210> 10 <211> 92 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:HDNP2 tetramer.
<400> 10 Met Thr Ile Glu Trp Lys Val Glu Leu Lys Phe Glu Met Lys Ile Glu Leu Lys Met Thr Gly Pro Gly Arg Met Thr Ile Glu Trp Lys Val Glu Leu Lys Phe Glu Met Lys Ile Glu Leu Lys Met Thr Gly Pro Gly Arg Met Thr Ile Glu Trp Lys Val Glu Leu Lys Phe Glu Met Lys Ile Glu Leu Lys Met Thr Gly Pro Gly Arg Met Thr Ile Glu Trp Lys Val Glu Leu Lys Phe Glu Met Lys Ile Glu Leu Lys Met Thr <210> 11 <211> 41 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:SDNPl monomer for use with swine.
<400> 11 Met Phe Glu Thr Ile Val Lys Leu Val Glu Glu Thr Met His Lys Trp Glu Glu Val Ile Lys Lys Phe Val Thr Met Val Glu Glu Thr Leu Lys Lys Phe Glu Glu Ile Thr Lys Lys Met <210> 12 <211> 176 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:SDNPl tetramer for use with swine.
<400> 12 Met Phe Glu Thr Ile Val Lys Leu Val Glu Glu Thr Met His Lys Trp Glu Glu Val Ile Lys Lys Phe Val Thr Met Val Glu Glu Thr Leu Lys Lys Phe Glu Glu Ile Thr Lys Lys Met Gly Pro Gly Arg Met Phe Glu Thr Ile Val Lys Leu Val Glu Glu Thr Met His Lys Trp Glu Glu Val Ile Lys Lys Phe Val Thr Met Val Glu Glu Thr Leu Lys Lys Phe Glu Glu Ile Thr Lys Lys Met Gly Pro Gly Arg Met Phe Glu Thr Ile Val Lys Leu Val Glu Glu Thr Met His Lys Trp Glu Glu Val Ile Lys Lys Phe Val Thr Met Val Glu Glu Thr Leu Lys Lys Phe Glu Glu Ile Thr Lys Lys Met Gly Pro Gly Arg Met Phe Glu Thr Ile Val Lys Leu Val Glu Glu Thr Met His Lys Trp Glu Glu Val Ile Lys Lys Phe Val Thr Met Val Glu Glu Thr Leu Lys Lys Phe Glu Glu Ile Thr Lys Lys Met <210> 13 <211> 41 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:SDNP2 monomer for use with swine.
<400> 13 Met Thr Ile Glu Phe Lys Val Glu Leu Lys Val Glu Thr His Trp Glu Met Lys Ile Glu Val Lys Phe Glu Thr Lys Ile Glu Val Lys Thr Glu Met Lys Leu Glu Val Lys Phe Thr Met <210> 14 <211> 176 ' <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:SDNP2 tetramer for use with swine.
<400> 14 Met Thr Ile Glu Phe Lys Val Glu Leu Lys Val Glu Thr His Trp Glu Met Lys Ile Glu Val Lys Phe Glu Thr Lys Ile Glu Val Lys Thr Glu Met Lys Leu Glu Val Lys Phe Thr Met Gly Pro Gly Arg Met Thr Ile Glu Phe Lys Val Glu Leu Lys Val Glu Thr His Trp Glu Met Lys Ile Glu Val Lys Phe Glu Thr Lys Ile Glu Val Lys Thr Glu Met Lys Leu Glu Val Lys Phe Thr Met Gly Pro Gly Arg Met Thr Ile Glu Phe Lys Val Glu Leu Lys Val Glu Thr His Trp Glu Met Lys Ile Glu Val Lys Phe Glu Thr Lys Ile Glu Val Lys Thr Glu Met Lys Leu Glu Val Lys Phe Thr Met Gly Pro Gly Arg Met Thr Ile Glu Phe Lys Val Glu Leu Lys Val Glu Thr His Trp Glu Met Lys Ile Glu Val Lys Phe Glu Thr Lys Ile Glu Val Lys Thr Glu Met Lys Leu Glu Val Lys Phe Thr Met <210> 15 <211> 37 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:PDNP1 monomer for use with poultry.
<400> 15 Met Phe Glu Gly Leu Val Lys Ile Met Glu Glu Val Leu Arg His Trp Thr Glu Val Phe Gly Lys Ile Phe Glu Met Gly Thr Arg Phe Leu Glu Gly Phe Thr Lys Met <210> 16 <211> 160 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:PDNPl tetramer for use with poultry.
<400> 16 Met Phe Glu Gly Leu Val Lys Ile Met Glu Glu Val Leu Arg His Trp Thr Glu Val Phe Gly Lys Ile Phe Glu Met Gly Thr Arg Phe Leu Glu Gly Phe Thr Lys Met Gly Pro Gly Arg Met Phe Glu Gly Leu Val Lys Ile Met Glu Glu Val Leu Arg His Trp Thr Glu Val Phe Gly Lys Ile Phe Glu Met Gly Thr Arg Phe Leu Glu Gly Phe Thr Lys Met Gly Pro Gly Arg Met Phe Glu Gly Leu Val Lys Ile Met Glu Glu Val Leu Arg His Trp Thr Glu Val Phe Gly Lys Ile Phe Glu Met Gly Thr Arg Phe Leu Glu Gly Phe Thr Lys Met Gly Pro Gly Arg Met Phe Glu Gly Leu Val Lys Ile Met Glu Glu Val Leu Arg His Trp Thr Glu Val Phe Gly Lys Ile Phe Glu Met Gly Thr Arg Phe Leu Glu Gly Phe Thr Lys Met <210> 17 <211> 37 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:PDNP2 monomer for use with poultry.
<400> 17 Met Glu Phe Lys Val Gly Ile Glu Leu Arg Phe Thr Trp Glu Met His Val Gly Phe Glu Leu Lys Ile Gly Phe Thr Val Glu Met Arg Leu Gly Phe Glu Thr Lys Met <210> 18 <211> 160 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:PDNP2 tetramer for use with poultry.
<400> 18 Met Glu Phe Lys Val Gly Ile Glu Leu Arg Phe Thr Trp Glu Met His Val Gly Phe Glu Leu Lys Ile Gly Phe Thr Val Glu Met Arg Leu Gly Phe Glu Thr Lys Met Gly Pro Gly Arg Met Glu Phe Lys Val Gly Ile Glu Leu Arg Phe Thr Trp Glu Met His Val Gly Phe Glu Leu Lys Ile Gly Phe Thr Val Glu Met Arg Leu Gly Phe Glu Thr Lys Met Gly Pro Gly Arg Met Glu Phe Lys Val Gly Ile Glu Leu Arg Phe Thr Trp Glu Met His Val Gly Phe Glu Leu Lys Ile Gly Phe Thr Val Glu Met Arg Leu Gly Phe Glu Thr Lys Met Gly Pro Gly Arg Met Glu Phe Lys Val Gly Ile Glu Leu Arg Phe Thr Trp Glu Met His Val Gly Phe Glu Leu Lys Ile Gly Phe Thr Val Glu Met Arg Leu Gly Phe Glu Thr Lys Met WO 99/55$90 PCTNS99/09067 <210> 19 <211> 40 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:FDNPl monomer for use with fish.
<400> 19 Met Phe Glu Glu Leu Val Arg Thr Ile Glu Glu Leu Met Lys Lys Trp Glu Glu Val Phe Lys Arg Val Leu His Ile Leu Glu Glu Phe Val Arg Lys Phe Glu Glu Thr Met Arg Lys <210> 20 <211> 172 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:FDNPl tetramer for use with fish.
<400> 20 Met Phe Glu Glu Leu Val Arg Thr Ile Glu Glu Leu Met Lys Lys Trp Glu Glu Val Phe Lys Arg Val Leu His Ile Leu Glu Glu Phe Val Arg Lys Phe Glu Glu Thr Met Arg Lys Gly Pro Gly Arg Met Phe Glu Glu Leu Val Arg Thr Ile Glu Glu Leu Met Lys Lys Trp Glu Glu Val Phe Lys Arg Val Leu His Ile Leu Glu Glu Phe Val Arg Lys Phe Glu Glu Thr Met Arg Lys Gly Pro Gly Arg Met Phe Glu Glu Leu Val Arg Thr Ile Glu Glu Leu Met Lys Lys Trp Glu Glu Val Phe Lys Arg Val Leu His Ile Leu Glu Glu Phe Val Arg Lys Phe Glu Glu Thr Met Arg Lys Gly Pro Gly Arg Met Phe Glu Glu Leu Val Arg Thr Ile Glu Glu Leu Met Lys Lys Trp Glu Glu Val Phe Lys Arg Val Leu His Ile Leu Glu Glu Phe Val Arg Lys Phe Glu Glu Thr Met Arg Lys <210> 21 <211> 40 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:FDNP2 monomer for use with fish.
<400> 21 Met Glu Ile Lys Leu Glu Val Arg Phe Glu Thr Lys Val Glu Leu Lys Val Glu Trp Arg Ile Glu Phe His Thr Glu Leu Lys Met Glu Leu Arg Val Glu Leu Arg Phe Glu Met Lys <210> 22 <211> 172 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:FDNP2 tetramer for use with fish.
<400> 22 Met Glu Ile Lys Leu Glu Val Arg Phe Glu Thr Lys Val Glu Leu Lys Val Glu Trp Arg Ile Glu Phe His Thr Glu Leu Lys Met Glu Leu Arg Val Glu Leu Arg Phe Glu Met Lys Gly Pro Gly Arg Met Glu Ile Lys Leu Glu Val Arg Phe Glu Thr Lys Val Glu Leu Lys Val Glu Trp Arg Ile Glu Phe His Thr Glu Leu Lys Met Glu Leu Arg Val Glu Leu Arg Phe Glu Met Lys Gly Pro Gly Arg Met Glu Ile Lys Leu Glu Val Arg Phe Glu Thr Lys Val Glu Leu Lys Val Glu Trp Arg Ile Glu Phe His 100 lOS 110 Thr Glu Leu Lys Met Glu Leu Arg Val Glu Leu Arg Phe Glu Met Lys Gly Pro Gly Arg Met Glu Ile Lys Leu Glu Val Arg Phe Glu Thr Lys Val Glu Leu Lys Val Glu Trp Arg Ile Glu Phe His Thr Glu Leu Lys Met Glu Leu Arg Val Glu Leu Arg Phe Glu Met Lys <210> 23 <211> 38 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:DDNPl monomer for use with dogs.
<400> 23 Met Val Glu Thr Phe Ile Lys Leu Val Glu Glu Ile Val Arg Lys Trp Glu Glu Met Leu His Lys Phe Val Glu Val Leu Thr Lys Leu Phe Glu Thr Phe Thr Lys Ile Met <210> 24 <211> 164 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:DDNPl tetramer for use with dogs.
<400> 24 Met Val Glu Thr Phe Ile Lys Leu Val Glu Glu Ile Val Arg Lys Trp Glu Glu Met Leu His Lys Phe Val Glu Val Leu Thr Lys Leu Phe Glu Thr Phe Thr Lys Ile Met Gly Pro Gly Arg Met Val Glu Thr Phe Ile Lys Leu Val Glu Glu Ile Val Arg Lys Trp Glu Glu Met Leu His Lys Phe Val Glu Val Leu Thr Lys Leu Phe Glu Thr Phe Thr Lys Ile Met Gly Pro Gly Arg Met Val Glu Thr Phe Ile Lys Leu Val Glu Glu Ile Val Arg Lys Trp Glu Glu Met Leu His Lys Phe Val Glu Val Leu Thr Lys Leu Phe Glu Thr Phe Thr Lys Ile Met Gly Pro Gly Arg Met Val Glu Thr Phe Ile Lys Leu Val Glu Glu Ile Val Arg Lys Trp Glu Glu Met Leu His Lys Phe Val Glu Val Leu Thr Lys Leu Phe Glu Thr Phe Thr Lys Ile Met <210> 25 <211> 38 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:DDNP2 monomer for use with dogs.
<400> 25 Met Thr Val Glu Phe Lys Leu Glu Ile Lys Val Thr Ile Glu Phe Lys Trp Glu Val His Leu Glu Ile Arg Phe Glu Val Lys Leu Glu Met Lys Phe Thr Leu Thr Met Val <210> 26 <211> 164 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:DDNP2 tetramer for use with dogs.

<400> 26 Met Thr Val Glu Phe Lys Leu Glu Ile Lys Val Thr Ile Glu Phe Lys Trp Glu Val His Leu Glu Ile Arg Phe Glu Val Lys Leu Glu Met Lys Phe Thr Leu Thr Met Val Gly Pro Gly Arg Met Thr Val Glu Phe Lys Leu Glu Ile Lys Val Thr Ile Glu Phe Lys Trp Glu Val His Leu Glu Ile Arg Phe Glu Val Lys Leu Glu Met Lys Phe Thr Leu Thr Met Val Gly Pro Gly Arg Met Thr Val Glu Phe Lys Leu Glu Ile Lys Val Thr Ile Glu Phe Lys Trp Glu Val His Leu Glu Ile Arg Phe Glu Val Lys Leu Glu Met Lys Phe Thr Leu Thr Met Val Gly Pro Gly Arg Met Thr Val Glu Phe Lys Leu Glu Ile Lys Val Thr Ile Glu Phe Lys Trp Glu Val His Leu Glu Ile Arg Phe Glu Val Lys Leu Glu Met Lys Phe Thr Leu Thr Met Val <210> 27 <211> 38 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:CDNPl monomer for use with cats.
<400> 27 Met Leu Glu Thr Leu Phe Lys Ile Val Glu Glu Thr Leu Arg Lys Trp Glu Glu Met Phe Lys His Val Leu Thr Phe Met Glu Glu Ile Val Lys Arg Ile Thr Arg Leu Met <210> 28 <211> 164 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:CDNPl tetramer for use with cats.
<400> 28 Met Leu Glu Thr Leu Phe Lys Ile Val Glu Glu Thr Leu Arg Lys Trp Glu Glu Met Phe Lys His Val Leu Thr Phe Met Glu Glu Ile Val Lys Arg Ile Thr Arg Leu Met Gly Pro Gly Arg Met Leu Glu Thr Leu Phe Lys Ile Val Glu Glu Thr Leu Arg Lys Trp Glu Glu Met Phe Lys His Val Leu Thr Phe Met Glu Glu Ile Val Lys Arg Ile Thr Arg Leu Met Gly Pro Gly Arg Met Leu Glu Thr Leu Phe Lys Ile Val Glu Glu Thr Leu Arg Lys Trp Glu Glu Met Phe Lys His Val Leu Thr Phe Met Glu Glu Ile Val Lys Arg Ile Thr Arg Leu Met Gly Pro Gly Arg Met Leu Glu Thr Leu Phe Lys Ile Val Glu Glu Thr Leu Arg Lys Trp Glu Glu Met Phe Lys His Val Leu Thr Phe Met Glu Glu Ile Val Lys Arg Ile Thr Arg Leu Met <210> 29 <211> 38 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:CDNP2 monomer for use with cats.
<400> 29 Met Thr Leu Glu Phe Lys Leu Thr Met Glu Leu His Trp Glu Ile Lys Val Glu Leu Lys Thr Glu Val Arg Ile Glu Met Lys Phe Glu Val Arg Leu Glu Phe Arg Met Thr <210> 30 <211> 164 <212> PRT
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence:CDNP2 tetramer for use with cats.
<400> 30 Met Thr Leu Glu Phe Lys Leu Thr Met Glu Leu His Trp Glu Ile Lys Val Glu Leu Lys Thr Glu Val Arg Ile Glu Met Lys Phe Glu Val Arg Leu Glu Phe Arg Met Thr Gly Pro Gly Arg Met Thr Leu Glu Phe Lys Leu Thr Met Glu Leu His Trp Glu Ile Lys Val Glu Leu Lys Thr Glu Val Arg Ile Glu Met Lys Phe Glu Val Arg Leu Glu Phe Arg Met Thr Gly Pro Gly Arg Met Thr Leu Glu Phe Lys Leu Thr Met Glu Leu His Trp Glu Ile Lys Val Glu Leu Lys Thr Glu Val Arg Ile Glu Met Lys Phe Glu Val Arg Leu Glu Phe Arg Met Thr Gly Pro Gly Arg Met Thr Leu Glu Phe Lys Leu Thr Met Glu Leu His Trp Glu Ile Lys Val Glu Leu Lys Thr Glu Val Arg Ile Glu Met Lys Phe Glu Val Arg Leu Glu Phe Arg Met Thr <210> 31 <211> 20 <212> PRT
<213> Artificial Sequence <220>
<223> Amino acid residues 1, 2, 5, 6, 9, 12, 13, 16, 17, 19 and 20 are hydroophobic and the rest are hydrophilic.
<220>
<223> Description of Artificial Sequence:Hydrophilic and hydrophobic representation as exemplified by SEQ
ID N0:6.
<400> 31 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa <210> 32 <211> 41 <212> PRT
<213> Artificial Sequence <220>
<223> Amino acid residues 1, 2, 5, 6, 8, 9, 12, 13, 16, 19, 20, 23, 24, 26, 27, 30, 31, 34, 37, 38 and 41 are hydrophobic and the rest are hydrophilic.
<220>
<223> Description of Artificial Sequence:Hydrophilic and hydrophobic representation as exemplified by SEQ
ID NO:11.
<400> 32 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa <210> 33 <211> 37 <212> PRT
<213> Artificial Sequence <220>
<223> Amino acid residues 1, 2, 5, 6, B, 9, 12, 13, 16, 19, 20, 23, 24, 26, 27, 30, 31, 34 and 37 are hydrophobic and the rest are hydrophilic.
<220>

<223> Description of Artificial Sequence:Hydrophilic and hydrophobic representation as exemplified by SEQ
ID N0:15.
<400> 33 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa <210> 34 <211> 40 <212> PRT
<213> Artificial Sequence <220>
<223> Amino acid residues 1, 2, 5, 6, 8, 9, 12, 13, 16, 19, 20, 23, 24, 26, 27, 30, 31, 34, 37 and 38 are hydrophobic and the rest are hydrophilic.
<220>
<223> Description of Artificial Sequence:Hydrophilic and hydrophobic representation as exemplified by SEQ
ID N0:19.
<400> 34 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa <210> 35 <211> 38 <212> PRT
<213> Artificial Sequence <220>
<223> Amino acid residues 1, 2, 5, 6, 8, 9, 12, 13, 16, 19, 20, 23, 24, 26, 27, 30, 31, 34, 37 and 38 are hydrophobic and the rest are hydrophilic.
<220>
<223> Description of Artificial Sequence:Hydrophilic and hydrophobic representation as exemplified by SEQ
ID N0:23 or 27.

<400> 35 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa <210> 36 <211> 20 <212> PRT
<213> Artificial Sequence <220>
<223> Amino acid residues 1, 3, 5, 7, 9, 11, 13, 15, 17 and 19 are hydrophobic and the rest are hydrophilic.
<220>
<223> Description of Artificial Sequence:Hydrophilic and hydrophobic representation as exemplified by SEQ
ID N0:9.
<400> 36 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa <210> 37 <211> 41 <212> PRT
<213> Artificial Sequence <220>
<223> Amino acid residues 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39 and 41 are hydrophobic and the rest are hydrophilic.
<220>
<223> Description of Artificial Sequence:Hydrophilic and hydrophobic representation as exemplified by SEQ
ID N0:13.
<400> 37 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa <210> 38 <211> 37 <212> PRT
<213> Artificial Sequence <220>
<223> Amino acid residues 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35 and 37 are hydrophobic and the rest are hydrophilic.
<220>
<223> Description of Artificial Sequence:Hydrophilic and hydrophobic representation as exemplified by SEQ
ID N0:17.
<400> 38 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa <210> 39 <211> 40 <212> PRT
<213> Artificial Sequence <220>
<223> Amino acid residues 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37 and 39 are hydrophobic and the rest are hydrophilic.
<220>
<223> Description of Artificial Sequence:Hydrophilic and hydrophobic representation as exemplified by SEQ
ID N0:21.
<400> 39 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa <210> 40 <211> 38 <212> PRT
<213> Artificial Sequence <220>
<223> Amino acid residues 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37 and 38 are hydrophobic and the rest are hydrophilic.
<220>
<223> Description of Artificial Sequence:Hydrophilic and hydrophobic representation as exemplified by SEQ
ID N0:25.
<400> 40 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 1 ~ 5 10 15 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa <210> 41 <211> 38 <212> PRT
<213> Artificial Sequence <220>
<223> Amino acid residues 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35 and 37 are hydrophobic and the rest are hydrophilic.
<220>
<223> Description of Artificial Sequence:Hydrophilic and hydrophobic representation as exemplified by SEQ
ID N0:29.
<400> 41 Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa

Claims

WHAT IS CLAIMED IS:

1. A plant wherein said plant is a transgenic plant comprising a heterologous gene selected from the group consisting of (i) a gene which encodes a protein comprising an amphiphilic .alpha.-helix and (ii) a gene which encodes a protein comprising a .beta.-pleated sheet, wherein said transgenic plant produces more protein per tissue weight than does said plant when said plant is not a transgenic plant and wherein said tissue is root, tuber, seed, leaf, stem, edible portion, flower or whole plant.

2. The plant of claim 1 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, a equals 0 or 1, x equals 1 or 2, y equals I or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein the value of x in one ((h)x(H)y) need not be the same as the value of x in another ((h)x(H)y) and wherein the value of y in one ((h)x(H)y) need not be the same as the value of y in another ((h)x(H)y), and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2.

3. The plant of, claim 2 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.

4. The plant of claim 2 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID
NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID
NO:39, SEQ ID NO:40 and SEQ ID NO:41.

5. The plant of claim 2 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID
NO:3, SEQ
ID NO:6, SEQ ID NO:11, SEQ ID NO:15, SEQ ID NO:19, SEQ ID NO:23 and SEQ ID
NO:27 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:9, SEQ ID NO:13, SEQ
ID NO:17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID NO:29.

6. The plant of claim 2 wherein if said gene is selected from (i) then said gene encodes multiple units of said amphiphilic .alpha.-helix wherein each unit of amphiphilic .alpha.-helix is defined by (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein each unit of amphiphilic .alpha.-helix is separated from any neighboring unit of amphiphilic .alpha.-helix by a helix breaker and wherein any unit of amphiphilic .alpha.-helix can be different from any other unit of amphiphilic .alpha.-helix and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x{H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y) and wherein if said gene is selected from (ii) then said gene encodes multiple units of said .beta.-pleated sheet wherein each unit of .beta.-pleated sheet is defined by (H)r(hH)sh)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2 and wherein each unit of .beta.-pleated sheet is separated from any neighboring unit of .beta.-pleated sheet by a helix breaker and wherein any unit of .beta.-pleated sheet can be different from any other unit of .beta.-pleated sheet and wherein a value of r in one unit of .beta.-pleated sheet need not be the same as a value of r in another unit of .beta.-pleated sheet and wherein a value of s in one unit of .beta.-pleated sheet need not be the same as a value of s in another unit of .beta.-pleated sheet and wherein a value of t in one unit of .beta.-pleated sheet need not be the same as a value of t in another unit of .beta.-pleated sheet.

7. The plant of claim 6 wherein said helix breaker is SEQ ID NO:8.

8. The plant of claim 6 wherein if said gene is selected from (i) then said gene encodes from 4 to 8 units of amphiphilic .alpha.-helix and if said gene is selected from (ii) then said gene encodes from 4 to 8 units of .beta.-pleated sheet.

9. The plant of claim 7 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID
NO:2, SEQ
ID NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID NO:24 and SEQ ID
NO:28 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:10, SEQ ID NO:14, SEQ
ID NO:18, SEQ ID NO:22, SEQ ID NO:26, and SEQ ID NO:30.

10. A plant wherein said plant is a transgenic plant comprising a heterologous gene which encodes a protein comprising a combination of amphiphilic .alpha.-helix and .beta.-pleated sheet and wherein said transgenic plant produces more protein per tissue weight than does said plant when said plant is not a transgenic plant wherein said tissue is root, tuber, seed, leaf, stem, edible portion, flower or whole plant.

11. The plant of claim 10 wherein said gene encodes a protein comprising a sequence of units of (((H)u((h)x(H)y)z(h)w X n)v((H)r(hH)s(h)t X m))p or (((H)r(hH)s(h)t X m))p((H)u((h)x(H)y)z(h)w X n)v wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, X is any amino acid and may be different for each X n or X
m, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, w equals 0, 1 or 2, n equals any whole number including 0, v equals any whole number greater than 0, r equals 0 or 1, s equals any whole number greater than 0, t equals 0, 1 or 2, m equals any whole number including 0, and p equals any whole number greater than 0 and wherein any one unit within said protein can differ from any other unit within said protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for one unit can differ from any other unit and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)Y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y).

12. The plant of claim 11 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.

13. The plant of claim 11 wherein X is SEQ ID NO:8.

14. A gene encoding a protein selected from the group consisting of (i) a protein comprising an amphiphilic .alpha.-helical sequence wherein said sequence comprises (H)u((h)x(H)y)Z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5,6,7,8,9,10,11,12, 13,14,15,16,17,18,19,20 and any whole number greater than 20, and w equals 0, or 2 and wherein the value of x in one ((h)x(H)y) need not be the same as the value of x in another ((h)x(H)y) and wherein the value of y in one ((h)x(H)y) need not be the same as the value of y in another ((h)x(H)y), and (ii) a protein comprising a .beta.-pleated sheet sequence wherein said sequence comprises (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2.

15. The gene of claim 14 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.

16. The gene of claim 14 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID
NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID
NO:39, SEQ ID NO:40 and SEQ ID NO:41.

17. The gene of claim 14 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID
NO:3, SEQ
ID NO:6, SEQ ID NO:11, SEQ ID NO:15, SEQ ID NO:19, SEQ ID NO:23 and SEQ ID
NO:27 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:9, SEQ ID NO:13, SEQ
ID NO:17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID NO:29.

18. The gene of claim 14 wherein if said gene is selected from (i) then said gene encodes multiple units of said amphiphilic .alpha.-helical sequence wherein each unit of amphiphilic .alpha.-helical sequence is defined by (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein,u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein each unit of amphiphilic .alpha.-helical sequence is separated from any neighboring unit of amphiphilic .alpha.-helical sequence by a helix breaker and wherein any unit of amphiphilic .alpha.-helical sequence can be different from any other unit of amphiphilic .alpha.-helical sequence and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x{H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x{H)y) and wherein if said gene is selected from (ii) then said gene encodes multiple units of said .beta.-pleated sheet sequence wherein each unit of .beta.-pleated sheet sequence is defined by (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2 and wherein each unit of .beta.-pleated sheet sequence is separated from any neighboring unit of .beta.-pleated sheet sequence by a helix breaker and wherein each unit of .beta.-pleated sheet sequence can be different from any other unit of .beta.-pleated sheet sequence and wherein a value of r in one unit of .beta.-pleated sheet need not be the same as a value of r in another unit of .beta.-pleated sheet and wherein a value of s in one unit of .beta.-pleated sheet need not be the same as a value of s in another unit of .beta.-pleated sheet and wherein a vlue of t in one unit of .beta.-pleated sheet need not be the same as a value of t in another unit of .beta.-pleated sheet.

19. The gene of claim 18 wherein said helix breaker is SEQ ID NO:8.

20. The gene of claim 18 wherein if said gene is selected from (i) then said gene encodes from 4 to 8 units of amphiphilic .alpha.-helical sequence and if said gene is selected from (ii) then said gene encodes from 4 to 8 units of .beta.-pleated sheet sequence.

21. The gene of claim 19 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID
NO:2, SEQ
ID NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID NO:24 and SEQ ID
NO:28 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:10, SEQ ID NO:14, SEQ
ID NO:18, SEQ ID NO:22, SEQ ID NO:26 and SEQ ID NO:30.

22. A gene which encodes a protein comprising a sequence of units of (((H)u((h)x(H)yz(h)wXn)v((H)r(hH))s(h)tXm))p or ((H)r((hH)s(h)tXm))p((H)u((h)x(H)y)z(h)wXn)v wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, X is any amino acid and may be different for each Xn or Xm, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, w equals 0, 1 or 2, n equals any whole number including 0, v equals any whole number greater than 0, r equals 0 or l, s equals any whole number greater than 0, t equals 0, 1 or 2, m equals any whole number including 0, and p equals any whole number greater than 0 and wherein any one unit within said protein can differ from any other unit within said protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for one unit can differ from any other unit and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y).

23. The gene of claim 22 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.

24. The gene of claim 22 wherein X is SEQ ID NO:8.

25. A protein selected from the group consisting of (i) a protein comprising an amphiphilic .alpha.-helical sequence wherein said sequence comprises (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5,6,7,8,9,10,11,12,13,14,15,16,17, 18,19,20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein the value of x in one ((h)x(H)Y) need not be the same as the value of x in another ((h)x(H)y) and wherein the value of y in one ((h)x(H)y) need not be the same as the value of y in another ((h)x(H)y), and (ii) a protein comprising a .beta.-pleated sheet sequence wherein said sequence comprises (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2.

26. The protein of claim 25 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.

27. The protein of claim 25 wherein if said protein is selected from (i) then said protein comprises a sequence selected from the group consisting of SEQ ID NO:31, SEQ
ID
NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said protein is selected from (ii) then said protein comprises a sequence selected from the group consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID NO:39, SEQ ID
NO:40 and SEQ ID NO:41.

28. The protein of claim 25 wherein if said protein is selected from (i) then said protein comprises a sequence selected from the group consisting of SEQ ID NO:3, SEQ ID
NO:6, SEQ ID NO:11, SEQ ID NO:15, SEQ ID NO:19, SEQ ID NO:23 and SEQ ID
NO:27 and if said protein is selected from (ii) then said protein comprises a sequence selected from the group consisting of SEQ ID NO:9, SEQ ID NO:13, SEQ ID NO:17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID NO:29.

29. The protein of claim 25 wherein if said protein is selected from (i) then said gene encodes multiple units of said amphiphilic .alpha.-helical sequence wherein each unit of amphiphilic .alpha.-helical sequence is defined by (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein each unit of amphiphilic .alpha.-helical sequence is separated from any neighboring unit of amphiphilic .alpha.-helical sequence by a helix breaker and wherein any unit of amphiphilic .alpha.-helical sequence can be different from any other unit of amphiphilic .alpha.-helical sequence and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)Y) and if said protein is selected from (ii) then said gene encodes multiple units of said .beta.-pleated sheet sequence wherein each unit of .beta.-pleated sheet sequence is defined by (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2 and wherein each unit of .beta.-pleated sheet sequence is separated from any neighboring unit of .beta.-pleated sheet sequence by a helix breaker and wherein each unit of .beta.-pleated sheet sequence can be different from any other unit of .beta.-pleated sheet sequence and wherein a value of r in one unit of .beta.-pleated sheet need not be the same as a value of r in another unit of .beta.-pleated sheet and wherein a value of s in one unit of .beta.-pleated sheet need not be the same as a value of s in another unit of .beta.-pleated sheet and wherein a value of t in one unit of .beta.-pleated sheet need not be the same as a value of t in another unit of .beta.-pleated sheet.

30. The protein of claim 29 wherein said helix breaker is SEQ ID NO:8.

31. The protein of claim 29 wherein if said protein is selected from (i) then said protein comprises 4 to 8 units of amphiphilic .alpha.-helical sequence and if said protein is selected from (ii) then said protein comprises from 4 to 8 units of .beta.-pleated sheet sequence.

32. The protein of claim 30 wherein if said protein is selected from (i) then said protein comprises a sequence selected from the group consisting of SEQ ID NO:2, SEQ ID
NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID NO:24 and SEQ ID
NO:28 and if said protein is selected from (ii) then said said protein comprises a sequence selected from the group consisting of SEQ ID NO:10, SEQ ID NO:14, SEQ ID
NO:18, SEQ ID NO:22, SEQ ID NO:26 and SEQ ID NO:30.

33. A protein comprising a sequence of units of (((H)u((h)x(H)y)z(h)wXn)v((H)r(hH)s(h)tXm))p or (((H)r(hH)s(h)tXm))p((H)u((h)x(H)y)z(h)wXn)v wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, X is any amino acid and may be different for each Xn or Xm, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, w equals 0,1 or 2, n equals any whole number including 0, v equals any whole number greater than 0, r equals 0 or 1, s equals any whole number greater than 0, t equals 0, 1 or 2, m equals any whole number including 0, and p equals any whole number greater than 0 and wherein any one unit within said protein can differ from any other unit within said protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for one unit can differ from any other unit and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y).

34. The protein of claim 33 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.

35. The protein of claim 33 wherein X is SEQ ID NO:8.

36. A plant cell wherein said plant cell is a transgenic plant cell comprising a heterologous gene selected from the group consisting of (i) a gene which encodes a protein comprising an amphiphilic .alpha.-helix and wherein said transgenic plant cell produces more protein per gram of plant cell than does said plant cell when said plant cell is not a transgenic plant cell and (ii) a gene which encodes a protein comprising a .beta.-pleated sheet and wherein said transgenic plant cell produces more protein per gram of cells than does said plant cell when said plant cell is not a transgenic plant cell.

37. The plant cell of claim 36 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals or 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein the value of x in one ((h)x(H)y) need not be the same as the value of x in another ((h)x(H)y) and wherein the value of y in one ((h)x(H)y) need not be the same as the value of y in another ((h)x(H)y), and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or l, s is any whole number greater than 0, and t equals 0, 1 or 2.

38. The plant cell of claim 37 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.

39. The plant cell of claim 37 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID
NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID
NO:39, SEQ ID NO:40 and SEQ ID NO:41.

40. The plant cell of claim 37 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID
NO:3, SEQ ID NO:6, SEQ ID NO:11, SEQ ID NO:15, SEQ ID NO:19, SEQ ID NO:23 and SEQ ID NO:27 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:9, SEQ
ID
NO:13, SEQ ID NO:17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID NO:29.

41. The plant cell of claim 37 wherein if said gene is selected from (i) then said gene encodes multiple units of said amphiphilic .alpha.-helix wherein each unit of amphiphilic .alpha.-helix is defined by (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein each unit of amphiphilic .alpha.-helix is separated from any neighboring unit of amphiphilic .alpha.-helix by a helix breaker and wherein any unit of amphiphilic .alpha.-helix can be different from any other unit of amphiphilic .alpha.-helix and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)Y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y) and if said gene is selected from (ii) then said gene encodes multiple units of said .beta.-pleated sheet wherein each unit of .beta.-pleated sheet is defined by (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2 and wherein each unit of .beta.-pleated sheet is separated from any neighboring unit of .beta.-pleated sheet by a helix breaker and wherein any unit of .beta.-pleated sheet can be different from any other unit of .beta.-pleated sheet and wherein a value of r in one unit of .beta.-pleated sheet need not be the same as a value of r in another unit of .beta.-pleated sheet and wherein a value of s in one unit of .beta.-pleated sheet need not be the same as a value of s in another unit of .beta.-pleated sheet and wherein a value of t in one unit of .beta.-pleated sheet need not be the same as a value of t in another unit of .beta.-pleated sheet.

42. The plant cell of claim 41 wherein said helix breaker is SEQ ID NO:8.

43. The plant cell of claim 41 wherein if said gene is selected from (i) then said gene encodes from 4 to 8 units of amphiphilic .alpha.-helix and if said gene is selected from (ii) then said gene encodes from 4 to 8 units of .beta.-pleated sheet.

44. The plant cell of claim 42 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID
NO:2, SEQ ID NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID NO:24 and SEQ ID NO:28 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:10, SEQ
ID
NO:14, SEQ ID NO:18, SEQ ID NO:22, SEQ ID NO:26, and SEQ ID NO:30.

45. A plant cell wherein said plant cell is a transgenic plant cell comprising a heterologous gene which encodes a protein comprising a combination of amphiphilic .alpha.-helix and .beta.-pleated sheet and wherein said transgenic plant cell produces more protein per gram of cells than does said plant cell when said plant cell is not a transgenic plant cell.

46. The plant cell of claim 45 wherein said gene encodes a protein comprising a sequence of units of (((H)u((h)x(H)y)z(h)wXn)v((H)r(hH)s(h)tXm))p or (((H)r(hH)s(h)tXm))p((H)u((h)x(H)y)z(h)wXn)v wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, X is any amino acid and may be different for each Xn or Xm, u equals 0 or l, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 1 l, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, w equals 0, 1 or 2, n equals any whole number including 0, v equals any whole number greater than 0, r equals 0 or l, s equals any whole number greater than 0, t equals 0, 1 or 2, m equals any whole number including 0, and p equals any whole number greater than 0 and wherein any one unit within said protein can differ from any other unit within said protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for one unit can differ from any other unit and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y).

47. The plant cell of claim 46 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.

48. The plant cell of claim 46 wherein X is SEQ ID NO:8.

49. A method for increasing production of a protein or a nonprotein product in a plant of a specified species wherein said method comprises:
a) transforming a cell or cells of said species with a heterologous gene to produce a transgenic cell or transgenic cells wherein said gene is selected from the group consisting of (i) a gene which encodes a protein which comprises an amphiphilic .alpha.-helical sequence and (ii) a gene which encodes a protein which comprises a .beta.-pleated sheet sequence;
b) growing said transgenic cell or cells to produce a transgenic plant; and c) growing said transgenic plant.

50. The method of claim 49 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, a equals 0 or 1, x equals 1I or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein the value of x in one ((h)x(H)y) need not be the same as the value of x in another ((h)x(H)y) and wherein the value of y in one ((h)x(H)y) need not be the same as the value of y in another ((h)x(H)y), and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2.

51. The method of claim 50 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.

52. The method of claim 50 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID
NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID
NO:39, SEQ ID NO:40 and SEQ ID NO:41.

53. The method of claim 50 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID
NO:3, SEQ ID NO:6, SEQ ID NO:Il, SEQ ID NO:15, SEQ ID NO:19, SEQ ID NO:23 and SEQ ID NO:27 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:9, SEQ
ID
NO:13, SEQ ID NO:17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID NO:29.

54. The method of claim 50 wherein if said gene is selected from (i) then said gene encodes multiple units of said amphiphilic .alpha.-helix wherein each unit of amphiphilic .alpha.-helix is defined by (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, a equals 0 or l, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein each unit of amphiphilic .alpha.-helix is separated from any neighboring unit of amphiphilic .alpha.-helix by a helix breaker and wherein any unit of amphiphilic .alpha.-helix can be different from any other unit of amphiphilic .alpha.-helix and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y) and wherein if said gene is selected from (ii) then said gene encodes multiple units of said .beta.-pleated sheet wherein each unit of .beta.-pleated sheet is defined by (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2 and wherein each unit of .beta.-pleated sheet is separated from any neighboring unit of .beta.-pleated sheet by a helix breaker and wherein any unit of .beta.-pleated sheet can be different from any other unit of .beta.-pleated sheet and wherein a value of r in one unit of .beta.-pleated sheet need not be the same as a value of r in another unit of .beta.-pleated sheet and wherein a value of s in one unit of .beta.-pleated sheet need not be the same as a value of s in another unit of .beta.-pleated sheet and wherein a value of t in one unit of .beta.-pleated sheet need not be the same as a value of t in another unit of .beta.-pleated sheet.

55. The method of claim 54 wherein said helix breaker is SEQ ID NO:8.

56. The method of claim 54 wherein if said gene is selected from (i) then said gene encodes from 4 to 8 units of amphiphilic .alpha.-helix and if said gene is selected from (ii) then said gene encodes from 4 to 8 units of .beta.-pleated sheet.

57. The method of claim 55 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID
NO:2, SEQ ID NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID NO:24 and SEQ ID NO:28 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:10, SEQ
ID
NO:14, SEQ ID NO:18, SEQ ID NO:22, SEQ ID NO:26, and SEQ ID NO:30.

58. A method for increasing production of a protein or a nonprotein product in a plant of a specified species wherein said method comprises:
a) transforming a cell or cells of said species with a heterologous gene to produce a transgenic cell or transgenic cells wherein said gene encodes a protein which comprises a combination of amphiphilic .alpha.-helical sequence and .beta.-pleated sheet sequence;
b) growing said transgenic cell or cells to produce a transgenic plant; and c) growing said transgenic plant.

59. The method of claim 58 wherein said gene encodes a protein comprising a sequence of units of (({H)u((h)x(H)y)z(h)wXr)v((H)(hH)s(h)tXm))p or (((H)r(hH)s(h)tXm))p((H)u((h)x(H)y)z(h)wXn)v wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, X is any amino acid and may be different for each Xn or Xm, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, w equals 0, 1 or 2, n equals any whole number including 0, v equals any whole number greater than 0, r equals 0 or l, s equals any whole number greater than 0, t equals 0, 1 or 2, m equals any whole number including 0, and p equals any whole number greater than 0 and wherein any one unit within said protein can differ from any other unit within said protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for one unit can differ from any other unit and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y).

60. The method of claim 59 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.

61. The method of claim 59 wherein X is SEQ ID NO:8.

62. A method for increasing protein production in plant cells of a specified species wherein said method comprises:
a) transforming a cell or cells of said species with a heterologous gene to produce a transgenic cell or transgenic cells wherein said gene is selected from the group consisting of (i) a gene which encodes a protein which comprises an amphiphilic .alpha.-helical sequence and (ii) a gene which encodes a protein which comprises a .beta.-pleated sheet sequence; and b) growing said transgenic cell or cells in culture or in a bioreactor.

63. The method of claim 62 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein the value of x in one ((h)x(H)y) need not be the same as the value of x in another ((h)x(H)y) and wherein the value of y in one ((h)x(H)y) need not be the same as the value of y in another ((h)x(H)y), and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2.

64. The method of claim 63 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.

65. The method of claim 63 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID
NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID
NO:39, SEQ ID NO:40 and SEQ ID NO:41.

66. The method of claim 63 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID
NO:3, SEQ ID NO:6, SEQ ID NO:11, SEQ ID NO:15, SEQ ID NO:19, SEQ ID NO:23 and SEQ ID NO:27 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:9, SEQ
ID
NO:13, SEQ ID NO:17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID NO:29.

67. The method of claim 63 wherein if said gene is selected from (i) then said gene encodes multiple units of said amphiphilic .alpha.-helix wherein each unit of amphiphilic .alpha.-helix is defined by (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, a equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein each unit of amphiphilic .alpha.-helix is separated from any neighboring unit of amphiphilic .alpha.-helix by a helix breaker and wherein any unit of amphiphilic .alpha.-helix can be different from any other unit of amphiphilic .alpha.-helix and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y) and wherein if said gene is selected from (ii) then said gene encodes multiple units of said .alpha.-pleated sheet wherein each unit of .alpha.-pleated sheet is defined by (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2 and wherein each unit of .beta.-pleated sheet is separated from any neighboring unit of .beta.-pleated sheet by a helix breaker and wherein any unit of .beta.-pleated sheet can be different from any other unit of .beta.-pleated sheet and wherein a value of r in one unit of .beta.-pleated sheet need not be the same as a value of r in another unit of .beta.-pleated sheet and wherein a value of s in one unit of .beta.-pleated sheet need not be the same as a value of s in another unit of .beta.-pleated sheet and wherein a value of t in one unit of .beta.-pleated sheet need not be the same as a value of t in another unit of .beta.-pleated sheet.

68. The method of claim 67 wherein said helix breaker is SEQ ID NO:8.

69. The method of claim 67 wherein if said gene is selected from (i) then said gene encodes from 4 to 8 units of amphiphilic .alpha.-helix and if said gene is selected from (ii) then said gene encodes from 4 to 8 units of .beta.-pleated sheet.

70. The method of claim 68 wherein if said gene is selected from (i) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID
NO:2, SEQ ID NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID NO:24 and SEQ ID NO:28 and if said gene is selected from (ii) then said gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:10, SEQ
ID
NO:14, SEQ ID NO:18, SEQ ID NO:22, SEQ ID NO:26, and SEQ ID NO:30.

71. A method for increasing production of a protein or a nonprotein product in plant cells of a specified species wherein said method comprises:
a) transforming a cell or cells of said species with a heterologous gene to produce a transgenic cell or transgenic cells wherein said gene encodes a protein which comprises a combination of amphiphilic .alpha.-helical sequence and .beta.-pleated sheet sequence; and b) growing said transgenic cell or cells in culture or in a bioreactor.

72. The method of claim 71 wherein said gene encodes a protein comprising a sequence of units of (((H)u((h)x(H)y)z(h)w X n)v((H)r(~hH)s(h)t X m))p or (((H)r(hH)s(h)t X m))p((H)u((h)x(H)y)z(h)w X n)v wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, X is any amino acid and may be different for each X n or X
m, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, w equals 0, 1 or 2, n equals any whole number including 0, v equals any whole number greater than 0, r equals 0 or 1, s equals any whole number greater than 0, t equals 0, 1 or 2, m equals any whole number including 0, and p equals any whole number greater than 0 and wherein any one unit within said protein can differ from any other unit within said protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for one unit can differ from any other unit and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y).

73. The method of claim 72 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.

74. The method of claim 72 wherein X is SEQ ID NO:8.

75. A method of increasing production of a first protein, or of a nonprotein product in which case said first protein catalyzes a step in a synthesis of said nonprotein product, in a plant of a specified species wherein said first protein is encoded by a first gene with which said plant is transformed or an ancestor of said plant had been transformed wherein said method comprises the steps of:
(a) selecting a cell or cells of said species;
(b) transforming said cell or cells of said species with said first gene if said cell or cells were not already transformed with said first gene;
(c) transforming said cell or cells of said species with a second gene, if said cell or cells or an ancestor of said cell or cells had not previously been transformed with said second gene, to form a transgenic cell or transgenic cells wherein said second gene is selected from the group consisting of (i) a heterologous gene which encodes a protein which comprises an amphiphilic .alpha.-helical sequence and (ii) a heterologous gene which encodes a protein which comprises a .beta.-pleated sheet sequence;
(d) growing said transgenic cell or cells to produce a transgenic plant comprising both said first gene and said second gene; and (e) growing said transgenic plant, wherein, if it is necessary to perform steps (b) and (c), either step (b) can be performed before step (c), step (c) can be performed before step (b), or steps (b) and (c) can be performed simultaneously.

76. The method of claim 75 wherein if said second gene is selected from (i) then said second gene encodes a protein comprising a sequence (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, a equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein the value of x in one ((h)x(H)y) need not be the same as the value of x in another ((h)x(H)y) and wherein the value of y in one ((h)x(H)y) need not be the same as the value of y in another ((h)x(H)y), and if said second gene is selected from (ii) then said second gene encodes a protein comprising a sequence (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2.

77. The method of claim 76 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.

78. The method of claim 76 wherein if said second gene is selected from (i) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ
ID NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said second gene is selected from (ii) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ
ID NO:38, SEQ ID NO:39, SEQ ID NO:40 and SEQ ID NO:41.

79. The method of claim 76 wherein if said second gene is selected from (i) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ
ID NO:3, SEQ ID NO:6, SEQ ID NO:11, SEQ ID NO:15, SEQ ID NO:19, SEQ ID
NO:23 and SEQ ID NO:27 and if said second gene is selected from (ii) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ
ID NO:9, SEQ ID NO:13, SEQ ID NO:17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID
NO:29.

80. The method of claim 76 wherein if said second gene is selected from (i) then said second gene encodes multiple units of said amphiphilic .alpha.-helix wherein each unit of amphiphilic .alpha.-helix is defined by (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, a equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein each unit of amphiphilic .alpha.-helix is separated from any neighboring unit of amphiphilic .alpha.-helix by a helix breaker and wherein any unit of amphiphilic .alpha.-helix can be different from any other unit of amphiphilic .alpha.-helix and whrein a value of x in one ((h)x(H)y) need not be the same as a value of x in annother ((h)X(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y), and wherein if said second gene is selected from (ii) then said second gene encodes multiple units of said .beta.-pleated sheet wherein each unit of .beta.-pleated sheet is defined by (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2 and wherein each unit of .beta.-pleated sheet is separated from any neighboring unit of .beta.-pleated sheet by a helix breaker and wherein any unit of .beta.-pleated sheet can be different from any other unit of .beta.-pleated sheet and wherein a value of r in one unit of .beta.-pleated sheet need not be the same as a value of r in another unit of .beta.-pleated sheet and wherein a value of s in one unit of .beta.-pleated sheet need not be the same as a value of s in another unit of .beta.-pleated sheet and wherein a value of t in one unit of .beta.-pleated sheet need not be the same as a value of t in another unit of .beta.-pleated sheet.

81. The method of claim 80 wherein said helix breaker is SEQ ID NO:8.

82. The method of claim 80 wherein if said second gene is selected from (i) then said second gene encodes from 4 to 8 units of amphiphilic .alpha.-helix and if said second gene is selected from (ii) then said second gene encodes from 4 to 8 units of .beta.-pleated sheet.

83. The method of claim 81 wherein if said second gene is selected from (i) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ
ID NO:2, SEQ ID NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID
NO:24 and SEQ ID NO:28 and if said second gene is selected from (ii) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ
ID NO:10, SEQ ID NO:14, SEQ ID NO:18, SEQ ID NO:22, SEQ ID NO:26 and SEQ
ID NO:30.

84. A method of increasing production of a first protein, or of a nonprotein product in which case said first protein catalyzes a step in a synthesis of said nonprotein product, in a plant of a specified species wherein said protein is encoded by a first gene with which said plant is transformed or an ancestor of said plant had been transformed wherein said method comprises the steps of:
(a) selecting a cell or cells of said species;
(b) transforming said cell or cells of said species with said first gene if said cell or cells were not already transformed with said first gene;
(c) transforming said cell or cells of said species with a second gene, if said cell or cells or an ancestor of said cell or cells had not previously been transformed with said second gene, to form a transgenic cell or transgenic cells wherein said second gene encodes a protein which comprises a combination of amphiphilic .alpha.-helical sequence and .beta.-pleated sheet sequence and wherein said second gene does not naturally occur in said plant;
(d) growing said transgenic cell or cells to produce a transgenic plant comprising both said first gene and said second gene; and (e) growing said transgenic plant, wherein, if it is necessary to perform steps (b) and (c), either step (b) can be performed before step (c), step (c) can be performed before step (b), or steps (b) and (c) can be performed simultaneously.

85. The method of claim 84 wherein said second gene encodes a protein comprising a sequence of units of (((H)u((h)x(H)y)z(h)w X n)v((H)r(hH)s(h)t X m))p or (((H)r(hH)s(h)t X m)p((H)u((h)x(H)y)z(h)w X n)v wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, X is any amino acid and may be different for each X n or X
m, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, w equals 0, 1 or 2, n equals any whole number including 0, v equals any whole number greater than 0, r equals 0 or 1, s equals any whole number greater than 0, t equals 0, 1 or 2, m equals any whole number including 0, and p equals any whole number greater than 0 and wherein any one unit within said protein can differ from any other unit within said protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for one unit can differ from any other unit and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y).

86. The method of claim 85 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.

87. The method of claim 85 wherein X is SEQ ID NO:8.

88. A method of increasing production of a first protein, or of a nonprotein product in which case said first protein catalyzes a step in a synthesis of said nonprotein product, in a plant cell or plant cells of a specified species wherein said first protein is encoded by a first gene with which said plant cell or plant cells are transformed or an ancestor of said plant had been transformed wherein said method comprises the steps of:
(a) selecting a cell or cells of said species;
(b) transforming said cell or cells of said species with said first gene if said cell or cells were not already transformed with said first gene;
(c) transforming said cell or cells of said species with a second gene, if said cell or cells or an ancestor of said cell or cells had not previously been transformed with said second gene, to form a transgenic cell or transgenic cells wherein said second gene is selected from the group consisting of (i) a gene which encodes a protein which comprises an amphiphilic a-helical sequence and wherein said gene does not naturally occur in said plant and (ii) a gene which encodes a protein which comprises a .beta.-pleated sheet sequence and wherein said gene does not naturally occur in said plant; and (d) growing said transgenic cell or cells in culture or in a bioreactor, wherein, if it is necessary to perform steps (b) and (c), either step (b) can be performed before step (c), step (c) can be performed before step (b), or steps (b) and (c) can be performed simultaneously.

89. The method of claim 88 wherein if said second gene is selected from (i) then said second gene encodes a protein comprising a sequence (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein the value of x in one ((h)x(H)y) need not be the same as the value of x in another ((h)x(H)y) and wherein the value of y in one ((h)x(H)y) need not be the same as the value of y in another ((h)x(H)y), and if said second gene is selected from (ii) then said second gene encodes a protein comprising a sequence (H)r((hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2.

90. The method of claim 89 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.

91. The method of claim 89 wherein if said second gene is selected from (i) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ
ID NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said second gene is selected from (ii) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ
ID NO:38, SEQ ID NO:39, SEQ ID NO:40 and SEQ ID NO:41.

92. The method of claim 89 wherein if said second gene is selected from (i) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ
ID NO:3, SEQ ID NO:6, SEQ ID NO:11, SEQ ID NO:15, SEQ ID NO:19, SEQ ID
NO:23 and SEQ ID NO:27 and if said second gene is selected from (ii) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ

ID NO:9, SEQ ID NO:13, SEQ ID NO:17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID
NO:29.

93. The method of claim 89 wherein if said second gene is selected from (i) then said second gene encodes multiple units of said amphiphilic .alpha.-helix wherein each unit of amphiphilic .alpha.-helix is defined by (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, a equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and wherein each unit of amphiphilic .alpha.-helix is separated from any neighboring unit of amphiphilic .alpha.-helix by a helix breaker and wherein any unit of amphiphilic .alpha.-helix can be different from any other unit of amphiphilic .alpha.-helix and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y), and wherein if said second gene is selected from (ii) then said second gene encodes multiple units of said .beta.-pleated sheet wherein each unit of .beta.-pleated sheet is defined by (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2 and wherein each unit of .beta.-pleated sheet is separated from any neighboring unit of .beta.-pleated sheet by a helix breaker and wherein any unit of .beta.-pleated sheet can be different from any other unit of .beta.-pleated sheet and wherein a value of r in one unit of .beta.-pleated sheet need not be the same as a value of r in another unit of .beta.-pleated sheet and wherein a value of s in one unit of .beta.-pleated sheet need not be the same as a value of s in another unit of .beta.-pleated sheet and wherein a vlaue of t in one unit of .beta.-pleated sheet need not be the same as a value of t in another unit of .beta.-pleated sheet.

94. The method of claim 93 wherein said helix breaker is SEQ ID NO:8.

95. The method of claim 93 wherein if said second gene is selected from (i) then said second gene encodes from 4 to 8 units of amphiphilic .alpha.-helix and if said second gene is selected from (ii) then said second gene encodes from 4 to 8 units of .beta.-pleated sheet.

96. The method of claim 94 wherein if said second gene is selected from (i) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ
ID NO:2, SEQ ID NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID
NO:24 and SEQ ID NO:28 and if said second gene is selected from (ii) then said second gene encodes a protein comprising a sequence selected from the group consisting of SEQ
ID NO:10, SEQ ID NO:14, SEQ ID NO:18, SEQ ID NO:22, SEQ ID NO:26 and SEQ
ID NO:30.

97. A method of increasing production of a first protein, or of a nonprotein product in which case said first protein catalyzes a step in a synthesis of said nonprotein product, in a plant cell or plant cells of a specified species wherein said protein is encoded by a first gene with which said plant cell or plant cells are transformed or an ancestor of said plant had been transformed wherein said method comprises the steps of:
(a) selecting a cell or cells of said species;
(b) transforming said cell or cells of said species with said first gene if said plant cell or plant cells were not already transformed with said first gene;
(c) transforming said cell or cells of said species with a second gene, if said cell or cells or an ancestor of said cell or cells had not previously been transformed with said second gene, to form a transgenic cell or transgenic cells wherein said second gene encodes a protein which comprises a combination of amphiphilic a-helical sequence and .beta.-pleated sheet sequence and wherein said gene does not naturally occur in said plant;
and (d) growing said transgenic cell or cells in culture or in a bioreactor, wherein, if it is necessary to perform steps (b) and (c), either step (b) can be performed before step (c), step (c) can be performed before step (b), or steps (b) and (c) can be performed simultaneously.

98. The method of claim 97 wherein said second gene encodes a protein comprising a sequence of units of (((H)u((h)x(H)y)z(h)w X n)v((H)r(hH)s(h)t X m))p or (((H)r(hH)s(h)t X m)p((H)u((h)x(H)y)z(h)w X n)v wherein H is a hydrophilic amino acid residue and can vary along said protein, h is a hydrophobic amino acid residue and can vary along said protein, X is any amino acid and may be different for each X n or X
m, a equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater than 20, w equals 0, 1 or 2, n equals any whole number including 0, v equals any whole number greater than 0, r equals 0 or 1, s equals any whole number greater than 0, t equals 0, 1 or 2, m equals any whole number including 0, and p equals any whole number greater than 0 and wherein any one unit within said protein can differ from any other unit within said protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for one unit can differ from any other unit and wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in another ((h)x(H)y).

99. The method of claim 98 wherein h is selected from the group consisting of glycine, isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and valine and wherein H is selected from the group consisting of arginine, glutamate, glycine, histidine, lysine and threonine.

100. The method of claim 98 wherein X is SEQ ID NO:8.