WO2001064891A2 - Nucleotide sequences for embryo and/or endosperm specific expression in plants - Google Patents

Nucleotide sequences for embryo and/or endosperm specific expression in plants Download PDF

Info

Publication number
WO2001064891A2
WO2001064891A2 PCT/US2001/006297 US0106297W WO0164891A2 WO 2001064891 A2 WO2001064891 A2 WO 2001064891A2 US 0106297 W US0106297 W US 0106297W WO 0164891 A2 WO0164891 A2 WO 0164891A2
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
vector
expression
embryo
plant
Prior art date
Application number
PCT/US2001/006297
Other languages
French (fr)
Other versions
WO2001064891A9 (en
WO2001064891A3 (en
Inventor
Ueli Grossniklaus
Jean-Philippe Vielle-Calzada
Original Assignee
Cold Spring Harbor Laboratory
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cold Spring Harbor Laboratory filed Critical Cold Spring Harbor Laboratory
Priority to AU2001241818A priority Critical patent/AU2001241818A1/en
Publication of WO2001064891A2 publication Critical patent/WO2001064891A2/en
Publication of WO2001064891A3 publication Critical patent/WO2001064891A3/en
Publication of WO2001064891A9 publication Critical patent/WO2001064891A9/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/415Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from plants
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • C12N15/8222Developmentally regulated expression systems, tissue, organ specific, temporal or spatial regulation
    • C12N15/823Reproductive tissue-specific promoters
    • C12N15/8234Seed-specific, e.g. embryo, endosperm

Definitions

  • This invention relates generally to the field of plant molecular biology. More specifically, this invention relates to the characterization of novel nucleotide sequences which are specific to early seed development in Arabidopsis. Regulatory regions of these sequences will provide for the temporal and spatial expression of foreign gene products in the embryo and/or endosperm of transgenic plants.
  • Gene expression encompasses a number of steps originating from the DNA template, ultimately to the final protein or protein product. Control and regulation of gene expression can occur through numerous mechanisms. The initiation of transcription of a gene is generally thought of as the predominant control of gene expression. Transcriptional controls (or promoters) are generally short sequences imbedded in the 5'-flanking or upstream region of a transcribed gene. There are promoter sequences which affect gene expression in response to environmental stimuli, nutrient availability, or adverse conditions including heat shock, anaerobiosis or the presence of heavy metals. There are also DNA sequences which control gene expression during development, or in a tissue, or in an organ specific fashion. Promoters contain the signals for RNA polymerase to begin transcription so that protein synthesis can proceed.
  • the entire region containing all the ancillary elements affecting regulation or absolute levels of transcription may be comprised of less than 100 base pairs or as much as 1 kilobase pair.
  • promoters which are active in plant cells have been described in the literature. These include nopaline synthase (NOS) and octopine synthase (OCS) promoters (which are carried on tumor inducing plasmids of Agrobacterium tumefaciens), the cauliflower mosaic virus (CaMV) 19S and 35S promoters, the light-inducible promoter from the small subunit of ribulose bisphosphate carboxylase (ssRUBICSO, a very abundant plant polypeptide), and the sucrose synthase promoter. All of these promoters have been used to create various types of DNA constructs which have been expressed in plants. (See for example PCT publication WO84/02913 Rogers, et al).
  • tissue specific promoters These promoters can activate gene sequences only in specific tissues where expression of a protein is desired, also these types of promoters can be temporarily expressed during a particular period of development.
  • organ-specific promoters The E8 promoter is only transcriptionally activated during tomato fruit ripening, and can be used to target gene expression in ripening tomato fruit (Deikman and Fischer, EMBO J. (1988) 7:3315; Giovannoni et al., The Plant Cell (1989) 1:53).
  • the activity of the E8 promoter is not limited to tomato fruit, but is thought to be compatible with any system wherein ethylene activates biological processes.
  • the Lipoxegenase (“the LOX gene") is a fruit specific promoter.
  • Leaf specific promoters include as the AS-1 promoter disclosed in US Patent 5,256,558 to Coruzzi and the RBCS-3A promoter isolated from pea the RBCS- 3A gene disclosed in US Patent 5,023,179 to Lam et al.
  • Root specific promoters include the Cam 35 S promoter disclosed in US Patent 391,725 to Coruzzi et al; the RB7 promoter disclosed in US patent 5,459,252 to Conking et al and the promoter isolated from Brassica napus disclosed in US Patent 5,401,836 to Bazczynski et al. which give root specific expression.
  • tissue specific promoters are those directed to the seed or embryo, a particularly vulnerable period and which can be manipulated to modulate seed size, to increase yield, to provide plants with fruits without seeds, to induce autonomous (parthenogenetic) development of embryo and/or endosperm, or simply to learn about spatial temporal gene regulation during this all important period.
  • These types of promoters include maternal tissue promoters such as seed coat, pericarp and ovule.
  • One such promoter active in endosperm development is the promoter from the a' subunit of the soybean ⁇ -conglycinin gene [Walling et al., Proc. Natl. Acad. Sci. USA 83:2123-2127 (1986)] which is expressed early in seed development in the endosperm and the embryo.
  • Further seed specific promoters include the Napin promoter described in united States Patent 5,110,728 to Calgene, which describes and discloses the use of the napin promoter in directing the expression to seed tissue of an acyl carrier protein to enhance seed oil production; the DC3 promoter from carrots which is early to mid embryo specific and is disclosed at Plant Physiology. Oct. 1992 100(2) p. 576-581, "Hormonal and Environmental Regulation of the Carrot Lea-class Gene Dc 3, and Plant Mol. Biol.. April 1992, 18(6) p.
  • the present invention comprises the isolation and characterization of novel regulatory nucleotide sequences which provide for expression of genes during ovule and early seed development in Arabidopsis.
  • the invention also comprises the spatial and temporal expression of nucleotide sequences using these regulatory elements.
  • the invention further comprises expression cassettes comprising the promoters of the invention, a structural gene, the expression of which is desired in plant cells, and a polyadenylation or stop signal.
  • the expression cassette can be encompassed in plasmid or viral vectors for transformation of plant protoplast cells.
  • the invention also encompasses transformed bacterial cells for maintenance and replication of the vector, as well as transformed monocot or dicot cells and ultimately transgenic plants, and breeding materials developed from the transgenic plants.
  • the invention comprises the identification and location of novel nucleic acid sequences which are expressed or which regulate expression to activate and de-activate genes during embryo and/or endosperm development. These sequences may be used as markers for mapping, for manipulation of gene expression pre- and post fertilization, and for assays and protocols for manipulation of paternal and maternal silencing, during embryo development.
  • nucleic acids are written left to right in 5' to 3' orientation; amino acid sequences are written left to right in amino to carboxy orientation, respectively. Numeric ranges are inclusive of the numbers defining the range and include each integer within the defined range. Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single- letter codes. Unless otherwise provided for, software, electrical, and electronics terms as used herein are as defined in The New IEEE Standard Dictionary of Electrical and Electronics Terms (5 th edition, 1993). The terms defined below are more fully defined by reference to the specification as a whole.
  • amplified is meant the construction of multiple copies of a nucleic acid sequence or multiple copies complementary to the nucleic acid sequence using at least one of the nucleic acid sequences as a template.
  • Amplification systems include the polymerase chain reaction (PCR) system, ligase chain reaction (LCR) system, nucleic acid sequence based amplification (NASBA, Canteen, Mississauga, Ontario), Q-_8et ⁇ Replicase systems, transcription-based amplification system (TAS), and strand displacement amplification (SDA). See, e.g., Diagnostic Molecular Microbiology: Principles and Applications, D.H. Persing et al., Ed., American Society for Microbiology, Washington, D.C. (1993). The product of amplification is termed an amplicon.
  • antisense orientation includes reference to a duplex polynucleotide sequence that is operably linked to a promoter in an orientation where the antisense strand is transcribed.
  • the antisense strand is sufficiently complementary to an endogenous transcription product such that translation of the endogenous transcription product is often inhibited.
  • chromosomal region includes reference to a length of a chromosome that may be measured by reference to the linear segment of DNA that it comprises.
  • the chromosomal region can be defined by reference to two unique DNA sequences, i.e., markers.
  • conservatively modified variants refers to those nucleic acids which encode identical or conservatively modified variants of the amino acid sequences. Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide. Such nucleic acid variations are "silent variations" and represent one species of conservatively modified variation.
  • Every nucleic acid sequence herein that encodes a polypeptide also, by reference to the genetic code, describes every possible silent variation of the nucleic acid.
  • each codon in a nucleic acid except AUG, which is ordinarily the only codon for methionine; and UGG, which is ordinarily the only codon for tryptophan
  • each silent variation of a nucleic acid which encodes a polypeptide of the present invention is implicit in each described polypeptide sequence and is within the scope of the present invention.
  • amino acid sequences one of skill will recognize that individual substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded sequence is a "conservatively modified variant" where the alteration results in the substitution of an amino acid with a chemically similar amino acid.
  • any number of amino acid residues selected from the group of integers consisting of from 1 to 15 can be so altered.
  • 1, 2, 3, 4, 5, 7, or 10 alterations can be made.
  • Conservatively modified variants typically provide similar biological activity as the unmodified polypeptide sequence from which they are derived.
  • substrate specificity, enzyme activity, or ligand/receptor binding is generally at least 30%, 40%, 50%, 60%, 70%, 80%, or 90% of the native protein for its native substrate.
  • Conservative substitution tables providing functionally similar amino acids are well known in the art.
  • nucleic acid encoding a protein may comprise non-translated sequences (e.g., introns) within translated regions of the nucleic acid, or may lack such intervening non- translated sequences (e.g., as in cDNA).
  • the information by which a protein is encoded is specified by the use of codons.
  • amino acid sequence is encoded by the nucleic acid using the "universal" genetic code.
  • variants of the universal code such as are present in some plant, animal, and fungal mitochondria, the bacterium Mycoplasma capricolum, or the ciliate Macronucleus, may be used when the nucleic acid is expressed therein.
  • nucleic acid is prepared or altered synthetically, advantage can be taken of known codon preferences of the intended host where the nucleic acid is to be expressed.
  • nucleic acid sequences of the present invention may be expressed in both monocotyledonous and dicotyledonous plant species, sequences can be modified to account for the specific codon preferences and GC content preferences of monocotyledons or dicotyledons as these preferences have been shown to differ (Murray et al. Nucl. Acids Res. 17:477-498 (1989)).
  • the maize preferred codon for a particular amino acid may be derived from known gene sequences from maize. Maize codon usage for 28 genes from maize plants are listed in Table 4 of Murray et al, supra.
  • full-length sequence in reference to a specified polynucleotide or its encoded protein means having the entire amino acid sequence of, a native (non-synthetic), endogenous, biologically active form of the specified protein.
  • Methods to determine whether a sequence is full-length are well known in the art including such exemplary techniques as northern or western blots, primer extensions, SI protection, and ribonuclease protection. See, e.g., Plant Molecular Biology: A Laboratory Manual, Clark, Ed., Springer- Verlag, Berlin (1997). Comparison to known full-length homologous (orthologous and/or paralogous) sequences can also be used to identify full- length sequences of the present invention.
  • consensus sequences typically present at the 5' and 3' untranslated regions of mRNA aid in the identification of a polynucleotide as full-length.
  • the consensus sequence ANNNNAUGG where the underlined codon represents the N- terminal methionine, aids in determining whether the polynucleotide has a complete 5' end.
  • Consensus sequences at the 3' end such as polyadenylation sequences, aid in determining whether the polynucleotide has a complete 3' end.
  • heterologous in reference to a nucleic acid is a nucleic acid that originates from a foreign species, or, if from the same species, is substantially modified from its native form in composition and/or genomic locus by deliberate human intervention.
  • a promoter operably linked to a heterologous structural gene is from a species different from that from which the structural gene was derived, or, if from the same species, on or both are substantially modified from their original form.
  • a heterologous protein may originate from a foreign species or, if from the same species, is substantially modified from its original form by deliberate human intervention.
  • host cell is meant a cell which contains a vector and supports the replication and/or expression of the vector.
  • Host cells may be prokaryotic eel such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or mammalian cells.
  • host cells are monocotyledonous or dicotyledonous plant cells.
  • a particularly preferred monocotyledonous host ct is a maize host cell.
  • hybridization complex includes reference to a duplex nucleit acid structure formed by two single-stranded nucleic acid sequences selectivel hybridized with each other.
  • introduction in the context of inserting a nucleic acid into a cell, means “transfection” or “transformation” or “transduction” and includes reference to the incorporation of a nucleic acid into a eukaryotic or prokaryotic cell where the nucleic acid may be incorporated into the genome of the cell (e.g., chromosome, plasmid, plastid or mitochondrial DNA), converted into an autonomous replicon, or transiently expressed (e.g., transfected mRNA).
  • isolated refers to material, such as a nucleic acid or a protein, which is: (1) substantially or essentially free from components that normally accompany or interact with it as found in its naturally occurring environment.
  • the isolated material optionally comprises material not found with the material in its natural environment; or (2) if the material is in its natural environment, the material has been synthetically (non-naturally) altered by deliberate human intervention to a composition and/or placed at a location in the cell (e.g., genome or subcellular organelle) not native to a material found in that environment.
  • the alteration to yield the synthetic material can be performed on the material within or removed from its natural state.
  • a naturally occurring nucleic acid becomes an isolated nucleic acid if it is altered, or if it is transcribed from DNA which has been altered, by means of human intervention performed within the cell from which it originates. See, e.g., Compounds and Methods for Site Directed Mutagenesis in Eukaryotic Cells, Kmiec, U.S. Patent No. 5,565,350; In Vivo Homologous Sequence Targeting in Eukaryotic Cells; Zarling et al, PCT/US93/03868.
  • a naturally occurring nucleic acid e.g., a promoter
  • Nucleic acids which are "isolated" as defined herein, are also referred to as "heterologous" nucleic acids.
  • chromosomal region defined by and including with respect to particular markers includes reference to a contiguous length of a chromosome delimited by and including the stated markers.
  • marker includes reference to a locus on a chromosome that serves to identify a unique position on the chromosome.
  • a "polymorphic marker” includes reference to a marker which appears in multiple forms (alleles) such that different forms of the marker, when they are present in a homologous pair, allow transmission of each of the chromosomes of that pair to be followed.
  • a genotype may be defined by use of one or a plurality of markers.
  • nucleic acid or “nucleotide” includes reference to a deoxyribonucleotide or ribonucleotide polymer in either single- or double- stranded form, and unless otherwise limited, encompasses known analogues having the essential nature of natural nucleotides in that they hybridize to single -stranded nucleic acids in a manner similar to naturally occurring nucleotides (e.g., peptide nucleic acids).
  • nucleic acid library is meant a collection of isolated DNA or RNA molecules which comprise and substantially represent the entire transcribed fraction of a genome of a specified organism. Construction of exemplary nucleic acid libraries, such as genomic and cDNA libraries, is taught in standard molecular biology references such as Berger and Kimmel, Guide to Molecular Cloning Techniques, Methods in Enzymology, Vol. 152, Academic Press, Inc., San Diego, CA (Berger); Sambrook et al, Molecular Cloning - A Laboratory Manual, 2 nd ed., Vol. 1-3 (1989); and Current Protocols in Molecular Biology, F.M. Ausubel et al., Eds., Current Protocols, a joint venture between Greene Publishing Associates, Inc.
  • operably linked includes reference to a functional linkage between a promoter and a second sequence, wherein the promoter sequence initiates and mediates transcription of the DNA sequence corresponding to the second sequence.
  • operably linked means that the nucleic acid sequences being linked are contiguous and, where necessary to join two protein coding regions, contiguous and in the same reading frame.
  • plant can include reference to whole plants, plant parts or organs (e.g., leaves, stems, roots, etc.), plant cells, seeds and progeny of same.
  • Plant cell as used herein, further includes, without limitation, cells obtained from or found in: seeds, suspension cultures, embryos, meristematic regions, callus tissue, leaves, roots, shoots, gametophytes, sporophytes, pollen, and microspores. Plant cells can also be understood to include modified cells, such as protoplasts, obtained from the aforementioned tissues.
  • the class of plants which can be used in the methods of the invention is generally as broad as the class of higher plants amenable to transformation techniques, including both monocotyledonous and dicotyledonous plants. Particularly preferred plants include maize, soybean, sunflower, sorghum, canola, wheat, alfalfa, cotton, rice, barley, and millet.
  • polynucleotide includes reference to a deoxyribopolynucleotide, ribopolynucleotide, or analogs thereof that have the essential nature of a natural ribonucleotide in that they hybridize, under stringent hybridization conditions, to substantially the same nucleotide sequence as naturally occurring nucleotides and/or allow translation into the same amino acid(s) as the naturally occurring nucleotide (s).
  • a polynucleotide can be full-length or a subsequence of a native or heterologous structural or regulatory gene. Unless otherwise indicated, the term includes reference to the specified sequence as well as the complementary sequence thereof.
  • DNAs or RNAs comprising unusual bases, such as inosine, or modified bases, such as tritylated bases, to name just two examples, are polynucleotides as the term is used herein. It will be appreciated that a great variety of modifications have been made to DNA and RNA that serve many useful purposes known to those of skill in the art.
  • polynucleotide as it is employed herein embraces such chemically, enzymatically or metabolically modified forms of polynucleotides, as well as the chemical forms of DNA and RNA characteristic of viruses and cells, including among other things, simple and complex cells.
  • polypeptide polypeptide
  • peptide protein
  • proteins are used interchangeably herein to refer to a polymer of amino acid residues.
  • the terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical analogue of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers.
  • the essential nature of such analogues of naturally occurring amino acids is that, when incorporated into a protein, that protein is specifically reactive to antibodies elicited to the same protein but consisting entirely of naturally occurring amino acids.
  • polypeptide polypeptide
  • peptide protein
  • modifications including, but not limited to, glycosylation, lipid attachment, sulfation, gamma-carboxylation of glutamic acid residues, hydroxylation and ADP-ribosylation. It will be appreciated, as is well known and as noted above, that polypeptides are not entirely linear. For instance, polypeptides may be branched as a result of ubiquitination, and they may be circular, with or without branching, generally as a result of posttranslation events, including natural processing event and events brought about by human manipulation which do not occur naturally.
  • Circular, branched and branched circular polypeptides may be synthesized by non-translation natural process and by entirely synthetic methods, as well. Further, this invention contemplates the use of both the methionine-containing and the methionine- less amino terminal variants of the protein of the invention.
  • promoter includes reference to a region of DNA upstream from the start of transcription and involved in recognition and binding of RNA polymerase and other proteins to initiate transcription.
  • a "plant promoter” is a promoter capable of initiating transcription in plant cells whether or not its origin is a plant cell. Exemplary plant promoters include, but are not limited to, those that are obtained from plants, plant viruses, and bacteria which comprise genes expressed in plant cells such as Agrobacterium or Rhizobium. Examples of promoters under developmental control include promoters that preferentially initiate transcription in certain tissues, such as leaves, roots, or seeds. Such promoters are referred to as “tissue preferred”. Promoters which initiate transcription only in certain tissue are referred to as "tissue specific”.
  • a “cell type” specific promoter primarily drives expression in certain cell types in one or more organs, for example, vascular cells in roots or leaves.
  • An “inducible” or “repressible” promoter is a promoter which is under environmental control. Examples of environmental conditions that may effect transcription by inducible promoters include anaerobic conditions or the presence of light. Tissue specific, tissue preferred, cell type specific, and inducible promoters constitute the class of "non-constitutive" promoters.
  • a “constitutive” promoter is a promoter which is active under most environmental conditions.
  • recombinant includes reference to a cell or vector, that has been modified by the introduction of a heterologous nucleic acid or that the cell is derived from a cell so modified.
  • recombinant cells express genes that are not found in identical form within the native (non- recombinant) form of the cell or express native genes that are otherwise abnormally expressed, under-expressed or not expressed at all as a result of deliberate human intervention.
  • the term "recombinant” as used herein does not encompass the alteration of the cell or vector by naturally occurring events (e.g., spontaneous mutation, natural transformation/transduction/transposition) such as those occurring without deliberate human intervention.
  • a "expression cassette” is a nucleic acid construct, generated recombinantly or synthetically, with a series of specified nucleic acid elements which permit transcription of a particular nucleic acid in a host cell.
  • the recombinant expression cassette can be incorporated into a plasmid, chromosome, mitochondrial DNA, plastid DNA, virus, or nucleic acid fragment.
  • the recombinant expression cassette portion of an expression vector includes, among other sequences, a nucleic acid to be transcribed, and a promoter.
  • amino acid residue or “amino acid residue” or “amino acid” are used interchangeably herein to refer to an amino acid that is incorporated into a protein, polypeptide, or peptide (collectively “protein”).
  • the amino acid may be a naturally occurring amino acid and, unless otherwise limited, may encompass non-natural analogs of natural amino acids that can function in a similar manner as naturally occurring amino acids.
  • sequences include reference to hybridization, under stringent hybridization conditions, of a nucleic acid sequence to a specified nucleic acid target sequence to a detectably greater degree (e.g., at least 2-fold over background) than its hybridization to non-target nucleic acid sequences and to the substantial exclusion of non-target nucleic acids.
  • Selectively hybridizing sequences typically have about at least 80% sequence identity, preferably 90% sequence identity, and most preferably 100% sequence identity (i.e., complementary) with each other.
  • stringent conditions or “stringent hybridization conditions” includes reference to conditions under which a probe will hybridize to its target sequence, to a detectably greater degree than to other sequences (e.g., at least 2-fold over background). Stringent conditions are sequence-dependent and may be different in different circumstances. By controlling the stringency of the hybridization and/or washing conditions, target sequences can be identified which are 100% complementary to the probe (homologous probing). Alternatively, stringency conditions can be adjusted to allow some mismatching in sequences so that lower degrees of similarity are detected (heterologous probing). Generally, a probe is less than about 1000 nucleotides in length, optionally less than 500 nucleotides in length.
  • stringent conditions will be those in which the salt concentration is less than about 1.5 M Na ion, typically about 0.01 to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30° C for short probes (e.g., 10 to 50 nucleotides) and at least about 60°C for long probes (e.g., greater than 50 nucleotides).
  • Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide.
  • Exemplary moderate stringency conditions include hybridization in 40 to 45% formamide, 1 M NaCl, 1% SDS at 37°C, and a wash in 0.5X to IX SSC at 55 to 50°C.
  • Exemplary high stringency conditions include hybridization in 50% formamide, 1 M NaCl, 1% SDS at 37°C, and a wash in 0.1X SSC at 60 to 65°C.
  • T m can be approximated from the equation of Meinkoth and Wahl, Anal Biochem., 138:267-284 (1984): + 16.6 (log M) + 0.41 (%GC) -0.61 (% form) - 500/L; where M is the molarity of monovalent cations, %GC is the percentage of guanosine and cytosine nucleotides in the DNA, % form is the percentage of formamide in the hybridization solution, and L is the length of the hybrid in base pairs.
  • the T m is the temperature (under defined ionic strength and pH) at which 50% of the complementary target sequence hybridizes to a perfectly matched probe. T m is reduced by about 1°C for each 1% of mismatching; thus, T m , hybridization and/or wash conditions can be adjusted to hybridize to sequences of the desired identity. For example, if sequences with >90% identity are sought, the T m can be decreased 10°C. Generally, stringent conditions are selected to be about 5°C lower than the thermal melting point (T m ) for the specific sequence and its complement at a defined ionic strength and pH.
  • transgenic plant includes reference to a plant which comprises within its genome a heterologous polynucleotide.
  • the heterologous polynucleotide is stably integrated within the genome such that the polynucleotide is passed on to successive generations.
  • the heterologous polynucleotide may be integrated into the genome alone or as part of a recombinant expression cassette.
  • Transgenic is used herein to include any cell, cell line, callus, tissue, plant part or plant, the genotype of which has been altered by the presence of heterologous nucleic acid including those transgenics initially so altered as well as those created by sexual crosses or asexual propagation from the initial transgenic.
  • transgenic does not encompass the alteration of the genome (chromosomal or extra-chromosomal) by conventional plant breeding methods or by naturally occurring events such as random cross-fertilization, non-recombinant viral infection, non-recombinant bacterial transformation, non-recombinant transposition, or spontaneous mutation.
  • vector includes reference to a nucleic acid used in transfection of a host cell and into which can be inserted a polynucleotide. Vectors are often replicons. Expression vectors permit transcription of a nucleic acid inserted therein.
  • reference sequence is a defined sequence used as a basis for sequence comparison.
  • a reference sequence may be a subset or the entirety of a specified sequence; for example, as a segment of a full-length cDNA or gene sequence, or the complete cDNA or gene sequence.
  • comparison window includes reference to a contiguous and specified segment of a polynucleotide sequence, wherein the polynucleotide sequence may be compared to a reference sequence and wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences.
  • the comparison window is at least 20 contiguous nucleotides in length, and optionally can be 30, 40, 50, 100, or longer.
  • Optimal alignment of sequences for comparison may be conducted by the local homology algorithm of Smith and Waterman, Adv. Appl. Math. 2:482 (1981); by the homology alignment algorithm of Needleman and Wunsch, J. Mol. Biol 48:443 (1970); by the search for similarity method of Pearson and Lipman, Proc. Natl. Acad. Sci.
  • the BLAST family of programs which can be used for database similarity searches includes: BLASTN for nucleotide query sequences against nucleotide database sequences; BLASTX for nucleotide query sequences against protein database sequences; BLASTP for protein query sequences against protein database sequences; TBLASTN for protein query sequences against nucleotide database sequences; and TBLASTX for nucleotide query sequences against nucleotide database sequences.
  • sequence identity/similarity values refer to the value obtained using the BLAST 2.0 suite of programs using default parameters. Altschul et a., Nucleic Acids Res. 25:3389-3402 (1997). Software for performing BLAST analyses is publicly available, e.g., through the National Center for Biotechnology-Information (htt ://www.hcbi.nlm.nih.gov ) . This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive -valued threshold score T when aligned with a word of the same length in a database sequence.
  • HSPs high scoring sequence pairs
  • T is referred to as the neighborhood word score threshold (Altschul et al., supra).
  • These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them.
  • the word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased.
  • Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always > 0) and N (penalty score for mismatching residues; always ⁇ 0).
  • M forward score for a pair of matching residues
  • N penalty score for mismatching residues; always ⁇ 0
  • a scoring matrix is used to calculate the cumulative score.
  • Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached.
  • the BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment.
  • the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff (1989) Proc. Natl Acad. Sci. USA 89:10915).
  • W wordlength
  • E expectation
  • BLOSUM62 scoring matrix see Henikoff & Henikoff (1989) Proc. Natl Acad. Sci. USA 89:10915.
  • the BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul, Proc. Natl. Acad. Sci. USA 90:5873- 5787 (1993)).
  • BLAST smallest sum probability
  • P(N) the smallest sum probability
  • BLAST searches assume that proteins can be modeled as random sequences. However, many real proteins comprise regions of nonrandom sequences which may be homopolymeric tracts, short-period repeats, or regions enriched in one or more amino acids. Such low-complexity regions may be aligned between unrelated proteins even though other regions of the protein are entirely dissimilar.
  • a number of low-complexity filter programs can be employed to reduce such low-complexity alignments. For example, the SEG (Wooten and Federhen, Comput. Chem., 17:149-163 (1993)) and XNU (Claverie and States, Comput. Chem., 17:191-201 (1993)) low-complexity filters can be employed alone or in combination.
  • sequence identity in the context of two nucleic acid or polypeptide sequences includes reference to the residues in the two sequences which are the same when aligned for maximum correspondence over a specified comparison window.
  • sequence identity When percentage of sequence identity is used in reference to proteins it is recognized that residue positions which are not identical often differ by conservative amino acid substitutions, where amino acid residues are substituted for other amino acid residues with similar chemical properties (e.g. charge or hydrophobicity) and therefore do not change the functional properties of the molecule. Where sequences differ in conservative substitutions, the percent sequence identity may be adjusted upwards to correct for the conservative nature of the substitution. Sequences which differ by such conservative substitutions are said to have "sequence similarity" or “similarity”.
  • Means for making this adjustment are well-known to those of skill in the art. Typically this involves scoring a conservative substitution as a partial rather than a full mismatch, thereby increasing the percentage sequence identity. Thus, for example, where an identical amino acid is given a score of 1 and a non-conservative substitution is given a score of zero, a conservative substitution is given a score between zero and 1.
  • the scoring of conservative substitutions is calculated, e.g., according to the algorithm of Meyers and Miller, Computer Applic. Biol Sci., 4:11-17 (1988) e.g., as implemented in the program PC/GENE (Intelligenetics, Mountain View, California, USA).
  • percentage of sequence identity means the value determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity.
  • polynucleotide sequences means that a polynucleotide comprises a sequence that has at least 70% sequence identity, preferably at least 80%, more preferably at least 90% and most preferably at least 95%, compared to a reference sequence using one of the alignment programs described using standard parameters.
  • sequence identity preferably at least 80%, more preferably at least 90% and most preferably at least 95%.
  • nucleotide sequences are substantially identical is if two molecules hybridize to each other under stringent conditions. However, nucleic acids which do not hybridize to each other under stringent conditions are still substantially identical if the polypeptides which they encode are substantially identical. This may occur, e.g., when a copy of a nucleic acid is created using the maximum codon degeneracy permitted by the genetic code.
  • One indication that two nucleic acid sequences are substantially identical is that the polypeptide which the first nucleic acid encodes is immunologically cross reactive with the polypeptide encoded by the second nucleic acid.
  • substantially identical in the context of a peptide indicates that a peptide comprises a sequence with at least 70% sequence identity to a reference sequence, preferably 80%, or preferably 85%, most preferably at least 90% or 95% sequence identity to the reference sequence over a specified comparison window.
  • optimal alignment is conducted using the homology alignment algorithm of Needleman and Wunsch, J. Mol. Biol. 48:443 (1970).
  • an indication that two peptide sequences are substantially identical is that one peptide is immunologically reactive with antibodies raised against the second peptide.
  • a peptide is substantially identical to a second peptide, for example, where the two peptides differ only by a conservative substitution.
  • Peptides which are "substantially similar” share sequences as noted above except that residue positions which are not identical may differ by conservative amino acid changes.
  • Figure 1 Silencing of paternally inherited genes during seed development in Arabidopsis .
  • (a) If an ET2612 female is crossed to a wild-type male, GUS expression is detected in the free nuclear endosperm 12 HAP.
  • (b) and (c) FI seeds of the same cross show more intense GUS expression in embryo and endosperm 48 HAP.
  • (d) If ET2612 is crossed as the male to wild-type plants, GUS expression is not detectable in embryo and free nuclear endosperm 48 HAP.
  • Figure 2 Allele-specific expression profile o ⁇ PRL during early seed development, (a) Schematic of PRL showing the position of exons (squared boxes), introns (lines) and the SNP in an Xhol site which is present in Col but absent in Ler. (b) Allele-specific PCR analysis using titrated mixtures of genomic DNA each of which is represented twice. Both PRL alleles are consistently amplified throughout the dilution series (c) Allele-specific RT-PCR analysis of PRL in reciprocal crosses between Col and Ler. Only transcripts derived from the maternal PRL allele are detected up to 68 HAP. (d) If PRL enhancer detector
  • ET1041 is used as the female in crosses to Col, GUS is detected in the endosperm 36 HAP.
  • PRL enhancer detector ET1041 is used as the male parent, GUS is detected in a small portion of the chalazal endosperm 92 HAP.
  • FIG. 3 Silencing of the paternally inherited EMB30 allele during early embryogenesis.
  • FIGURE 4 Identification of genes active during early embryo and endosperm formation in Arabidopsis. All lines show GUS expression only when self- pollinated or used as females as reciprocal crosses, (a) ET1041: embryo and endosperm, (b) ET1051: embryo and endosperm, (c) ET2209: endosperm, (d) ET1275: embryo, (e) ET1278: embryo, (f) ET3988: embryo and endosperm, (g) ET3992: embryo and endosperm (h) ET3536: endosperm, (i) ET2612: embryo and endosperm, (j) ET2567: endosperm, (k) ET1811: embryo and endosperm. (1) ET4336: embryo and endosperm.
  • Transgenic plants have been generated using randomly inserted enhancer trap technology (O'Kane, C, and Gehring, W.J. (1987), "Detection in situ of genomic regulatory elements in Drosophila", Proc. Natl Acad. Sci. USA, 84, 9123-9127; Sundaresan et al., "Patterns of gene action revealed by enhancer trap and gene trap transposable elements", Genes & Development, 9:1797-1810 (1995)), which through screening has identified 10 regions active during this time period. Sequencing of the immediate area has been accomplished to "tag" the site. Using this information and routine molecular biology techniques known in the art several regulatory regions which are closely linked (including or neighboring) may be identified active during this time period. The novel Arabidopsis sequences can be used as markers to identify genes and/or regulatory regions as well as markers for mapping.
  • organ-specific promoters such as these for embryo or endosperm appropriate for a desired target organ can be isolated using known procedures. These control sequences are generally associated with genes uniquely expressed in the desired organ. In a typical higher plant, each organ has thousands of mRNAs that are absent from other organ systems (reviewed in Goldberg, Phil, Trans. R. Soc. London (1986) B314-343. mRNAs are first isolated to obtain suitable probes for retrieval of the appropriate genomic sequence which retains the presence of the natively associated control sequences. An example of the use of techniques to obtain the cDNA associated with mRNA specific to avocado fruit is found in Christoffersen et al., Plant Molecular Biology (1984) 3:385.
  • mRNA was isolated from ripening avocado fruit and used to make a cDNA library. Clones in the library were identified that hybridized with labeled RNA isolated from ripening avocado fruit, but that did not hybridize with labeled RNAs isolated from unripe avocado fruit. Many of these clones represent mRNAs encoded by genes that are transcriptionally activated at the onset of avocado fruit ripening.
  • Enhancer detector or gene trap transposable elements are integrated randomly into the plant genome to create transgenic individuals, enhancer/gene trap lines (transposants), which are then assayed to identify genes located nearby or within the integration site. From the information gathered, namely the timing and location of expression of the reporter gene present on the enhancer detector transposon and the sequence surrounding the transposon (or sequence tag) (Liu, Y-G., Mitsukawa, N., Oosumi, T., &
  • the enhancer trap lies within the coding region of the detected gene.
  • the transposon insertion is potentially within the regulatory region of the corresponding tagged gene.
  • the putative regulatory regions driving expression in the embryo and/or endosperm from these lines can be isolated and tested using routine molecular techniques as disclosed in the references and teachings herein.
  • One method involves several steps, first, identification of the transcription units neighboring the insertion site, then, isolation of potential cis-regulatory elements, third, making constructs using these cis-regulatory elements driving a reporter gene e.g. beta-glucuronidase (GUS) [Jefferson, RA, Kavanagh, TA, and Bevan, MW (1987).
  • GUS beta-glucuronidase
  • GUS-fusions ⁇ -glucuronidase as a sensitive and versatile gene fusion marker in higher plants.
  • GFP green fluorescent protein
  • Applicants herein provide the necessary data, i.e. the sequence tag and the expression pattern, that enables one to search for regulatory elements in the vicinity of the Ds element.
  • the enhancer trap revealed the presence of such elements in its vicinity, the sequence tag localizes the element within the genomic sequence (either known sequence or allows the isolation of such sequence).
  • Two general cases are outlined below how one would proceed to get regulatory regions.
  • a) Sequence tag falls into sequenced region of the genome. If the area has been sequenced, the neighborhood of the Ds element is known. The genes in the region have been predicted by software programs or were identified as expressed sequence tags ox previously isolated genes.
  • the Ds transposable element could be either inserted within the coding region (or an intron) of a gene or in its immediate promoter region.
  • the first intron of the Arabidopsis thaliana gene coding for elongation factor 1-beta contains an enhancer-like element. Gene 170: 201-6; Sieburth LE, Meyerowitz EM (1997). Molecular dissection of the AGAMOUS control region shows that cis elements for spatial regulation are located intragenically. Plant Cell 9: 355-65; Daniel SG, Becker WM (1995). Transgenic analysis of the 5'- and 3'-flanking regions of the NADH-dependent hydroxypyruvate reductase gene from Cucumis sativus L. Plant Mol Biol 28: 821-36].
  • introns and possibly 3' regions can be cloned in front of a minimal promoter (e.g. the minimal 35S promoter used in the enhancer trap [Sundaresan V, Springer P, Volpe T, Haward S, Jones JD, Dean C, Ma H, Martienssen (1995). Patterns of gene action in plant development revealed by enhancer trap and gene trap transposable elements. Genes Dev 9: 1797-810] driving a reporter gene. In the case where the Ds element lies in between two predicted genes, a similar analysis may have to be performed for both of them. It is likely that the gene whose promoter is closer to the enhancer trap is the better candidate.
  • a minimal promoter e.g. the minimal 35S promoter used in the enhancer trap [Sundaresan V, Springer P, Volpe T, Haward S, Jones JD, Dean C, Ma H, Martienssen (1995). Patterns of gene action in plant development revealed by enhancer trap and gene trap transposable elements. Genes Dev 9: 179
  • the sequence tag can be used to make specific primers which are used to isolate a larger genomic fragment, e.g. by using a PCR-based strategy such as TAIL-PCR [Liu YG, Mitsukawa N, Oosumi T, Whittier RF (1995). Efficient isolation and mapping of Arabidopsis thaliana T-DNA insert junctions by thermal asymmetric interlaced PCR. Plant J 8: 457-63; Grossniklaus, U, Moore, JM, and Gagliano, WB (1998). Molecular and genetic approaches to understanding and engineering apomixis: Arabidopsis as a powerful tool. In Advances in Hybrid Rice Technology.
  • ENHANCER TRAP EXPRESSION Embryo and endosperm.
  • Ds LOCALIZATION Insertion in chromosome 4 (BAC ATF16G20). Insertion probably in the first intron of a gene that contains 6 exons. similarity to rape mRNA (Brassica napus) PIR2:S42651; contains EST gb:T45158, T4498.
  • DEFINITION Putative protein.
  • ACCESSION CAA20464.
  • ENHANCER TRAP EXPRESSION Egg apparatus and embryo.
  • EXPRESSION Egg apparatus, central cell, embryo.
  • Ds LOCALIZATION No perfect match; insertion within a isoflavone reductase homolog (close homology to accession CAB36830.1 in BAG F18A5). Insertion in a first exon, if similar molecular structure. Sequence not previously reported
  • EXPRESSION Zygote, fertilized central cell, embryo and endosperm.
  • Ds LOCALIZATION No perfect match but extremely close; Homology to gene in chromosome 2 (BAC F14N22; sequence 51955-52135; homologous to Arabidopsis 14-3-3 regulatory protein GF14 mu. Insertion within the promoter if similar molecular structure (upstream of 1st exon).
  • Ref Chuong and Ferl (1999) Plant Physiology 120 (4); 1206).
  • DEFINITION Putative protein.
  • ACCESSION AAD23005.
  • SEQUENCE OF TAIL-PCR FRAGMENT Forward reaction (SEQ ID NO:4)
  • EXPRESSION embryo and endosperm.
  • Ds LOCALIZATION Insertion in chromosome 2 (BAG T30B22; seq. 66661- 67130). The 73s element is inserted 5' to 3' in the 8th intron of a putative disulfide isomerase (10 exons).
  • DEFINITION Putative protein, disulfide- isomerase homologue.
  • ACCESSION AAC62863
  • Ds LOCALIZATION Insertion within chromosome 2 (BAG T3K9; sequence 59413-59555) within the untranslated 5' region of a transmembrane protein homolog found in Saccaromyces cerevisiae (yeast; accession YDR352W). The gene where the Ds is inserted is upstream of the first exon.
  • Ds LOCALIZATION Insertion within chromosome 2 (BAG T3K9; sequence 59413-59555) within the untranslated 5' region of a transmembrane protein homolog found in Saccaromyces cerevisiae (yeast; accession YDR352W). The gene where the Ds is inserted is upstream of the first exon.
  • EXPESSION endosperm (perhaps faint in embryo).
  • Ds LOCALIZATION Insertion within chromosome 4 (BAC F2009; sequence 36329-35751), in the first exon of a putative protein without function assignedDEFINITION Putative protein. ACCESSION: CAA16882.
  • EXPRESSION embryo and endosperm.
  • EXPRESSION seed coat, embryo, endosperm.
  • 73s LOCALIZATION No homology at DNA or protein level (no match to ESTs). 73s 5'end present in TAIL sequence.
  • EXPRESSION embryo and endopserm.
  • 73s LOCALIZATION Insertion within Chromosome 1 Perfect hit to an EST obtained from germinating seeds of Arabidopsis ACCESSION: AF162845.
  • the promoter region can be used in a virtually infinite array of transgenic protocols.
  • the selection of structural gene, vector and transformation method are all expedients which may be maximized by those of skill in the art and are intended to be within the scope of the invention.
  • agronomic genes can be expressed in transformed plants. More particularly, plants can be genetically engineered to express various phenotypes of agronomic interest with expression targeted to the embryo or endosperm. Exemplary genes implicated in this regard include, but are not limited to, those categorized below.
  • a Plant disease resistance genes Plant defenses are often activated by specific interaction between the product of a disease resistance gene (R) in the plant and the product of a corresponding avirulence (Avr) gene in the pathogen.
  • R disease resistance gene
  • a plant variety can be transformed with cloned resistance gene to engineer plants that are resistant to specific pathogen strains. See, for example Jones et al., Science 266: 789 (1994) (cloning of the tomato Cf-9 gene for resistance to Cladosporium fulvum); Martin et al., Science 262: 1432 (1993) (tomato Pto gene for resistance to Pseudomonas syringae pv. tomato encodes a protein kinase); Mindrinos et al., Cell 78: 1089 (1994) (Arabidopsis RSP2 gene for resistance to Pseudomonas syringae).
  • B A Bacillus thuringiensis protein, a derivative thereof or a synthetic polypeptide modeled thereon. See, for example, Geiser et al, Gene 48: 109 (1986), who disclose the cloning and nucleotide sequence of a Bt - endotoxin gene. Moreover, DNA molecules encoding -endotoxin genes can be purchased from American Type Culture Collection (Rockville, MD), for example, under ATCC Accession Nos. 40098, 67136, 31995 and 31998.
  • C A lectin. See, for example, the disclosure by Van Damme et al, Plant Molec. Biol. 2 25 (1994), who disclose the nucleotide sequences of several Clivia miniata mannose -binding lectin genes.
  • an insect-specific hormone or pheromone such as an ecdysteroid and juvenile hormone, a variant thereof, a mimetic based thereon, or an antagonist or agonist thereof. See, for example, the disclosure by Hammock et al., Nature 344: 458 (1990), of baculovirus expression of cloned juvenile hormone esterase, an inactivator of juvenile hormone.
  • G An insect-specific peptide or neuropeptide which, upon expression, disrupts the physiology of the affected pest. For example, see the disclosures of Regan, J. Biol. Chem. 269: 9 (1994) (expression cloning yields DNA coding for insect diuretic hormone receptor), and Pratt et al, Biochem. Biophys. Res. Comm.163: 1243 (1989) (an allostatin is identified in Diploptera puntata). See also U.S. patent No.5,266,317 to Tomalski et al, who disclose genes encoding insect-specific, paralytic neurotoxins.
  • H An insect-specific venom produced in nature by a snake, a wasp, etc. For example, see Pang et al., Gene 116: 165 (1992), for disclosure of heterologous expression in plants of a gene coding for a scorpion insectotoxic peptide.
  • (J) An enzyme involved in the modification, including the post- translational modification, of a biologically active molecule for example, a glycolytic enzyme, a proteolytic enzyme, a lipolytic enzyme, a nuclease, a cyclase, a transaminase, an esterase, a hydrolase, a phosphatase, a kinase, a phosphorylase, a polymerase, an elastase, a chitinase and a glucanase, whether natural or synthetic.
  • a glycolytic enzyme for example, a glycolytic enzyme, a proteolytic enzyme, a lipolytic enzyme, a nuclease, a cyclase, a transaminase, an esterase, a hydrolase, a phosphatase, a kinase, a phosphorylase, a polymerase, an elastase, a chitinase
  • DNA molecules which contain chitinase-encoding sequences can be obtained, for example, from the ATCC under Accession Nos. 39637 and 67152. See also Kramer et al, Insect Biochem. Molec. Biol.23: 691 (1993), who teach the nucleotide sequence of a cDNA encoding tobacco hookworm chitinase, and Kawalleck et al, Plant Molec. Biol 21: 673 (1993), who provide the nucleotide sequence of the parsley ubiA-2 polyubiquitin gene.
  • N A viral-invasive protein or a complex toxin derived therefrom.
  • the accumulation of viral coat proteins in transformed plant cells imparts resistance to viral infection and/or disease development effected by the virus from which the coat protein gene is derived, as well as by related viruses.
  • Coat protein- mediated resistance has been conferred upon transformed plants against alfalfa mosaic virus, cucumber mosaic virus, tobacco streak virus, potato virus X, potato virus Y, tobacco etch virus, tobacco rattle virus and tobacco mosaic virus. Id.
  • a herbicide that inhibits the growing point or meristem such as an imidazalinone or a sulfonylurea.
  • Exemplary genes in this category code for mutant ALS and AHAS enzyme as described, for example, by Lee et al.,EMBO J. 7: 1241 (1988), and Miki et al, Theor. Appl.Genet. 80: 449 (1990), respectively.
  • Glyphosate resistance imparted by mutant 5-enolpyruvl-3- phosphikimate synthase (EPSP) and aroA genes, respectively
  • PEP mutant 5-enolpyruvl-3- phosphikimate synthase
  • aroA aroA genes
  • other phosphono compounds such as glufosinate (phosphinothricin acetyl transferase (PAT) and Streptomyces hygroscopicus phosphinothricin acetyl transferase (bar) genes), and pyridinoxy or phenoxy proprionic acids and cycloshexones (ACCase inhibitor-encoding genes).
  • PAT phosphinothricin acetyl transferase
  • bar Streptomyces hygroscopicus phosphinothricin acetyl transferase
  • nucleotide sequence of a phosphinothricin- acetyl-transferase gene is provided in European application No. 0 242 246 to Leemans et al. De Greef et al, Bio /Technology 7: 61 (1989), describe the production of transgenic plants that express chimeric bar genes coding for phosphinothricin acetyl transferase activity.
  • Exemplary of genes conferring resistance to phenoxy proprionic acids and cycloshexones, such as sethoxydim and haloxyfop are the Accl-Sl, Accl-S2 and Accl-S3 genes described by Marshall et al, Theor. Appl. Genet.
  • a herbicide that inhibits photosynthesis such as a triazine (psbA and gs+ genes) and a benzonitrile (nitrilase gene).
  • Przibilla et al Plant Cell 3: 169 (1991), describe the transformation of Chlamydomonas with plasmids encoding mutant psbA genes. Nucleotide sequences for nitrilase genes are disclosed in U.S. patent No. 4,810,648 to Stalker, and DNA molecules containing these genes are available under ATCC Accession Nos. 53435, 67441 and 67442. Cloning and expression of DNA coding for a glutathione S- transferase is described by Hayes et al, Biochem. J. 285: 173 (1992).
  • a gene could be introduced that reduces phytate content. In maize, this, for example, could be accomplished, by cloning and then reintroducing DNA associated with the single allele which is responsible for maize mutants characterized by low levels of phytic acid. See Raboy et al, Maydica 35: 383 (1990).
  • C Modified carbohydrate composition effected, for example, by transforming plants with a gene coding for an enzyme that alters the branching pattern of starch. See Shiroza et al, J. Bacteriol 170: 810 (1988) (nucleotide sequence of Streptococcus mutans fructosyltransferase gene), Steinmetz et al, Mol. Gen. Genet. 200: 220 (1985) (nucleotide sequence of
  • Bacillus subtilis levansucrase gene Bacillus subtilis levansucrase gene
  • Pen et al Bio /Technology 10: 292 (1992) (production of transgenic plants that express Bacillus licheniformis a- amylase), Elliot et al, Plant Molec. Biol. 2 LA '- 515 (1993) (nucleotide sequences of tomato invertase genes), S ⁇ gaard et al, J. Biol Chem. 268: 22480 (1993) (site-directed mutagenesis of barley ⁇ -amylase gene), and Fisher et al, Plant Physiol 102: 1045 (1993) (maize endosperm starch branching enzyme II).
  • C Genes that promote embryo formation, another component of apomixis (asexual reproduction through seeds) such as the SOMATIC EMBRYOGENESIS RELATED KINASE (SERK) (Schmidt ED, et al. "A leucine-rich repeat containing receptor-like kinase marks somatic plant cells competent to form embryos.” Development 124:2049-62 (1997)), or LEAFY COTYLEDONS1 (LEC1) (Lotan T, et al. "Arabidopsis LEAFY COTYLEDON1 is sufficient to induce embryo development in vegetative cells.” Cell 93:1195- 205 (1998)).
  • Toxin genes such as diphteria toxin A (Nilsson O et al.
  • the promoters disclosed herein may be used in conjunction with naturally occurring flanking coding or transcribed sequences of the desired structural gene/s or with any other coding or transcribed sequence that is critical to structural gene formation and/or function.
  • intron sequences may also be desirable to include some intron sequences in the promoter constructs since the inclusion of intron sequences in the coding region may result in enhanced expression and specificity.
  • regions of one promoter may be joined to regions from a different promoter in order to obtain the desired promoter activity resulting in a chimeric promoter.
  • Synthetic promoters which regulate gene expression may also be used.
  • the expression system may be further optimized by employing supplemental elements such as transcription terminators and/or enhancer elements.
  • an expression cassette or construct should also contain a transcription termination region downstream of the structural gene to provide for efficient termination.
  • the termination region or polyadenylation signal may be obtained from the same gene as the promoter sequence or may be obtained from different genes.
  • Polyadenylation sequences include, but are not limited to the Agrobacterium octopine synthase signal (Gielen et al., EMBO J. (1984) 3:835-846) or the nopaline synthase signal (Depicker et al., Mol. andAppl Genet. (1982) 1:561-573). MARKER GENES
  • Recombinant DNA molecules containing any of the DNA sequences and promoters described herein may additionally contain selection marker genes which encode a selection gene product which confer on a plant cell resistance to a chemical agent or physiological stress, or confers a distinguishable phenotypic characteristic to the cells such that plant cells transformed with the recombinant DNA molecule may be easily selected using a selective agent.
  • selection marker gene is neomycin phosphotransferase (NPT II) which confers resistance to kanamycin and the antibiotic G-418.
  • Cells transformed with this selection marker gene may be selected for by assaying for the presence in vitro of phosphorylation of kanamycin using techniques described in the literature or by testing for the presence of the mRNA coding for the NPT II gene by Northern blot analysis in RNA from the tissue of the transformed plant. Polymerase chain reactions are also used to identify the presence of a transgene or expression using reverse transcriptase PCR amplification to monitor expression and PCR on genomic DNA. Other commonly used selection markers include the ampicillin resistance gene, the tetracycline resistance and the hygromycin resistance gene. Transformed plant cells thus selected can be induced to differentiate into plant structures which will eventually yield whole plants. It is to be understood that a selection marker gene may also be native to a plant. TRANSFORMATION
  • a recombinant DNA molecule whether designed to inhibit expression or to provide for expression containing any of the DNA sequences and/or promoters described herein may be integrated into the genome of a plant by first introducing a recombinant DNA molecule into a plant cell by any one of a variety of known methods.
  • the recombinant DNA molecule(s) are inserted into a suitable vector and the vector is used to introduce the recombinant DNA molecule into a plant cell.
  • Cauliflower Mosaic Virus (Howell, S.H., et al, 1980, Science, 208:1265) and gemini viruses (Goodman, R.M., 1981, J. Gen Virol. 54:9) as vectors has been suggested but by far the greatest reported successes have been with Agrobacteria sp. (Horsch, R.B., et al, 1985, Science 227:1229- 1231).
  • hypocotyls (DeBlock, M., et al, 1989, Plant Physiol. 91:694-701), leaf discs (Feldman, K.A., and Marks, M.D., 1986, Plant Sci. 47:63-69), stems (Fry J., et al, 1987, Plant Cell Repts. 6:321-325), cotyledons (Moloney M. M., et al, 1989, Plant Cell Repts. 8:238-242) and embryoids (Neuhaus, G., et al, 1987, Theor. Appl. Genet.
  • a plant cell be transformed with a recombinant DNA molecule containing at least two DNA sequences or be transformed with more than one recombinant DNA molecule.
  • the DNA sequences or recombinant DNA molecules in such embodiments may be physically linked, by being in the same vector, or physically separate on different vectors.
  • a cell may be simultaneously transformed with more than one vector provided that each vector has a unique selection marker gene.
  • a cell may be transformed with more than one vector sequentially allowing an intermediate regeneration step after transformation with the first vector.
  • it may be possible to perform a sexual cross between individual plants or plant lines containing different DNA sequences or recombinant DNA molecules preferably the DNA sequences or the recombinant molecules are linked or located on the same chromosome, and then selecting from the progeny of the cross, plants containing both DNA sequences or recombinant DNA molecules.
  • Expression of recombinant DNA molecules containing the DNA sequences and promoters described herein in transformed plant cells may be monitored using Northern blot techniques and/or Southern blot techniques or PCR-based methods known to those of skill in the art.
  • a large number of plants have been shown capable of regeneration from transformed individual cells to obtain transgenic whole plants.
  • regeneration has been shown for dicots as follows: apple, Malus pumila (James et al., Plant Cell Reports (1989) 7:658); blackberry, Rubus, Blackberry/raspberry hybrid, Rubus, red raspberry, Rubus (Graham et al., Plant Cell, Tissue and Organ Culture (1990) 20:35); carrot, Daucus carota (Thomas et al., Plant Cell Reports (1989) 8:354; Wurtele and Bulka, Plant
  • the regenerated plants are transferred to standard soil conditions and cultivated in a conventional manner. After the expression or inhibition cassette is stably incorporated into regenerated transgenic plants, it can be transferred to other plants by sexual crossing. Any of a number of standard breeding techniques can be used, depending upon the species to be crossed.
  • the transgenic plant provided for commercial production of foreign protein is maize.
  • the biomass of interest is seed.
  • a genetic map can be generated, primarily via conventional Restriction Fragment Length Polymorphisms (RFLP), Polymerase Chain Reaction (PCR) analysis, and Simple Sequence Repeats (SSR) which identifies the approximate chromosomal location of the integrated DNA molecule.
  • RFLP Restriction Fragment Length Polymorphisms
  • PCR Polymerase Chain Reaction
  • SSR Simple Sequence Repeats
  • map of the integration region can be compared to similar maps for suspect plants, to determine if the latter have a common parentage with the subject plant. Map comparisons would involve hybridizations, RFLP, PCR, SSR and sequencing, all of which are conventional techniques.
  • plants may be self-fertilized, leading to the production of a mixture of seed that consists of, in the simplest case, three types, homozygous (25%), heterozygous (50%) and null (25%) for the inserted gene.
  • homozygous 25%
  • heterozygous 50%)
  • null 25%)
  • Transgenic homozygous parental lines make possible the production of hybrid plants and seeds which will contain a modified protein component.
  • Transgenic homozygous parental lines are maintained with each parent containing either the first or second recombinant DNA sequence operably linked to a promoter. Also incorporated in this scheme are the advantages of growing a hybrid crop, including the combining of more valuable traits and hybrid vigor.
  • double fertilization involves two sperm cells, one of which fuses with the egg cell to form a diploid zygote, while the second fuses with the binucleated central cell to give rise to the triploid primary endosperm nucleus (van Went, J.L. & Willemse, M.T.M. Fertilization. In Embryology of Angiosperms (Johri, B. ed) Springer-Verlag, Berlin, pp 273-318 (1984)). Double fertilization triggers rapid proliferation of the endosperm and slow cell divisions of the zygote, which usually undergoes an asymmetrical division (Bowman, J.L. & Mansfield S.G. (1994).
  • transposants that harbour 73s elements with a uidA reporter gene encoding ⁇ - glucoronidase (GUS) by using the system of Sundaresdan et al. (Sundaresan, V. et al. Patterns of gene action in plant development revealed by enhancer trap and gene trap transposable elements. Genes & Dev. 9, 1797-1810 (1995)). Screening for genes that act during ovule and early seed development in Arabidopsis (U. Grossniklaus et al., unpublished results), we identified 19 transposants that show GUS expression in the developing embryo and/or endosperm after fertilization.
  • GUS uidA reporter gene encoding ⁇ - glucoronidase
  • GUS is expressed in the egg and/or central cell and persists for several rounds of cell division in either one or both fertilization products.
  • GUS expression was the result of transcription from only one or both parental alleles.
  • the resulting FI seeds showed GUS expression in a pattern identical to the one found in developing seeds resulting from self-pollination in all 19 lines (Fig. la to lc).
  • the transposants were used as male parents, GUS expression was absent from all FI seeds and remained undetectable up to 80 hours after pollination (HAP) (Fig. Id ).
  • mRNA was detected in the developing endosperm (Fig. If) as well as the early globular embryo (data not shown) in a pattern identical to the one observed in GUS assays (Fig. le).
  • Uniparental expression could be the consequence of transgene silencing specifically affecting the paternally inherited allele.
  • our enhancer detection screen identified a subclass of genes which are transcribed maternally prior to fertilization and are subject to paternal silencing.
  • a third possibility is that the paternally inherited genome remains silent during the first days following double fertilization, and that the activation of the paternal genome occurs after several rounds of cell division in both the embryo and endosperm.
  • RT-PCR reverse transcription polymerase chain reaction
  • PROLIFERA is an MCM2-3-5-like replication licensing factor required for the initiation of DNA replication (Springer, P.S., McCombie, W.R., Sundaresan, V., & Martienssen, R.A. Gene trap tagging of PROLIFERA, an essential MCM2-3- 5-like gene in Arabidopsis. Science 268, 877-880 (1995)).
  • sequences from two insertions show no homology to sequences in public databases or have similarity to genes of unknown function
  • the remaining eight genes encode proteins with a wide variety of functions. Some of them are similar to genes involved in basic cellular functions such as cell cycle regulation, the basal transcription machinery, or the assembly of protein secondary structure. Others encode putative signal transduction proteins or transcription factors that may play regulatory roles during seed development.
  • the Ds insertions are located on at least four out of the five chromosomes of Arabidopsis (Table B).
  • EMB301 GNOM Shevell, B. ⁇ . et al. EMB30 is essential for normal cell division, cell expansion, and cell adhesion in Arabidopsis and encodes a protein that has similarity to Sec7. Cell 77, 1051-1062 (1994); Busch, M., Mayer, U. & J rgens, G. Molecular analysis of the Arabidopsis pattern formation of gene GNOM: gene structure and intragenic complementation. Mol Gen Genet 250, 681-91 (1996)) expression using allele-specific RT-PCR.
  • Embryos homozygous for emb30 are defective in the establishment of the apical-basal axis. In some emb30 embryos the zygote divides almost symmetrically, giving rise to an enlarged apical cell that subsequently forms an abnormal globular embryo (Meinke, D.W. Embryo-lethal mutants of Arabidopsis thaliana: analysis of mutants with a wide range of lethal phases. Theor. Appl. Genet. 69, 543-552 (1985)). Self-pollinated heterozygous emb30/EMB30 individuals produce 25% of aborted embryo lethal seeds suggesting a zygotic requirement.
  • emb30 The phenotype of emb30 has been interpreted as being caused by a recessive mutation that affects a gene active during the earliest diploid (sporophytic) phase of embryogenesis (Howell, S.H. Molecular genetics of plant development. Cambridge University Press, New York. 365 pp. (1999)).
  • the emb30 embryo phenotype has been the strongest argument in favor of early genome activation in Arabidopsis.
  • EMB30 encodes a Sec-7 like protein that has recently been shown to be essential for auxin transport (Steinmann, T. et al. Coordinated polar localization of auxin efflux carrier PIN1 by GNOMARF GEF. Science 286, 316-318 (1999)).
  • EMB30 is expressed throughout the plant, its allele-specific pattern of expression has not been investigated during seed development.
  • a SNP in the first exon ⁇ EMB30 creates an Eco57I site present in Col but not in Ler (Fig. 3a).
  • EMB30 transcripts from the paternal allele could not be detected (Fig 3b), suggesting that the initial post-fertilization expression of EMB30 is exclusively dependent on transcription from the maternally inherited allele.
  • EMB30 may be expressed in sporophytic tissues such as the silique or seed coat, we could not discard the possibility that transcripts from the EMB30 maternal allele are present in vast excess over the paternally derived transcripts, obscuring their detection. Therefore, we tested for genetic activity of the paternal EMB30 allele by crossing heterozygous emb30IEMB30 plants with wild-type pollen. If the paternally inherited EMB30 allele is active following fertilization, heterozygous emb30 m /EMB30Pemhryos should develop normally.
  • emb30 m /EMB30Pe bryos include a nearly symmetrical plane of division of the zygote (Fig.
  • emb30 is a paternally rescuable maternal effect mutant and not a purely zygotic embryo lethal.
  • paternal silencing prolongs the functionally haploid phase, which serves to eliminate deleterious mutations (Walbot, V. Sources and consequences of phenotypic and genotypic plasticity in flowering plants. Trends PI Sci. 1, 27-32 (1996)), leading to a more stringent selection against such mutations inherited from the mother.
  • this mechanism is imperfect because only a subset of genes are expressed in the female gametophyte and early seed, and some maternal defects can be paternally rescued as illustrated by emb30.
  • emb30 The genetic analysis o ⁇ emb30 indicates that some of the early embryo-lethal mutants, which have been interpreted as affecting zygotically transcribed genes, represent loci that are only transcribed from the maternal allele.
  • Genomic fragments flanking Ds insertions were isolated by TAIL-PCR as described (Liu, Y-G., Mitsukawa, N., Oosumi, T., & Whittier, R.F. (1995). Efficient isolation and mapping of Arabidopsis thaliana T-DNA insert
  • RNA preparation young siliques were harvested in liquid nitrogen at specific time points after pollination.
  • RNA preparation, cDNA synthesis, and PCR amplification of 1/3 of the cDNA were performed as descried (Grossniklaus, U., Dahlle-Calzada, J-P., Hoeppner, M.A., Gagliano, W.B. Maternal control of embryogenesis by MEDEA, a Polycomb group gene in Arabidopsis. Science 280, 446-450 (1998); Dahlle-Calzada, J-P., Thomas, J., Spillane, Ch., Coluccio, A., Hoeppner, M.A & Grossniklaus, U.
  • PCR products were digested with Xhol (PRL) and Eco57I (EMB30) overnight at 37°C.
  • Siliques were dissected with hypodermic needles (Becton-Dickinson, lcc insulin syringes) and fixed in FAA (4%formaldehyde, 5% acetic acid, 50% ethanol). Dissected siliques or individual seeds were cleared in Herr's solution (2:2:2:2:1 lactic acidxhloral hydrate :phenol: clove oil: xylene, by volume) and observed on a eic ⁇ DMRB microscope under brightfield or Nomarski optics.

Abstract

The invention discloses location and expression patterns of genes associated with early embryo and endosperm development in Arabidopsis thaliana. Methods for identification of regulatory regions for this expression are disclosed as well as transgenic methods for using these regions in expression cassettes. Novel sequence tags for regions associated with enhancer trap insertions are also disclosed.

Description

TITLE: NOVEL NUCLEOTIDE SEQUENCES FOR EMBRYO
AND/OR ENDOSPERM SPECIFIC EXPRESSION IN PLANTS
FIELD OF THE INVENTION
This invention relates generally to the field of plant molecular biology. More specifically, this invention relates to the characterization of novel nucleotide sequences which are specific to early seed development in Arabidopsis. Regulatory regions of these sequences will provide for the temporal and spatial expression of foreign gene products in the embryo and/or endosperm of transgenic plants.
BACKGROUND OF THE INVENTION
Gene expression encompasses a number of steps originating from the DNA template, ultimately to the final protein or protein product. Control and regulation of gene expression can occur through numerous mechanisms. The initiation of transcription of a gene is generally thought of as the predominant control of gene expression. Transcriptional controls (or promoters) are generally short sequences imbedded in the 5'-flanking or upstream region of a transcribed gene. There are promoter sequences which affect gene expression in response to environmental stimuli, nutrient availability, or adverse conditions including heat shock, anaerobiosis or the presence of heavy metals. There are also DNA sequences which control gene expression during development, or in a tissue, or in an organ specific fashion. Promoters contain the signals for RNA polymerase to begin transcription so that protein synthesis can proceed. DNA binding, nuclear proteins interact specifically with these cognate promoter DNA sequences to promote the formation of the transcriptional complex and eventually initiate the gene expression process. The entire region containing all the ancillary elements affecting regulation or absolute levels of transcription may be comprised of less than 100 base pairs or as much as 1 kilobase pair.
A number of promoters which are active in plant cells have been described in the literature. These include nopaline synthase (NOS) and octopine synthase (OCS) promoters (which are carried on tumor inducing plasmids of Agrobacterium tumefaciens), the cauliflower mosaic virus (CaMV) 19S and 35S promoters, the light-inducible promoter from the small subunit of ribulose bisphosphate carboxylase (ssRUBICSO, a very abundant plant polypeptide), and the sucrose synthase promoter. All of these promoters have been used to create various types of DNA constructs which have been expressed in plants. (See for example PCT publication WO84/02913 Rogers, et al).
The most valuable promoters, however, are tissue specific promoters. These promoters can activate gene sequences only in specific tissues where expression of a protein is desired, also these types of promoters can be temporarily expressed during a particular period of development. One example is organ-specific promoters. The E8 promoter is only transcriptionally activated during tomato fruit ripening, and can be used to target gene expression in ripening tomato fruit (Deikman and Fischer, EMBO J. (1988) 7:3315; Giovannoni et al., The Plant Cell (1989) 1:53). The activity of the E8 promoter is not limited to tomato fruit, but is thought to be compatible with any system wherein ethylene activates biological processes. Similarly the Lipoxegenase ("the LOX gene") is a fruit specific promoter.
Other fruit specific promoters are the 1.45 promoter fragment disclosed in Bird, et al., Plant Mol. Bio., pp 651-663(1988) and the polygalacturonase promoter from tomato disclosed in U.S. Patent 5,413,937 to Bridges et al. Leaf specific promoters include as the AS-1 promoter disclosed in US Patent 5,256,558 to Coruzzi and the RBCS-3A promoter isolated from pea the RBCS- 3A gene disclosed in US Patent 5,023,179 to Lam et al.
Root specific promoters include the Cam 35 S promoter disclosed in US Patent 391,725 to Coruzzi et al; the RB7 promoter disclosed in US patent 5,459,252 to Conking et al and the promoter isolated from Brassica napus disclosed in US Patent 5,401,836 to Bazczynski et al. which give root specific expression.
Perhaps the most important tissue specific promoters are those directed to the seed or embryo, a particularly vulnerable period and which can be manipulated to modulate seed size, to increase yield, to provide plants with fruits without seeds, to induce autonomous (parthenogenetic) development of embryo and/or endosperm, or simply to learn about spatial temporal gene regulation during this all important period. These types of promoters include maternal tissue promoters such as seed coat, pericarp and ovule. One such promoter active in endosperm development is the promoter from the a' subunit of the soybean β-conglycinin gene [Walling et al., Proc. Natl. Acad. Sci. USA 83:2123-2127 (1986)] which is expressed early in seed development in the endosperm and the embryo.
Further seed specific promoters include the Napin promoter described in united States Patent 5,110,728 to Calgene, which describes and discloses the use of the napin promoter in directing the expression to seed tissue of an acyl carrier protein to enhance seed oil production; the DC3 promoter from carrots which is early to mid embryo specific and is disclosed at Plant Physiology. Oct. 1992 100(2) p. 576-581, "Hormonal and Environmental Regulation of the Carrot Lea-class Gene Dc 3, and Plant Mol. Biol.. April 1992, 18(6) p. 1049- 1063, "Transcriptional Regulation of a Seed Specific Carrot Gene, DC 8": the phaseolin promoter described in United States Patent 5,504,200 to Mycogen which discloses the gene sequence and regulatory regions for phaseolin, a protein isolated from P. vulgaris which is expressed only while the seed is developing within the pod, and only in tissues involved in seed generation. As can be seen, there is a continuing need in the art for tissue specific promoters and to identify gene expression patterns, particularly in the embryo post fertilization.
It is a primary object of this invention to provide regulatory elements that direct embryo and endosperm expression of nucleotide sequences in plant cells and plant tissues.
It is another object of this invention to provide transcription units for expression of gene products immediately post fertilization.
It is yet another object of the invention to provide vehicles for transformation of plant cells including viral or plasmid vectors incorporating the novel regulator elements of the invention.
It is yet another object of the invention to provide bacterial cells comprising such vectors for maintenance, replication, and plant transformation.
It is yet another object of the invention to provide expression constructs, sequences and transformed cells which provide for creation of transgenic plants.
It is yet another object of the invention to provide for transgenic as well as nontransgenic (screening for naturally occurring or induced) mutants to provide for mechanisms to create transgenic plants. It is yet another object of the invention to provide breeding materials that may be used in a breeding program to produce commercial crops with exogenous advantageous genes. It is a further object of the invention to provide sequence tags which can be used in mapping the Arabidopsis genome or as tools to provide assays to learn about gene activation and inactivation during embryo development.
These and other objects will become apparent from the following description of the invention.
SUMMARY OF THE INVENTION
The present invention comprises the isolation and characterization of novel regulatory nucleotide sequences which provide for expression of genes during ovule and early seed development in Arabidopsis. The invention also comprises the spatial and temporal expression of nucleotide sequences using these regulatory elements.
The invention further comprises expression cassettes comprising the promoters of the invention, a structural gene, the expression of which is desired in plant cells, and a polyadenylation or stop signal. The expression cassette can be encompassed in plasmid or viral vectors for transformation of plant protoplast cells.
The invention also encompasses transformed bacterial cells for maintenance and replication of the vector, as well as transformed monocot or dicot cells and ultimately transgenic plants, and breeding materials developed from the transgenic plants.
In a further embodiment, the invention comprises the identification and location of novel nucleic acid sequences which are expressed or which regulate expression to activate and de-activate genes during embryo and/or endosperm development. These sequences may be used as markers for mapping, for manipulation of gene expression pre- and post fertilization, and for assays and protocols for manipulation of paternal and maternal silencing, during embryo development.
For purposes of this application the following terms shall have the definitions recited herein. Units, prefixes, and symbols may be denoted in their SI accepted form. Unless otherwise indicated, nucleic acids are written left to right in 5' to 3' orientation; amino acid sequences are written left to right in amino to carboxy orientation, respectively. Numeric ranges are inclusive of the numbers defining the range and include each integer within the defined range. Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single- letter codes. Unless otherwise provided for, software, electrical, and electronics terms as used herein are as defined in The New IEEE Standard Dictionary of Electrical and Electronics Terms (5th edition, 1993). The terms defined below are more fully defined by reference to the specification as a whole.
By "amplified" is meant the construction of multiple copies of a nucleic acid sequence or multiple copies complementary to the nucleic acid sequence using at least one of the nucleic acid sequences as a template. Amplification systems include the polymerase chain reaction (PCR) system, ligase chain reaction (LCR) system, nucleic acid sequence based amplification (NASBA, Canteen, Mississauga, Ontario), Q-_8etα Replicase systems, transcription-based amplification system (TAS), and strand displacement amplification (SDA). See, e.g., Diagnostic Molecular Microbiology: Principles and Applications, D.H. Persing et al., Ed., American Society for Microbiology, Washington, D.C. (1993). The product of amplification is termed an amplicon.
As used herein, "antisense orientation" includes reference to a duplex polynucleotide sequence that is operably linked to a promoter in an orientation where the antisense strand is transcribed. The antisense strand is sufficiently complementary to an endogenous transcription product such that translation of the endogenous transcription product is often inhibited.
As used herein, "chromosomal region" includes reference to a length of a chromosome that may be measured by reference to the linear segment of DNA that it comprises. The chromosomal region can be defined by reference to two unique DNA sequences, i.e., markers.
The term "conservatively modified variants" applies to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, conservatively modified variants refers to those nucleic acids which encode identical or conservatively modified variants of the amino acid sequences. Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide. Such nucleic acid variations are "silent variations" and represent one species of conservatively modified variation. Every nucleic acid sequence herein that encodes a polypeptide also, by reference to the genetic code, describes every possible silent variation of the nucleic acid. One of ordinary skill will recognize that each codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine; and UGG, which is ordinarily the only codon for tryptophan) can be modified to yield a functionally identical molecule. Accordingly, each silent variation of a nucleic acid which encodes a polypeptide of the present invention is implicit in each described polypeptide sequence and is within the scope of the present invention.
As to amino acid sequences, one of skill will recognize that individual substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded sequence is a "conservatively modified variant" where the alteration results in the substitution of an amino acid with a chemically similar amino acid. Thus, any number of amino acid residues selected from the group of integers consisting of from 1 to 15 can be so altered. Thus, for example, 1, 2, 3, 4, 5, 7, or 10 alterations can be made.
Conservatively modified variants typically provide similar biological activity as the unmodified polypeptide sequence from which they are derived. For example, substrate specificity, enzyme activity, or ligand/receptor binding is generally at least 30%, 40%, 50%, 60%, 70%, 80%, or 90% of the native protein for its native substrate. Conservative substitution tables providing functionally similar amino acids are well known in the art.
The following six groups each contain amino acids that are conservative substitutions for one another:
1) Alanine (A), Serine (S), Threonine (T); 2) Aspartic acid (D), Glutamic acid (E);
3) Asparagine (N), Glutamine (Q);
4) Arginine (R), Lysine (K);
5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); and
6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W). See also, Creighton (1984) Proteins W.H. Freeman and Company.
By "encoding" or "encoded", with respect to a specified nucleic acid, is meant comprising the information for translation into the specified protein. A nucleic acid encoding a protein may comprise non-translated sequences (e.g., introns) within translated regions of the nucleic acid, or may lack such intervening non- translated sequences (e.g., as in cDNA). The information by which a protein is encoded is specified by the use of codons. Typically, the amino acid sequence is encoded by the nucleic acid using the "universal" genetic code. However, variants of the universal code, such as are present in some plant, animal, and fungal mitochondria, the bacterium Mycoplasma capricolum, or the ciliate Macronucleus, may be used when the nucleic acid is expressed therein. When the nucleic acid is prepared or altered synthetically, advantage can be taken of known codon preferences of the intended host where the nucleic acid is to be expressed. For example, although nucleic acid sequences of the present invention may be expressed in both monocotyledonous and dicotyledonous plant species, sequences can be modified to account for the specific codon preferences and GC content preferences of monocotyledons or dicotyledons as these preferences have been shown to differ (Murray et al. Nucl. Acids Res. 17:477-498 (1989)). Thus, the maize preferred codon for a particular amino acid may be derived from known gene sequences from maize. Maize codon usage for 28 genes from maize plants are listed in Table 4 of Murray et al, supra.
As used herein "full-length sequence" in reference to a specified polynucleotide or its encoded protein means having the entire amino acid sequence of, a native (non-synthetic), endogenous, biologically active form of the specified protein. Methods to determine whether a sequence is full-length are well known in the art including such exemplary techniques as northern or western blots, primer extensions, SI protection, and ribonuclease protection. See, e.g., Plant Molecular Biology: A Laboratory Manual, Clark, Ed., Springer- Verlag, Berlin (1997). Comparison to known full-length homologous (orthologous and/or paralogous) sequences can also be used to identify full- length sequences of the present invention. Additionally, consensus sequences typically present at the 5' and 3' untranslated regions of mRNA aid in the identification of a polynucleotide as full-length. For example, the consensus sequence ANNNNAUGG, where the underlined codon represents the N- terminal methionine, aids in determining whether the polynucleotide has a complete 5' end. Consensus sequences at the 3' end, such as polyadenylation sequences, aid in determining whether the polynucleotide has a complete 3' end.
As used herein, "heterologous" in reference to a nucleic acid is a nucleic acid that originates from a foreign species, or, if from the same species, is substantially modified from its native form in composition and/or genomic locus by deliberate human intervention. For example, a promoter operably linked to a heterologous structural gene is from a species different from that from which the structural gene was derived, or, if from the same species, on or both are substantially modified from their original form. A heterologous protein may originate from a foreign species or, if from the same species, is substantially modified from its original form by deliberate human intervention.
By "host cell" is meant a cell which contains a vector and supports the replication and/or expression of the vector. Host cells may be prokaryotic eel such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or mammalian cells. Preferably, host cells are monocotyledonous or dicotyledonous plant cells. A particularly preferred monocotyledonous host ct is a maize host cell.
The term "hybridization complex" includes reference to a duplex nucleit acid structure formed by two single-stranded nucleic acid sequences selectivel hybridized with each other. The term "introduced" in the context of inserting a nucleic acid into a cell, means "transfection" or "transformation" or "transduction" and includes reference to the incorporation of a nucleic acid into a eukaryotic or prokaryotic cell where the nucleic acid may be incorporated into the genome of the cell (e.g., chromosome, plasmid, plastid or mitochondrial DNA), converted into an autonomous replicon, or transiently expressed (e.g., transfected mRNA). The term "isolated" refers to material, such as a nucleic acid or a protein, which is: (1) substantially or essentially free from components that normally accompany or interact with it as found in its naturally occurring environment. The isolated material optionally comprises material not found with the material in its natural environment; or (2) if the material is in its natural environment, the material has been synthetically (non-naturally) altered by deliberate human intervention to a composition and/or placed at a location in the cell (e.g., genome or subcellular organelle) not native to a material found in that environment. The alteration to yield the synthetic material can be performed on the material within or removed from its natural state. For example, a naturally occurring nucleic acid becomes an isolated nucleic acid if it is altered, or if it is transcribed from DNA which has been altered, by means of human intervention performed within the cell from which it originates. See, e.g., Compounds and Methods for Site Directed Mutagenesis in Eukaryotic Cells, Kmiec, U.S. Patent No. 5,565,350; In Vivo Homologous Sequence Targeting in Eukaryotic Cells; Zarling et al, PCT/US93/03868. Likewise, a naturally occurring nucleic acid (e.g., a promoter) becomes isolated if it is introduced by non-naturally occurring means to a locus of the genome not native to that nucleic acid. Nucleic acids which are "isolated" as defined herein, are also referred to as "heterologous" nucleic acids.
As used herein, "localized within the chromosomal region defined by and including" with respect to particular markers includes reference to a contiguous length of a chromosome delimited by and including the stated markers.
As used herein, "marker" includes reference to a locus on a chromosome that serves to identify a unique position on the chromosome. A "polymorphic marker" includes reference to a marker which appears in multiple forms (alleles) such that different forms of the marker, when they are present in a homologous pair, allow transmission of each of the chromosomes of that pair to be followed. A genotype may be defined by use of one or a plurality of markers. As used herein, "nucleic acid" or "nucleotide" includes reference to a deoxyribonucleotide or ribonucleotide polymer in either single- or double- stranded form, and unless otherwise limited, encompasses known analogues having the essential nature of natural nucleotides in that they hybridize to single -stranded nucleic acids in a manner similar to naturally occurring nucleotides (e.g., peptide nucleic acids).
By "nucleic acid library" is meant a collection of isolated DNA or RNA molecules which comprise and substantially represent the entire transcribed fraction of a genome of a specified organism. Construction of exemplary nucleic acid libraries, such as genomic and cDNA libraries, is taught in standard molecular biology references such as Berger and Kimmel, Guide to Molecular Cloning Techniques, Methods in Enzymology, Vol. 152, Academic Press, Inc., San Diego, CA (Berger); Sambrook et al, Molecular Cloning - A Laboratory Manual, 2nd ed., Vol. 1-3 (1989); and Current Protocols in Molecular Biology, F.M. Ausubel et al., Eds., Current Protocols, a joint venture between Greene Publishing Associates, Inc. and John Wiley & Sons, Inc. (1994). As used herein "operably linked" includes reference to a functional linkage between a promoter and a second sequence, wherein the promoter sequence initiates and mediates transcription of the DNA sequence corresponding to the second sequence. Generally, operably linked means that the nucleic acid sequences being linked are contiguous and, where necessary to join two protein coding regions, contiguous and in the same reading frame. As used herein, the term "plant" can include reference to whole plants, plant parts or organs (e.g., leaves, stems, roots, etc.), plant cells, seeds and progeny of same. Plant cell, as used herein, further includes, without limitation, cells obtained from or found in: seeds, suspension cultures, embryos, meristematic regions, callus tissue, leaves, roots, shoots, gametophytes, sporophytes, pollen, and microspores. Plant cells can also be understood to include modified cells, such as protoplasts, obtained from the aforementioned tissues. The class of plants which can be used in the methods of the invention is generally as broad as the class of higher plants amenable to transformation techniques, including both monocotyledonous and dicotyledonous plants. Particularly preferred plants include maize, soybean, sunflower, sorghum, canola, wheat, alfalfa, cotton, rice, barley, and millet.
As used herein, "polynucleotide" includes reference to a deoxyribopolynucleotide, ribopolynucleotide, or analogs thereof that have the essential nature of a natural ribonucleotide in that they hybridize, under stringent hybridization conditions, to substantially the same nucleotide sequence as naturally occurring nucleotides and/or allow translation into the same amino acid(s) as the naturally occurring nucleotide (s). A polynucleotide can be full-length or a subsequence of a native or heterologous structural or regulatory gene. Unless otherwise indicated, the term includes reference to the specified sequence as well as the complementary sequence thereof. Thus, DNAs or RNAs with backbones modified for stability or for other reasons as "polynucleotides" as that term is intended herein. Moreover, DNAs or RNAs comprising unusual bases, such as inosine, or modified bases, such as tritylated bases, to name just two examples, are polynucleotides as the term is used herein. It will be appreciated that a great variety of modifications have been made to DNA and RNA that serve many useful purposes known to those of skill in the art. The term polynucleotide as it is employed herein embraces such chemically, enzymatically or metabolically modified forms of polynucleotides, as well as the chemical forms of DNA and RNA characteristic of viruses and cells, including among other things, simple and complex cells.
The terms "polypeptide", "peptide" and "protein" are used interchangeably herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical analogue of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers. The essential nature of such analogues of naturally occurring amino acids is that, when incorporated into a protein, that protein is specifically reactive to antibodies elicited to the same protein but consisting entirely of naturally occurring amino acids. The terms "polypeptide", "peptide" and "protein" are also inclusive of modifications including, but not limited to, glycosylation, lipid attachment, sulfation, gamma-carboxylation of glutamic acid residues, hydroxylation and ADP-ribosylation. It will be appreciated, as is well known and as noted above, that polypeptides are not entirely linear. For instance, polypeptides may be branched as a result of ubiquitination, and they may be circular, with or without branching, generally as a result of posttranslation events, including natural processing event and events brought about by human manipulation which do not occur naturally. Circular, branched and branched circular polypeptides may be synthesized by non-translation natural process and by entirely synthetic methods, as well. Further, this invention contemplates the use of both the methionine-containing and the methionine- less amino terminal variants of the protein of the invention.
As used herein "promoter" includes reference to a region of DNA upstream from the start of transcription and involved in recognition and binding of RNA polymerase and other proteins to initiate transcription. A "plant promoter" is a promoter capable of initiating transcription in plant cells whether or not its origin is a plant cell. Exemplary plant promoters include, but are not limited to, those that are obtained from plants, plant viruses, and bacteria which comprise genes expressed in plant cells such as Agrobacterium or Rhizobium. Examples of promoters under developmental control include promoters that preferentially initiate transcription in certain tissues, such as leaves, roots, or seeds. Such promoters are referred to as "tissue preferred". Promoters which initiate transcription only in certain tissue are referred to as "tissue specific". A "cell type" specific promoter primarily drives expression in certain cell types in one or more organs, for example, vascular cells in roots or leaves. An "inducible" or "repressible" promoter is a promoter which is under environmental control. Examples of environmental conditions that may effect transcription by inducible promoters include anaerobic conditions or the presence of light. Tissue specific, tissue preferred, cell type specific, and inducible promoters constitute the class of "non-constitutive" promoters. A "constitutive" promoter is a promoter which is active under most environmental conditions.
As used herein "recombinant" includes reference to a cell or vector, that has been modified by the introduction of a heterologous nucleic acid or that the cell is derived from a cell so modified. Thus, for example, recombinant cells express genes that are not found in identical form within the native (non- recombinant) form of the cell or express native genes that are otherwise abnormally expressed, under-expressed or not expressed at all as a result of deliberate human intervention. The term "recombinant" as used herein does not encompass the alteration of the cell or vector by naturally occurring events (e.g., spontaneous mutation, natural transformation/transduction/transposition) such as those occurring without deliberate human intervention. As used herein, a "expression cassette" is a nucleic acid construct, generated recombinantly or synthetically, with a series of specified nucleic acid elements which permit transcription of a particular nucleic acid in a host cell. The recombinant expression cassette can be incorporated into a plasmid, chromosome, mitochondrial DNA, plastid DNA, virus, or nucleic acid fragment. Typically, the recombinant expression cassette portion of an expression vector includes, among other sequences, a nucleic acid to be transcribed, and a promoter.
The term "residue" or "amino acid residue" or "amino acid" are used interchangeably herein to refer to an amino acid that is incorporated into a protein, polypeptide, or peptide (collectively "protein"). The amino acid may be a naturally occurring amino acid and, unless otherwise limited, may encompass non-natural analogs of natural amino acids that can function in a similar manner as naturally occurring amino acids.
The term "selectively hybridizes" includes reference to hybridization, under stringent hybridization conditions, of a nucleic acid sequence to a specified nucleic acid target sequence to a detectably greater degree (e.g., at least 2-fold over background) than its hybridization to non-target nucleic acid sequences and to the substantial exclusion of non-target nucleic acids. Selectively hybridizing sequences typically have about at least 80% sequence identity, preferably 90% sequence identity, and most preferably 100% sequence identity (i.e., complementary) with each other.
The term "stringent conditions" or "stringent hybridization conditions" includes reference to conditions under which a probe will hybridize to its target sequence, to a detectably greater degree than to other sequences (e.g., at least 2-fold over background). Stringent conditions are sequence-dependent and may be different in different circumstances. By controlling the stringency of the hybridization and/or washing conditions, target sequences can be identified which are 100% complementary to the probe (homologous probing). Alternatively, stringency conditions can be adjusted to allow some mismatching in sequences so that lower degrees of similarity are detected (heterologous probing). Generally, a probe is less than about 1000 nucleotides in length, optionally less than 500 nucleotides in length.
Typically, stringent conditions will be those in which the salt concentration is less than about 1.5 M Na ion, typically about 0.01 to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30° C for short probes (e.g., 10 to 50 nucleotides) and at least about 60°C for long probes (e.g., greater than 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. Exemplary low stringency conditions include hybridization with a buffer solution of 30 to 35% formamide, 1 M NaCl, 1% SDS (sodium dodecyl sulphate) at 37°C, and a wash in IX to 2X SSC (20X SSC = 3.0 M NaCl/0.3 M trisodium citrate) at 50 to 55°C. Exemplary moderate stringency conditions include hybridization in 40 to 45% formamide, 1 M NaCl, 1% SDS at 37°C, and a wash in 0.5X to IX SSC at 55 to 50°C. Exemplary high stringency conditions include hybridization in 50% formamide, 1 M NaCl, 1% SDS at 37°C, and a wash in 0.1X SSC at 60 to 65°C. Specificity is typically the function of post-hybridization washes, the critical factors being the ionic strength and temperature of the final wash solution. For DNA-DNA hybrids, the Tm can be approximated from the equation of Meinkoth and Wahl, Anal Biochem., 138:267-284 (1984):
Figure imgf000014_0001
+ 16.6 (log M) + 0.41 (%GC) -0.61 (% form) - 500/L; where M is the molarity of monovalent cations, %GC is the percentage of guanosine and cytosine nucleotides in the DNA, % form is the percentage of formamide in the hybridization solution, and L is the length of the hybrid in base pairs. The Tm is the temperature (under defined ionic strength and pH) at which 50% of the complementary target sequence hybridizes to a perfectly matched probe. Tm is reduced by about 1°C for each 1% of mismatching; thus, Tm, hybridization and/or wash conditions can be adjusted to hybridize to sequences of the desired identity. For example, if sequences with >90% identity are sought, the Tm can be decreased 10°C. Generally, stringent conditions are selected to be about 5°C lower than the thermal melting point (Tm) for the specific sequence and its complement at a defined ionic strength and pH. However, severely stringent conditions can utilize a hybridization and/or wash at 1, 2, 3, or 4°C lower than the thermal melting point (Tm); moderately stringent conditions can utilize a hybridization and/or wash at 6, 7, 8, 9, or 10°C lower than the thermal melting point (Tm); low stringency conditions can utilize a hybridization and/or wash at 11, 12, 13, 14, 15, or 20°C lower than the thermal melting point (Tm). Using the equation, hybridization and wash compositions, and desired Tm, those of ordinary skill will understand that variations in the stringency of hybridization and/or wash solutions are inherently described. If the desired degree of mismatching results in a Tm of less than 45°C (aqueous solution) or 32°C (formamide solution) it is preferred to increase the SSC concentration so that a higher temperature can be used. An extensive guide to the hybridization of nucleic acids is found in Tijssen, Laboratory Techniques in
Biochemistry and Molecular Biology — Hybridization with Nucleic Acids Probes, Part I, Chapter 2, Ausubel, et al., Eds., Greene Publishing and Wiley - Interscience, New York (1995).
As used herein, "transgenic plant" includes reference to a plant which comprises within its genome a heterologous polynucleotide. Generally, the heterologous polynucleotide is stably integrated within the genome such that the polynucleotide is passed on to successive generations. The heterologous polynucleotide may be integrated into the genome alone or as part of a recombinant expression cassette. "Transgenic" is used herein to include any cell, cell line, callus, tissue, plant part or plant, the genotype of which has been altered by the presence of heterologous nucleic acid including those transgenics initially so altered as well as those created by sexual crosses or asexual propagation from the initial transgenic. The term "transgenic" as used herein does not encompass the alteration of the genome (chromosomal or extra-chromosomal) by conventional plant breeding methods or by naturally occurring events such as random cross-fertilization, non-recombinant viral infection, non-recombinant bacterial transformation, non-recombinant transposition, or spontaneous mutation.
As used herein, "vector" includes reference to a nucleic acid used in transfection of a host cell and into which can be inserted a polynucleotide. Vectors are often replicons. Expression vectors permit transcription of a nucleic acid inserted therein.
The following terms are used to describe the sequence relationships between two or more nucleic acids or polynucleotides: (a) "reference sequence", (b) "comparison window", (c) "sequence identity", (d) "percentage of sequence identity", and (e) "substantial identity". (a) As used herein, "reference sequence" is a defined sequence used as a basis for sequence comparison. A reference sequence may be a subset or the entirety of a specified sequence; for example, as a segment of a full-length cDNA or gene sequence, or the complete cDNA or gene sequence. (b) As used herein, "comparison window" includes reference to a contiguous and specified segment of a polynucleotide sequence, wherein the polynucleotide sequence may be compared to a reference sequence and wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. Generally, the comparison window is at least 20 contiguous nucleotides in length, and optionally can be 30, 40, 50, 100, or longer. Those of skill in the art understand that to avoid a high similarity to a reference sequence due to inclusion of gaps in the polynucleotide sequence, a gap penalty is typically introduced and is subtracted from the number of matches.
Methods of alignment of sequences for comparison are well-known in the art. Optimal alignment of sequences for comparison may be conducted by the local homology algorithm of Smith and Waterman, Adv. Appl. Math. 2:482 (1981); by the homology alignment algorithm of Needleman and Wunsch, J. Mol. Biol 48:443 (1970); by the search for similarity method of Pearson and Lipman, Proc. Natl. Acad. Sci. 85:2444 (1988); by computerized implementations of these algorithms, including, but not limited to: CLUSTAL in the PC/Gene program by Intelligenetics, Mountain View, California; GAP, BESTFIT, BLAST, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group (GCG), 575 Science Dr., Madison,
Wisconsin, USA; the CLUSTAL program is well described by Higgins and Sharp, Gene 73:237-244 (1988); Higgins and Sharp, CABIOS 5:151-153 (1989); Corpet, et al., Nucleic Acids Research 16:10881-90 (1988); Huang, et al, Computer Applications in the Biosciences 8:155-65 (1992), and Pearson, et al, Methods in Molecular Biology 24:307-331 (1994). The BLAST family of programs which can be used for database similarity searches includes: BLASTN for nucleotide query sequences against nucleotide database sequences; BLASTX for nucleotide query sequences against protein database sequences; BLASTP for protein query sequences against protein database sequences; TBLASTN for protein query sequences against nucleotide database sequences; and TBLASTX for nucleotide query sequences against nucleotide database sequences. See, Current Protocols in Molecular Biology, Chapter 19, Ausubel, et al., Eds., Greene Publishing and Wiley-Interscience, New York (1995).
Unless otherwise stated, sequence identity/similarity values provided herein refer to the value obtained using the BLAST 2.0 suite of programs using default parameters. Altschul et a., Nucleic Acids Res. 25:3389-3402 (1997). Software for performing BLAST analyses is publicly available, e.g., through the National Center for Biotechnology-Information (htt ://www.hcbi.nlm.nih.gov ) . This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive -valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al., supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) of 10, a cutoff of 100, M=5, N=-4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff (1989) Proc. Natl Acad. Sci. USA 89:10915). In addition to calculating percent sequence identity, the BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul, Proc. Natl. Acad. Sci. USA 90:5873- 5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. BLAST searches assume that proteins can be modeled as random sequences. However, many real proteins comprise regions of nonrandom sequences which may be homopolymeric tracts, short-period repeats, or regions enriched in one or more amino acids. Such low-complexity regions may be aligned between unrelated proteins even though other regions of the protein are entirely dissimilar. A number of low-complexity filter programs can be employed to reduce such low-complexity alignments. For example, the SEG (Wooten and Federhen, Comput. Chem., 17:149-163 (1993)) and XNU (Claverie and States, Comput. Chem., 17:191-201 (1993)) low-complexity filters can be employed alone or in combination.
(c) As used herein, "sequence identity" or "identity" in the context of two nucleic acid or polypeptide sequences includes reference to the residues in the two sequences which are the same when aligned for maximum correspondence over a specified comparison window. When percentage of sequence identity is used in reference to proteins it is recognized that residue positions which are not identical often differ by conservative amino acid substitutions, where amino acid residues are substituted for other amino acid residues with similar chemical properties (e.g. charge or hydrophobicity) and therefore do not change the functional properties of the molecule. Where sequences differ in conservative substitutions, the percent sequence identity may be adjusted upwards to correct for the conservative nature of the substitution. Sequences which differ by such conservative substitutions are said to have "sequence similarity" or "similarity". Means for making this adjustment are well-known to those of skill in the art. Typically this involves scoring a conservative substitution as a partial rather than a full mismatch, thereby increasing the percentage sequence identity. Thus, for example, where an identical amino acid is given a score of 1 and a non-conservative substitution is given a score of zero, a conservative substitution is given a score between zero and 1. The scoring of conservative substitutions is calculated, e.g., according to the algorithm of Meyers and Miller, Computer Applic. Biol Sci., 4:11-17 (1988) e.g., as implemented in the program PC/GENE (Intelligenetics, Mountain View, California, USA).
(d) As used herein, "percentage of sequence identity" means the value determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity.
(e)(1) The term "substantial identity" of polynucleotide sequences means that a polynucleotide comprises a sequence that has at least 70% sequence identity, preferably at least 80%, more preferably at least 90% and most preferably at least 95%, compared to a reference sequence using one of the alignment programs described using standard parameters. One of skill will recognize that these values can be appropriately adjusted to determine corresponding identity of proteins encoded by two nucleotide sequences by taking into account codon degeneracy, amino acid similarity, reading frame positioning and the like. Substantial identity of amino acid sequences for these purposes normally means sequence identity of at least 60%, or preferably at least 70%, 80%, 90%, and most preferably at least 95%.
Another indication that nucleotide sequences are substantially identical is if two molecules hybridize to each other under stringent conditions. However, nucleic acids which do not hybridize to each other under stringent conditions are still substantially identical if the polypeptides which they encode are substantially identical. This may occur, e.g., when a copy of a nucleic acid is created using the maximum codon degeneracy permitted by the genetic code. One indication that two nucleic acid sequences are substantially identical is that the polypeptide which the first nucleic acid encodes is immunologically cross reactive with the polypeptide encoded by the second nucleic acid.
(e)(ii) The terms "substantial Identity" in the context of a peptide indicates that a peptide comprises a sequence with at least 70% sequence identity to a reference sequence, preferably 80%, or preferably 85%, most preferably at least 90% or 95% sequence identity to the reference sequence over a specified comparison window. Optionally, optimal alignment is conducted using the homology alignment algorithm of Needleman and Wunsch, J. Mol. Biol. 48:443 (1970). an indication that two peptide sequences are substantially identical is that one peptide is immunologically reactive with antibodies raised against the second peptide. Thus, a peptide is substantially identical to a second peptide, for example, where the two peptides differ only by a conservative substitution. Peptides which are "substantially similar" share sequences as noted above except that residue positions which are not identical may differ by conservative amino acid changes.
DETAILED DESCRIPTION OF THE FIGURES
Figure 1 Silencing of paternally inherited genes during seed development in Arabidopsis .(a) If an ET2612 female is crossed to a wild-type male, GUS expression is detected in the free nuclear endosperm 12 HAP. (b) and (c) FI seeds of the same cross show more intense GUS expression in embryo and endosperm 48 HAP. (d) If ET2612 is crossed as the male to wild-type plants, GUS expression is not detectable in embryo and free nuclear endosperm 48 HAP. (e) Transverse optical section through a seed of a self-pollinated ET2612 plant 48 HAP. (f) In situ hybridization to mRNA of the putative basal transcription factor tagged in ET2612; the pattern of mRNA and GUS expression are identical 48 HAP. Abbreviations: E= embryo; FNE= free nuclear endosperm; Ov= ovule; Scale bars: a & b= 17 μm; c=23 μm; d^ 45μm; e=40 μm.
Figure 2 Allele-specific expression profile oϊPRL during early seed development, (a) Schematic of PRL showing the position of exons (squared boxes), introns (lines) and the SNP in an Xhol site which is present in Col but absent in Ler. (b) Allele-specific PCR analysis using titrated mixtures of genomic DNA each of which is represented twice. Both PRL alleles are consistently amplified throughout the dilution series (c) Allele-specific RT-PCR analysis of PRL in reciprocal crosses between Col and Ler. Only transcripts derived from the maternal PRL allele are detected up to 68 HAP. (d) If PRL enhancer detector
ET1041 is used as the female in crosses to Col, GUS is detected in the endosperm 36 HAP. (e) If PRL enhancer detector ET1041 is used as the male parent, GUS is detected in a small portion of the chalazal endosperm 92 HAP. (f) Cleared seed showing the nodular cyst (arrowhead) in the chalazal endosperm. Scale Bars; d=53 μm; e=50 μm; f=55 μm.
Figure 3. Silencing of the paternally inherited EMB30 allele during early embryogenesis. (a) Schematic of EMB30 showing the position of exons (squared boxes), introns (lines), and the SNP in an Eco57I restriction site which is present in Col but absent from Ler. (b) Allele-specific RT-PCR analysis of EMB30. In reciprocal crosses between Col and Ler plants, only the transcript derived from the maternal EMB30 allele is detected 24 HAP. (c) 2- cell wild-type embryo; the first division of the zygote (arrowhead) gives rise to a small apical and an elongated basal cell, (d) 2-cell emb30m/EMB30P embryo showing a nearly symmetrical plane of division (arrowhead), (e) Wild-type embryo with the division plane of the apical cell parallel to the apical-basal axis, (f) emb30m/EMB30P embryo with an oblique division plane of the apical cell. Scale bar: c & d=20 μm; e & f=18 μm.
FIGURE 4. Identification of genes active during early embryo and endosperm formation in Arabidopsis. All lines show GUS expression only when self- pollinated or used as females as reciprocal crosses, (a) ET1041: embryo and endosperm, (b) ET1051: embryo and endosperm, (c) ET2209: endosperm, (d) ET1275: embryo, (e) ET1278: embryo, (f) ET3988: embryo and endosperm, (g) ET3992: embryo and endosperm (h) ET3536: endosperm, (i) ET2612: embryo and endosperm, (j) ET2567: endosperm, (k) ET1811: embryo and endosperm. (1) ET4336: embryo and endosperm.
DETAILED DESCRIPTION OF THE INVENTION
According to the invention, applicant has identified several areas of Arabidopsis DNA which encode genes active or, when in the paternal line, silenced in the embryo or endosperm immediately after fertilization.
Transgenic plants have been generated using randomly inserted enhancer trap technology (O'Kane, C, and Gehring, W.J. (1987), "Detection in situ of genomic regulatory elements in Drosophila", Proc. Natl Acad. Sci. USA, 84, 9123-9127; Sundaresan et al., "Patterns of gene action revealed by enhancer trap and gene trap transposable elements", Genes & Development, 9:1797-1810 (1995)), which through screening has identified 10 regions active during this time period. Sequencing of the immediate area has been accomplished to "tag" the site. Using this information and routine molecular biology techniques known in the art several regulatory regions which are closely linked (including or neighboring) may be identified active during this time period. The novel Arabidopsis sequences can be used as markers to identify genes and/or regulatory regions as well as markers for mapping.
Generally, organ-specific promoters such as these for embryo or endosperm appropriate for a desired target organ can be isolated using known procedures. These control sequences are generally associated with genes uniquely expressed in the desired organ. In a typical higher plant, each organ has thousands of mRNAs that are absent from other organ systems (reviewed in Goldberg, Phil, Trans. R. Soc. London (1986) B314-343. mRNAs are first isolated to obtain suitable probes for retrieval of the appropriate genomic sequence which retains the presence of the natively associated control sequences. An example of the use of techniques to obtain the cDNA associated with mRNA specific to avocado fruit is found in Christoffersen et al., Plant Molecular Biology (1984) 3:385. Briefly, mRNA was isolated from ripening avocado fruit and used to make a cDNA library. Clones in the library were identified that hybridized with labeled RNA isolated from ripening avocado fruit, but that did not hybridize with labeled RNAs isolated from unripe avocado fruit. Many of these clones represent mRNAs encoded by genes that are transcriptionally activated at the onset of avocado fruit ripening.
Another very important method and the one used by applicants herein which can be used to identify cell type specific promoters that allow even the identification of genes expressed in a single cell is enhancer detection (O'Kane, C, and Gehring, W.J. (1987), "Detection in situ of genomic regulatory elements in Drosophila", Proc. Natl Acad. Sci. USA, 84, 9123-9127). This method was first developed in Drosophila and rapidly adapted to mice and plants (Wilson, C, Pearson, R.K., Bellen, H.J., O'Kane, C.J., Grossniklaus, U., and Gehring, W.J. (1989), "P-element-mediated enhancer detection: an efficient method for isolating and characterizing developmentally regulated genes in Drosophila" , Genes & Dev., 3, 1301-1313; Skarnes, W.C. (1990), "Entrapment vectors: a new tool for mammalian genetics", Biotechnology, 8, 827-831; Topping, J.F., Wei, W., and Lindsey, K. (1991), "Functional tagging of regulatory elements in the plant genome", Development, 112, 1009-1019; Sundaresan, V., Springer, P.S., Volpe, T., Haward, S., Jones, J.D.G., Dean, C, Ma, H., and Martienssen, R.A., (1995), "Patterns of gene action in plant development revealed by enhancer trap and gene trap transposable elements", Genes & Dev., 9, 1797-1810).
Enhancer detector or gene trap transposable elements are integrated randomly into the plant genome to create transgenic individuals, enhancer/gene trap lines (transposants), which are then assayed to identify genes located nearby or within the integration site. From the information gathered, namely the timing and location of expression of the reporter gene present on the enhancer detector transposon and the sequence surrounding the transposon (or sequence tag) (Liu, Y-G., Mitsukawa, N., Oosumi, T., &
Whittier, R.F. (1995). "Efficient isolation and mapping of Arabidopsis thaliana T-DNA insert junctions by thermal asymmetric interlaced PCR. " Plant J. 8, 457-463; Grossniklaus, U., Moore, J.M., and Gagliano, W.B. (1998). "Molecular and genetic approaches to understanding and engineering apomixis: Arabidopsis as a powerful tool." In Advances in Hybrid Rice TecJmology. Proceedings of the 3rd International Symposium on Hybrid Rice 1996 (Virmani, S.S., Siddiq E.A., Muralidharan, , eds.). International Rice Research
Institute, Manila, Philippines), routine techniques can be used to identify the genes involved and the regulatory regions controlling their expression.
For example, in some of the transposon lines the enhancer trap lies within the coding region of the detected gene. For others the transposon insertion is potentially within the regulatory region of the corresponding tagged gene. The putative regulatory regions driving expression in the embryo and/or endosperm from these lines can be isolated and tested using routine molecular techniques as disclosed in the references and teachings herein. One method involves several steps, first, identification of the transcription units neighboring the insertion site, then, isolation of potential cis-regulatory elements, third, making constructs using these cis-regulatory elements driving a reporter gene e.g. beta-glucuronidase (GUS) [Jefferson, RA, Kavanagh, TA, and Bevan, MW (1987). GUS-fusions: β-glucuronidase as a sensitive and versatile gene fusion marker in higher plants. EMBO J. 6: 3901- 3907] or the green fluorescent protein (GFP) [Hasseloff, J and Amos, B (1995). GFP in plants. Trends in Genetics 11: 328-329], next, generate transgenic plants, and finally, identify which DNA fragments (cis-regulatory elements) are functional.
Applicants herein provide the necessary data, i.e. the sequence tag and the expression pattern, that enables one to search for regulatory elements in the vicinity of the Ds element. The enhancer trap revealed the presence of such elements in its vicinity, the sequence tag localizes the element within the genomic sequence (either known sequence or allows the isolation of such sequence). Two general cases are outlined below how one would proceed to get regulatory regions. a) Sequence tag falls into sequenced region of the genome. If the area has been sequenced, the neighborhood of the Ds element is known. The genes in the region have been predicted by software programs or were identified as expressed sequence tags ox previously isolated genes. The Ds transposable element could be either inserted within the coding region (or an intron) of a gene or in its immediate promoter region. In this case, the identification of regulatory sequences would be focused on this gene. In the vast majority of the cases where regulatory regions were analyzed they could be found within 2kb or 3kb of 5' upstream region [Dwyer, KG, Kandasamy, MK, Mahosky, DI, Acciai, J, Kudish, BI, Mille, JE, Nasrallah, M, and Nasrallah, JB (1994). A superfamily of S-locus related sequences in Arabidopsis: Diverse structures and expression patterns. Plant Cell 6: 1829-1843; Thoma, S, Hecht, U, Kippers, A, Botella, J, de Vries, S, and Somerville, C (1994). Tissue-specific expression of the carrot gene EP2 lipid transfer protein gene. Plant Cell 3: 907- 921; Xia, Y, Nikolau, BJ, and Schnable, PS (1996). Cloning and characterization of CER2, an. Arabidopsis gene that affects cuticular wax accumulation. Plant Cell 8: 1291-1304]. Therefore, the 5' region upstream of the gene would be amplified by PCR and cloned in front of an appropriate reporter gene (GUS or GFP). In a few cases, regulatory regions were also found in larger introns or, rarely, in 3' regions [Gidekel M, Jimenez B, Herrera-Estrella L (1996). The first intron of the Arabidopsis thaliana gene coding for elongation factor 1-beta contains an enhancer-like element. Gene 170: 201-6; Sieburth LE, Meyerowitz EM (1997). Molecular dissection of the AGAMOUS control region shows that cis elements for spatial regulation are located intragenically. Plant Cell 9: 355-65; Daniel SG, Becker WM (1995). Transgenic analysis of the 5'- and 3'-flanking regions of the NADH-dependent hydroxypyruvate reductase gene from Cucumis sativus L. Plant Mol Biol 28: 821-36]. Therefore, if the 5' region does not give the expected expression pattern, introns and possibly 3' regions can be cloned in front of a minimal promoter (e.g. the minimal 35S promoter used in the enhancer trap [Sundaresan V, Springer P, Volpe T, Haward S, Jones JD, Dean C, Ma H, Martienssen (1995). Patterns of gene action in plant development revealed by enhancer trap and gene trap transposable elements. Genes Dev 9: 1797-810] driving a reporter gene. In the case where the Ds element lies in between two predicted genes, a similar analysis may have to be performed for both of them. It is likely that the gene whose promoter is closer to the enhancer trap is the better candidate.
b) Sequence in novel region of the genome.
In this case, the sequence tag can be used to make specific primers which are used to isolate a larger genomic fragment, e.g. by using a PCR-based strategy such as TAIL-PCR [Liu YG, Mitsukawa N, Oosumi T, Whittier RF (1995). Efficient isolation and mapping of Arabidopsis thaliana T-DNA insert junctions by thermal asymmetric interlaced PCR. Plant J 8: 457-63; Grossniklaus, U, Moore, JM, and Gagliano, WB (1998). Molecular and genetic approaches to understanding and engineering apomixis: Arabidopsis as a powerful tool. In Advances in Hybrid Rice Technology. Proceedings of the 3rd International Symposium on Hybrid Rice 1996 (Virmani, SS, Siddiq EA, Muralidharan, K, eds.). International Rice Research Institute, Manila, Philippines], the Genome Walker (Invitrogen) strategy or simply by isolating clones from a genomic library in a classical approach [Sambrook, J, Fritsch, EF, and Maniatis, T (1989). Molecular cloning: A laboratory manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York]. The fragments are then cloned in front of a minimal promoter driving a reporter gene and assayed for enhancer activity in transgenic plants.
Based upon the information disclosed herein, the actual transformed lines are not necessary to perform the invention as the location of the putative gene and the PCR sequence identified is disclosed in the following Table A.
Table A
Localization of Ds Insertions in Enhancer Trap Lines which show GUS
Expression Pattern during Early Seed Development in Arabidopsis thaliana
LINE: ET1051
ENHANCER TRAP EXPRESSION: Embryo and endosperm. Ds LOCALIZATION: Insertion in chromosome 4 (BAC ATF16G20). Insertion probably in the first intron of a gene that contains 6 exons. similarity to rape mRNA (Brassica napus) PIR2:S42651; contains EST gb:T45158, T4498. DEFINITION: Putative protein. ACCESSION: CAA20464.
SEQUENCE OF TAIL-PCR FRAGMENT: Forward Reaction. (SEQ ID NO:l) ATCGAGTTTGGAGTTAACTCATGTTATTGGTGGGAGCTATTTTAAGGACAT TTGTTAAATCCTTTAAAAAGAAAGTTGTTAATAAATCCCCATCATTTGACT TTGGAGATGCACTATTCTACACTTCAACCTCAACCTCGTGAGACATGAGG ATTAAAGTAAAGTAATCACATTTCTGTCACATATGAACAAATAAAGTTATT TAAATCAGAATAAAAGCGAAATCGCTTGACCACTCAAGAAGAGAAAACGC TAACAAGTCCCATAGTTTGTGTTTATTGGGACTTCTTTCCTCTCATCTTCT CCAAGCAGAAACAAAAAAAAGCTGAAAGATGTGGTTTTTTGGATCGAAAA GGAGCATCTGGGATTCCTCGTCTCCGTTCAACTGCAGANGAAGTCACACA TGGTGTTGATGGGAACTGGGTCTCACTGCCATAATCACAAGGTTTTCCTT CTATCAACACTTCATGCCATTTTGCGNACATGGGGGTGTCTAAANTNGTA GACCTTTTTTTCTTANAATTTCAACAACTTTT
LINE: ET1275
ENHANCER TRAP EXPRESSION: Egg apparatus and embryo.
Ds LOCALIZATION: Insertion in chromosome 5 (Pi clone MXK3; seq. 29877-
29649; Insertion within the Copl-interacting protein CIP8
DEFINITION: Putative protein. ACCESSION: AF162150
SEQUENCE OF TAIL-PCR FRAGMENT:
Reverse reaction (SEQ ID NO:2)
GGTCGAGTGAGATGAACCAGAGGTAGAAGGAACAGAATCAGAAGACGAG
TAGCTGTTATCTACGATATCAAAGGAGAATATTCCGATGACGCAAAGAAG GAGTAAGAGGAGGACGAAGACCAACATTTGGATCCAAATTAGCCATGGGA
TATTATTGTAGCTATCTAAAACCAAAAACTCTTTCCCAGAATCCACCATTT
TTTGTAGAGAGATAAGAGTTTTTCCCGGGAAGGGAGAATTCCGTACCGAN
C
LINE: ET1278
EXPRESSION: Egg apparatus, central cell, embryo.
Ds LOCALIZATION: No perfect match; insertion within a isoflavone reductase homolog (close homology to accession CAB36830.1 in BAG F18A5). Insertion in a first exon, if similar molecular structure. Sequence not previously reported
SEQUENCE OF TAIL-PCR FRAGMENT:
Forward reaction (SEQ ID NO: 3)
GGTCGAGTGAGAAGAACAAAAAATGGGAGAGAGCAAACGCACCGAGAAA ACGCGCGTTTTGGTGGTTGGAGCGACTGGATACATAGGGAAGAGGATAGT AAGGGCGTGTTTGGCTGAAGGTCACGAGACTTACGTTTTGCAGAGGCCAG AGATTGGTCTCGAAATCGAGAAAGTCCAACTCTTTCTTTCTTTCAAGAAAC TCGGCGCACGTATCGTTGAAGGTTCTTTCTCCGACCATCAAAGCCTCGTA TCCGCCGTGAAACTTGTTGACGTCGTTTGTCTCCGCCATGTCCGGTGTTC ACTTCCGTAGCCATAACATT
LINE: ET 1811
EXPRESSION: Zygote, fertilized central cell, embryo and endosperm. Ds LOCALIZATION: No perfect match but extremely close; Homology to gene in chromosome 2 (BAC F14N22; sequence 51955-52135; homologous to Arabidopsis 14-3-3 regulatory protein GF14 mu. Insertion within the promoter if similar molecular structure (upstream of 1st exon). Ref: Chuong and Ferl (1999) Plant Physiology 120 (4); 1206). DEFINITION: Putative protein. ACCESSION: AAD23005. SEQUENCE OF TAIL-PCR FRAGMENT: Forward reaction (SEQ ID NO:4)
GGTCGACAGAGAAGAACCAAATGGATCCCACGAGGAATAATATAATGCAT AAAATCACTTACTGACACAGACCATTGAGTAGTAATTACCCCGTTTAACCC GCGTTAATTACGCTTGTGTCCCTACGCGCGTCTCAAAGATCTCTCTCTTCT CTGGAGAAAAAATAAAACAAAAGAACAGCGTTCGCCGAGAGATATACATA CTCGGATCANTCGTCAAAAGCTTTGGAATTTGATACTTTTGATTTTTCGAG AATCTTGAAAATCAGTCATGGGTTCTGGAAAAAAGCGTGACACTTTCNTC NACCTCGCTAAGCTCTCTGAGCAAGCTGAGCGTTATGAAGGTANGGAAGC TTCTTCTTCTTCATCCCACTCGACCAAAGGGGGGAAGTCCATCACACTGG
LINE: ET 3988
EXPRESSION: embryo and endosperm. Ds LOCALIZATION: Insertion in chromosome 2 (BAG T30B22; seq. 66661- 67130). The 73s element is inserted 5' to 3' in the 8th intron of a putative disulfide isomerase (10 exons). DEFINITION: Putative protein, disulfide- isomerase homologue. ACCESSION: AAC62863
SEQUENCE OF TAIL-PCR FRAGMENT: Reverse reaction (SEQ ID NO: 5)
GAATACCGAAATAGAATCACACTAACCTTTGCAATCAACACAGACTTAGCC TTCTTGAAGCTTGCCCCTAGCTTTTCATACTCTGGAGCAAGTTTCTTGCAG TGACCACACCTATTATTAAAACCATGTCCAATTCCATAAATGCAATTAGTC AGATTCAACAGCACACATCAGTAAGCTCTAATTTCTCAGTAAAACATATAA ATCGAAATCATGCTTATTAATCCAGAAGACTGCAATCACAGATCAAAACAT TTCTCGACCAGAAGATTTACAGATCAAAAAGGATCGCATCCGAATTGAGA GAATTGAGCCATACCAGGGAGCGTAAAACTCGACGAGAGCTCCTTTATCT TTACCAACTTCCTTTTCGAAGCTATCGTCAGTCAAAACAACCACATCGTCA GCTACGGGGTGAAACCAGAAGCAACGCGAGTAACGCANAACCAAACCAG ATTTGAGATTCGCCANTTTTTTTTCTGCNTCCACTTAAGCTTGATTTTTAN AAAAAGTGGGCTGCTCCCGGTTTATNAAAANGTTCTTCCCCCCCGAC
LINE: ET 2209 EXPRESSION: endosperm.
Ds LOCALIZATION: Insertion within chromosome 2 (BAG T3K9; sequence 59413-59555) within the untranslated 5' region of a transmembrane protein homolog found in Saccaromyces cerevisiae (yeast; accession YDR352W).The gene where the Ds is inserted is upstream of the first exon. DEFINITION Hypothetical protein. ACCESSION: AAD 12006
SEQUENCE OF TAIL-PCR FRAGMENT: Reverse Reaction: 3' end of the Ds. (SEQ ID NO:6)
AAGGTAACGTCTTTTTGATTGTGCAAACTAGAAAAAAGTGTTAGAAGGCG AAAAGGAAAAACATTTACACTAGAATGACCACATTTGAAAGAGTAGAAAA CGAATCCTAGAGAATTAAAAAAGATTTCTAATTCCGACAATAATTCAATAT ACTAGATTCACAAACAAAAAGACCCCACAAAGTGAATATCCCCCAAAACC CAGATCACACCCTCAGGATTCAGATTCTTCCATTAAAAGCAACAAAATCGA AAACGTTACACAAAATCTGGTTTCGACATTCACATATCAACACCAAACTCG AC
LINE: ET2567
EXPESSION: endosperm (perhaps faint in embryo).
Ds LOCALIZATION: Insertion within chromosome 4 (BAC F2009; sequence 36329-35751), in the first exon of a putative protein without function assignedDEFINITION Putative protein. ACCESSION: CAA16882.
SEQUENCE OF TAIL-PCR FRAGMENT: Forward Reaction: 5' end of the 73s. (SEQ ID NO:7)
GGTCGAGTGAGAAGAACGATACCCCAAGAAACAAAAGTCCAAGATAGGAG ATCGCTGAGAAAACACTTCAAACCACACAAAGTTTCATTATCAACTCATGA ATGTGAAAAATGATTAAAGCTAGAAAGATAAACCTTGACAGAGAAATCAA TGTTACCTAAAGTTAAATCCTTTAGCAGCAAAGCAAGTAGATAGGAAACAT ATACAGCCAAAACCAAACCAAAGTGTTGATTTCGCAACATCTCTCCACATA ATCAAATCGCTTATATTCTCTCTAATCCGCTCTAAATCACCTTGACAAACA TCTTTAGCTGAATTACACAAATCATTGAGCAATTATTAGAGCACGAAAGTG AAATCGAATGAGCTAATAGACAGAGGAAGGACATACGTTGATCTGAAAGA TGGACTAGGAGAAGCTAACAGAGGAACAAGAACTGTGCTTCTCCTTCTTA GATCGACCCAACTTCTTCTGTTTCCTCTGTTTTTGTTTCTTTTCANCGGTC ACGACAACAATTGGCTCCGTATTTTCCTTTGTATCTACCATCGCCTCGAAC TAAGTCCGACCTCCGGAAATTCCTCCGCGACGGGNGAGGCCAAAGACAAT TTGCC
LINE: ET 2612
EXPRESSION: embryo and endosperm.
73s LOCALIZATION: Insertion within chromosome 2 (BACF10A8; sequence 57165-57661) in the 7th intron of a putative transcription factor that acts as an activator of RNA Polymerase III transcription (8 exons). Reference: Khoo,B.,
Brophy,B. and Jackson, S. P. 1994. Conserved functional domains of the RNA polymerase III general transcription factor BRF. Genes Dev. 8 (23), 2879-2890
(1994). DEFINITION: Putative transcription factor. ACCESSION: AAD 14528.
SEQUENCE OF TAIL-PCR FRAGMENT:
Reverse reaction (SEQ ID NO:8)
GGTCGAGTGAGAAGAACATGTTCCACCTAGAAAACAGTGAACAAGTGAAA CTCTAGTAAGGGGAAAAAGACTCTAGAGCATTTCTCCATTCCATGTTTCAT ATTCAAAAGTTTAAGCATTCGACCTCAGAATCATCGTCGTCCGAACAGATG TCTGATTCATCCGAATATTCAGCATGTCTTTCACCCCCATATGTTTCTCCT TCGCCTTTCTCTGCGCACAAAAGATTCTCAAGGGCTCTAACGATAGCAAAT AAAAACTTTAAGCTGATAAGTGATAAGACATGAAGTAAACTCACCCGAAC AATTGTTTTGCCCTTTTGCTTACATTAACTTGTTCATCGTGGTTTAAGTNA CAATCCCTCCCTCGTTTTCTTCTCTAGCTGCTTTTTTCCATTCTCCNCTTTC TCTGCTCCGCTGATAAGCA
LINE: ET 3536
EXPRESSION: endosperm.
Ds LOCALIZATION: Insertion within chromosome 4 (BACT12H17; seq.
58121-58474); similarity to DNA binding protein PDl in Pisum sativum.
Contains EST (gb N377194), and a novel AT-hook containing DNA binding protein from Arabidopsis
(unpublished). Transposon inserted at the end of the last exon (exon 4).
DEFINITION: Putative DNA binding protein. ACCESSION: CAA16562.
SEQUENCE OF TAIL-PCR FRAGMENT: Reverse reaction (SEQ ID NO: 9) GAGACTACCGGAGAAGTTGTTAAAACAACCACCGGGAGCGACGGAGGCG TTACGGTGGTGAGATCCAACGCGCCGTCAGACTTCCACATGGCTCCGAGG TCAGAAACTTCAAACACACCTCCCAACTCCGTCGCTCCTCCTCCTCCTCCA CCGCCGCAAAACTCCTTTACTCCGTCGGCGGCTATGGATGGTTTCTCAAG CGGACCGATAAAGAAGAGACGTGGGCGCCCTAGGAAGTACNGACACGAC GGAGCAGCGGTGACGCTATCTCCGAATCCGATATCATCAGCCGCACCAAC GACTTCTCACGTTATCGATTTCTCGACGAACATCGGGAGAAACGTGGCAA AATGAA
AACCAAGCAAACTCCAACTCCAAACTCGAC
LINE: ET3992
EXPRESSION: seed coat, embryo, endosperm.
73s LOCALIZATION: No homology at DNA or protein level (no match to ESTs). 73s 5'end present in TAIL sequence.
SEQUENCE OF TAIL-PCR FRAGMENT: Reverse reaction (SEQ ID NO: 10) GGTCGACGAGAGAAGAAAAAACAATGTTAGCTCTTCCATCAATCAGTCCA GCCAAAGCACAGAGAGTAACAACATGATGGTAATACATAACACTACATAA TGGTAAGGAGTANAAACACACTTTCAGCTAAATTCCAGTTGTTGCTATTAG GGTAAAGCATTCCATCATACTCAATTCACTCCAAACAAAGCATANAGACTA ACTATGTTTCANCANAATTCCANTTGTTGCTATCANGGATNAAAGTANGA TGGGAGAATTCCNTANCGACC
LINE: ET4336
EXPRESSION: embryo and endopserm. 73s LOCALIZATION: Insertion within Chromosome 1 Perfect hit to an EST obtained from germinating seeds of Arabidopsis ACCESSION: AF162845.
SEQUENCE TAIL-PCR FRAGMENT:
Reverse reaction (5' end of the 73s) (SEQ ID NO: 11)
GGTCGAGAGAGATNAAATCCGGCGAAGGCGACNATGGAAGCAGCAGGTT GGAGAATCAATGAGACAAGCTGAACTACACGTTTCTTAAGAAGCATAGGG TAAGGGAGAGTCAACACGAGGGTGANTACTACTTCGACGGNCACCACATA CGATANAATCACCCATTGCNACGCCATTGT
Once the regulatory regions have been identified and sequenced the promoter region can be used in a virtually infinite array of transgenic protocols. The selection of structural gene, vector and transformation method are all expedients which may be maximized by those of skill in the art and are intended to be within the scope of the invention.
The following is a non-limiting general overview of Molecular biology techniques which may be used in performing the methods of the invention. STRUCTURAL GENE
Likewise, by means of the present invention, agronomic genes can be expressed in transformed plants. More particularly, plants can be genetically engineered to express various phenotypes of agronomic interest with expression targeted to the embryo or endosperm. Exemplary genes implicated in this regard include, but are not limited to, those categorized below.
1. Genes That Confer Resistance To Pests or Disease And That Encode:
(A) Plant disease resistance genes. Plant defenses are often activated by specific interaction between the product of a disease resistance gene (R) in the plant and the product of a corresponding avirulence (Avr) gene in the pathogen. A plant variety can be transformed with cloned resistance gene to engineer plants that are resistant to specific pathogen strains. See, for example Jones et al., Science 266: 789 (1994) (cloning of the tomato Cf-9 gene for resistance to Cladosporium fulvum); Martin et al., Science 262: 1432 (1993) (tomato Pto gene for resistance to Pseudomonas syringae pv. tomato encodes a protein kinase); Mindrinos et al., Cell 78: 1089 (1994) (Arabidopsis RSP2 gene for resistance to Pseudomonas syringae).
(B) A Bacillus thuringiensis protein, a derivative thereof or a synthetic polypeptide modeled thereon. See, for example, Geiser et al, Gene 48: 109 (1986), who disclose the cloning and nucleotide sequence of a Bt - endotoxin gene. Moreover, DNA molecules encoding -endotoxin genes can be purchased from American Type Culture Collection (Rockville, MD), for example, under ATCC Accession Nos. 40098, 67136, 31995 and 31998.
(C) A lectin. See, for example, the disclosure by Van Damme et al, Plant Molec. Biol. 2 25 (1994), who disclose the nucleotide sequences of several Clivia miniata mannose -binding lectin genes.
(D) A vitamin-binding protein, such as avidin. See PCT application US93/06487 the contents of which are hereby incorporated by. The application teaches the use of avidin and avidin homologues as larvicides against insect pests. (E) An enzyme inhibitor, for example, a protease inhibitor or an amylase inhibitor. See, for example, Abe et al, J. Biol. Chem. 262: 16793 (1987) (nucleotide sequence of rice cysteine proteinase inhibitor), Huub et al, Plant Molec. Biol. 21: 985 (1993) (nucleotide sequence of cDNA encoding tobacco proteinase inhibitor I), and Sumitani et al, Biosci. Biotech. Biochem. 57: 1243 (1993) (nucleotide sequence of Streptomyces nitrosporeus alpha- amylase inhibitor).
(F) An insect-specific hormone or pheromone such as an ecdysteroid and juvenile hormone, a variant thereof, a mimetic based thereon, or an antagonist or agonist thereof. See, for example, the disclosure by Hammock et al., Nature 344: 458 (1990), of baculovirus expression of cloned juvenile hormone esterase, an inactivator of juvenile hormone.
(G) An insect-specific peptide or neuropeptide which, upon expression, disrupts the physiology of the affected pest. For example, see the disclosures of Regan, J. Biol. Chem. 269: 9 (1994) (expression cloning yields DNA coding for insect diuretic hormone receptor), and Pratt et al, Biochem. Biophys. Res. Comm.163: 1243 (1989) (an allostatin is identified in Diploptera puntata). See also U.S. patent No.5,266,317 to Tomalski et al, who disclose genes encoding insect-specific, paralytic neurotoxins. (H) An insect-specific venom produced in nature by a snake, a wasp, etc. For example, see Pang et al., Gene 116: 165 (1992), for disclosure of heterologous expression in plants of a gene coding for a scorpion insectotoxic peptide.
(I) An enzyme responsible for an hyper accumulation of a monterpene, a sesquiterpene, a steroid, hydroxamic acid, a phenylpropanoid derivative or another non-protein molecule with insecticidal activity.
(J) An enzyme involved in the modification, including the post- translational modification, of a biologically active molecule; for example, a glycolytic enzyme, a proteolytic enzyme, a lipolytic enzyme, a nuclease, a cyclase, a transaminase, an esterase, a hydrolase, a phosphatase, a kinase, a phosphorylase, a polymerase, an elastase, a chitinase and a glucanase, whether natural or synthetic. See PCT application WO 93/02197 in the name of Scott et αZ., which discloses the nucleotide sequence of a callase gene. DNA molecules which contain chitinase-encoding sequences can be obtained, for example, from the ATCC under Accession Nos. 39637 and 67152. See also Kramer et al, Insect Biochem. Molec. Biol.23: 691 (1993), who teach the nucleotide sequence of a cDNA encoding tobacco hookworm chitinase, and Kawalleck et al, Plant Molec. Biol 21: 673 (1993), who provide the nucleotide sequence of the parsley ubiA-2 polyubiquitin gene.
(K) A molecule that stimulates signal transduction. For example, see the disclosure by Botella et al, Plant Molec. Biol 24: 757 (1994), of nucleotide sequences for mung bean calmodulin cDNA clones, and Griess et al, Plant Physiol.104: 1467 (1994), who provide the nucleotide sequence of a maize calmodulin cDNA clone.
(L) A hydrophobic moment peptide. See PCT application WO95/16776 (disclosure of peptide derivatives of Tachyplesin which inhibit fungal plant pathogens) and PCT application WO95/18855 (teaches synthetic antimicrobial peptides that confer disease resistance), the respective contents of which are hereby incorporated by reference.
(M) A membrane permease, a channel former or a channel blocker. For example, see the disclosure by Jaynes et al, Plant Sci. 89: 43 (1993), of heterologous expression of a cecropin lytic peptide analog to render transgenic tobacco plants resistant to Pseudomonas solanacearum.
(N) A viral-invasive protein or a complex toxin derived therefrom. For example, the accumulation of viral coat proteins in transformed plant cells imparts resistance to viral infection and/or disease development effected by the virus from which the coat protein gene is derived, as well as by related viruses. See Beachy et al, Ann. Rev. Phytopathol28: 451 (1990). Coat protein- mediated resistance has been conferred upon transformed plants against alfalfa mosaic virus, cucumber mosaic virus, tobacco streak virus, potato virus X, potato virus Y, tobacco etch virus, tobacco rattle virus and tobacco mosaic virus. Id.
(O) An insect-specific antibody or an immunotoxin derived therefrom. Thus, an antibody targeted to a critical metabolic function in the insect gut would inactivate an affected enzyme, killing the insect. Cf. Taylor et; al, Abstract #497, SEVENTH INT'L SYMPOSIUM ON MOLECULAR PLANT- MICROBE INTERACTIONS (Edinburgh, Scotland, 1994) (enzymatic inactivation in transgenic tobacco via production of single-chain antibody fragments).
(P) A virus-specific antibody. See, for example, Tavladoraki et al, Nature 366: 469 (1993), who show that transgenic plants expressing recombinant antibody genes are protected from virus attack.
(Q) A developmental-arrestive protein produced in nature by a pathogen or a parasite. Thus, fungal endo -1,4-D-polygalacturonases facilitate fungal colonization and plant nutrient release by solubilizing plant cell wall homo- -1,4-D-galacturonase. See Lamb et al, Bio / Technology 10: 1436 (1992). The cloning and characterization of a gene which encodes a bean endopolygalacturonase-inhibiting protein is described by Toubart et al, Plant J. 2: 367 (1992).
(R) A developmental-arrestive protein produced in nature by a plant. For example, Logemann et al, Bio /Technology 10: 305 (1992), have shown that transgenic plants expressing the barley ribosome-inactivating gene have an increased resistance to fungal disease.
2. Genes That Confer Resistance To A Herbicide. For Example:
(A) A herbicide that inhibits the growing point or meristem, such as an imidazalinone or a sulfonylurea. Exemplary genes in this category code for mutant ALS and AHAS enzyme as described, for example, by Lee et al.,EMBO J. 7: 1241 (1988), and Miki et al, Theor. Appl.Genet. 80: 449 (1990), respectively.
(B) Glyphosate (resistance imparted by mutant 5-enolpyruvl-3- phosphikimate synthase (EPSP) and aroA genes, respectively) and other phosphono compounds such as glufosinate (phosphinothricin acetyl transferase (PAT) and Streptomyces hygroscopicus phosphinothricin acetyl transferase (bar) genes), and pyridinoxy or phenoxy proprionic acids and cycloshexones (ACCase inhibitor-encoding genes). See, for example, U.S. patent No. 4,940,835 to Shah et al., which discloses the nucleotide sequence of a form of EPSP which can confer glyphosate resistance. A DNA molecule encoding a mutant αroA gene can be obtained under ATCC accession No. 39256, and the nucleotide sequence of the mutant gene is disclosed in U.S. patent No. 4,769,061 to Comai. European patent application No. 0 333 033 to Kumada et al. and U.S. patent No. 4,975,374 to Goodman et al disclose nucleotide sequences of glutamine synthetase genes which confer resistance to herbicides such as L-phosphinothricin. The nucleotide sequence of a phosphinothricin- acetyl-transferase gene is provided in European application No. 0 242 246 to Leemans et al. De Greef et al, Bio /Technology 7: 61 (1989), describe the production of transgenic plants that express chimeric bar genes coding for phosphinothricin acetyl transferase activity. Exemplary of genes conferring resistance to phenoxy proprionic acids and cycloshexones, such as sethoxydim and haloxyfop, are the Accl-Sl, Accl-S2 and Accl-S3 genes described by Marshall et al, Theor. Appl. Genet. 83: 435 (1992). (C) A herbicide that inhibits photosynthesis, such as a triazine (psbA and gs+ genes) and a benzonitrile (nitrilase gene). Przibilla et al, Plant Cell 3: 169 (1991), describe the transformation of Chlamydomonas with plasmids encoding mutant psbA genes. Nucleotide sequences for nitrilase genes are disclosed in U.S. patent No. 4,810,648 to Stalker, and DNA molecules containing these genes are available under ATCC Accession Nos. 53435, 67441 and 67442. Cloning and expression of DNA coding for a glutathione S- transferase is described by Hayes et al, Biochem. J. 285: 173 (1992).
3. Genes That Confer Or Contribute To A Value-Added Trait, Such As:
(A) Modified fatty acid metabolism, for example, by transforming a plant with an antisense gene of stearoyl-ACP desaturase to increase stearic acid content of the plant. See Knultzon et al, Proc. Natl Acad. Sci. USA 89: 2624 (1992).
(B) Decreased phytate content
(1) Introduction of a phytase-encoding gene would enhance breakdown of phytate, adding more free phosphate to the transformed plant. For example, see Van Hartingsveldt et α ., Gene 127: 87 (1993), for a disclosure of the nucleotide sequence of an Aspergillus niger phytase gene.
(2) A gene could be introduced that reduces phytate content. In maize, this, for example, could be accomplished, by cloning and then reintroducing DNA associated with the single allele which is responsible for maize mutants characterized by low levels of phytic acid. See Raboy et al, Maydica 35: 383 (1990).
(C) Modified carbohydrate composition effected, for example, by transforming plants with a gene coding for an enzyme that alters the branching pattern of starch. See Shiroza et al, J. Bacteriol 170: 810 (1988) (nucleotide sequence of Streptococcus mutans fructosyltransferase gene), Steinmetz et al, Mol. Gen. Genet. 200: 220 (1985) (nucleotide sequence of
Bacillus subtilis levansucrase gene), Pen et al, Bio /Technology 10: 292 (1992) (production of transgenic plants that express Bacillus licheniformis a- amylase), Elliot et al, Plant Molec. Biol. 2LA '- 515 (1993) (nucleotide sequences of tomato invertase genes), Søgaard et al, J. Biol Chem. 268: 22480 (1993) (site-directed mutagenesis of barley α-amylase gene), and Fisher et al, Plant Physiol 102: 1045 (1993) (maize endosperm starch branching enzyme II).
4. Genes That Control Growth and Development. Such As:
(A) Genes that controls cell proliferation and growth of the embryo and/or endosperm such as cell cycle regulators (Bogre L et al., "Regulation of cell division and the cytoskeleton by mitogen-activated protein kinases in higher plants." Results Probl Cell Differ 27:95-117 (2000); Huntley RP and Murray JA, "The plant cell cycle." Curr Opin Plant Biol 2:440-446 (1999)), the MEDEA (MEA) (Grossniklaus et al., "Maternal control of embryogenesis by MEDEA, a Polycomb group gene in Arabidopsis." Science 280: 446-50 (1998)), FERTILIZATION-INDEPENDENT SEEDS2 (FIS2) (Luo et al. "Genes controlling fertilization-independent seed development in Arabidopsis thaliana." Proc Natl Acad Sci U S A 96:296-301 (1999)), and FERTILIZATION- INDEPENDENT ENDOSPERM (FIE) (Ohad N, et al. "Mutations in FIE, a WD Polycomb group gene, allow endosperm development without fertilization." Plant Cell 11:407-16 (1999)) genes which affect cell proliferation in embryo and endosperm to increase seed size, change the relative biomass ratio of embryo to endosperm or to minimize seed content (seed ablation).
(B) Genes that promote autonomous endosperm proliferation in the absence of fertilization, a component of apomixis (asexual reproduction through seeds) such as MEA, FIS2 OR FIE (see Grossniklaus et al., Luo et al., Ohad et al., supra).
(C) Genes that promote embryo formation, another component of apomixis (asexual reproduction through seeds) such as the SOMATIC EMBRYOGENESIS RELATED KINASE (SERK) (Schmidt ED, et al. "A leucine-rich repeat containing receptor-like kinase marks somatic plant cells competent to form embryos." Development 124:2049-62 (1997)), or LEAFY COTYLEDONS1 (LEC1) (Lotan T, et al. "Arabidopsis LEAFY COTYLEDON1 is sufficient to induce embryo development in vegetative cells." Cell 93:1195- 205 (1998)). (D) Toxin genes such as diphteria toxin A (Nilsson O et al. "Genetic ablation of flowers in transgenic Arabidopsis." Plant J 15:799-804 (1998)) or ricin A (Moffat et al. "Inducible cell ablation in Drosophila by cold- sensitive ricin A chain." Development 114:681-7 (1992)), or other genes that interfere with cell viability such as barnase (Mariani C, et al. "Engineered male sterility in plants." Symp Soc Exp Biol. 45:271-9 (1991)) to ablate seeds for the production of seedless fruits and plants.
PROMOTERS
The promoters disclosed herein may be used in conjunction with naturally occurring flanking coding or transcribed sequences of the desired structural gene/s or with any other coding or transcribed sequence that is critical to structural gene formation and/or function.
It may also be desirable to include some intron sequences in the promoter constructs since the inclusion of intron sequences in the coding region may result in enhanced expression and specificity. Thus, it may be advantageous to join the DNA sequences to be expressed to a promoter sequence that contains the first intron and exon sequences of a polypeptide which is unique to cells/tissues of a plant critical to seed specific Structural formation and/or function. Additionally, regions of one promoter may be joined to regions from a different promoter in order to obtain the desired promoter activity resulting in a chimeric promoter. Synthetic promoters which regulate gene expression may also be used. The expression system may be further optimized by employing supplemental elements such as transcription terminators and/or enhancer elements.
OTHERREGULATORYELEMENTS
In addition to a promoter sequence, an expression cassette or construct should also contain a transcription termination region downstream of the structural gene to provide for efficient termination. The termination region or polyadenylation signal may be obtained from the same gene as the promoter sequence or may be obtained from different genes. Polyadenylation sequences include, but are not limited to the Agrobacterium octopine synthase signal (Gielen et al., EMBO J. (1984) 3:835-846) or the nopaline synthase signal (Depicker et al., Mol. andAppl Genet. (1982) 1:561-573). MARKER GENES
Recombinant DNA molecules containing any of the DNA sequences and promoters described herein may additionally contain selection marker genes which encode a selection gene product which confer on a plant cell resistance to a chemical agent or physiological stress, or confers a distinguishable phenotypic characteristic to the cells such that plant cells transformed with the recombinant DNA molecule may be easily selected using a selective agent. One such selection marker gene is neomycin phosphotransferase (NPT II) which confers resistance to kanamycin and the antibiotic G-418. Cells transformed with this selection marker gene may be selected for by assaying for the presence in vitro of phosphorylation of kanamycin using techniques described in the literature or by testing for the presence of the mRNA coding for the NPT II gene by Northern blot analysis in RNA from the tissue of the transformed plant. Polymerase chain reactions are also used to identify the presence of a transgene or expression using reverse transcriptase PCR amplification to monitor expression and PCR on genomic DNA. Other commonly used selection markers include the ampicillin resistance gene, the tetracycline resistance and the hygromycin resistance gene. Transformed plant cells thus selected can be induced to differentiate into plant structures which will eventually yield whole plants. It is to be understood that a selection marker gene may also be native to a plant. TRANSFORMATION
A recombinant DNA molecule whether designed to inhibit expression or to provide for expression containing any of the DNA sequences and/or promoters described herein may be integrated into the genome of a plant by first introducing a recombinant DNA molecule into a plant cell by any one of a variety of known methods. Preferably the recombinant DNA molecule(s) are inserted into a suitable vector and the vector is used to introduce the recombinant DNA molecule into a plant cell.
The use of Cauliflower Mosaic Virus (CaMV) (Howell, S.H., et al, 1980, Science, 208:1265) and gemini viruses (Goodman, R.M., 1981, J. Gen Virol. 54:9) as vectors has been suggested but by far the greatest reported successes have been with Agrobacteria sp. (Horsch, R.B., et al, 1985, Science 227:1229- 1231).
Methods for the use of Agrobacterium based transformation systems have now been described for many different species. Generally strains of bacteria are used that harbor modified versions of the naturally occurring Ti plasmid such that DNA is transferred to the host plant without the subsequent formation of tumors. These methods involve the insertion within the borders of the Ti plasmid the DNA to be inserted into the plant genome linked to a selection marker gene to facilitate selection of transformed cells. Bacteria and plant tissues are cultured together to allow transfer of foreign DNA into plant cells then transformed plants are regenerated on selection media. Any number of different organs and tissues can serve as targets from Agrobacterium mediated transformation as described specifically for members of the Brassicaceae. These include thin cell layers (Charest, P.J., et al, 1988, Theor. Appl Genet. 75:438-444), hypocotyls (DeBlock, M., et al, 1989, Plant Physiol. 91:694-701), leaf discs (Feldman, K.A., and Marks, M.D., 1986, Plant Sci. 47:63-69), stems (Fry J., et al, 1987, Plant Cell Repts. 6:321-325), cotyledons (Moloney M. M., et al, 1989, Plant Cell Repts. 8:238-242) and embryoids (Neuhaus, G., et al, 1987, Theor. Appl. Genet. 75:30-36), or even whole plants using in vacuum infiltration and floral dip or floral spraying transformation procedures available in Arabidopsis and Medicago at present but likely applicable to other plants in the hear future. It is understood, however, that it may be desirable in some crops to choose a different tissue or method of transformation.
Other methods that have been employed for introducing recombinant molecules into plant cells involve mechanical means such as direct DNA uptake, liposomes, electroporation (Guerche, P. et al, 1987, Plant Science 52:111-116) and micro-injection (Neuhaus, G., et al, 1987, Theor. Appl. Genet. 75:30-36). The possibility of using microprojectiles and a gun or other device to force small metal particles coated with DNA into cells has also received considerable attention (Klein, T.M. et al., 1987, Nature 327:70-73).
It is often desirable to have the DNA sequence in homozygous state which may require more than one transformation event to create a parental line, requiring transformation with a first and second recombinant DNA molecule both of which encode the same gene product. It is further contemplated in some of the embodiments of the process of the invention that a plant cell be transformed with a recombinant DNA molecule containing at least two DNA sequences or be transformed with more than one recombinant DNA molecule. The DNA sequences or recombinant DNA molecules in such embodiments may be physically linked, by being in the same vector, or physically separate on different vectors. A cell may be simultaneously transformed with more than one vector provided that each vector has a unique selection marker gene. Alternatively, a cell may be transformed with more than one vector sequentially allowing an intermediate regeneration step after transformation with the first vector. Further, it may be possible to perform a sexual cross between individual plants or plant lines containing different DNA sequences or recombinant DNA molecules preferably the DNA sequences or the recombinant molecules are linked or located on the same chromosome, and then selecting from the progeny of the cross, plants containing both DNA sequences or recombinant DNA molecules. Expression of recombinant DNA molecules containing the DNA sequences and promoters described herein in transformed plant cells may be monitored using Northern blot techniques and/or Southern blot techniques or PCR-based methods known to those of skill in the art.
A large number of plants have been shown capable of regeneration from transformed individual cells to obtain transgenic whole plants. For example, regeneration has been shown for dicots as follows: apple, Malus pumila (James et al., Plant Cell Reports (1989) 7:658); blackberry, Rubus, Blackberry/raspberry hybrid, Rubus, red raspberry, Rubus (Graham et al., Plant Cell, Tissue and Organ Culture (1990) 20:35); carrot, Daucus carota (Thomas et al., Plant Cell Reports (1989) 8:354; Wurtele and Bulka, Plant
Science (1989) 61:253); cauliflower, Brassica oleracea (Srivastava et al., Plant Cell Reports (1988) 7:504); celery, Apium graveolens (Catlin et al., Plant Cell Reports (1988) 7:100); cucumber, Cucumis sativus (Trulson et al., Theor. Appl. Genet. (1986) 73:11); eggplant, Solanum melonoena (Guri and Sink, J. Plant Physiol. (1988) 133:52) lettuce, Lactuca sativa (Michelmore et al., Plant Cell Reports (1987) 6:439); potato, Solarium tuberosum (Sheerman and Bevan, Plant Cell Reports (1988) 7:13); rape, Brassica napus (Radke et al., Theor. Appl. Genet. (1988) 75:685; Moloney et al., Plant Cell Reports (1989) 8:238); soybean (wild), Glycine canescens (Rech et al., Plant Cell Reports (1989) 8:33); strawberry, Fragaria x ananassa (Nehra et al., Plant Cell Reports (1990) 9:10; tomato, Lycopersicon esculentum (McCormick et al., Plant Cell Reports (1986) 5:81); walnut, Juglans regia (McGranahan et al., Plant Cell Reports (1990) 8:512); melon, Cucumis melo (Fang et al., 86th Annual Meeting of the American Society for Horticultural Science Hort. Science (1989) 24:89); grape, Vitis vinifera (Colby et al., Symposium on Plant Gene Transfer, UCLA Symposia on Molecular and Cellular Biology J Cell Biochem Suppl (1989) 13D:255; mango, Mangifera indica (Mathews, et al., symposium on Plant Gene Transfer, UCLA Symposia on Molecular and Cellular Biology J Cell Biochem Suppl (1989) 13D:264); and for the following monocots: rice, Oryza sativa (Shimamoto et al., Nature (1989) 338:274); rye, Secale cereale (de la Pena et al., Nature (1987) 325:274); maize, (Rhodes et al., Science (1988) 240:204).
In addition, regeneration of whole plants from cells (not necessarily transformed) has been observed in apricot, Prunus armeniaca (Pieterse, Plant Cell Tissue and Organ Culture (1989) 19:175); asparagus, Asparagus officinalis (Elmer et al., J. Amer. Soc. Hort. Sci. (1989) 114:1019); Banana, hybrid Musa (Escalant and Teisson, Plant Cell Reports (1989)
7:665); bean, Phaseolus vulgaris (McClean and Grafton, Plant Science (1989) 60:117); cherry, hybrid Prunus (Ochatt et al., Plant Cell Reports (1988) 7:393); grape, Vitis vinifera (Matsuta and Hirabayashi, Plant Cell Reports, (1989) 7:684; mango, Mangifera indica (DeWald et al., J Amer Soc Hort Sci (1989) 114:712); melon, Cucumis melo (Moreno et al., Plant Sci letters (1985) 34:195); ochra, Abelmoschus esculentus (Roy and Mangat, Plant Science (1989) 60:77; Dirks and van Buggenum, Plant Cell Reports (1989) 7:626); onion, hybrid Allium (Lu et al., Plant Cell Reports (1989) 7:696); orange, Citrus sinensis (Hidaka and Kajikura, Scientia Horiculturae (1988) 34:85); papaya, Carrica papaya (Litz and Conover, Plant Sci Letters (1982) 26:153); peach, Prunus persica and plum, Prunus domestica (Mante et al., Plant Cell Tissue and Organ Culture (989) 19:1); pear, Pyrus communis (Chevreau et al., Plant Cell Reports (1988) 7:688; Ochatt and Power, Plant Cell Reports (1989) 7:587); pineapple, Ananas comosus (DeWald et al., Plant Cell Reports (1988) 7:535); watermelon, Citrullus vulgaris (Srivastava et al., Plant Cell Reports (1989) 8:300); wheat, Triticum aestivum (Redway et al., Plant Cell Reports (1990) 8:714).
The regenerated plants are transferred to standard soil conditions and cultivated in a conventional manner. After the expression or inhibition cassette is stably incorporated into regenerated transgenic plants, it can be transferred to other plants by sexual crossing. Any of a number of standard breeding techniques can be used, depending upon the species to be crossed.
It may be useful to generate a number of individual transformed plants with any recombinant construct in order to recover plants free from any position effects. It may also be preferable to select plants that contain more than one copy of the introduced recombinant DNA molecule such that high levels of expression of the recombinant molecule are obtained.
According to a preferred embodiment, the transgenic plant provided for commercial production of foreign protein is maize. In another preferred embodiment, the biomass of interest is seed. For the relatively small number of transgenic plants that show higher levels of expression, a genetic map can be generated, primarily via conventional Restriction Fragment Length Polymorphisms (RFLP), Polymerase Chain Reaction (PCR) analysis, and Simple Sequence Repeats (SSR) which identifies the approximate chromosomal location of the integrated DNA molecule. For exemplary methodologies in this regard, see Glick and Thompson, METHODS IN PLANT MOLECULAR BIOLOGY AND BIOTECHNOLOGY 269-284 (CRC Press, Boca Raton, 1993). Map information concerning chromosomal location is useful for proprietary protection of a subject transgenic plant. If unauthorized propagation is undertaken and crosses made with other germplasm, the map of the integration region can be compared to similar maps for suspect plants, to determine if the latter have a common parentage with the subject plant. Map comparisons would involve hybridizations, RFLP, PCR, SSR and sequencing, all of which are conventional techniques.
As indicated above, it may be desirable to produce plant lines which are homozygous for a particular gene. In some species this is accomplished rather easily by the use of anther culture or isolated microspore culture. This is especially true for the oil seed crop Brassica napus (Keller and Armstrong, Z. flanzenzucht 80:100-108, 1978). By using these techniques, it is possible to produce a haploid line that carries the inserted gene and then to double the chromosome number either spontaneously or by the use of colchicine. This gives rise to a plant that is homozygous for the inserted gene, which can be easily assayed for if the inserted gene carries with it a suitable selection marker gene for detection of plants carrying that gene. Alternatively, plants may be self-fertilized, leading to the production of a mixture of seed that consists of, in the simplest case, three types, homozygous (25%), heterozygous (50%) and null (25%) for the inserted gene. Although it is relatively easy to score null plants from those that contain the gene, it is possible in practice to score the homozygous from heterozygous plants by southern blot analysis in which careful attention is paid to the loading of exactly equivalent amounts of DNA from the mixed population, and scoring heterozygotes by the intensity of the signal from a probe specific for the inserted gene. It is advisable to verify the results of the southern blot analysis by allowing each independent transformant to self-fertilize, since additional evidence for homozygosity can be obtained by the simple fact that if the plant was homozygous for the inserted gene, all of the subsequent plants from the selfed seed will contain the gene, while if the plant was heterozygous for the gene, the generation grown from the selfed seed will contain null plants. Therefore, with simple selfing one can easily select homozygous plant lines that can also be confirmed by southern blot analysis.
Creation of homozygous parental lines makes possible the production of hybrid plants and seeds which will contain a modified protein component. Transgenic homozygous parental lines are maintained with each parent containing either the first or second recombinant DNA sequence operably linked to a promoter. Also incorporated in this scheme are the advantages of growing a hybrid crop, including the combining of more valuable traits and hybrid vigor.
The following examples serve to better illustrate the invention described herein and are not intended to limit the invention in any way. All references cited herein are hereby expressly incorporated to this document in their entirety by reference.
EXAMPLES Little is known about the timing of the maternal-to-zygotic transition during seed development in flowering plants. Because plant embryos can develop from somatic cells or microspores, (Mordhorst, A. P., Toonen, M.A.J. & de Vries, S.C. Plant embryogenesis. Crit. Rev. PI Sci. 16, 535-576 (1997)) maternal contributions are not considered to play a crucial role for early embryogensis (Walbot, V. Sources and consequences of phenotypic and genotypic plasticity in flowering plants. Trends PI Sci. 1, 27-32 (1996)). Early acting embryo lethal mutants in Arabidopsis, including emb30 / gnom affecting the first zygotic division (Meinke, D.W. Embryo-lethal mutants of Arabidopsis thaliana: analysis of mutants with a wide range of lethal phases. Theor. Appl. Genet. 69, 543-552 (1985); Mayer, TJ., Bύttner, G. & Jύrgens G. Apical-basal pattern formation in the Arabidopsis embryo: studies on the role of the gnom gene. Dev. 117, 149-162 (1993)), have fuelled the perception that both maternal and paternal genomes are active immediately after fertilization. Here we show that none of the paternally inherited alleles of 20 loci that we tested is expressed during early seed development in Arabidopsis. For genes that are expressed at later stages, the paternally inherited allele becomes active three to four days after fertilization. The genes that we tested are involved in various processes and distributed throughout the genome, indicating that most, if not all, of the paternal genome may be initially silenced. Our findings are corroborated by genetic studies which show that emb30/gnom has a maternal effect phenotype that is paternally rescuable in addition to its zygotic lethality. Thus, contrary to previous interpretations, early embryo and endosperm development are mainly under maternal control. In flowering plants, double fertilization involves two sperm cells, one of which fuses with the egg cell to form a diploid zygote, while the second fuses with the binucleated central cell to give rise to the triploid primary endosperm nucleus (van Went, J.L. & Willemse, M.T.M. Fertilization. In Embryology of Angiosperms (Johri, B. ed) Springer-Verlag, Berlin, pp 273-318 (1984)). Double fertilization triggers rapid proliferation of the endosperm and slow cell divisions of the zygote, which usually undergoes an asymmetrical division (Bowman, J.L. & Mansfield S.G. (1994). Embryogenesis. In Arabidopsis: An Atlas of Morphology and Development (Bowman, J. ed.) Springer-Verlag, New York, pp 351-361 (1994)). In Arabidopsis, as in the majority of plant species, the primary endosperm nucleus undergoes divisions without cytokinesis, giving rise to a syncytium that eventually cellularizes (Berger, F. (1999). Endosperm development. Curr. Opin. Plant Biol.2, 28-32 (1999)). In contrast to animals (Zalokar, "Autoradiographic study of protein and RNA formation during early development in Drosophila eggs. Dev. Biol. 49: 425-437 (1976); Newman, E.D. & Rothman, J.H. The maternal-to-zygotic transition in embryonic patterning of Caenorhabditis elegans. Curr. Op. Gen. & Dev. 8, 472- 480 (1998); Bensaude, O., Babinet, C, Morange, M. & Jacob F. Heat shock proteins, first major products of zygotic gene activity in mouse embryo. Nature 305, 331-332 (1983)) the timing of transcriptional activation of the genome in the plant embryo and endosperm has not been intensively studied
(Zimmerman, J.L. & Cohill, P.R. Heat shock and thermotolerance in plant and animal embryogenesis. New. Biol. 7, 641-650 (1991)). The identification of a large group of early acting embryo lethal Arabidopsis mutants that segregate as sporophytic recessive traits (Meinke, D.W. Seed development in Arabidopsis thaliana. In Arabidopsis (Meyerowitz, E.M.& Somerville C.R eds.). Cold Spring Harbor Laboratory Press, p. 253-295 (1994)) suggested that the activation of the zygotic genome occurs before the first division of the zygote. However, a spatial and temporal pattern for zygotic genome activation has not yet been determined. We generated a library of enhancer detector and gene trap lines
(transposants) that harbour 73s elements with a uidA reporter gene encoding β- glucoronidase (GUS) by using the system of Sundaresdan et al. (Sundaresan, V. et al. Patterns of gene action in plant development revealed by enhancer trap and gene trap transposable elements. Genes & Dev. 9, 1797-1810 (1995)). Screening for genes that act during ovule and early seed development in Arabidopsis (U. Grossniklaus et al., unpublished results), we identified 19 transposants that show GUS expression in the developing embryo and/or endosperm after fertilization. GUS is expressed in the egg and/or central cell and persists for several rounds of cell division in either one or both fertilization products. To determine if in each of these lines GUS expression was the result of transcription from only one or both parental alleles, we performed reciprocal crosses between wild-type plants and the 19 transposants. When wild-type plants were used as male parents, the resulting FI seeds showed GUS expression in a pattern identical to the one found in developing seeds resulting from self-pollination in all 19 lines (Fig. la to lc). In contrast, if the transposants were used as male parents, GUS expression was absent from all FI seeds and remained undetectable up to 80 hours after pollination (HAP) (Fig. Id ). This suggests that the paternally inherited allele is not expressed during early stages of embryo and endosperm development. To verify if the pattern of GUS expression truly reflects the expression of genes neighbouring the insertion, we performed in-situ hybridization with digoxigenin-labeled probes of a gene encoding a putative basal transcription factor that was tagged by ET2612 (see Table B).
Table B Identification of genes expressed during early seed formation in Arabidopsis
Line Postfertilization Localization of Chromosome Accession
GUS Expression* the Insertion Positionf Numberff
ET1041 Embryo and endosperm PROLIFERA# IV, 12 cM L39954 ET1051 Embryo and endosperm Homology to rape mRNA IV, 69 cM AL031326 (CAA20464) ET1275 Embryo Copl interacting protein CIP8« V, 125 cM AF162150 ET1278 Embryo Isoflavone reductase horαolog New sequence ET1811 Embryo and endosperm 14-3-3 GF14 mu II, 83 cM AC007087 (AAD23005) ET2209 Endosperm Transmembrane protein II, 79 cM AC004261 (AAD12006) homolog
ET2612 Embryo and endosperm Basal pol III transcription factor II, 1 cM AC006200 (AAD 14528) homolog
ET2567 Endosperm Hypothetical protein IV, 77 cM AL021749 (CAA16882) ET3536 Endosperm AT-hook DNA IV, 69 cM AL021635 (CAA16562) binding protein
ET3988 Embryo and endosperm Disulfide isomerase homolog II, 87 cM AC002535 (AAC62863) ET3992 Embryo and endosperm No homology New sequence ET4336 Embryo and endosperm Germinating seed mRNA < I, 70 cM AF162845
* All lines show post fertilization GUS expression only when self-pol nated or used as females in reciprocal crosses to the wild type See Figure 4 f The map position is based on the nearest genetic marker referenced in the RI hst of Lister and Dean (see http //genome www Stanford edu/Arabidovsisf) ft Accession numbers correspond to nucleotide sequences in Genbank, protein accession numbers are indicated m brackets
# Spimgei, P S , McCombie, W R , Sundaresan, V , & Martienssen R A Gene trap tagging oiPROLIFERA, an essential MCM2 3 5 like gene Arabidopsis Science 268, 877 880 (1995)
## Torn, K U , Stoop Myer, C D , Okamoto, H , Coleman J E , Matsui, M & Deng, X W The RING finger motif of photomorphogemc lepressor C0P1 specifically interacts with the RING-H2 motif of a novel Arabidopsis protein J Biol Chem 274, 27674-27681 (1999)
mRNA was detected in the developing endosperm (Fig. If) as well as the early globular embryo (data not shown) in a pattern identical to the one observed in GUS assays (Fig. le). Uniparental expression could be the consequence of transgene silencing specifically affecting the paternally inherited allele. Alternatively, it could indicate that our enhancer detection screen identified a subclass of genes which are transcribed maternally prior to fertilization and are subject to paternal silencing. A third possibility is that the paternally inherited genome remains silent during the first days following double fertilization, and that the activation of the paternal genome occurs after several rounds of cell division in both the embryo and endosperm.
To determine if the absence of paternally-derived GUS expression was due to sex-specific inactivation of the enhancer detector transgene or if it reflected sex-specific silencing of endogenous genes, we designed a reverse transcription polymerase chain reaction (RT-PCR) assay that depends on single nucleotide polymorphisms (SNPs) between Columbia (Col) and
Landsberg erecta (her) ecotypes. SNPs in endonuclease restriction sites allow the distinction of transcripts derived from Col and Ler alleles. PROLIFERA (PRL) is an MCM2-3-5-like replication licensing factor required for the initiation of DNA replication (Springer, P.S., McCombie, W.R., Sundaresan, V., & Martienssen, R.A. Gene trap tagging of PROLIFERA, an essential MCM2-3- 5-like gene in Arabidopsis. Science 268, 877-880 (1995)). We have identified a new enhancer detector insertion (ET1041) within the coding region of PRL (see Table B), and reporter gene expression confirmed that PRL is expressed in the developing embryo and free nuclear endosperm. A SNP in the second exon of PRL creates a Xhol site present in Col but not in Ler (Fig. 2a). To test the sensitivity of our assay we amplified a region spanning exon 2 in titrated mixtures of genomic DNA. After digesting the PCR products with Xhol, we could consistently amplify both PRL alleles even when the amount of Col DNA used as a template was 80 times less than that of Ler DNA (Fig 2b). We examined PRL expression by RT-PCR on RNA isolated 24, 32, 50, 68 and 156 HAP from developing siliques derived from reciprocal crosses between Ler and Col plants. During the first three days following pollination (68 HAP) only transcripts derived from the maternally inherited allele could be detected (Fig 2c). These results confirm that the paternally inherited PRL allele is not transcribed in either embryo or endosperm during early seed development. Transcripts from both PRL alleles could be detected at 156 HAP (Fig 2c), indicating that the activation of the paternally inherited PRL allele occurs only later during seed development. Additional histochemical analysis of ET1041 showed that paternally derived GUS expression can be observed earlier, in seeds containing embryos at the mid-globular stage 92 HAP (Fig 2e). Interestingly, GUS expression is initially restricted to a small region of the chalazal chamber in the nodular cyst of the developing endosperm (Fig. 2d-f), and progressively expands to include cellularizing tissue (data not shown). These results indicate that the paternal silencing we observed using transposants affects endogenous loci and is independent of the presence of a transgene. To test whether a particular subset of genes was affected by this epigenetic regulation we determined the molecular nature of the genes identified by enhancer detection. We isolated genomic regions flanking the Ds insertion using thermal asymmetric interlaced PCR (TAIL-PCR) (Liu, Y-G., Mitsukawa, N., Oosumi, T., & Whittier, R.F. (1995). Efficient isolation and mapping of Arabidopsis thaliana T-DNA insert junctions by thermal asymmetric interlaced PCR. Plant J. 8, 457-463). We subcloned and sequenced a total of 16 flanking fragments representing 12 loci (Table B). For all of them the insertion site could be confirmed by the presence of a 3'- or 5'- Ds end bordering the genomic sequence. We identified ten insertions within the regulatory region or coding sequence of genes that are represented in Genbank, and two insertions in novel genomic sequence. (Table A)
Whereas sequences from two insertions show no homology to sequences in public databases or have similarity to genes of unknown function, the remaining eight genes encode proteins with a wide variety of functions. Some of them are similar to genes involved in basic cellular functions such as cell cycle regulation, the basal transcription machinery, or the assembly of protein secondary structure. Others encode putative signal transduction proteins or transcription factors that may play regulatory roles during seed development. The Ds insertions are located on at least four out of the five chromosomes of Arabidopsis (Table B). These results indicate that the genes identified in our screen do not encode for members of a specific family of proteins and are distributed throughout the Arabidopsis genome.
Our observations suggest that most if not all of the paternal genome is silenced during early seed development. We expect that other genes known to be expressed at early stages are also affected by this regulation. Therefore, we examined EMB301 GNOM (Shevell, B.Ε. et al. EMB30 is essential for normal cell division, cell expansion, and cell adhesion in Arabidopsis and encodes a protein that has similarity to Sec7. Cell 77, 1051-1062 (1994); Busch, M., Mayer, U. & J rgens, G. Molecular analysis of the Arabidopsis pattern formation of gene GNOM: gene structure and intragenic complementation. Mol Gen Genet 250, 681-91 (1996)) expression using allele-specific RT-PCR. Embryos homozygous for emb30 are defective in the establishment of the apical-basal axis. In some emb30 embryos the zygote divides almost symmetrically, giving rise to an enlarged apical cell that subsequently forms an abnormal globular embryo (Meinke, D.W. Embryo-lethal mutants of Arabidopsis thaliana: analysis of mutants with a wide range of lethal phases. Theor. Appl. Genet. 69, 543-552 (1985)). Self-pollinated heterozygous emb30/EMB30 individuals produce 25% of aborted embryo lethal seeds suggesting a zygotic requirement. The phenotype of emb30 has been interpreted as being caused by a recessive mutation that affects a gene active during the earliest diploid (sporophytic) phase of embryogenesis (Howell, S.H. Molecular genetics of plant development. Cambridge University Press, New York. 365 pp. (1999)). The emb30 embryo phenotype has been the strongest argument in favor of early genome activation in Arabidopsis. EMB30 encodes a Sec-7 like protein that has recently been shown to be essential for auxin transport (Steinmann, T. et al. Coordinated polar localization of auxin efflux carrier PIN1 by GNOMARF GEF. Science 286, 316-318 (1999)). Although EMB30 is expressed throughout the plant, its allele-specific pattern of expression has not been investigated during seed development. A SNP in the first exon ΪEMB30 creates an Eco57I site present in Col but not in Ler (Fig. 3a). We examined the allelic pattern OΪEMB30 expression using RT-PCR followed by endonuclease digestion with Eco57I. At 24 HAP, EMB30 transcripts from the paternal allele could not be detected (Fig 3b), suggesting that the initial post-fertilization expression of EMB30 is exclusively dependent on transcription from the maternally inherited allele. However, since EMB30 may be expressed in sporophytic tissues such as the silique or seed coat, we could not discard the possibility that transcripts from the EMB30 maternal allele are present in vast excess over the paternally derived transcripts, obscuring their detection. Therefore, we tested for genetic activity of the paternal EMB30 allele by crossing heterozygous emb30IEMB30 plants with wild-type pollen. If the paternally inherited EMB30 allele is active following fertilization, heterozygous emb30m/EMB30Pemhryos should develop normally. Because of low expressivity of the emb30 phenotype at young stages only 12.2% (n=131) of the embryos derived from self-fertilization of emb30lEMB30 plants show morphological defects 24 to 48 HAP. If the same plants were crossed to wild-type pollen 12.9% (n=116) of the FI seeds showed the same morphological defects strongly suggesting that the paternally inherited wild-type allele does not provide EMB30 activity. No defective FI embryos were observed when female wild-type plants were crossed with the same heterozygous emb30/EMB30 individuals (n=53). The defects in emb30m/EMB30Pe bryos include a nearly symmetrical plane of division of the zygote (Fig. 3c,d), an oblique plane of division of the apical cell (Fig. 3e,f), and delayed differentiation of the epidermal precursor cells (data not shown). All these abnormalities have previously been described in emb30 embryos (Mayer, U., Bύttner, G. & Jύrgens G. Apical-basal pattern formation in the Arabidopsis embryo: studies on the role of the gnom gene. Dev. 117, 149-162 (1993)). At maturity, all FI seeds resulting from emb30/EMB30 plants crossed to the wild type are morphologically normal and viable (n=157). Thus, late expression of a paternally derived EMB30 allele is sufficient to rescue embryos inheriting a maternal emb30 allele, or, in genetic terms, emb30 is a paternally rescuable maternal effect mutant and not a purely zygotic embryo lethal.
Our results suggest that the activity of many genes acting during early embryo and endosperm formation depends solely on transcription from the maternally inherited allele. These maternal transcripts are likely to represent a combination of mRNAs transcribed before and after fertilization. The identification of enhancer detector lines in which GUS expression is detected in the central cell only after fertilization or in specific cells of the embryo (data not shown) indicates that at least some of the detected genes are transcribed zygotically from the maternal allele only and, thus, are regulated by genomic imprinting as was recently shown for the medea locus which displays a true maternal effect (Grossniklaus, U., Vielle-Calzada, J-P., Hoeppner, M.A., Gagliano, W.B. Maternal control of embryogenesis by MEDEA, a Polycomb group gene in Arabidopsis. Science 280, 446-450 (1998); Vielle-Calzada, J-P., Thomas, J., Spillane, Ch., Coluccio, A., Hoeppner, M.A & Grossniklaus, U. Maintenance of genomic imprinting at the Arabidopsis medea locus requires zygotic DDM1 activity. Genes & Dev. 13, 2971-2982 (1999)). Taken together, these results strongly suggest that early seed formation in Arabidopsis is characterised by a delayed transcriptional activation of the paternally inherited genome. Reporter gene expression and RT-PCR analyses indicate that transcription of paternally inherited alleles initiates late after fertilization, when the embryo consists of 32 to 64 cells. Although a case of global paternal silencing has not been reported in the plant kingdom, it is reminiscent of genome-wide heterochromatinisation in scale insects (Nur U. Heterochromatization and euchromatization of whole genomes in scale insects (Coccoidea: Homoptera). Dev. Suppl 1, 29-34 (1990); Buglia, G., Predazzi, V. & Ferraro, M. Cytosine methylation is not involved in the heterochromatization of the paternal genome of mealybug Planococcus citri. Chrom. Res. 7, 71-73 (1999)) and parent-specific X chromosome inactivation in the extra-embryonic tissues of mammals Goto T. & Monk M. Regulation of X-chromosome inactivation in development in mice and humans. Microbiol Mol Biol Rev 62, 362-78 (1998)). The silencing of the paternal genome probably occurs during sperm cell differentiation and may be related to the tight packaging of sperm chromatin involving specific histones (Ueda, K. & Tanaka, I. The appearance of male gamete-specific histones gH2B and gH3 during pollen development in Lilium longiflorum. Dev. Biol.169, 210-217 (1995); Xu, H., Swoboda, L, Bhalla, P.L. & Singh, M.B. Male gametic cell-specific expression of H2A and H3 histone genes. Plant Mol. Biol. 39, 607-614 (1999)) or changes in methylation levels of sperm DNA as compared to DNA in the vegetative nucleus (Oakeley, E.J., Podesta, A. & Jost, J.P. Developmental changes in DNA methylation of the two tobacco pollen nuclei during maturation. Proc. Natl. Acad. Sci. U SA 94, 11721-11725 (1997)). Whatever the mechanism, paternal silencing prolongs the functionally haploid phase, which serves to eliminate deleterious mutations (Walbot, V. Sources and consequences of phenotypic and genotypic plasticity in flowering plants. Trends PI Sci. 1, 27-32 (1996)), leading to a more stringent selection against such mutations inherited from the mother. However, this mechanism is imperfect because only a subset of genes are expressed in the female gametophyte and early seed, and some maternal defects can be paternally rescued as illustrated by emb30. The genetic analysis oϊ emb30 indicates that some of the early embryo-lethal mutants, which have been interpreted as affecting zygotically transcribed genes, represent loci that are only transcribed from the maternal allele. The delayed transcription of a paternally inherited wild-type allele is sufficient to rescue heterozygous embryos carrying a maternally inherited mutant allele, resulting in the 25% aborted seeds observed for many mutants that act during the pre-globular stage of embryogenesis. Thus, paternal silencing is not necessarily reflected in the segregation ratio of aborted seeds in mature siliques. Our results strongly suggest that the first few days of embryogenesis and endosperm development are largely, if not exclusively, under maternal control. This finding should lead to a reinterpretation of the genetic basis and molecular mechanisms that regulate early seed development in plants.
Methods. 5
Plant Material and Growth Conditions.
Plant growth conditions and insertional mutagenesis were described previously (Moore, J.M., Vielle-Calzada, J-P., Gagliano, W.B. & Grossniklaus,
U. Genetic characterization of hadad, a mutant disrupting female 10 gametogenesis in Arabidopsis thaliana. CSHL Symp. Quant. Biol. 62, 35-47
(1997); Sundaresan, V. et al. Patterns of gene action in plant development revealed by enhancer trap and gene trap transposable elements. Genes & Dev.
9, 1797-1810 (1995)). The wild-type strains used were A. thaliana (L.) Heynh. var. Landsberg (erecta mutant: Ler) and A. thaliana (L.) Heynh. var. Columbia 15 (Col.). Seeds of emb30-3/EMB30-3 plants were obtained from the Arabidopsis
Biological Resource Center (CS6322).
GUS Assays
Developing carpels and siliques were dissected to expose the ovules and
20 incubated in GUS staining buffer [lOmM EDTA, 0.1% Triton X-100, 2mM
Fe2+CN, 2 mM Fe3+CN, lOOμg/ml chloramphenicol, 1 mg/ml X-Gluc (Biosynth) in 50mM Sodium Phosphate Buffer pH 7.0] for 3 days at 37°C. The tissue was cleared in 20% lactic acid/20% glycerol and observed on a Leica DMBR microscope under Nomarski optics.
25
TAIL-PCR.
Genomic fragments flanking Ds insertions were isolated by TAIL-PCR as described (Liu, Y-G., Mitsukawa, N., Oosumi, T., & Whittier, R.F. (1995). Efficient isolation and mapping of Arabidopsis thaliana T-DNA insert
30 junctions by thermal asymmetric interlaced PCR. Plant J. 8, 457-463; Grossniklaus, U., Vielle-Calzada, J-P., Hoeppner, M.A., Gagliano, W.B. Maternal control of embryogenesis by MEDEA, a Polycomb group gene in Arabidopsis. Science 280, 446-450 (1998)) and sequenced at the IOWA State University DNA Sequencing and Synthesis Facility after subcloning into
35 pCR®II-TOPO (Invitrogen).
In-Situ Hybridization. Subcloned TAIL-PCR fragments were linearised with restriction enzymes cutting in the polylinker (Xhol and BamHI, respectively). Probe synthesis using 1 μg as template and in situ hybridization were performed as described (Vielle-Calzada, J-P., Thomas, J., Spillane, Ch., Coluccio, A., Hoeppner, M.A & Grossniklaus, U. Maintenance of genomic imprinting at the Arabidopsis medea locus requires zygotic DDMl activity. Genes & Dev. 13, 2971-2982 (1999)).
RT-PCR.
For RNA preparation young siliques were harvested in liquid nitrogen at specific time points after pollination. RNA preparation, cDNA synthesis, and PCR amplification of 1/3 of the cDNA were performed as descried (Grossniklaus, U., Vielle-Calzada, J-P., Hoeppner, M.A., Gagliano, W.B. Maternal control of embryogenesis by MEDEA, a Polycomb group gene in Arabidopsis. Science 280, 446-450 (1998); Vielle-Calzada, J-P., Thomas, J., Spillane, Ch., Coluccio, A., Hoeppner, M.A & Grossniklaus, U. Maintenance of genomic imprinting at the Arabidopsis medea locus requires zygotic DDMl activity. Genes & Dev. 13, 2971-2982 (1999)). For PRL, the primers used were PRLS1 (5'-CAGTCACTGGTTCATTCCT -3') SEQ ID NO: 12 and PRLAS1 (5i- GTAACAACTCGTCAACAGC-3') SEQ ID NO: 13. For EMB30, the primers used were EMB30S2 (5'-CGCCTAAAGTTGCATTCTG-3') SEQ ID NO: 14 and EMB30AS3 (5'-AATCACTGTCTACTCCAGC-3') SEQ ID NO:15. PCR products were digested with Xhol (PRL) and Eco57I (EMB30) overnight at 37°C.
Histological Analysis
Siliques were dissected with hypodermic needles (Becton-Dickinson, lcc insulin syringes) and fixed in FAA (4%formaldehyde, 5% acetic acid, 50% ethanol). Dissected siliques or individual seeds were cleared in Herr's solution (2:2:2:2:1 lactic acidxhloral hydrate :phenol: clove oil: xylene, by volume) and observed on a eicα DMRB microscope under brightfield or Nomarski optics.

Claims

What is claimed is:
1. A regulatory nucleotide sequence isolated and purified from Arabidopsis, said sequence characterized by the following: capable of directing expression in a plant cell to the embryo and/or endosperm after fertilization; and said regulatory sequence identifiable by a sequence tag selected from the group consisting of SEQ ID NOS:l, 2, 3, 4, 5, 6, 7, 8, 9, 10 and 11.
2. An expression construct comprising: a nucleotide sequence according to claim 1, operatively linked to a structural gene.
3. A vector capable of transforming or transfecting a host cell, said vector comprising an expression construct according to claim 2.
4. The vector of claim 3 wherein said vector is a plasmid based vector.
5. The vector of claim 3 wherein said vector is a viral based vector.
7. A prokaryotic or eukaryotic host cell transformed or transfected with a vector according to claim 3.
8. The host cell of claim 7 wherein said cell is a plant cell.
9. A purified and isolated Arabidopsis regulatory nucleotide sequence capable of directing expression in a plant cell, said sequence characterized by the following:
(a) directs expression of a transcription unit in the embryo and/or endosperm after fertilization;
(b) in Arabidopsis directs expression of a transcription unit which is silenced during this period when in the paternal allele;
(c) is further characterized by a location selected from the group consisting of: located on Arabidopsis chromosome IV at approximately 69cM and conferring tissue-specific expression to the transcription unit containing genomic sequence of SEQ ID NO:l, located on Arabidopsis chromosome V at approximately 125 cM and conferring tissue-specific expression to the transcription unit containing genomic sequence of SEQ ID NO:2; located on Arabidopsis chromosome II at approximately 87cM and conferring tissue- specific expression to the transcription unit containing genomic sequence of SEQ ID NO:5;; located on Arabidopsis chromosome VI at approximately 77cM and conferring tissue-specific expression to the transcription unit containing genomic sequence of SEQ ID NO:7; located on Arabidopsis chromosome II at approximately lcM and conferring tissue-specific expression to the transcription unit containing genomic sequence of SEQ ID NO:8; located on Arabidopsis chromosome IV at approximately 69cM and conferring tissue- specific expression to the transcription unit containing genomic sequence of SEQ ID NO:9; located on Arabidopsis chromosome I at approximately 70cM and conferring tissue-specific expression to the transcription unit containing genomic sequence of SEQ ID NO:ll; located on Arabidopsis chromosome II at approximately 83cM and conferring tissue-specific expression to the transcription unit containing genomic sequence of SEQ ID NO: 4 in its 5' untranslated or promoter region; located on Arabidopsis chromosome II at approximately 79cM and conferring tissue -specific expression to the transcription unit containing genomic sequence of SEQ ID NO:6 in its 5' untranslated or promoter region.
10. An expression construct comprising: a nucleotide sequence according to claim 9, operatively linked to a structural gene.,
11. A vector capable of transforming or transfecting a host cell, said vector comprising an expression construct according to claim 10.
12. The vector of claim 11 wherein said vector is a plasmid based vector.
13. The vector of claim 11 wherein said vector is a viral based vector.
14. A prokaryotic or eukaryotic host cell transformed or transfected with a vector according to claim 11.
15. The host cell of claim 14 wherein said cell is a plant cell.
16. A purified and isolated Arabidopsis regulatory nucleotide sequence capable of directing expression in a plant cell, said sequence characterized by the following: (a) directs expression of a transcription unit in the embryo and/or endosperm after fertilization;
(b) in Arabidopsis directs expression of a transcription unit which is silenced during this period when in the paternal allele; (c) is further characterized by conferring tissue-specific expression to the transcription unit containing genomic sequence SEQ ID NO: 3, or containing sequences that are closely linked (sequences including or neighboring) SEQ. ID. NO: 10.
17. An expression construct comprising: a nucleotide sequence according to claim 16, operatively linked to a structural gene.
18. A vector capable of transforming or transfecting a host cell, said vector comprising an expression construct according to claim 17.
19. The vector of claim 18 wherein said vector is a plasmid based vector.
20. The vector of claim 18 wherein said vector is a viral based vector.
21. A prokaryotic or eukaryotic host cell transformed or transfected with a vector according to claim 18.
22. The host cell of claim 21 wherein said cell is a plant cell.
23. A method for identifying a regulatory sequence capable of directing expression to a plant embryo and/or endosperm post fertilization comprising: identifying a regulatory sequence based upon a sequence tag selected from the group consisting of SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 and 11.
24. The method of claim 23 further comprising the step of: identifying a transcription unit neighboring said sequence tag region.
25. A novel nucleotide composition from Arabidopsis comprising a sequence selected from the group consisting of SEQ ID NO: 3 and 10.
26. A method for obtaining expression of a transgene during early embryo or endosperm development comprising: introducing said transgene to a maternally inherited genetic component of said embryo and endosperm.
27. The method of claim 26 wherein said genetic component is an ovule or the female gametophyte.
28. A method of silencing a transgene in early embryo and/or endosperm development comprising: introducing said transgene to a paternally inherited genetic component of said embryo, wherein said transgene becomes expressed after early embryo or endosperm development.
29. The method of claim 27 where said genetic component is pollen.
PCT/US2001/006297 2000-03-01 2001-02-27 Nucleotide sequences for embryo and/or endosperm specific expression in plants WO2001064891A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2001241818A AU2001241818A1 (en) 2000-03-01 2001-02-27 Nucleotide sequences for embryo and/or endosperm specific expression in plants

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US51670500A 2000-03-01 2000-03-01
US09/516,705 2000-03-01

Publications (3)

Publication Number Publication Date
WO2001064891A2 true WO2001064891A2 (en) 2001-09-07
WO2001064891A3 WO2001064891A3 (en) 2002-05-10
WO2001064891A9 WO2001064891A9 (en) 2003-01-03

Family

ID=24056754

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/006297 WO2001064891A2 (en) 2000-03-01 2001-02-27 Nucleotide sequences for embryo and/or endosperm specific expression in plants

Country Status (2)

Country Link
AU (1) AU2001241818A1 (en)
WO (1) WO2001064891A2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6887988B2 (en) 1999-08-31 2005-05-03 Pioneer Hi-Bred International, Inc. Plant reproduction polynucleotides and methods of use
WO2005054482A2 (en) * 2003-12-04 2005-06-16 Plant Bioscience Limited Nucleic acids having utility in seeds
CN103602684A (en) * 2013-11-22 2014-02-26 昆明理工大学 Enhanced subsample gene capable of improving expression of foreign protein and application thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999020775A1 (en) * 1997-10-22 1999-04-29 Aventis Cropscience S.A. Novel seed specific promoters based on plant genes
EP1033405A2 (en) * 1999-02-25 2000-09-06 Ceres Incorporated Sequence-determined DNA fragments and corresponding polypeptides encoded thereby

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999020775A1 (en) * 1997-10-22 1999-04-29 Aventis Cropscience S.A. Novel seed specific promoters based on plant genes
EP1033405A2 (en) * 1999-02-25 2000-09-06 Ceres Incorporated Sequence-determined DNA fragments and corresponding polypeptides encoded thereby

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
DATABASE EM_EST [Online] EMBL; 9 September 1999 (1999-09-09) CHEN, J. ET AL.: "Arabidopsis thaliana cDNA clone 701500412" retrieved from EBI, accession no. AI994916 Database accession no. AI994916 XP002178565 *
DATABASE EM_PL [Online] EMBL; 21 August 1998 (1998-08-21) WATSON, M.D. ET AL.: "Arabidopsis thaliana chromosome IV, BAC clone F16G20" retrieved from EBI, accession no. ATF16G20 Database accession no. AL031326 XP002178566 cited in the application *
SPRINGER P S ET AL: "GENE TRAP TAGGING OF PROLIFERA, AN ESSENTIAL MCM2-3-5-LIKE GENE IN ARABIDOPSIS" SCIENCE, US, vol. 268, 12 May 1995 (1995-05-12), pages 877-880, XP002913013 ISSN: 0036-8075 *
SUNDARESAN V ET AL: "PATTERNS OF GENE ACTION IN PLANT DEVELOPMENT REVEALED BY ENHANCER TRAP AND GENE TRAP TRANSPOSABLE ELEMENTS" GENES AND DEVELOPMENT, COLD SPRING HARBOR, NY, US, vol. 9, no. 14, 15 July 1995 (1995-07-15), pages 1797-1810, XP000674520 ISSN: 0890-9369 *
TORII KEIKO U ET AL: "The RING finger motif of photomorphogenic repressor COP1 specifically interacts with the RING-H2 motif of a novel Arabidopsis protein." JOURNAL OF BIOLOGICAL CHEMISTRY, vol. 274, no. 39, 1999, pages 27674-27681, XP002178562 ISSN: 0021-9258 cited in the application *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6887988B2 (en) 1999-08-31 2005-05-03 Pioneer Hi-Bred International, Inc. Plant reproduction polynucleotides and methods of use
US7312375B2 (en) 1999-08-31 2007-12-25 Pioneer Hi-Bred International, Inc. Plant reproduction polynucleotides and methods of use
WO2005054482A2 (en) * 2003-12-04 2005-06-16 Plant Bioscience Limited Nucleic acids having utility in seeds
WO2005054482A3 (en) * 2003-12-04 2005-10-06 Plant Bioscience Nucleic acids having utility in seeds
CN103602684A (en) * 2013-11-22 2014-02-26 昆明理工大学 Enhanced subsample gene capable of improving expression of foreign protein and application thereof

Also Published As

Publication number Publication date
WO2001064891A9 (en) 2003-01-03
WO2001064891A3 (en) 2002-05-10
AU2001241818A1 (en) 2001-09-12

Similar Documents

Publication Publication Date Title
USRE41318E1 (en) Plant promoter sequences and methods of use for same
EP3018217B1 (en) Maize cytoplasmic male sterility (cms) c-type restorer rf4 gene, molecular markers and their use
AU2016257016B2 (en) Polynucleotide responsible of haploid induction in maize plants and related processes
US8163975B2 (en) Nucleic acid molecules and their use in plant sterility
US20120329674A1 (en) Methods for large scale functional evaluation of nucleotide sequences in plants
WO2019038417A1 (en) Methods for increasing grain yield
CN111818794A (en) Method for increasing nutrient utilization efficiency
US20230183729A1 (en) Methods of increasing seed yield
US6787687B1 (en) Rin gene compositions and methods for use thereof
US20220396804A1 (en) Methods of improving seed size and quality
US6762347B1 (en) NOR gene compositions and methods for use thereof
WO2001014561A1 (en) Nor gene compositions and methods for use thereof
WO2001064891A2 (en) Nucleotide sequences for embryo and/or endosperm specific expression in plants
US20030014776A1 (en) Maternal effect gametophyte regulatory polynucleotide
US6875906B1 (en) Control of sporocyte or meiocyte formation in plants
AU2005253642B8 (en) Nucleic acid molecules and their use in plant male sterility
CA2828931C (en) Regulatory regions preferentially expressing in non-pollen plant tissue
MXPA04011042A (en) Nucleotide sequences and methods for the specific expression of genes in the female gametophyte, female reproductive cells, pollen grain and/or male reproductive cells of plants.
WO2001004315A2 (en) Ringene compositions and methods for use thereof
TW201125978A (en) Inducible promoter from lily and use thereof
WO2005054482A2 (en) Nucleic acids having utility in seeds

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

COP Corrected version of pamphlet

Free format text: PAGES 1/4-4/4, DRAWINGS, REPLACED BY NEW PAGES 1/4-4/4; DUE TO LATE TRANSMITTAL BY THE RECEIVING OFFICE

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP