EP1644510A2 - Compositions a base de centromeres de vegetaux - Google Patents

Compositions a base de centromeres de vegetaux

Info

Publication number
EP1644510A2
EP1644510A2 EP03817686A EP03817686A EP1644510A2 EP 1644510 A2 EP1644510 A2 EP 1644510A2 EP 03817686 A EP03817686 A EP 03817686A EP 03817686 A EP03817686 A EP 03817686A EP 1644510 A2 EP1644510 A2 EP 1644510A2
Authority
EP
European Patent Office
Prior art keywords
gene
plant
seq
nucleic acid
centromere
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP03817686A
Other languages
German (de)
English (en)
Other versions
EP1644510A4 (fr
Inventor
Jennifer Mach
Helge Zieler
Rongguan Jin
Kevin Keith
Gregory Copenhaver
Daphne Preuss
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Chicago
Chromatin Inc
Original Assignee
University of Chicago
Chromatin Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Chicago, Chromatin Inc filed Critical University of Chicago
Priority to EP10011458A priority Critical patent/EP2295586A3/fr
Priority to EP10011419A priority patent/EP2357240A1/fr
Publication of EP1644510A2 publication Critical patent/EP1644510A2/fr
Publication of EP1644510A4 publication Critical patent/EP1644510A4/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/6895Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Definitions

  • transformation into cells.
  • One approach is to introduce the new genetic information as part of another DNA molecule, referred to as an "episomal vector," or
  • Episomal vectors contain all the necessary DNA sequence elements required for DNA replication and maintenance of the vector within the cell. Many episomal vectors are available for use in bacterial cells (for example, see Maniatis et al, 1982). However, only a few episomal vectors that function in higher eukaryotic cells have been developed. Higher eukaryotic episomal vectors were primarily based on naturally occurring viruses.
  • gemini viruses are double-stranded DNA viruses that replicate through a double-stranded intermediate upon which an episomal vector could be based, although the gemini virus is limited to an approximately 800 bp insert.
  • an episomal plant vector based on the Cauliflower Mosaic Virus has been developed, its capacity to carry new genetic information also is limited (Brisson et al., 1984).
  • the other general method of genetic transformation involves integration of introduced DNA sequences into the recipient cell's chromosomes, permitting the new information to be replicated and partitioned to the cell's progeny as a part ofthe natural chromosomes.
  • the introduced DNA usually is broken and joined together in various combinations before it is integrated at random sites into the cell's chromosome (see, for example Wigler et ⁇ /., 1977).
  • Common problems with this procedure are the rearrangement of introduced DNA sequences and unpredictable levels of expression due to the location of the transgene in the genome or so called "position effect variation" (Shingo et al, 1986).
  • position effect variation Shingo et al, 1986.
  • integrated DNA cannot normally be precisely removed.
  • a more refined form of integrative transformation can be achieved by exploiting naturally occurring viruses that integrate into the host's chromosomes as part of their life cycle, such as retroviruses (see
  • the most common genetic transformation method used in higher plants is based on the transfer of bacterial DNA into plant chromosomes that occurs during infection by the phytopathogenic soil bacterium Agrobacterium (see
  • a third drawback of the Agrobacterium T-DNA system is the reliance on a "gene addition” mechanism: the new genetic information is added to the genome (i.e., all the genetic information a cell possesses) but does not replace information already present in the genome.
  • Artificial chromosomes are man-made linear or circular DNA molecules constructed from cis-acting DNA sequence elements that provide replication and partitioning of the constructed chromosomes (see Murray et al, 1983).
  • Desired elements include: (1) Autonomous Replication Sequences (ARS) (these have properties of replication origins, which are the sites for initiation of DNA replication), (2) Centromeres (site of kinetochore assembly and responsible for proper distribution of replicated chromosomes at mitosis or meiosis), and (3) if the chromosome is linear, telomeres (specialized DNA structures at the ends of linear chromosomes that function to stabilize the ends and facilitate the complete replication ofthe extreme termini ofthe DNA molecule).
  • ARS Autonomous Replication Sequences
  • Centromeres site of kinetochore assembly and responsible for proper distribution of replicated chromosomes at mitosis or meiosis
  • telomeres specialized DNA structures at the ends of linear chromosomes that function to stabilize the ends and facilitate the complete replication ofthe extreme termini ofthe DNA molecule.
  • ARSs have been isolated from unicellular fungi, including Saccharomyces cerevisiae (brewer's yeast) and Schizosaccharomyces pombe (see Stinchcomb et al., 1979 and Hsiao et al, 1979).
  • An ARS behaves like a replication origin allowing DNA molecules that contain the ARS to be replicated as an episome after introduction into the cell nuclei of these fungi. DNA molecules containing these sequences replicate, but in the absence of a centromere they are partitioned randomly into daughter cells.
  • chromosomes have been constructed in yeast using the three cloned essential chromosomal elements.
  • Murray et al, 1983 disclose a cloning system based on the in vitro construction of linear DNA molecules that can be transformed into yeast, where they are maintained as artificial chromosomes.
  • yeast artificial chromosomes contain cloned genes, origins of replication, centromeres and telomeres and are segregated in daughter cells with high fidelity when the YAC is at least 100 kB in length. Smaller CEN-containing vectors may be stably segregated, however, when in circular form.
  • telomeres are important in maintaining the stability of chromosomal termini, they do not encode the information needed to ensure stable inheritance of an artificial chromosome. It is well documented that centromere function is crucial for stable chromosomal inheritance in almost all eukaryotic organisms (reviewed in Nicklas 1988). For example, broken chromosomes that lack a centromere (acentric chromosomes) are rapidly lost from cell lines, while fragments that have a centromere are faithfully segregated.
  • centromere accomplishes this by attaching, via centromere binding proteins, to the spindle fibers during mitosis and meiosis, thus ensuring proper gene segregation during cell divisions.
  • centromere binding proteins In contrast to the detailed studies done in S. cerevisiae and S. pombe, less is known about the molecular structure of functional centromeric DNA of higher eukaryotes. Ultrastructural studies indicate that higher eukaryotic kinetochores, which are specialized complexes of proteins that form on the centromere during late prophase, are large structures (mammalian kinetochore plates are approximately 0.3 ⁇ m in diameter) which possess multiple microtubule attachment sites (reviewed in Rieder, 1982). It is therefore possible that the centromeric DNA regions of these organisms will be correspondingly large, although the minimal amount of DNA necessary for centromere function may be much smaller.
  • centromeres The above studies have been useful in elucidating the structure and function of centromeres-.
  • the production of artificial chromosomes with centromeres which function in higher eukaryotes would overcome many of the problems associated with the prior art and represent a significant breakthrough in biotechnology research.
  • the present invention allows the isolation and identification of plant centromere DNA sequences from the total genomic DNA of an organism or fractions thereof. With centromere DNA sequences, it is possible to construct chromosomes having functional centromeres and carrying large number of genes. Genes for producing a vast set of products have been identified, but technologies used within the industry severely limit the delivery of these genes to plant cells. One or at most a few genes are typically inserted into random locations in the host chromosomes, which can irreversibly disrupt host gene functions while causing variable and uncontrolled expression of the introduced genes.
  • the present invention makes it possible to overcome the technical limitations associated with gene delivery in crop species, thereby allowing for the ability to shorten the time required for crop development.
  • the invention provides a method to obtain a centromere DNA sequence from a selected organism, the method comprising the steps of preparing a sample of genomic DNA from a selected organism, obtaining a plurality of nucleic acid segments from the genomic DNA and screening the nucleic acid segments to identify one or more centromere nucleic acid sequences.
  • the method of obtaining the plurality of nucleic acid segments comprises contacting said genomic DNA with a restriction endonuclease and selecting nucleic acid segments containing repetitive DNA to obtain said plurality of nucleic acid segments.
  • the method of obtaining the plurality of nucleic acid segments comprises contacting said genomic DNA with a methylation sensitive restriction endonuclease and selecting nucleic acid segments exhibiting resistance to cleavage with said methylation sensitive restriction endonuclease to obtain said plurality of nucleic acid segments.
  • the method of obtaining the plurity of nucleic acid segments comprises contacting said genomic DNA with a restriction endonuclease or physically shearing said genomic DNA and selecting nucleic acid segments that anneal rapidly after denaturation to obtain said plurality of nucleic acid segments.
  • the invention provides a method for identifying a centromere nucleic acid sequence from a dataset of the genomic sequences of an organism.
  • the method comprises the steps of (1) providing a first dataset consisting of the genomic sequences, or a representative fraction of genomic sequence, of the organism; (2) identifying and eliminating known non-centromeric repeat sequences from the first dataset by using the BLAST sequence comparison algorithm to create a second dataset; (3) comparing each sequence in the second dataset to itself by using the BLAST sequence comparison algorithm, obtaining a BLAST score for each pair of sequence compared, and collecting high score pairs to create a third dataset; (4) examining the BLAST score of each high score pair in the third dataset and eliminating the pairs having a score greater than 10 "20 to create a fourth dataset; (5) eliminating the high score pairs in the fourth dataset having less than 80 bp or more than 250 bp to create a fifth dataset; (6) examining the nucleotide position of each high score pair in the fifth dataset and eliminating pairs having 100% identity as well as identical nucleo
  • the known non-centromeric repeat sequence in the second step is a ribosomal DNA.
  • the invention provides a Brassica oleracea centromere comprising Brassica oleracea centromere DNA.
  • the Brassica oleracea centromere is defined as comprising n copies of a repeated nucleotide sequence, wherein n is at least 2.
  • any number of repeat copies capable of physically being placed on the recombinant construct could be included on the construct, including about 5, 10, 15, 20, 30, 50, 75, 100, 150, 200, 300, 400, 500, 750, 1,000, 1,500, 2,000, 3,000, 5,000, 7,500, 10,000, 20,000, 30,000, 40,000, 50,000, 60,000, 70,000, 80,000, 90,000 and about 100,000, including all ranges in-between such copy numbers.
  • the repeated nucleotide sequence is isolated from Brassica oleracea given by SEQ ID NO: 1 , 2, 3, or 4.
  • the invention provides a Glycine max centromere comprising glycine max centromere DNA.
  • the Glycine max centromere is defined as comprising n copies of a repeated nucleotide sequence, wherein n is at least 2. Potentially any number of repeat copies capable of physically being placed on the recombinant construct could be included on the construct, including about 5, 10, 15, 20, 30, 50, 75, 100, 150, 200, 300, 400, 500, 750, 1,000, 1,500, 2,000, 3,000, 5,000, 7,500, 10,000, 20,000, 30,000, 40,000, 50,000, 60,000, 70,000, 80,000, 90,000 and about 100,000, including all ranges in-between such copy numbers.
  • the repeated nucleotide sequence is isolated from Glycine max given by SEQ ID NO:5, 6, 7, or 8.
  • the invention provides a Lycopersicon esculentum centromere comprising Lycopersicon esculentum centromere DNA.
  • the Lycopersicon esculentum centromere is defined as comprising n copies of a repeated nucleotide sequence, wherein n is at least 2.
  • any number of repeat copies capable of physically being placed on the recombinant construct could be included on the construct, including about 5, 10, 15, 20, 30, 50, 75, 100, 150, 200, 300, 400, 500, 750, 1,000, 1,500, 2,000, 3,000, 5,000, 7,500, 10,000, 20,000, 30,000, 40,000, 50,000, 60,000, 70,000, 80,000, 90,000 and about 100,000, including all ranges in-between such copy numbers.
  • the repeated nucleotide sequence is isolated from Lycopersicon esculentum given by SEQ ID NO:9 or 10.
  • the invention provides a Zea mays centromere comprising Zea mays centromere DNA.
  • the centromere is defined as comprising n copies of a repeated nucleotide sequence, wherein n is at least 2.
  • Potentially any number of repeat copies capable of physically being placed on the recombinant construct could be included on the construct, including about 5, 10, 15, 20, 30, 50, 75, 100, 150, 200, 300, 400, 500, 750, 1,000, 1,500, 2,000, 3,000, 5,000, 7,500, 10,000, 20,000, 30,000, 40,000, 50,000, 60,000, 70,000, 80,000, 90,000 and about 100,000, including all ranges in-between such copy numbers.
  • the repeated nucleotide sequence is isolated from Zea mays given by SEQ E> NO:l l, 12 or 13.
  • the invention provides a recombinant DNA construct comprising a plant centromere sequence of the present mvention.
  • the recombinant DNA construct may additionally comprise any other desired sequences, for example, a telomere.
  • Examples of structural genes one may wish to use include a selectable or screenable marker gene, an antibiotic resistance gene, a ligand gene, an enzyme gene, a herbicide resistance gene, a nitrogen fixation gene, a plant pathogen defense gene, a plant stress-induced gene, a toxin gene, a receptor gene, a gene encoding an enzyme, a gene encoding an antibody, a gene encoding an antigen for a vaccine, a transcription factor, a cytoskeletal protein, a DNA-binding protein, a protease, an endonuclease, a lipid, a seed storage gene, an interleukin gene, a clotting factor gene, a cytokine gene, a growth factor gene and a biosynthetic gene for producing pharmaceutically active proteins, small molecules with medicinal properties, chemicals with industrial utility, nutraceuticals, carbohydrates, RNAs, lipids, fuels, dyes, pigments, vitamins, scents, flavors, vaccines, antibodies, and hormones.
  • the construct is capable of expressing the structural gene, for example, in a prokaryote or eukaryote, including a lower eukaryote, or a higher eukaryote such as a plant.
  • the recombinant construct could contain other useful non-coding sequences, including promotors, terminators, boundary elements that regulate gene expression, sequences that alter maintenance, inheritance, or stability of the construct, and sequences that allow subsequent modification of the composition ofthe construct.
  • the invention provides a recombinant DNA construct comprising a plant centromere sequence of the present invention and which is capable of being maintained as a chromosome, wherein the chromosome is transmitted in dividing cells.
  • the plant centromere may be from any plant or may be from any other source of DNA or may be partially or entirely synthetic in origin.
  • the invention provides a recombinant DNA construct comprising a plant centromere sequence of the present invention and which is a plasmid.
  • the plasmid may contain any desired sequences, such as an origin of replication.
  • the plasmid may also comprise a selection marker.
  • the invention provides a minichromosome comprising a plant centromere sequence ofthe present invention and may also contain a telomere sequence. Any additional desired sequences may be added to the minichromosome, such as an autonomous replicating sequence and a structural gene such as those described above.
  • the minichromosome may comprise any of the centromere compositions disclosed herein.
  • the minichromosome also may contain "negative" selectable markers which confer susceptibility to an antibiotic, herbicide or other agent, thereby allowing for selection against plants, plant cells or cells of any other organism of interest containing a minichromosome.
  • the minichromosome also may include genes or other sequences which control the copy number of the minichromosome within a cell.
  • One or more structural genes also may be included in the minichromosome. Specifically contemplated as being useful will be as many structural genes as may be inserted into the minichromosome.
  • the invention provides a cell transformed with a recombinant DNA construct comprising a plant centromere sequence of the present invention.
  • the cell may be of any type, including a prokaryotic cell or eukaryotic cell. Where the cell is a eukaryotic cell, the cell may be, for example, a yeast cell or a higher eukaryotic cell, such as plant cell.
  • the plant cell may be from a dicotyledonous plant, such as tobacco, tomato, potato, soybean, canola, sunflower, alfalfa, cotton and Arabidopsis, or may be a monocotyledonous plant cell, such as wheat, maize, rye, rice, turfgrass, oat, barley, sorghum, millet, and sugarcane.
  • the plant centromere is a centromere chosen from the group consisting of Brassica oleracea, Glycine max, Lycopersicon esculentum, and Zea mays and the cell may be a cell chosen from one of the above species or any other species.
  • the recombinant DNA construct may comprise additional sequences, such as a telomere, an autonomous replicating sequence (ARS), a structural gene or genes, or a selectable or screenable marker gene or genes, including as many of such sequences as may physically be placed on said recombinant DNA construct.
  • the cell is further defined as capable of expressing said structural gene.
  • a plant is provided comprising the aforementioned cells.
  • the invention provides a method for preparing a transgenic plant cell. The method comprises the steps of contacting a starting plant cell with a recombinant DNA construct comprising a plant centromere sequence ofthe present invention, whereby the starting plant cell is transformed with the recombinant DNA construct.
  • the invention provides a transgenic crop comprising a minichromosome, wherein the minichromosome comprises a plant centromere sequence of the present invention.
  • the minichromosome may further comprise a telomere sequence, an autonomous replicating sequence or a structural gene, such as a selectable or screenable marker gene, an antibiotic resistance gene, a ligand gene, an enzyme gene, a herbicide resistance gene, a nitrogen fixation gene, a plant pathogen defense gene, a plant stress-induced gene, a toxin gene, a receptor gene, a gene encoding an enzyme, a gene encoding an antibody, a gene encoding an antigen for a vaccine, a transcription factor, a cytoskeletal protein, a DNA-binding protein, a protease, an endonuclease, a lipid, a seed storage gene, an interleukin gene, a clotting factor gene, a cytokine gene, a growth factor gene and a biosynthetic
  • the transgenic crop may be any type of crop, such as a dicotyledonous plant, for example, tobacco, tomato, potato, pea, carrot, cauliflower, broccoli, soybean, canola, sunflower, alfalfa, cotton and Arabidopsis, or may be a monocotyledonous plant, such as wheat, maize, rye, rice, turfgrass, oat, barley, sorghum, millet, and sugarcane.
  • a dicotyledonous plant for example, tobacco, tomato, potato, pea, carrot, cauliflower, broccoli, soybean, canola, sunflower, alfalfa, cotton and Arabidopsis
  • a monocotyledonous plant such as wheat, maize, rye, rice, turfgrass, oat, barley, sorghum, millet, and sugarcane.
  • the invention provides a method for preparing a transgenic crop tissue.
  • the method comprises the steps of contacting a starting crop tissue with a recombinant DNA construct comprising a plant centromere sequence of the present invention, whereby the starting crop tissue is transformed with the recombinant DNA construct.
  • the invention provides a method for preparing a transgenic crop seed. The method comprises the steps of contacting a starting crop, crop tissue, or crop cell, with a recombinant DNA construct comprising a plant centromere sequence of the present invention, whereby the starting crop, crop tissue, or crop cell is transformed with the recombinant DNA construct.
  • These transformed crops, crop tissues, or crop cells are allowed to develop into mature crops, using standard agricultural techniques. Transgenic seed is then collected from these crops.
  • the invention provides a method for preparing an extract of a transgenic crop, crop tissue, crop seed, or crop cell.
  • the method comprises the steps of contacting a starting crop, crop tissue, or crop cell with a recombinant DNA construct comprising a plant centromere sequence of the present invention, whereby the starting crop cell is transformed with the recombinant DNA construct.
  • the resulting transgenic crop, crop tissue, crop seed, or crop cell is then extracted and processed to yield the desirable product.
  • One preferred desirbale product is a food product.
  • Another preferred desirable product is a pharmaceutical product.
  • Yet another preferred desirable product is a chemical product.
  • FIG. 1A-F Consensus sequences of repeats from Brassica oleracea.
  • FIG. 1A is the consensus sequence of ChrBol. This consensus was assembled from 33 sequences collected by the inventors. The length of this repeat is 180 + 0.86 base pairs and A and T compose 60% of the consensus.
  • FIG. IB is the consensus sequence of ChrBo2. This consensus was assembled from 7 sequences collected by the inventors. The length of this repeat is 180 + 0.45 base pairs and A and T compose 63% of the consensus.
  • FIG. 1A-F Consensus sequences of repeats from Brassica oleracea.
  • FIG. 1A is the consensus sequence of ChrBol. This consensus was assembled from 33 sequences collected by the inventors. The length of this repeat is 180 + 0.86 base pairs and A and T compose 60% of the consensus.
  • FIG. IB is the consensus sequence of ChrBo2. This consensus was assembled from 7 sequences collected by the inventors. The length of this repeat is 180 + 0.45 base pairs and A and T compose 63%
  • FIG. ID is a revised consensus sequence of ChrBol. This consensus was assembled from 33 DNA sequences collected by the inventors and 18 sequences from Genbank, identified by the assession numbers:
  • FIG. IE is a revised consensus sequence of ChrBo2. This consensus was assembled from 7 DNA sequences collected by the inventors and 5 sequences from Genbank, identified by the accession numbers AJ228347, M30962, X12736, X61583, and X68785.
  • FIG. IF is a comparison of the revised consensus sequences of ChrBol and ChrBo2, aligned as for FIG. lC.
  • FIG. 2A-F Consensus sequences of repeats from Glycine max.
  • FIG. 2A is a consensus sequence of ChrGml. This consensus was assembled from 32 sequences collected by the inventors. The length of this repeat is 92 0.79 base pairs and A and T compose 63% ofthe consensus.
  • FIG. 2B is a consensus sequence of ChrGm2. This consensus was assembled from 21 sequences collected by the inventors. The length of this repeat is 91 048 base pairs and A and T compose 62% of the consensus.
  • FIG. 2C is a comparison of the consensus sequences of ChrGml and ChrGm2. The two repeats (ChrGml and ChrGm2) were aligned to each other using the ClustalX program (ClustalX is a free multiple sequence alignment program for Windows.
  • FIG. 2D is a revised consensus sequence of ChrGml. This consensus was assembled from 32 DNA sequences collected by the inventors and 1 sequence from Genbank, identified by the accession number Z26334.
  • FIG. 2E is a revised consensus sequence of ChrGm2. This consensus was assembled from 21 DNA sequences collected by the inventors and 13 sequences from Genbank, identified by the accession numbers AF297983, AF297984, and AF297985.
  • FIG. 2F is a comparison of the revised consensus sequences of ChrGml and ChrGm2, aligned as for FIG. 2C.
  • FIG. 3A-B Consensus sequences of repeats from Lycopersicon esculentum.
  • FIG. 3 A is a consensus sequence of ChrLel. This consensus was assembled from 42 sequences collected by the inventors. The length of this repeat is 181 0.61 base pairs and A and T compose 50% of the consensus.
  • FIG. 3B is a revised consensus sequence of ChrLel. This consensus was assembled from 32 sequences collected by the inventors and 2 Genbank sequences identified by the accession numbers X87233 and AY007367.
  • FIG. 4A-C Consensus sequences of repeats from Zea mays.
  • FIG. 4A is a consensus sequence of ChrZml. This consensus was assembled from 38 sequences collected by the inventors. The length of this repeat is 180 1.15 base pairs and A and T compose 56% of the consensus.
  • FIG. 4B is a revised consensus sequence of ChrZml. This consensus was assembled from 38 sequences collected by the inventors and 26 sequences from Genbank, identified by the accession numbers:
  • FIG. 4C is a consensus sequence of ChrZm2. This consensus was assembled from 6 sequences collected from Genbank identified by the accession numbers:
  • FIG. 5 Minichromosome containing centromere sequences as well as minichromosome vector sequences
  • FIG. 6 Minichromosome construct formed by minichromosome vector tailing method.
  • FIG. 7A-7N Exemplary Minichromosome vectors: The vectors shown in FIG. 7A, FIG. 7B, FIG. 7E, FIG. 7F, FIG. 71 and FIG. 7J have an E. coli origin of replication which can be high copy number, low copy number or single copy.
  • the vectors include a multiple cloning site which can contain recognition sequences for conventional restriction endonucleases with 4-8 bp specificity as well as recognition sequences for very rare cutting enzymes such as, for example, I-Ppo I, I-Cue I, PI-Tli, PI-Psp I, Not I, and PI See I.
  • I-Ppo I I-Cue I
  • PI-Tli PI-Psp I
  • Not I and PI See I.
  • FIG. 7A-7N the centromere is flanked by Lox sites which can act as targets for the site specific recombinase Cre.
  • FIG. 7A Shows an E. coli plant circular shuttle vector with a plant ARS.
  • FIG. 7B Shows a plant circular vector without a plant ARS. The vector relies on a plant origin of replication function found in other DNA sequences such as selectable or screenable markers.
  • FIG. 7C Shows a yeast-plant circular shuttle vector with a plant ARS. The yeast ARS is included twice, once on either side of multiple cloning site to ensure that large inserts are stable.
  • FIG. 7D Shows a yeast-plant circular shuttle vector without a plant ARS.
  • FIG. 7E Shows an E. coli-Agrobacterium-plant circular shuttle vector with a plant ARS. Vir functions for T-DNA transfer would be provided in trans by a using the appropriate Agrobacterium strain.
  • FIG. 7F Shows an E. coli-Agrobacterium-plant circular shuttle vector without a plant ARS. The vector relies on a plant origin of replication function found in other plant DNA sequences such as selectable markers. Vir functions for T-DNA transfer would be provided in trans by a using the appropriate Agrobacterium strain.
  • FIG. 7G Shows an E. coli-Agrobacterium-plant circular shuttle vector with a plant ARS. Vir functions for T-DNA transfer would be provided in trans by a using the appropriate Agrobacterium strain.
  • FIG. 7G Shows an E. coli-Agrobacterium-plant circular shuttle vector with a plant ARS. Vir functions for T-DNA transfer would be provided in trans by a using the appropriate Agrobacterium strain.
  • FIG. 7G Shows an E.
  • FIG. 7H Shows a linear plant vector without a plant ARS.
  • the linear vector could be assembled in vitro and then transferred into the plant by, for example, mechanical means such as micro projectile bombardment, electroporation, or PEG-mediated transformation.
  • FIG. 7I-7N Shows a linear plant vector without a plant ARS.
  • the linear vector could be assembled in vitro and then transferred into the plant by, for example, mechanical means such as micro projectile bombardment, electroporation, or PEG-mediated transformation.
  • FIGs. 7I-7N The figures are identical to FIGs. 7A-7F, respectively, with the exception that they do not contain plant telomeres. These vectors will remain circular once delivered into the plant cell and therefore do not require telomeres to stabilize their ends.
  • FIG. 8 Sequence features at Arabidopsis CEN2 (A) and CEN4 (B). Central bars depict annotated genomic sequence of indicated BAC clones; black, genetically-defined centromeres; white, regions flanking the centromeres. Sequences corresponding to genes and repetitive features, filled boxes (above and below the bars, respectively), are defined as in FIG. 11 A-T; predicted nonmobile genes, red; genes carried by mobile elements, black; nonmobile pseudogenes, pink; pseudogenes carried by mobile elements, gray; retroelements, yellow; transposons, green; previously defined centromeric repeats, dark blue; 180 bp repeats, pale blue.
  • Chromosome-specific centromere features include a large mitochondrial DNA insertion (orange; CEN2), and a novel array of tandem repeats (purple; CEN4). Gaps in the physical maps (If), unannotated regions (hatched boxes), and expressed genes (filled circles) are shown.
  • FIG. 9 Method for converting a BAC clone (or any other bacterial clone) into a minichromosome.
  • a portion of the conversion vector will integrate into the BAC clone (or other bacterial clone of interest) either through non-homologous recombination (transposable element mediated) or by the action of a site specific recombinase system, such as Cre-Lox or FLP-FRT.
  • FIG. 10A-G Method for converting a BAC clone (or any other bacterial clone) into a minichromosome. The necessary selectable markers and origins of replication for propagation of genetic material in E.
  • coli, Agrobacterium and Arabidopsis as well as the necessary genetic loci for Agrobacterium mediated transformation into Arabidopsis are cloned into a conversion vector.
  • Cre/loxP recombination the conversion vectors are recombined into BACs containing centromere fragments to form minichromosomes.
  • FIG. 11 A-T Properties of centromeric regions on chromosomes II and IV of Arabidopsis.
  • Top Drawing of genetically-defined centromeres (gray shading, CEN2, left; CEN4, right), adjacent pericentromeric DNA, and a distal segment of each chromosome, scaled in Mb as determined by DNA sequencing (gaps in the grey shading correspond to gaps in the physical maps). Positions in cM on the RI map (http://nasc.nott.ac.uk/new_ri_map.html) and physical distances in Mb, beginning at the northern telomere and at the centromeric gap, are shown. (Bottom) The density of each feature (FIGs.
  • FIG. 11A-11T is plotted relative to the position on the chromosome in Mb.
  • FIG. HA, UK cM positions for markers on the RI map (solid squares) and a curve representing the genomic average of 1 cM/221 kb (dashed line).
  • a single crossover within CEN4 in the RI mapping population http://nasc.nott.ac.uk/new_ri_map.html; Somerville and Somerville, 1999) may reflect a difference between male meiotic recombination monitored here and recombination in female meiosis.
  • centromeric repeats including 163A, 164A, 164B, 278A, 11B7RE, mil67, pAT27, 160-, 180- and 500-bp repeats, and telomeric sequences.
  • FIGs. 11 F, HP % adenosine + thymidine was calculated for a 50 kb window with a sliding interval of 25 kb
  • FIGs. 11G-11J, 11Q-11T The number of predicted genes or pseudogenes was plotted over a window of 100 kb with a sliding interval of 10 kb.
  • FIGs. IIG, HI, 11Q, US predicted genes (FIGs.
  • Dashed lines indicate regions in which sequencing or annotation is in progress, annotation was obtained from GenBank records (http://www.ncbi.nlm.nih.gov/Entrez/nucleotide.html), from the AGAD database (http://www.tigr.org/tdb/at/agad/.), and by BLAST comparisons to the database of repetitive Arabidopsis sequences
  • FIG. 12 Methods for converting a BAC clone containing centromere DNA into a minichromosome for introduction into plant cells.
  • the specific elements described are provided for exemplary purposes and are not limiting.
  • FIG. 13A-B Conservation of Arabidopsis centromere DNA. BAC clones (bars) used to sequence CEN2 (FIG. 13A) and CEN4 (FIG. 13B) are indicated; arrows denote the boundaries of the genetically-defined centromeres. PCR primer pairs yielding products from only Columbia (filled circles) or from both Landsberg and Columbia (open circles); BACs encoding DNA with homology to the mitochondrial genome (gray bars); 180 bp repeats (gray boxes); unsequenced DNA (dashed lines); and gaps in the physical map (double slashes) are shown.
  • FIG. 14A-B Primers used to analyze conservation of centromere sequences in the A. thaliana Columbia and Landsberg ecotypes.
  • FIG. 14A Primers used for amplification of chromosome 2 sequences.
  • FIG. 14B Primers used for amplification of chromosome 4 sequences.
  • FIG. 15 Sequences common to Arabidopsis CEN2 and CEN4. Genetically-defined centromeres (bold lines), sequenced (thin lines), and unannotated (dashed lines) BAC clones are displayed as in FIG. 14A, B. Repeats AtCCSl (A. thaliana centromere conserved sequence) and AtCCS2 (closed and open circles, respectively), AtCCS3 (triangles), and AtCCS4-7 (4-7, respectively) are indicated (GenBank Accession numbers AF204874 to AF204880), and were identified using BLAST 2.0 flittp ://blast. wustl. edu) .
  • centromeres The inventors have overcome the deficiencies in the prior art by providing the nucleic acid sequences of plant centromeres.
  • the significance of this achievement relative to the prior art is exemplified by the general lack of detailed information in the art regarding the centromeres of multicellular organisms in general.
  • centromere sequences The S. cerevisiae and S. pombe, where the ability to analyze centromere functions has provided a clear picture ofthe desired DNA sequences.
  • S. cerevisiae and S. pombe studies of lower eukaryotes
  • S. pombe the ability to analyze centromere functions has provided a clear picture ofthe desired DNA sequences.
  • cerevisiae centromere consists of three essential regions, CDEI, CDEII, and CDEIII, totaling only 125 bp, or approximately 0.006 to 0.06% of each yeast chromosome (Carbon et al, 1990; Bloom 1993).
  • S. pombe centromeres are between 40 and 100 kB in length and consist of repetitive elements that comprise 1 to 3% of each chromosome (Baum et ⁇ /., 1994). Subsequent studies, using tetrad analysis to follow the segregation of artificial chromosomes, demonstrated that less than 1/5 of the naturally occurring S. pombe centromere is sufficient for centromere function (Baum et ⁇ /., 1994).
  • centromeres of mammals and other higher eukaryotes are less understood.
  • DNA fragments that hybridize to centromeric regions in higher eukaryotes have been identified, in many cases, little is known regarding the functionality of these sequences (see Tyler-Smith et al., 1993).
  • Centromere repeats often correlate with centromere location, with probes to the repeats mapping both cytologically and genetically to centromere regions. Many of these sequences are tandemly-repeated satellite elements and dispersed repeated sequences in arrays ranging from 300 kB to 5000 kB in length (Willard 1990).
  • centromeres that can be used in intact chromosomes are tetrad analysis (Mortimer et al, 1981), which provides a functional definition of a centromere in its native chromosomal context.
  • Centromeres that have been mapped in this manner include those from the yeasts Saccharomyces cerevisiae, Schizosaccharomyces pombe, and Kluyveromyces lactis (Carbon et al, 1990; Hegemann et al, 1993). In many of these systems, accurate mapping of the centromeres made it possible to clone centromeric DNA, using a chromosome walking strategy (Clarke et al, 1980).
  • centromeres can be visualized easily in condensed chromosomes, they have not been characterized as extensively as centromeres from yeast or mammals. Genetic characterization has relied on segregation analysis of chromosome fragments, and in particular on analysis of trisomic strains that carry a genetically marked, telocentric fragment (for example, see Koomneef 1983). In addition, repetitive elements have been identified that are either genetically (Richards et al, 1991) or physically (Alfenito et al, 1993; Maluszynska et al, 1991) linked to a centromere. In no case, however, has the functional significance of these sequences been tested.
  • Cytology in Arabidopsis thaliana has served to correlate centromere structure with repeat sequences.
  • a fluorescent dye, DAPI allows visualization of centromeric chromatin domains in metaphase chromosomes.
  • A. pumila is thought to be an amphidiploid, derived from a cross between A. thaliana and another close relative (Maluszynska et al, 1991; Price et al, 1995).
  • Another repetitive sequence, pAtT12 has been genetically mapped to within 5 cM of the centromere on chromosome 1 and to the central region of chromosome 5 (Richards et al, 1991), although its presence on other chromosomes has not been established. Like pALl, a role for pAtT12 in centromere function remains to be demonstrated.
  • centromere-associated proteins CENPs, Rattner 1991
  • Yen 1991 kinesin superfamily of microtubule-based motors
  • centromeres of Arabidopsis thaliana have been mapped using trisomic strains, where the segregation of chromosome fragments (Koomneef 1983) or whole chromosomes (Sears et al, 1970) was used to localize four of the centromeres to within 5, 12, 17 and 38 cM, respectively. These positions have not been refined by more recent studies because the method is limited the difficulty of obtaining viable trisomic strains (Koomneef 1983).
  • Centromere DNA can be purified from total genomic DNA using several methods which include: 1) digesting genomic DNA with restriction enzymes and separating the fragments on agarose gels, to reveal major classes of repetitive DNA; 2) digesting genomic DNA with restriction enzymes sensitive to DNA methylation and separating the fragments on agarose gels to reveal the heavily methylated fraction of the genome; and 3) collecting the rapidly annealing fraction of denatured genomic DNA.
  • These three methods isolate centromere DNA; therefore, these methods are expected to independently isolate the same sequences, thus validating the sequences' centromere origin. It is anticipated that each of these methods can be applied to genomic DNA from any organism, including some lower organisms such as yeasts, as well as higher organisms such as plants and animals. Each of these methods is described in detail below.
  • repetitive DNA Centromere regions often contain many copies of the same DNA sequence (repetitive DNA); such repeats can range in size from a few nucleotides long to hundreds or thousands of bases.
  • repetitive DNA can be identified following digestion of genomic DNA with restriction endonucleases. Digestion of non-repetitive genomic DNA with a particular restriction enzyme produces a distribution of size fragments; in contrast, digestion of repeats with a restriction enzyme that cuts within each repeat produces a fragment of a typical size.
  • genomic DNA that has been cut with a restriction enzyme can be size fractionated by agarose gel elecfrophoresis to reveal repetitive DNA elements; after staining the gel to reveal the DNA, the repetitive fragment can be excised and purified using conventional techniques or commercial kits.
  • Such repeats can be introduced into cloning vectors and characterized as described below. By using this method with a variety of restriction enzymes, different repetitive elements can be purified from genomic DNA.
  • centromere DNA is often extensively modified by methylation; the presence of this methylation can be used to purify centromere fragments.
  • Digestion of genomic DNA with a methylation-sensitive restriction endonuclease yields a range of fragment sizes; endonuclease sites that are methylated are protected from digestion.
  • Heavily methylated DNA molecules, such as centromere DNA yield large fragments after digestion and can therefore be separated from the lightly or non-methylated fraction by virtue of their size.
  • agarose gel elecfrophoresis For example agarose gel elecfrophoresis, acrylamide gel elecfrophoresis, sucrose gradient fractionation, or other size fractionation techniques can be used to separate these fragments into pools of "large” (7-12kb) and "smaller” fragments (3-7 kb and 0-3kb).
  • Repetitive or methylated DNA fragments isolated using the methods described above can be ligated (using T4 DNA ligase, for example) to a plasmid vector and cloned by transformation into E. coli. These clones can then propagated, sequenced, used to assemble minichromosomes, or used to identify larger centromere clones, generate molecular markers that facilitate genetic mapping of centromeres, or create probes for chromosome mapping experiments such as fluorescent in situ hybridization (FISH).
  • FISH fluorescent in situ hybridization
  • a genomic library can be screened for clones carrying centromere DNA by arraying the clones onto solid supports, such as membrane filters, and probing with labeled fragments of purified centromere DNA, including cloned repetitive or methylated DNA fragments described above, or alternatively, the entire set of rapidly annealing genomic DNA or highly methylated genomic DNA fragments.
  • Probes can be used singly or in combination. Typically these probes are labeled by incorporation of radionucleotides, fluorescent nucleotides, or other chemical or enzymatic ligands that enable easy detection.
  • the labeled probe DNA is denatured and hybridized to the arrayed library using standard molecular biology techniques.
  • Hybridization is performed at a temperature that will discourage non-specific DNA annealing while promoting the hybridization of the labeled probe to complementary sequences. After incubation, the arrayed library is washed to remove unannealed probe, and a detection method appropriate to the label incorporated in the probe is used. For example, if the probe is radiolabeled, the labeled filter is exposed to X-ray film.
  • centromere clones the results of several hybridization experiments are quantitated and compared. In some cases, centromere clones may hybridize to only one probe; in other cases, the clones will hybridize to multiple probes.
  • the hybridization intensity of each clone to each probe can be measured and stored in a database.
  • a preferred method for this analysis is to use software that digitizes the hybridization signals, assigns each signal to its corresponding clone address, ensures that duplicate copies of the clones successfully hybridized, and enters the resulting information into a relational database (MySQL for example).
  • Another possible method for this analysis is to examine the hybridization results visually, estimate the hybridization intensity, and tabulate the resulting information.
  • the results of each hybridization experiment can be classified by grouping clones that show hybridization to each probe above a threshold value. For example, a computerized relational database can be queried for clones giving hybridization signals above a certain threshold for individual probes or for multiple probes. Based on these hybridization patterns, clones can be grouped into categories, and representative members of each category can be tested in minichromosomes.
  • BLAST is Basic Local Alignment Search Tool, a family of freely available algorithms for sequence database searches. BLAST aligns two sequences and yields an estimate of the probability that this alignment is significant, i.e. that it did not occur by chance.
  • the two sequences compared by BLAST are called the 'query', usually a single sequence of interest, and the 'subject', often part of a large database of sequences that are compared to the query.
  • the query sequence (query) can also be part of a database of sequences.
  • the outputs of BLAST are High Scoring Pairs (HSPs) that are alignments of subject and query sequences.
  • HSPs High Scoring Pairs
  • Nucleotide position describes the position of a given nucleotide within the sequence, relative to the first nucleotide of the sequence.
  • BLAST score e value is the likelihood that a given sequence alignment is significant (the lower the value the higher the significance).
  • the algorithm is as follows: (1) provide a first dataset consisting of the genomic sequences, or a representative fraction of genomic sequence, ofthe organism of interest; (2) identify and eliminate known non-centromeric repeat sequences from the first dataset by using the BLAST sequence comparison algorithm to create a second dataset; (3) compare each sequence in the second dataset to itself by using the BLAST sequence comparison algorithm, obtain a BLAST score for each pair of sequence compared, and collect high score pairs to create a third dataset; (4) examine the BLAST score of each high score pair in the third dataset and ⁇ • f) eliminate the pairs having a score greater than 10 " to create a fourth dataset; (5) eliminate the high score pairs in the fourth dataset having less than 80 bp or more than 250 bp to create a fifth dataset; (6) examine the nucleotide position of each high score pair in the fifth dataset and eliminate pairs having 100% identity and identical nucleotide positions (i.e.
  • the databset used in step (1) in the above algorithm would be the whole genome dataset such as the Arabidopsis genome which was derived by methodical sequencing of mapped clones or the rice genome dataset which was derived by shotgun sequencing.
  • the algorithm would also work well on representative genome datasets.
  • representation genome datasets it is meant that the genomic sequences in the dataset is a subset of the sequences of the whole genome collected from the whole genome without bias, such as bias toward coding sequences. These sequences would be representative of the genome as a whole. For example, the use of a 0.5X or even a 0.1X library of Arabidposis with representative genome datasets would return a true positive result. On the contrary, the use of a subset of genomic sequences of the whole genome which are not representative of the whole genome and biased toward certain sequences, such as the coding sequence, would return false positive results.
  • nucleic acid segment refers to a nucleic acid molecule that has been purified from total genomic nucleic acids of a particular species. Therefore, a nucleic acid segment conferring centromere function refers to a nucleic acid segment that contains centromere sequences yet is isolated away from, or purified free from, total genomic nucleic acids.
  • nucleic acid segment includes nucleic acid segments and smaller fragments of such segments, and also recombinant vectors, including, for example, minichromosomes, artificial chromosomes, BACs, YACs, plasmids, cosmids, phage, viruses, and the like.
  • nucleic acid segment comprising an isolated or purified centromeric sequence refers to a nucleic acid segment including centromere sequences and, in certain aspects, regulatory sequences, isolated substantially away from other naturally occurring sequences, or other nucleic acid sequences.
  • gene is used for simplicity to refer to a protein, polypeptide- or peptide- encoding unit. As will be understood by those in the art, this functional term includes both genomic sequences, cDNA sequences and smaller engineered gene segments that may express, or may be adapted to express, proteins, polypeptides or peptides.
  • isolated substantially away from other sequences means that the sequences of interest, in this case centromere sequences, are included within the genomic nucleic acid clones provided herein. Of course, this refers to the nucleic acid segment as originally isolated, and does not exclude all genes or coding regions.
  • the invention concerns isolated nucleic acid segments and recombinant vectors incorporating nucleic acid sequences that encode a centromere functional sequence that includes a contiguous sequence from the centromeres of the current invention.
  • nucleic acid segments that exhibit centromere function activity will be most preferred.
  • the invention provides a plant centromere which is further defined as an Arabidopsis thaliana centromere.
  • the plant centromere comprises an Arabidopsis thaliana chromosome 2 centromere.
  • the chromosome 2 centromere may comprise, for example, from about 100 to about 611,000, about 500 to about 611,000, about 1,000 to about 611,000, about 10,000 to about 611,000, about 20,000 to about 611,000, about 40,000 to about 611,000, about 80,000 to about 611,000, about 150,000 to about 611,000, or about 300,000 to about 611,000 contiguous nucleotides of a first nucleic acid sequence flanking a first series of 180 bp repeats in centromere 2 of A. thaliana.
  • the centromere may also be defined as comprising from about 100 to about 50,959, about 500 to about 50,959, about 1,000 to about 50,959, about 5,000 to about 50,959, about 10,000 to about 50,959, 20,000 to about 50,959, about 30,000 to about 50,959, or about 40,000 to about 50,959 contiguous nucleotides of a second nucleic acid sequence flanking a second series of 180 bp repeats in centromere 2 of A. thaliana.
  • the centromere may comprise sequences from both of the third and the fourth sequences, including the aforementioned fragments, or the entirety of these sequences.
  • the inventors contemplate a 3' fragment of the first sequence can be fused to a 5' fragment of the second sequence, optionally including one or more 180 bp repeat sequence disposed therebetween.
  • the invention provides an Arabidopsis thaliana chromosome 4 centromere.
  • the centromere may comprise from about 100 to about 1,082,000, about 500 to about 1,082,000, about 1,000 to about 1,082,000, about 5,000 to about 1,082,000, about 10,000 to about 1,082,000, about 50,000 to about 1,082,000, about 100,000 to about 1,082,000, about 200,000 to about 1,082,000, about 400,000 to about 1,082,000, or about 800,000 to about 1,082,000 contiguous nucleotides of a third nucleic acid sequence flanking a third series of repeated sequences, including comprising the nucleic acid sequence of the third sequence.
  • the centromere may also be defined as comprising from about 100 to about 163,317, about 500 to about 163,317, about 1,000 to about 163,317, about 5,000 to about 163,317, about 10,000 to about 163,317, about 30,000 to about 163,317, about 50,000 to about 163,317, about 80,000 to about 163,317, or about 120,000 to about 163,317 contiguous nucleotides of the nucleic acid sequence of a fourth sequence flanking a fourth series of repeated sequences, and may be defined as comprising the nucleic acid sequence of the fourth sequence.
  • the centromere may comprise sequences from both the third and the fourth sequences, including the aforementioned fragments, or the entirety of the third and the fourth sequences.
  • the inventors contemplate a 3' fragment ofthe third sequence can be fused to a 5' fragment ofthe fourth sequence, optionally including one or more 180 bp repeat sequence disposed therebetween.
  • a Arabidopsis thaliana chromosome 1, 3 or 5 centromere selected from the nucleic acid sequence given by one ofthe repeated sequences in these chromosomes, or fragments thereof.
  • the length of the repeat used may vary, but will preferably range from about 20 bp to about 250 bp, from about 50 bp to about 225 bp, from about 75 bp to about 210 bp, from about 100 bp to about 205 bp, from about 125 bp to about 200 bp, from about 150 bp to about 195 bp, from about 160 bp to about 190 and from about 170 bp to about 185 bp including about 180 bp.
  • the construct comprises at least 100 base pairs, up to an including the full length, of one of the preceding sequences.
  • the construct may include 1 or more 180 base pair repeats.
  • any number of repeat copies capable of physically being placed on the recombinant construct could be included on the construct, including about 5, 10, 15, 20, 30, 50, 75, 100, 150, 200, 300, 400, 500, 750, 1,000, 1,500, 2,000, 3,000, 5,000, 7,500, 10,000, 20,000, 30,000, 40,000, 50,000, 60,000, 70,000, 80,000, 90,000 and about 100,000, including all ranges in-between such copy numbers.
  • the copies while largely identical, can vary from each other. Such repeat variation is commonly observed in naturally occurring centromeres.
  • the centromere is a Brassica oleracea centromere comprising Brassica oleracea centromere DNA.
  • the Brassica oleracea cenfromere is defined as comprising n copies of a repeated nucleotide sequence, wherein n is at least 2. Potentially any number of repeat copies capable of physically being placed on the recombinant construct could be included on the construct, including about 5, 10, 15, 20, 30, 50, 75, 100, 150, 200, 300, 400, 500, 750, 1,000, 1,500, 2,000, 3,000, 5,000, 7,500, 10,000, 20,000, 30,000, 40,000, 50,000, 60,000, 70,000, 80,000, 90,000 and about 100,000, including all ranges in-between such copy numbers.
  • the repeated nucleotide sequence is isolated from Brassica oleracea given by SEQ ID NO: 1 , 2, 3, or 4.
  • the centromere is a Glycine max centromere comprising glycine max centromere DNA.
  • the Glycine max centromere is defined as comprising n copies of a repeated nucleotide sequence, wherein n is at least 2. Potentially any number of repeat copies capable of physically being placed on the recombinant construct could be included on the construct, including about 5, 10, 15, 20, 30, 50, 75, 100, 150, 200, 300, 400, 500, 750, 1,000, 1,500, 2,000, 3,000, 5,000, 7,500, 10,000, 20,000, 30,000, 40,000, 50,000, 60,000, 70,000, 80,000, 90,000 and about 100,000, including all ranges in-between such copy numbers.
  • the repeated nucleotide sequence is isolated from Glycine max given by SEQ TD NO:5, 6, 7, or 8.
  • the centromere is a Lycopersicon esculentum centromere comprising Lycopersicon esculentum centromere DNA.
  • the Lycopersicon esculentum centromere is defined as comprising n copies of a repeated nucleotide sequence, wherein n is at least 2.
  • any number of repeat copies capable of physically being placed on the recombinant construct could be included on the construct, including about 5, 10, 15, 20, 30, 50, 75, 100, 150, 200, 300, 400, 500, 750, 1,000, 1,500, 2,000, 3,000, 5,000, 7,500, 10,000, 20,000, 30,000, 40,000, 50,000, 60,000, 70,000, 80,000, 90,000 and about 100,000, including all ranges in-between such copy numbers,
  • the repeated nucleotide sequence is isolated from Lycopersicon esculentum given by SEQ ID NO: 9 or 10.
  • the centromere is a Zea mays centromere comprising Zea mays centromere DNA.
  • the centromere is defined as comprising n copies of a repeated nucleotide sequence, wherein n is at least 2.
  • n is at least 2.
  • any number of repeat copies capable of physically being placed on the recombinant construct could be included on the construct, including about 5, 10, 15, 20, 30, 50, 75, 100, 150, 200, 300, 400, 500, 750, 1,000, 1,500, 2,000, 3,000, 5,000, 7,500, 10,000, 20,000, 30,000, 40,000, 50,000, 60,000, 70,000, 80,000, 90,000 and about 100,000, including all ranges in-between such copy numbers, hi one embodiment, the repeated nucleotide sequence is isolated from Zea mays given by SEQ ID NOrl l, 12 or 13.
  • the centromere can additionally be defined as the region of the chromosome where the sister chromatids pair during cell division.
  • the centromere is also the chromosomal region where the kinetochore (the chromosomal attachment stmcture for the spindle) and the spindle (the cellular machinery that provides the motive force for chromosome segregation) attach to the chromosome during mitosis and meiosis.
  • the centromere is also defined as the region of the primary constriction in a condensed chromosome.
  • the DNA of the centromere is characteristically heavily methylated, repetitive, and condensed (heterochromatic).
  • Minichromosome construction Minichromosomes are constmcted by combining fragments of cenfromere
  • centromere clones DNA with other DNA sequences useful for propagation of the resultant recombinant DNA molecule in E. coli, other bacteria, yeast or plants.
  • Recombinant plasmids containing large fragments of centromere DNA are referred to as centromere clones.
  • Centromere sequences removed from cenfromere clones, or centromere sequences derived directly from genomic DNA, are referred to as centromere fragments.
  • minichromosome vector sequences or minichromosome vectors can include but are not limited to selectable marker genes, visible marker genes, origins of replication, restriction endonuclease recognition sites, homing endonuclease recognition sites, sequences recognized by site specific recombinase enzymes, telomere sequences, and sequences required for delivery of minichromosomes into bacteria, yeast or plant cells.
  • Recombinant constructs containing both large centromere fragments as well as minichromosome vector sequences are referred to as minichromosomes.
  • the process of assembling minichromosomes from centromere clones/fragments and minichromosome vector sequences can be done in several ways, and involves techniques that are common practice among those trained in molecular biology:
  • centromere fragments and minichromosome vector DNA fragments are generated and purified using conventional techniques, some of which include restriction enzyme digestion, agarose gel elecfrophoresis, gel purification of specific fragments, anion-exchange purification and ethanol precipitation.
  • the resulting purified centromere and vector fragments are enzymatically joined in vitro, using for example T4 DNA ligase.
  • the ends of the fragments can be cohesive, as the result of digestion with compatible restriction endonucleases or from the addition of compatible oligonucleotide linkers; alternatively the ends ofthe fragments can be blunt and can be directly joined.
  • the resulting minichromosomes are introduced into E. coli, other bacteria, yeast, or plant cells using chemical or physical transformation methods.
  • the stracture of the resulting minichromosomes can be determined by recovering them from the host organism and assessing DNA fragment size and composition.
  • minichromosome vector sequences can be constmcted to include site- specific recombination sequences (for example those recognized by the bacteriophage PI Cre recombinase, or the bacteriophage lambda integrase, or similar recombination enzymes).
  • site-specific recombination sequences for example those recognized by the bacteriophage PI Cre recombinase, or the bacteriophage lambda integrase, or similar recombination enzymes.
  • a compatible recombination site, or a pair of such sites, can also be included in the centromere clones.
  • minichromosome vector and the centromere clone incubation of the minichromosome vector and the centromere clone in the presence of the recombinase enzyme causes strand exchange to occur between the recombination sites in the two plasmids; the resulting minichromosomes contain centromere sequences as well as minichromosome vector sequences (FIG. 5).
  • Introducing the DNA molecules formed in such recombination reactions into E. coli, other bacteria, yeast or plant cells can be followed by selection for marker genes present on both parental plasmids, allowing the isolation of minichromosomes.
  • Minichromosome vector tailing method for minichromosome construction Centromere DNA fragments isolated from genomic DNA or from centromere clones can be modified on their ends by treatment with restriction endonucleases, or by ligation with DNA molecules including, but not limited to, oligonucleotide linkers, or by the addition of nucleotides, to produce a desired cohesive or blunt end. These fragments are size-fractionated by, agarose gel elecfrophoresis or other methods, and the centromere fragments purified using conventional techniques. Minichromosome vector fragments are generated and purified in a similar manner, resulting in linear minichromosome vector sequences with DNA ends compatible with those on the centromere fragments.
  • Compatible ends in this case are defined by ends that can be joined in vitro by the action of a ligase enzyme. As shown in FIG. 6, the two fragments are then mixed so that the minichromosome vector molecules are present in at least two-fold molar excess over the centromere fragments. The fragments are joined by the addition of a ligase enzyme (for example bacteriophage T4 DNA ligase), resulting in the formation of DNA molecules in which minichromosome vector molecules have been joined to both ends of the same centromere fragment.
  • a ligase enzyme for example bacteriophage T4 DNA ligase
  • linear minichromosome precursors consisting of a fragment of the original minichromosome vector attached to each end of the centromere fragment.
  • the ends of this hybrid molecule are compatible because they were created by the same restriction enzyme.
  • This linear minichromosome precursor is purified, for example, by agarose gel elecfrophoresis followed by gel purification of the DNA fragments of the expected length.
  • the purified DNA molecules are circularized by joining the ends, for example by treatment with a DNA ligase enzyme.
  • the resulting minichromosome molecules can be introduced into E. coli, other bacteria, yeast or plant cells, followed by purification and characterization using conventional methods.
  • Minichromosomes are purified and delivered into plant cells, either individually or as a mixture.
  • the mimchromosomes can be either circular or linear or mixtures thereof.
  • the plant cells used for minichromosome delivery can be either intact seedlings, immature or mature plants, parts of seedlings or plants, specific plant tissues (for example leaves, stems, roots, flowers, fruits), differentiated tissues cultured in vitro (for example roots), or undifferentiated cells (for example callus) cultured in vitro.
  • the minichromosome DNA can be delivered into plant cells by a variety of methods including but not limited to the following: electroporation; Agrobacterium- mediated DNA delivery; vims-mediated DNA delivery; delivery mediated by salts or lipids that facilitate the cellular uptake of DNA; microinjection of DNA; manipulation into a cell of DNA-coated or DNA-containing particles, droplets, micelles, microspheres, or chemical complexes using a variety of techniques, including biolistic particle bombardment, optical tweezers, particle beams, and electrospray apparatus; manipulation of DNA-coated magnetic particles into the cells by magnetic fields; DNA delivery into cells by cell wounding using micro-needles (for example silicon carbide needles); sonication or other acoustic treatment of the cells to facilitate DNA uptake; fusion of plant cells with other cell types carrying a minichromosome, including bacterial, yeast, or other plant cells; any other electrical, chemical, physical, or biological mechanism that results in the introduction of minichromosome DNA into the plant cell
  • minichromosomes Following minichromosome delivery, plant cells, plant tissues, or complete plants carrying the minichromosome can be isolated by a variety of selection methods. Selection involves subjecting the plant cells, tissues or plants to chemical, environmental, or mechanical treatments that enrich for those cells, tissue or plants that contain a minichromosome.
  • the selection methods include but are not limited to: fluorescence-activated cell sorting of cells, cell clumps, or cell protoplasts based on expression of a marker protein encoded by the minichromosome (for example, a fluorescent protein such as DsRed); affinity purification of cells, cell clumps, or protoplasts based on expression of a cell wall protein, membrane protein, or membrane-associated protein encoded by the minichromosome; any cell fractionation method capable of separating cells based on their density, size or shape to enrich for cells with a property that differs from that of the starting population and is conferred by the minichromosome; selection of cells for resistance to an antibiotic conferred by the minichromosome; selection of cells for resistance to an herbicide conferred by the minichromosome; selection of cells for resistance to a toxic metal, salt, mineral or other substance conferred by the minichromosome; selection of cells for resistance to abiotic stress (for example heat, cold, acid, base, osmotic stress) conferred by the minichromosome;
  • minichromosomes As a result of the treatment, a population of plant cells can be obtained that contain minichromosomes. Individual clones or sub- populations of these cells can be expanded in culture for further characterization. Alternatively, plant cells, plant tissues, or complete plants that carry minichromosomes can be identified by direct screening. Such methods involve subjecting each cell, plant, or tissue to diagnostic tests indicative ofthe presence ofthe minichromosome. These tests can include direct assays for the presence of minichromosome DNA, or indirect assays for properties conferred by the minichromosome.
  • Direct assays for the presence ofthe minichromosome DNA include but are not limited to: staining of cells with DNA-binding molecules to allow detection of an additional chromosome; in situ hybridization with labeled DNA probes corresponding to sequences present on the minichromosome; southern blots or dot blots of DNA extracted from the cells, plant or tissue and probed with labeled DNA sequences corresponding to sequences present on the minichromosome; elecfrophoresis of genomic DNA extracted from the cells, plant or tissue under conditions that allow identification of the minichromosome; amplification of specific sequences present on the minichromosome from genomic DNA extracted from the cells, plant or tissue using the polymerase chain reaction.
  • Indirect assays for properties conferred by the minichromosome include but are not limited to: detection of the expression of a fluorescent marker encoded by the minichromosome by fluorescence microscopy, flow cytometery or fluorimetry; detection of the expression of a protein encoded by the minichromosome by use of specific antibodies, or any other reagent capable of specifically binding to the protein; use of cell fractionation methods capable of detecting a specific density, size or shape ofthe cells or tissues, that is conferred by the minichromosome; growth of cells, seedlings, plants or tissues on an antibiotic- containing medium to determine the presence of an antibiotic-resistance gene encoded by the minichromosome; growth of cells, seedlings, plants or tissues on an herbicide- containing medium to determine the presence of an herbicide-resistance gene encoded by the minichromosome; growth of cells, seedlings, plants or tissues on a medium containing a toxic metal, salt, mineral or other substance to determine the presence of an gene conferring resistance to this substance encoded by
  • Plant cells, tissues, or entire plants containing minichromosomes can be further characterized to determine whether the minichromosome is an autonomous DNA molecule, or whether it is associated with one of the plant cell's chromosomes by integration.
  • the methods used for this analysis include, but are not limited to, the following: 1) Detection of marker protein expression by microscopy, flow cytometry, fluorimetry, enzymatic assays, cell staining or any other technique that allows the detection of a marker protein having a specific enzymatic activity, or conferring a specific color, or fluorescence property onto the cells.
  • a cell line has been selected for containing a minichromosome by selecting for the function of a resistance gene encoded by the minichromosome, and if a marker protein is also encoded by the minichromosome, then expression of this marker protein in the selected cells is an indication of the presence ofthe entire minichromosome, and could indicate autonomy of this minichromosome from the cell's other chromosomes.
  • genomic DNA isolated from the cells, tissues or plants can be fractionated by gel elecfrophoresis, either intact or following digestion with restriction endonucleases or homing endonucleases, allowing the detection of a mimchromosome or a fragment of a minichromosome.
  • Markers include but are not limited to: visible markers conferring a visible characteristic to the plant; selectable markers, conferring resistance to an antibiotic, herbicide, or other toxic compound; enzymatic markers, conferring an enzymatic activity that can be assays in the plant or in extracts made from the plant; protein markers, allowing the specific detection of a protein expressed in the plant; molecular markers, such as restriction fragment length polymorphisms, amplified fragment length polymorphisms, short sequence repeat (microsatellite) markers, presence of certain sequences in the DNA of the plant as detected by the polymerase chain reaction, single nucleotide polymorphisms or cleavable amplified polymorphic sites.
  • Plant regeneration from transformed cell clones Plant cells or tissues that harbor minichromosomes can be used to regenerate entire plants. This will be accomplished with standard techniques of plant regeneration from differentiated tissues or undifferentiated cells. Typically, transformed tissues or callus are subjected to a series of treatments with media containing various mixtures of plant hormones and growth regulators that promote the formation of a plant embryo, specific plant tissues or organs, or a complete plant (roots and shoot) from the starting cells or tissues. Following plant regeneration, the plant can be grown either in sterile media or in soil.
  • minichromosomes can be measured through one or more cell divisions. After isolating cells, tissues, or entire plants that contain the minichromosome, the population of cells is allowed to grow (either with or without selection), and the presence of the minichromosome is monitored as the cells divide.
  • Minichromosomes can be detected in cells by a variety of methods, including but not limited to: detection of fluorescence or any other visual characteristic arising from a marker protein gene present on the minichromosome; resistance to an antibiotic, herbicide, toxic metal, salt, mineral or other substance, or abiotic stress as outlined above (Isolating plant cells containing minichromosomes); staining of cells with DNA-binding molecules to allow detection of an additional chromosome; in situ hybridization with labeled DNA probes corresponding to sequences present on the minichromosome; southern blots or dot blots of DNA extracted from the cell population and probed with labeled DNA sequences corresponding to sequences present on the minichromosome; expression of a marker enzyme encoded by a gene present on the minichromosome (i.e.
  • luciferase alkaline phosphatase, beta- galactosidase, etc.
  • the percentage of cells containing the chromosome is determined at regular intervals during this growth phase.
  • the change in the fraction of cells harboring the minichromosome, divided by the number of cell divisions, represents the average minichromosome loss rate. Minichromosomes with the lowest loss rates have the highest level of inheritance.
  • the resulting minichromosomes "rescued” in this fashion may differ from their parental molecules in total size, size of the centromere, presence or absence of additional sequences, and overall arrangement of the sequences.
  • These procedures allow the isolation of DNA molecules capable of replicating and segregating in plant cells without having to test minichromosomes individually. For example, after delivery of pools of minichromosomes, or pools of centromere clones into plant cells, tissues or whole plants, and recovering them by the methods listed above, facilitates the selection of specific minichromosomes or centromere clones that remain autonomous in plant cells.
  • X. Exogenous Genes for Expression in Plants One particularly important advance of the present invention is that it provides methods and compositions for expression of exogenous genes in plant cells.
  • One advance of the constructs of the current invention is that they enable the introduction of multiple genes (often referred to as gene "stacking"), potentially representing an entire biochemical pathway, or any combination of genes encoding different biochemical processes or pathways.
  • the current invention allows for the transformation of plant cells with a minichromosome comprising a number of structural genes.
  • Another advantage is that more than one minichromosome could be introduced, allowing combinations of genes to be moved and shuffled.
  • the ability to eliminate a minichromosome from a plant would provide additional flexibility, making it possible to alter the set of genes contained within a plant. Further, by using site-specific recombinases, it should be possible to add genes to an existing minichromosome once it is in a plant.
  • an "expressible gene” is any gene that is capable of being transcribed into RNA (e.g., mRNA, antisense RNA, etc.) or translated into a protein, expressed as a trait of interest, or the like, etc., and is not limited to selectable, screenable or non-selectable marker genes.
  • RNA e.g., mRNA, antisense RNA, etc.
  • the inventors also contemplate that, where both an expressible gene that is not necessarily a marker gene is employed in combination with a marker gene, one may employ the separate genes on either the same or different DNA segments for transformation. In the latter case, the different vectors may be delivered concurrently to recipient cells to maximize cofransformation or may be delivered sequentially.
  • One of the major purposes of transformation of crop plants is to add some commercially desirable, agronomically important traits to the plant.
  • Such traits include, but are not limited to, herbicide resistance or tolerance; insect resistance or tolerance; disease resistance or tolerance (viral, bacterial, fungal, nematode); stress tolerance and/or resistance, as exemplified by resistance or tolerance to drought, heat, chilling, freezing, excessive moisture, salt stress; oxidative stress; increased yields; food content and makeup; physical appearance; male sterility; drydown; standability; prolificacy; starch quantity and quality; oil quantity and quality; protein quality and quantity; amino acid composition; the production of a pharmaceutically active protein; the production of a small molecule with medicinal properties; the production of a chemical including those with industrial utility; the production of nutraceuticals, carbohydrates, RNAs, lipids, fuels, dyes, pigments, vitamins, scents, flavors, vaccines, antibodies,
  • minichromosomes a desired genomic segment such as one that includes a quantitative trait onto a minichromosome.
  • the present invention contemplates the transformation of a recipient cell with mimchromosomes comprising more than one exogenous gene.
  • An "exogenous gene,” can be a gene not normally found in the host genome in an identical context, or alternatively, the minichromosome could be used to introduce extra copies of host genes into a cell.
  • the gene may be isolated from a different species than that of the host genome, or alternatively, isolated from the host genome but operably linked to one or more regulatory regions which differ from those found in the unaltered, native gene.
  • Two or more exogenous genes also can be supplied in a single transformation event using either distinct fransgene-encoding vectors, or using a single vector incorporating two or more gene coding sequences.
  • plasmids bearing the bar and aroA expression units in either convergent, divergent, or colinear orientation are considered to be particularly useful.
  • Further preferred combinations are those of an insect resistance gene, such as a Bt gene, along with a protease inhibitor gene such as pinll, or the use of bar in combination with either of the above genes.
  • any two or more transgenes of any description such as those conferring herbicide, insect, disease (viral, bacterial, fungal, nematode) or drought resistance, male sterility, drydown, standability, prolificacy, starch properties, oil quantity and quality, modified chemical production, pharmaceutical or nutraceutical properties, bioremediation properties, increased biomass, altered growth rate, altered fitness, altered salinity tolerance, altered thermal tolerance, altered growth form, altered composition, altered metabolism, altered biodegradability, altered CO 2 fixation, altered stress tolerance, presence of bioindicator activity, altered digestibility by humans or animals, altered allergenicity, altered mating characteristics, altered pollen dispersal, altered appearance, improved environmental impact, nitrogen fixation capability, or those increasing yield or nutritional quality may be employed as desired.
  • Herbicide Resistance The genes encoding phosphinothricin acetyltransferase (bar and pat), glyphosate tolerant EPSP synthase genes, the glyphosate degradative enzyme gene gox encoding glyphosate oxidoreductase, deh (encoding a dehalogenase enzyme that inactivates dalapon), herbicide resistant (e.g., sulfonylurea and imidazolinone) acetolactate synthase, and bxn genes (encoding a nitrilase enzyme that degrades bromoxynil) are good examples of herbicide resistant genes for use in transformation.
  • the bar and pat genes code for an enzyme, phosphinothricin acetyltransferase (PAT), which inactivates the herbicide phosphinothricin and prevents this compound from inhibiting glutamine synthetase enzymes.
  • PAT phosphinothricin acetyltransferase
  • the enzyme 5-enolpyruvylshikimate 3-phosphate synthase (EPSP Synthase) is normally inhibited by the herbicide N-(phosphonomethyl)glycine (glyphosate).
  • genes are known that encode glyphosate-resistant EPSP synthase enzymes. These genes are particularly contemplated for use in plant transformation.
  • the deh gene encodes the enzyme dalapon dehalogenase and confers resistance to the herbicide dalapon.
  • the bxn gene codes for a specific nitrilase enzyme that converts bromoxynil to a non-herbicidal degradation product.
  • Insect Resistance Potential insect resistance genes that can be introduced include Bacillus thuringiensis crystal toxin genes or Bt genes (Watrud et al, 1985). Bt genes may provide resistance to lepidopteran or coleopteran pests such as European Com Borer (ECB).
  • ECB European Com Borer
  • Preferred Bt toxin genes for use in such embodiments include the CryIA(b) and Cry ⁇ A(c) genes. Endotoxin genes from other species of B. thuringiensis which affect insect growth or development also may be employed in this regard.
  • preferred Bt genes for use in the transformation protocols disclosed herein will be those in which the coding sequence has been modified to effect increased expression in plants, and more particularly, in monocot plants.
  • Means for preparing synthetic genes are well known in the art and are disclosed in, for example, U.S. Patent No. 5,500,365 and U.S. Patent Number No. 5,689,052, each of the disclosures of which are specifically incorporated herein by reference in their entirety.
  • modified Bt toxin genes include a synthetic Bt CrylA ) gene (Perlak et al, 1991), and the synthetic Cry ⁇ A(c) gene termed 1800b (PCT Application WO 95/06128).
  • Some examples of other Bt toxin genes known to those of skill in the art are given in Table 1 below.
  • Protease inhibitors also may provide insect resistance (Johnson et al, 1989), and will thus have utility in plant transformation.
  • the use of a protease inhibitor II gene, pinll, from tomato or potato is envisioned to be particularly useful. Even more advantageous is the use of a pinll gene in combination with a Bt toxin gene, the combined effect of which has been discovered to produce synergistic insecticidal activity.
  • genes encoding lectins may confer additional or alternative insecticide properties.
  • Lectins (originally termed phytohemagglutinins) are multivalent carbohydrate-binding proteins which have the ability to agglutinate red blood cells from a range of species. Lectins have been identified recently as insecticidal agents with activity against weevils, ECB and rootworm (Murdock et al, 1990; Czapla & Lang, 1990). Lectin genes contemplated to be useful include, for example, barley and wheat germ agglutinin (WGA) and rice lectins (Gatehouse et al, 1984), with WGA being preferred.
  • WGA barley and wheat germ agglutinin
  • rice lectins Gatehouse et al, 1984
  • Genes controlling the production of large or small polypeptides active against insects when introduced into the insect pests form another aspect of the invention.
  • insect pests such as, e.g., lyric peptides, peptide hormones and toxins and venoms
  • the expression of juvenile hormone esterase, directed towards specific insect pests also may result in insecticidal activity, or perhaps cause cessation of metamorphosis (Hammock et al, 1990).
  • Transgenic plants expressing genes which encode enzymes that affect the integrity of the insect cuticle form yet another aspect of the invention.
  • genes include those encoding, e.g., chitinase, proteases, lipases and also genes for the production of nikkomycin, a compound that inhibits chitin synthesis, the introduction of any of which is contemplated to produce insect resistant plants.
  • Genes that code for activities that affect insect molting such as those affecting the production of ecdysteroid UDP-glucosyl transferase, also fall within the scope of the useful transgenes of the present invention.
  • Genes that code for enzymes that facilitate the production of compounds that reduce the nutritional quality of the host plant to insect pests also are encompassed by the present invention. It may be possible, for instance, to confer insecticidal activity on a plant by altering its sterol composition. Sterols are obtained by insects from their diet and are used for hormone synthesis and membrane stability. Therefore alterations in plant sterol composition by expression of novel genes, e.g., those that directly promote the production of undesirable sterols or those that convert desirable sterols into undesirable forms, could have a negative effect on insect growth and/or development and hence endow the plant with insecticidal activity.
  • Lipoxygenases are naturally occurring plant enzymes that have been shown to exhibit anti-nutritional effects on insects and to reduce the nutritional quality of their diet. Therefore, further embodiments of the invention concern transgenic plants with enhanced lipoxygenase activity which may be resistant to insect feeding.
  • Tripsacum dactyloides is a species of grass that is resistant to certain insects, including com root worm. It is anticipated that genes encoding proteins that are toxic to insects or are involved in the biosynthesis of compounds toxic to insects will be isolated from Tripsacum and that these novel genes will be useful in conferring resistance to insects. It is known that the basis of insect resistance in Tripsacum is genetic, because said resistance has been transferred to Zea mays via sexual crosses (Branson and Guss, 1972). It is further anticipated that other cereal, monocot or dicot plant species may have genes encoding proteins that are toxic to insects which would be useful for producing insect resistant plants.
  • genes encoding proteins characterized as having potential insecticidal activity also may be used as transgenes in accordance herewith.
  • Such genes include, for example, the cowpea trypsin inhibitor (CpTI; Hilder et ⁇ /., 1987) which may be used as a rootworm deterrent; genes encoding avermectin (Avermectin and Abamectin., Campbell, W.C., Ed., 1989; Ikeda et al, 1987) which may prove particularly useful as a com rootworm deterrent; ribosome inactivating protein genes; and even genes that regulate plant stmctures.
  • Transgenic plants including anti-insect antibody genes and genes that code for enzymes that can convert a non-toxic insecticide (pro-insecticide) applied to the outside of the plant into an insecticide inside the plant also are contemplated.
  • pro-insecticide non-toxic insecticide
  • Environment or Stress Resistance Improvement of a plants ability to tolerate various environmental stresses such as, but not limited to, drought, excess moisture, chilling, freezing, high temperature, salt, and oxidative stress, also can be effected through expression of novel genes. It is proposed that benefits may be realized in terms of increased resistance to freezing temperatures through the introduction of an "antifreeze" protein such as that of the Winter Flounder (Cutler et al, 1989) or synthetic gene derivatives thereof.
  • Improved chilling tolerance also may be conferred through increased expression of glycerol-3-phosphate acetyltransferase in chloroplasts (Wolter et al, 1992). Resistance to oxidative stress (often exacerbated by conditions such as chilling temperatures in combination with high light intensities) can be conferred by expression of superoxide dismutase (Gupta et al, 1993), and may be improved by glutathione reductase (Bowler et al, 1992). Such strategies may allow for tolerance to freezing in newly emerged fields as well as extending later maturity higher yielding varieties to earlier relative maturity zones.
  • drought resistance and “drought tolerance” are used to refer to a plants increased resistance or tolerance to stress induced by a reduction in water availability, as compared to normal circumstances, and the ability of the plant to function and survive in lower-water environments.
  • the expression of genes encoding for the biosynthesis of osmotically-active solutes, such as polyol compounds, may impart protection against drought.
  • mannitol-L-phosphate dehydrogenase (Lee and Saier, 1982) and trehalose-6-phosphate synthase (Kaasen et al, 1992).
  • these introduced genes will result in the accumulation of either mannitol or trehalose, respectively, both of which have been well documented as protective compounds able to mitigate the effects of stress. Mannitol accumulation in transgenic tobacco has been verified and preliminary results indicate that plants expressing high levels of this metabolite are able to tolerate an applied osmotic stress (Tarczynski et al, 1992, 1993).
  • Naturally occurring metabolites that are osmotically active and/or provide some direct protective effect during drought and/or desiccation include fructose, erythritol (Coxson et al, 1992), sorbitol, dulcitol (Karsten et al, 1992), glucosylglycerol (Reed et al, 1984; ErdMann et al, 1992), sucrose, stachyose (Koster and Leopold, 1988; Blackman et al, 1992), raffinose (Bemal-Lugo and Leopold, 1992), proline (Rensburg et al, 1993), glycine betaine, ononitol and pinitol (Vemon and Bohnert, 1992).
  • genes which promote the synthesis of an osmotically active polyol compound are genes which encode the enzymes mannitol- 1 -phosphate dehydrogenase, trehalose-6-phosphate synthase and myoinositol O-methyltransferase.
  • Late Embryogenic Proteins have been assigned based on structural similarities (see Dure et al, 1989). All three classes of LEAs have been demonstrated in maturing (i.e. desiccating) seeds. Within these 3 types of LEA proteins, the Type-II (dehydrin-type) have generally been implicated in drought and/or desiccation tolerance in vegetative plant parts (i.e. Mundy and Chua, 1988; Piatkowski et al, 1990; Yamaguchi-Shinozaki et al, 1992). Recently, expression of a Type-Ill LEA (HVA-1) in tobacco was found to influence plant height, maturity and drought tolerance (Fitzpatrick, 1993).
  • HVA-1 Type-Ill LEA
  • HVA-1 HVA-1 gene influenced tolerance to water deficit and salinity (Xu et al, 1996). Expression of structural genes from all three LEA groups may therefore confer drought tolerance. Other types of proteins induced during water stress include thiol proteases, aldolases and transmembrane transporters (Guerrero et al, 1990), which may confer various protective and/or repair-type functions during drought sfress. It also is contemplated that genes that effect lipid biosynthesis and hence membrane composition might also be useful in conferring drought resistance on the plant.
  • genes that are involved with specific morphological traits that allow for increased water extractions from drying soil would be of benefit. For example, introduction and expression of genes that alter root characteristics may enhance water uptake. It also is contemplated that expression of genes that enhance reproductive fitness during times of stress would be of significant value. For example, expression of genes that improve the synchrony of pollen shed and receptiveness of the female flower parts, i.e., silks, would be of benefit. In addition it is proposed that expression of genes that minimize kernel abortion during times of sfress would increase the amount of grain to be harvested and hence be of value.
  • Resistance to viruses may be produced through expression of novel genes.
  • expression of a viral coat protein in a transgenic plant can impart resistance to infection of the plant by that vims and perhaps other closely related vimses (Cuozzo et al, 1988, Hemenway et al, 1988, Abel et al, 1986).
  • expression of antisense genes targeted at essential viral functions may also impart resistance to vimses.
  • an antisense gene targeted at the gene responsible for replication of viral nucleic acid may inhibit replication and lead to resistance to the vims. It is believed that interference with other viral functions through the use of antisense genes also may increase resistance to vimses. Further, it is proposed that it may be possible to achieve resistance to vimses through other approaches, including, but not limited to the use of satellite vimses .
  • Peptide antibiotics are polypeptide sequences which are inhibitory to growth of bacteria and other microorganisms.
  • the classes of peptides referred to as cecropins and magainins inhibit growth of many species of bacteria and fungi.
  • expression of PR proteins in monocotyledonous plants such as maize may be useful in conferring resistance to bacterial disease.
  • genes are induced following pathogen attack on a host plant and have been divided into at least five classes of proteins (Bol, Linthorst, and Comelissen, 1990). Included amongst the PR proteins are ⁇ -1, 3-glucanases, chitinases, and osmotin and other proteins that are believed to function in plant resistance to disease organisms. Other genes have been identified that have antifungal properties, e.g., UDA (stinging nettle lectin) and hevein (Broakaert et al, 1989; Barkai-Golan et ⁇ /., 1978). It is known that certain plant diseases are caused by the production of phytotoxins.
  • UDA stinging nettle lectin
  • hevein Broakaert et al, 1989; Barkai-Golan et ⁇ /., 1978. It is known that certain plant diseases are caused by the production of phytotoxins.
  • resistance to these diseases would be achieved through expression of a novel gene that encodes an enzyme capable of degrading or otherwise inactivating the phytotoxin. It also is contemplated that expression of novel genes that alter the interactions between the host plant and pathogen may be useful in reducing the ability of the disease organism to invade the tissues ofthe host plant, e.g., an increase in the waxiness ofthe leaf cuticle or other morphological characteristics.
  • genes that influence maturity and/or dry down can be identified and introduced into plant lines using transformation techniques to create new varieties adapted to different growing locations or the same growing location, but having improved yield to moisture ratio at harvest. Expression of genes that are involved in regulation of plant development may be especially useful.
  • genes may be introduced into plants that would improve standability and other plant growth characteristics. Expression of novel genes in plants which confer stronger stalks, improved root systems, or prevent or reduce ear droppage would be of great value to the farmer. It is proposed that introduction and expression of genes that increase the total amount of photoassimilate available by, for example, increasing light distribution and/or interception would be advantageous. In addition, the expression of genes that increase the efficiency of photosynthesis and/or the leaf canopy would further increase gains in productivity. It is contemplated that expression of a phytochrome gene in crop plants may be advantageous. Expression of such a gene may reduce apical dominance, confer semidwarfism on a plant, and increase shade tolerance (U.S. Patent No. 5,268,526). Such approaches would allow for increased plant populations in the field.
  • coli gdhA genes may lead to increased fixation 1 of nifrogen in organic compounds.
  • expression of gdhA in plants may lead to enhanced resistance to the herbicide glufosinate by incorporation of excess ammonia into glutamate, thereby detoxifying the ammonia.
  • expression of a novel gene may make a nutrient source available that was previously not accessible, e.g., an enzyme that releases a component of nutrient value from a more complex molecule, perhaps a macromolecule.
  • male Sterility is useful in the production of hybrid seed. It is proposed that male sterility may be produced through expression of novel genes. For example, it has been shown that expression of genes that encode proteins that interfere with development of the male inflorescence and/or gametophyte result in male sterility. Chimeric ribonuclease genes that express in the anthers of transgenic tobacco and oilseed rape have been demonstrated to lead to male sterility (Mariani et al, 1990). A number of mutations were discovered in maize that confer cytoplasmic male sterility. One mutation in particular, referred to as T cytoplasm, also correlates with sensitivity to Southern com leaf blight.
  • TURF- 13 (Levings, 1990) was identified that correlates with T cytoplasm. It is proposed that it would be possible through the introduction of TURF- 13 via transformation, to separate male sterility from disease sensitivity. As it is necessary to be able to restore male fertility for breeding purposes and for grain production, it is proposed that genes encoding restoration of male fertility also may be introduced.
  • (viii) Improved Nutritional Content Genes may be introduced into plants to improve the nutrient quality or content of a particular crop.
  • Introduction of genes that alter the nutrient composition of a crop may greatly enhance the feed or food value.
  • the protein of many grains is suboptimal for feed and food purposes, especially when fed to pigs, poultry, and humans.
  • the protein is deficient in several amino acids that are essential in the diet of these species, requiring the addition of supplements to the grain.
  • Limiting essential amino acids may include lysine, methionine, tryptophan, threonine, valine, arginine, and histidine. Some amino acids become limiting only after com is supplemented with other inputs for feed formulations.
  • the levels of these essential amino acids in seeds and grain may be elevated by mechanisms which include, but are not limited to, the introduction of genes to increase the biosynthesis of the amino acids, decrease the degradation of the amino acids, increase the storage of the amino acids in proteins, or increase transport ofthe amino acids to the seeds or grain.
  • the protein composition of a crop may be altered to improve the balance of amino acids in a variety of ways including elevating expression of native proteins, decreasing expression of those with poor composition, changing the composition of native proteins, or introducing genes encoding entirely new proteins possessing superior composition.
  • genes that alter the oil content of a crop plant may also be of value. Increases in oil content may result in increases in metabolizable-energy- content and density of the seeds for use in feed and food.
  • the introduced genes may encode enzymes that remove or reduce rate-limitations or regulated steps in fatty acid or lipid biosynthesis. Such genes may include, but are not limited to, those that encode acetyl-CoA carboxylase, ACP-acyltransferase, ⁇ -ketoacyl-ACP synthase, plus other well known fatty acid biosynthetic activities. Other possibilities are genes that encode proteins that do not possess enzymatic activity such as acyl carrier protein.
  • Genes may be introduced that alter the balance of fatty acids present in the oil providing a more healthful or nutritive feedstuff.
  • the introduced DNA also may encode sequences that block expression of enzymes involved in fatty acid biosynthesis, altering the proportions of fatty acids present in crops.
  • Genes may be introduced that enhance the nutritive value of the starch component of crops, for example by increasing the degree of branching, resulting in improved utilization of the starch in livestock by delaying its metabolism. Additionally, other major constituents of a crop may be altered, including genes that affect a variety of other nutritive, processing, or other quality aspects. For example, pigmentation may be increased or decreased.
  • Feed or food crops may also possess sub-optimal quantities of vitamins, antioxidants or other nutraceuticals, requiring supplementation to provide adequate nutritive value and ideal health value.
  • Introduction of genes that enhance vitamin biosynthesis may be envisioned including, for example, vitamins A, E, B 12 , choline, and the like.
  • Mineral content may also be sub-optimal.
  • genes that affect the accumulation or availability of compounds containing phosphoms, sulfur, calcium, manganese, zinc, and iron among others would be valuable.
  • the improvements may not necessarily involve grain, but may, for example, improve the value of a crop for silage.
  • Introduction of DNA to accomplish this might include sequences that alter lignin production such as those that result in the
  • genes also may be introduced which improve the processing of crops and improve the value of the products resulting from the processing.
  • One use of crops if via wetmilling.
  • novel genes that increase the efficiency and reduce the cost of such processing, for example by decreasing steeping time, may also find use.
  • Improving the value of wetmilling products may include altering the quantity or quality of starch, oil, com gluten meal, or the components of gluten feed. Elevation of starch may be achieved through the identification and elimination of rate limiting steps in starch biosynthesis or by decreasing levels of the other components of crops resulting in proportional increases in starch.
  • Oil is another product of wetmilling, the value of which may be improved by introduction and expression of genes. Oil properties may be altered to improve its performance in the production and use of cooking oil, shortenings, lubricants or other oil-derived products or improvement of its health attributes when used in the food- related applications. Novel fatty acids also may be synthesized which upon extraction can serve as starting materials for chemical syntheses. The changes in oil properties may be achieved by altering the type, level, or lipid arrangement of the fatty acids present in the oil. This in turn may be accomplished by the addition of genes that encode enzymes that catalyze the synthesis of novel fatty acids and the lipids possessing them or by increasing levels of native fatty acids while possibly reducing levels of precursors.
  • DNA sequences may be introduced which slow or block steps in fatty acid biosynthesis resulting in the increase in precursor fatty acid intermediates.
  • Genes that might be added include desaturases, epoxidases, hydratases, dehydratases, and other enzymes that catalyze reactions involving fatty acid intermediates.
  • Representative examples of catalytic steps that might be blocked include the desaturations from stearic to oleic acid and oleic to linolenic acid resulting in the respective accumulations of stearic and oleic acids.
  • Another example is the blockage of elongation steps resulting in the accumulation of C 8 to C 12 saturated fatty acids.
  • transgenic plant prepared in accordance with the invention may be used for the production or manufacturing of useful biological compounds that were either not produced at all, or not produced at the same level, in the com plant previously.
  • plants produced in accordance with the invention may be made to metabolize or absrob and concentrate certain compounds, such as hazardous wastes, thereby allowing bioremediation of these compounds.
  • the novel plants producing these compounds are made possible by the introduction and expression of one or potentially many genes with the constructs provided by the invention.
  • the vast array of possibilities include but are not limited to any biological compound which is presently produced by any organism such as proteins, nucleic acids, primary and intermediary metabolites, carbohydrate polymers, enzymes for uses in bioremediation, enzymes for modifying pathways that produce secondary plant metabolites such as flavonoids or vitamins, enzymes that could produce pharmaceuticals, and for introducing enzymes that could produce compounds of interest to the manufacturing industry such as specialty chemicals and plastics.
  • the compounds may be produced by the plant, extracted upon harvest and/or processing, and used for any presently recognized useful purpose such as pharmaceuticals, fragrances, and industrial enzymes to name a few.
  • Non-Protein-Expressing Sequences DNA may be introduced into plants for the purpose of expressing RNA transcripts that function to affect plant phenotype yet are not translated into protein.
  • Two examples are antisense RNA and RNA with ribozyme activity. Both may serve possible functions in reducing or eliminating expression of native or introduced plant genes. However, as detailed below, DNA need not be expressed to effect the phenotype of a plant.
  • Antisense RNA Genes may be constructed or isolated, which when transcribed, produce antisense RNA that is complementary to all or part(s) of a targeted messenger RNA(s). The antisense RNA reduces production of the polypeptide product of the messenger RNA.
  • Genes may also be constmcted to produce double-stranded RNA molecules complementary to all or part of the targeted messenger RNA(s). Genes designed in this manner will be referred to as RNAi constructs; the double-stranded RNA or RNAi constructs can trigger the sequence-specific degradation of the target messenger RNA.
  • the polypeptide product of the target messenger RNA may be any protein.
  • the aforementioned genes will be referred to as antisense genes and RNAi constructs, respectively.
  • An antisense gene or RNAi construct may thus be introduced into a plant by fransformation methods to produce a novel transgenic plant with reduced expression of a selected protein of interest.
  • the protein may be an enzyme that catalyzes a reaction in the plant.
  • Reduction of the enzyme activity may reduce or eliminate products of the reaction which include any enzymatically synthesized compound in the plant such as fatty acids, amino acids, carbohydrates, nucleic acids and the like.
  • the protein may be a storage protein, such as a zein, or a structural protein, the decreased expression of which may lead to changes in seed amino acid composition or plant morphological changes respectively.
  • Ribozymes Genes also may be constmcted or isolated, which when transcribed, produce RNA enzymes (ribozymes) which can act as endoribonucleases and catalyze the cleavage of RNA molecules with selected sequences. The cleavage of selected messenger RNAs can result in the reduced production of their encoded polypeptide products. These genes may be used to prepare novel transgenic plants which possess them. The transgenic plants may possess reduced levels of polypeptides including, but not limited to, the polypeptides cited above. Ribozymes are RNA-protein complexes that cleave nucleic acids in a site-specific fashion.
  • Ribozymes have specific catalytic domains that possess endonuclease activity (Kim and Cech, 1987; Gerlach et al, 1987; Forster and Symons, 1987). For example, a large number of ribozymes accelerate phosphoester transfer reactions with a high degree of specificity, often cleaving only one of several phosphoesters in an oligonucleotide substrate (Cech et al, 1981; Michel and Westhof, 1990; Reinhold-Hurek and Shub, 1992). This specificity has been attributed to the requirement that the substrate bind via specific base-pairing interactions to the internal guide sequence ("IGS") ofthe ribozyme prior to chemical reaction.
  • IGS internal guide sequence
  • Ribozyme catalysis has primarily been observed as part of sequence-specific cleavage/ligation reactions involving nucleic acids (Joyce, 1989; Cech et al, 1981).
  • U. S. Patent 5,354,855 reports that certain ribozymes can act as endonucleases with a sequence specificity greater than that of known ribonucleases and approaching that of the DNA restriction enzymes.
  • RNA cleavage activity examples include sequences from the Group I self splicing introns including Tobacco Ringspot Vims (Prody et ⁇ /., 1986), Avocado Sunblotch Viroid (Palukaitis et al, 1979; Symons, 1981), and Lucerne Transient Sfreak Vims (Forster and Symons, 1987). Sequences from these and related viruses are referred to as hammerhead ribozyme based on a predicted folded secondary stmcture.
  • ribozymes include sequences from RNase P with RNA cleavage activity (Yuan et ⁇ /., 1992, Yuan and Altman, 1994, U. S. Patents 5,168,053 and 5,624,824), hairpin ribozyme stmctures (Berzal-Herranz et ⁇ ., 1992; Chowrira et al, 1993) and Hepatitis Delta vims based ribozymes (U. S. Patent 5,625,047).
  • Ribozymes are targeted to a given sequence by virtue of annealing to a site by complimentary base pair interactions. Two stretches of homology are required for this targeting. These stretches of homologous sequences flank the catalytic ribozyme stmcture defined above. Each stretch of homologous sequence can vary in length from 7 to 15 nucleotides.
  • the cleavage site is a dinucleotide sequence on the target RNA is a uracil (U) followed by either an adenine, cytosine or uracil (A,C or U) (Perriman et al, 1992; Thompson et al, 1995).
  • U uracil
  • A,C or U adenine, cytosine or uracil
  • the frequency of this dinucleotide occurring in any given RNA is statistically 3 out of 16. Therefore, for a given target messenger RNA of 1,000 bases, 187 dinucleotide cleavage sites are statistically possible.
  • Designing and testing ribozymes for efficient cleavage of a target RNA is a process well known to those skilled in the art. Examples of scientific methods for designing and testing ribozymes are described by Chowrira et al, (1994) and Lieber and Strauss (1995), each incorporated by reference. The identification of operative and preferred sequences for use in down regulating a given gene is simply a matter of preparing and testing a given sequence, and is a routinely practiced "screening" method known to those of skill in the art.
  • genes may be introduced to produce novel transgenic plants which have reduced expression of a native gene product by the mechanism of co-suppression. It has been demonstrated in tobacco, tomato, and petunia
  • Non-RNA-Expressing Sequences DNA elements including those of fransposable elements such as Ds, Ac, or Mu, may be inserted into a gene to cause mutations. These DNA elements may be inserted in order to inactivate (or activate) a gene and thereby "tag" a particular trait.
  • the fransposable element does not cause instability of the tagged mutation, because the utility ofthe element does not depend on its ability to move in the genome.
  • the introduced DNA sequence may be used to clone the corresponding gene, e.g., using the introduced DNA sequence as a PCR primer together with PCR gene cloning techniques (Shapiro, 1983; Dellaporta et al, 1988). Once identified, the entire gene(s) for the particular trait, including confrol or regulatory regions where desired, may be isolated, cloned and manipulated as desired.
  • DNA elements introduced into an organism for purposes of gene tagging is independent ofthe DNA sequence and does not depend on any biological activity of the DNA sequence, i.e., transcription into RNA or translation into protein.
  • the sole function ofthe DNA element is to disrupt the DNA sequence of a gene.
  • unexpressed DNA sequences could be introduced into cells as proprietary "labels" of those cells and plants and seeds thereof. It would not be necessary for a label DNA element to disrupt the function of a gene endogenous to the host organism, as the sole function of this DNA would be to identify the origin of the organism. For example, one could introduce a unique DNA sequence into a plant and this DNA element would identify all cells, plants, and progeny of these cells as having arisen from that labeled source. It is proposed that inclusion of label DNAs would enable one to distinguish proprietary germplasm or germplasm derived from such, from unlabelled germplasm.
  • MAR matrix attachment region element
  • Stief chicken lysozyme A element
  • tRNA sequences for example, to alter codon usage
  • rRNA variants for example, which may confer resistance to various agents such as antibiotics.
  • mutated centromeric sequences are contemplated to be useful for increasing the utility ofthe centromere. It is specifically contemplated that the function of the centromeres of the current invention may be based upon the secondary stmcture of the DNA sequences of the centromere, modification of the DNA with methyl groups or other adducts, and / or the proteins which interact with the centromere. By changing the DNA sequence of the centromere, one may alter the affinity of one or more centromere-associated protein(s) for the centromere and / or the secondary structure or modification of the centromeric sequences, thereby changing the activity of the centromere.
  • centromeres ofthe invention changes may be made in the centromeres ofthe invention which do not affect the activity ofthe centromere.
  • Changes in the centromeric sequences which reduce the size of the DNA segment needed to confer centromere activity are contemplated to be particularly useful in the current invention, as would changes which increased the fidelity with which the centromere was transmitted during mitosis and meiosis.
  • Plants refers to any type of plant.
  • the inventors have provided below an exemplary description of some plants that may be used with the invention. However, the list is not in any way limiting, as other types of plants will be known to those of skill in the art and could be used with the invention.
  • a common class of plants exploited in agriculture are vegetable crops, including artichokes, kohlrabi, arugula, leeks, asparagus, lettuce (e.g., head, leaf, romaine), bok choy, malanga, broccoli, melons (e.g., muskmelon, watermelon, crenshaw, honeydew, cantaloupe), brussels sprouts, cabbage, cardoni, carrots, napa, cauliflower, okra, onions, celery, parsley, chick peas, parsnips, chicory, Chinese cabbage, peppers, collards, potatoes, cucumber plants (marrows, cucumbers), pumpkins, cucurbits, radishes, dry bulb onions, mtabaga, eggplant, salsify, escarole, shallots, endive, garlic, spinach, green onions, squash, greens, beet (sugar beet and fodder beet), sweet potatoes, swiss chard, horseradish, tomatoes,
  • fruit and vine crops such as apples, apricots, cherries, nectarines, peaches, pears, plums, prunes, quince almonds, chestnuts, filberts, pecans, pistachios, walnuts, citrus, blueberries, boysenberries, cranberries, currants, loganberries, raspberries, strawberries, blackberries, grapes, avocados, bananas, kiwi, persimmons, pomegranate, pineapple, tropical fruits, pomes, melon, mango, papaya, and lychee.
  • fruit and vine crops such as apples, apricots, cherries, nectarines, peaches, pears, plums, prunes, quince almonds, chestnuts, filberts, pecans, pistachios, walnuts, citrus, blueberries, boysenberries, cranberries, currants, loganberries, raspberries, strawberries, blackberries, grapes, avocados, bananas, kiwi, persimmons, pom
  • plants include bedding plants such as flowers, cactus, succulents and ornamental plants, as well as trees such as forest (broad-leaved trees and evergreens, such as conifers), fruit, ornamental, and nut-bearing trees, as well as shrubs and other nursery stock.
  • bedding plants such as flowers, cactus, succulents and ornamental plants, as well as trees such as forest (broad-leaved trees and evergreens, such as conifers), fruit, ornamental, and nut-bearing trees, as well as shrubs and other nursery stock.
  • oil of replication refers to an origin of DNA replication recognized by proteins that initiate DNA replication.
  • binary BAC or “binary bacterial artificial chromosome” refer to a bacterial vector that contains the T-DNA border sequences necessary for Agrobacterium mediated transformation (see, for example, Hamilton et al, 1996; Hamilton, 1997; and Liu et al., 1999.
  • centromere sequence refers to a nucleic acid sequence which one wishes to assay for potential centromere function.
  • a "centromere" is any DNA sequence that confers an ability to segregate to daughter cells through cell division.
  • this sequence may produce a segregation efficiency to daughter cells ranging from about 1% to about 100%, including to about 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or about 95% of daughter cells. Variations in such a segregation efficiency may find important applications within the scope of the invention; for example, minichromosomes carrying centromeres that confer 100% stability could be maintained in all daughter cells without selection, while those that confer 1% stability could be temporarily introduced into a transgenic organism, but be eliminated when desired.
  • the centromere may confer stable segregation of a nucleic acid sequence, including a recombinant construct comprising the cenfromere, through mitotic or meiotic divisions, including through both meiotic and meitotic divisions.
  • a plant centromere is not necessarily derived from plants, but has the ability to promote DNA segregation in plant cells.
  • the term "cenfromere-associated protein" refers to a protein encoded by a sequence of the centromere or a protein which is encoded by host DNA and binds with relatively high affinity to the centromere.
  • circular permutations refer to variants of a sequence that begin at base n within the sequence, proceed to the end of the sequence, resume with base number one of the sequence, and proceed to base n — 1.
  • n may be any number less than or equal to the length of the sequence.
  • circular permutations ofthe sequence ABCD are: ABCD, BCD A, CDAB, and DABC.
  • crop includes any plant or portion of a plant grown or harvested for commercial or beneficial purposes.
  • eukaryote refers to living organisms whose cells contain nuclei. A eukaryote may be distinguished from a "prokaryote” which is an organism which lacks nuclei. Prokaryotes and eukaryotes differ fundamentally in the way their genetic information is organized, as well as their patterns of RNA and protein synthesis.
  • expression refers to the process by which a structural gene produces an RNA molecule, typically termed messenger RNA (mRNA). The mRNA is typically, but not always, translated into polypeptide(s).
  • the term "genome” refers to all of the genes and DNA sequences that comprise the genetic information within a given cell of an organism. Usually, this is taken to mean the information contained within the nucleus, but also includes the organelles.
  • higher eukaryote means a multicellular eukaryote, typically characterized by its greater complex physiological mechanisms and relatively large size. Generally, complex organisms such as plants and animals are included in this category. Preferred higher eukaryotes to be transformed by the present invention include, for example, monocot and dicot angiosperm species, gymnosperm species, fern species, plant tissue culture cells of these species, animal cells and algal cells. It will of course be understood that prokaryotes and eukaryotes alike may be transformed by the methods of this invention.
  • the term "host” refers to any organism that contains aplasmid, expression vector, or integrated construct comprising a plant centromere.
  • Preferred examples of host cells for cloning, useful in the present invention are bacteria such as Escherichia coli, Bacillus subt ⁇ lis, Pseudomonas, Streptomyces, Salmonella, and yeast cells such as S. cerevisiae.
  • Host cells which can be targeted for expression of a minichromosome may be plant cells of any source and specifically include Arabidopsis, maize, rice, sugarcane, sorghum, barley, soybeans, tobacco, wheat, tomato, potato, citrus, or any other agronomically or scientifically important species.
  • hybridization refers to the pairing of complementary
  • linker refers to a DNA molecule, generally up to 50 or 60 nucleotides long and synthesized chemically, or cloned from other vectors. In a preferred embodiment, this fragment contains one, or preferably more than one, restriction enzyme site for a blunt-cutting enzyme and a staggered-cutting enzyme, such as BamHI. One end ofthe linker fragment is adapted to be ligatable to one end of the linear molecule and the other end is adapted to be ligatable to the other end of the linear molecule.
  • a "library” is a pool of random DNA fragments which are cloned. In principle, any gene can be isolated by screening the library with a specific hybridization probe (see, for example, Young et al, 1977).
  • Each library may contain the DNA of a given organism inserted as discrete restriction enzyme-generated fragments or as randomly sheered fragments into many thousands of plasmid vectors.
  • E. coli, yeast, and Salmonella plasmids are particularly useful when the genome inserts come from other organisms.
  • lower eukaryote refers to a eukaryote characterized by a comparatively simple physiology and composition, and most often unicellularity. Examples of lower eukaryotes include flagellates, ciliates, and yeast.
  • a "minichromosome” is a recombinant DNA construct including a centromere and capable of transmission to daughter cells. Minichromosome may remain separate from the host genome (as episomes) or may integrate into host chromosomes. The stability of this construct through cell division could range between from about 1% to about 100%, including about 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% and about 95%.
  • the minichromosome construct may be a circular or linear molecule. It may include elements such as one or more telomeres, ARS sequences, and genes. The number of such sequences included is only limited by the physical size limitations of the construct itself.
  • the minichromosome could contain DNA derived from a natural centromere, although it may be preferable to limit the amount of DNA to the minimal amount required to obtain a segregation efficiency in the range of 1-100%.
  • the minichromosome could also contain a synthetic centromere composed of tandem arrays of repeats of any sequence, either derived from a natural centromere, or of synthetic DNA.
  • the minichromosome could also contain DNA derived from multiple natural centromeres.
  • the minichromosome may be inherited through mitosis or meiosis, or through both meiosis and mitosis.
  • minichromosome specifically encompasses and includes the terms "plant artificial chromosome” or "PLAC,” or engineered chromosomes or microchromosomes and all teachings relevant to a PLAC or plant artificial chromosome specifically apply to constmcts within the meaning of the term minichromosome.
  • minichromosome-encoded protein it is meant a polypeptide which is encoded by a sequence of a minichromosome of the current invention. This includes sequences such as selectable markers, telomeres, etc., as well as those proteins encoded by any other selected functional genes on the minichromosome.
  • plant includes plant cells, plant protoplasts, plant calli, and the like, as well as whole plants regenerated therefrom.
  • plasmid or "cloning vector” refers to a closed covalently circular extrachromosomal DNA or linear DNA which is able to replicate in a host cell and which is normally nonessential to the survival of the cell.
  • plasmids and other vectors are known and commonly used in the art (see, for example, Cohen et al, U.S. Patent No. 4,468,464, which discloses examples of DNA plasmids, and which is specifically incorporated herein by reference).
  • a "probe” is any biochemical reagent (usually tagged in some way for ease of identification), used to identify or isolate a gene, a gene product, a DNA segment or a protein.
  • recombination refers to any genetic exchange that involves breaking and rejoining of DNA strands.
  • regulatory sequence refers to any DNA sequence that influences the efficiency of transcription or translation of any gene.
  • the term includes, but is not limited to, sequences comprising promoters, enhancers and terminators.
  • a "selectable marker” is a gene whose presence results in a clear phenotype, and most often a growth advantage for cells that contain the marker. This growth advantage may be present under standard conditions, altered conditions such as elevated temperature, or in the presence of certain chemicals such as herbicides or antibiotics.
  • selectable markers are described, for example, in Broach et al. (1979). Examples of selectable markers include the thymidine kinase gene, the cellular adenine-phosphoribosyltransferase gene and the dihydrylfolate reductase gene, hygromycin phosphotransferase genes, the bar gene and neomycin phosphotransferase genes, among others.
  • Preferred selectable markers in the present invention include genes whose expression confer antibiotic or herbicide resistance to the host cell, sufficient to enable the maintenance of a vector within the host cell, and which facilitate the manipulation of the plasmid into new host cells.
  • a "screenable marker” is a gene whose presence results in an identifiable phenotype. This phenotype may be observable under standard conditions, altered conditions such as elevated temperature, or in the presence of certain chemicals used to detect the phenotype.
  • site-specific recombination refers to any genetic exchange that involves breaking and rejoining of DNA strands at a specific DNA sequence.
  • a "structural gene” is a sequence which codes for a polypeptide or RNA and includes 5' and 3' ends.
  • the structural gene may be from the host into which the structural gene is transformed or from another species.
  • a structural gene will preferably, but not necessarily, include one or more regulatory sequences which modulate the expression of the structural gene, such as a promoter, terminator or enhancer.
  • a structural gene will preferably, but not necessarily, confer some useful phenotype upon an organism comprising the structural gene, for example, herbicide resistance.
  • a structural gene may encode an RNA sequence which is not translated into a protein, for example a tRNA or rRNA gene.
  • telomere refers to a sequence capable of capping the ends of a chromosome, thereby preventing degradation of the chromosome end, ensuring replication and preventing fusion to other chromosome sequences. Telomeres can include naturally occuring telomere sequences or synthetic sequences. Telomres from one species may confer telomere activity in another species.
  • the terms "transformation” or “transfection” refer to the acquisition in cells of new DNA sequences through the chromosomal or extra- chromosomal addition of DNA. This is the process by which naked DNA, DNA coated with protein, or whole minichromosomes are introduced into a cell, resulting in a potentially heritable change.
  • the term “consensus” refers to a nucleic acid sequence derived by comparing two or more related sequences. A consensus sequence defines both the conserved and variable sites between the sequences being compared. Any one of the sequences used to derive the consensus or any permutation defined by the consensus may be useful in constmction minichromosomes.
  • replicateated nucleotide sequence refers to any nucleic acid sequence of at least 25 bp present in a genome or a recombinant molecule that occurs at least two or more times and that are preferably at least 80% identical either in head to tail or head to head orientation either with or without intervening sequence between repeat units.
  • Tissue from various plants are harvested for DNA extraction.
  • leaf tissue is cooled in liquid nifrogen, ground to a fine powder and transferred to an organic solvent-resistant test tube or beaker.
  • Warm CTAB extraction solution (2% (w/v) CTAB, 100 mM Tris-Cl, pH 9.5, 20 mM EDTA, pH 8.0, 1.4 M NaCl, 1% polyethylene gycol) is added in a ratio of 20 ml per gram of tissue and mixed thoroughly.
  • 50 microliters of ⁇ - mercaptoethanol and 30 microliters of 30 mg/ml RNAse A are added and the mixture is incubated for 10-60 min. at 65°C with occasional mixing.
  • the homogenate is extracted with an equal volume of chloroform, and is then centrifuged 5 min at 7500 x g (8000 rpm in JA20; 10,000 rpm in a microcentrifuge, for smaller samples), 4°C.
  • the top (aqueous) phase is recovered and nucleic acids are precipitated by adding 1 volume isopropanol. After mixing, the precipitate is pelleted at 15 min at 7500 x g, 4°C.
  • the pellet is washed with 70% ethanol, dried and resuspended in a minimal volume of TE (10 mM Tris-Cl, pH 8.0, 0.1 mM EDTA, pH 8.0).
  • the consensus sequence of ChrBol is shown in FIG. 1A (SEQ ID NO:l). This consensus was assembled from DNA sequences collected by the inventors. Twenty- four of these sequences completely spanned the repeat, and nine others partially covered the repeat. The length of this repeat is 180 ⁇ 0.86 base pairs, and A and T comprise of 60% ofthe consensus.
  • the consensus sequence of ChrBo2 is shown in FIG. IB (SEQ ID NO:2). This consensus was assembled from DNA sequences collected by the inventors. Five of these sequences completely spanned the repeat, and two others partially covered the repeat. The length of this repeat is 180 ⁇ 0.45 base pairs, and A and T comprise 63% ofthe consensus.
  • the two repeats (ChrBol and ChrBo2) were aligned to each other using the ClustalX program (ClustalX is a free multiple sequence alignment program for Windows. Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F. and Higgins, D.G. (1997) The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Research, 24:4876-4882.). The two consensus sequences differ significantly from each other at several bases. Those sites with significant differences (chi-squared, P ⁇ 0.05) are highlighted as shown in FIG. lC.
  • GenBank nt database (fttp://ftp.ncbi.nlm.nih.gov last/db/, March 29 version, downloaded on 04/07/2002) and the plant satellite DNA database (http://w31amc.umbr.cas.cz/PlantSat/, downloaded on 4/14/2002) were compared to the inventors' consensus sequences using the blastn program and an Expect value threshold score of -3. Consensus sequences were assembled using all inventors' and GenBank sequences that matched with an Expect (E) value of less than -45.
  • E Expect
  • FIG. ID The revised consensus sequence of ChrBol is shown in FIG. ID (SEQ ID NO:3). This consensus was assembled from thirty-three DNA sequences collected by the inventors and eighteen GenBank sequences (Table 10). Thirty of these sequences completely spanned the repeat, and twenty-one others partially covered the repeat. The length of this repeat is 180 ⁇ 0.81 base pairs, and A and T comprise of 59% of the consensus. Table 1. GenBank sequences (accession numbers) that match inventors' ChrBol consensus
  • the revised consensus sequence of ChrBo2 is shown in FIG. IE (SEQ LD NO:4).
  • This consensus was assembled from seven DNA sequences collected by the inventors and five GenBank sequences (Table 2). Seven of these sequences completely spanned the repeat, and five others partially covered the repeat. The length of this repeat is 180 ⁇ 0.44 base pairs, and A and T comprise of 63% of the consensus.
  • GenBank sequences accession numbers
  • ChrBo2 consensus The two revised consensus sequences (ChrBol and ChrBo2) were aligned to each other using the ClustalX program.
  • the two consensus sequences differ significantly (chi-squared, P ⁇ 0.05) from each other at several bases (highlighted as shown in FIG. IF).
  • GenBank entries match the Brassica oleracea centromere sequences defined by the inventors. These are annotated as follows: Xle7-2EB gene Xle4-7B gene Xle6-14H gene Satellite tandem repeat monomer Hindlll satellite repeat Satellite DNA inverted direct repeat Tandem repeated DNA Highly repetitive DNA They are not annotated as cenfromere repeats in GenBank. A completed list of these sequences are shown in Table 3.
  • GenBank entries match the Brassica oleracea centromere sequences defined by the inventors
  • the consensus sequence for ChrGm2 is shown in FIG. 2B (SEQ ID NO:6). This consensus was assembled from DNA sequences collected by the inventors. Ten of these sequences completely spanned the repeat, and eleven others partially covered the repeat. It is 91 ⁇ 0.48 base pairs in length, and A and T comprise of 62% of the consensus.
  • GenBank nt database (fttp://ftp.ncbi.nlm.nih.gov/blast/db/, March 29 version, downloaded on 04/07/2002) and the plant satellite DNA database (http://w31amc.umbr.cas.cz/PlantSat/ downloaded on 4/14/2002) were compared to the inventors' consensus sequences using the blastn program and an Expect value threshold of -3. Consensus sequences were built using all inventors' and GenBank sequences that matched with an Expect (E) value of less than -25.
  • E Expect
  • FIG. 2E The revised consensus sequence for ChrGm2 is shown in FIG. 2E (SEQ ID NO: 8). This consensus was assembled from twenty-one DNA sequences collected by the inventors and three matching sequences from GenBank (accession numbers AF297983, AF297984, AF297985). Ten of these sequences completely spanned the repeat, and fourteen others partially covered the repeat. It is 91 ⁇ 0.53 base pairs in length, and A and T comprise of 61% ofthe consensus. The two repeats (ChrGml and ChrGm2) were aligned to each other using the ClustalX program Those sites with significant differences (chi-squared, P ⁇ 0.05) are highlighted in FIG. 2F.
  • GenBank entries match the Glycine max centromere sequences defined by the inventors. These are annotated as follows: Satellite DNA Tospovims resistance protein C (Sw5-c), tospovirus resistance protein D (Sw5- d), and tospovims resistance protein E (Sw5-e) genes They are not annotated as centromere repeats in GenBank. A complete list of these sequences is shown in Table 4:
  • the consensus sequence of ChrLel is shown in FIG. 3 A (SEQ ID NO: 9). This consensus was assembled from forty-two DNA sequences collected by the inventors. Eighteen of these sequences completely spanned the repeat, and twenty-four others partially covered the repeat. The repeat is 181 ⁇ 0.61 base pairs in length, and A and T comprise of 50% ofthe consensus.
  • GenBank nt database (fttp://ftp.ncbi.nlm.nih.gov/blast/db/, March 29 version, downloaded on 04/07/2002) and the plant satellite DNA database (http://w31amc.umbr.cas.cz/PlantSat/, downloaded on 4/14/2002) were compared to the inventors' consensus sequences using the blastn program and an Expect value threshold value of -3. Consensus sequences were built using all inventors' and GenBank sequences matched with an Expect (E) value of less than -40.
  • E Expect
  • the repeat is 181 ⁇ 0.61 base pairs in length, and A and T comprise of 50% ofthe consensus.
  • the revised consensus sequence of ChrLel is shown in FIG. 3B (SEQ ID NO: 10). This consensus was assembled from forty-two sequences collected by the inventors and two GenBank sequence (nt database at fttp://ftp.ncbi.nlm.nih.gov/blast/db/. March 29 version, downloaded on 04/07/2002). Eighteen of these sequences completely spanned the repeat, and twenty-six others partially covered the repeat.
  • the GenBank sequences are accession numbers X87233 and AY007367. Neither of the 2 GenBank entries that match the Lycopersicon esculentum centromere sequences defined by the inventors are complete repeats; they match only a portion ofthe sequence identified by the company. These are annotated as follows: Satellite DNA Tandem repetitive repeat region They are not annotated as cenfromere repeats in GenBank. A complete list of these sequences is shown in Table 5.
  • ChrZml centromere repeat
  • the repeat is 180 ⁇ 1.15 base pairs in length, and A and T comprise of 56% ofthe consensus.
  • the consensus sequence of ChrZml is shown in FIG. 4A (SEQ ID NO: 11). This consensus was assembled from thirty-eight DNA sequences collected by the inventors. Three of these sequences completely spanned the repeat, and thirty-five others partially covered the repeat.
  • GenBank nt database (fttp://ftp.ncbi.nlm.nih.gov/blast/db/, March 29 version, downloaded on 04/07/2002) and the plant satellite DNA database (http://w31amc.umbr.cas.cz/PlanSat/. downloaded on 4/14/2002) were compared to the inventors' consensus sequences using the blastn program and an Expect value threshold score of -3. Consensus sequences were built using all inventors' and GenBank sequences matched with an Expect (E) value of -50.
  • E Expect
  • the revised consensus sequence of ChrZml is shown in FIG. 4B (SEQ ID NO: 12). This consensus was assembled from thirty-eight DNA sequences collected by the inventors and twenty-six matching GenBank sequences (Table 6). Twenty of these sequences completely spanned the repeat, and forty-four others partially covered the repeat. The length of the repeat is 180 ⁇ 0.51 base pairs, and A and T comprise the consensus.
  • GenBank entries match the Lycopersicon esculentum centromere sequences defined by the inventors
  • GenBank sequences of Zea mays centrometric repeat CentC were collected (Table 13) and assigned the identifier ChrZm2.
  • the consensus ofthe repeat was determined as shown in Example 6.
  • the repeat is 158 ⁇ 1.6 base pairs in length.
  • a and T comprises of 53% ofthe bases. All 6 sequences are of unit length.
  • ChrZm2 The consensus sequence of ChrZm2 (SEQ ID NO:13) is shown in FIG. 4C. Table 8. GenBank sequences of Zea mays centrometric repeat ChrZm2 AF078918
  • the most common base is designated as the consensus if it occurs three times more frequently than the second most common base.
  • the consensus is assigned according to the IUPAC ambiguity codes for the three most common bases. If the four bases occur approximately equally (23-27%), the consensus is assigned as N.
  • a BAC clone may be retrofitted with one or more plant telomeres and selectable markers together with the DNA elements necessary for Agrobacterium transformation (FIG. 9).
  • This method will provide a means to deliver any BAC clone into plant cells and to test it for cenfromere function. The method works in the following way.
  • the conversion vector contains a retrofitting cassette.
  • the retrofitting cassette is flanked by TnlO, Tn5, Tn7, Mu or other fransposable elements and contains an origin of replication and a selectable marker for Agrobacterium, a plant telomere array followed by T-DNA right and left borders followed by a second plant telomere array and a plant selectable marker (FIG. 9).
  • the conversion vector is fransformed into an E. coli sfrain carrying the target BAC.
  • the fransposable elements flanking the retrofitting cassette then mediate transposition ofthe cassette randomly into the BAC clone.
  • genes in the centromeric regions Expressed genes are located within 1 kb of essential centromere sequences in S. cerevisiae, and multiple copies of fRNA genes reside within an 80 kb fragment necessary for centromere function in S. pombe (Kuhn et al, 1991). In contrast, genes are thought to be relatively rare in the centromeres of higher eukaryotes, though there are notable exceptions.
  • the Drosophila light, concertina, responder, and rolled loci all map to the centromeric region of chromosome 2, and translocations that remove light from its native heterochromatic context inhibit gene expression.
  • the phosphoenolpymvate gene (CUEl) defines one CEN5 border; mutations in this gene cause defects in light-regulated gene expression
  • the current inventors contemplate use of these genes, or DNA sequences 0 to 5 kb upstream or downstream of these sequences, for insertion into a gene of choice in a minichromosome. It is expected that such elements could potentially yield beneficial regulatory controls of the expression of these genes, even when in the unique environment of a centromere.
  • a search was made in the database of annotated genomic Arabidopsis sequences. With the exception of two genes, no homologs with >95% identity were found elsewhere in the 80% of the genome that has been sequenced. The number of independent cDNA clones that correspond to a single-copy gene provides an estimate of the level of gene expression.
  • genes encoded at CEN2 and CEN4 are not members of a single gene family, nor do they correspond to genes predicted to play a role in centromere functions, but instead have diverse roles.
  • Table 9 Predicted genes within CEN2 and CEN4 that correspond to the cDNA database.
  • Table 10 List of additional genes encoded within the boundaries of CEN4.
  • FIGs. 14 A, B Amplification products of the appropriate length were obtained in both ecotypes for most primer pairs (85%), indicating that the amplified regions were highly similar. In the remaining cases, primer pairs amplified Columbia, but not Landsberg DNA, even at very low stringencies. In these regions, additional primers were designed to determine the extent of nonhomology.
  • the matching sequences were sorted into groups, including two families containing 8 sequences each, 3 sequences from a small family encoding a putative open reading frame, and 4 sequences found once within the cenfromeres, one of which corresponds to predicted CEN2 and CEN4 proteins with similarity throughout their exons and introns (FIG. 15).
  • AtCCSl - AtCCS5 were moderately repeated sequences that appear in centromeric and pericentromeric regions. The remaining sequences were present only in the genetically-defined centromeres. Similar comparisons of all 16 S. cerevisiae centromeres defined a consensus consisting of a conserved 8 bp CDEI motif, an AT-rich 85 bp CDEII element, and a 26 bp CDEII region with 7 highly conserved nucleotides (Fleig et al, 1995). hi contrast, surveys ofthe three S. pombe centromeres revealed conservation of overall centromere stracture, but no universally conserved motifs (Clark, 1998).
  • Minichromosomes are constructed by combining the previously isolated essential chromosomal elements.
  • Exemplary minichromosome vectors include those designed to be "shuttle vectors"; i.e., they can be maintained in a convenient host (such as E. coli, Agrobacterium or yeast) as well as plant cells.
  • a minichromosome can be maintained in E. coli or other bacterial cells as a circular molecule by placing a removable stuffer fragment between the telomeric sequence blocks.
  • the stuffer fragment is a dispensable DNA sequence, bordered by unique restriction sites, which can be removed by restriction digestion of the circular DNAs to create linear molecules with telomeric ends.
  • the linear minichromosome can then be isolated by, for example, gel elecfrophoresis.
  • the minichromosome contains a replication origin and selectable marker that can function in plants to allow the circular molecules to be maintained in bacterial cells.
  • the minichromosomes also include a plant selectable marker, a plant centromere, and a plant ARS to allow replication and maintenance of the DNA molecules in plant cells.
  • the minichromosome includes several unique restriction sites where additional DNA sequence inserts can be cloned. The most expeditious method of physically constructing such a minichromosome, i.e., ligating the various essential elements together for example, will be apparent to those of ordinary skill in this art.
  • FIGs. 7A-7H A number of minichromosome vectors have been designed by the current inventors and are disclosed herein for the purpose of illustration (FIGs. 7A-7H). These vectors are not limiting however, as it will be apparent to those of skill in the art that many changes and alterations may be made and still obtain a functional vector.
  • One plasmid can be created that contains markers, origins and border sequences for Agrobacterium transfer, markers for selection and screening in plants, plant telomeres, and a loxP site or other site useful for site-specific recombination in vivo or in vitro.
  • the second plasmid can be an existing BAC clone, isolated from the available genomic libraries (FIG. 10A).
  • the two plasmids are mixed, either within a single E. coli cell, or in a test tube, and the site-specific recombinase ere is introduced. This will cause the two plasmids to fuse at the loxP sites (FIG. 10B).
  • Variations include vectors with or without a Kan R gene (FIGs. 10B, 10C), with or without a LAT52 GUS gene, with a LAT52 GFP gene, and with a GUS gene under the control of other plant promoters. (FIGs. 10C, 10D and 10 ⁇ ).
  • F. 10F Method for Preparation of Stable Non-Integrated Minichromosomes
  • the inventors envision a variety that would encode a lethal plant gene (such as diptheria toxin or any other gene product that, when expressed, causes lethality in plants). This gene could be located between the right Agrobacterium border and the telomere. Minichromosomes that enter a plant nucleus and integrate into a host chromosome would result in lethality. However, if the minichromosome remains separate, and further, if the ends of this construct are degraded up to the telomeres, then the lethal gene would be removed and the cells would survive. It should be understood that various changes and modifications to the presently preferred embodiments described herein will be apparent to those skilled in the art. Such changes and modifications can be made without departing from the spirit and scope of the present invention and without diminishing its intended advantages. It is therefore intended that such changes and modifications be covered by the appended claims.
  • Carbon et al In: Recombinant Molecules: Impact on Science and Society (Raven Press), 335-378, 1977. Carbon et al, “Centromere structure and function in budding and fission yeasts," New Biologist, 2:10-19, 1990. Carpenter et al, “The control of the distribution of meiotic exchange in Drosophilla melanogaster,” Genetics, 101:81-90, 1982. Cech et al, "In vitro splicing of the ribosomal RNA precursor of Tefrahymena: involvement of a guanosine nucleotide in the excision of the intervening sequence," Cell, 27:487-496, 1981.
  • Copenhaver et al "Use of RFLPs larger than 100 kbp to map position and internal organization of the nucleolus organizer region on chromosome 2 in Arabidopsis thaliana " Plant J. 7, 273-286, 1995.
  • Copenhaver et al Proc. Natl. Acad. Sci. 95:247, 1998. Copenhaver et al, Science. 286:2468-2474, 1999. Copenhaver and Preuss, Plant Biology, 2:104-108, 1999. Coxson et ⁇ /., Biotropica, 24:121-133, 1992. Creusot et al, Plant Journal, 8:763-70, 1995 Cristou et al, Plant Physiol, 87:671-674, 1988. Cuozzo et al, Bio/Technology, 6:549-553, 1988.
  • Curiel et al "Adenovirus enhancement of fransferrin-polylysine-mediated gene delivery," Proc. Natl Acad. Sci. USA 88(19):8850-8854, 1991.
  • Curiel et al high-efficiency gene transfer mediated by adenovirus coupled to DNA-polylysine complexes," Hum. Gen. Ther. 3(2):147-154, 1992. Cutler et al, J. Plant Physiol, 135:351-354, 1989.
  • Eglitis et al "Refroviral-mediated gene transfer into hemopoietic cells,” Avd. Exp. Med. Biol. 241:19-27, 1988.
  • Enomoto et al "Mapping of the pin locus coding for a site-specific recombinase that causes flagellar-phase variation in Escherichia coli K-12," J. Bacteriol, 156:663-668, 1983.
  • Erdmann et al J. Gen. Microbiology, 138:363-368, 1992.
  • Grellet et al "Organization and evolution of a higher plant alphoid-like satellite DNA sequence," j. Mol. Biol. 187:495-507, 1986. , Grill and Somerville, Mol Gen Genet, 226:484-90, 1991 Guerrero et al, Plant Molecular Biology, 15:11-26, 1990. Gupta et al, Proc. Natl. Acad. Sci. USA, 90:1629-1633, 1993. Gutierrez-Marcos et al, Proc. Natl. Acad. Sci., USA, 93:13377, 1996.
  • Lechner et al "A 240 kd multisubunit protein complex, CBF3 is a major component ofthe budding yeast cenfromere," Cell, 64:717-725, 1991. Lee and Saier, J. of Bacteriol, 153-685, 1983. Levings, Science, 250:942-947, 1990. Lewin, Genes II, John Wiley & Sons, Publishers, N.Y., 1985. Li et al, Plant Cell, 7:1599, 1995. Li et al, Proc. Natl. Acad. Sci., 87:4580-4584, 1990. Lieber and Strauss, "Selection of efficient cleavage sites in target RNAs by using a ribozyme expression library.” Mol.
  • Rattner "The stracture ofthe mammalian centromere," Bioassays, 13(2):51-56, 1991. Ravatn et al, Journal of ' Bacteriology, 180:5505-14, 1998. Reed et al, J. Gen. Microbiology, 130:1-4, 1984. Reichel et al, Proc. Nat'l Acad. Sci. USA, 93 (12) p. 5888-5893. 1996 Reinhold-Hurek and Shub, "Self-splicing infrons in tRNA genes of widely divergent bacteria," Nature, 357:173-176, 1992. Rensburg et ⁇ /., J. Plant Physiol, 141:188-194, 1993.
  • Yuan and Altman "Selection of guide sequences that direct efficient cleavage of mRNA by human ribonuclease P," Science, 263:1269-1273, 1994. Yuan et al, “Targeted cleavage of mRNA by human RNase P,” Proc. Natl. Acad. Sci. USA, 89:8006-8010, 1992. Zatloukal et al, “Transferrinfection: a highly efficient way to express gene constmcts in eukaryotic cells," Ann. NY. Acad. Sci., 660:136-153, 1992. Zhang et al, Gene, 202:139-46, 1997

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • Zoology (AREA)
  • Analytical Chemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Botany (AREA)
  • Mycology (AREA)
  • Plant Pathology (AREA)
  • Cell Biology (AREA)
  • Immunology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Breeding Of Plants And Reproduction By Means Of Culturing (AREA)

Abstract

La présente invention concerne les séquences d'acide nucléique de centromères de végétaux plantes, qui permettent de réaliser des constructions d'ADN recombinant hérité de façon stable, et de construire des minichromosomes pouvant servir de vecteurs pour la construction de plantes transgéniques et de cellules animales.
EP03817686A 2003-06-27 2003-06-27 Compositions a base de centromeres de vegetaux Withdrawn EP1644510A4 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP10011458A EP2295586A3 (fr) 2003-06-27 2003-06-27 Compositions à base de centromères de végétaux
EP10011419A EP2357240A1 (fr) 2003-06-27 2003-06-27 Compositions à base de centromères de végétaux

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2003/020381 WO2005010142A2 (fr) 2003-06-27 2003-06-27 Compositions à base de centromères de végétaux

Publications (2)

Publication Number Publication Date
EP1644510A2 true EP1644510A2 (fr) 2006-04-12
EP1644510A4 EP1644510A4 (fr) 2007-10-24

Family

ID=34102342

Family Applications (2)

Application Number Title Priority Date Filing Date
EP03817686A Withdrawn EP1644510A4 (fr) 2003-06-27 2003-06-27 Compositions a base de centromeres de vegetaux
EP10011419A Withdrawn EP2357240A1 (fr) 2003-06-27 2003-06-27 Compositions à base de centromères de végétaux

Family Applications After (1)

Application Number Title Priority Date Filing Date
EP10011419A Withdrawn EP2357240A1 (fr) 2003-06-27 2003-06-27 Compositions à base de centromères de végétaux

Country Status (5)

Country Link
EP (2) EP1644510A4 (fr)
AU (2) AU2003276839B2 (fr)
BR (1) BR0318377A (fr)
CA (1) CA2532809A1 (fr)
WO (1) WO2005010142A2 (fr)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2621874C (fr) * 2005-09-08 2014-12-16 Chromatin Inc. Plantes modifiees par des mini-chromosomes
AU2007249385B2 (en) * 2006-05-09 2012-09-13 The Curators Of The University Of Missouri Plant artificial chromosome platforms via telomere truncation
CN101490267B (zh) * 2006-05-17 2013-04-17 先锋高级育种国际公司 人工植物微染色体
WO2008112972A2 (fr) * 2007-03-15 2008-09-18 Chromatin, Inc. Séquences de centromères et minichromosomes
US9096909B2 (en) 2009-07-23 2015-08-04 Chromatin, Inc. Sorghum centromere sequences and minichromosomes
KR101866904B1 (ko) * 2011-06-23 2018-06-14 방글라데시 주트 리서치 인스티튜트 황마에서 질환 저항성을 부여하는 효소들을 인코딩하는 핵산 분자들

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996040965A1 (fr) * 1995-06-07 1996-12-19 Case Western Reserve University Chromosome synthetique de mammifere et procedes de construction de celui-ci
US5869294A (en) * 1995-06-07 1999-02-09 Case Western Reserve University Method for stably cloning large repeating DNA sequences
WO2000018941A1 (fr) * 1998-09-30 2000-04-06 Medical Research Council Chromosomes artificiels de mammifere et leurs applications
WO2000055325A2 (fr) * 1999-03-18 2000-09-21 The University Of Chicago Compositions de chromosomes vegetaux et methodes
WO2001000858A1 (fr) * 1999-06-30 2001-01-04 Wisconsin Alumni Research Foundation Sequences d'adn specifiques aux centromeres du riz
WO2002000842A2 (fr) * 2000-06-23 2002-01-03 The University Of Chicago Methodes d'isolation d'adn centromere
WO2002012555A2 (fr) * 2000-08-03 2002-02-14 Dosagene - R & D Moyens pour cibler des regions repetees d'acides nucleiques
WO2002081710A1 (fr) * 2001-04-06 2002-10-17 The Government Of The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Chromosomes artificiels pouvant servir de navette entre des cellules de bacterie, de levure et de mammifere
US20030033617A1 (en) * 1996-04-10 2003-02-13 Gyula Hadlaczky Artificial chromosomes, uses thereof and methods for preparing artificial chromosomes
WO2005010187A1 (fr) * 1997-06-03 2005-02-03 Chromatin, Inc. Procedes de generation ou d'augmentation des revenus tires de cultures

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4468464A (en) 1974-11-04 1984-08-28 The Board Of Trustees Of The Leland Stanford Junior University Biologically functional molecular chimeras
US4987071A (en) 1986-12-03 1991-01-22 University Patents, Inc. RNA ribozyme polymerases, dephosphorylases, restriction endoribonucleases and methods
US5270201A (en) 1988-03-24 1993-12-14 The General Hospital Corporation Artificial chromosome vector
US5268526A (en) 1988-07-29 1993-12-07 E. I. Du Pont De Nemours And Company Overexpression of phytochrome in transgenic plants
AU638438B2 (en) 1989-02-24 1993-07-01 Monsanto Technology Llc Synthetic plant genes and method for preparation
US5168053A (en) 1989-03-24 1992-12-01 Yale University Cleavage of targeted RNA by RNAase P
US5624824A (en) 1989-03-24 1997-04-29 Yale University Targeted cleavage of RNA using eukaryotic ribonuclease P and external guide sequence
US7705215B1 (en) 1990-04-17 2010-04-27 Dekalb Genetics Corporation Methods and compositions for the production of stably transformed, fertile monocot plants and cells thereof
AU683011B2 (en) 1992-01-13 1997-10-30 Duke University Enzymatic RNA molecules
US5689052A (en) 1993-12-22 1997-11-18 Monsanto Company Synthetic DNA sequences having enhanced expression in monocotyledonous plants and method for preparation thereof
CA2292893C (fr) * 1997-06-03 2012-01-17 Arch Development Corporation Compositions de chromosomes artificiels de plantes et procedes

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996040965A1 (fr) * 1995-06-07 1996-12-19 Case Western Reserve University Chromosome synthetique de mammifere et procedes de construction de celui-ci
US5869294A (en) * 1995-06-07 1999-02-09 Case Western Reserve University Method for stably cloning large repeating DNA sequences
US20030033617A1 (en) * 1996-04-10 2003-02-13 Gyula Hadlaczky Artificial chromosomes, uses thereof and methods for preparing artificial chromosomes
WO2005010187A1 (fr) * 1997-06-03 2005-02-03 Chromatin, Inc. Procedes de generation ou d'augmentation des revenus tires de cultures
WO2000018941A1 (fr) * 1998-09-30 2000-04-06 Medical Research Council Chromosomes artificiels de mammifere et leurs applications
WO2000055325A2 (fr) * 1999-03-18 2000-09-21 The University Of Chicago Compositions de chromosomes vegetaux et methodes
WO2001000858A1 (fr) * 1999-06-30 2001-01-04 Wisconsin Alumni Research Foundation Sequences d'adn specifiques aux centromeres du riz
WO2002000842A2 (fr) * 2000-06-23 2002-01-03 The University Of Chicago Methodes d'isolation d'adn centromere
WO2002012555A2 (fr) * 2000-08-03 2002-02-14 Dosagene - R & D Moyens pour cibler des regions repetees d'acides nucleiques
WO2002081710A1 (fr) * 2001-04-06 2002-10-17 The Government Of The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Chromosomes artificiels pouvant servir de navette entre des cellules de bacterie, de levure et de mammifere

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of WO2005010142A2 *

Also Published As

Publication number Publication date
WO2005010142A3 (fr) 2005-06-09
EP2357240A1 (fr) 2011-08-17
EP1644510A4 (fr) 2007-10-24
BR0318377A (pt) 2006-07-25
AU2003276839A8 (en) 2005-02-14
WO2005010142A2 (fr) 2005-02-03
AU2010241309A1 (en) 2010-11-25
AU2003276839A1 (en) 2005-02-14
WO2005010142A8 (fr) 2005-10-20
AU2010241309B2 (en) 2012-02-16
CA2532809A1 (fr) 2005-02-03
AU2003276839B2 (en) 2010-12-02

Similar Documents

Publication Publication Date Title
WO2005010187A1 (fr) Procedes de generation ou d'augmentation des revenus tires de cultures
US7226782B2 (en) Plant centromere compositions
US7235716B2 (en) Plant centromere compositions
US7227057B2 (en) Plant centromere compositions
US20050266560A1 (en) Plant chromosome compositions and methods
AU2005217648A1 (en) Plants modified with mini-chromosomes
US20150337321A1 (en) Plant centromere compositions
US20130007927A1 (en) Novel centromeres and methods of using the same
WO2000055325A9 (fr) Compositions de chromosomes vegetaux et methodes
AU2010241309B2 (en) Plant centromere compositions
AU2012202836B2 (en) Plant centromere compositions
AU2008207566B2 (en) Plant centromeres
EP2295586A2 (fr) Compositions à base de centromères de végétaux
CA2530178A1 (fr) Procedes de generation ou d'augmentation des revenus tires de cultures
Preuss et al. Zieler et a

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20060130

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PT RO SE SI SK TR

R17P Request for examination filed (corrected)

Effective date: 20060127

DAX Request for extension of the european patent (deleted)
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1089481

Country of ref document: HK

RIN1 Information on inventor provided before grant (corrected)

Inventor name: MACH, JENNIFER

Inventor name: KEITH, KEVIN

Inventor name: ZIELER, HELGE

Inventor name: JIN, RONGGUAN

Inventor name: PREUSS, DAPHNE

Inventor name: COPENHAVER, GREGORY

RIN1 Information on inventor provided before grant (corrected)

Inventor name: COPENHAVER, GREGORY

Inventor name: JIN, RONGGUAN

Inventor name: ZIELER, HELGE

Inventor name: KEITH, KEVIN

Inventor name: PREUSS, DAPHNE

Inventor name: MACH, JENNIFER

RIC1 Information provided on ipc code assigned before grant

Ipc: C12N 15/82 20060101AFI20070619BHEP

Ipc: C12Q 1/68 20060101ALI20070619BHEP

Ipc: A01H 5/00 20060101ALI20070619BHEP

Ipc: C12N 15/10 20060101ALI20070619BHEP

A4 Supplementary search report drawn up and despatched

Effective date: 20070920

17Q First examination report despatched

Effective date: 20080228

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20160408