WO2005121346A1 - Transformation vectors - Google Patents

Transformation vectors Download PDF

Info

Publication number
WO2005121346A1
WO2005121346A1 PCT/NZ2005/000117 NZ2005000117W WO2005121346A1 WO 2005121346 A1 WO2005121346 A1 WO 2005121346A1 NZ 2005000117 W NZ2005000117 W NZ 2005000117W WO 2005121346 A1 WO2005121346 A1 WO 2005121346A1
Authority
WO
WIPO (PCT)
Prior art keywords
plant
sequence
derived
dna
vector
Prior art date
Application number
PCT/NZ2005/000117
Other languages
French (fr)
Inventor
Anthony John Conner
Philippa Jane Barrell
Johanna Maria Elisabeth Jacobs
Samantha Jane Baldwin
Annemarie Suzanne Lokerse
Jan-Peter Hendrik Nap
Original Assignee
New Zealand Institute For Crop & Food Research
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from NZ533371A external-priority patent/NZ533371A/en
Application filed by New Zealand Institute For Crop & Food Research filed Critical New Zealand Institute For Crop & Food Research
Priority to AU2005252598A priority Critical patent/AU2005252598B8/en
Priority to EP05757541A priority patent/EP1766029A4/en
Publication of WO2005121346A1 publication Critical patent/WO2005121346A1/en
Priority to AU2010257316A priority patent/AU2010257316B2/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8201Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation
    • C12N15/8202Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation by biological means, e.g. cell mediated or natural vector
    • C12N15/8205Agrobacterium mediated transformation
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8201Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation
    • C12N15/8213Targeted insertion of genes into the plant genome by homologous recombination

Definitions

  • An option provided by genetic engineering is the ability to extend the germplasm base available for crop improvement to any source of DNA, including that from other plants, microbes or animals.
  • This cross-species transformation has raised ethical concerns with the public, especially when associated with food.
  • Agrobacterium- aediated transformation is the preferred method and requires the construction of modified T-DNA (transferred-DNA) on a vector (usually a binary vector).
  • the transformation requires the use of vector systems based on DNA sequences from other species (e.g. the T-DNA border regions, the DNA region into which target genes are inserted, selectable markers genes and sequences allowing such vectors to replicate in additional host systems); sequences that have been usually derived from bacterial systems.
  • other species e.g. the T-DNA border regions, the DNA region into which target genes are inserted, selectable markers genes and sequences allowing such vectors to replicate in additional host systems
  • sequences that have been usually derived from bacterial systems e.g. the T-DNA border regions, the DNA region into which target genes are inserted, selectable markers genes and sequences allowing such vectors to replicate in additional host systems.
  • T-DNA border region The minimum requirement of a vector to perform Agrob ⁇ cterium-X ⁇ QdiatQd plant transformation is at least one T-DNA border region, although in practice transformation vector systems include other vector sequences as described above.
  • Two T-DNA border regions are usually used flanking the sequence of interest to be integrated into the plant genome. However in most instances such border sequences or parts thereof also become integrated into the genome of the transformed plant.
  • T-DNA sequences have been identified as naturally occurring in the genomes of plants (White et al 1983, Nature 301: 348-350; Furner et al 1986, Nature 319: 422-427; Aoki et al 1994, Molecular and General Genetics 243: 706-710; Susuli et al 2002, Plant Journal 32: 775-787). Plant transformation vectors in which the Agrob ⁇ cterium borders are replaced with plant derived T-DNA border-like sequences have also been reported (WO 03/069980). If the T- DNA border-like sequences are chosen from a plant of the species to be transformed, this allows for the possibility of production of plants transformed with only their own DNA.
  • the invention provides a plant transformation vector comprising: a) T-DNA-like sequence including at least one T-DNA border-like sequence, the T- DNA border-like sequence comprising two polynucleotide sequence fragments, wherein all of the sequences of the T-DNA-like sequence are derived from plant species. Also possible but less preferred is use of a similar T-DNA border-like sequence containing three or more polynucleotide sequence fragments derived from plant species.
  • the invention provides a plant transformation vector comprising a) a T-DNA-like sequence including at least one T-DNA border-like sequence b) additional plant polynucleotide sequence on one or both sides of the T-DNA-like sequence in which all of said sequences are derived from plants, preferably from the same plant species.
  • the additional plant polynucleotide sequence is 5' to the left border when two T- DNA border-like sequences are used, or 5' to the single T-DNA border-like sequence when a single T-DNA border-like sequence is used.
  • the said additional plant polynucleotide sequence is at least about lbp in length, preferably at least about 5 bp, preferably at least about 10 bp, preferably at least about 50 bp, preferably at least about 100 bp, preferably at least about 200 bp, preferably at least about 500 bp, more preferably at least about 1 kb.
  • the T-DNA-like sequence includes two T-DNA border-like polynucleotide sequences flanking the T-DNA-like sequence, both T-DNA border-like polynucleotide sequences being derived from plants, preferably from the same plant species.
  • the T-DNA-like sequence further comprises additional base polynucleotide sequence(s), the additional base polynucleotide sequence(s) being derived from plants preferably from the same plants species as the T-DNA border-like sequences.
  • the T-DNA-like sequence includes first and second recombinase recognition site sequences, wherein all of said sequences are derived from plants, preferably from the same plant species.
  • first recombinase recognition site and the second recombinase recognition site are lox P-like sites derived from a plant species, preferably from the same plant species as the T-DNA border-like sequences.
  • first recombinase recognition site and the second recombinase recognition site are ⁇ rt-like sites derived from a plant species, preferably from the same plant species as the T-DNA border-like sequences.
  • the vector comprises a selectable marker sequence flanked by the first and second recombinase recognition site sequences.
  • the selectable marker is operably linked to a constitutive promoter sequence.
  • the selectable marker and/or the constitutive promoter sequences are derived from plants, preferably from the same plant species as the T-DNA border-like sequences.
  • the vector comprises a recombinase sequence flanked by the first and second recombinase recognition site sequences.
  • the recombinase is operably linked to an inducible promoter sequence.
  • the recombinase and/or inducible promoter sequences are derived from plants, preferably from the same plant species as the T- DNA border-like sequences.
  • the recombinase sequence when the recombinase recognition sites are a / ⁇ P-like sequences, the recombinase sequence is Cre and when the recombinase recognition sites are an ⁇ f-like sequences, the recombinase sequence is FLP.
  • a negative selection marker may be flanked by the first and second recombinase recognition site sequences.
  • the negative selection marker is CodA.
  • neither the T-DNA border-like polynucleotide sequences, nor any base polynucleotide sequence of the T-DNA-like sequence, nor the first or second recombinase recognition site sequences, nor the plant polynucleotide sequence additional to the T-DNA-like sequence contain regulatory elements, such as promoters, which may influence the expression of inserted genes of interest.
  • T-DNA border-like polynucleotide sequences nor any base polynucleotide sequence of T-DNA-like sequence, nor the first or second recombinase recognition site sequences, nor the plant polynucleotide sequence additional to the T-DNA- like sequence are derived from heterochromatic regions of the genome from which they are derived.
  • the polynucleotide encompassing the T-DNA border-like sequences, the base polynucleotide sequence of the T-DNA-like sequence and the plant polynucleotide sequence additional to the T-DNA-like sequence are constructed from fewer than 10, preferably fewer than 9, preferably fewer than 8, preferably fewer than 7, preferably fewer than 6, preferably fewer than 5, preferably fewer than 4, preferably fewer than 3, most preferably 2 or 1 sequence fragments derived from plants.
  • the plant transformation vector of the invention further comprises an origin of replication sequence.
  • the origin of replication sequence is derived from a plant, preferably from the same plant species as the T-DNA border-like sequences and/or the base polynucleotide sequence of the T-DNA-like sequence, and/or the sequence additional to the T-DNA-like sequence.
  • the T-DNA-like sequence of the plant transformation vector of the invention comprises a selectable marker polynucleotide sequence for selection of a plant cell or plant harbouring the T-DNA-like sequence.
  • the selectable marker sequence is derived from a plant, more preferably from the same plant species as the T-DNA border-like sequences and/or the base polynucleotide sequence of the T-DNA-like sequence, and/or the sequence additional to the T-DNA-like sequence.
  • the plant transformation vector of the invention further comprises a selectable marker polynucleotide sequence for selection of a bacterium harbouring the vector.
  • the selectable marker sequence is derived from a plant, more preferably from the same plant species as the T-DNA border-like sequences and/or the base polynucleotide sequence of the T-DNA-like sequence, and/or the sequence additional to the T-DNA-like sequence.
  • the selectable marker polynucleotide sequence for selection of a plant harbouring the T-DNA-like sequence also functions in selection of a bacterium harbouring the vector.
  • the T-DNA-like sequence further comprises a genetic construct as herein defined.
  • the genetic construct comprises a promoter polynucleotide sequence operably linked to a polynucleotide sequence of interest and a terminator polynucleotide sequence, wherein all of said polynucleotide sequences are derived from plants, preferably from the same plant species as the T-DNA border-like sequences.
  • polynucleotide sequence of the entire vector is derived from plant species, preferably from the same plant species.
  • the T-DNA-like sequence includes, 5' to the chimeric T-DNA border- like sequence, first and second recombinase recognition sequences, wherein the recombinase recognition sequences are derived from plant species.
  • first recombinase recognition site and the second recombinase recognition sequence are /o P-like sequences.
  • first recombinase recognition sequence and the second recombinase recognition sequences arejrt-like sequences.
  • the plant transformation vector comprises a selectable marker sequence flanked by the first and second recombinase recognition sequences.
  • the selectable marker sequence is derived from plants.
  • the polynucleotide of at least 20 bp in length and any recombinase recognition site sequences are constructed from fewer than 10 fragments, preferably fewer than 9, preferably fewer than 8, preferably fewer than 7, preferably fewer than 6, preferably fewer than 5, preferably fewer than 4, preferably fewer than 3, most preferably 2 or 1 sequence fragments derived from plants.
  • the plant transformation vector further comprises an origin of replication polynucleotide sequence derived from plant species.
  • the T-DNA-like sequence includes, 5' to the chimeric T-DNA border-like sequence, a selectable marker polynucleotide sequence capable of functioning in selection of a plant cell or plant harbouring the T-DNA-like sequence, wherein the selectable marker sequence is derived from plant species.
  • the plant transformation vector comprises a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector, wherein the selectable marker sequence is derived from " a plant.
  • the selectable marker polynucleotide sequence capable of functioning in selection of a plant harbouring the T-DNA-like sequence is also capable of functioning in selection of a bacterium harbouring the vector.
  • the T-DNA-like sequence of the plant transformation vector further comprises a genetic construct as herein defined, wherein all polynucleotide sequences of the genetic construct are derived from plant species.
  • polynucleotide sequence of the the plant transformation vector is derived from plant species.
  • polynucleotide sequence of the the plant transformation vector is derived from plant species which are interfertile.
  • the plant-derived sequence of at least 20 bp in length is at least about 50bp in length, more preferably at least about lOObp in length, more preferably at least about 200bp in length, more preferably at least about 500bp in length, most preferably at least about lkb in length.
  • the plant transformation includes, 5' to the border sequence, first and second recombinase recognition sequences derived from plant species.
  • first recombinase recognition site and the second recombinase recognition sequence are loxP-like sequences.
  • first recombinase recognition sequence and the second recombinase recognition sequences axefrt-like sequences.
  • the plant transformation vector comprises a selectable marker sequence flanked by the first and second recombinase recognition sequences.
  • the selectable marker sequence is derived from plants.
  • polynucleotide of at least 20 bp in length and any recombinase recognition site sequences, of the plant transformation vector are constructed from fewer than 10 fragments, preferably fewer than 9, preferably fewer than 8, preferably fewer than 7, preferably fewer than 6, preferably fewer than 5, preferably fewer than 4, preferably fewer than 3, most preferably 2 or 1 polynucleotide sequence fragments derived from plant species.
  • the plant transformation vector further comprises an origin of replication polynucleotide sequence derived from plant species.
  • the plant transformation vector includes, 5' to the border sequence, a selectable marker polynucleotide sequence capable of functioning in selection of a plant cell or plant harbouring the selectable marker polynucleotide sequence, wherein the selectable marker sequence is derived from plant species.
  • the plant transformation vector comprises a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector, wherein the selectable marker sequence is derived from a plant.
  • the selectable marker polynucleotide sequence capable of functioning in selection of a plant harbouring the selectable marker polynucleotide sequence is also capable of functioning in selection of a bacterium harbouring the vector.
  • the plant transformation vector further comprises a genetic construct as herein defined, wherein all polynucleotide sequences of the genetic construct are derived from plant species.
  • all of the polynucleotide sequence of the plant transformation vector, except for the border sequence, is derived from plant species.
  • all of the polynucleotide sequence of the plant transformation vector, except for the border sequence, is derived from plant species which are interfertile.
  • all of the polynucleotide sequence of the plant transformation vector, except for the border sequence, is derived from the same plant species.
  • the invention provides a plant transformation vector comprising a selectable marker polynucleotide sequence capable of functioning in selection of a plant cell or plant harbouring the selectable marker polynucleotide, wherein the selectable marker sequence is derived from plant species.
  • the invention provides a plant transformation vector comprising first and second recombinase recognition sequences, wherein the recombinase recognition sequences are derived from plant species.
  • first recombinase recognition sequence and the second recombinase recognition sequence are loxP-like sequences derived from a plant species.
  • first recombinase recognition sequence and the second recombinase recognition sequences are ⁇ rt-like sequences derived from plant species.
  • the plant transformation vector comprises a selectable marker sequence flanked by the first and second recombinase recognition sequences.
  • the selectable marker sequence is derived from plants.
  • the plant transformation vector further comprises an origin of replication polynucleotide sequence derived from plant species.
  • the plant transformation vector further comprises a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector, wherein the selectable marker sequence is derived from plant species.
  • the selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector is also capable of functioning in selection of a plant cell or plant harbouring the selectable marker polynucleotide.
  • the invention provides a plant transformation vector comprising: a) an origin of replication polynucleotide sequence, and b) a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector in which all of said sequences are derived from plant species.
  • the plant transformation vector further comprises additional base polynucleotide sequence, the additional base polynucleotide sequence being derived from plant species.
  • the plant transformation vector is constructed from fewer than 10, preferably fewer than 9, preferably fewer than 8, preferably fewer than 7, preferably fewer than 6, preferably fewer than 5, preferably fewer than 4, preferably fewer than 3, most preferably 2 or 1 polynucleotide sequence fragments derived from plants.
  • the plant transformation vector further comprises a genetic construct as herein defined, wherein all polynucleotide sequences of the genetic construct are derived from plants.
  • all of the polynucleotide sequence of the plant transformation vector is derived from plant species, more preferably from plant species which are interfertile and most preferably from the same plant species.
  • the invention provides a plant transformation vector comprising a selectable marker polynucleotide sequence for selection of a bacterium harbouring the vector.
  • the selectable marker sequence is derived from a plant.
  • the vector also comprises an origin of replication sequence functional in bacteria, preferably in E. coli.
  • the origin of replication sequence is derived from a plant, more preferably from the same plant species as the selectable marker polynucleotide sequence for selection of a • bacterium harbouring the vector. Yet more preferably the.
  • vector further comprises a genetic construct as herein defined.
  • the genetic construct sequence is derived from a plant, more preferably from the same plant species as the selectable marker polynucleotide sequence for selection of a bacterium harbouring the vector.
  • the polynucleotide sequence of the entire vector are derived from plant species, most preferably from the same plant species.
  • the invention provides a method of producing a transformed plant cell or plant, the method comprising the step of transformation of the plant cell or plant using a transformation vector of the invettion.
  • any polynucleotide stably integrated into the plant cell or plant is derived from a plant.
  • any polynucleotide stably integrated into the plant cell or plant is derived from a plant interfertile with the plant or plant cell to be transformed.
  • Most preferably any polynucleotide stably integrated into the plant cell or plant is derived from a plant of the same species as the plant or plant cell to be transformed.
  • the invention also provides a method of modifying a trait in a plant cell or plant comprising: (a) transforming of a plant cell or plant with a vector of the invention, the vector comprising a genetic construct capable of altering expression of a gene which influences the trait; and (b) obtaining a stably transformed plant cell or plant modified for the trait.
  • transformation is vir gene-mediated.
  • transformation is Agrobacterium-me ⁇ iate ⁇ .
  • transformation involves direct DNA uptake.
  • the invention provides a method for modifying a plant cell or plant, comprising: (a) transforming a plant cell or plant with the vector of the invention comprising a selectable marker flanked by /oxP-like recombinase recognition sites; (b) selecting a plant cell or plant expressing the selectable marker flanked by loxP-like recombinase recognition sites; (c) inducing the expression of the Cre gene in the plant cell or plant; (d) culturing the plant cell or plant for sufficient time to allow excision of the selectable marker.
  • the invention provides a method for modifying a plant cell or plant, comprising: (a) transforming a plant cell or plant with the vector of the invention comprising a selectable marker flanked b ⁇ frt-like recombinase recognition sites; (b) selecting a plant cell or plant expressing the selectable marker flanked byfrt-hke recombinase recognition sites; (c) inducing the expression of the FLP gene in the plant cell or plant; (d) culturing the plant cell or plant for sufficient time to allow excision of the selectable marker.
  • the invention provides a plant modified by a method of the invention.
  • the plant cell or plant modified is of the same species as the vector sequence used to modify it.
  • the invention also provides a plant cell or plant produced by a method of the invention
  • the plant cell or plant produced is of the same species as the vector sequence used to produce it.
  • the invention also provides a plant tissue, organ, propagule or progeny of the plant cell or plant of the invention.
  • polynucleotide(s), means a single or double-stranded deoxyribonucleotide or ribonucleotide polymer of any length, and include as non-limiting examples, coding and non-coding sequences of a gene, sense and antisense sequences, exons, introns, genomic DNA, cDNA, pre-mRNA, mRNA, rRNA, siRNA, miRNA, tRNA, ribozymes, recombinant polynucleotides, isolated and purified naturally occurring DNA or RNA sequences, synthetic RNA and DNA sequences, nucleic acid probes, primers, fragments, genetic constructs, vectors and modified polynucleotides.
  • variant refers to polynucleotide or polypeptide sequences different from the specifically identified sequences, wherein one or more nucleotides or amino acid residues is " deleted, substituted, or added. Variants may be naturally occurring allelic variants, or non-naturally occurring variants. Variants may be from the same or from other species and may encompass homologues, paralogues and orthologues. In certain embodiments, variants of the inventive polypeptides and polynucleotides possess biological activities that are the same or similar to those of the inventive polypeptides or polynucleotides.
  • variants of the inventive polypeptides and polynucleotides possess biological activities that are the same or similar to those of the inventive polypeptides or polynucleotides.
  • variant with reference to polynucleotides and polypeptides encompasses all forms of polynucleotides and polypeptides as defined herein.
  • Variant polynucleotide sequences preferably exhibit at least 50%, more preferably at least 70%, more preferably at least 80%, more preferably at least 90%, more preferably at least 95%o, more preferably at least 98%, and most preferably at least 99% identity to a sequence of the present invention. Identity is found over a comparison window of at least 5 nucleotide positions, preferably at least 10 nucleotide positions, preferably at least 20 nucleotide positions, preferably at least 50 nucleotide positions, more preferably at least 100 nucleotide positions, and most preferably over the entire length of a polynucleotide of the invention.
  • Polynucleotide sequence identity can be determined in the following manner.
  • the subject polynucleotide sequence is compared to a candidate polynucleotide sequence using BLASTN (from the BLAST suite of programs, version 2.2.5 [Nov 2002]) in bl2seq (Tatiana A. Tatusova, Thomas L. Madden (1999), "Blast 2 sequences - a new tool for comparing protein and nucleotide sequences", FEMS Microbiol Lett. 174:247-250), which is publicly available from NCBI (ftp://ftp.ncbi.nih. gov/blast/). The default parameters of bl2seq may be utilized.
  • Polynucleotide sequence identity may also be calculated over the entire length of the overlap between a candidate and subject polynucleotide sequences using global sequence alignment programs (e.g. Needleman, S. B. and Wunsch, C. D. (1970) J. Mol. Biol. 48, 443-453).
  • Needleman- Wunsch global alignment algorithm is found in the needle program in the EMBOSS package (Rice, P. Longden, I. and Bleasby, A. EMBOSS: The European Molecular Biology Open Software Suite, Trends in Genetics June 2000, vol 16, No 6. ⁇ p.276-277) which can be obtained from http://www.hgmp.mrc.ac.uk/Software/EMBOSS/.
  • the European Bioinformatics Institute server also provides the facility to perform EMBOSS-needle global alignments between two sequences on line at http:/www.ebi.ac.uk/emboss/align/.
  • GAP Global Sequence Alignment. Computer Applications in the Biosciences 10, 227-235.
  • BLASTN as described above is preferred for use in the determination of sequence identity for polynucleotide variants according to the present invention.
  • variant polynucleotides of the present invention hybridize to the polynucleotide sequences disclosed herein, or complements thereof under stringent conditions.
  • hybridize under stringent conditions refers to the ability of a polynucleotide molecule to hybridize to a target polynucleotide molecule (such as a target polynucleotide molecule immobilized on a DNA or RNA blot, such as a Southern blot or Northern blot) under defined conditions of temperature and salt concentration.
  • a target polynucleotide molecule such as a target polynucleotide molecule immobilized on a DNA or RNA blot, such as a Southern blot or Northern blot
  • the ability to hybridize under stringent hybridization conditions can be determined by initially hybridizing under less stringent conditions then increasing the stringency to the desired stringency.
  • Tm melting temperature
  • Typical stringent conditions for polynucleotide molecules of greater than 100 bases in length would be hybridization conditions such as prewashing in a solution of 6X SSC, 0.2% SDS; hybridizing at 65°C, 6X SSC, 0.2% SDS overnight; followed by two washes of 30 minutes each in IX SSC, 0.1% SDS at 65° C and two washes of 30 minutes each in 0.2X SSC, 0.1% SDS at 65°C.
  • exemplary stringent hybridization conditions are 5 to 10° C below Tm.
  • Tm of a polynucleotide molecule of length less than 100 bp is reduced by approximately (500/oligonucleotide length) 0 C.
  • Variant polynucleotides of the present invention also encompasses polynucleotides that differ from the sequences of the invention but that, as a consequence of the degeneracy of the genetic code, encode a polypeptide having similar activity to a polypeptide encoded by a polynucleotide of the present invention.
  • a sequence alteration that does not change the amino acid sequence of the polypeptide is a "silent variation". Except for ATG (methionine) and TGG (tryptophan), other codons for the same amino acid may be changed by art recognized techniques, e.g., to optimize codon expression in a particular host organism.
  • Polynucleotide sequence alterations resulting in conservative substitutions of one or several amino acids in the encoded polypeptide sequence without significantly altering its biological activity are also included in the invention.
  • a skilled artisan will be aware of methods for making phenotypically silent amino acid substitutions (see, e.g., Bowie et al, 1990, Science 247, 1306).
  • Variant polynucleotides due to silent variations and conservative substitutions in the encoded polypeptide sequence may be determined using the publicly available bl2seq program from the BLAST suite of programs (version 2.2.5 [Nov 2002]) from NCBI (ftp ://ftp .ncbi .nih. go v/blast/) via the tblastx algorithm as previously described.
  • a "fragment" of a polynucleotide sequence provided herein is a subsequence of contiguous nucleotides that is at least 5 nucleotides in length.
  • the fragments of the invention comprise at least 5 nucleotides, preferably at least 10 nucleotides, preferably at least 15 nucleotides, preferably at least 20 nucleotides, more preferably at least 30 nucleotides, more preferably at least 50 nucleotides, more preferably at least 50 nucleotides and most preferably at least 60 nucleotides of contiguous nucleotides of a polynucleotide of the invention.
  • primer refers to a short polynucleotide, usually having a free 3 'OH group, that is hybridized to a template and used for priming polymerization of a polynucleotide complementary to the target.
  • probe refers to a short polynucleotide that is used to detect a polynucleotide sequence, that is complementary to the probe, in a hybridization-based assay.
  • the probe may consist of a "fragment" of a polynucleotide as defined herein.
  • polypeptide encompasses amino acid chains of any length, including full-length proteins, in which amino acid residues are linked by covalent peptide bonds.
  • Polypeptides of the present invention may be purified natural products, or may be produced partially or wholly using recombinant or synthetic techniques.
  • the term may refer to a polypeptide, an aggregate of a polypeptide such as a dimer or other multimer, a fusion polypeptide, a polypeptide fragment, a polypeptide variant, or derivative thereof.
  • isolated as applied to the polynucleotide sequences disclosed herein is used to refer to sequences that are removed from their natural cellular environment.
  • An isolated molecule may be obtained by any method or combination of methods including biochemical, recombinant, and synthetic techniques.
  • the term “genetic construct” refers to a polynucleotide molecule, usually double-stranded DNA, which may have inserted into it another polynucleotide molecule (the insert polynucleotide molecule) such as, but not limited to, a cDNA molecule.
  • a genetic construct may contain the necessary elements that permit transcribing the insert polynucleotide molecule, and, optionally, translating the transcript into a polypeptide.
  • the insert polynucleotide molecule may be derived from the host cell, or may be derived from a different cell or organism and/or may be a recombinant or synthetic polynucleotide. Once inside the host cell the genetic construct may become integrated in the host chromosomal DNA.
  • the term “genetic construct” includes "expression construct” as herein defined. The genetic construct may be linked to a vector.
  • expression construct refers to a genetic construct that includes the necessary elements that permit transcribing the insert polynucleotide molecule, and, optionally, translating the transcript into a polypeptide.
  • An expression construct typically comprises in a 5' to 3' direction: a) a promoter functional in the host cell into which the construct will be transformed, b) the polynucleotide to be transcribed and/or expressed, and c) a terminator functional in the host cell into which the construct will be transformed.
  • vector refers to a polynucleotide molecule, usually double stranded DNA, which may include a genetic construct and be used to transport the genetic construct into a host cell.
  • the vector may be capable of replication in at least one additional host system, such as Escherichia coli or Agrobacterium tumefaciens.
  • coding region or "open reading frame” (ORF) refers to the sense strand of a genomic DNA sequence or a cDNA sequence that is capable of producing a transcription product and/or a polypeptide under the control of appropriate regulatory sequences.
  • the coding sequence is identified by the presence of a 5' translation start codon and a 3' translation stop codon.
  • a "coding sequence" is capable of being expressed when it is operably linked to promoter and terminator sequences.
  • “Operably-linked” means that the sequence to be expressed is placed under the control of regulatory elements that include promoters, tissue-specific regulatory elements, temporal regulatory elements, chemical-inducible regulatory elements, environment-inducible regulatory elements, enhancers, repressors and terminators.
  • noncoding region refers to untranslated sequences that are upstream of the translational start site and downstream of the translational stop site. These sequences are also referred to respectively as the 5' UTR and the 3' UTR. These regions include elements required for transcription initiation and termination and for regulation of translation efficiency.
  • Terminators are sequences, which terminate transcription, and are found in the 3' untranslated ends of genes downstream of the translated sequence. Terminators are important determinants of mRNA stability and in some cases have been found to have spatial regulatory functions.
  • promoter refers to nontranscribed cis-regulatory elements upstream of the coding region that regulate gene transcription. Promoters comprise cis-initiator elements which specify the transcription initiation site and conserved boxes such as the TATA box, and motifs that are bound by transcription factors.
  • a “transformed plant” refers to a plant which contains new genetic material as a result of genetic manipulation or transformation.
  • the new genetic material may be derived from a plant of the same species or from a different species in which case it can also be known as a "transgenic plant”.
  • An "inverted repeat” is a sequence that is repeated, where the second half of the repeat is in the complementary strand, e.g., (5')GATCTA TAGATC(3') (3')CTAGAT ATCTAG(5')
  • Read-through transcription will produce a transcript that undergoes complementary base- pairing to form a hairpin structure provided that there is a 3-5 bp spacer between the repeated regions.
  • the terms "to alter expression of and “altered expression” of a polynucleotide or polypeptide of the invention are intended to encompass the situation where genomic DNA corresponding to a polynucleotide of the invention is modified thus leading to altered expression of a polynucleotide or polypeptide of the invention. Modification of the genomic DNA may be through genetic transformation or other methods known in the art for inducing mutations.
  • the "altered expression” can be related to an increase or decrease in the amount of messenger RNA and/or polypeptide produced and may also result in altered activity of a polypeptide due to alterations in the sequence of a polynucleotide and polypeptide produced.
  • oxp-like sequence refers to a sequence derived from the genome of a plant which can perform the function of a Cre recombinase recognition site.
  • the loxP-like sequence may be comprised of one contiguous sequence derived from the genome of a plant or may be formed by combining two sequences derived from the genome of a plant.
  • a oxP-like sequence is between 24-100 bp in length, preferably 24-80 bp in length, preferably 24-70 bp in length, preferably 24-60 bp in length, preferably 24-50 bp in length, preferably 24-40 bp in length, preferably 24-34 bp in length, preferably 26-34 bp in length, preferably 28-34 bp in length, preferably 30-34 bp in length, preferably 32-34 bp in length, preferably 34 bp in length.
  • a /oxP-like sequence preferably comprises the consensus motif
  • frt-like sequence refers to a sequence derived from the genome of a plant which can perform the function of an FLP recombinase recognition site.
  • the frt-like sequence may be comprised of one contiguous sequence derived from the genome of a plant or may be formed by combining two sequences derived from the genome of a plant.
  • An frt-like sequence is between 28-100 bp in length, preferably 28-80 bp in length, preferably 28-70 bp in length, preferably 28-60 bp in length, preferably 28-50 bp in length, preferably 28-40 bp in length, preferably 28-34 bp in length, preferably 30-34 bp in length, preferably 32-34 bp in length, preferably 34 bp in length.
  • Afii-like sequence preferably comprises the consensus motif 5' GAAGTTCCTATACNNNNNNNNGWATAGGAACTTC 3'
  • T-DNA border-like sequence refers to a sequence derived from the genome of a plant which can perform the function of an Agrobacterium T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant.
  • the T-DNA border-like sequence may be comprised of one contiguous sequence derived from the genome of a plant or may be formed by combining two or more sequences derived from the genome of a plant.
  • a T-DNA border-like sequence is between 10-100 bp in length, preferably 10-80 bp in length, preferably 10-70 bp in length, preferably 15-60 bp in length, preferably 15-50 bp in length, preferably 15-40 bp in length, preferably 15-30 bp in length, preferably 20-30 bp in length, preferably 21-30 bp in length, preferably 22-30 bp in length, preferably 23-30 bp in length, preferably 24-30 bp in length, preferably 25-30 bp in length, preferably 26-30 bp in length.
  • the T-DNA border-like sequence of the invention is preferably at least 50%, more preferably at least 55%, more preferably at least 60%, more preferably at least 65%, more preferably at least 70%, more preferably at least 75%, more preferably at least 80%, more preferably at least 85%, more preferably at least 90%, more preferably at least 95%, more preferably at least 99% identical to any Agrobacterium-de ⁇ ved T-DNA border sequence.
  • a T-DNA border-like sequence of the invention may include a sequence naturally occurring in a plant which is modified or mutated to change the efficiency at which it is capable of integrating a linked polynucleotide sequence into the genome of a plant.
  • T-DNA-like sequence refers to a sequence derived from a plant genome which includes at one or both ends a T-DNA border-like sequence, or a chimeric T-DNA-border-like sequence as herein defined.
  • a T-DNA-like sequence may include additional base sequence between the T-DNA border-like sequences, or to one side of a T-DNA border-like sequence.
  • the base sequence of the T-DNA-like sequences of the invention preferably includes restriction sites or alternative cloning sites to facilitate insertion of further polynucleotide sequences.
  • chimeric T-DNA border-like sequence refers to a sequence which can perform the function of an Agrobacterium T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant, wherein part of the sequence is derived from a plant and part of the sequence is derived from another source, such as Agrobacterium.
  • T-DNA integration from the right border is very precise.
  • Molecular cloning and sequencing across T-DNA/plant genomic DNA junctions has repeatedly established that T-DNA integration at the right border is highly conserved, with only the first few nucleotides of the right border being integrated into plant genomes (Gheysen, G., Angenon, G., van Montagu, M., Agrobacterium- ediated plant transformation: a scientifically interesting story with significant applications, pp. 1-33, in Transgenic Plant Research, editor Lindsey, K., Harwood Academic Publishers, Amsterdam, 1998).
  • border sequence refers to a sequence derived from a plant which can perform the function of an Agrobacterium T-DNA border sequence for integration of a polynucleotide sequence into the genome of a plant.
  • a "border sequence” is between 10-100 bp in length, preferably 10-80 bp in length, preferably 10-70 bp in length, preferably 15-60 bp in length, preferably 15-50 bp in length, preferably 15-40 bp in length, preferably 15-30 bp in length, preferably 20-30 bp in length, preferably 21-30 bp in length, preferably 22-30 bp in length, preferably 23-30 bp in length, preferably 24-30 bp in length, preferably 25-30 bp in length, preferably 26-30 bp in length.
  • a "border sequence” preferably comprises the consensus motif:
  • border sequence includes known Agrobacterium borders, including those disclosed herein.
  • border sequence also includes modified versions of known Agrobacterium sequences, which have been modified, for example by substitution, addition or deletion, to improve the efficiency at which they are capable of performing function of an Agrobacterium T-DNA border sequence for integration of a polynucleotide sequence into the genome of a plant.
  • plant-derived origin of replication refers to a sequence derived from a plant which can support replication of a vector in which it is included in a bacterium.
  • plant-derived origins of replication may be composed of one, two or more sequence fragments derived from plants.
  • plant-derived origins of replication are composed of two sequence fragments derived from plants.
  • the plant-derived origin of replication may comprise the consensus motif:
  • selectable marker derived from a plant or “plant-derived selectable marker” or grammatical equivalents thereof refers to a sequence derived from a plant which can enable selection of a plant cell harbouring the sequence or a sequence to which the selectable marker is linked.
  • the "plant-derived selectable markers” may be composed of one, two or more sequence fragments derived from plants.
  • the “plant-derived selectable markers” are composed of two sequence fragments derived from plants.
  • the plant-derived selectable marker is at least 50%, more preferably at least 55%), more preferably at least 60%, more preferably at least 65%, more preferably at least 70%, more preferably at least 75%, more preferably at least 80%, more preferably at least 85%, more preferably at least 90%>, more preferably at least 95%>, more preferably at least 99% identical to the sequence of SEQ ID NO: 10.
  • the plant-derived selectable marker is at least 90%, preferably at least 95%, and most preferably 100% identical to SEQ ID NO:39 or SEQ ID NO:40.
  • the invention provides novel plant derived loxP-like and frt-like recombinase recognition sequences, novel T-DNA border-like sequences, T-DNA-like sequences, transformation vectors, methods for producing transformed plant cells and plants, and plant cells and plants produced by the methods.
  • marker genes are no longer required. Moreover it is desirable to remove the promoter and enhancer elements used to drive the expression of the marker genes as these may interfere with the expression of neighboring endogenous genes.
  • Two such recombination systems are the Escherichi ⁇ coli bacteriophage PI Cre// ⁇ xP system and the Saccharomyces cerevisiae FLP//rt systems, which require only a single-polypeptide recombinase, Cre or FLP and minimal 34bp DNA recombination sites, lox? oxfrt.
  • the recombinase enzyme can either be located next to the selectable marker gene so that it is in effect auto excised (Mlynarova, L and Nap J-P, A self-excising Cre recombinase allows efficient recombination of multiple ectopic heterospecific lox sites in transgenic tobacco, Transgenic Research, 12: 45-57, 2003), or it can be transiently expressed (Gleave, A.P, Mitra, D.S, Mudge, S.R and Morris, B.A.M. Selectable marker-free transgenic plants without sexual crossing: transient expression of cre recombinase and use of a conditional lethal dominant gene, Plant Molecular Biology, 40: 223-235, 1999).
  • the invention provides T-DNA border-like sequences, T-DNA-like sequences, transformation vectors, methods for transforming plant cells and plants, and the plant cells and plants produced by the methods.
  • the applicants have also identified novel plant derived loxP-like and frt-like recombinase recognition sequences from plant genomes and devised further improved methods for transformation which minimise or eliminate transfer of foreign DNA to the transformed plant.
  • the invention provides methods which allow for within-species or "intragenic" as opposed to transgenic transformation of plants.
  • Vectors useful for this approach can therefore be described as intragenic vectors.
  • the invention provides such intragenic vectors and methods of using them to produce intragenic transformed plants without any foreign DNA.
  • DNA sequences used to construct such "intragenic vectors” are preferentially derived from DNA sequences (ESTs or cDNAs) known to be expressed in plant genomes. In this manner sequences derived from heterochromatic regions, promoters or introns can be avoided.
  • the use of such sequences for the construction of intragenic vectors may influence the subsequent expression of genes of interest following their transfer to plants via intragenic vectors.
  • the invention provides novel T-DNA border-like sequences from several plant species (as shown in Example 1) formed by combining two to three fragments of genomic DNA, with all fragments being from a single plant species of interest or a closely related species. The common nature of such sequences in plant genomes is shown in Example 1.
  • the invention further provides isolated T-DNA-like sequences from several plant species as shown in Example 2.
  • the T-DNA-like region sequences in Example 2 include the T-DNA-like sequences flanked (and delineated) by T-DNA border-like sequences (high-lighted) and additional sequence on either one or both sides of the T-DNA-like sequence.
  • Plant-derived selectable marker sequences which are useful for selecting transformed plant cells and plants harbouring a particular T-DNA-like sequence include PPga22 (Zuo et al, Curr Opin BiotechnoL 13: 173-80, 2002), Ckil (Kakimoto, Science 274: 982-985, 1996), Esrl (Banno et al, Plant Cell 13: 2609-18, 2001), and dhdps-rl (Ghislain et al, Plant Journal, 8: 733-743, 1995).
  • pigmentation markers to visually select transformed plant cells and plants, such as the R and CI genes (Lloyd et al, Science, 258: 1773-1775, 1992; Bodeau and Walbot, Molecular and General Genetics, 233: 379-387, 1992).
  • a preferred plant-derived selectable marker is the acetohydroxyacid synthase gene as shown in Example 6 and Example 7. Non-plant derived selectable markers are also described herein.
  • Preferred intragenic vectors of the invention contain a plant-derived selectable marker which function in selection of bacteria harbouring the marker as described in Example 3 and Example 5.
  • the preferred intragenic vectors of the invention consist entirely of plant-derived polynucleotide sequence from the species to be transformed, or from closely related species, such as species interfertile with the plant to be transformed, considered to be within the germplasm pool accessible to traditional plant breeding.
  • Such vectors preferably include a plant-derived origin of replication which is functional in bacteria, particularly in Agrobacterium species and preferably also in E. coli.
  • the invention provides plant transformation vectors comprising such sequences.
  • Preferred origin of replication sequences include those shown in Example 4.
  • the invention provides novel loxV-like and ⁇ rt-like recombinase recognition sequences from several plant species as shown in Example 9 and Example 10.
  • Example 6 Construction of a vector of the invention is described in Example 6 and Example 8. Plant transformation using these vectors is described in Example 7 and Example 8.
  • Example 6 and Example 7 also illustrate the construction and successful use of a vector with a chimeric T-DNA border-like sequence.
  • the "right border” is composed of 5'GAC3' from the end of a sequence isolated from Arabidopsis thaliana, with the remainder of the chimeric T-DNA border-like sequence, 5 ⁇ GGATATATTGGCGGGTAAAC3', being derived from the binary vector pART27 (see sequence of pTCl in Example 6).
  • Such chimeric T-DNA border-like sequences are preferably used as the right border when two border-like sequences are used to flank the T-DNA-like sequence.
  • the plant derived end (e.g. 5'GRC3') end of the T-DNA border-like sequence must be contiguous with the plant derived sequence(s) destined for integration into a plant genome.
  • polynucleotide molecules of the invention can be isolated by using a variety of techniques known to those of ordinary skill in the art.
  • such polynucleotides can be isolated through use of the polymerase chain reaction (PCR) described in Mullis et al, Eds. 1994 The Polymerase Chain Reaction, Birkhauser, incorporated herein by reference.
  • PCR polymerase chain reaction
  • the polynucleotides of the invention can be amplified using primers, as defined herein, derived from the polynucleotide sequences of the invention.
  • Further methods for isolating polynucleotides of the invention include use of all, or portions of, the disclosed polynucleotide sequences as hybridization probes.
  • the technique of hybridizing labeled polynucleotide probes to polynucleotides immobilized on solid supports such as nitrocellulose filters or nylon membranes, can be used to screen the genomic or cDNA libraries.
  • Exemplary hybridization and wash conditions are: hybridization for 20 hours at 65°C in 5. 0 X SSC, 0. 5% sodium dodecyl sulfate, 1 X Denhardt's solution; washing (three washes of twenty minutes each at 55°C) in 1.
  • polynucleotide fragments of the invention may be produced by techniques well-known in the art such as restriction endonuclease digestion and oligonucleotide synthesis.
  • a partial polynucleotide sequence may be used, in methods well-known in the art to identify the corresponding further contiguous polynucleotide sequence. Such methods would include PCR-based methods, 5'RACE (Frohman MA, 1993, Methods Enzymol. 218: 340-56) and hybridization- based method, computer/database-based methods. Further, by way of example, inverse PCR permits acquisition of unknown sequences, flanking the polynucleotide sequences disclosed herein, starting with primers based on a known region (Triglia et ah, 1998, Nucleic Acids Res 16, 8186, incorporated herein by reference). The method uses several restriction enzymes to generate a suitable fragment in the known region of a gene.
  • the fragment is then circularized by intramolecular ligation and used as a PCR template.
  • Divergent primers are designed from the known region.
  • standard molecular biology approaches can be utilized (Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987).
  • Variant polynucleotides may be identified using PCR-based methods (Mullis et al, Eds. 1994 The Polymerase Chain Reaction, Birkhauser).
  • the polynucleotide sequence of a primer useful to amplify variants of polynucleotide molecules of the invention by PCR, may be based on a sequence encoding a conserved region of the corresponding amino acid sequence.
  • Further methods for identifying variant polynucleotides of the invention include use of all, or portions of, the polynucleotides disclosed herein as hybridization probes to screen plant genomic or cDNA libraries as described above. Typically probes based on a sequence encoding a conserved region of the corresponding amino acid sequence may be used. Hybridisation conditions may also be less stringent than those used when screening for sequences identical to the probe.
  • variant polynucleotide sequences of the invention may also be identified by computer- based methods well-known to those skilled in the art, using public domain sequence alignment algorithms and sequence similarity search tools to search sequence databases (public domain databases include Genbank, EMBL, Swiss-Prot, PIR and others). See, e.g., Nucleic Acids Res. 29: 1-10 and 11-16, 2001 for examples of online resources. Similarity searches retrieve and align target sequences for comparison with a sequence to be analyzed (i.e., a query sequence). Sequence comparison algorithms use scoring matrices to assign an overall score to each of the alignments.
  • An exemplary family of programs useful for identifying variants in sequence databases is the BLAST suite of programs (version 2.2.5 [Nov 2002]) including BLASTN, BLASTP, BLASTX, tBLASTN and tBLASTX, which are publicly available from (ftp ://ftp .ncbi .nih. gov/blast/) or from the National Center for Biotechnology Information (NCBI), National Library of Medicine, Building 38 A, Room 8N805, Bethesda, MD 20894 USA.
  • NCBI National Center for Biotechnology Information
  • the NCBI server also provides the facility to use the programs to screen a number of publicly available sequence databases.
  • BLASTN compares a nucleotide query sequence against a nucleotide sequence database.
  • BLASTP compares an amino acid query sequence against a protein sequence database.
  • BLASTX compares a nucleotide query sequence translated in all reading frames against a protein sequence database.
  • tBLASTN compares a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames.
  • tBLASTX compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database.
  • the BLAST programs may be used with default parameters or the parameters may be altered as required to refine the screen.
  • BLAST family of algorithms including BLASTN, BLASTP, and BLASTX, is described in the publication of Altschul et al, Nucleic Acids Res. 25 : 3389-3402, 1997.
  • the "hits" to one or more database sequences by a queried sequence produced by BLASTN, BLASTP, BLASTX, tBLASTN, tBLASTX, or a similar algorithm align and identify similar portions of sequences.
  • the hits are arranged in order of the degree of similarity and the length of sequence overlap. Hits to a database sequence generally represent an overlap over only a fraction of the sequence length of the queried sequence.
  • the BLASTN, BLASTP, BLASTX, tBLASTN and tBLASTX algorithms also produce "Expect" values for alignments.
  • the Expect value (E) indicates the number of hits one can "expect” to see by chance when searching a database of the same size containing random contiguous sequences.
  • the Expect value is used as a significance threshold for determining whether the hit to a database indicates true similarity. For example, an E value of 0.1 assigned to a polynucleotide hit is interpreted as meaning that in a database of the size of the database screened, one might expect to see 0.1 matches over the aligned portion of the sequence with a similar score simply by chance.
  • the probability of finding a match by chance in that database is 1% or less using the BLASTN, BLASTP, BLASTX, tBLASTN or tBLASTX algorithm.
  • Pattern recognition software applications are available for finding motifs or signature sequences.
  • MEME Multiple Em for Motif Elicitation
  • MAST Motif Alignment and Search Tool
  • the MAST results are provided as a series of alignments with appropriate statistical data and a visual overview of the motifs found.
  • MEME and MAST were developed at the University of California, San Diego.
  • PROSITE (Bairoch and Bucher, 1994, Nucleic Acids Res. 22, 3583; Hofmann et al, 1999, Nucleic Acids Res. 27, 215) is a method of identifying the functions of uncharacterized proteins translated from genomic or cDNA sequences.
  • the PROSITE database www.expasy.org/prosite
  • Prosearch is a tool that can search SWISS-PROT and EMBL databases with a given sequence pattern or signature.
  • the function of a variant of a polynucleotide of the invention may be assessed by replacing the corresponding sequence in an intragenic vector with the variant sequence and testing the functionality of the vector in a host bacterial cell or in a plant transformation procedure as herein defined.
  • Such methods may involve the transformation of plant cells and plants, using a vector of the invention including a genetic construct designed to alter expression of a polynucleotide or polypeptide which modulates such a trait in plant cells and plants.
  • Such methods also include the transformation of plant cells and plants with a combination of the construct of the invention and one or more other constructs designed to alter expression of one or more polynucleotides or polypeptides which modulate such traits in such plant cells and plants.
  • a number of plant transformation strategies are available (e.g. Birch, 1997, Ann Rev Plant Phys Plant Mol Biol, 48, 297).
  • strategies may be designed to increase expression of a polynucleotide/polypeptide in a plant cell, organ and/or at a particular developmental stage where/when it is normally expressed or to ectopically express a polynucleotide/polypeptide in a cell, tissue, organ and/or at a particular developmental stage which when it is not normally expressed.
  • the expressed polynucleotide/polypeptide may be derived from the plant species to be transformed or may be derived from a different plant species.
  • Transformation strategies may be designed to reduce expression of a polynucleotide/polypeptide in a plant cell, tissue, organ or at a particular developmental stage which/when it is normally expressed. Such strategies are known as gene silencing strategies.
  • Direct gene transfer involves the uptake of naked DNA by cells and its subsequent integration into the genome (Conner, A.J. and Meredith, C.P., Genetic manipulation of plant cells, pp. 653-688, in The Biochemistry of Plants: A Comprehensive Treatise, Vol 15, Molecular Biology, editor Marcus, A., Academic Press, San Diego, " 1989; Petolino, J. Direct DNA delivery into intact cells and tissues, pp.137-143, in Transgenic Plants and Crops, editors Khachatourians et al., Marcel Dekker, New York, 2002,.
  • the cells can include those of intact plants, pollen, seeds, intact plant organs, in vitro cultures of plants, plant parts, tissues and cells or isolated protoplasts.
  • methods to effect direct DNA transfer may involve, but not limited to: passive uptake; the use of electroporation; treatments with polyethylene glycol and related chemicals and their adjuncts; electrophoresis, cell fusion with liposomes or spheroplasts; microinjection, silicon carbide whiskers, and microparticle bombardment.
  • Genetic constructs for expression of genes in transgenic plants typically include promoters for driving the expression of one or more cloned polynucleotide, terminators and selectable marker sequences to detect presence of the genetic construct in the transformed plant.
  • the promoters suitable for use in the constructs of this invention are functional in a cell, tissue or organ of a monocot or dicot plant and include cell-, tissue- and organ-specific promoters, cell cycle specific promoters, temporal promoters, inducible promoters, constitutive promoters that are active in most plant tissues, and recombinant promoters. Choice of promoter will depend upon the temporal and spatial expression of the cloned polynucleotide, so desired.
  • the promoters may be those normally associated with a transgene of interest, or promoters which are derived from genes of other plants, viruses, and plant pathogenic bacteria and fungi.
  • promoters that are suitable for use in modifying and modulating plant traits using genetic constructs comprising the polynucleotide sequences of the invention.
  • constitutive promoters used in plants include the CaMV 35S promoter, the nopaline synthase promoter and the octopine synthase promoter, and the Ubi 1 promoter from maize.
  • Plant promoters which are active in specific tissues, respond to internal developmental signals or external abiotic or biotic stresses are also described in the scientific literature. Exemplary promoters are described, e.g., in WO 02/00894, which is herein incorporated by reference.
  • Exemplary terminators that are commonly used in plant transformation genetic constructs include, e.g., the cauliflower mosaic virus (CaMV) 35S terminator, the Agrobacterium tumefaciens nopaline synthase or octopine synthase terminators, the Zea mays zein gene terminator, the Oryza sativa ADP-glucose pyrophosphorylase terminator and the Solanum tuberosum PI- II terminator.
  • CaMV cauliflower mosaic virus
  • Agrobacterium tumefaciens nopaline synthase or octopine synthase terminators the Zea mays zein gene terminator
  • the Oryza sativa ADP-glucose pyrophosphorylase terminator the Solanum tuberosum PI- II terminator.
  • NPT II neomycin phophotransferase II gene
  • aadA gene which confers spectinomycin and streptomycin resistance
  • phosphinothricin acetyl transferase bar gene
  • Ignite AgrEvo
  • Basta Hoechst
  • hpt hygromycin phosphotransferase gene
  • non-plant derived regulatory elements described above may be used in the intragenic vectors of the invention operably linked to selectable markers placed between the recombinase recognition sites.
  • Gene silencing strategies may be focused on the gene itself or regulatory elements which effect expression of the encoded polypeptide. "Regulatory elements” is used here in the widest possible sense and includes other genes which interact with the gene of interest.
  • Genetic constructs designed to decrease or silence the expression of a polynucleotide/polypeptide of the invention may include an antisense copy of a polynucleotide of the invention.
  • the polynucleotide is placed in an antisense orientation with respect to the promoter and terminator.
  • An "antisense" polynucleotide is obtained by inverting a polynucleotide or a segment of the polynucleotide so that the transcript produced will be complementary to the mRNA transcript of the gene, e.g.,
  • Genetic constructs designed for gene silencing may also include an inverted repeat as herein defined.
  • the preferred approach to achieve this is via RNA-interference strategies using genetic constructs encoding self-complementary "hairpin” RNA (Wesley et al., 2001, Plant Journal, 27: 581-590).
  • the transcript formed may undergo complementary base pairing to form a hairpin structure.
  • a spacer of at least 3-5 bp between the repeated region is required to allow hairpin formation.
  • Another silencing approach involves the use of a small antisense RNA targeted to the transcript equivalent to an miRNA (Llave et al, 2002, Science 297, 2053). Use of such small antisense RNA corresponding to polynucleotide of the invention is expressly contemplated.
  • genetic construct as used herein also includes small antisense RNAs and other such polynucleotides effecting gene silencing.
  • Transformation with an expression construct, as herein defined, may also result in gene silencing through a process known as sense suppression (e.g. Napoli et al, 1990, Plant Cell 2, 279; de Carvalho Niebel et al, 1995, Plant Cell, 7, 347).
  • sense suppression may involve over-expression of the whole or a partial coding sequence but may also involve expression of non-coding region of the gene, such as an intron or a 5' or 3' untranslated region (UTR).
  • Chimeric partial sense constructs can be used to coordinately silence multiple genes (Abbott et al, 2002, Plant Physiol. 128(3): 844-53; Jones et al, 1998, Planta 204: 499- 505).
  • polynucleotide inserts in genetic constructs designed for gene silencing may correspond to coding sequence and/or non-coding sequence, such as promoter and/or intron and/or 5' or 3' UTR sequence, or the corresponding gene.
  • Pre-transcriptional silencing may be brought about through mutation of the gene itself or its regulatory elements.
  • Such mutations may include point mutations, frameshifts, insertions, deletions and substitutions.
  • the plant-derived sequences in the vectors of the invention may be derived from any plant species.
  • the plant-derived sequences in the vectors of the invention are from gymnosperm species.
  • Preferred gymnosperm genera include Cycas, Pseudotsuga, Pinus and Picea.
  • Preferred gymnosperm species include Cycas rumphii, Pseudotsuga menziesii, Pinus radiata, Pinus taeda, Pinus pinaster, Picea engelmannia x sitchensis, Picea sitchensis and Picea glauca.
  • the plant-derived sequences in the vectors of the invention are from bryophyte species.
  • Preferred bryophyte genera include Marchantia, Tortula, Physcomitrella and Ceratodon.
  • Preferred bryophyte species include Marchantia polymorpha, Tortula ruralis, Physcomitrella patens and Ceratodon purpureous.
  • the plant-derived sequences in the vectors of the invention are from algae species.
  • Preferred algae genera include Chlamydomonas.
  • Preferred algae species include Chlamydomonas reinhardtii.
  • the plant-derived sequences in the vectors of the invention are from angiospemi species.
  • Preferred angiosperm genera include Aegilops, Allium, Amborella, Anopterus, Apium, Arabidopsis, Arachis, Asparagus, Atropa, Avena, Beta, Betula, Brassica, Camellia, Capsicum, Chenopodium, Cicer, Citrus, Citrullus, Coffea, Cucumis, Elaeis, Eschscholzia, Eucalyptus, Fagopyrum, Fragaria, Glycine, Gossypium, Helianthus, Hevea, Hordeum, Humulus, Ipomoea, Lactuca, Limonium, Linum, Lolium, Lotus, Lycopersicon, Lycoris, Malus, Manihot, Medicago, Mesembryanthemum, Musa, Nicotiana, Nuphar, Olea, Oryza, Persea, Petunia, Pet
  • Preferred angiosperm species include Aegilops speltoides, Allium cepa, Amborella trichopoda, Anopterus macleayanus, Apium graveolens, Arabidopsis thaliana, Arachis hypogaea, Asparagus officinalis, Atropa belladonna, Avena sativa, Beta vulgar is, Brassica napus, Brassica rapa, Brassica oleracea, Capsicum annuum, Capsicum fi'utescens, Cicer arietinum, Citrullus lanatus, Citrus Clementina, Citrus reticulata, Citrus sinensis, Coffea arabica, Coffea canephora, Cucumis sativus, Elaeis guineesis, Eschscholzia californica, Eucalyptus tereticornis, Fagopyrum esculentum, Fragaria x ananassa, G
  • Particularly preferred angiosperm genera include Solanum, Petunia and Allium.
  • Particularly preferred angiosperm species include Solanum tuberosum, Petunia hybrida and Allium cepa.
  • the plant cells and plants of the invention may be derived from any plant species.
  • the plant cells and plants of the invention are from gymnosperm species.
  • Preferred gymnosperm genera include Cycas, Pseudotsuga, Pinus and Picea.
  • Preferred gymnosperm species include Cycas rumphii, Pseudotsuga menziesii, Pinus radiata, Pinus taeda, Pinus pinaster, Picea engelmannia x sitchensis, Picea sitchensis and Picea glauca.
  • the plant cells and plants of the invention are from bryophyte species.
  • Preferred bryophyte genera include Marchantia, Tortula, Physcomitrella and Ceratodon.
  • Preferred bryophyte species include Marchantia polymorpha, Tortula ruralis, Physcomitrella patens and Ceratodon purpureous.
  • the plant cells and plants of the invention are from algae species.
  • Preferred algae genera include Chlamydomonas.
  • Preferred algae species include Chlamydomonas reinhardtii.
  • the plant cells and plants of the invention are from angiosperm species.
  • Preferred angiosperm genera include Aegilops, Allium, Amborella, Anopterus, Apium, Arabidopsis, Arachis, Asparagus, Atropa, Avena, Beta, Betula, Brassica, Camellia, Capsicum, Chenopodium, Cicer, Citrus, Citrullus, Coffea, Cucumis, Elaeis, Eschscholzia, Eucalyptus, Fagopyrum, Fragaria, Glycine, Gossypium, Helianthus, Hevea, Hordeum, Humulus, Ipomoea, Lactuca, Limonium, Linum, Lolium, Lotus, Lycopersicon, Lycoris, Malus, Manihot, Medicago, Mesembryanthemum, Musa, Nicotiana, Nuphar, Olea, Oryza, Persea, Petunia, Phaseolus, Pisum
  • Preferred angiosperm species include Aegilops speltoides, Allium cepa, Amborella trichopoda, Anopterus macleayanus, Apium graveolens, Arabidopsis thaliana, Arachis hypogaea, Asparagus officinalis, Atropa belladonna, Avena sativa, Beta vulgaris, Brassica napus, Brassica rapa, Brassica oleracea, Capsicum annuum, Capsicum frutescens, Cicer arietinum, Citrullus lanatus, Citrus Clementina, Citrus reticulata, Citrus sinensis, Coffea arabica, Coffea canephora, Cucumis sativus, Elaeis guineesis, Eschscholzia californica, Eucalyptus tereticornis, Fagopyrum esculentum, Fragaria x ananassa, Gly
  • Particularly preferred angiosperm genera include Solanum, Petunia and Allium.
  • Particularly preferred angiosperm species include Solanum tuberosum, Petunia hybrida and Allium cepa.
  • the cells and plants of the invention may be grown in culture, in greenhouses or the field. They may be propagated vegetatively, as well as either selfed or crossed with a different plant strain and the resulting hybrids, with the desired phenotypic characteristics, may be identified. Two or more generations may be grown to ensure that the subject phenotypic characteristics are stably maintained arid inherited. Plants resulting from such standard breeding approaches also form an aspect of the present invention.
  • Figure 1 shows PCR verification the propagation of plasmid pPOTCOLE2SPEC in E. coli mediated by a potato-derived COLE2-like origin of replication.
  • Lanes 1 and 2 are plasmid preparations restricted with a BamWEco I double digest from two independent transformation events of pPOTCOLE2SPEC into E. coli DH5 ⁇ already possessing pBX243; they show 3.9 kb, 2.5 kb, and 1.5 kb fragments, representing the pBX243 backbone, linearised pPOTCOLE2SPEC, and the pBX243 Rep gene respectively.
  • Lane 3 is a plasmid preparation restricted with a BamHl/EcoRI double digest from a culture transformed with only pBX243 and shows 3.9 kb and 1.5 kb fragments, representing the pBX243 backbone and the ⁇ BX243 Rep gene.
  • Lane 4 is the GeneRuler DNA ladder mix #SM0331 (Fermentas, Hanover, Maryland) size marker.
  • Figure 2 shows PCR verification the potato-derived LacOl-like sequences functioning as a plasmid selectable element by operator-repressor titration.
  • Lane 1 is the GeneRuler DNA ladder mix #SM0331 (Fermentas, Hanover, Maryland) size marker.
  • Lanes 2-6 are plasmid preparations restricted with Pstl from five independent transformation events of pBR322POTLACOl into E. coli strain DHllacdapD using repressor titration selection; they show the expected 1.3 kb and 3.8 kb fragments.
  • Lane 7 is a plasmid preparation restricted with Pstl following transformation of pBR322POTLACOl into E. coli strain DH5 ⁇ using ampillicin selection and also shows the expected 1.3 kb and 3.8 kb fragments.
  • Lane 8 is linearised pBR322 visualised as a 4.4 kb fragment.
  • Figure 3 shows PCR verification of Arabidopsis thaliana 'Columbia' transformed with the intragenic vector pTCAHAS.
  • Lanes 1&2, 3&4 and 5&6 are three A.
  • thaliana lines transformed with the intragenic vector, lanes 1,3,5 using primers E+F, lanes 2,4,6 using primers G+H; lanes 8&9 are untransfoimed A.
  • Figure 4 shows PCR verification of potato cultivar Twa' transformed with the intragenic vector pPOTLNV. This involved a multiplexed PCR using primers I+J to amplify the 570 bp fragment from the pPOTLNV T-DNA-like region and primers K+L to amplify the 1069 bp product from the endogenous actin gene of potato.
  • Lanes 1&7 are the 1 kb plus molecular ruler 10787-018 (Invitrogen, Carlsbad, California), lane 2 is the co-transformed hairy root line #18, lane 3 is the co-transformed hairy root line #74, lane 4 is a control hairy root line transformed with Agrobacterium strain A4T without the binary vector pPOTLNV, lane 5 is the intragenic vector pPOTLNV, lane 6 is a no template control.
  • Figure 5 shows PCR verification of the absence of Agrobacterium DNA in the samples used for PCR analysis in Figure 4. This involved a PCR using primers M+N to amplify the 590 bp of the Agrobacterium virG gene.
  • Lanes 1&7 are the 1 kb plus molecular ruler 10787-018 (Invitrogen, Carlsbad, California)
  • lane 2 is the co-transformed hairy root line #18,
  • lane 3 is the co-transformed hairy root line #74
  • lane 4 is a control hairy root line transformed with Agrobacterium strain A4T without the binary vector pPOTLNV
  • lane 5 is Agrobacterium strain A4T, lane 6 is a no template control.
  • Figure 6 shows PCR verification of potato cultivar 'Iwa' transformed with the intragenic vector pPETINV. This involved PCR using primers O+P to amplify the 447 bp fragment from the pPETINV T-DNA-like region (lanes 2-5) and primers K+L to amplify the 1069 bp product from the endogenous actin gene of potato (lanes 6-8).
  • Lane 1 is the 1 kb plus molecular ruler 10787-018 (Invitrogen, Carlsbad, California); lanes 2 and 6 are the co- transformed hairy root line #24; lanes 3 and 7 are a control hairy root line transformed with Agrobacterium strain A4T without the binary vector pPETINV; lane 4 is the intragenic vector pPETINV; lanes 5 and 8 are a no template controls.
  • Figure 7 shows PCR verification of the absence of Agrobacterium DNA in the samples used for PCR analysis in Figure 6. This involved a PCR using primers M+N to amplify the 590 bp of the Agrobacterium virG gene.
  • Lane 1 is the 1 kb plus molecular ruler 10787-018 (Invitrogen, Carlsbad, California)
  • lane 2 is the co-transformed hairy root line #18,
  • lane 3 is a control hairy foot line transformed with Agrobacterium strain A4T without the binary vector pPOTLNV
  • lane 4 is Agrobacterium strain A4T
  • lane 5 is a no template control.
  • Figure 8 illustrates recombination between the POTLOXP sites mediated by Cre recombinase. Plasmid was isolated from E. coli strain 294-Cre transformed with pPOTLOXP2 and restricted with Sail. Expression of Cre recombinase was induced by raising the temperature from 23 °C to 37 °C.
  • Lane 1 is the 1 kb plus molecular ruler 10787-018 (Invifrogen, Carlsbad, California); lane 2 illustrates the expected 3.0 kb and 2.3 kb Sail fragments of unrecombined pPOTLOXP2 isolated from a culture maintained at 23 °C; lanes 3-8 illustrate the 3.0 kb and 1.5 kb Sail fragments expected from Cre-mediated recombination between the POTLOXP sites in six different colonies cultured at 37 °C.
  • Figure 9 illustrates recombination between the POTFRT sites mediated by FLP recombinase.
  • Plasmid was isolated from E. coli strain 294-FLP transformed with pPOTFRT2 and restricted with Sail. Expression of FLP recombinase was induced by raising the temperature from 23 °C to 37 °C.
  • Lanes 1 and 8 are the GeneRuler DNA ladder mix #SM0331 (Fermentas, Hanover, Maryland) size marker; lane 2 illustrates the expected 3.0 kb and 1.4 kb Sail fragments of unrecombined pPOTFRT2 isolated from a culture maintained at 23 °C; lanes 3-7 illustrate the 3.0 kb and 1.4 kb fragments, and the 1.1 kb Sail fragments expected from FLP- mediated recombination between the POTFRT sites in five different colonies cultured at 37 °C.
  • NCBI GenBank http://www.ncbi.nlm.nih.gov/BLAST/
  • TIGR database htt ://tigrblast.ti r . or g/tgi ⁇ using the BLAST tool "search for short, nearly exact matches" and searching within the EST databases, yielded multiple accession numbers for each motif 5'GACAGGATATAT3' and 5'GGCAGGATATAT3' as shown in Table 1.
  • the search was limited to Viridiplanteae and the expect value was 10000. Searches were also conducted in the EST Database of Japan carried out using Expect values of 10000 and the gap tool off (http://www.ddbi .nig.ac.ip .
  • the initial 5'GRCAGGATATAT3' of the T-DNA border-like motif is less likely to be identified in database searches than the shorter sequence 5'KSTMAWN3'. If the entire border sequence is formed using 2 EST sequences as shown in Example 2 of the patent application, then a second BLAST search is undertaken using 5'KSTMAWN3' from known T-DNA border sequences.
  • a list of such sequences are: 5'TGTCATG3' 5'TGTAAAC3', 5'GGTAAAC3' 5 5 GTAAAA3', 5OGTAAAA3'; which correspond to the following border sequences: 5'gacaggatatatgttcttgtcatg3' (pRi), 5'gacaggatatattggcgggtaaac3' (pTiT37 andpTiC58), 5'ggcaggatatatcgaggtgtaaaa3' (pTil5955), 5'ggcaggatatattgtggtgtaaac3' ( ⁇ ART27 lb) and 5'gacaggatatattggcgggtaaac3' ( ⁇ ART27 rb).
  • BLAST searches using these sequences produce multiple matches. For example just within Solanum tuberosum (a plant whose genome has not been completely sequenced) , a search (BLAST "search for short, nearly exact matches" Expect 20000 and descriptions 1000) for only 5'TGTAAAC3' in NCBI GenBank yields 997 exact matches of which 985 are S. tuberosum ESTs (search performed 2 June 2004).
  • T-DNA- like regions for possible intragenic vectors was undertaken by searching plant EST databases for Agrobacterium border-like sequences. Limiting searches to EST sequences facilitates the design of intragenic vectors by:
  • the base DNA making up the T-DNA-like region does not involve regulatory elements such as promoters that may influence expression of inserted target genes; and 2.
  • the DNA on which the T-DNA-like region is based is not derived from heterochromatic regions (non coding, non expressed, condensed DNA) as this may suppress activity of the genes intended for transfer.
  • BLAST searches were conducted as described by Altschul et al. (Gapped BLAST and PSI- BLAST: a new generation of protein database search programs, Nucleic Acids Res. 25: 3389- 3402, 1997).
  • NCBI BLAST www.ncbi.nlm.nih.gov/BLAST/ "blastn” and “search for short, nearly exact matches” was used to search the EST database. Expect values of 10000 or 20000 (dependent on word size) were used and the search was limited by entrez query, potato (Solanum), tomato (Lycopersicon), or Petunia. All Petunia EST sequences from the NCBI site were also downloaded in FASTA format and searched using the "find" tool in Microsoft Notepad.
  • Solanaceae genomics network http://soldb.cit.cornell.edu/cgi-bin/tools/blast/simple.pl BLAST settings included expect values of 10,000 (due to short sequences) and the default settings. All searches were done in EST databases. Unigene sequences were identified using the EST searches.
  • BLAST was carried out as above with an Expect value of 10,000 and limited by entrez query to Pinus, Nicotiana, Medicago, apple or onion (Allium).
  • NCBI BLAST www.ncbi.nlm.nih.gov/BLAST/. Settings were as above but limited by entrez query rice or Oryza.
  • TIGR - htt ://tigrblast .ti r . or g/tgi searched unique gene indices. Used an expect value of 10,000 and matrix blosum62 or blosumlOO. All other values were the default settings. The searches identified some TC# sequences (tentative consensus sequences) and ESTs containing the region of interest were identified from these.
  • the RGP EST database was used to search for ESTs containing the sequences of interest, using Expect values of 10000 and the remaining options at default settings.
  • ESTs were identified that showed sequence identity to parts of ' the Agrobacterium border-like sequences. These identified EST sequences were then assessed for homology, length of sequence flanking the borders and unique restriction sites. This was carried out using DNAMAN (version 3.2, Lynnon BioSoft. co ⁇ yright ⁇ 1994-1997). ESTs were adjoined (usually 3 ESTs) to give a T-DNA-like region containing two border sequences, unique restriction sites between the border sequences (that can be used as cloning sites) and extra plant EST sequence beyond the borders to minimize the opportunity for non-intragenic vector backbone sequences being transferred with the T-DNA-like region into plant genomes. Multiple intragenic T-DNA-like regions were designed and compared. Those designed to have the optimum sequence and useful unique restriction sites are presented below.
  • T-DNA-like region of a potato intragenic vector This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sa l sites that are underlined.
  • the nucleotides in italics are not part of the potato genome sequence.
  • Nucleotides 6 - 334 are the reverse complement of nucleotides 315 - 643 of sgn-U179068.
  • Nucleotides 335 - 974 are nucleotides 131 - 770 of sgn-Ul 74278.
  • Nucleotides 975 - 1265 are nucleotides 117 - 407 of CN216800.
  • T-DNA border-like sequences are shown in bold.
  • the left border is nucleotides 314 - 337 and the right border is nucleotides 957 - 980.
  • Unique restriction sites in the resulting binary vector that are located within the T-DNA-like region are:
  • This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined.
  • the nucleotides in italics are not part ofthe petunia genome sequence.
  • Nucleotides 6-399 are the complete sequence ofthe 394 nucleotide fragment from sgn- e521144.
  • Nucleotides 400-855 are the reverse complement of nucleotides 85-540 from sgn-e534315.
  • Nucleotides 856-1071 are the reverse complement of nucleotides 121-336 from sgn-u207691.
  • T-DNA border-like sequences are shown in bold.
  • the left border is nucleotides 347-370 and right border is nucleotides 844-867.
  • Unique restriction sites in the resulting binary vector that are located within the T-DNA-like region are: Acc ⁇ ll at 392 Agel at 788 Bbvl at 453 BspM ⁇ l at 392 -9,st71I at 453 C/H0I at 788 CM site at 398 FnuAYll at 442 ⁇ el at 665 ⁇ f ⁇ l at 752 P AI at 788
  • Nspl sites within the T-DNA-like region (616 and 755) that could be used as cloning sites.
  • the most useful restriction site for cloning into the T-DNA-like region is the CM site which is shown in underlined bold.
  • T-DNA-like region of a tomato (Lycopersicon esculentum) intragenic vector This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined.
  • the nucleotides in italics are not part of the tomato genome sequence.
  • Nucleotides 5 - 537 are nucleotides 2-534 of SGN-E260320.
  • Nucleotides 538 - 976 are the reverse complement of nucleotides 79 - 517 of SGN-E291502
  • Nucleotides 977 - 1188 are the reverse complement of nucleotides 1 - 212 of CK575027.
  • the T-DNA border-like sequences are shown in bold.
  • the left border is nucleotides 375 - 398 and the right border nucleotides 960 - 983.
  • the restriction sites and positions that could be used for cloning within the T-DNA are shown below (as calculated by DNAMAN):
  • T-DNA-like region of a Nicotiana benthamiana intragenic vector This sequence can be ligated into pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-
  • Nucleotides 5 - 853 are nucleotides 111 - 959 of CK292156
  • Nucleotides 854 - 1469 are the reverse complement of nucleotides 81-696 of CK286377.
  • Nucleotides 1470 - 1787 are nucleotides 285 - 602 of CN748849.
  • T-DNA border-like sequences are shown in bold.
  • the left border is nucleotides 566 - 589 and the right border is nucleotides 1455 - 1478.
  • Unique restriction sites in the resulting binary vector that are located within the T-DNA-like region are: ocelli at 611 4/711 at 654 -4fom at l l60 BamRl at 61 A Bsil at 1362 BspM ⁇ l at 611 Oral at 1160 EcoNI at 622 ⁇ ell at 840 Nspl at 726 Seal at 921 &pl at l420 Kspl at l085 J ⁇ II at 614
  • This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined.
  • the nucleotides in italics are not part of the apple genome sequence.
  • Nucleotides 5 - 246 are nucleotides 1 - 242 of CN862631.
  • Nucleotides 247 - 644 are the reverse complement of nucleotides 28 - 425 of CN942531.
  • Nucleotides 645 - 943 are the reverse complement of nucleotides 1 - 299 of CO541348.
  • the T-DNA border-like sequences are shown in bold.
  • the left border is nucleotides 229 -
  • nucleotides 6-357 are nucleotides 2- 353 of CA921810.
  • Nucleotides 358 - 694 are nucleotides 112 - 448 of AL375389.
  • Nucleotides 695 - 1055 are the reverse complement of nucleotides 2-362 of CF069972.
  • the T-DNA border-like sequence is shown in bold.
  • the left border is nucleotides 339 - 362 and the right border is nucleotides 677 - 700.
  • Unique restriction sites in the resulting binary vector that are located within the T-DNA-like region are:
  • This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined.
  • the nucleotides in italics are not part of the onion genome sequence.
  • Nucleotides 5 - 537 are nucleotides 4 - 536 of CF449263.
  • Nucleotides 538 - 1186 are nucleotides 94 - 742 of CF441521.
  • Nucleotides 1187 - 1503 are nucleotides 162 - 478 of CF452730.
  • the T-DNA border-like sequences are shown in bold.
  • the left border is nucleotides 520 -
  • This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined. This requires a partial digest due to a Sail site within the T-DNA like region.
  • the nucleotides in italics are not part of the rice genome sequence.
  • Nucleotides 6 - 634 are nucleotides 1 - 629 of CR287857.
  • Nucleotides 635 - 1258 are nucleotides 156 - 779 of AK100350.
  • Nucleotides 1259 - 1740 are nucleotides 222 - 703 of CB619781.
  • the T-DNA border-like sequences are shown in bold.
  • the left border is nucleotides 616 —
  • T-DNA-like region of Pinus taeda intragenic vector This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined.
  • the nucleotides in italics are not part of the P. taeda genome sequence.
  • Nucleotides 1 - 333 are nucleotides 114 - 446 of BM133642.
  • Nucleotides 334 - 914 are nucleotides 81 - 661 of CF392877.
  • Nucleotides 915 - 1172 are nucleotides 138 - 395 of CX715693.
  • the T-DNA border-like sequences are shown in bold.
  • the left border is nucleotides 314 -
  • the complete vector is made up entirely of plant-derived sequences.
  • One desirable component for effective vector manipulation is a bacterial selectable marker.
  • Preferred marker sequences include plant genes that complement bacterial mutants deficient in genes essential for their growth, such as amino acid biosynthesis genes.
  • One such gene is acetohydroxyacid synthase.
  • Acetohydroxyacid synthase is an enzyme which catalyses the formation of acetolactate pyruvate, the first step in valine, leucine and isoleucine biosynthesis.
  • plants with mutant forms of AHAS can confer resistance to sulfonylurea herbicides and related compounds (Mazur and Falco, Annual Review of Plant Physiology and Plant Molecular Biology, 40: 441-470, 1989).
  • the Arabidopsis thaliana mutant AHAS gene confers resistance to the herbicide chlorsulfuron upon transformation into tobacco (Haughan et al, Molecular and General Genetics, 211: 266-271, 1988).
  • AHAS genes from Arabidopsis thaliana (Smith et ⁇ 7., Proceedings of the National Academy of Science, USA, 86: 4179-4183, 1989), Nicotiana tabacurn (Kim and Chang, Journal of Biochemistry and Molecular Biology, 28: 265-270, 1995), and Brassica napus (Wiersma et ⁇ 7., Molecular and General Genetics, 224: 155-159, 1990) have been used to complement AHAS -deficient bacteria such as Escherichia coli and Salmonella typhimurium.
  • plant-derived sequences such as AHAS known to complement bacterial deficiencies can be placed under the control of plant promoters known to be transcriptionally active in bacteria.
  • Jacob et a Transgenic Research, 11 : 291-303, 2002
  • the potato (Solanum tuberosum) AHAS gene This gene can be used in the manner described above to provide a bacterial selectable marker gene to maintain vectors in bacteria.
  • Primer R 5'CAACGGCAAACTAGACAGATAGAA3'
  • a polymerase chain reaction was then performed with high fidelity Pwo polymerase with primers Q and R to amplify a fragment using genomic DNA from potato cultivar Twa' as a template.
  • This product was A-tailed, and ligated into pGemT (Promega) following the manufacturers' instructions.
  • the cloned AHAS allele was then sequenced using primers based on the consensus sequence anchored about every 400 bp along the cloned fragment.
  • the following sequence for the coding region of a potato cultivar Twa' AHAS allele was obtained:
  • Preferred intragenic vectors of the invention comprise an origin of replication that functions in E. coli and Agrobacterium tumefaciens.
  • Plant derived bacterial origins of replication in this example are based on the smallest known prokaryotic replication origins of Colicin E plasmids (ColE plasmids), specifically ColE2-P9 (from Shigella sp.) and ColE3-CA38 (from E. coli).
  • the minimal replication origins of these plasmids, named COLE2 and COLE3 require only 1 specific factor (Rep) to be provided in trans.
  • Plasmids pBX243 and pBX343 provide Rep in trans for ColE2 and ColE3 respectively.
  • ColE2 and ColE3 origin sequences There are 2 differences between ColE2 and ColE3 origin sequences, one mismatch and a deletion of a single nucleotide in ColE2 (or an insertion in ColE3).
  • the deletion/insertion not the mismatch, is responsible for determining the plasmid specificity in the interaction of the origins with the trans-acting factors.
  • Characteristic features of these sequences are two direct repeat sequences of 7 bp (5'CAPuATAA) or of 9 bp (APyCAPuATAA) which are separated from each other by 7 bp or 5 bp in ColE2 and by 8 bp or 6 bp in ColE3.
  • ColE2 AGACCAGATAAGCCT TATCAGATAACAGCGCC (SEQ ID NO:l l)
  • ColE3 AGACCAAATAAGCCTATATCAGATAACAGCGCC (SEQ ID NO: 12)
  • Consensus ColE2 AGAgCAJATAAGCCT TA CAJATAACAGCgCC
  • Consensus ColE3 AGAg ⁇ CA
  • the ColE2 consensus sequence was used to search publicly available potato (Solanum tuberosum) DNA sequences.
  • the potato COLE2-like replication sequence POTCOLE2 was constructed in silico from two sequences, accessions: SGN U254575 nucleotides 359-721 correspondto POTCOLE21-363 TIGR EST494490 nucleotides 248-693 correspond to POTCOLE2362-807
  • the POTCOLE2 replication sequence is underlined.
  • the 807 bp POTCOLE2 sequence was synthesised by Genscript Corporation (Piscatawa, NJ, www. enscript. com) and supplied cloned into the Smal site of pUC57 (pUC57POTCOLE2).
  • Primer S 5 ' GTGTCGAC AACTACGATACGG3 ' (SEQ ID NO:14)
  • Primer T 5 'CGTAAGCTTGAACGAATTCTTAG3 ' (SEQ ID NO: 15)
  • Nucleotides underlined represent a S ⁇ TI site in primer S and represent Hmdlll and EcoRI sites in primer T.
  • a 1661 bp fragment with the spectinomycin resistance gene was PCR amplified from pART27 using high fidelity Pwo polymerase. This fragment was ligated as a Sail to H dIII region into pUC57POTCOLE2 to give pUC57POTCOLE2SPEC and position the spectinomycin resistance gene immediately adjacent to the POTCOLE2 fragment.
  • the fragment corresponding to the spectinomycin resistance gene and the POTCOLE2 was isolated as a 2.5 kb EcoRI fragment from pUC57POTCOL ⁇ 2SP ⁇ C and self-circularised to generate pPOTCOLE2SPEC.
  • the ligation was transformed into E. coli D ⁇ 5 ⁇ harbouring helper plasmid pBX243 (with an ampicillin resistance gene) and transformation selected on L plates supplemented with ampicillin and spectinomycin (100 ⁇ g/mL). Resulting colonies were picked, plasmid DNA isolated and analysed by restriction enzyme digest using BamHl and EcoRI. A i? ⁇ /r ⁇ HI/EcoRI double digest will release the Rep gene from ⁇ BX243 and will linearise pPOTCOL ⁇ 2SP ⁇ C.
  • Beta vulgaris AGGCCAAATAAGCCT/TATCAGATAACAGCGCC (SEQ ID NO:
  • Theobroma cacao AGACCAAATAAGACTTA/TCAGATAACAGCACG (SEQ ID NO: 1
  • Vitis vinifera AGATCAGATAAGCCTTTA/TCAGATAACAGCCCC (SEQ ID NO:30) CF207293 /CF515867
  • ORT Operator-Repressor Titration
  • E. coli ORT strain OHllacdapD (genotype recA endAl gyrA96 thil hsdrl 7 supE44 relAl ⁇ (dapD):;kan hipAr.lac-dapD) contains a chromosomal conditionally essential gene dapD under the control of the lac operator/promoter system. Under normal conditions, a repressor protein encoded by a second chromosomal gene binds to the chromosomal 7 ⁇ c operator and prevents transcription of dapD, and cells lyse. Growth is permitted when an inducer (IPTG) is provided i.e. on a nutrient agar plate.
  • IPTG inducer
  • growth is also permitted when a plasmid containing a 7 ⁇ c operator sequence is introduced into the cell.
  • the repressor protein binds to the plasmid-borne operator sequence, derepressing the chromosomal operator and allowing dapD expression.
  • E ⁇ cOl is 21 bp and is derived from the wild-type E. coli lac operon.
  • E ⁇ cO is 20 bp and is an 'ideal' version of Z ⁇ cOl, being a perfect palindrome of the first 10 bp of ⁇ cOl.
  • Z ⁇ cOl AATTGTGAGCGGATAACAATT (SEQIDNO:39)
  • Z ⁇ cO AATTGTGAGCGCTCACAATT (SEQ ID NO:40)
  • NCBI accession numbers that have sequences identical to at least 10 bp that comprise one of the two inverted repeats that make up Z ⁇ cO:
  • NCBI GenBank http ://www.ncbi.nlm.nih. gov/BLAST/
  • TIGR database http://tigrblast. tigr.org/tgi/
  • the 21 bp Z ⁇ cOl sequence is identical in its first 10 bp to Z ⁇ cO.
  • the following list gives accession numbers where at least the last 11 bp of the Z ⁇ cOl sequence, GGATAACAATT, are found: Dicotyledonous plants Chenopodiaceae Beta vulgaris CX779649 CF542856 Compositae Lactuca sativa BU004821 BU008839 Helianthus annuus BU671786 BQ965452 Convolvulaceae Ipomoea nil BJ567255 Cruciferae Brassica rapa CV433907 CV432343 Brassica napus CX195012 CD838296 Raphanus sativus AF051115 Cucurbitaceae Citrullus lanatus AI563425 Leguminosae Cicer arietinum CK148974 Glycine max CO036432 CX709893 Medicago trunculata BQ 144942 Phaseolus
  • Potato LacOl sequence as a recombinant plasmid selectable element
  • the 21 bp Z ⁇ cOl sequence was used to search publicly available potato (Solanum tuberosum) EST sequences. Sequences were found in NCBI accessions CV501815 and CK259105 joined in silico with Bglfl restriction enzyme recognition sites (agatct) added to termini to make 693 bp POTLACOl: agatctAATATTTACTTCTCCACTTAAACAAATACCCCAATCAGAATCACTAGCTGGCAGAT TCCTTGTCCTCTATTGACAGCAAACATAGACGTACATTATAGAGCCACCACAACATTAGACA ⁇ AACATTCTTTAAACAAGAGGTGGATACTGCTTAGACTGCAGGCACCCTCTTTCGGTACTC CAGAACATCCTGAATAAACATATGATACCCTTCAGTTTGGGCAGGATCAGCAGGGTTTGGCT GATCTAACAAGTCCTGGATACCAACCAGTATCTGTTTCACGGTG
  • Z ⁇ cOl sequence is underlined. First nucleotide of CK259105 underlined and in bold. Terminal Bgl ⁇ l restriction enzyme sites (agatct in lowercase) are not of potato sequence origin.
  • the 693 bp POTLACOl sequence was synthesised by Genscript Corporation (Piscatawa, NJ, www. enscript. com) and supplied cloned into the Smal site of pUC57 (pUC57POTLACOl).
  • POTLACOl was excised from pUC57POTLACOl with Bgl ⁇ l and ligated into pBR322 previously linearised with f ⁇ r ⁇ HI.
  • the resulting plasmid pBR322POTLACOl was transformed into E. coli strain DHllacdapD and colonies were selected using repressor titration. Plasmid DNA was isolated from selected colonies and digested with restriction enzyme Pstl (see Figure 2).
  • Linearised pBR322 is visualised as a band at 4.4 kb.
  • Pstl digested pBR322POTLACOl is visualised as two bands, one at 1.3 kb and one at 3.8 kb. The results indicate that POTLACOl functions as plasmid selectable element.
  • ALLLACO and ALLLACO 1 have also been made in silico.
  • NCBI accessions CF448121 and CF450773 were used to generate a 756 bp ALLLACO (Z ⁇ cO sequence underlined):
  • NCBI accessions CF448121 and CF449604 were used to generate a 662 bp ALLLACO1 (Z ⁇ cOl sequence underlined):
  • Z ⁇ cOl -like sequences will also function as plasmid selectable elements. For example, the following sequence was also found.
  • Example 6 Design and construction of an intragenic vector for Arabidopsis thaliana
  • the consensus T-DNA border sequence can be defined as: 5 GGCAGGATATATXXXXXTGTAAXX 3' Although other variants can include:
  • T-DNA border is remarkably similar to authentic T-DNA borders from Agrobacterium
  • the A. thaliana "T-DNA border” is from an open reading frame (nucleotides 59676-63206 from AL138652) for a putative protein of unknown function [i.e. no promoters and presumably not a heterochromatic region]. Examination of sequences flanking this "T-DNA border” reveal a 2838 bp fragment (nucleotides 59735-62572 from AL138652) with several unique restriction sites suitable as potential insertion sites for other genes and Southern analysis of plants transformed using this vector.
  • T-DNA border found at nucleotides 60629-60606 is considered the "left border” of a binary vector there are several unique restriction sites, including Xbal, between this left border and the first three nucleotides equivalent to a right border at positions 59735-59737.
  • the right border beyond these three nucleotides can be provided by authentic right border sequences of non-plant origin, thereby resulting in a "chimeric right border'.
  • Primer A 5'CCGAGGAGGTGCTAGAGC7C7 ⁇ G4GCGTAAAGGAATGTCC3' (SEQ ID NO:
  • Primer B 5'AAAGGCZCG ⁇ GGTTTACCCGCCAATATATCCTGTCTATGTTTC ACATGAACACGTGAATCTTC3' (SEQ ID NO:51)
  • Primer C 5 ⁇ AAGGGZCG.4CTAGATCTTTCGGTTGTGTGAATGATTCCGATGA GAGAAGAAGAC3' (SEQ ID NO:52)
  • Primer D 5'GK3ACATTCCTTTACGC7TC2 ⁇ 5 ⁇ GCTCTAGCACCTCCTCGG3' (SEQ ID NO: 1
  • the 2864 bp Sail to Xhol fragment of pPROEX-AtTD was ligated to the 8004 bp S ⁇ /I backbone of the binary vector pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203- 1207) to form pTCl.
  • the orientation of the two fragments was determined by restriction patterns and confirmed by DNA sequencing.
  • the full sequence of pTCl is shown below and comprises a 2838 bp DNA fragment derived from Arabidopsis thaliana (nucleotides 59735- 62572 from AL138652) presented in italics.
  • the right and left T-DNA borders are in bold and the unique Xbal site used for subsequent cloning is in bold and underlined.
  • CTTGGTGTAT CCAACGGCGT CAGCCGGGCA GGATAGGTGA AGTAGGCCCA CCCGCGAGCG 181 GGTGTTCCTT CTTCACTGTC CCTTATTCGC ACCTGGCGGT GCTCAACGGG AATCCTGCTC
  • SEQ ID NO:54 A mutant form of the Arabidopsis thaliana acetohydroxyacid synthase gene conferring resistance to sulfonylurea herbicides such as chlorsulfuron was inserted into the T-DNA of pTCl.
  • the 5.8 kb Xbal fragment from pGHl (Haughn et al. 1988, Molecular and General Genetics 211 :266-271) was ligated into the unique Xbal site between the left and right T- DNA borders of pTCl to produce pTCAHAS. The orientation of the two fragments was determined by restriction patterns and confirmed by DNA sequencing.
  • the pTCAHAS binary vector was transformed into the disarmed Agrobacterium tumefaciens strain EHA105 (Hood et al 1993, Transgenic Research, 2:208-218), using the freeze-thaw method (Hofgen & Willmitzer 1988, Nucleic Acids Research, 16: 9877).
  • Agrobacterium was cultured overnight in LB broth supplemented with 300 mg/L spectinomycin and used to transform Arabidopsis thaliana 'Columbia' using the floral dip method (Clough and Bent, Plant Journal 16: 735-743, 1998).
  • the resulting self pollinated seed was screened in vitro on half-strength MS salts (Murashige and Skoog 1962, Physiologia Plantarum, 15: 473-497) supplemented with 10 ⁇ g/L chlorsulfuron. Seeds were also sown on a standard potting mix in a greenhouse and the germinated seedlings at the 3-4 true leaf stage were sprayed with a standard application of Glean (active ingredient chlorsulfuron) at a rate equivalent to 20 g/ha.
  • Glean active ingredient chlorsulfuron
  • Genomic DNA from the recovered chlorsulfuron-resistant seedlings were confirmed as being transformed with the intragenic vector pTCAHAS by polymerase chain reactions across the junctions of the two Xbal sites adjoining the original T-DNA of pTCl and the inserted 5.8 kb Xbal fragment to form pTCAHAS.
  • the following primers were used:
  • Primer E 5'CATCCACTGCATAGTTCCC3' (SEQ ID NO:55)
  • Primer F 5'GATGCGTTGATCTCTTCATCA3' (SEQ ID NO:56)
  • Primer G 5'TCAACATCAATCCGAGTACG3' (SEQ ID NO:57)
  • Primer H 5 'AGAGATTGTGGACCGAGGAG3 ' (SEQ ID NO:58)
  • the expected 643 bp DNA fragment was PCR amplified from the binary vector pTCAHAS and three A. thaliana lines transformed with pTCAHAS using primers E+F designed to flank the Xbal site inside the right T-DNA border.
  • the expected 149 bp DNA fragment was PCR amplified from the same DNA sources using primers G+H designed to flank the Xbal site inside the left T-DNA border.
  • Example 2 The 1268 bp sequence illustrated in Example 2 as a T-DNA-like region of a potato (Solanum tuberosum) intragenic vector was synthesised by Genscript Corporation (Piscatawa, NJ, www.genscript.com) and supplied cloned into pUC57 (pUC57POTINV).
  • the Sail fragment encompassing the T-DNA composed of potato DNA from pUC57POTINV was isolated by restriction, then ligated to the 8004 bp Sail backbone of the binary vector ⁇ ART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) to form the binary vector pPOTINV.
  • the orientation of the two fragments was determined by PCR analysis across the junctions of the two Sail sites and DNA sequencing.
  • the pPOTINV binary vector was transformed into the Agrobacterium strain A4T, also known as C58C1 (pArA4b) (Petit et al 1983, Molecular and General Genetics, 190:204-214), using the freeze-thaw method (Hofgen & Willmitzer 1988, Nucleic Acids Research, 16: 9877).
  • the Agrobacterium was cultured overnight in LB broth supplemented with 300 mg/L spectinomycin for co-cultivation with leaves from in vitro cultured potato plants.
  • Virus-free potato plants of cultivar Iwa were multiplied in vitro on MS salts and vitamins (Murashige and Skoog, 1962, Physiologia Plantarum, 15: 473-497), plus 30g l "1 sucrose, 40 mg l “1 ascorbic acid, 500 mg l "1 casein hydrolysate and 7 g l "1 agar, adjusted to pH 5.8 with 0.1 M KOH. Plants were routinely subcultured as 2-3 node segments every three to four weeks and incubated at 26 °C under cool white fluorescent lamps (80-100 ⁇ mol m "2 sec "1 ; 16 h photoperiod).
  • Leaves were excised from the in vitro plants, cut in half, dipped for about 30 sec in the liquid culture of Agrobacterium strain A4T harbouring pPOTINV, then blotted dry on sterile filter paper. These leaf segments were then cultured on potato medium defined above and incubated under reduced light intensity (5-10 ⁇ mol m "2 sec "1 ). Two days later, the leaf segments were transferred to the same medium supplemented with 200 mg l "1 Timentin to prevent Agrobacterium overgrowth.
  • Hairy roots were selected on MS medium without growth regulators. Genomic DNA isolated from these hairy roots was screened via PCR to identify those derived from co-transformation with pArA4b and pPOTINV. The following primers were used:
  • Primer I 5'GCTCACCTTGCAGCTTCACT3' (SEQ ID NO:59)
  • Primer J 5'CAGAGCTGGATTTGCATCAG3' (SEQ ID NO:60) to amplify an expected 570 bp DNA fragment from the T-DNA-like region of pPOTINV
  • Primer K 5'GATGGCAGAAGGCGAAGATA3' (SEQ ID NO:61)
  • Primer L 5'GAGCTGGTCTTTGAAGTCTCG3' (SEQ ID NO:62) as an internal control to amplify an expected 1069 bp fragment from the endogenous potato actin gene.
  • the expected 1069 bp fragment was amplified using primers K and L from all hairy root lines, including control hairy root line transformed with Agrobacterium strain A4T without the binary vector pPOTINV.
  • the expected 570 bp DNA fragment was PCR amplified from the binary vector pPOTINV and from two of 80 hairy root lines tested using primers I and J ( Figure 4).
  • Primer N 5'GCGTCAAAGAAATA3' (SEQ ID NO:64)
  • a binary vector with a T-DNA composed of petunia DNA The 1507 bp sequence illustrated in Example 2 as a T-DNA-like region of a petunia (Petunia hybrida) intragenic vector was synthesised by Genscript Corporation (Piscatawa, NJ, www. enscript. com) and supplied cloned into pUC57 (pUC57PETINV).
  • the Sail fragment encompassing the T-DNA composed of petunia DNA from pUC57PETINV was isolated by restriction, then ligated to the 8004 bp Sa l backbone of the binary vector pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) to form the binary vector pPETINV.
  • the orientation of the two fragments was determined by PCR analysis across the junctions of the two Sail sites and DNA sequencing.
  • the pPETINV binary vector was transformed into the Agrobacterium strain A4T, also known as C58C1 (pArA4b) (Petit et al 1983, Molecular and General Genetics, 190:204-214), using the freeze-thaw method (Hofgen & Willmitzer 1988, Nucleic Acids Research, 16: 9877).
  • the Agrobacterium was cultured overnight in LB broth supplemented with 300 mg/L spectinomycin for co-cultivation with leaves from in vitro cultured potato plants.
  • Virus-free potato plants of cultivar Iwa were multiplied in vitr-o on MS salts and vitamins (Murashige arid Skoog, 1962, Physiologia Plantarum, 15: 473-497), plus 30g l "1 sucrose, 40 mg l “1 ascorbic acid, 500 mg l "1 casein hydrolysate and 7 g l "1 agar, adjusted to pH 5.8 with 0.1 M KOH. Plants were routinely subcultured as 2-3 node segments every three to four weeks and incubated at 26 °C under cool white fluorescent lamps (80-100 ⁇ mol m "2 sec "1 ; 16 h photoperiod).
  • Leaves were excised from the in vitro plants, cut in half, dipped for about 30 sec in the liquid culture of Agrobacterium strain A4T harbouring pPETINV, then blotted dry on sterile filter paper. These leaf segments were then cultured on potato medium defined above and incubated under reduced light intensity (5-10 ⁇ mol m "2 sec "1 ). Two days later, the leaf segments were transferred to the same medium supplemented with 200 mg l "1 Timentin to prevent Agrobacterium overgrowth.
  • Hairy roots were selected on MS medium without growth regulators. Genomic DNA isolated from these hairy roots was screened via PCR to identify those derived from co-transformation with pArA4b and pPETINV. The following primers were used:
  • Primer O 5'GAGATAAACAAATAGTCCGGATCG3' (SEQ ID NO:65)
  • Primer P 5OGGAGCATTTGGTGGAAATAG3' (SEQ ID NO:66) to amplify an expected 447 bp DNA fragment from the T-DNA-like region of pPETINV.
  • the same DNA samples were also used in a PCR using primers K and L designed to amplify an expected 1069 bp fragment from the endogenous potato actin gene as an internal control.
  • the expected 1069 bp fragment was amplified using primers K and L from all hairy root lines, including control hairy root line transformed with Agrobacterium strain A4T without the binary vector pPETINV.
  • the expected 447 bp DNA fragment from the T-DNA-like region of pPETINV was PCR amplified from the binary vector pPETINV and from one of 85 hairy root lines tested using primers O and P ( Figure 6).
  • the DNA sample from the hairy root line positive for the T-DNA from pPETINV failed to amplify a PCR product using primers M and N designed for the Agrobacterium virQ gene ( Figure 7).
  • a culture of this hairy root line failed to grow bacteria when incubated in LB medium.
  • Example 2 The 1075 bp sequence illustrated in Example 2 as a T-DNA-like region of an onion (Allium cepa) intragenic vector was synthesised by Genscript Corporation (Piscatawa, NJ, www. genscript. com) and supplied cloned into pUC57 (pUC57ALLINV).
  • the Sail fragment encompassing the T-DNA composed of onion DNA from pUC57ALLINV was isolated by restriction, then ligated to the 8004 bp Sail backbone of the binary vector ⁇ ART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) to form the binary vector pALLINV.
  • the orientation of the two fragments was determined by PCR analysis across the junctions of the two Sail sites and DNA sequencing.
  • BLAST searches were conducted of publicly available plant DNA sequences from NCBI, SGN and TIGR databases.
  • a fragment containing a loxV-like sequence was designed from two EST sequences from potato ⁇ Solanum tuberosum) (NCBI accessions BQl 11407 and BQ045786). This fragment, named POTLOXP, is illustrated below. Restriction enzyme sites used for DNA cloning into the potato intragenic T-DNA described in Example 8 are shown in bold and the loxP-like sequence shown in bold and light grey.
  • Nucleotides 4-402 nucleotides 17-415 of NCBI accession BQl 11407
  • Nucleotides 403-653 nucleotides 298-548 of NCBI accession BQ045786 Nucleotides 654-655 part of EcoRV restriction enzyme site (from the potato intragenic T-DNA)
  • the designed potato /oxP-like sequence has 6 nucleotide mismatches from the native loxV sequence as illustrated in bold below.
  • loxP sequence ATAACTTCGTATAGCATACATTATACGAAGTTAT SEQ ID NO:68
  • the 655 bp POTLOXP sequence illustrated above was synthesised by Genscript Corporation (Piscatawa, NJ, www. enscript. com) and supplied cloned into pUC57. All plasmid constructions were performed using standard molecular biology techniques of plasmid isolation, restriction, ligation and transformation into Escherichia coli strain DH5 ⁇
  • the DNA sequence of the 2316 bp Sail fragment comprising the potato derived T-DNA region in pPOTLOXP2 is illustrated below. Only the nucleotides in italics are not part of potato genome sequences. The POTLOXP regions are shaded. The T-DNA borders are shown in bold, with the left border positioned at 314-337 and the right border positioned at 2005-2028. Restriction sites illustrated in bold represent those used in cloning the POTLOXP regions into pGEMTPOTINV. Unique restriction sites in pPOTLOXP2 for cloning between POTLOXP sites are:
  • Plasmid was isolated from colonies of E. coli strain 294-Cre transformed with pPOTLOXP2 and cultured at 37 °C, then DNA sequenced across the Sail region inserted into pGEMT. The resulting sequence from two independent cultures is illustrated below and confirms that recombination is base pair faithful through the remaining POTLOXP site in plasmid preparations. Only the nucleotides in italics are not part of the potato genome sequences. The remaining POTLOXP region is shaded. The T-DNA borders are shown in bold, with the left border positioned at 314-337 and the right border positioned at 1169-1192. Restriction sites illustrated in bold represent those remaining from cloning the POTLOXP regions into pPOTINV.
  • Medicago trunculata (barrel medic) foxP-like sequence designed from 2 ESTs
  • the barrel medic loxP-like site has 4 nucleotide mismatches from the native loxV sequence (illustrated above in bold). Picea (spruce) / ⁇ P-like sequence designed from 2 ESTs
  • the spruce loxP-like site has 4 nucleotide mismatches from the native loxP sequence (illustrated above in bold)
  • the maize /oxP-like site has 6 nucleotide mismatches from the native loxP sequence (illustrated above in bold)
  • BLAST searches were conducted of publicly available plant DNA sequences from NCBI, SGN and TIGR databases.
  • a fragment containing a frt-like sequence was designed from two EST sequences from potato (Solanum tuberosum) (NCBI accessions BQ513657 and BG098563). This fragment, named POTFRT, is illustrated below. Restriction enzyme sites used for DNA cloning into the potato intragenic T-DNA described in Example 8 are shown in bold and the ⁇ t-like sequence shown in bold and light grey.
  • the designed potato frt-like sequence has 5 nucleotide mismatches from the native frt sequence as illustrated in bold below.
  • the 185 bp POTFRT sequence illustrated above was synthesised by Genscript Corporation (Piscatawa, NJ, www. genscript. com) and supplied cloned into pUC57. All plasmid constructions were performed using standard molecular biology techniques of plasmid isolation, restriction, ligation and transformation into Escherichia coli strain DH5 ⁇ (Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987).
  • POTFRT was cloned into the T-DNA composed of potato DNA residing in the plasmid pGEMTPOTINV (described in Example 9) twice, firstly as a EcoRI to Avrll fragment, then subsequently as a Bfrl to BamEl fragment. Confirmation of the POTFRT inserts was verified using restriction enzyme analysis and DNA sequencing. The resulting plasmid was named pPOTFRT2.
  • the DNA sequence of the 1432 bp Sail fragment comprising the potato derived T-DNA region in the resulting pPOTFRT2 is illustrated below. Only the nucleotides in italics are not part of potato genome sequences. The POTFRT regions are shaded. The T-DNA borders are shown in bold, with the left border positioned at 314-337 and the right border positioned at
  • Restriction sites illustrated in bold represent those used to clone the POTFRT regions into pGEMTPOTINV.
  • POTFRT sites are:

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Organic Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Plant Pathology (AREA)
  • Cell Biology (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Breeding Of Plants And Reproduction By Means Of Culturing (AREA)

Abstract

The invention provides plant transformation vectors, methods for transforming plants and transformed plants. The vectors are designed to avoid incorporation of non-plant DNA into the plant genome. One embodiment of the invention includes use of plant sequences selected from sequences functioning as T-DNA border sequences, selectable markers, recombinase recognition sequences and origins of replication. Another embodiment includes use of additional plant sequences contiguous to border sequences. A further embodiment includes use of vectors having border sequences where the border DNA to be transferred forms part of a plant sequence.

Description

TRANSFORMATION VECTORS
BACKGROUND ART
Over the past 20 years rapid scientific advances in molecular and cell biology have resulted in the development of technology to enable genetic engineering of plants (development of transformed plants, transgenic plants or GMOs). This offers new opportunities for the incorporation of genes into crop plants and represents a new technology platform for the next level of genetic gain in crop breeding.
An option provided by genetic engineering is the ability to extend the germplasm base available for crop improvement to any source of DNA, including that from other plants, microbes or animals. However this cross-species transformation has raised ethical concerns with the public, especially when associated with food.
As this technology develops further, more genes are being identified from crop species which would be of benefit to agriculture and industry if they were transferred to other genotypes of the same crop, i.e. within species transformation. The use of such "within-species transformation" approaches for moving genes between genotypes within the existing gene pools available to plant breeders also has several advantages over traditional breeding:
1. Direct gene transfer to elite plants and cultivars without repeated backcrossing. This allows the efficient development of new plant lines without the many generations of hybridisation and selection usually required to recover the desired plant.
2. The transfer of single discrete genes, without the "linkage drag" associated with the transfer of many undefined and often undesirable neighboring genes in traditional plant breeding.
3. The specific design and development of new gene formulations. This can involve the matching of molecular switches (promoters) with the desired coding regions to target the expression of the new gene at a specific location within a plant. Alternatively, "reverse genetics" approaches can be used to "knock-out" specific functions in plants. This can be achieved by positioning the coding region of a gene in the reverse orientation, relative to the promoter responsible for "turning the gene on" or components of the coding region arranged in an inverted repeat under control of the promoter.
In addition, moving genes between plants of the same species does not raise the same ethical concerns as cross-species transformation.
The application of genetic engineering requires the use of vectors for either Agrohacterium- mediated transformation or direct DNA uptake into plant cells. Agrobacterium- aediated transformation is the preferred method and requires the construction of modified T-DNA (transferred-DNA) on a vector (usually a binary vector).
However, the transformation requires the use of vector systems based on DNA sequences from other species (e.g. the T-DNA border regions, the DNA region into which target genes are inserted, selectable markers genes and sequences allowing such vectors to replicate in additional host systems); sequences that have been usually derived from bacterial systems.
The minimum requirement of a vector to perform Agrobαcterium-XΑQdiatQd plant transformation is at least one T-DNA border region, although in practice transformation vector systems include other vector sequences as described above. Two T-DNA border regions are usually used flanking the sequence of interest to be integrated into the plant genome. However in most instances such border sequences or parts thereof also become integrated into the genome of the transformed plant.
T-DNA sequences have been identified as naturally occurring in the genomes of plants (White et al 1983, Nature 301: 348-350; Furner et al 1986, Nature 319: 422-427; Aoki et al 1994, Molecular and General Genetics 243: 706-710; Susuli et al 2002, Plant Journal 32: 775-787). Plant transformation vectors in which the Agrobαcterium borders are replaced with plant derived T-DNA border-like sequences have also been reported (WO 03/069980). If the T- DNA border-like sequences are chosen from a plant of the species to be transformed, this allows for the possibility of production of plants transformed with only their own DNA. However, in practice integration is relatively un-predictable and often results in integration of other vector sequences from outside of the T-DNA borders and even transfer of the whole transformation vector which includes many additional non-plant sequences. It is an object of the invention to provide improved compositions and methods for plant transformation which reduce or eliminate the transfer of foreign DNA into the plant, or at least provide the public with a useful choice.
SUMMARY OF INVENTION
In one aspect the invention provides a plant transformation vector comprising: a) T-DNA-like sequence including at least one T-DNA border-like sequence, the T- DNA border-like sequence comprising two polynucleotide sequence fragments, wherein all of the sequences of the T-DNA-like sequence are derived from plant species. Also possible but less preferred is use of a similar T-DNA border-like sequence containing three or more polynucleotide sequence fragments derived from plant species.
In a further aspect the invention provides a plant transformation vector comprising a) a T-DNA-like sequence including at least one T-DNA border-like sequence b) additional plant polynucleotide sequence on one or both sides of the T-DNA-like sequence in which all of said sequences are derived from plants, preferably from the same plant species.
Preferably the additional plant polynucleotide sequence is 5' to the left border when two T- DNA border-like sequences are used, or 5' to the single T-DNA border-like sequence when a single T-DNA border-like sequence is used.
In a preferred embodiment the said additional plant polynucleotide sequence is at least about lbp in length, preferably at least about 5 bp, preferably at least about 10 bp, preferably at least about 50 bp, preferably at least about 100 bp, preferably at least about 200 bp, preferably at least about 500 bp, more preferably at least about 1 kb.
In a preferred embodiment the T-DNA-like sequence includes two T-DNA border-like polynucleotide sequences flanking the T-DNA-like sequence, both T-DNA border-like polynucleotide sequences being derived from plants, preferably from the same plant species. In a further embodiment the T-DNA-like sequence further comprises additional base polynucleotide sequence(s), the additional base polynucleotide sequence(s) being derived from plants preferably from the same plants species as the T-DNA border-like sequences.
In a further embodiment the T-DNA-like sequence includes first and second recombinase recognition site sequences, wherein all of said sequences are derived from plants, preferably from the same plant species.
In a further embodiment the first recombinase recognition site and the second recombinase recognition site are lox P-like sites derived from a plant species, preferably from the same plant species as the T-DNA border-like sequences.
In a further embodiment the first recombinase recognition site and the second recombinase recognition site are^rt-like sites derived from a plant species, preferably from the same plant species as the T-DNA border-like sequences.
In one embodiment the vector comprises a selectable marker sequence flanked by the first and second recombinase recognition site sequences. Preferably the selectable marker is operably linked to a constitutive promoter sequence. Preferably the selectable marker and/or the constitutive promoter sequences are derived from plants, preferably from the same plant species as the T-DNA border-like sequences.
In a further embodiment the vector comprises a recombinase sequence flanked by the first and second recombinase recognition site sequences. Preferably the recombinase is operably linked to an inducible promoter sequence. Preferably the recombinase and/or inducible promoter sequences are derived from plants, preferably from the same plant species as the T- DNA border-like sequences.
In preferred vectors, when the recombinase recognition sites are a /ø P-like sequences, the recombinase sequence is Cre and when the recombinase recognition sites are an^f-like sequences, the recombinase sequence is FLP.
Alternatively a negative selection marker may be flanked by the first and second recombinase recognition site sequences. Preferably the negative selection marker is CodA. In a further embodiment neither the T-DNA border-like polynucleotide sequences, nor any base polynucleotide sequence of the T-DNA-like sequence, nor the first or second recombinase recognition site sequences, nor the plant polynucleotide sequence additional to the T-DNA-like sequence, contain regulatory elements, such as promoters, which may influence the expression of inserted genes of interest.
In a further embodiment neither the T-DNA border-like polynucleotide sequences, nor any base polynucleotide sequence of the T-DNA-like sequence, nor the first or second recombinase recognition site sequences, nor the plant polynucleotide sequence additional to the T-DNA-like sequence, contain introns, which may influence the expression of inserted genes of interest.
In a further embodiment neither the T-DNA border-like polynucleotide sequences, nor any base polynucleotide sequence of T-DNA-like sequence, nor the first or second recombinase recognition site sequences, nor the plant polynucleotide sequence additional to the T-DNA- like sequence are derived from heterochromatic regions of the genome from which they are derived.
In a further embodiment the polynucleotide encompassing the T-DNA border-like sequences, the base polynucleotide sequence of the T-DNA-like sequence and the plant polynucleotide sequence additional to the T-DNA-like sequence are constructed from fewer than 10, preferably fewer than 9, preferably fewer than 8, preferably fewer than 7, preferably fewer than 6, preferably fewer than 5, preferably fewer than 4, preferably fewer than 3, most preferably 2 or 1 sequence fragments derived from plants.
In a further embodiment the plant transformation vector of the invention further comprises an origin of replication sequence. Preferably the origin of replication sequence is derived from a plant, preferably from the same plant species as the T-DNA border-like sequences and/or the base polynucleotide sequence of the T-DNA-like sequence, and/or the sequence additional to the T-DNA-like sequence.
In a further embodiment the T-DNA-like sequence of the plant transformation vector of the invention comprises a selectable marker polynucleotide sequence for selection of a plant cell or plant harbouring the T-DNA-like sequence. Preferably the selectable marker sequence is derived from a plant, more preferably from the same plant species as the T-DNA border-like sequences and/or the base polynucleotide sequence of the T-DNA-like sequence, and/or the sequence additional to the T-DNA-like sequence.
In a further embodiment the plant transformation vector of the invention further comprises a selectable marker polynucleotide sequence for selection of a bacterium harbouring the vector. Preferably the selectable marker sequence is derived from a plant, more preferably from the same plant species as the T-DNA border-like sequences and/or the base polynucleotide sequence of the T-DNA-like sequence, and/or the sequence additional to the T-DNA-like sequence.
In a further embodiment the selectable marker polynucleotide sequence for selection of a plant harbouring the T-DNA-like sequence also functions in selection of a bacterium harbouring the vector.
In a further embodiment the T-DNA-like sequence further comprises a genetic construct as herein defined. Preferably the genetic construct comprises a promoter polynucleotide sequence operably linked to a polynucleotide sequence of interest and a terminator polynucleotide sequence, wherein all of said polynucleotide sequences are derived from plants, preferably from the same plant species as the T-DNA border-like sequences.
In a preferred embodiment the polynucleotide sequence of the entire vector is derived from plant species, preferably from the same plant species.
In a further aspect the invention provides a plant transformation vector including a T-DNA like sequence, the T-DNA like sequence comprising: a) a polynucleotide sequence of at least about 20 bp in length derived from a plant species, comprising, the nucleotide sequence 5'-GR-3' (wherein R = G or A); and b) a chimeric T-DNA-border-like sequence comprising at its 5' end, the nucleotide sequence 5'-GR-3' from a) wherein the chimeric border is capable of functioning as a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant. In a further aspect the invention provides a plant transformation vector including a T-DNA like sequence, the T-DNA like sequence comprising: a) a polynucleotide sequence of at least about 20 bp in length derived from a plant species, comprising, the nucleotide sequence 5'-GRC-3' (wherein R = G or A); and b) a chimeric T-DNA-border-like sequence comprising at its 5' end, the nucleotide sequence 5'-GRC-3' from a) wherein the chimeric border is capable of functioning as a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant.
In a further aspect the invention provides a plant transformation vector including a T-DNA like sequence, the T-DNA like sequence comprising: a) a polynucleotide sequence of at least about 20 bp in length derived from a plant species, comprising, the nucleotide sequence 5'-GRCA-3' (wherein R = G or A); and b) a chimeric T-DNA-border-like sequence comprising at its 5' end, the nucleotide sequence 5 ' -GRC A-3 ' from a) wherein the chimeric border is capable of functioning as a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant.
In a one embodiment, the T-DNA-like sequence includes, 5' to the chimeric T-DNA border- like sequence, first and second recombinase recognition sequences, wherein the recombinase recognition sequences are derived from plant species.
Preferably the first recombinase recognition site and the second recombinase recognition sequence are /o P-like sequences. Alternatively the first recombinase recognition sequence and the second recombinase recognition sequences arejrt-like sequences.
In a preferred embodiment the plant transformation vector comprises a selectable marker sequence flanked by the first and second recombinase recognition sequences. Preferably the selectable marker sequence is derived from plants.
In a more preferred embodiment, the polynucleotide of at least 20 bp in length and any recombinase recognition site sequences are constructed from fewer than 10 fragments, preferably fewer than 9, preferably fewer than 8, preferably fewer than 7, preferably fewer than 6, preferably fewer than 5, preferably fewer than 4, preferably fewer than 3, most preferably 2 or 1 sequence fragments derived from plants.
Preferably the plant transformation vector further comprises an origin of replication polynucleotide sequence derived from plant species.
In a preferred embodiment, the T-DNA-like sequence includes, 5' to the chimeric T-DNA border-like sequence, a selectable marker polynucleotide sequence capable of functioning in selection of a plant cell or plant harbouring the T-DNA-like sequence, wherein the selectable marker sequence is derived from plant species.
Preferably the plant transformation vector comprises a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector, wherein the selectable marker sequence is derived from" a plant.
Preferably the selectable marker polynucleotide sequence capable of functioning in selection of a plant harbouring the T-DNA-like sequence is also capable of functioning in selection of a bacterium harbouring the vector.
In a preferred embodiment the T-DNA-like sequence of the plant transformation vector further comprises a genetic construct as herein defined, wherein all polynucleotide sequences of the genetic construct are derived from plant species.
Preferably all of the polynucleotide sequence of the the plant transformation vector, except for the chimeric T-DNA border-like sequence, is derived from plant species.
More preferably all of the polynucleotide sequence of the the plant transformation vector, except for the chimeric T-DNA border-like sequence, is derived from plant species which are interfertile.
Most preferably all of the polynucleotide sequence of the the plant transformation vector, except for the chimeric T-DNA border-like sequence, is derived from the same plant species. In a further aspect the invention provides a plant transformation vector including a chimeric sequence, the chimeric sequence comprising: a) at the 5' end a plant-derived sequence of at least 20 bp in length including the nucleotide sequence 5'-GR-3' (wherein R = G or A); and b) at the 3' end a border sequence capable of performing the function of a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant, wherein the nucleotide sequence 5'-GR-3' from a) forms the 5' end of the border sequence.
In a further aspect the invention provides a plant transformation vector including a chimeric sequence, the chimeric sequence comprising: a) at the 5' end a plant-derived sequence of at least 20bp in length including the nucleotide sequence 5'-GRC-3' (wherein R = G or A); and b) at the 3' end a border sequence "capable of performing the function of a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant, wherein the nucleotide sequence 5'-GRC-3' from a) forms the 5' end of the border sequence.
In a further aspect the invention provides a plant transformation vector including a chimeric sequence, the chimeric sequence comprising: a) at the 5' end a plant-derived sequence of at least 20 bp in length including the nucleotide sequence 5'-GRCA-3' (wherein R = G or A); and b) at the 3' end a border sequence capable performing the function of a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant, wherein the nucleotide sequence 5'-GRCA-3' from a) forms the 5' end of the border sequence.
Preferably the plant-derived sequence of at least 20 bp in length is at least about 50bp in length, more preferably at least about lOObp in length, more preferably at least about 200bp in length, more preferably at least about 500bp in length, most preferably at least about lkb in length.
In a preferred embodiment the plant transformation includes, 5' to the border sequence, first and second recombinase recognition sequences derived from plant species. Preferably the first recombinase recognition site and the second recombinase recognition sequence are loxP-like sequences.
Alternatively the first recombinase recognition sequence and the second recombinase recognition sequences axefrt-like sequences.
In a further embodiment the plant transformation vector comprises a selectable marker sequence flanked by the first and second recombinase recognition sequences.
Preferably the selectable marker sequence is derived from plants.
In a further embodiment the polynucleotide of at least 20 bp in length and any recombinase recognition site sequences, of the plant transformation vector, are constructed from fewer than 10 fragments, preferably fewer than 9, preferably fewer than 8, preferably fewer than 7, preferably fewer than 6, preferably fewer than 5, preferably fewer than 4, preferably fewer than 3, most preferably 2 or 1 polynucleotide sequence fragments derived from plant species.
In a further embodiment the plant transformation vector further comprises an origin of replication polynucleotide sequence derived from plant species.
In a further embodiment the plant transformation vector includes, 5' to the border sequence, a selectable marker polynucleotide sequence capable of functioning in selection of a plant cell or plant harbouring the selectable marker polynucleotide sequence, wherein the selectable marker sequence is derived from plant species.
In a further embodiment the plant transformation vector comprises a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector, wherein the selectable marker sequence is derived from a plant.
Preferably the selectable marker polynucleotide sequence capable of functioning in selection of a plant harbouring the selectable marker polynucleotide sequence is also capable of functioning in selection of a bacterium harbouring the vector. In a further embodiment the plant transformation vector further comprises a genetic construct as herein defined, wherein all polynucleotide sequences of the genetic construct are derived from plant species.
In a preferred embodiment all of the polynucleotide sequence of the plant transformation vector, except for the border sequence, is derived from plant species.
In a more preferred embodiment all of the polynucleotide sequence of the plant transformation vector, except for the border sequence, is derived from plant species which are interfertile.
In a yet more preferred embodiment all of the polynucleotide sequence of the plant transformation vector, except for the border sequence, is derived from the same plant species.
In a further aspect the invention provides a plant transformation vector comprising a selectable marker polynucleotide sequence capable of functioning in selection of a plant cell or plant harbouring the selectable marker polynucleotide, wherein the selectable marker sequence is derived from plant species.
In a further aspect the invention provides a plant transformation vector comprising first and second recombinase recognition sequences, wherein the recombinase recognition sequences are derived from plant species.
In one embodiment the first recombinase recognition sequence and the second recombinase recognition sequence are loxP-like sequences derived from a plant species.
In an alternative embodiment the first recombinase recognition sequence and the second recombinase recognition sequences are^rt-like sequences derived from plant species.
In a preferred embodiment the plant transformation vector comprises a selectable marker sequence flanked by the first and second recombinase recognition sequences. Preferably the selectable marker sequence is derived from plants. In a further embodiment the plant transformation vector further comprises an origin of replication polynucleotide sequence derived from plant species.
In a further embodiment the plant transformation vector further comprises a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector, wherein the selectable marker sequence is derived from plant species.
In a preferred embodiment the selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector is also capable of functioning in selection of a plant cell or plant harbouring the selectable marker polynucleotide.
In a further aspect the invention provides a plant transformation vector comprising: a) an origin of replication polynucleotide sequence, and b) a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector in which all of said sequences are derived from plant species.
In one embodiment the plant transformation vector further comprises additional base polynucleotide sequence, the additional base polynucleotide sequence being derived from plant species.
In a preferred embodiment the plant transformation vector is constructed from fewer than 10, preferably fewer than 9, preferably fewer than 8, preferably fewer than 7, preferably fewer than 6, preferably fewer than 5, preferably fewer than 4, preferably fewer than 3, most preferably 2 or 1 polynucleotide sequence fragments derived from plants.
In a further embodiment the plant transformation vector further comprises a genetic construct as herein defined, wherein all polynucleotide sequences of the genetic construct are derived from plants.
In a preferred embodiment all of the polynucleotide sequence of the plant transformation vector is derived from plant species, more preferably from plant species which are interfertile and most preferably from the same plant species. In a further aspect the invention provides a plant transformation vector comprising a selectable marker polynucleotide sequence for selection of a bacterium harbouring the vector. Preferably the selectable marker sequence is derived from a plant. More preferably the vector also comprises an origin of replication sequence functional in bacteria, preferably in E. coli. Preferably the origin of replication sequence is derived from a plant, more preferably from the same plant species as the selectable marker polynucleotide sequence for selection of a • bacterium harbouring the vector. Yet more preferably the. vector further comprises a genetic construct as herein defined. Preferably the genetic construct sequence is derived from a plant, more preferably from the same plant species as the selectable marker polynucleotide sequence for selection of a bacterium harbouring the vector. Preferably the polynucleotide sequence of the entire vector are derived from plant species, most preferably from the same plant species.
In a further aspect the invention provides a method of producing a transformed plant cell or plant, the method comprising the step of transformation of the plant cell or plant using a transformation vector of the invettion.
In a preferred embodiment any polynucleotide stably integrated into the plant cell or plant is derived from a plant. Preferably any polynucleotide stably integrated into the plant cell or plant is derived from a plant interfertile with the plant or plant cell to be transformed. Most preferably any polynucleotide stably integrated into the plant cell or plant is derived from a plant of the same species as the plant or plant cell to be transformed.
The invention also provides a method of modifying a trait in a plant cell or plant comprising: (a) transforming of a plant cell or plant with a vector of the invention, the vector comprising a genetic construct capable of altering expression of a gene which influences the trait; and (b) obtaining a stably transformed plant cell or plant modified for the trait.
In one embodiment transformation is vir gene-mediated.
In a further embodiment transformation is Agrobacterium-meάiateά.
In an alternative embodiment transformation involves direct DNA uptake. In a further aspect the invention provides a method for modifying a plant cell or plant, comprising: (a) transforming a plant cell or plant with the vector of the invention comprising a selectable marker flanked by /oxP-like recombinase recognition sites; (b) selecting a plant cell or plant expressing the selectable marker flanked by loxP-like recombinase recognition sites; (c) inducing the expression of the Cre gene in the plant cell or plant; (d) culturing the plant cell or plant for sufficient time to allow excision of the selectable marker.
In a further aspect the invention provides a method for modifying a plant cell or plant, comprising: (a) transforming a plant cell or plant with the vector of the invention comprising a selectable marker flanked bγfrt-like recombinase recognition sites; (b) selecting a plant cell or plant expressing the selectable marker flanked byfrt-hke recombinase recognition sites; (c) inducing the expression of the FLP gene in the plant cell or plant; (d) culturing the plant cell or plant for sufficient time to allow excision of the selectable marker.
The invention provides a plant modified by a method of the invention.
In a preferred embodiment the plant cell or plant modified is of the same species as the vector sequence used to modify it.
The invention also provides a plant cell or plant produced by a method of the invention
In a preferred embodiment the plant cell or plant produced is of the same species as the vector sequence used to produce it.
The invention also provides a plant tissue, organ, propagule or progeny of the plant cell or plant of the invention. DETAILED DESCRIPTION
The term "polynucleotide(s)," as used herein, means a single or double-stranded deoxyribonucleotide or ribonucleotide polymer of any length, and include as non-limiting examples, coding and non-coding sequences of a gene, sense and antisense sequences, exons, introns, genomic DNA, cDNA, pre-mRNA, mRNA, rRNA, siRNA, miRNA, tRNA, ribozymes, recombinant polynucleotides, isolated and purified naturally occurring DNA or RNA sequences, synthetic RNA and DNA sequences, nucleic acid probes, primers, fragments, genetic constructs, vectors and modified polynucleotides.
As used herein, the term "variant" refers to polynucleotide or polypeptide sequences different from the specifically identified sequences, wherein one or more nucleotides or amino acid residues is "deleted, substituted, or added. Variants may be naturally occurring allelic variants, or non-naturally occurring variants. Variants may be from the same or from other species and may encompass homologues, paralogues and orthologues. In certain embodiments, variants of the inventive polypeptides and polynucleotides possess biological activities that are the same or similar to those of the inventive polypeptides or polynucleotides. The term "variant" with reference to polynucleotides and polypeptides encompasses all forms of polynucleotides and polypeptides as defined herein.
Variant polynucleotide sequences preferably exhibit at least 50%, more preferably at least 70%, more preferably at least 80%, more preferably at least 90%, more preferably at least 95%o, more preferably at least 98%, and most preferably at least 99% identity to a sequence of the present invention. Identity is found over a comparison window of at least 5 nucleotide positions, preferably at least 10 nucleotide positions, preferably at least 20 nucleotide positions, preferably at least 50 nucleotide positions, more preferably at least 100 nucleotide positions, and most preferably over the entire length of a polynucleotide of the invention.
Polynucleotide sequence identity can be determined in the following manner. The subject polynucleotide sequence is compared to a candidate polynucleotide sequence using BLASTN (from the BLAST suite of programs, version 2.2.5 [Nov 2002]) in bl2seq (Tatiana A. Tatusova, Thomas L. Madden (1999), "Blast 2 sequences - a new tool for comparing protein and nucleotide sequences", FEMS Microbiol Lett. 174:247-250), which is publicly available from NCBI (ftp://ftp.ncbi.nih. gov/blast/). The default parameters of bl2seq may be utilized.
Polynucleotide sequence identity may also be calculated over the entire length of the overlap between a candidate and subject polynucleotide sequences using global sequence alignment programs (e.g. Needleman, S. B. and Wunsch, C. D. (1970) J. Mol. Biol. 48, 443-453). A full implementation of the Needleman- Wunsch global alignment algorithm is found in the needle program in the EMBOSS package (Rice, P. Longden, I. and Bleasby, A. EMBOSS: The European Molecular Biology Open Software Suite, Trends in Genetics June 2000, vol 16, No 6. ρp.276-277) which can be obtained from http://www.hgmp.mrc.ac.uk/Software/EMBOSS/. The European Bioinformatics Institute server also provides the facility to perform EMBOSS-needle global alignments between two sequences on line at http:/www.ebi.ac.uk/emboss/align/.
Alternatively the GAP program may be used which computes an optimal global alignment of two sequences without penalizing terminal gaps. GAP is described in the following paper: Huang, X. (1994) On Global Sequence Alignment. Computer Applications in the Biosciences 10, 227-235.
Use of BLASTN as described above is preferred for use in the determination of sequence identity for polynucleotide variants according to the present invention.
Alternatively, variant polynucleotides of the present invention hybridize to the polynucleotide sequences disclosed herein, or complements thereof under stringent conditions.
The term "hybridize under stringent conditions", and grammatical equivalents thereof, refers to the ability of a polynucleotide molecule to hybridize to a target polynucleotide molecule (such as a target polynucleotide molecule immobilized on a DNA or RNA blot, such as a Southern blot or Northern blot) under defined conditions of temperature and salt concentration. The ability to hybridize under stringent hybridization conditions can be determined by initially hybridizing under less stringent conditions then increasing the stringency to the desired stringency. With respect to polynucleotide molecules greater than about 100 bases in length, typical stringent hybridization conditions are no more than 25 to 30° C (for example, 10° C) below the melting temperature (Tm) of the native duplex (see generally, Sambrook βt al, Eds, 1987, Molecular Cloning, A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press; Ausubel et al. , 1987, Current Protocols in Molecular Biology, Greene Publishing,). Tm for polynucleotide molecules greater than about 100 bases can be calculated by the formula Tm = 81. 5 + 0. 41%) (G + C-log (Na+). (Sambrook et al, Eds, 1987, Molecular Cloning, A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press; Bolton and McCarthy, 1962, PNAS 84:1390). Typical stringent conditions for polynucleotide molecules of greater than 100 bases in length would be hybridization conditions such as prewashing in a solution of 6X SSC, 0.2% SDS; hybridizing at 65°C, 6X SSC, 0.2% SDS overnight; followed by two washes of 30 minutes each in IX SSC, 0.1% SDS at 65° C and two washes of 30 minutes each in 0.2X SSC, 0.1% SDS at 65°C.
With respect to polynucleotide molecules having a length less than 100 bases, exemplary stringent hybridization conditions are 5 to 10° C below Tm. On average, the Tm of a polynucleotide molecule of length less than 100 bp is reduced by approximately (500/oligonucleotide length)0 C.
Variant polynucleotides of the present invention also encompasses polynucleotides that differ from the sequences of the invention but that, as a consequence of the degeneracy of the genetic code, encode a polypeptide having similar activity to a polypeptide encoded by a polynucleotide of the present invention. A sequence alteration that does not change the amino acid sequence of the polypeptide is a "silent variation". Except for ATG (methionine) and TGG (tryptophan), other codons for the same amino acid may be changed by art recognized techniques, e.g., to optimize codon expression in a particular host organism.
Polynucleotide sequence alterations resulting in conservative substitutions of one or several amino acids in the encoded polypeptide sequence without significantly altering its biological activity are also included in the invention. A skilled artisan will be aware of methods for making phenotypically silent amino acid substitutions (see, e.g., Bowie et al, 1990, Science 247, 1306). Variant polynucleotides due to silent variations and conservative substitutions in the encoded polypeptide sequence may be determined using the publicly available bl2seq program from the BLAST suite of programs (version 2.2.5 [Nov 2002]) from NCBI (ftp ://ftp .ncbi .nih. go v/blast/) via the tblastx algorithm as previously described.
A "fragment" of a polynucleotide sequence provided herein is a subsequence of contiguous nucleotides that is at least 5 nucleotides in length. The fragments of the invention comprise at least 5 nucleotides, preferably at least 10 nucleotides, preferably at least 15 nucleotides, preferably at least 20 nucleotides, more preferably at least 30 nucleotides, more preferably at least 50 nucleotides, more preferably at least 50 nucleotides and most preferably at least 60 nucleotides of contiguous nucleotides of a polynucleotide of the invention.
The term "primer" refers to a short polynucleotide, usually having a free 3 'OH group, that is hybridized to a template and used for priming polymerization of a polynucleotide complementary to the target.
The term "probe" refers to a short polynucleotide that is used to detect a polynucleotide sequence, that is complementary to the probe, in a hybridization-based assay. The probe may consist of a "fragment" of a polynucleotide as defined herein.
The term "polypeptide", as used herein, encompasses amino acid chains of any length, including full-length proteins, in which amino acid residues are linked by covalent peptide bonds. Polypeptides of the present invention may be purified natural products, or may be produced partially or wholly using recombinant or synthetic techniques. The term may refer to a polypeptide, an aggregate of a polypeptide such as a dimer or other multimer, a fusion polypeptide, a polypeptide fragment, a polypeptide variant, or derivative thereof.
The term "isolated" as applied to the polynucleotide sequences disclosed herein is used to refer to sequences that are removed from their natural cellular environment. An isolated molecule may be obtained by any method or combination of methods including biochemical, recombinant, and synthetic techniques.
The term "genetic construct" refers to a polynucleotide molecule, usually double-stranded DNA, which may have inserted into it another polynucleotide molecule (the insert polynucleotide molecule) such as, but not limited to, a cDNA molecule. A genetic construct may contain the necessary elements that permit transcribing the insert polynucleotide molecule, and, optionally, translating the transcript into a polypeptide. The insert polynucleotide molecule may be derived from the host cell, or may be derived from a different cell or organism and/or may be a recombinant or synthetic polynucleotide. Once inside the host cell the genetic construct may become integrated in the host chromosomal DNA. The term "genetic construct" includes "expression construct" as herein defined. The genetic construct may be linked to a vector.
The term "expression construct" refers to a genetic construct that includes the necessary elements that permit transcribing the insert polynucleotide molecule, and, optionally, translating the transcript into a polypeptide. An expression construct typically comprises in a 5' to 3' direction: a) a promoter functional in the host cell into which the construct will be transformed, b) the polynucleotide to be transcribed and/or expressed, and c) a terminator functional in the host cell into which the construct will be transformed.
The term "vector" refers to a polynucleotide molecule, usually double stranded DNA, which may include a genetic construct and be used to transport the genetic construct into a host cell. The vector may be capable of replication in at least one additional host system, such as Escherichia coli or Agrobacterium tumefaciens.
The term "coding region" or "open reading frame" (ORF) refers to the sense strand of a genomic DNA sequence or a cDNA sequence that is capable of producing a transcription product and/or a polypeptide under the control of appropriate regulatory sequences. The coding sequence is identified by the presence of a 5' translation start codon and a 3' translation stop codon. When inserted into a genetic construct, a "coding sequence" is capable of being expressed when it is operably linked to promoter and terminator sequences.
"Operably-linked" means that the sequence to be expressed is placed under the control of regulatory elements that include promoters, tissue-specific regulatory elements, temporal regulatory elements, chemical-inducible regulatory elements, environment-inducible regulatory elements, enhancers, repressors and terminators.
The term "noncoding region" refers to untranslated sequences that are upstream of the translational start site and downstream of the translational stop site. These sequences are also referred to respectively as the 5' UTR and the 3' UTR. These regions include elements required for transcription initiation and termination and for regulation of translation efficiency.
Terminators are sequences, which terminate transcription, and are found in the 3' untranslated ends of genes downstream of the translated sequence. Terminators are important determinants of mRNA stability and in some cases have been found to have spatial regulatory functions.
The term "promoter" refers to nontranscribed cis-regulatory elements upstream of the coding region that regulate gene transcription. Promoters comprise cis-initiator elements which specify the transcription initiation site and conserved boxes such as the TATA box, and motifs that are bound by transcription factors.
A "transformed plant" refers to a plant which contains new genetic material as a result of genetic manipulation or transformation. The new genetic material may be derived from a plant of the same species or from a different species in which case it can also be known as a "transgenic plant".
An "inverted repeat" is a sequence that is repeated, where the second half of the repeat is in the complementary strand, e.g., (5')GATCTA TAGATC(3') (3')CTAGAT ATCTAG(5')
Read-through transcription will produce a transcript that undergoes complementary base- pairing to form a hairpin structure provided that there is a 3-5 bp spacer between the repeated regions.
The terms "to alter expression of and "altered expression" of a polynucleotide or polypeptide of the invention, are intended to encompass the situation where genomic DNA corresponding to a polynucleotide of the invention is modified thus leading to altered expression of a polynucleotide or polypeptide of the invention. Modification of the genomic DNA may be through genetic transformation or other methods known in the art for inducing mutations. The "altered expression" can be related to an increase or decrease in the amount of messenger RNA and/or polypeptide produced and may also result in altered activity of a polypeptide due to alterations in the sequence of a polynucleotide and polypeptide produced.
The term " oxp-like sequence" refers to a sequence derived from the genome of a plant which can perform the function of a Cre recombinase recognition site. The loxP-like sequence may be comprised of one contiguous sequence derived from the genome of a plant or may be formed by combining two sequences derived from the genome of a plant.
A oxP-like sequence is between 24-100 bp in length, preferably 24-80 bp in length, preferably 24-70 bp in length, preferably 24-60 bp in length, preferably 24-50 bp in length, preferably 24-40 bp in length, preferably 24-34 bp in length, preferably 26-34 bp in length, preferably 28-34 bp in length, preferably 30-34 bp in length, preferably 32-34 bp in length, preferably 34 bp in length.
A /oxP-like sequence preferably comprises the consensus motif
5 ' ATAACTTCGTATANNNNNNNNTATACGAAGTTAT 3 '
(where N = any nucleotide), or similar sequences.
The term "frt-like sequence" refers to a sequence derived from the genome of a plant which can perform the function of an FLP recombinase recognition site. The frt-like sequence may be comprised of one contiguous sequence derived from the genome of a plant or may be formed by combining two sequences derived from the genome of a plant.
An frt-like sequence is between 28-100 bp in length, preferably 28-80 bp in length, preferably 28-70 bp in length, preferably 28-60 bp in length, preferably 28-50 bp in length, preferably 28-40 bp in length, preferably 28-34 bp in length, preferably 30-34 bp in length, preferably 32-34 bp in length, preferably 34 bp in length.
Afii-like sequence preferably comprises the consensus motif 5' GAAGTTCCTATACNNNNNNNNGWATAGGAACTTC 3'
(where W = A or T, N = any nucleotide).
The term "T-DNA border-like sequence" refers to a sequence derived from the genome of a plant which can perform the function of an Agrobacterium T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant. The T-DNA border-like sequence may be comprised of one contiguous sequence derived from the genome of a plant or may be formed by combining two or more sequences derived from the genome of a plant.
A T-DNA border-like sequence is between 10-100 bp in length, preferably 10-80 bp in length, preferably 10-70 bp in length, preferably 15-60 bp in length, preferably 15-50 bp in length, preferably 15-40 bp in length, preferably 15-30 bp in length, preferably 20-30 bp in length, preferably 21-30 bp in length, preferably 22-30 bp in length, preferably 23-30 bp in length, preferably 24-30 bp in length, preferably 25-30 bp in length, preferably 26-30 bp in length.
A T-DNA border-like sequence preferably comprises the consensus motif: 5 ' GRC AGGATATATNNNNNKSTMAWN3 ' (where R = G or A, K = T or G, S = G or C, M = C or A, W = A or T and N = any nucleotide.
The T-DNA border-like sequence of the invention is preferably at least 50%, more preferably at least 55%, more preferably at least 60%, more preferably at least 65%, more preferably at least 70%, more preferably at least 75%, more preferably at least 80%, more preferably at least 85%, more preferably at least 90%, more preferably at least 95%, more preferably at least 99% identical to any Agrobacterium-deήved T-DNA border sequence.
Although not preferred, a T-DNA border-like sequence of the invention may include a sequence naturally occurring in a plant which is modified or mutated to change the efficiency at which it is capable of integrating a linked polynucleotide sequence into the genome of a plant.
The term "T-DNA-like sequence" refers to a sequence derived from a plant genome which includes at one or both ends a T-DNA border-like sequence, or a chimeric T-DNA-border-like sequence as herein defined. A T-DNA-like sequence may include additional base sequence between the T-DNA border-like sequences, or to one side of a T-DNA border-like sequence. The base sequence of the T-DNA-like sequences of the invention preferably includes restriction sites or alternative cloning sites to facilitate insertion of further polynucleotide sequences.
The term "chimeric T-DNA border-like sequence" refers to a sequence which can perform the function of an Agrobacterium T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant, wherein part of the sequence is derived from a plant and part of the sequence is derived from another source, such as Agrobacterium.
Upon plant transformation, it is well understood that T-DNA integration from the right border is very precise. Molecular cloning and sequencing across T-DNA/plant genomic DNA junctions has repeatedly established that T-DNA integration at the right border is highly conserved, with only the first few nucleotides of the right border being integrated into plant genomes (Gheysen, G., Angenon, G., van Montagu, M., Agrobacterium- ediated plant transformation: a scientifically intriguing story with significant applications, pp. 1-33, in Transgenic Plant Research, editor Lindsey, K., Harwood Academic Publishers, Amsterdam, 1998).
For this reason, when deriving a chimeric border for use as a "right border" in intragenic transformation, it is only necessary for the first few nucleotides (up to four nucleotides) to be of plant origin; i.e. 5'GRCA...3' . The remaining DNA sequence of such rights borders can be authentic sequences from Agrobacterium T-DNA borders.
It will be well understood by those skilled in the art that a DNA sequence of 5'GRCA3' will occur frequently in any genome. It is expected to be found at random once in every 256 nucleotides and is likely to be found on any other fragment useful for the construction of vectors for plant transformation.
The term "border sequence" refers to a sequence derived from a plant which can perform the function of an Agrobacterium T-DNA border sequence for integration of a polynucleotide sequence into the genome of a plant. A "border sequence" is between 10-100 bp in length, preferably 10-80 bp in length, preferably 10-70 bp in length, preferably 15-60 bp in length, preferably 15-50 bp in length, preferably 15-40 bp in length, preferably 15-30 bp in length, preferably 20-30 bp in length, preferably 21-30 bp in length, preferably 22-30 bp in length, preferably 23-30 bp in length, preferably 24-30 bp in length, preferably 25-30 bp in length, preferably 26-30 bp in length.
A "border sequence" preferably comprises the consensus motif:
5'GRCAGGATATATNNNNNKSTMAWN3'
(where R = G or A, K = T or G, S = G or C, M - C or A, W = A or T and N = any nucleotide).
The term "border sequence" as used herein includes known Agrobacterium borders, including those disclosed herein.
The term "border sequence" also includes modified versions of known Agrobacterium sequences, which have been modified, for example by substitution, addition or deletion, to improve the efficiency at which they are capable of performing function of an Agrobacterium T-DNA border sequence for integration of a polynucleotide sequence into the genome of a plant.
The terms "origin of replication derived from a plant" or "plant-derived origin of replication" or grammatical equivalents thereof refers to a sequence derived from a plant which can support replication of a vector in which it is included in a bacterium. The "plant-derived origins of replication" may be composed of one, two or more sequence fragments derived from plants. Preferably the "plant-derived origins of replication" are composed of two sequence fragments derived from plants.
The plant-derived origin of replication may comprise the consensus motif:
AGAgCAgATAAGCCT TA|CA§ATAACAGCJCC
Where R = G or A (Pu), Y = C or T (Py) and W = A or T Alternatively the plant-derived origin of replication comprises the consensus motif:
AGAScA§ATAAGCCTgTA|CABATAACAGC|CC
Where R= G or A (Pu), Y - C or T (Py) and W = A or T
The terms "selectable marker derived from a plant" or "plant-derived selectable marker" or grammatical equivalents thereof refers to a sequence derived from a plant which can enable selection of a plant cell harbouring the sequence or a sequence to which the selectable marker is linked. The "plant-derived selectable markers" may be composed of one, two or more sequence fragments derived from plants. Preferably the "plant-derived selectable markers" are composed of two sequence fragments derived from plants.
In one embodiment, the plant-derived selectable marker is at least 50%, more preferably at least 55%), more preferably at least 60%, more preferably at least 65%, more preferably at least 70%, more preferably at least 75%, more preferably at least 80%, more preferably at least 85%, more preferably at least 90%>, more preferably at least 95%>, more preferably at least 99% identical to the sequence of SEQ ID NO: 10.
Alternatively the plant-derived selectable marker is at least 90%, preferably at least 95%, and most preferably 100% identical to SEQ ID NO:39 or SEQ ID NO:40.
Methods for transforming plant cells, plants and portions thereof with polynucleotides are described in Draper et al, 1988, Plant Genetic Transformation and Gene Expression. A Laboratory Manual. Blackwell Sci. Pub. Oxford, p. 365; Potrykus and Spangenburg, 1995, Gene Transfer to Plants. Springer-Verlag, Berlin.; and Gelvin et al, 1993, Plant Molecular Biol. Manual. Kluwer Acad. Pub. Dordrecht. A review of transgenic plants, including transformation techniques, is provided in Galun and Breiman, 1997, Transgenic Plants. Imperial College Press, London. It will be well understood by those skilled in the art that the intragenic vectors of the invention can function in the place of the binary vectors for Agrobacterium-mediated transformation and as vectors for direct DNA uptake approaches.
The invention provides novel plant derived loxP-like and frt-like recombinase recognition sequences, novel T-DNA border-like sequences, T-DNA-like sequences, transformation vectors, methods for producing transformed plant cells and plants, and plant cells and plants produced by the methods.
The majority of selectable markers for plant transformation are antibiotic or herbicide resistance genes; their presence in transgenic crop plants has given rise to public concerns on environmental safety.
Regardless of their origin, once a transgenic plant has been established marker genes are no longer required. Moreover it is desirable to remove the promoter and enhancer elements used to drive the expression of the marker genes as these may interfere with the expression of neighboring endogenous genes.
Previously site-specific recombination systems have been elegantly used to excise precise sequences corresponding to selectable marker constructs in transgenic plants (reviewed by Gilbertson, L. Cre-lox recombination: Cre-ative tools for plant biotechnology TRENDS in Biotechnology 21(12) 550-555 2003).
Two such recombination systems are the Escherichiα coli bacteriophage PI Cre//øxP system and the Saccharomyces cerevisiae FLP//rt systems, which require only a single-polypeptide recombinase, Cre or FLP and minimal 34bp DNA recombination sites, lox? oxfrt.
When two recombination sites in the same orientation flank integrated DNA such as a selectable marker, recombinase mediates a crossover between these sites effectively excising the intervening DNA.
The recombinase enzyme can either be located next to the selectable marker gene so that it is in effect auto excised (Mlynarova, L and Nap J-P, A self-excising Cre recombinase allows efficient recombination of multiple ectopic heterospecific lox sites in transgenic tobacco, Transgenic Research, 12: 45-57, 2003), or it can be transiently expressed (Gleave, A.P, Mitra, D.S, Mudge, S.R and Morris, B.A.M. Selectable marker-free transgenic plants without sexual crossing: transient expression of cre recombinase and use of a conditional lethal dominant gene, Plant Molecular Biology, 40: 223-235, 1999).
Following excision only one recombination site remains.
The applicants have identified novel T-DNA border-like sequences from plant genomes and devised improved methods for transformation which minimise or eliminate transfer of foreign DNA to the transformed plant cell or plant.
The invention provides T-DNA border-like sequences, T-DNA-like sequences, transformation vectors, methods for transforming plant cells and plants, and the plant cells and plants produced by the methods.
The applicants have also identified novel plant derived loxP-like and frt-like recombinase recognition sequences from plant genomes and devised further improved methods for transformation which minimise or eliminate transfer of foreign DNA to the transformed plant.
It will be understood by those skilled in the art that corresponding recombinase sequences can be expressed in plants in order to facilitate recombination of the loxP-like and frt-like recombinase recognition sequences of the invention.
The invention provides methods which allow for within-species or "intragenic" as opposed to transgenic transformation of plants. Vectors useful for this approach can therefore be described as intragenic vectors. The invention provides such intragenic vectors and methods of using them to produce intragenic transformed plants without any foreign DNA.
It will be understood by those skilled in the art that DNA sequences used to construct such "intragenic vectors" are preferentially derived from DNA sequences (ESTs or cDNAs) known to be expressed in plant genomes. In this manner sequences derived from heterochromatic regions, promoters or introns can be avoided. The use of such sequences for the construction of intragenic vectors may influence the subsequent expression of genes of interest following their transfer to plants via intragenic vectors. The invention provides novel T-DNA border-like sequences from several plant species (as shown in Example 1) formed by combining two to three fragments of genomic DNA, with all fragments being from a single plant species of interest or a closely related species. The common nature of such sequences in plant genomes is shown in Example 1.
The invention further provides isolated T-DNA-like sequences from several plant species as shown in Example 2. The T-DNA-like region sequences in Example 2 include the T-DNA- like sequences flanked (and delineated) by T-DNA border-like sequences (high-lighted) and additional sequence on either one or both sides of the T-DNA-like sequence.
Plant-derived selectable marker sequences which are useful for selecting transformed plant cells and plants harbouring a particular T-DNA-like sequence include PPga22 (Zuo et al, Curr Opin BiotechnoL 13: 173-80, 2002), Ckil (Kakimoto, Science 274: 982-985, 1996), Esrl (Banno et al, Plant Cell 13: 2609-18, 2001), and dhdps-rl (Ghislain et al, Plant Journal, 8: 733-743, 1995). It is also possible to use pigmentation markers to visually select transformed plant cells and plants, such as the R and CI genes (Lloyd et al, Science, 258: 1773-1775, 1992; Bodeau and Walbot, Molecular and General Genetics, 233: 379-387, 1992). A preferred plant-derived selectable marker is the acetohydroxyacid synthase gene as shown in Example 6 and Example 7. Non-plant derived selectable markers are also described herein.
Preferred intragenic vectors of the invention contain a plant-derived selectable marker which function in selection of bacteria harbouring the marker as described in Example 3 and Example 5.
The preferred intragenic vectors of the invention consist entirely of plant-derived polynucleotide sequence from the species to be transformed, or from closely related species, such as species interfertile with the plant to be transformed, considered to be within the germplasm pool accessible to traditional plant breeding. Such vectors preferably include a plant-derived origin of replication which is functional in bacteria, particularly in Agrobacterium species and preferably also in E. coli. The invention provides plant transformation vectors comprising such sequences. Preferred origin of replication sequences include those shown in Example 4. The invention provides novel loxV-like and ^rt-like recombinase recognition sequences from several plant species as shown in Example 9 and Example 10.
Construction of a vector of the invention is described in Example 6 and Example 8. Plant transformation using these vectors is described in Example 7 and Example 8.
Example 6 and Example 7 also illustrate the construction and successful use of a vector with a chimeric T-DNA border-like sequence. In this instance the "right border" is composed of 5'GAC3' from the end of a sequence isolated from Arabidopsis thaliana, with the remainder of the chimeric T-DNA border-like sequence, 5ΑGGATATATTGGCGGGTAAAC3', being derived from the binary vector pART27 (see sequence of pTCl in Example 6). Such chimeric T-DNA border-like sequences are preferably used as the right border when two border-like sequences are used to flank the T-DNA-like sequence. When vectors with only one borderlike sequence are used, the plant derived end (e.g. 5'GRC3') end of the T-DNA border-like sequence must be contiguous with the plant derived sequence(s) destined for integration into a plant genome.
The polynucleotide molecules of the invention can be isolated by using a variety of techniques known to those of ordinary skill in the art. By way of example, such polynucleotides can be isolated through use of the polymerase chain reaction (PCR) described in Mullis et al, Eds. 1994 The Polymerase Chain Reaction, Birkhauser, incorporated herein by reference. The polynucleotides of the invention can be amplified using primers, as defined herein, derived from the polynucleotide sequences of the invention.
Further methods for isolating polynucleotides of the invention include use of all, or portions of, the disclosed polynucleotide sequences as hybridization probes. The technique of hybridizing labeled polynucleotide probes to polynucleotides immobilized on solid supports such as nitrocellulose filters or nylon membranes, can be used to screen the genomic or cDNA libraries. Exemplary hybridization and wash conditions are: hybridization for 20 hours at 65°C in 5. 0 X SSC, 0. 5% sodium dodecyl sulfate, 1 X Denhardt's solution; washing (three washes of twenty minutes each at 55°C) in 1. 0 X SSC, 1% (w/v) sodium dodecyl sulfate, and optionally one wash (for twenty minutes) in 0. 5 X SSC, 1% (w/v) sodium dodecyl sulfate, at 60°C. An optional further wash (for twenty minutes) can be conducted under conditions of 0. 1 X SSC, 1% (w/v) sodium dodecyl sulfate, at 60°C. The polynucleotide fragments of the invention may be produced by techniques well-known in the art such as restriction endonuclease digestion and oligonucleotide synthesis.
A partial polynucleotide sequence may be used, in methods well-known in the art to identify the corresponding further contiguous polynucleotide sequence. Such methods would include PCR-based methods, 5'RACE (Frohman MA, 1993, Methods Enzymol. 218: 340-56) and hybridization- based method, computer/database-based methods. Further, by way of example, inverse PCR permits acquisition of unknown sequences, flanking the polynucleotide sequences disclosed herein, starting with primers based on a known region (Triglia et ah, 1998, Nucleic Acids Res 16, 8186, incorporated herein by reference). The method uses several restriction enzymes to generate a suitable fragment in the known region of a gene. The fragment is then circularized by intramolecular ligation and used as a PCR template. Divergent primers are designed from the known region. In order to physically assemble full- length clones, standard molecular biology approaches can be utilized (Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987).
It will be understood by those skilled in the art that in order to produce intragenic vectors for further species it may be necessary to identify the sequences corresponding to essential or preferred elements of such vectors in other plant species. It will be appreciated by those skilled in the art that this may be achieved by . identifying polynucleotide variants of the sequences disclosed. Many methods are known by those skilled in the art for isolating such variant sequences.
Variant polynucleotides may be identified using PCR-based methods (Mullis et al, Eds. 1994 The Polymerase Chain Reaction, Birkhauser). Typically, the polynucleotide sequence of a primer, useful to amplify variants of polynucleotide molecules of the invention by PCR, may be based on a sequence encoding a conserved region of the corresponding amino acid sequence.
Further methods for identifying variant polynucleotides of the invention include use of all, or portions of, the polynucleotides disclosed herein as hybridization probes to screen plant genomic or cDNA libraries as described above. Typically probes based on a sequence encoding a conserved region of the corresponding amino acid sequence may be used. Hybridisation conditions may also be less stringent than those used when screening for sequences identical to the probe.
The variant polynucleotide sequences of the invention, may also be identified by computer- based methods well-known to those skilled in the art, using public domain sequence alignment algorithms and sequence similarity search tools to search sequence databases (public domain databases include Genbank, EMBL, Swiss-Prot, PIR and others). See, e.g., Nucleic Acids Res. 29: 1-10 and 11-16, 2001 for examples of online resources. Similarity searches retrieve and align target sequences for comparison with a sequence to be analyzed (i.e., a query sequence). Sequence comparison algorithms use scoring matrices to assign an overall score to each of the alignments.
An exemplary family of programs useful for identifying variants in sequence databases is the BLAST suite of programs (version 2.2.5 [Nov 2002]) including BLASTN, BLASTP, BLASTX, tBLASTN and tBLASTX, which are publicly available from (ftp ://ftp .ncbi .nih. gov/blast/) or from the National Center for Biotechnology Information (NCBI), National Library of Medicine, Building 38 A, Room 8N805, Bethesda, MD 20894 USA. The NCBI server also provides the facility to use the programs to screen a number of publicly available sequence databases. BLASTN compares a nucleotide query sequence against a nucleotide sequence database. BLASTP compares an amino acid query sequence against a protein sequence database. BLASTX compares a nucleotide query sequence translated in all reading frames against a protein sequence database. tBLASTN compares a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames. tBLASTX compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database. The BLAST programs may be used with default parameters or the parameters may be altered as required to refine the screen.
The use of the BLAST family of algorithms, including BLASTN, BLASTP, and BLASTX, is described in the publication of Altschul et al, Nucleic Acids Res. 25 : 3389-3402, 1997.
The "hits" to one or more database sequences by a queried sequence produced by BLASTN, BLASTP, BLASTX, tBLASTN, tBLASTX, or a similar algorithm, align and identify similar portions of sequences. The hits are arranged in order of the degree of similarity and the length of sequence overlap. Hits to a database sequence generally represent an overlap over only a fraction of the sequence length of the queried sequence.
The BLASTN, BLASTP, BLASTX, tBLASTN and tBLASTX algorithms also produce "Expect" values for alignments. The Expect value (E) indicates the number of hits one can "expect" to see by chance when searching a database of the same size containing random contiguous sequences. The Expect value is used as a significance threshold for determining whether the hit to a database indicates true similarity. For example, an E value of 0.1 assigned to a polynucleotide hit is interpreted as meaning that in a database of the size of the database screened, one might expect to see 0.1 matches over the aligned portion of the sequence with a similar score simply by chance. For sequences having an E value of 0.01 or less over aligned and matched portions, the probability of finding a match by chance in that database is 1% or less using the BLASTN, BLASTP, BLASTX, tBLASTN or tBLASTX algorithm.
To identify the polynucleotide variants most likely to be functional equivalents of the disclosed sequences, several further computer based approaches are known to those skilled in the art.
Multiple sequence alignments of a group of related sequences can be carried out with CLUSTALW (Thompson, J.D., Higgins, D.G. and Gibson, TJ. (1994) CLUSTALW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Research, 22:4673-4680, http://www-igbmc.u-strasbg.fr/BioInfo/ClustalW/Top.html^) or T-COFFEE (Cedric Notredame, Desmond G. Higgins, Jaap Heringa, T-Coffee: A novel method for fast and accurate multiple sequence alignment, J. Mol. Biol. (2000) 302: 205-217)) or PILEUP, which uses progressive, pairwise alignments (Feng and Doolittle, 1987, J. Mol. Evol. 25, 351).
Pattern recognition software applications are available for finding motifs or signature sequences. For example, MEME (Multiple Em for Motif Elicitation) finds motifs and signature sequences in a set of sequences, and MAST (Motif Alignment and Search Tool) uses these motifs to identify similar or the same motifs in query sequences. The MAST results are provided as a series of alignments with appropriate statistical data and a visual overview of the motifs found. MEME and MAST were developed at the University of California, San Diego.
PROSITE (Bairoch and Bucher, 1994, Nucleic Acids Res. 22, 3583; Hofmann et al, 1999, Nucleic Acids Res. 27, 215) is a method of identifying the functions of uncharacterized proteins translated from genomic or cDNA sequences. The PROSITE database (www.expasy.org/prosite) contains biologically significant patterns and profiles and is designed so that it can be used with appropriate computational tools to assign a new sequence to a known family of proteins or to determine which known domain(s) are present in the sequence (Falquet et al., 2002, Nucleic Acids Res. 30, 235). Prosearch is a tool that can search SWISS-PROT and EMBL databases with a given sequence pattern or signature.
The function of a variant of a polynucleotide of the invention may be assessed by replacing the corresponding sequence in an intragenic vector with the variant sequence and testing the functionality of the vector in a host bacterial cell or in a plant transformation procedure as herein defined.
Methods for assembling and manipulating genetic constructs and vectors are well known in the art and are described generally in Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987; Ausubel et al, Current Protocols in Molecular Biology, Greene Publishing, 1987).
Numerous traits in plants may also be altered through methods of the invention. Such methods may involve the transformation of plant cells and plants, using a vector of the invention including a genetic construct designed to alter expression of a polynucleotide or polypeptide which modulates such a trait in plant cells and plants. Such methods also include the transformation of plant cells and plants with a combination of the construct of the invention and one or more other constructs designed to alter expression of one or more polynucleotides or polypeptides which modulate such traits in such plant cells and plants.
A number of plant transformation strategies are available (e.g. Birch, 1997, Ann Rev Plant Phys Plant Mol Biol, 48, 297). For example, strategies may be designed to increase expression of a polynucleotide/polypeptide in a plant cell, organ and/or at a particular developmental stage where/when it is normally expressed or to ectopically express a polynucleotide/polypeptide in a cell, tissue, organ and/or at a particular developmental stage which when it is not normally expressed. The expressed polynucleotide/polypeptide may be derived from the plant species to be transformed or may be derived from a different plant species.
Transformation strategies may be designed to reduce expression of a polynucleotide/polypeptide in a plant cell, tissue, organ or at a particular developmental stage which/when it is normally expressed. Such strategies are known as gene silencing strategies.
Direct gene transfer involves the uptake of naked DNA by cells and its subsequent integration into the genome (Conner, A.J. and Meredith, C.P., Genetic manipulation of plant cells, pp. 653-688, in The Biochemistry of Plants: A Comprehensive Treatise, Vol 15, Molecular Biology, editor Marcus, A., Academic Press, San Diego," 1989; Petolino, J. Direct DNA delivery into intact cells and tissues, pp.137-143, in Transgenic Plants and Crops, editors Khachatourians et al., Marcel Dekker, New York, 2002,. The cells can include those of intact plants, pollen, seeds, intact plant organs, in vitro cultures of plants, plant parts, tissues and cells or isolated protoplasts. Those skilled in the art will understand that methods to effect direct DNA transfer may involve, but not limited to: passive uptake; the use of electroporation; treatments with polyethylene glycol and related chemicals and their adjuncts; electrophoresis, cell fusion with liposomes or spheroplasts; microinjection, silicon carbide whiskers, and microparticle bombardment.
Genetic constructs for expression of genes in transgenic plants typically include promoters for driving the expression of one or more cloned polynucleotide, terminators and selectable marker sequences to detect presence of the genetic construct in the transformed plant.
The promoters suitable for use in the constructs of this invention are functional in a cell, tissue or organ of a monocot or dicot plant and include cell-, tissue- and organ-specific promoters, cell cycle specific promoters, temporal promoters, inducible promoters, constitutive promoters that are active in most plant tissues, and recombinant promoters. Choice of promoter will depend upon the temporal and spatial expression of the cloned polynucleotide, so desired. The promoters may be those normally associated with a transgene of interest, or promoters which are derived from genes of other plants, viruses, and plant pathogenic bacteria and fungi. Those skilled in the art will, without undue experimentation, be able to select promoters that are suitable for use in modifying and modulating plant traits using genetic constructs comprising the polynucleotide sequences of the invention. Examples of constitutive promoters used in plants include the CaMV 35S promoter, the nopaline synthase promoter and the octopine synthase promoter, and the Ubi 1 promoter from maize. Plant promoters which are active in specific tissues, respond to internal developmental signals or external abiotic or biotic stresses are also described in the scientific literature. Exemplary promoters are described, e.g., in WO 02/00894, which is herein incorporated by reference.
Exemplary terminators that are commonly used in plant transformation genetic constructs include, e.g., the cauliflower mosaic virus (CaMV) 35S terminator, the Agrobacterium tumefaciens nopaline synthase or octopine synthase terminators, the Zea mays zein gene terminator, the Oryza sativa ADP-glucose pyrophosphorylase terminator and the Solanum tuberosum PI- II terminator.
Selectable markers commonly used in plant transformation include the neomycin phophotransferase II gene (NPT II) which confers kanamycin resistance, the aadA gene, which confers spectinomycin and streptomycin resistance, the phosphinothricin acetyl transferase (bar gene) for Ignite (AgrEvo) and Basta (Hoechst) resistance, and the hygromycin phosphotransferase gene ( hpt) for hygromycin resistance.
It will be understood by those skilled in the art that non-plant derived regulatory elements described above may be used in the intragenic vectors of the invention operably linked to selectable markers placed between the recombinase recognition sites.
Gene silencing strategies may be focused on the gene itself or regulatory elements which effect expression of the encoded polypeptide. "Regulatory elements" is used here in the widest possible sense and includes other genes which interact with the gene of interest.
Genetic constructs designed to decrease or silence the expression of a polynucleotide/polypeptide of the invention may include an antisense copy of a polynucleotide of the invention. In such constructs the polynucleotide is placed in an antisense orientation with respect to the promoter and terminator. An "antisense" polynucleotide is obtained by inverting a polynucleotide or a segment of the polynucleotide so that the transcript produced will be complementary to the mRNA transcript of the gene, e.g.,
5'GATCTA 3' (coding strand) 3'CTAGAT 5' (antisense strand) 3'CUAGAU 5' mRNA 5'GAUCUA 3' antisense RNA
Genetic constructs designed for gene silencing may also include an inverted repeat as herein defined. The preferred approach to achieve this is via RNA-interference strategies using genetic constructs encoding self-complementary "hairpin" RNA (Wesley et al., 2001, Plant Journal, 27: 581-590).
The transcript formed may undergo complementary base pairing to form a hairpin structure. Usually a spacer of at least 3-5 bp between the repeated region is required to allow hairpin formation.
Another silencing approach involves the use of a small antisense RNA targeted to the transcript equivalent to an miRNA (Llave et al, 2002, Science 297, 2053). Use of such small antisense RNA corresponding to polynucleotide of the invention is expressly contemplated.
The term genetic construct as used herein also includes small antisense RNAs and other such polynucleotides effecting gene silencing.
Transformation with an expression construct, as herein defined, may also result in gene silencing through a process known as sense suppression (e.g. Napoli et al, 1990, Plant Cell 2, 279; de Carvalho Niebel et al, 1995, Plant Cell, 7, 347). In some cases sense suppression may involve over-expression of the whole or a partial coding sequence but may also involve expression of non-coding region of the gene, such as an intron or a 5' or 3' untranslated region (UTR). Chimeric partial sense constructs can be used to coordinately silence multiple genes (Abbott et al, 2002, Plant Physiol. 128(3): 844-53; Jones et al, 1998, Planta 204: 499- 505). The use of such sense suppression strategies to silence the expression of a polynucleotide of the invention is also contemplated. The polynucleotide inserts in genetic constructs designed for gene silencing may correspond to coding sequence and/or non-coding sequence, such as promoter and/or intron and/or 5' or 3' UTR sequence, or the corresponding gene.
Other gene silencing strategies include dominant negative approaches and the use of ribozyme constructs (Mclntyre, 1996, Transgenic Res, 5, 257)
Pre-transcriptional silencing may be brought about through mutation of the gene itself or its regulatory elements. Such mutations may include point mutations, frameshifts, insertions, deletions and substitutions.
The following are representative publications disclosing genetic transformation protocols that can be used to genetically transform the following plant species: onions (WO00/44919); peas (Grant et al., 1995 Plant Cell Rep., 15, 254-258; Grant et al, 1998, Plant Science, 139:159- 164); petunia (Deroles and Gardner, 1988, Plant Molecular Biology, 11: 355-364); Medicago truncatula (Trieu and Harrison 1996, Plant Cell Rep. 16: 6-11); rice (Alam et al, 1999, Plant Cell Rep. 18, 572); maize (US Patent Serial Nos. 5, 177, 010 and 5, 981, 840); wheat (Ortiz et al, 1996, Plant Cell Rep. 15, 1996, 877); tomato (US Patent Serial No. 5, 159, 135); potato (Kumar et al, 1996 Plant J. 9, : 821); cassava (Li et al, 1996 Nat. Biotechnology 14, 736); lettuce (Michelmore et al., 1987, Plant Cell Rep. 6, 439); tobacco (Horsch et al, 1985, Science 227, 1229); cotton (US Patent Serial Nos. 5, 846, 797 and 5, 004, 863); grasses (US Patent Nos. 5, 187, 073 and 6. 020, 539); peppermint (Niu et al, 1998, Plant Cell Rep. 17, 165); citrus plants (Pena et al, 1995, Plant Sci.104, 183); caraway (Krens et al, 1997, Plant Cell Rep, 17, 39); banana (US Patent Serial No. 5, 792, 935); soybean (US Patent Nos. 5, 416, 011 ; 5, 569, 834 ; 5, 824, 877 ; 5, 563, 04455 and 5, 968, 830); pineapple (US Patent Serial No. 5, 952, 543); poplar (US Patent No. A, 795, 855); monocots in general (US Patent Nos. 5, 591, 616 and 6, 037, 522); brassica (US Patent Nos. 5, 188, 958 ; 5, 463, 174 and 5, 750, 871); and cereals (US Patent No. 6, 074, 877). It "will be understood by those skilled in the art that the above protocols may be adapted for example, for use with alternative selectable marker for transformation.
The plant-derived sequences in the vectors of the invention may be derived from any plant species. In one embodiment the plant-derived sequences in the vectors of the invention are from gymnosperm species. Preferred gymnosperm genera include Cycas, Pseudotsuga, Pinus and Picea. Preferred gymnosperm species include Cycas rumphii, Pseudotsuga menziesii, Pinus radiata, Pinus taeda, Pinus pinaster, Picea engelmannia x sitchensis, Picea sitchensis and Picea glauca.
In a further embodiment the plant-derived sequences in the vectors of the invention are from bryophyte species. Preferred bryophyte genera include Marchantia, Tortula, Physcomitrella and Ceratodon. Preferred bryophyte species include Marchantia polymorpha, Tortula ruralis, Physcomitrella patens and Ceratodon purpureous.
In a further embodiment the plant-derived sequences in the vectors of the invention are from algae species. Preferred algae genera include Chlamydomonas. Preferred algae species include Chlamydomonas reinhardtii.
In a further embodiment the plant-derived sequences in the vectors of the invention are from angiospemi species. Preferred angiosperm genera include Aegilops, Allium, Amborella, Anopterus, Apium, Arabidopsis, Arachis, Asparagus, Atropa, Avena, Beta, Betula, Brassica, Camellia, Capsicum, Chenopodium, Cicer, Citrus, Citrullus, Coffea, Cucumis, Elaeis, Eschscholzia, Eucalyptus, Fagopyrum, Fragaria, Glycine, Gossypium, Helianthus, Hevea, Hordeum, Humulus, Ipomoea, Lactuca, Limonium, Linum, Lolium, Lotus, Lycopersicon, Lycoris, Malus, Manihot, Medicago, Mesembryanthemum, Musa, Nicotiana, Nuphar, Olea, Oryza, Persea, Petunia, Phaseolus, Pisum, Plumbago, Poncirus, Populus, Prunus, Puccinellia, Pyrus, Quintinia, Raphanus, Saccharum, Schedonorus, Secale, Sesamum, Solanum, Sorghum, Spinacia, Thellungiella, Theobroma, Triticum, Vaccinium, Vitis, Zea and Zinnia.
Preferred angiosperm species include Aegilops speltoides, Allium cepa, Amborella trichopoda, Anopterus macleayanus, Apium graveolens, Arabidopsis thaliana, Arachis hypogaea, Asparagus officinalis, Atropa belladonna, Avena sativa, Beta vulgar is, Brassica napus, Brassica rapa, Brassica oleracea, Capsicum annuum, Capsicum fi'utescens, Cicer arietinum, Citrullus lanatus, Citrus Clementina, Citrus reticulata, Citrus sinensis, Coffea arabica, Coffea canephora, Cucumis sativus, Elaeis guineesis, Eschscholzia californica, Eucalyptus tereticornis, Fagopyrum esculentum, Fragaria x ananassa, Glycine max, Gossypium arboreum, Gossypium hirsutum, Gossypium raimondii, Helianthus annuus, Helianthus argophyllus, Hevea brasiliensis, Hordeum vulgare, Humulus lupulus, Ipomoea batatas, Ipomoea nil, Lactuca sativa, Limonium bicolor, Linum usitatissimum, Lolium multiflorum, Lotus corniculatus, Lycopersicon esculentum, Lycopersicon penellii, Lycoris longituba, Malus x domestica, Manihot esculenta, Medicago truncatula, Mesembryanthemum crystallinum, Nicotiana benthamiana, Nicotiana tabacum, Nuphar advena, Olea europea, Oryza sativa, Oryza minuta, Persea americana, Petunia hybrida, Phaseolus coccineus, Phaseolus vulgaris, Pisum sativum, Plumbago zeylanica, Poncirus trifoliata, Populus alba x tremula, Populus tremula x tremuloides, Populus tremula, Populus balsamifera x teldoides), Prunus americana, Prunus armeniaca, Prunus domestica, Prunus dulcis, Prunus persica, Puccinellia tenuiflora, Pyrus communis, Quintinia verdonii, Raphanus staivus, Saccharum officinarum, Schedonorus arundinaceus, Secale cereale, Sesamum indicum, Solanum habrochaites, Solanum nigrum, Solanum tuberosum, Sorghum bicolor, Sorghum propinquum, Spinacia oleracea, Thellungiella halophila, Thellungiella salsuginea, Theobroma cacao, Triticum aestivum, Triticum durum, Triticum monococcum, Vaccinium corymbosum, Vitis vinifera, Zea mays and Zinnia elegans.
Particularly preferred angiosperm genera include Solanum, Petunia and Allium. Particularly preferred angiosperm species include Solanum tuberosum, Petunia hybrida and Allium cepa.
The plant cells and plants of the invention may be derived from any plant species.
In one embodiment the plant cells and plants of the invention are from gymnosperm species. Preferred gymnosperm genera include Cycas, Pseudotsuga, Pinus and Picea. Preferred gymnosperm species include Cycas rumphii, Pseudotsuga menziesii, Pinus radiata, Pinus taeda, Pinus pinaster, Picea engelmannia x sitchensis, Picea sitchensis and Picea glauca.
In a further embodiment the plant cells and plants of the invention are from bryophyte species. . Preferred bryophyte genera include Marchantia, Tortula, Physcomitrella and Ceratodon. Preferred bryophyte species include Marchantia polymorpha, Tortula ruralis, Physcomitrella patens and Ceratodon purpureous. In a further embodiment the plant cells and plants of the invention are from algae species. Preferred algae genera include Chlamydomonas. Preferred algae species include Chlamydomonas reinhardtii.
In a further embodiment the plant cells and plants of the invention are from angiosperm species. Preferred angiosperm genera include Aegilops, Allium, Amborella, Anopterus, Apium, Arabidopsis, Arachis, Asparagus, Atropa, Avena, Beta, Betula, Brassica, Camellia, Capsicum, Chenopodium, Cicer, Citrus, Citrullus, Coffea, Cucumis, Elaeis, Eschscholzia, Eucalyptus, Fagopyrum, Fragaria, Glycine, Gossypium, Helianthus, Hevea, Hordeum, Humulus, Ipomoea, Lactuca, Limonium, Linum, Lolium, Lotus, Lycopersicon, Lycoris, Malus, Manihot, Medicago, Mesembryanthemum, Musa, Nicotiana, Nuphar, Olea, Oryza, Persea, Petunia, Phaseolus, Pisum, Plumbago, Poncirus, Populus, Prunus, Puccinellia, Pyrus, Quintinia, Raphanus, Saccharum, Schedonorus, Secale, Sesamum, Solanum, Sorghum, Spinacia, Thellungiella, Theobroma, Triticum, Vaccinium, Vitis, Zea and Zinnia.
Preferred angiosperm species include Aegilops speltoides, Allium cepa, Amborella trichopoda, Anopterus macleayanus, Apium graveolens, Arabidopsis thaliana, Arachis hypogaea, Asparagus officinalis, Atropa belladonna, Avena sativa, Beta vulgaris, Brassica napus, Brassica rapa, Brassica oleracea, Capsicum annuum, Capsicum frutescens, Cicer arietinum, Citrullus lanatus, Citrus Clementina, Citrus reticulata, Citrus sinensis, Coffea arabica, Coffea canephora, Cucumis sativus, Elaeis guineesis, Eschscholzia californica, Eucalyptus tereticornis, Fagopyrum esculentum, Fragaria x ananassa, Glycine max, Gossypium arboreum, Gossypium hirsutum, Gossypium raimondii, Helianthus annuus, Helianthus argophyllus, Hevea brasiliensis, Hordeum vulgare, Humulus lupulus, Ipomoea batatas, Ipomoea nil, Lactuca sativa, Limonium bicolor, Linum usitatissimum, Lolium multiflorum, Lotus corniculatus, Lycopersicon esculentum, Lycopersicon penellii, Lycoris longituba, Malus x domestica, Manihot esculenta, Medicago truncatula, Mesembryanthemum crystallinum, Nicotiana benthamiana, Nicotiana tabacum, Nuphar advena, Olea europea, Oryza sativa, Oryza minuta, Persea americana, Petunia hybrida, Phaseolus coccineus,
Phaseolus vulgaris, Pisum sativum, Plumbago zeylanica, Poncirus trifoliata, Populus alba x tremula, Populus tremula x tremuloides, Populus tremula, Populus balsamifera x teldoides), Prunus americana, Prunus armeniaca, Prunus domestica, Prunus dulcis, Prunus persica, Puccinellia tenuiflora, Pyrus communis, Quintinia verdonii, Raphanus staivus, Saccharum officinarum, Schedonorus arundinaceus, Secale cereale, Sesamum indicum, Solanum habrochaites, Solanum nigrum, Solanum tuberosum, Sorghum bicolor, Sorghum propinquum, Spinacia oleracea, Thellungiella halophila, Thellungiella salsuginea, Theobroma cacao, Triticum aestivum, Triticum durum, Triticum monococcum, Vaccinium corymbosum, Vitis vinifera, Zea mays and Zinnia elegans.
Particularly preferred angiosperm genera include Solanum, Petunia and Allium. Particularly preferred angiosperm species include Solanum tuberosum, Petunia hybrida and Allium cepa.
The cells and plants of the invention may be grown in culture, in greenhouses or the field. They may be propagated vegetatively, as well as either selfed or crossed with a different plant strain and the resulting hybrids, with the desired phenotypic characteristics, may be identified. Two or more generations may be grown to ensure that the subject phenotypic characteristics are stably maintained arid inherited. Plants resulting from such standard breeding approaches also form an aspect of the present invention.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 shows PCR verification the propagation of plasmid pPOTCOLE2SPEC in E. coli mediated by a potato-derived COLE2-like origin of replication. Lanes 1 and 2 are plasmid preparations restricted with a BamWEco I double digest from two independent transformation events of pPOTCOLE2SPEC into E. coli DH5α already possessing pBX243; they show 3.9 kb, 2.5 kb, and 1.5 kb fragments, representing the pBX243 backbone, linearised pPOTCOLE2SPEC, and the pBX243 Rep gene respectively. Lane 3 is a plasmid preparation restricted with a BamHl/EcoRI double digest from a culture transformed with only pBX243 and shows 3.9 kb and 1.5 kb fragments, representing the pBX243 backbone and the ρBX243 Rep gene. Lane 4 is the GeneRuler DNA ladder mix #SM0331 (Fermentas, Hanover, Maryland) size marker.
Figure 2 shows PCR verification the potato-derived LacOl-like sequences functioning as a plasmid selectable element by operator-repressor titration. Lane 1 is the GeneRuler DNA ladder mix #SM0331 (Fermentas, Hanover, Maryland) size marker. Lanes 2-6 are plasmid preparations restricted with Pstl from five independent transformation events of pBR322POTLACOl into E. coli strain DHllacdapD using repressor titration selection; they show the expected 1.3 kb and 3.8 kb fragments. Lane 7 is a plasmid preparation restricted with Pstl following transformation of pBR322POTLACOl into E. coli strain DH5α using ampillicin selection and also shows the expected 1.3 kb and 3.8 kb fragments. Lane 8 is linearised pBR322 visualised as a 4.4 kb fragment.
Figure 3 shows PCR verification of Arabidopsis thaliana 'Columbia' transformed with the intragenic vector pTCAHAS. Lanes 1&2, 3&4 and 5&6 are three A. thaliana lines transformed with the intragenic vector, lanes 1,3,5 using primers E+F, lanes 2,4,6 using primers G+H; lanes 8&9 are untransfoimed A. thaliana, lane 8 using primers E+F, lane 9 using primers G+H; lanes 10&11 are no template controls, lane 10 using primers E+F, lane 11 using primers G+H; lanes 12&13 are the intragenic vector pTCAHAS, lane 12 using primers E+F, lane 13 using primers G+H; lanes 7&14 are the 100 bp molecular ruler (170-8206, BioRad laboratories, USA). Primers E+F amplify an expected 643 bp fragment and primers G+H amplify an expected 149 bp fragment from the T-DNA-like region of pTCAHAS .
Figure 4 shows PCR verification of potato cultivar Twa' transformed with the intragenic vector pPOTLNV. This involved a multiplexed PCR using primers I+J to amplify the 570 bp fragment from the pPOTLNV T-DNA-like region and primers K+L to amplify the 1069 bp product from the endogenous actin gene of potato. Lanes 1&7 are the 1 kb plus molecular ruler 10787-018 (Invitrogen, Carlsbad, California), lane 2 is the co-transformed hairy root line #18, lane 3 is the co-transformed hairy root line #74, lane 4 is a control hairy root line transformed with Agrobacterium strain A4T without the binary vector pPOTLNV, lane 5 is the intragenic vector pPOTLNV, lane 6 is a no template control.
Figure 5 shows PCR verification of the absence of Agrobacterium DNA in the samples used for PCR analysis in Figure 4. This involved a PCR using primers M+N to amplify the 590 bp of the Agrobacterium virG gene. Lanes 1&7 are the 1 kb plus molecular ruler 10787-018 (Invitrogen, Carlsbad, California), lane 2 is the co-transformed hairy root line #18, lane 3 is the co-transformed hairy root line #74, lane 4 is a control hairy root line transformed with Agrobacterium strain A4T without the binary vector pPOTLNV, lane 5 is Agrobacterium strain A4T, lane 6 is a no template control. Figure 6 shows PCR verification of potato cultivar 'Iwa' transformed with the intragenic vector pPETINV. This involved PCR using primers O+P to amplify the 447 bp fragment from the pPETINV T-DNA-like region (lanes 2-5) and primers K+L to amplify the 1069 bp product from the endogenous actin gene of potato (lanes 6-8). Lane 1 is the 1 kb plus molecular ruler 10787-018 (Invitrogen, Carlsbad, California); lanes 2 and 6 are the co- transformed hairy root line #24; lanes 3 and 7 are a control hairy root line transformed with Agrobacterium strain A4T without the binary vector pPETINV; lane 4 is the intragenic vector pPETINV; lanes 5 and 8 are a no template controls.
Figure 7 shows PCR verification of the absence of Agrobacterium DNA in the samples used for PCR analysis in Figure 6. This involved a PCR using primers M+N to amplify the 590 bp of the Agrobacterium virG gene. Lane 1 is the 1 kb plus molecular ruler 10787-018 (Invitrogen, Carlsbad, California), lane 2 is the co-transformed hairy root line #18, lane 3 is a control hairy foot line transformed with Agrobacterium strain A4T without the binary vector pPOTLNV, lane 4 is Agrobacterium strain A4T, lane 5 is a no template control.
Figure 8 illustrates recombination between the POTLOXP sites mediated by Cre recombinase. Plasmid was isolated from E. coli strain 294-Cre transformed with pPOTLOXP2 and restricted with Sail. Expression of Cre recombinase was induced by raising the temperature from 23 °C to 37 °C. Lane 1 is the 1 kb plus molecular ruler 10787-018 (Invifrogen, Carlsbad, California); lane 2 illustrates the expected 3.0 kb and 2.3 kb Sail fragments of unrecombined pPOTLOXP2 isolated from a culture maintained at 23 °C; lanes 3-8 illustrate the 3.0 kb and 1.5 kb Sail fragments expected from Cre-mediated recombination between the POTLOXP sites in six different colonies cultured at 37 °C.
Figure 9 illustrates recombination between the POTFRT sites mediated by FLP recombinase. Plasmid was isolated from E. coli strain 294-FLP transformed with pPOTFRT2 and restricted with Sail. Expression of FLP recombinase was induced by raising the temperature from 23 °C to 37 °C. Lanes 1 and 8 are the GeneRuler DNA ladder mix #SM0331 (Fermentas, Hanover, Maryland) size marker; lane 2 illustrates the expected 3.0 kb and 1.4 kb Sail fragments of unrecombined pPOTFRT2 isolated from a culture maintained at 23 °C; lanes 3-7 illustrate the 3.0 kb and 1.4 kb fragments, and the 1.1 kb Sail fragments expected from FLP- mediated recombination between the POTFRT sites in five different colonies cultured at 37 °C. EXAMPLES
The invention will now be illustrated with reference to the following non-limiting examples.
Example 1 Identification of T-DNA border-like sequences in many plant species
Agrobacterium T-DNA borders contain the following consensus motif: 5'GRCAGGATATATNNNNNKSTMAWN3' (Where R = G or A, K = T or G, S = G or C, M = C or A, W = A or T and N = any nucleotide).
A search on NCBI GenBank (http://www.ncbi.nlm.nih.gov/BLAST/) and TIGR database (htt ://tigrblast.ti r . or g/tgiΛ using the BLAST tool "search for short, nearly exact matches" and searching within the EST databases, yielded multiple accession numbers for each motif 5'GACAGGATATAT3' and 5'GGCAGGATATAT3' as shown in Table 1. The search was limited to Viridiplanteae and the expect value was 10000. Searches were also conducted in the EST Database of Japan carried out using Expect values of 10000 and the gap tool off (http://www.ddbi .nig.ac.ip .
Table 1. Plant species and DNA accession numbers in which a partial T-DNA border has been identified in EST sequences. All accession numbers were found from searches in the NCBI Genbank EST databases, except for those labelled A which were identified using the TIGR database and those labelled B which were found in the EST Database of Japan. Note: l indicates 5'GRCAGGATAT (A); 2 indicates 5ORCAGGATA3'; 3 indicates 5ORCAGGAT3'; 4 indicates 5'GRCAGGA3';and "+" indicates there are many more accessions than the example(s) listed.
Figure imgf000045_0001
Figure imgf000046_0001
Figure imgf000047_0001
Figure imgf000048_0001
Figure imgf000049_0001
Figure imgf000050_0001
The initial 5'GRCAGGATATAT3' of the T-DNA border-like motif is less likely to be identified in database searches than the shorter sequence 5'KSTMAWN3'. If the entire border sequence is formed using 2 EST sequences as shown in Example 2 of the patent application, then a second BLAST search is undertaken using 5'KSTMAWN3' from known T-DNA border sequences. A list of such sequences are: 5'TGTCATG3' 5'TGTAAAC3', 5'GGTAAAC3'5 5 GTAAAA3', 5OGTAAAA3'; which correspond to the following border sequences: 5'gacaggatatatgttcttgtcatg3' (pRi), 5'gacaggatatattggcgggtaaac3' (pTiT37 andpTiC58), 5'ggcaggatatatcgaggtgtaaaa3' (pTil5955), 5'ggcaggatatattgtggtgtaaac3' (ρART27 lb) and 5'gacaggatatattggcgggtaaac3' (ρART27 rb).
BLAST searches using these sequences produce multiple matches. For example just within Solanum tuberosum (a plant whose genome has not been completely sequenced) , a search (BLAST "search for short, nearly exact matches" Expect 20000 and descriptions 1000) for only 5'TGTAAAC3' in NCBI GenBank yields 997 exact matches of which 985 are S. tuberosum ESTs (search performed 2 June 2004).
Alternatively, the sequences defined in Table 1 can be used for the design of "chimeric right borders" for plant-dervied T-DNA-like sequences. Example 2 Identification of T-DNA-like regions from plant genomes
The design of T-DNA- like regions for possible intragenic vectors was undertaken by searching plant EST databases for Agrobacterium border-like sequences. Limiting searches to EST sequences facilitates the design of intragenic vectors by:
1. The base DNA making up the T-DNA-like region (including the T-DNA-like sequence and additional sequence at either one or both sides of the T-DNA-like sequence) does not involve regulatory elements such as promoters that may influence expression of inserted target genes; and 2. The DNA on which the T-DNA-like region is based is not derived from heterochromatic regions (non coding, non expressed, condensed DNA) as this may suppress activity of the genes intended for transfer.
BLAST searches were conducted as described by Altschul et al. (Gapped BLAST and PSI- BLAST: a new generation of protein database search programs, Nucleic Acids Res. 25: 3389- 3402, 1997).
Border sequences used to search the databases:
Sequence motifs used to search the databases were 5'GACAGGATATAT3' or
5'GGCAGGATATAT3', 5 GTAAAC3', 5'GGTAAAC3\ 5'TGTAAAA3', 5'GGTAAAA3'. Other known borders were used as query sequences, these being:
5'GACAGGATATATGTTCTTGTCATG3' (pRi )
5'GACAGGATATATTGGCGGGTAAAC3' (pTiT37 and pTiC58)
5'GGCAGGATATATCGAGGTGTAAAA3' (pTil 5955)
5'GGCAGGATATATTGTGGTGTAAAC3' (pART27 lb) 5'GACAGGATATATTGGCGGGTAAAC3' (pART27 rb)
Potato, Petunia and tomato vectors
NCBI BLAST - www.ncbi.nlm.nih.gov/BLAST/ "blastn" and "search for short, nearly exact matches" was used to search the EST database. Expect values of 10000 or 20000 (dependent on word size) were used and the search was limited by entrez query, potato (Solanum), tomato (Lycopersicon), or Petunia. All Petunia EST sequences from the NCBI site were also downloaded in FASTA format and searched using the "find" tool in Microsoft Notepad. Solanaceae genomics network - http://soldb.cit.cornell.edu/cgi-bin/tools/blast/simple.pl BLAST settings included expect values of 10,000 (due to short sequences) and the default settings. All searches were done in EST databases. Unigene sequences were identified using the EST searches.
Pinus, Nicotiana, Medicago, apple and onion vectors NCBI BLAST - www.ncbi.nlm.nih.gov/BLAST/
BLAST was carried out as above with an Expect value of 10,000 and limited by entrez query to Pinus, Nicotiana, Medicago, apple or onion (Allium).
Rice vector
NCBI BLAST - www.ncbi.nlm.nih.gov/BLAST/. Settings were as above but limited by entrez query rice or Oryza.
TIGR - htt ://tigrblast .ti r . or g/tgi (searched unique gene indices). Used an expect value of 10,000 and matrix blosum62 or blosumlOO. All other values were the default settings. The searches identified some TC# sequences (tentative consensus sequences) and ESTs containing the region of interest were identified from these.
Staff - http://web.staff.or.jp/. The RGP EST database was used to search for ESTs containing the sequences of interest, using Expect values of 10000 and the remaining options at default settings.
Design of extended intragenic T-DNA- like regions
ESTs were identified that showed sequence identity to parts of 'the Agrobacterium border-like sequences. These identified EST sequences were then assessed for homology, length of sequence flanking the borders and unique restriction sites. This was carried out using DNAMAN (version 3.2, Lynnon BioSoft. coρyright©1994-1997). ESTs were adjoined (usually 3 ESTs) to give a T-DNA-like region containing two border sequences, unique restriction sites between the border sequences (that can be used as cloning sites) and extra plant EST sequence beyond the borders to minimize the opportunity for non-intragenic vector backbone sequences being transferred with the T-DNA-like region into plant genomes. Multiple intragenic T-DNA-like regions were designed and compared. Those designed to have the optimum sequence and useful unique restriction sites are presented below.
T-DNA-like region of a potato intragenic vector This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sa l sites that are underlined. The nucleotides in italics are not part of the potato genome sequence.
Nucleotides 6 - 334 are the reverse complement of nucleotides 315 - 643 of sgn-U179068. Nucleotides 335 - 974 are nucleotides 131 - 770 of sgn-Ul 74278. Nucleotides 975 - 1265 are nucleotides 117 - 407 of CN216800.
The T-DNA border-like sequences are shown in bold. The left border is nucleotides 314 - 337 and the right border is nucleotides 957 - 980. Unique restriction sites in the resulting binary vector that are located within the T-DNA-like region are:
Aflϊl at 611
Agel at 518
ApαBl at 912
Asul at 5 6 .4v l at 357
,4v II at 516
BαmRl at 687
5stD102I at 514 rl0I at 518
Figure imgf000053_0001
CM at 507 pl at 516
EcoHI at 683
EcoRI at 340 Hαelll at 725
Hg/AI at 916 ell at 405
M lll at 875 AI at 518 Sczl at 359
Xbά at 433
^7røl at 357 oll at 687 There are also two EcoRV sites within the T-DNA-like region (698 and 853) that could be used as cloning sites.
l GTCGACΆGTΆ AAAGTTGCAC CTGGAATAAG GTTTTCATTC TTCACAGGAG GCATCTCACT 61 CTTTCTAGCA GGTCTTGAAC GCTTAGATTG AACAGATGTA GGACTCACAT CTGATATGGA 121 GGATTCTTGA CTTGTTTCAG CAGCATCAGA TGAAGCTTCT GAGACTTCAC CTGATCCATC 181 ATCTGTAGCA GTTGCTTCTA CTTCTTCCAC TGCTACATCA GTCTCAGTTG CTGATACTAT 241 AAGACCTCTT AATTTAGGTC GTAAAATGCA ACCAACTCTA AAATGGGGAA ACAATTTAAT 301 AGATGTTGAC AGAGGCAGGA TATATTTTGG GGTAAACGGG AATTCTTCAG CAGTTGCTCG 361 AGGGAGATTG GCGGTGCTTT CAGCTCACCT TGCAGCTTCA CTCAACGTCT CCGATTTAAC 421 AACCTTCAAA CTTCTAGAAA CTTCCGGTGT ATCCGCCGTT TCCGGCGTTG CACCTCCGCC 481 GAATCTAAAA GGTGCGTTGA CGATCATCGA TGAGCGGACC GGTAAGAAGT ATCCGGTTCA 541 GGTTTCTGAG GATGGCACTA TCAAAGCCAC CGACTTAAAG AAGATAACAA CAGGACAGAA 601 TGATAAAGGT CTTAAGCTTT ATGATCCAGG CTATCTCAAC ACAGCACCTG TTAGGTCATC 661 AATATGCTAT ATAGATGGTG ATGCCGGGAT CCTTAGATAT CGAGGCTACC CTATTGAAGA 721 GCTGGCCGAG GGAAGTTCCT TCTTGGAAGT GGCATATCTT TTGTTGTATG GTAATTTACC 781 ATCTGAGAAC CAGTTAGCAG ACTGGGAGTT CACAGTTTCA CAGCATTCAG CGGTTCCACA 841 AGGACTCTTG GATATCATAC AGTCAATGCC CCATGATGCT CATCCAATGG GGGTTCTTGT 901 CAGTGCAATG AGTGCTCTTT CCGTTTTTCA TCCTGATGCA AATCCAGCTC TGAGAGGACA 961 GGATATATAC AAGTGTAAAC AATTTAAAAG CATATGGTGG CACTGCTCAA TATATGAGGT 1021 GGGCGCGAGA AGCAGGTACC AATGTGTCCT CATCAAGAGA TGCATTCTTT ACCAATCCAA 1081 CGGTCAAAGC ATACTACAAG TCTTTTGTCA AGGCTATTGT GACAAGAAAA AACTCTATAA 1141 GTGGAGTTAA ATATTCAGAA GAGCCCGCCA TATTTGCGTG GGAACTCATA AATGAGCCTC 1201 GTTGTGAATC CAGTTCATCA GCTGCTGCTC TCCAGGCGTG GATAGCAGAG ATGGCTGGAT 1261 TTGTCGAC (SEQ ID NO:1)
T-DNA-like region of a petunia intragenic vector
This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined. The nucleotides in italics are not part ofthe petunia genome sequence.
Nucleotides 6-399 are the complete sequence ofthe 394 nucleotide fragment from sgn- e521144.
Nucleotides 400-855 are the reverse complement of nucleotides 85-540 from sgn-e534315.
Nucleotides 856-1071 are the reverse complement of nucleotides 121-336 from sgn-u207691.
The T-DNA border-like sequences are shown in bold. The left border is nucleotides 347-370 and right border is nucleotides 844-867. Unique restriction sites in the resulting binary vector that are located within the T-DNA-like region are: Accϊll at 392 Agel at 788 Bbvl at 453 BspMϊl at 392 -9,st71I at 453 C/H0I at 788 CM site at 398 FnuAYll at 442 αel at 665 Λføl at 752 P AI at 788
There are also two Nspl sites (RCATG/Y) within the T-DNA-like region (616 and 755) that could be used as cloning sites. The most useful restriction site for cloning into the T-DNA- like region is the CM site which is shown in underlined bold.
GTCGACTTTA TGATCCTGGC TATCTCAACA CAGCGCCTGT TCGGTCATCA ATATGTTATA
61 TAGATGGTGA TGCCGGGATC CTTAGGTATC GAGGTTACCC TATTGAAGAG CTGGCTGAGG
121 GAAGCTCCTT CTTGGAAGTG GCTTATCTTT TATTGTACGG TAATTTGCCA TCTGAGAACC
181 AGTTGGCAGA CTGTGAGTTC ACAGTTTCAC AACATTCAGC AGTTCCACAA GGACTCCTTG 241 GATATCATAC AGTCAATGCC CCATGATGCT CATCCGATGG GTGTTCTTGT CAGTGCAATG
301 AGCGCTCTTT CTGTCTTTCA CCCTGATGCC AATCCAGCTC TTAGGGGACA GGATATATAC
361 AAGTCTAAAC AAATGAGAGA TAAACAAATA GTCCGGATCG ATACGTGAAG ATCAAAATGA
421 AAAGGGGAGG CGATAGATTA GCAGCATGAG CCTATATTTC TCTCACAAAA ATTCCCAGAT
481 ATTCGACACA ATAGCTCTAA CAACACTGAG CTTTTGATTA CTTGGGTCAC TTCTTCATTT 541 CTCTATCGTC TGTTCAGTCT TTTCCTCTGA TTTAGTTTCT GCATCATAAG TTTTGCCAAA
601 GCCAAGTTCT GACATGTCTT GCTTTGCCAT CAAATTCTTC TCCATACGAC ACTCCAGGTA
661 CTTCCTAGAG AGGTGTCTAC ACTGCTCAGA TTTATGCCCA GCGGATTTTA GACAACTAAG
721 GTATTCCTTC TTCTCCACGT CACATAAATG CATGTGATCC AAAGGGAAAA CTCCTTTTTC
781 TGGTGGAACC GGTCTCAATC CTCTATTTCC ACCAAATGCT CCCCCTGCAC TCATTACGGA 841 GATGGCAGGA TATATGTTCT TGTCATGGAA TAGGCCACTG CTTTCAGCTG TCTGGAGACC
901 GTGAAGTGTA CGTTGAGCCA CAGCCCATTG TGCTTCCCTC TCACCTTTTC CGTAATCCTT
961 CTTGGTTGTG AAGGCAGTCT TATTCTGCAT CATTGATTGC CAGGCGTCAC CACTCAACGT
1021 GTAACGGCTG ATGAATTTAA GAATATCAAG AGGGAAATAG GTGATAATTG TCGΛC
(SEQ ID NO:2)
T-DNA-like region of a tomato (Lycopersicon esculentum) intragenic vector: This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined. The nucleotides in italics are not part of the tomato genome sequence. Nucleotides 5 - 537 are nucleotides 2-534 of SGN-E260320. Nucleotides 538 - 976 are the reverse complement of nucleotides 79 - 517 of SGN-E291502 Nucleotides 977 - 1188 are the reverse complement of nucleotides 1 - 212 of CK575027.
The T-DNA border-like sequences are shown in bold. The left border is nucleotides 375 - 398 and the right border nucleotides 960 - 983. The restriction sites and positions that could be used for cloning within the T-DNA are shown below (as calculated by DNAMAN):
_4/w26I at 881
AbvNl at 876p
Bbvl at 798, 843 and 442
Bglϊl at 740 5^14071 at 528
BspMl at 705
5st71I at 798, 843 and 442
CM at 637
Eco31I at 881 Eco57I at 787
EcøNI at 431
FnuARl at 456, 787 and 832 el at 573, 711 and 744
Mfel at 444 MspAl at 455 M at 511
Nsp ll at 455
Nspl at 535 and 583
Pstl at 833 Pvull at 455
Rsαl at 530
Styl at 427
Xcml at 925 1 GTCGACΑACA GGACAGAATG TAAGGGTCT AAGCTTTAT GATCCAGGCT ATCTCAACAC
61 GGCACCTGTT AGGTCATCAA TATGTTATAT TGATGGTGAT GCCGGGATCC TTAGATATCG
121 AGGCTACCCT ATTGAAGAGC TGGCCGAGGG AAGTTCCTTC TTGGAAGTGG CATATCTTTT 181 GTTGTATGGT AATTTACCAT CTGAGAATCA GTTAGCAGAC TGGGAGTTCA CAGTTTCACA
241 GCATTCAGCA GTTCCACAAG GACTCTTGGA TATCATACAG TCAATGCCAC ATGATGCTCA
301 TCCAATGGGG GTTCTTGTCA GTGCAATGAG TGCTCTTTCC GTTTTTCATC CTGATGCAAA
361 TCCAGCTCTG AGAGGGCAGG ATATATACAA GTCTAAACAA GTGAGAGATA AACAAATAGT
421 TCGGATCCTT GGCAAGGCAC CTACAATTGC TACAGCTGCT TACTTAAGAA TGGCTGGCAG 481 GCCACCTGTC CTTCCATCCA ACAATCTCTC ATATGCGGAG AACTTCTTGT ACATGCTTGC
541 TTCCTACATC CTTTACATAA CTATCACTCA ACCTAGAAAC ATGCACCAAT CCATCCGTAA
601 AAGCTCCAAA ATCAATAAAA GCACCGAATG GCTGTATCGA TCTGACCTTT CCAGGAAAAG
661 TTGCACCTGG AATAAGGTCT TCATTCTTCA CAGGAGGCAT CTCACTCTTT CTAGCAGGTC
721 TTGAACGCTT AGACTGAACA GATCTAGGAC TCACATCTGA TACAGAGGAT TCTTCACTTA 781 TTTCAGCAGC ATCAGATGAA GCTTCAGCAA CTCCACCAGA TCCATCATCT GCAGCAGTTG
841 CTTCTACTTC TTCCACTGCT ACATCGGTTT CAGTTGCTGA TACTACGAGA CCTCTTAATT
901 TATGTCGTAA AATGCAACCA ACTCTAAAAT GGGGAAACAA TTTAATAGAT GTTGACAGGG
961 GCAGGATATA TTTTGGTGTA AACCTGTTTC TTGCACTAAT CGTGCTTTGT CTTCCTCAGT
1021 TGGATAAGGC CACTTAGAAT GTGATTGCCA CCAAGCTTTC AACACAGATG TAGTATCACC 1081 AGGCAGTTTT CCTGCTCTTC TTTTGCGTAA AATTTCCTCT CTAATGTCAA CAATTTTTTC
1141 CTTATAACCC TGTTTGAGTT CATGCTTGAG TTCTTGCCTA ACACGCTCGr CGAC
(SEQ ID NO:3)
T-DNA-like region of a Nicotiana benthamiana intragenic vector This sequence can be ligated into pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-
1207) backbone using the Sail sites that are underlined . The nucleotides in italics are not in the N benthamiana genome sequence.
Nucleotides 5 - 853 are nucleotides 111 - 959 of CK292156
Nucleotides 854 - 1469 are the reverse complement of nucleotides 81-696 of CK286377. Nucleotides 1470 - 1787 are nucleotides 285 - 602 of CN748849.
The T-DNA border-like sequences are shown in bold. The left border is nucleotides 566 - 589 and the right border is nucleotides 1455 - 1478. Unique restriction sites in the resulting binary vector that are located within the T-DNA-like region are: ocelli at 611 4/711 at 654 -4fom at l l60 BamRl at 61 A Bsil at 1362
Figure imgf000058_0001
BspMϊl at 611 Oral at 1160 EcoNI at 622 αell at 840 Nspl at 726 Seal at 921 &pl at l420 Kspl at l085 J^øII at 614
There are also two Aval sites (773 and 1072), two Banl sites (627 and 906), two Hαelll (672 and 1393), three Mael sites (727, 1182 and 1355), and three NlalN sites (616, 629 and 908) within the T-DNA-like region that could be used as cloning sites.
1 GJCGACCTCG CCGCTTCAGT CAATCTCTCC GATTCCAAAC TTTTAGAAAC TTCCGTTGTA
61 TCCGCCACTT CCGTCGTCGC GCCGCCGCCG AATCTAAAAG GCGCTTTGAC GATCATCGAT 121 GAGCGAACCG GTAAGAGGTA TCCAGTTCAA GTTTCGGAGG AAGGCACTAT CAAAGCCACC
181 GACTTGAAAA AGATAACAGC AGGACATAAT GATAAGGGTC TCAAGCTTTA TGATCCGGGA
241 TATCTCAACA CAGCACCTGT TCGGTCATCA ATATGTTATA TAGATGGTGA TGCTGGTATC
301 CTTAGATATC GAGGTTACCC AATTGAAGAG CTGGCTGAGG GAAGTTCCTT CTTGGAAGTG
361 GCTTATCTTT TGATGTATGG TAATTTACCA TCTGAGAACC AGTTGGCAGA TTGGGAGTTC 421 ACAGTTTCAC AACATTCAGC AGTTCCACAA GGAATCATGG ATATTATACA TTCGATGCCC
481 CATGATGCTC ATCCAATGGG TGTTCTTGTC AGCGCAATGA GTGCTCTTTC TGTCTTTCAT
541 CCTGATGCCA ATCCAGCTCT GAGAGGACAG GATATATACA AGTCTAAACA AGTGAGAGAT
601 AAACAAATAG TCCGGATCCT TGGCAAGGCA CCTACAATTG CTGCGGCTGC TTACTTAAGA
661 ATGGCTGGAA GGCCACCTGT CCTTCCATCC AACAATCTCT CTTATGCAGA GAACTTCTTG 721 - TACATGCTAG ATTCATTAGG TAATAGGTCT TACAAACCCA ATCCTCGACT CGCTCGGGTG
781 CTCGACATTC TTTTCATATT ACACGCGGAA CATGAAATGA ATTGCTCTAC TGCTGCAGCA
841 CGTCATCTTG CTTAAATGCA ACTGCTCTAT TTTGTGTCAG AATTTGGTGA AAAATGCACT
901 GTTTTGGCAC CAAAAGTTAG TACTTTTGGA CAACTTTTTG GTGAACCAAA ATCTGTCCAA
961 AATGACTTGT TTACCTACTT AAAGAGGTCA TTTTTTCATA CCAGGGGACA TCCCCGACAT 1021 CCCAGGATAC ATAGCTTTTG AAAAATTTTT TACACTCAAG AATACACAAA ACTCGGGAGC
1081 AAAATTAATA GCTGAATGTT TAATAGTAAG CTGAAACTTG AGAGTTTTGG AGTGAGTTTT
1141 TTGAGAGAAA ATAACACTTT AAAAAACAAA AGTCCATACA GCTAGTTTAT AGTTTTTCTT 1201 TCACTAAAGA TGCTGAGTTT TACGGTTTGT TTTTGGTTGT TTTGGGTTCA ATTTATTGCT
1261 GTTTTTTTTA CTATTTTTAC TGTCACTGCT GCTGCATTTT TGCTACTGCT GTATTTTTGC
1321 TCTTCAGGTA ACCTGAGAAG CTTATTTTTT GATACTAGCC ACTCGTGTTG TATTTGTCCT
1381 TTTTAATTTA AGGCCAAATA GTTTCAGTTG TAGAAGTAAT ATTTTCTCCT TTCATTAGTA 1441 AAGTTCAATT AAAAGGCAGG ATATATTGTG ATGTAAACAC CGTCCTGAAG TGTACCAGCT
1501 AAGGACAAGG GATCAAAGAA TTTGCCGCCA GGGTAACCCT GTTCTCCAGT GAAGTTGGCA
1561 AAGTTCTCAG CAGTCTTGGA CCATGGAGTT GCCCACTCTA CTGACTGAGA CTCAGGGTTG
1621 AAGAAGTCCA CCCACCTTTT GCTCTCAACC CACCCCATGA GTAGAAGTTG AGTGCCAAGA
1681 AGTGAGCCAA AAGAGAAAGT GCAATAGCAC CAGGGTCAGC ACCAGCCTCG AACCATGGGA 1741 TGCCACTCCA GGCTTGACCA ACAAAGAGCC CAATACTGCA GCCATTG CG AC
(SEQ ID NO:4)
T-DNA-like region of an apple intragenic vector
This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined. The nucleotides in italics are not part of the apple genome sequence.
Nucleotides 5 - 246 are nucleotides 1 - 242 of CN862631.
Nucleotides 247 - 644 are the reverse complement of nucleotides 28 - 425 of CN942531.
Nucleotides 645 - 943 are the reverse complement of nucleotides 1 - 299 of CO541348.
The T-DNA border-like sequences are shown in bold. The left border is nucleotides 229 -
252 and the right border is nucleotides 627 - 650. The restriction sites and positions that could be used for cloning within the T-DNA are shown below:
Accϊl at 392 AM at 338
Alw26l at 593
Asul at 353 and 387
Aval at 270
Avail at 353 5αnl at 415
BsaOl at 374 and 466
5^12861 at 418
-3stD102I at 262
CM at 466 EcoHI at 269 and 270 EcoNl at 348
EcoRY at 438
FnuDll at 392
Fokl at 619 and 235 Hhαl at 3 4
HM>1I at 392
Hpαl at 565 αell at 449 αelll at 509 ypAH at 400
NlαϊV at 289 and 417
NspBll at 400
Pvul at 466
RleAl at 269 ScrFI at 271 and 272
S< at 418
Smαl at 272
Stul at 653
Thai at 392 JMI at 586 and 818
Xmal at 270
Xorll at 466
1 GTCGACTAAT GAGGCTTTGA TCTACCACAA GGCTTTTCCA ATGCCGGCAT TGTCATACAA 61 GTTTCAGAAC ACAGACTCAC TTTCCGGCCA TGACACAGAT GATGCTGCAC AGTTTATCTC
121 TTCCGTTTGT TGGCGAGGCC AAACCTCCAC CTTAATTGCT GCAAATTCGA CGGGGAATAT
181 AAAAATTTTG GAGATGGTTT GATGATCTCC AAGGTGATTC TTGAATCTGG CAGGATATAT
241 GGGGTGGTCA TCCCACATCG AGCGGATCAC CCGGGAGAAG GTGAACGGTT CCACCGTCAA
301 TGTCGGCATC AACCCCCTCC AAGGTCGTCA TCACCAAGCT CCGCCTCGAC AAGGACCGCA 361 AAGTTCTGCT CGACCGCAAA GGCCAAGGGC CGCGCCGCCG CTGACAAGGA CAAGGGCACC
421 AAGTTCACTG CCGAGGATAT CATGCAGAAC GTCGATTGAT TTCGATCGAT TTCATTTCGG
481 TTTGTGTTTT TGTTAGTTAA ATGAAAGTAG TAACTGTCAA GTTAAGCACT TTAGTCGGAA
541 TCACTTTTAA TTTGAAGTAT GCGTTAACGG ATTTGGTGTT TAATCGGATC TTCGATTTGA
601 GACATGGATG GATTTGTGCT TTTTTTGACA GGATATATTA TATTGTAAAC AGGCCTCCCT 661 CAGACGATAC AAATGAACCC TCATGTAAAT TTGTTTCATT ATTTATTCTC ATTAACAATG
721 ACTAACACAC AAATATAAAA GAAATAAATC ACATTTGGGG TCTTGTCTGG ACAACATAGA 781 GTTTACCGTC CCTGATCACG CCTTCAATGT CTTGTGGAGA TCCATACAGT TCCTCGATTG 841 CACTTCCTGC ACGAGCAATG CTAGAGAGGA TTGTCTTGCG AAAGTTTCCG TCAACCATAA 901 GCGGGTCAGA TGAGTAGTCG ACCAAGACTT TCTCTTCCTC GTC
(SEQ ID NO:5)
T-DNA-like region of a Medicago truncatula intragenic vector
This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined. The nucleotides in italics are not part of the M. truncatula genome sequence. Nucleotides 6-357 are nucleotides 2- 353 of CA921810.
Nucleotides 358 - 694 are nucleotides 112 - 448 of AL375389.
Nucleotides 695 - 1055 are the reverse complement of nucleotides 2-362 of CF069972.
The T-DNA border-like sequence is shown in bold. The left border is nucleotides 339 - 362 and the right border is nucleotides 677 - 700. Unique restriction sites in the resulting binary vector that are located within the T-DNA-like region are:
Accl at 403
4/71X1 at 465
Agel at 409 Asull at 522
£sαOI at 410
Bsml at 539 rlOI at 409
CspA5l at 522 HpαII at 410 αeII at 391
Mboll at 511 pl at 410
NlaTV at 585 AI at 409
Rsal at 394
There are also two Ahaϊll sites (462 and 567), two Dral sites (462 and 567) and three Taql sites (522, 615 and 636) that could be used as cloning sites. l GΓCGΛCTCAT TAAACAAATA AAΆGAΆCTAT TCAAATGTTT GCACATTTG AAACTGAGTC
61 AGCATAAAAA CATTTACCAT GCTAGTTTAA CTAGTTCAGA CGCAACCAAA ACACCATGAA 121 ACTTAATTGC ATAGGAAAGC ACCAACCTGT TTCAGCAGCA AGACATGCTG ACTACAGGTC
181 AACTACTGTT TCTGCATAAT AACTATTTAT CAACTACTAA TATTCCATGG TAGAATAGCC
241 ATCAAAACCA TCATTGCGCA GCAGAGGGTC AAAAAGGAAC AATGATTTTC AACAGTTACT
301 CCAAGGATAA CTGATGCTGC AGCTGGTAAC AAGTTATTGG CAGGATATAT TACCATGTAA
361 AATTCTAGGC TATGTTTACA AAAAAATTGA ACGTACTTAA TGTATACGAC CGGTAAAGGA 421 GAAAAAGGAA GTATAAGTCA CTTAATTTAA TTTTTTAACT TTAAACATGT TTTTTAGGAG
481 GCACAATTAT AAGTTAAAAA TGTAAGGAAA ACCTATTTTC TTCGAATATA TAGATTTGGC
541 ATTCCATTTT AGTTAAGGAG TTAATTTAAA AATATGAAAG TAGGAGCCTC TGTTCGTAAA
601 ATTTGTGAAA AATGTCGATT GATACGCAGG CGAGGTCGAA TTATAGTAAT TTGTTCCAAC
661 CCAATAACAA ATACAAGGCA GGATATATTT AACTGTAAAC GACCATGAGC CCTGTGCTCT 721 GCAGGTGCAT GCCAATGAAG TTGTTTCAAT GAATAATTCA TTCCATTTAT ATTAATATCG
781 CCCACTTTTC CTTCAAAATG CACCCCAATA CTGTATTGGT GGTTAACAAG TGTGGCATTT
841 GTAGGAAGGT AGTTTCTGTC TAAGGATTTC AATACATTGT TCACAACAAT ATTTGTCATT
901 GCAAGATCCA CTGGACTCTG TGCTTTCCCA TTTGAGCATG CTGCGAATGA TTGTTTTAGT
961 GTTCCCCATT TCAGAGGACC ATTTGGTCCA ATATAACTAA AATTAACCGA ATCATGATTT 1021 GCTGAAGTGC ATAGAGCCAA AGCAGCTATG AAAGGTCGAC
(SEQ ID O:6)
T-DNA-like region of an onion intragenic vector
This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined. The nucleotides in italics are not part of the onion genome sequence.
Nucleotides 5 - 537 are nucleotides 4 - 536 of CF449263.
Nucleotides 538 - 1186 are nucleotides 94 - 742 of CF441521.
Nucleotides 1187 - 1503 are nucleotides 162 - 478 of CF452730.
The T-DNA border-like sequences are shown in bold. The left border is nucleotides 520 -
543 and the right border is nucleotides 1169 - 1192. Unique restriction sites in the resulting binary vector that are located in the T-DNA-like region are:
Accl at 2 and 739 Asul at 1046
Avail at 10A6
Bc/I at ll24 CM at 556 Dralϊl at 869 Eco57I at 910 and 975 Hindlϊl at 628 Mfel at 852 at 1006 /7MI at 649 Rsal at 999 Sapl at 580 Xbal at 578
1 GTCGACTTCC CTTTCCTCTA CTCCACTTGT TTCTCGCTTT CTCTACTTCC TTTTTCTCTC
61 TTTTCTTTAT ATTTATTGCT CAGCTGGGAT TAATTACTGT CATTTATTCC TCATATCTAT 121 TTTATTGAAT TAAAACGGTT ATTTAGCTCG AGGCCTTCTC TCTTATTCTT TGCTTCCAAG
181 GAGAGAGAAT ATGGCGAGTG GTAGCAATCA TCAGCATGGT GGAGGAGGAA GAAGAAGAGG
241 CGGAATGTTA GTCGCTGCGA CCTTGCTTAT TCTTCCTGCC ATTTTCCCCA ATTTGTTTGT
301 TCCTCTTCCC TTTGCTTTTG GTAGTTCTGG CAGCGGTGCA TCTCCTTCTC TCTTCTCCGA
361 ATGGAATGCT CCTAAACCTA GGCATCTCTC TCTTCTGAAA GCAGCCATTG AGCGTGAGAT 421 TTCTGACGAA CAAAAATCAG AGCTGTGGTC TCCCTTGCCT CCACAGGGAT GGAAACCGTG
481 CCTTGAGACT CAATATAGTA GCGGGCTACC CAGTAGATCG ACAGGATATA TTCAAGTGTA
541 AAACAAGATG CTGAATCGAT TAGCAATGGT TCGCTCTTCT AGACTTGCTT CTCGGATAAT
601 CAATCCTCAG TTTTTGATTC CTTCTCGAAG CTTCCTTGAT CTCCATAAGA TGGTAAACAA
661 GGAGGCGATA AAAAAAGAAA GGGCTAGACT TGCTGATGAG ATGAGCAGAG GATATTTTGC 721 GGATATGGCA GAGATTCGTA TACATGGTGG CAAGATTGCT ATGGCAAATG AAATTCTTAT
781 TCCATCAGGG GAAGCAATCA AATTTCCTGA TTTGACAGTA AAATTGTCTG ATGATAGCAG
841 TTTGCATTTA CCAATTGTAT CTACACAAAG TGCTACAAAT AACAATGCTA AATCCACTCC 901 TGCTGCCTCA TTGTTGTGCC TTTCCTTCAG AGCAAGTTCA CAGACAATGG TTGAATCATG 961 GACTGTTCCT TTTTTGGACA CTTTTAACTC TTCAGAAGTA CAAGCATATG AGGTATCATT 1021 TTTGGATTCT TGGTTTTTCT CATTCGGACC AATCAAGAGA ATGTTTCTTA ACATGACGAA
1081 GAAACCCACT GCTACTCAGC GGAAGATTGG TTATTTCATT TGGTGATCAC TATGATTTTA
1141 GGAAGCAGCT TCAAATTGTA AATCTTTTGA CAGGATATAT ATTACTGTAA AAAGTGAAGA
1201 GAGAAATGTG ATATATGCTG ATGTTTCCAT GGAGAGGGGT GCATTTCTTG TTCAACAAGC
1261 TATGAGGGCT TTCCATGGAA AGAATATAGA AAGCGCAAAA TCAAGGCTTA GTCTTTGCGA 1321 GGAGGATATT CGTGGGCAGT TAGAGATGAC AGATAACAAA CCAGAGTTAT ATTCACAGCT
1381 TGGTGCTGTC CTTGGAATGC TAGGAGACTG CTGTCGAGGA ATGGGTGATA CTAATGGTGC
1441 GATTCCATAT TATGAAGAGA GTGTGGAATT CCTCTTAAAA ATGCCTGCAA AAGATCCCGA
1501 GGT CGAC
(SEQ ID NO:7) T-DNA-like region of a rice intragenic vector
This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined. This requires a partial digest due to a Sail site within the T-DNA like region. The nucleotides in italics are not part of the rice genome sequence.
Nucleotides 6 - 634 are nucleotides 1 - 629 of CR287857. Nucleotides 635 - 1258 are nucleotides 156 - 779 of AK100350. Nucleotides 1259 - 1740 are nucleotides 222 - 703 of CB619781.
The T-DNA border-like sequences are shown in bold. The left border is nucleotides 616 —
639 and the right border is nucleotides 1247 - 1270. Unique restriction sites in the resulting binary vector that are located in the T-DNA-like region are: _4ccl al 945
-4c7I at 716
Aflll at 984
Ahalll at 893
-47w26I at l l02 Apal at 1166
5α«II at 662 and 1166
5g7II at l026
BsaOl at 1137 and 1200
BspΗl at 111 CM at 1200
Oral at 893
£>rali at l l62
HgiAl at 675
Hindϊl at 946 αell at 716, 842 and 1125
Mel at 892 and 985
Pssl at 1165
Pvul at 1137 and 1200
Sail at 944 S«αBI at l l26
Sspl at 1073
Xorll at 1137 and 1200
1 GTCGA GGGA ATTCGCCATT ATGGCCGGGG GAAGCTTCCC TGTCACTACT GCAGGATCAT
61 CAAAAGCAAG CTAGGGCGCA TCTTCTTGAC ACTGAACCTT TTGAGCATGC ATTTGGACCA
121 AAGGGCAAGA GGAAACGCCC AAAACTAATG GCTCTTGATT ATGAATCTCT ATTGAAGAAA
181 GCTGATGATT CTCAAGGTGC ATTTGAGGAT AAGCATGCTA CAGCGAAGTT GCTGAAAGAG 241 GAAGAGGAAG ATGGCTTACG ATACCTAGTC CGGCACACAA TGTTTGAGAA GGGACAGAGC
301 AAAAGAATTT GGGGTGAACT CTATAAAGTT ATTGACTCTT CAGATGTTGT CGTGCAGGTG
361 TTGGATGCCA GGGATCCAAT GGGTACTAGA TGCTACCATC TGGAGAAACA TCTGAAGGAG
421 AATGCCAAGC ACAAACACTT GGTATTCTTA CTAAATAAGT GTGATCTAGT ACCTGCTTGG
481 GCCACAAAAG GATGGTTGCG CACTTTATCA AAGGACTATC CCAACCTAGC ATACCATGCA 541 AGCATCAACA GTTCATTTGG CAAAGGATCA CTTCTTTCAG TGTTACGGGA GGATGGACGC
601 CCTGAGAGAT GTGACGACAG GATATATAGT GAGGTCATGC AGTGCAAGCC CCTCCCCGAG
661 CCCGAGGTCA GAGCACTTTG CGAGAAGGCA AAAGAGATAT TGATGGAGGA GAGCAACGTT
721 CAACCTGTAA AGAGTCCTGT TACAATATGT GGTGATATTC ATGGGCAGTT TCATGACCTT
781 GCAGAACTGT TCCGAATCGG TGGAAAGTGC CCAGATACAA ACTACTTGTT TATGGGAGAT 841 TACGTGGATC GTGGTTATTA TTCTGTTGAA ACTGTCACGC TTTTGGTGGC TTTAAAGGTT
90.1 CGTTATCCTC AGCGAATTAC TATTCTCAGA GGAAACCACG AAAGTCGACA GATCACTCAA
961 GTTTATGGAT TCTATGACGA GTGCTTAAGG AAGTACGGGA ATGCAAATGT GTGGAAAACT
1021 TTTACAGATC TCTTCGATTA CTTCCCCTTG ACAGCATTGG TTGAGTCAGA AATATTTTGC
1081 CTGCATGGTG GATTATCGCC ATCCATTGAG ACACTTGATA ACATACGTAA CTTCGATCGT 1141 GTCCAAGAAG TTCCCCATGA AGGGCCCATG TGTGATCTTC TGTGGTCTGA TCCAGACGAT
1201 CGATGTGGTT GGGGTATTTC TCCTCGAGGT GCTGGATACA CCTTCGGGCA GGATATATTG
1261 GCGGGTAAAC CAATTCCTGG TTTTCCCGAC AAACCCTCGA GAATAAATTC ATTCTTTGCA
1321 GAAGGATGTC AAACTGGTGA CAATGGTGCT GGTTCCTCGC AAGAGTTGAA TGGTCATTGC
1381 AATGGAGAAC CCAGTTGCCC AGAGCAAGGA GTTCTGACCA ATGGTGGCAA CACGCCCTCT 1441 CCAAGCACAC AATGCTATGA AAATAAGTTT GCAACATCCA CCAACGGCAA CTATTCTATT
1501 GGGAATGGTG ATACATTATC TAGCAGCAAC TCATTACATG CGGGCAAACA GAATGCTGGC
1561 TTTACCTATA ATGGTTTCAA TCCAAAACCT TACAAAGAAC CATCAGGAAG CAACACATAT 1621 CTGAATAATA CATGCAATGG TAAACCATCG GAAGATAATC ACAATAAATG TGCCCCAAAC 1681 CTGCCGGCAA AAGATTGCCA AGGGGGCATG CCATTCTTAC ATCGTGGCTT CCTTCTAAGG 1741 TCGAC
(SEQ ID NO:8)
T-DNA-like region of Pinus taeda intragenic vector This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined. The nucleotides in italics are not part of the P. taeda genome sequence. Nucleotides 1 - 333 are nucleotides 114 - 446 of BM133642. Nucleotides 334 - 914 are nucleotides 81 - 661 of CF392877. Nucleotides 915 - 1172 are nucleotides 138 - 395 of CX715693.
The T-DNA border-like sequences are shown in bold. The left border is nucleotides 314 -
337 and the right border is nucleotides 898 - 921. Unique restriction sites in the resulting binary vector that are located in the T-DNA-like region are:
Acclll at 442
Alw26l at 513
Asul at U
^vαll at 813 .ftspMII at 442
DraII at 813
7 αIII at 879
Eco72I at 876
EcoNl at 808 Ec RV at 555
Fokl at 871
Hαelll at 339
Hpαll at 443
Mαel at 534 Mspl at 443
PmαCl at 876 pwMI at 813
P^I at 816
There are also to ^4ccl sites (2 and 888) and two BspRl sites (587 and 806) that could be used as cloning sites. 1 GTCGACATTC GCAGCGATGA CGCTTGGCGC TTTCATTATC TTACAGTCGA GGATAAACT
61 CCACGTCTGG ATTGCTCTCC GTCAAGACGA AAAGAGGGAG GGCAAATGC
AGGAATTAGA 121 GATTGCAAAA ATGAAAAAGA AATTATTGGC GGAACTTAAA GAGAAAGAGT
CAAGCGTAT
181 GCTTTTTGTA GTGATAATTG TTAACCTCCA ATGGTAATCA TTATTTTGAA
TGATAGCTAT
241 TCGATTATTT GGTGAAAAAA CAAACTTGAA AATGATTGTG GACAGTTATT TGTGTTGAAG
301 CCACAGCAGT GTTGGCAGGA TATATCCAAT TGTAAAAGGC CAAGGAGATT
CTCATGGAAG
361 AAAGTAATGT GCAGCCTGTC AAGTGCCCTG TTACAATCTG TGGTGACATA
CATGGTCAAT 421 TTCACGATCT TGCCGAGCTG TTCCGGATTG GTGGAAAGTG TCCAGATACG
AATTATTTGT
481 TCATGGGTGA CTATGTTGAC CGAGGATACT ATTCAGTCGA GACTGTCACT
CTTCTAGTGT
541 CATTGAAGGT GCGATATCCC CAACGAATTA CCATTCTTCG AGGAAATCAT GAGAGTCGTC
601 AGATTACTCA AGTATATGGA TTCTATGATG AATGTTTACG GAAGTATGGA
AATGCAAACG
661 TCTGGAAGAT ATTTACAGAC CTTTTTGACT ACTTCCCATT GACAGCACTG
GTAGAATCAG 721 AAATTTTCTG CTTACATGGT GGTTTGTCAC CTTCCATTGA CACATTAGAT
AACATAAGGA
781 ATTTTGACCG TGTTCAAGAA GTGCCTCATG AAGGTCCCAT GTGCGATCTT
TTGTGGTCAG
841 ATCCAGATGA TAGATGTGGA TGGGGAATCT CTCCACGTGG TGCAGGGTAT ACATTTGGAC
901 AGGATATATC TGAAGGTAAA CCGAAGGGAT CCTTCAAAGT TGACCAATAT
CAGAGATAAT
961 TATGTTGGCA TTGAATATGA TACCAACCCT GAAATATTGT GCTAAAGAGA
GATGCTGAGG 1021 TTTTAAAATT TCTATCAGCT TAACGAGCAC ATGATACATA ATATATGCCA
CAAGAATGGA 1081 ATGGAATTTC CATTTGGCTT TAAAAAATGA TTTGTAATGT CACGTACATT AGCATCTACA
1141 AAGAATTGGA TTGCCTTCAT TCACTTTTCG TCGAC (SEQIDNO:9)
Example 3 Identification of plant-derived sequences that function as a selectable marker in bacteria: complementation of deficient bacteria
In preferred intragenic vectors of the invention, the complete vector is made up entirely of plant-derived sequences. One desirable component for effective vector manipulation is a bacterial selectable marker. Preferred marker sequences include plant genes that complement bacterial mutants deficient in genes essential for their growth, such as amino acid biosynthesis genes. One such gene is acetohydroxyacid synthase.
Acetohydroxyacid synthase is an enzyme which catalyses the formation of acetolactate pyruvate, the first step in valine, leucine and isoleucine biosynthesis. Furthermore, plants with mutant forms of AHAS can confer resistance to sulfonylurea herbicides and related compounds (Mazur and Falco, Annual Review of Plant Physiology and Plant Molecular Biology, 40: 441-470, 1989). For example, the Arabidopsis thaliana mutant AHAS gene confers resistance to the herbicide chlorsulfuron upon transformation into tobacco (Haughan et al, Molecular and General Genetics, 211: 266-271, 1988).
Wild-type AHAS Genbank details are:
LOCUS NM_1147142270 bp mRNA linear PLN 19-FEB-2004 DEFINITION Arabidopsis thaliana acetolactate synthase, chloroplast /acetohydroxy-acid synthase (ALS) (At3g48560) mRNA, complete cds. Located on chromosome 3.
A discontiguous megablast of the gene sequence above against all publicly available Virdiplantae genome sequences using the standard NCBI parameters shows that the AHAS gene is present in many plant species. Genes encoding acetohydroxyacid synthase are found in both the E. coli genome and the Agrobacterium tumefaciens C58 genome (Genbank accessions NC00913 and NC003062 respectively). Furthermore, functional expression of plant AHAS genes in bacterial systems to complement deficiencies in AHAS has been well established. For example, the AHAS genes from Arabidopsis thaliana (Smith et α7., Proceedings of the National Academy of Science, USA, 86: 4179-4183, 1989), Nicotiana tabacurn (Kim and Chang, Journal of Biochemistry and Molecular Biology, 28: 265-270, 1995), and Brassica napus (Wiersma et α7., Molecular and General Genetics, 224: 155-159, 1990) have been used to complement AHAS -deficient bacteria such as Escherichia coli and Salmonella typhimurium.
Furthermore, it will be understood by those with ordinary skill in the art, that plant-derived sequences such as AHAS known to complement bacterial deficiencies can be placed under the control of plant promoters known to be transcriptionally active in bacteria. For example, Jacob et a (Transgenic Research, 11 : 291-303, 2002) describe, several such plant promoters, one of which is the potato ST-LS1 promoter. In order to provide an example in which all the components of the present invention are derived from a single plant species, we have isolated the potato (Solanum tuberosum) AHAS gene. This gene can be used in the manner described above to provide a bacterial selectable marker gene to maintain vectors in bacteria.
Using potato (Solanum tuberosum) cultivar 'Iwa' genomic DNA as a template, various fragments of the AHAS gene were isolated based on primers designed from related species. These fragments were cloned, their DNA sequence determined, and a composite consensus sequence generated for the potato AHAS gene. In order to generate the complete sequence for a single allele, the following primers flanking to coding region were designed: Primer Q: 5 AGCCATTTTGCCTCCTTTC3'
Primer R: 5'CAACGGCAAACTAGACAGATAGAA3' A polymerase chain reaction was then performed with high fidelity Pwo polymerase with primers Q and R to amplify a fragment using genomic DNA from potato cultivar Twa' as a template. This product was A-tailed, and ligated into pGemT (Promega) following the manufacturers' instructions. The cloned AHAS allele was then sequenced using primers based on the consensus sequence anchored about every 400 bp along the cloned fragment. The following sequence for the coding region of a potato cultivar Twa' AHAS allele (from the start codon to the stop codon) was obtained:
1 ATGGCGGCTG CTGCCTCACC ATCTCCATGT TTCTCCAAAA CCCTACCTCC ATCTTCCTCC 61 AAATCTTCCA CCATTCTTCC TAGATCTACC TTCCCTTTCC ACAATCACCC TCAAAAAGCC
121 TCACCCCTTC ATCTCACCCA CACCCATCAT CATCGTCGTG GTTTCGCCGT TTCCAATGTC
181 GTCATATCCA CTACCACCCA TAACGACGTT TCTGAACCTG AAACATTCGT TTCCCGTTTC
241 GCCCCTGACG AACCCAGAAA GGGTTGTGAT GTTCTTGTGG AGGCACTTGA AAGGGAGGGG
301 GTTACGGATG TATTTGCGTA CCCAGGAGGT GCTTCTATGG AGATTCATCA GGCTTTGACA 361 CGTTCGAATA TTATTCGTAA TGTGCTGCCA CGTCATGAGC AAGGTGGTGT GTTTGCTGCA
421 GAGGGTTACG CACGGGCGAC TGGGTTCCCT GGTGTTTGCA TTGCTACCTC TGGTCCGGGA 481 GCTACGAATC TTGTTAGTGG TCTTGCGGAT GCTTTGTTGG ATAGTATTCC GATTGTTGCT
541 ATTACGGGTC AAGTGCCGAG GAGGATGATT GGTACTGATG CGTTTCAGGA AACGCCTATT
601 GTTGAGGTAA CGAGATCTAT TACGAAGCAT AATTATCTTG TTATGGATGT AGAGGATATT
661 CCTAGGGTTG TTCGTGAAGC GTTTTTTCTA GCGAAATCGG GACGGCCTGG GCCGGTTTTG 721 ATTGATGTAC CTAAGGATAT TCAGCAACAA TTGGTGATAC CTAATTGGGA TCAGCCAATG
781 AGGTTGCCTG GTTACATGTC TAGGTTACCT AAATTGCCTA ATGAGATGCT TTTGGAACAA
841 ATTATTAGGC TGATTTCGGA GTCGAAGAAG CCTGTTTTGT ATGTGGGTGG TGGGTGTTTG
901 CAATCAAGTG AGGAGCTGAG ACGATTTGTG GAGCTTACGG GTATTCCTGT GGCGAGTACT
961 TTGATGGGTC TTGGAGCTTT TCCAACTGGG GATGAGCTTT CCCTTCAAAT GTTGGGTATG 1021 CATGGGACTG TGTATGCTAA TTATGCTGTG GATGGTAGTG ATTTGTTGCT TGCATTTGGG
1081 GTGAGGTTTG ATGATCGAGT TACTGGTAAA TTGGAAGCTT TTGCTAGCCG AGCGAAAATT
1141 GTCCACATTG ATATTGATTC GGCTGAGATT GGAAAGAACA AGCAACCTCA TGTTTCCATT
1201 TGTGCAGATA TCAAGTTGGC ATTACAGGGT TTGAATTCCA TATTGGAGGG TAAAGAAGGT
1261 AAGCTGAAGT TGGACTTTTC TGCTTGGAGA CAGGAGTTAA CGGAACAGAA GGTGAAGTAC 1321 CCATTGAGTT TTAAGACTTT TGGTGAAGCC ATCCCTCCAC AATATGCTAT TCAGGTTCTT
1381 GATGAGTTAA CTAACGGAAA TGCCATTATT AGTACTGGTG TGGGGCAACA CCAGATGTGG
1441 GCTGCCCAAT ACTATAAGTA CAAAAAGCCA CACCAATGGT TGACATCTGG TGGATTAGGA
1501 GCAATGGGAT TTGGTTTGCC TGCTGCAATA GGTGCGGCTG TTGGAAGACC GGGTGAGATT
1561 GTGGTTGACA TTGATGGTGA CGGGAGTTTT ATCATGAATG TGCAGGAGTT AGCAACAATT 1621 AAGGTGGAGA ATCTCCCAGT TAAGATTATG TTGCTGAATA ATCAACACTT GGGAATGGTG
1681 GTTCAATGGG AGGATCGATT CTATAAGGCT AACAGAGCAC ACACTTACTT GGGTGATCCT
1741 GCTAATGAGG AAGAGATCTT CCCTAATATG TTGAAATTCG CAGAGGCTTG TGGCGTACCT
1801 GCTGCAAGAG TGTCACACAG GGATGATCTT AGAGCTGCCA TTCAAAAGAT GTTAGACACT
1861 CCTGGGCCAT ACTTGTTGGA TGTGATTGTA CCTCATCAGG AGCACGTTCT ACCTATGATT 1921 CCCAGTGGCG GTGCTTTCAA AGATGTGATC ACAGAGGGTG ATGGGAGACG TTCATATTGA
(SEQ ID NO:10)
Example 4 Identification of a plant-derived sequences that function as plasmid origins of replication in bacteria
Preferred intragenic vectors of the invention comprise an origin of replication that functions in E. coli and Agrobacterium tumefaciens.
Plant derived bacterial origins of replication in this example are based on the smallest known prokaryotic replication origins of Colicin E plasmids (ColE plasmids), specifically ColE2-P9 (from Shigella sp.) and ColE3-CA38 (from E. coli). The minimal replication origins of these plasmids, named COLE2 and COLE3, require only 1 specific factor (Rep) to be provided in trans. Plasmids pBX243 and pBX343 provide Rep in trans for ColE2 and ColE3 respectively. The minimal origins also require host DNA polymerase I and other factors (see Yasueda et al., Molecular and General Genetics, 215: 209-216, 1989; Shinohara and Itoh, Journal of Molecular Biology, 257: 290-300, 1996).
There are 2 differences between ColE2 and ColE3 origin sequences, one mismatch and a deletion of a single nucleotide in ColE2 (or an insertion in ColE3). The deletion/insertion, not the mismatch, is responsible for determining the plasmid specificity in the interaction of the origins with the trans-acting factors.
Characteristic features of these sequences are two direct repeat sequences of 7 bp (5'CAPuATAA) or of 9 bp (APyCAPuATAA) which are separated from each other by 7 bp or 5 bp in ColE2 and by 8 bp or 6 bp in ColE3.
ColE2 AGACCAGATAAGCCT TATCAGATAACAGCGCC (SEQ ID NO:l l)
ColE3 AGACCAAATAAGCCTATATCAGATAACAGCGCC (SEQ ID NO: 12)
The one nucleotide mismatch G/A can be substituted without effect. Only T/A is acceptable in the insertion position and the third to last position can be G or an A. It is likely that other changes can also be made that do not affect the composition of the two direct repeat sequences. Consensus sequences for ColE2 and ColE3 can be described as: R - G or A (Pu) Y = C or T (Py) W = A or T
Consensus ColE2 AGAgCAJATAAGCCT TA CAJATAACAGCgCC Consensus ColE3 AGAgιCA|ATAAGCCT TAJCA§ATAACAGCteCC
Other minimal replication origins and Rep genes from other (Colicin E) plasmids could also be used when constructing plant-derived replication origins (Table 2).
Table 2. Replication origins from ColE plasmids that could be used to construct plant derived replication origins.
Original Host Plasmid (putative) minimal origins NCBI Accession
Shigella sp ColE2-P9 agaccagataa-gcct-tatcagataacagcgcc D30054 Escherichia coli ColE3-CA38 agaccaaataa-gcctatatcagataacagcgcc D30055 Escherichia coli ColE2-CA42 -ga — aaata—gcctatatcagataacagcgcc D30056 Escherichia coli ColE2-GEI602 -ga — aaata—gcctatatcagataacagcgcc D30057 Escherichia coli ColE2imm-K317 -ga — aaata—gcctatatcagataacagcgcc D30058 Shigella sonnei ColE4-CT9 agaccagataa-gcct-tatcagataacagcgcc D30059 Shigella sonnei ColE5-O99 agaccaaataaaacctatatcagataacagcgct D30060 Shigella sonnei ColE6-CT14 agaccaaataaaacctatatcagataacagcgct D30061 Escherichia coli ColE7-K317 agaccaaataa-gcctatatcagataacagcgcc D30062 Escherichia coli C0IE8-J agaccaaataa-gcctatatcagataacagcgcc D30063 Escherichia coli ColE9-J agaccaaataaaacctatatcagataacagcgct D30064
Potato COLE2-like replication sequence
The ColE2 consensus sequence was used to search publicly available potato (Solanum tuberosum) DNA sequences. The potato COLE2-like replication sequence POTCOLE2 was constructed in silico from two sequences, accessions: SGN U254575 nucleotides 359-721 correspondto POTCOLE21-363 TIGR EST494490 nucleotides 248-693 correspond to POTCOLE2362-807
TGGCCACAAAACAAGCGCCAAACAACGAGCAACAACAAATCAAGATTGCACCAAAACTAGAA AATTAAAGAAGAGTATCACCCCAAATGCGTTACTGTTCACGACCTCAAATCAGAATCTACAG ATCTCTAAATCCGATCTCCACTGTTGAATTGCAAGAACCAGATGCTGAGAACTCTCAGTTCA AATTTGAGCACGATCCAACGGTTAACGAAGCGGCAAACTCTGTCTGAAGCGGACTGCCTGAG CAGAAAATTTCCAGAAGCAAAAACGGGATTTTCTCTTTTTCTCTCAATCTCTAAAACGAATC TCTCTTGATTTTTCTCTCTTGTGTTTCTGAAAATAAGACCAAATAAGCCTTATCAGATAACA GCACCTGAAGCAGCTCATGTAGCTTGTCAGCACCAGGTCCTGGCCTAAACACTGTATCATTG CCACGCAAAGAGCACGGGTCTGCTCCACCATCTGATGACCCAATACACCATGCACCTGGCGA AAACCTATGTGTGCGCCCCAAAAGTTCTTTGTCAATCTTGTCTAGGACTGGATCATCCCTTG CAAA-TTTTCGGGCAAAGGGAGCATTGCTCTTGGCCATTTTGTCGAAGTTCTTCATAGTTAGA GACATGGGATGTTGCTTTGGTGGACTGTCCCAAGCAATGTAATGAAGATCGTGGCTTATTGC TGTGTGCCGAAATTTCTGAGTGTTGCAAATGACAGTGTGAAAATATCCCTCTGGCGAAGAGA CAAAATTTGTATAATACATAAGCATAGTCCGTGGAAAGTTATCCCATCCCCATATGCAGTAC T (SEQ ID NO:13)
The POTCOLE2 replication sequence, the minimal origin sequence, is underlined.
The 807 bp POTCOLE2 sequence was synthesised by Genscript Corporation (Piscatawa, NJ, www. enscript. com) and supplied cloned into the Smal site of pUC57 (pUC57POTCOLE2).
The following primers were designed to PCR amplify the spectinomycin resistance gene of pART27 (Gleave, Plant Molecular Biology, 20: 1203-1207, 1992). Primer S: 5 ' GTGTCGAC AACTACGATACGG3 ' (SEQ ID NO:14)
Primer T: 5 'CGTAAGCTTGAACGAATTCTTAG3 ' (SEQ ID NO: 15)
Nucleotides underlined represent a SαTI site in primer S and represent Hmdlll and EcoRI sites in primer T. A 1661 bp fragment with the spectinomycin resistance gene was PCR amplified from pART27 using high fidelity Pwo polymerase. This fragment was ligated as a Sail to H dIII region into pUC57POTCOLE2 to give pUC57POTCOLE2SPEC and position the spectinomycin resistance gene immediately adjacent to the POTCOLE2 fragment.
The fragment corresponding to the spectinomycin resistance gene and the POTCOLE2 was isolated as a 2.5 kb EcoRI fragment from pUC57POTCOLΕ2SPΕC and self-circularised to generate pPOTCOLE2SPEC. The ligation was transformed into E. coli DΗ5α harbouring helper plasmid pBX243 (with an ampicillin resistance gene) and transformation selected on L plates supplemented with ampicillin and spectinomycin (100 μg/mL). Resulting colonies were picked, plasmid DNA isolated and analysed by restriction enzyme digest using BamHl and EcoRI. A i?α/røHI/EcoRI double digest will release the Rep gene from ρBX243 and will linearise pPOTCOLΕ2SPΕC.
Successful bacterial propagation of pPOTCOLE2SPEC using the potato-derived origin of replication is evident as three bands on a gel at 3.9 kb, 2.5 kb, and 1.5 kb, representing the pBX243 backbone, linearised pPOTCOLE2SPEC, and the pBX243 Rep gene. Control digests of only pBX243 results in two bands, the 3.9 kb pBX243 backbone and the 1.5 kb Rep gene. Figure 1 provides confirmation of replication of pPOTCOLE2SPEC in bacteria mediated by the potato-derived origin of replication.
COLE2-like sequences from the genomes of other plant species
A search on NCBI GenBank (http://www.ncbi.nlm.nih.gov/BLAST and TIGR database fhttp://tigrblast.tigr.org/tgi using the BLAST tool "search for short, nearly exact matches" and searching within the EST databases, yielded COLE2-like sequences in the genomes of other plant species. Searches were made for the consensus ColE2 sequence:
AGAgCAJATAAGCCTTA|CAJATAACAGC|CC
It was possible to readily assemble COLE2-like sequences by joining two sequences from the same species with a few mismatches unlikely to impact on the functionality of the origin of replication. In the examples below the position of the join between two EST sequences is indicated by "/" and mismatches to the consensus sequence are indicated in bold.
Species Sequence Consensus ColE2 AGAJCAJATAAGCCT TAJCAJATAACAGC|CC
Allium cepa AGACCAAATAAGCTC/TATCAGATAGCAGCTGC (SEQ ID
NO: 16) CF436111/CF452305
Beta vulgaris AGGCCAAATAAGCCT/TATCAGATAACAGCGCC (SEQ ID
NO: 17)
BQ589076/BQ590618
Medicago trunculata AAATCAAATAAGCCTTATCA/GATAACAGCACC (SEQ ID
NO: 18)
TC111839/TC102142
Gossypium arbor eum AGACCAGATAATCCT/TAACAGATAACAGCGCC (SEQ ID NO: 19) BM358442 /AW729597
Hordeum vulgare AGATCAGATAAGCCTTA/TCAGATAACAGCGCC (SEQ ID NO:20)
DN158808/AV924388
Sorghum bicolor GACCAGATAAGCATTATTAG/ATAACAGCGCC (SEQ ID
NO:21) CF427156 /CX613542
Picea glauca AGACCTAATAAGCCT/TATCAGATAACTGTGCG (SEQ ID
NO:22)
CO485190/ CO235782
Theobroma cacao AGACCAAATAAGACTTA/TCAGATAACAGCACG (SEQ ID
NO:23)
CA796667/CF974720 Mesembryanthemum crystallinum
CA838853 / BE036300 AAACCAAATAAGCTTTA/TCAGATAACAGCACA (SEQ ID
NO:24)
Petunia hybrida GCATCAGATAAGCCT/AACCAAATAACAGCAAC (SEQ ID
NO:25) NP1240078/TC390
Brassica napus AGACCAGATAAGACT/CATCAGATAACAACACA (SEQ ID NO:26)
CD814492/CD814199
Zea mays CCACCAGATAAGCCTT/ATCAGATAACAGTTGC (SEQ ID
NO:27) DN211845/DN232238
Pinus taeda AGTACAGATAAGCCTT/ACCAAATAACAACACC (SEQ ID
NO:28)
DR019180/ BQ699992
COLE3-like sequences from the genomes of plant species
A search on NCBI GenBank (http ://www.ncbi.nlm.nih. ov/BLAST/) and TIGR database fhτtp://tigrblast.tigr.org/tgiΛ using the BLAST tool "search for short, nearly exact matches" and searching within the EST databases, yielded COLE3-like sequences in the genomes of other plant species. Searches were made for the consensus ColE3 sequence:
^CC
It was also possible to readily assemble COLE3-like sequences by joining two sequences from the same species with a few mismatches unlikely to impact on the functionality of the replication origin. In the examples below the position of the join between two EST sequences is indicated by "/"and mismatches to the consensus sequence are indicated in bold.
Species Sequence Consensus ColE3 AGAJCAiRATAAGCCTi ^iTAYCAgATAACAGCPcC
Allium cepa AGACCAAATAAGC/TTATATCAGATAGCAGCTGC (SEQ ID NO:29)
CF436111/CF452305
Vitis vinifera AGATCAGATAAGCCTTTA/TCAGATAACAGCCCC (SEQ ID NO:30) CF207293 /CF515867
Nicotiana tabacum AGACCAAATAAGCA/TATATCAGATAACTTCGGA (SEQ ID
NO:31)
TC1407/BP530912
Glycine max TGACCAAATAAGCTTATAT/CAGATAACAGAGTC (SEQ ID
NO:32)
BE209626/AI959871
Saccharum officinarum AGACCAAATAACCCTAAAT/CAGATAACAACGC (SEQ ID NO:33) CA156596/CA092850
Secale cereale AGACCAAATAATCATAT/TTCAGATAACAGCGCC (SEQ ID NO:34)
BE704886/CD453313
Capsicum annum AAACCAAATAAGCAAA/TATCAGATAACTTCGCA (SEQ ID NO:35) CA525915/TC6186
Populus euphratica GCACCAAATAAGCCAATA/TCAGATAACAGCTGC (SEQ ID
NO:36)
AJ777378/ AJ768273 Lotus japonicus CTACCAAATAAGCA TATATCAGATAACAGCGTA (SEQ ID
NO:37)
TC15168/AV774815
Medicago trunculata AGATCAAATAAGCCTTTA TCAGATAACAGCAGA (SEQ ID
NO:38)
CR931730/AC148360
Note: the last example was derived from nr database rather than an EST database.
Example 5 Identification of plant-derived sequences that function as a selectable marker in bacteria: operator-repressor titration
Antibiotics and antibiotic resistance genes traditionally used for the selection and maintenance of recombinant plasmids in hosts such as E. coli and A. tumefaciens. Their continued use is undesirable in plant biotechnology, where the threat of horizontal transfer to other microbes exists. An alternative plasmid selection strategy based on the phenomenon repressor titration was developed by Cobra Biomanufacturing Pic (WO 03/097838 Al; Williams et α7., Nucleic Acids Research, 26: 2120-2124, 1998; Cranenburgh et α/., Nucleic Acids Research, 29: e26, 2001;
Cranenburgh et α7., Journal of Molecular Microbiology and Biotechnology, 7: 197-203, 2004).
Background The Operator-Repressor Titration (ORT) system enables selection and maintenance of plasmids that are free from expressed selectable marker genes and require only the short non- expressed 7αc operator for selection and maintenance.
E. coli ORT strain OHllacdapD (genotype recA endAl gyrA96 thil hsdrl 7 supE44 relAl Δ(dapD):;kan hipAr.lac-dapD) contains a chromosomal conditionally essential gene dapD under the control of the lac operator/promoter system. Under normal conditions, a repressor protein encoded by a second chromosomal gene binds to the chromosomal 7αc operator and prevents transcription of dapD, and cells lyse. Growth is permitted when an inducer (IPTG) is provided i.e. on a nutrient agar plate. Alternatively, growth is also permitted when a plasmid containing a 7αc operator sequence is introduced into the cell. The repressor protein binds to the plasmid-borne operator sequence, derepressing the chromosomal operator and allowing dapD expression.
Two 7αc operator sequences have previously been shown to function as plasmid selectable elements:
EαcOl is 21 bp and is derived from the wild-type E. coli lac operon.
EαcO is 20 bp and is an 'ideal' version of ZαcOl, being a perfect palindrome of the first 10 bp of αcOl. ZαcOl: AATTGTGAGCGGATAACAATT (SEQIDNO:39) ZαcO: AATTGTGAGCGCTCACAATT (SEQ ID NO:40)
Previous research conducted to understand the 7αc operator has involved the analysis of various operator analogues, or ZαcO 1 -like sequences, i.e. sequences that are able to titrate the lac repressor. As little as 13 bp of a 14 bp symmetrical consensus sequence TGTGAGCGCTCACA is able to bind the 7αc repressor (Simons et al, Proceedings of the National Academy of Science, USA, 81: 1624-1628, 1984). No work has been conducted to show that these ZαcO-like sequences will function as plasmid selectable elements but it appears likely. With only 13-14 bp required, it is statistically probable that sequences capable of binding the 7αc repressor and acting as a plasmid selectable element will be found in all plant genomes.
Search for LacO-like sequences in plant genomes
A search on NCBI GenBank (nttp://www.ncbi.nlm.nih.gov/BLAST and TIGR database (http://tigrblast.tigr.org/tgi/) using the BLAST tool "search for short, nearly exact matches" and searching within the EST databases, were made for the ZαcO sequence:
AATTGTGAGCGCTCACAATT (SEQ ID NO:40)
The following list gives NCBI accession numbers that have sequences identical to at least 10 bp that comprise one of the two inverted repeats that make up ZαcO:
Dicotyledonous plants
Camelliaceae Camellia so. CVO 14004
Chenopodiaceae Beta vulgaris CV301904 Chenopodium sp. CN782052
Compositae Lactuca sativa BQ999659 Helianthus annuus CD848561 BU671795 BU672101 Helianthus argophyllus CF097400 Convolvulaceae Ipomoea batatas CB330510 Ipomoea nil BJ577170 BJ576074 BJ570027 Cruciferae Brassica rapa L33645 Brassica rapa pekinensis CV523215 . Brassica napus CD837028 Thellungiella halophila BM985805 Thellungiella salsuginea DN777083 Cucurbitaceae Cucumis sativus DN910885 Ericaceae Vaccinium corymbosum CV191519 Euphorbiaceae Manihot esculenta CK651449 Lauraceae Persea americana CK758014 Leguminosae Arachis hypogaea CX127972 CD038632 Cicer arietinum CD051347 Glycine max CX708729 Glycine soja BG041485 Lotus corniculatus ax. japonicus BP073115 Phaseolus vulgaris CV542405 Phaseolus coccineus CA914174 Linaceae Linum usitatissimum CV478930 CV478503 Malvaceae Gossypium hirsutum DR044140 Gossypium raimondii CO131784 Myrtaceae Eucalyptus tereticornis CD669011 Pedaliaceae Sesamum indicum BU668313 Plumbaginaceae Limonium bicolor CX263567 Rosaceae Fragaria x ananassa AB208578 Malus x domestica CV882575 Prunus persica AJ825706 Prunus armeniaca CV048462 Prunus dulcis BI203104 Pyrus communis AJ504986 Rutaceae Citrus sinensis DN618703 CN191267 Citrus reticulata CF828122 Salicaeae Populus tremula BU823277 Solanaceae Capsicum annuum BM066713 Lycopersicon esculentum BP883198 BP876932 Medicago truncatula AL384864 Petunia x hybrida CV300353 Solanum habrochaites DN168862 Solanum tuberosum DN849072 CV469139 Sterculiaceae Theobroma cacao CF974287 Vitaceae Vitis vinifera CX127882
Monocotyledonous plants Gramineae A vena sativa CN821127 Hordeum vulgare CV062014 Oryza sativa CR290368 Saccharum officinarum CA104782 Sorghum bicolor CX607714 Schedonorus arundinaceus CK802645 Triticum aestivum BQ578949 Triticum monococcum BQ801760 Zea mays CO526196
Liliaceae Allium cepa CF448121
Gymnosperms Pinus taeda DRO 13559 Picea engelmannii x sitchensis CO213279 Picea glauca CK441720 Pinus pinaster BX676975 Pseudotsuga menziesii CN638414
Bryophytes Marchantia polymorpha AU081717
Algae Chlamydomonas reinhardtii CF558875
Search for LacOl-like sequences in plant genomes
A search on NCBI GenBank (http ://www.ncbi.nlm.nih. gov/BLAST/) and TIGR database ■http://tigrblast. tigr.org/tgi/) using the BLAST tool "search for short, nearly exact matches' and searching within the EST databases, were made for GGATAACAATT.
The 21 bp ZαcOl sequence is identical in its first 10 bp to ZαcO. The following list gives accession numbers where at least the last 11 bp of the ZαcOl sequence, GGATAACAATT, are found: Dicotyledonous plants Chenopodiaceae Beta vulgaris CX779649 CF542856 Compositae Lactuca sativa BU004821 BU008839 Helianthus annuus BU671786 BQ965452 Convolvulaceae Ipomoea nil BJ567255 Cruciferae Brassica rapa CV433907 CV432343 Brassica napus CX195012 CD838296 Raphanus sativus AF051115 Cucurbitaceae Citrullus lanatus AI563425 Leguminosae Cicer arietinum CK148974 Glycine max CO036432 CX709893 Medicago trunculata BQ 144942 Phaseolus vulgaris CV533775 CB543020 Phaseolus coccineus CA913133 Pisum sativum CD860446 Linaceae Linum usitatissimum CA482669 Malvaceae Gossypium raimondii CO128755 Gossypium hirsutum CO497326 Rosaceae Malus x domestica CV997415 CN879093 CN882088 Prunus persica AJ823535 Prunus armeniaca CV048921 Prunus americana CV458467 CK758903 Pyrus communis AJ504896 Rutaceae Citrus sinensis CX675412 CX075530 Citrus Clementina CX298649 Solanaceae Capsicum annuum CA516533 Lycopersicon esculentum BI926125 Nicotiana tabacum CN949741 BP535353 Solanum tuberosum CK719419 Vitaceae Vitis vinifera CD715798 Monocotyledonous plants Gramineae Avena sativa CN820280 Hordeum vulgare CK568615 Lolium multiflorum AU247989 Oryza sativa CF986696 Saccharum officinarum CA279301 Secale cereale BE495021 Sorghum bicolor CX615619 Triticum aestivum CK215572 Zea mays CF046268 Liliaceae Allium cepa BE205560 CF449604 Gymnosperms Pinus taeda AW065199 DN614133 Picea glauca CO251715 CO241938 Pinus pinaster BX682941 Pseudotsuga menziesii CN640766
Potato LacOl sequence as a recombinant plasmid selectable element The 21 bp ZαcOl sequence was used to search publicly available potato (Solanum tuberosum) EST sequences. Sequences were found in NCBI accessions CV501815 and CK259105 joined in silico with Bglfl restriction enzyme recognition sites (agatct) added to termini to make 693 bp POTLACOl: agatctAATATTTACTTCTCCACTTAAACAAATACCCCAATCAGAATCACTAGCTGGCAGAT TCCTTGTCCTCTATTGACAGCAAACATAGACGTACATTATAGAGCCACCACAACATTAGACA AACATTCTTTAAACAAGAGGTGGATACTGCTTAGACTGCAGGCGCACCCTCTTTCGGTACTC CAGAACATCCTGAATAAACATATGATACCCTTCAGTTTGGGCAGGATCAGCAGGGTTTGGCT GATCTAACAAGTCCTGGATACCAACCAGTATCTGTTTCACGGTGATGGCTGGTCTCCACCCA CTATCTTCATTGAGGATCGACAAGCAAACTGTTCCAGATGGATAGACATTGGGATGGAAAAA GCCTGGTGGGAATTTACACTTTGGCGGTTTACTCGGATAATCTTCACTGAAGTGAATTGTGA GCGGATAACAATTGGGGAAATCATTATGTAAATTCAACAAATATTTCAATTTATGCATTAGC AAATTGTTATCAGGATCTACCACATCAGGATTGTCTTCTATGCTACGCAGCTAGTCGAACTC GACTCCCTCGTTGTCTTCCTGGTAAATCCGGTCGAATATATCTCGACGGATGTTTTCTCCGG TACGATCAATACAATTTTTTCAACGAAACTACTGATTCAGCTAAAGATACAGTGAACTGTAG CAGCTagatct (SEQIDNO:41)
ZαcOl sequence is underlined. First nucleotide of CK259105 underlined and in bold. Terminal Bglϊl restriction enzyme sites (agatct in lowercase) are not of potato sequence origin.
The 693 bp POTLACOl sequence was synthesised by Genscript Corporation (Piscatawa, NJ, www. enscript. com) and supplied cloned into the Smal site of pUC57 (pUC57POTLACOl).
POTLACOl was excised from pUC57POTLACOl with Bglϊl and ligated into pBR322 previously linearised with førøHI. The resulting plasmid pBR322POTLACOl was transformed into E. coli strain DHllacdapD and colonies were selected using repressor titration. Plasmid DNA was isolated from selected colonies and digested with restriction enzyme Pstl (see Figure 2). Linearised pBR322 is visualised as a band at 4.4 kb. Pstl digested pBR322POTLACOl is visualised as two bands, one at 1.3 kb and one at 3.8 kb. The results indicate that POTLACOl functions as plasmid selectable element.
Onion LacO and LacOl sequences
Onion-derived ZαcOl and ZαcO sequences, ALLLACO and ALLLACO 1, have also been made in silico.
NCBI accessions CF448121 and CF450773 were used to generate a 756 bp ALLLACO (ZαcO sequence underlined):
agatcttcgtcagcctacaactaccgacaatcccaaacccacatccgacgacgataactacg aaaatagggaggtggattctgacggagcttcggattccgacgatgattgggaaggggttgag agcacggagttggatgagatttttagtgcggcgactacgtttatagctgcgactgctgcgga taagaattctgcaaaagtttcgaatgatctgcagctgcagttatatgggttttacaagattg ctactgaggggccttgtaccgttccccaaccttctgcacttaaaatgacagctcgtgccaag tggaatgcatggcagaaacttggttccatgcctcctgaagaagctatggagaagtacattgc aattgtgagcgctcacaattgcttttacgctatqtatgacaatatggataatcatggtgggg cccagagagccccaatgaatcctcagcaaattccatttggaaattcattatatggagctggg tctggactcatccgaggtggcttgggtgcctatggagagagatttttaggttcaagctccga gtttatgcagagcaatataagtagatggttctccaaccctcagtattactttcaagtgaatg accagtatgtgaggaacaagttgaaagttgttttgtttccctttttacacagagggcattgg acaaggatcactgaaccggttggtggcaggctttcttacaaacctccaatttttgacatcaa tgccccagatct (SEQIDNO:42)
NCBI accessions CF448121 and CF449604 were used to generate a 662 bp ALLLACO1 (ZαcOl sequence underlined):
agatctggggcattgatgtcaaaaattggaggtttgtaagaaagcctgccaccaaccggttc agtgatccttgtccaatgccctctgtgtaaaaagggaaacaaaacaactttcaacttgttcc tcacatactggtcattcacttgaaagtaatactgagggttggagaaccatctacttatattg ctctgcataaactcggagcttgaacctaaaaatctctctccataggcacccaagccacctcg gatgagtccagacccagctccatataatgaatttccaaatggaatttgctgaggattcattg gggctctctgggccccaccatgat atccatattgtcatacatagcgtaaaagcaattgtga gcggataacaattcatttcaaaagggaggaggaggaggacaacagagtcagggtcttacgct ttttgtgaaaggttttgatagctctcaagatccattcacgattcgtgatactcttcgatcgc attttgagtcctgtggagagatttctcgtgtttcagttccaaaagattttgaaaccggcagc tccagggggattgcgtacattgatttcaatgaacaagagagttttaacaaagccctagaact gaatggatcagaaatagatggatactacctggttgttgatca (SEQIDNO:43)
Other LacOl-like sequences from plant genomes
It is highly likely that ZαcOl -like sequences will also function as plasmid selectable elements. For example, the following sequence was also found.
qi 114495426 |qb| BE205560.il API26F NPI Onion cDNA library Allium cepa cDNA clone API26F similar to catalase, mRNA sequence. Length = 628 Score = 28.2 bits (14), Expect = 0.34 Identities = 21/22 (95%), Gaps = 1/22 (4%) Strand = Plus / Minus
Query: 1 aattgtgagcgg-ataacaatt 21 I I I I I I I I I I I I I I I I I I I I I Sbjct: 477 aattgtgagcgggataacaatt 456
Example 6 Design and construction of an intragenic vector for Arabidopsis thaliana
The consensus T-DNA border sequence can be defined as: 5 GGCAGGATATATXXXXXTGTAAXX3' Although other variants can include:
5 GACAGGATATATXXXXXGGTCAXX3' (nucleotides in bold represent possible substitutions in some T-DNA borders).
Searches for such DNA sequences identified a single sequence in the A. thaliana genome corresponding to:
5 GACAGGATATATCGTGATGTCAAC3' (SEQ ID NO:44)
[ex AL138652 from chromosome 3, bp 60629-60606)
This "T-DNA border" is remarkably similar to authentic T-DNA borders from Agrobacterium
Ti or Ri plasmids, with all nucleotide substitutions occurring in variable regions:
5 GACAGGATATATGGTGATGTCACG3' pTiS4 (SEQ ID
NO:45)
5 GACAGGATATATGTTCTTGTCATG3' pRi (TR rb) (SEQ ID NO:46)
5 GGCAGGATATATCGAGGTGTAAAA3' pTil5955 (TR lb) (SEQ ID
NO:47)
5 GACAGGATATATTGGCGGGTAAAC3' pTiT37 (rb) (SEQ ID
NO:48) 5 GACAGGATATATTGGCGGGTAAAC3' pTiC58 (rb) (SEQ ID
NO:49)
(nucleotides in bold represent nucleotide substitutions)
The A. thaliana "T-DNA border" is from an open reading frame (nucleotides 59676-63206 from AL138652) for a putative protein of unknown function [i.e. no promoters and presumably not a heterochromatic region]. Examination of sequences flanking this "T-DNA border" reveal a 2838 bp fragment (nucleotides 59735-62572 from AL138652) with several unique restriction sites suitable as potential insertion sites for other genes and Southern analysis of plants transformed using this vector.
If the "T-DNA border" found at nucleotides 60629-60606 is considered the "left border" of a binary vector there are several unique restriction sites, including Xbal, between this left border and the first three nucleotides equivalent to a right border at positions 59735-59737. The right border beyond these three nucleotides can be provided by authentic right border sequences of non-plant origin, thereby resulting in a "chimeric right border'.
In assembling an intragenic vector all plasmid constructions were performed using standard molecular biology techniques of plasmid isolation, restriction, ligation and transformation into Escherichia coli strain DH5α (Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987).
The following primers were designed:
Primer A: 5'CCGAGGAGGTGCTAGAGC7C7^G4GCGTAAAGGAATGTCC3' (SEQ ID
NO:50)
Primer B: 5'AAAGGCZCG^GGTTTACCCGCCAATATATCCTGTCTATGTTTC ACATGAACACGTGAATCTTC3' (SEQ ID NO:51)
Primer C: 5ΑAAGGGZCG.4CTAGATCTTTCGGTTGTGTGAATGATTCCGATGA GAGAAGAAGAC3' (SEQ ID NO:52)
Primer D: 5'GK3ACATTCCTTTACGC7TC2^ 5^GCTCTAGCACCTCCTCGG3' (SEQ
ID NO:53)
Restriction sites within the primers are indicated in italics: TCTAGA -Xbal site, CTCGAG - Xhol site, GTCGAC - Sail site.
Using Arabidopsis thaliana 'Columbia' genomic DNA as a template and primers A and B a polymerase chain reaction was performed using high fidelity Pwo polymerase to amplify a "right border" 703 bp fragment which was subsequently restricted -with. Xbal and Xhol ligated into pPROEX-1 restricted with the same endonucleases, to form pPROEX-lrb.
Using Arabidopsis thaliana 'Columbia' genomic DNA as a template and primers C and D a polymerase chain reaction was performed using high fidelity Pwo polymerase to amplify a "left border" 2216 bp fragment which was subsequently restricted with Xbal and Sail ligated into pPROEX-1-rb restricted with the same endonucleases, to form pPROEX-AtTD.
The 2864 bp Sail to Xhol fragment of pPROEX-AtTD was ligated to the 8004 bp Sα/I backbone of the binary vector pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203- 1207) to form pTCl. The orientation of the two fragments was determined by restriction patterns and confirmed by DNA sequencing. The full sequence of pTCl is shown below and comprises a 2838 bp DNA fragment derived from Arabidopsis thaliana (nucleotides 59735- 62572 from AL138652) presented in italics. The right and left T-DNA borders are in bold and the unique Xbal site used for subsequent cloning is in bold and underlined.
1 GTCGACGGAT CTTTTCCGCT GCATAACCCT GCTTCGGGGT CATTATAGCG ATTTTTTCGG
61 TATATCCATC CTTTTTCGCA CGATATACAG GATTTTGCCA AAGGGTTCGT GTAGACTTTC
121 CTTGGTGTAT CCAACGGCGT CAGCCGGGCA GGATAGGTGA AGTAGGCCCA CCCGCGAGCG 181 GGTGTTCCTT CTTCACTGTC CCTTATTCGC ACCTGGCGGT GCTCAACGGG AATCCTGCTC
241 TGCGAGGCTG GCCGGCTACC GCCGGCGTAA CAGATGAGGG CAAGCGGATG GCTGATGAAA
301 CCAAGCCAAC CAGGGGTGAT GCTGCCAACT TACTGATTTA GTGTATGATG GTGTTTTTGA
361 GGTGCTCCAG TGGCTTCTGT TTCTATCAGC TGTCCCTCCT GTTCAGCTAC TGACGGGGTG
421 GTGCGTAACG GCAAAAGCAC CGCCGGACAT CAGCGCTATC TCTGCTCTCA CTGCCGTAAA 481 ACATGGCAAC TGCAGTTCAC TTACACCGCT TCTCAACCCG GTACGCACCA GAAAATCATT
541 GATATGGCCA TGAATGGCGT TGGATGCCGG GCAACAGCCC GCATTATGGG CGTTGGCCTC
601 AACACGATTT TACGTCACTT AAAAAACTCA GGCCGCAGTC GGTAACCTCG CGCATACAGC
661 CGGGCAGTGA CGTCATCGTC TGCGCGGAAA TGGACGAACA GTGGGGCTAT GTCGGGGCTA
721 AATCGCGCCA GCGCTGGCTG TTTTACGCGT ATGACAGTCT CCGGAAGACG GTTGTTGCGC 781 ACGTATTCGG TGAACGCACT ATGGCGACGC TGGGGCGTCT TATGAGCCTG CTGTCACCCT
841 TTGACGTGGT GATATGGATG ACGGATGGCT GGCCGCTGTA TGAATCCCGC CTGAAGGGAA
901 AGCTGCACGT AATCAGCAAG CGATATACGC AGCGAATTGA GCGGCATAAC CTGAATCTGA
961 GGCAGCACCT GGCACGGCTG GGACGGAAGT CGCTGTCGTT CTCAAAATCG GTGGAGCTGC
1021 ATGACAAAGT CATCGGGCAT TATCTGAACA TAAAACACTA TCAATAAGTT GGAGTCATTA 1081 CCCAACCAGG AAGGGCAGCC CACCTATCAA GGTGTACTGC CTTCCAGACG AACGAAGAGC
1141 GATTGAGGAA AAGGCGGCGG CGGCCGGCAT GAGCCTGTCG GCCTACCTGC TGGCCGTCGG
1201 CCAGGGCTAC AAAATCACGG GCGTCGTGGA CTATGAGCAC GTCCGCGAGC TGGCCCGCAT
1261 CAATGGCGAC CTGGGCCGCC TGGGCGGCCT GCTGAAACTC TGGCTCACCG ACGACCCGCG 1321 CACGGCGCGG TTCGGTGATG CCACGATCCT CGCCCTGCTG GCGAAGATCG AAGAGAAGCA
1381 GGACGAGCTT GGCAAGGTCA TGATGGGCGT GGTCCGCCCG AGGGCAGAGC CATGACTTTT
1441 TTAGCCGCTA AAACGGCCGG GGGGTGCGCG TGATTGCCAA GCACGTCCCC ATGCGCTCCA
1501 TCAAGAAGAG CGACTTCGCG GAGCTGGTAT TCGTGCAGGG CAAGATTCGG AATACCAAGT 1561 ACGAGAAGGA CGGCCAGACG GTCTACGGGA CCGACTTCAT TGCCGATAAG GTGGATTATC
1621 TGGACACCAA GGCACCAGGC GGGTCAAATC AGGAATAAGG GCACATTGCC CCGGCGTGAG
1681 TCGGGGCAAT CCCGCAAGGA GGGTGAATGA ATCGGACGTT TGACCGGAAG GCATACAGGC
1741 AAGAACTGAT CGACGCGGGG TTTTCCGCCG AGGATGCCGA AACCATCGCA AGCCGCACCG
1801 TCATGCGTGC GCCCCGCGAA ACCTTCCAGT CCGTCGGCTC GATGGTCCAG CAAGCTACGG 1861 CCAAGATCGA GCGCGACAGC GTGCAACTGG CTCCCCCTGC CCTGCCCGCG CCATCGGCCG
1921 CCGTGGAGCG TTCGCGTCGT CTCGAACAGG AGGCGGCAGG TTTGGCGAAG TCGATGACCA
1981 TCGACACGCG AGGAACTATG ACGACCAAGA AGCGAAAAAC CGCCGGCGAG GACCTGGCAA
2041 AACAGGTCAG CGAGGCCAAG CAGGCCGCGT TGCTGAAACA CACGAAGCAG CAGATCAAGG
2101 AAATGCAGCT TTCCTTGTTC GATATTGCGC CGTGGCCGGA CACGATGCGA GCGATGCCAA 2161 ACGACACGGC CCGCTCTGCC CTGTTCACCA CGCGCAACAA GAAAATCCCG CGCGAGGCGC
2221 TGCAAAACAA GGTCATTTTC CACGTCAACA AGGACGTGAA GATCACCTAC ACCGGCGTCG
2281 AGCTGCGGGC CGACGATGAC GAACTGGTGT GGCAGCAGGT GTTGGAGTAC GCGAAGCGCA
2341 CCCCTATCGG CGAGCCGATC ACCTTCACGT TCTACGAGCT TTGCCAGGAC CTGGGCTGGT
2401 CGATCAATGG CCGGTATTAC ACGAAGGCCG AGGAATGCCT GTCGCGCCTA CAGGCGACGG 2461 CGATGGGCTT CACGTCCGAC CGCGTTGGGC ACCTGGAATC GGTGTCGCTG CTGCACCGCT
2521 TCCGCGTCCT GGACCGTGGC AAGAAAACGT CCCGTTGCCA GGTCCTGATC GACGAGGAAA
2581 TCGTCGTGCT GTTTGCTGGC GACCACTACA CGAAATTCAT ATGGGAGAAG TACCGCAAGC
2641 TGTCGCCGAC GGCCCGACGG ATGTTCGACT ATTTCAGCTC GCACCGGGAG CCGTACCCGC
2701 TCAAGCTGGA AACCTTCCGC CTCATGTGCG GATCGGATTC CACCCGCGTG AAGAAGTGGC 2761 GCGAGCAGGT CGGCGAAGCC TGCGAAGAGT TGCGAGGCAG CGGCCTGGTG GAACACGCCT
2821 GGGTCAATGA TGACCTGGTG CATTGCAAAC GCTAGGGCCT TGTGGGGTCA GTTCCGGCTG
2881 GGGGTTCAGC AGCCAGCGCT TTACTGGCAT TTCAGGAACA AGCGGGCACT GCTCGACGCA
2941 CTTGCTTCGC TCAGTATCGC TCGGGACGCA CGGCGCGCTC TACGAACTGC CGATAAACAG
3001 AGGATTAAAA TTGACAATTG TGATTAAGGC TCAGATTCGA CGGCTTGGAG CGGCCGACGT 3061 GCAGGATTTC CGCGAGATCC GATTGTCGGC CCTGAAGAAA GCTCCAGAGA TGTTCGGGTC
3121 CGTTTACGAG CACGAGGAGA AAAAGCCCAT GGAGGCGTTC GCTGAACGGT TGCGAGATGC
3181 CGTGGCATTC GGCGCCTACA TCGACGGCGA GATCATTGGG CTGTCGGTCT TCAAACAGGA
3241 GGACGGCCCC AAGGACGCTC ACAAGGCGCA TCTGTCCGGC GTTTTCGTGG AGCCCGAACA
3301 GCGAGGCCGA GGGGTCGCCG GTATGCTGCT GCGGGCGTTG CCGGCGGGTT TATTGCTCGT 3361 GATGATCGTC CGACAGATTC CAACGGGAAT CTGGTGGATG CGCATCTTCA TCCTCGGCGC
3421 ACTTAATATT TCGCTATTCT GGAGCTTGTT GTTTATTTCG GTCTACCGCC TGCCGGGCGG
3481 GGTCGCGGCG ACGGTAGGCG CTGTGCAGCC GCTGATGGTC GTGTTCATCT CTGCCGCTCT
3541 GCTAGGTAGC CCGATACGAT TGATGGCGGT CCTGGGGGCT ATTTGCGGAA CTGCGGGCGT
3601 GGCGCTGTTG GTGTTGACAC CAAACGCAGC GCTAGATCCT GTCGGCGTCG CAGCGGGCCT 3661 GGCGGGGGCG GTTTCCATGG CGTTCGGAAC CGTGCTGACC CGCAAGTGGC AACCTCCCGT
3721 GCCTCTGCTC ACCTTTACCG CCTGGCAACT GGCGGCCGGA GGACTTCTGC TCGTTCCAGT 3781 AGCTTTAGTG TTTGATCCGC CAATCCCGAT GCCTACAGGA ACCAATGTTC TCGGCCTGGC
3841 GTGGCTCGGC CTGATCGGAG CGGGTTTAAC CTACTTCCTT TGGTTCCGGG GGATCTCGCG
3901 ACTCGAACCT ACAGTTGTTT CCTTACTGGG CTTTCTCAGC CGGGATGGCG CTAAGAAGCT
3961 ATTGCCGCCG ATCTTCATAT GCGGTGTGAA ATACCGCACA GATGCGTAAG GAGAAAATAC 4021 CGCATCAGGC GCTCTTCCGC TTCCTCGCTC ACTGACTCGC TGCGCTCGGT CGTTCGGCTG
4081 CGGCGAGCGG TATCAGCTCA CTCAAAGGCG GTAATACGGT TATCCACAGA ATCAGGGGAT 4141 AACGCAGGAA AGAACATGTG AGCAAAAGGC CAGCAAAAGG CCAGGAACCG TAAAAAGGCC 4201 GCGTTGCTGG CGTTTTTCCA TAGGCTCCGC CCCCCTGACG AGCATCACAA AAATCGACGC 4261 TCAAGTCAGA GGTGGCGAAA CCCGACAGGA CTATAAAGAT ACCAGGCGTT TCCCCCTGGA 4321 AGCTCCCTCG TGCGCTCTCC TGTTCCGACC CTGCCGCTTA CCGGATACCT GTCCGCCTTT
4381 CTCCCTTCGG GAAGCGTGGC GCTTTCTCAA TGCTCACGCT GTAGGTATCT CAGTTCGGTG
4441 TAGGTCGTTC GCTCCAAGCT GGGCTGTGTG CACGAACCCC CCGTTCAGCC CGACCGCTGC
4501 GCCTTATCCG GTAACTATCG TCTTGAGTCC AACCCGGTAA GACACGACTT ATCGCCACTG
4561 GCAGCAGCCA CTGGTAACAG GATTAGCAGA GCGAGGTATG TAGGCGGTGC TACAGAGTTC 4621 TTGAAGTGGT GGCCTAACTA CGGCTACACT AGAAGGACAG TATTTGGTAT CTGCGCTCTG
4681 CTGAAGCCAG TTACCTTCGG AAAAAGAGTT GGTAGCTCTT GATCCGGCAA ACAAACCACC
4741 GCTGGTAGCG GTGGTTTTTT TGTTTGCAAG CAGCAGATTA CGCGCAGAAA AAAAGGATCT
4801 CAAGAAGATC CTTTGATCTT TTCTACGGGG TCTGACGCTC AGTGGAACGA AAACTCACGT
4861 TAAGGGATTT TGGTCATGAG ATTATCAAAA AGGATCTTCA CCTAGATCCT TTTAAATTAA 4921 AAATGAAGTT TTAAATCAAT CTAAAGTATA TATGAGTAAA CTTGGTCTGA CAGTTACCAA
4981 TGCTTAATCA GTGAGGCACC TATCTCAGCG ATCTGTCTAT TTCGTTCATC CATAGTTGCC 5041 TGACTCCCCG TCGTGTAGAT AACTACGATA CGGGAGGGCT TACCATCTGG CCCCAGTGCT 5101 GCAATGATAC CGCGAGACCC ACGCTCACCG GCTCCAGATT TATCAGCAAT AAACCAGCCA 5161 GCCGGAAGGG CCGAGCGCAG AAGTGGTCCT GCAACTTTAT CCGCCTCCAT CCAGTCTATT 5221 AAACAAGTGG CAGCAACGGA TTCGCAAACC TGTCACGCCT TTTGTGCCAA AAGCCGCGCC
5281 AGGTTTGCGA TCCGCTGTGC CAGGCGTTAG GCGTCATATG AAGATTTCGG TGATCCCTGA
5341 GCAGGTGGCG GAAACATTGG ATGCTGAGAA CCATTTCATT GTTCGTGAAG TGTTCGATGT
5401 GCACCTATCC GACCAAGGCT TTGAACTATC TACCAGAAGT GTGAGCCCCT ACCGGAAGGA
5461 TTACATCTCG GATGATGACT CTGATGAAGA CTCTGCTTGC TATGGCGCAT TCATCGACCA 5521 AGAGCTTGTC GGGAAGATTG AACTCAACTC AACATGGAAC GATCTAGCCT CTATCGAACA
5581 CATTGTTGTG TCGCACACGC ACCGAGGCAA AGGAGTCGCG CACAGTCTCA TCGAATTTGC
5641 GAAAAAGTGG GCACTAAGCA GACAGCTCCT TGGCATACGA TTAGAGACAC AAACGAACAA 5701 TGTACCTGCC TGCAATTTGT ACGCAAAATG TGGCTTTACT CTCGGCGGCA TTGACCTGTT 5761 CACGTATAAA ACTAGACCTC AAGTCTCGAA CGAAACAGCG ATGTACTGGT ACTGGTTCTC 5821 GGGAGCACAG GATGACGCCT AACAATTCAT TCAAGCCGAC ACCGCTTCGC GGCGCGGCTT
5881 AATTCAGGAG TTAAACATCA TGAGGGAAGC GGTGATCGCC GAAGTATCGA CTCAACTATC
5941 AGAGGTAGTT GGCGTCATCG AGCGCCATCT CGAACCGACG TTGCTGGCCG TACATTTGTA
6001 CGGCTCCGCA GTGGATGGCG GCCTGAAGCC ACACAGTGAT ATTGATTTGC TGGTTACGGT
6061 GACCGTAAGG CTTGATGAAA CAACGCGGCG AGCTTTGATC AACGACCTTT TGGAAACTTC 6121 GGCTTCCCCT GGAGAGAGCG AGATTCTCCG CGCTGTAGAA GTCACCATTG TTGTGCACGA
6181 CGACATCATT CCGTGGCGTT ATCCAGCTAA GCGCGAACTG CAATTTGGAG AATGGCAGCG 6241 CAATGACATT CTTGCAGGTA TCTTCGAGCC AGCCACGATC GACATTGATC TGGCTATCTT
6301 GCTGACAAAA GCAAGAGAAC ATAGCGTTGC CTTGGTAGGT CCAGCGGCGG AGGAACTCTT
6361 TGATCCGGTT CCTGAACAGG ATCTATTTGA GGCGCTAAAT GAAACCTTAA CGCTATGGAA
6421 CTCGCCGCCC GACTGGGCTG GCGATGAGCG AAATGTAGTG CTTACGTTGT CCCGCATTTG 6481 GTACAGCGCA GTAACCGGCA AAATCGCGCC GAAGGATGTC GCTGCCGACT GGGCAATGGA
6541 GCGCCTGCCG GCCCAGTATC AGCCCGTCAT ACTTGAAGCT AGGCAGGCTT ATCTTGGACA
6601 AGAAGATCGC TTGGCCTCGC GCGCAGATCA GTTGGAAGAA TTTGTTCACT ACGTGAAAGG
6661 CGAGATCACC AAGGTAGTCG GCAAATAATG TCTAACAATT CGTTCAAGCC GACGCCGCTT
6721 CGCGGCGCGG CTTAACTCAA GCGTTAGAGA GCTGGGGAAG ACTATGCGCG ATCTGTTGAA 6781 GGTGGTTCTA AGCCTCGTAC TTGCGATGGC ATCGGGGCAG GCACTTGCTG ACCTGCCAAT
6841 TGTTTTAGTG GATGAAGCTC GTCTTCCCTA TGACTACTCC CCATCCAACT ACGACATTTC
6901 TCCAAGCAAC TACGACAACT CCATAAGCAA TTACGACAAT AGTCCATCAA ATTACGACAA
6961 CTCTGAGAGC AACTACGATA ATAGTTCATC CAATTACGAC AATAGTCGCA ACGGAAATCG
7021 TAGGCTTATA TATAGCGCAA ATGGGTCTCG CACTTTCGCC GGCTACTACG TGATTGCCAA 7081 CAATGGGACA ACGAACTTCT TTTCCACATC TGGCAAAAGG ATGTTCTACA CCCCAAAAGG
7141 GGGGCGCGGC GTCTATGGCG GCAAAGATGG GAGCTTCTGC GGGGCATTGG TCGTCATAAA
7201 TGGCCAATTT TCGCTTGCCC TGACAGATAA CGGCCTGAAG ATCATGTATC TAAGCAACTA
7261 GCCTGCTCTC TAATAAAATG TTAGGAGCTT GGCTGCCATT TTTGGGGTGA GGCCGTTCGC
7321 GGCCGAGGGG CGCAGCCCCT GGGGGGATGG GAGGCCCGCG TTAGCGGGCC GGGAGGGTTC 7381 GAGAAGGGGG GGCACCCCCC TTCGGCGTGC GCGGTCACGC GCCAGGGCGC AGCCCTGGTT
7441 AAAAACAAGG TTTATAAATA TTGGTTTAAA AGCAGGTTAA AAGACAGGTT AGCGGTGGCC
7501 GAAAAACGGG CGGAAACCCT TGCAAATGCT GGATTTTCTG CCTGTGGACA GCCCCTCAAA
7561 TGTCAATAGG TGCGCCCCTC ATCTGTCAGC ACTCTGCCCC TCAAGTGTCA AGGATCGCGC
7621 CCCTCATCTG TCAGTAGTCG CGCCCCTCAA GTGTCAATAC CGCAGGGCAC TTATCCCCAG 7681 GCTTGTCCAC ATCATCTGTG GGAAACTCGC GTAAAATCAG GCGTTTTCGC CGATTTGCGA
7741 GGCTGGCCAG CTCCACGTCG CCGGCCGAAA TCGAGCCTGC CCCTCATCTG TCAACGCCGC
7801 GCCGGGTGAG TCGGCCCCTC AAGTGTCAAC GTCCGCCCCT CATCTGTCAG TGAGGGCCAA
7861 GTTTTCCGCG AGGTATCCAC 7ACGCCGGCG GCCGGCCGCG GTGTCTCGCA CACGGCTTCG
7921 ACGGCGTTTC TGGCGCGTTT GCAGGGCCAT AGACGGCCGC CAGCCCAGCG GCGAGGGCAA 7981 CCAGCCCGGT GAGCGTCGGA AAGGGTCGAG GTTTACCCGC CAATATATCC 1GTCTATGTT
8041 TCACATGAAC ACGTGAATCT TCTTCAACAC GCCCACCTAA CCGCTCCTTT GCAGATAATC
8101 GACGGCGTCG AGTTGATGTG TGATCAACAT TACCAGAATT CCTTTCATCA GCTGAGTATC
8161 GGAATTGTTC TCTGCTTATT CCTCCATCCA CTGCATAGTT CCCTAGCTTG TCTCTGTAAT
8221 CATATGCTAC TTCATGTTCA CGGAACCTTT TACTATCTGC CTTCTCATAA GACATTCTTG 8281 ATTGCTTAGC ATCCCTGTAG TTGTAATCAT AAGGCATATT CTCATGCATA ACCTCACTTG
8341 CGTTGTCTCT AAGACCATAA TCATCTCTTG TACGCAAAAT TGAATCATTC GAATGATAAA
8401 CCTCTTGTCT ACCATCTTGA TATCTCATAT TGGCATAAAC TTTAACATCA CCACCATTAC
8461 GTCGTTGCAA ACGCTCATCA TCCAAGTAGA CTTGATCTCG GTCATCAAAA AGATATCTCC
8521 TGCCTCGAAG AGCTTCCTCA TCTTGCTTGC CAGCTGATGA TCTACTGACA TCAGGATGCA 8581 TCACCCCATA CGAATCAATT TCATGATCTC TTAGGAGTTG CTGGCTTTCA TAGGGCAAAT
8641 AGGCTTCCCT TCCGTCATTC GAGGACATTC CTTTACGCTC TAGAGCTCTA GCACCTCCTC 8701 GGTCCACAAT CTCTGCTTTG GTGACAGCAG GATACATCCT CTCATCAATG CCAGAGTCGT
8761 AGTACTTCAG TTGTTGTTTA TTGTAATGCT GATAAACATC CTTGCTTTCA TTATCCAAAT
8821 ACGCTTCATT TCTATCAATG AAGGCTACTC TCCTAAGCTC TAGCGCCTTG GCATCTCCAT
8881 GGTCTACTAT AATATCTGAC GAGTTGACAT CACGATATAT CCTGTCATCA ATGCCATAGT 8941 CATGATCTTT CTTAAGTTGT TGGCTTTCGT AATGCAGATA TGCATCCCCC CTTTTATAAT
9001 CCATGTATGA TTCCTCTCCA TCATCGAAGG ATCCTCTTCT ACGCTCAAGA GCTCTGGCTT
9061 CTTCCCCGTT TACAAGAATA TCTGATTTAT TGAGACTGGG ATGCATCATG CCAAAAGAGT
9121 TAGTTTCATG ATCTTTTAGG AGTTGCTGGC TTTCACTTTG AAAATATGCT TCCTTTCGAT
9181 CATTTAAGGA TACTCCTCTA TACCTTAGAC CTCTTGCATC TTCATGGTCT ACTAGAATAT 9241 CTGATCTGTT GACATCAGGA GGCATCATGA CATAAGAGTC AGTTTCATAA TCGTTTAGGA
9301 GTTGCTGGTT TTCACATTGC AAGTATGCGT CCTTTTTATC ATTCAAGGAC ACTCCTCCAT
9361 ACCTCCGACC TCTGGCATCT TCATGGTCTA CCAGAATATC TGATTTGTTG ACATCGGGAT
9421 GCCTCATGAC GTAAGAGTCA GTTTCATGAT CGTTTAGGAG TCGCTGCCTT TCACATTGCA
9481 AGTATGCTTC CTTTTTATCA TTCAAGGAAA CTCCTCTATA CCTCCGACCT CTGCCATCTT 9541 CATGGTCTAC CAGAGTATCT GATTTGTTGA CATGGGGATG CATCATGCCA TAGGAGTTAG
9601 TTTCATAATC ATTTAGGAGT CTCTGTCTTT CACATTGCAT GTATGCTTCC TTTTTATCAT
9661 TCAAGGACCC TCCTCTATAC CTTAGACCTT TGGAATCTTC CCGGTCTTCC AGAGTATCTG
9721 ATTTGTTGAC ATCGGGATGC ATTATGCCAT AGGAGTTAGT TTCATAATCA TTTAGGAGTT
9781 GCTGGCTTTC ACATTGCAAG TAAGCTTCCC TTCTATCATT TAAGGACCCT CCTCTATACC 9841 TTAGACCTCT GGAATTTTCC CGTTCCCAGT CTGCTAGAAT ATCTAATCTG TTGACATCAT
9901 CAGGATAAAT CTTAGCATCA GAGCGAGAGT CATAATCTTT CTCCAGTTGT TGGATTTTGT
9961 AATTCAGATA AACATCCTTC CTTTCATTAT CCAAGTATGC TGCCTTTCCT TTGTTTAAGC
1 0021 ATCGTCGTTG AAGCTGCACT CCTCTTCCAT CCTGATCTAC CACAATATCT GCCATATCTA
10081 CATCAGAATG TATCCTACCA TCATTGCCAG GGTGATACTC GTTTCTCAGT TCCTGTCTTT 10141 TATCATCTGA AAAAACAGCA TTTCTCTCAT TATCCAAATA TGCTTCCCTT TCTTTATTTA
10201 AGCAACGTGT TATGCTCTGT GAAGCTTTAT CATCTCGATC TACTACAGCA TATGGTTCAC
10261 TGAAATCAAG ATTCTTCTTA CTGTCAACAC CATCATATAG ATAATCCTTT CTCAGTACTT
10321 GACATTTGTT ATCCAAATAA ACAACCTTTC TTTCTTCGTT ACGGAAGTTC CTGTAGTGAT
1 0381 CATCTCGTTC ATCCACCACT CTTGATCCCA ACTCCACAAA AGGATAATCT TCCTTCACAG 1 0441 ACTCATAATG GTCAGCCATC CTCTCTTTCC TGCTAAACTC AAGATGGGTA TCGGCCGCAT
10501 CAACATCAGC TATATTTGAA CCACAGACAT GGGATTTTGA TAAAGATCCT CTCCTCTGCA
10561 TAAAAAGATC ATTCTCTCTA GCCACATTAT TGACCTCATG CCTAACACTG GGAAACTCTC
10621 TCATTGCTAT ATCAGAGCCT ATATGATAAT TATCCCGAGC TTCATCCACT ATCTCTTTGA
10681 CACACCTGCT CACAACTGGT GAATCATGGT CTCCACGACT TAAATCTCTA ACTTGTTGAT 10741 CCCTTGGTGA GTTTCTACCA ACATAATCAT CGACACGTCT AGTACGTAGA ACCTGTGGTA
10801 CACAAAGATT CCCATCATAA TCATGTCTTC TTCTCTCATC GGAATCATTC ACACAACCGA
10861 AAGATCTA
(SEQ ID NO:54) A mutant form of the Arabidopsis thaliana acetohydroxyacid synthase gene conferring resistance to sulfonylurea herbicides such as chlorsulfuron was inserted into the T-DNA of pTCl. The 5.8 kb Xbal fragment from pGHl (Haughn et al. 1988, Molecular and General Genetics 211 :266-271) was ligated into the unique Xbal site between the left and right T- DNA borders of pTCl to produce pTCAHAS. The orientation of the two fragments was determined by restriction patterns and confirmed by DNA sequencing.
Example 7 Transformation of Arabidopsis thaliana with an intragenic vector
The pTCAHAS binary vector was transformed into the disarmed Agrobacterium tumefaciens strain EHA105 (Hood et al 1993, Transgenic Research, 2:208-218), using the freeze-thaw method (Hofgen & Willmitzer 1988, Nucleic Acids Research, 16: 9877). Agrobacterium was cultured overnight in LB broth supplemented with 300 mg/L spectinomycin and used to transform Arabidopsis thaliana 'Columbia' using the floral dip method (Clough and Bent, Plant Journal 16: 735-743, 1998).
The resulting self pollinated seed was screened in vitro on half-strength MS salts (Murashige and Skoog 1962, Physiologia Plantarum, 15: 473-497) supplemented with 10 μg/L chlorsulfuron. Seeds were also sown on a standard potting mix in a greenhouse and the germinated seedlings at the 3-4 true leaf stage were sprayed with a standard application of Glean (active ingredient chlorsulfuron) at a rate equivalent to 20 g/ha.
Genomic DNA from the recovered chlorsulfuron-resistant seedlings were confirmed as being transformed with the intragenic vector pTCAHAS by polymerase chain reactions across the junctions of the two Xbal sites adjoining the original T-DNA of pTCl and the inserted 5.8 kb Xbal fragment to form pTCAHAS. The following primers were used:
Primer E: 5'CATCCACTGCATAGTTCCC3' (SEQ ID NO:55)
Primer F: 5'GATGCGTTGATCTCTTCATCA3' (SEQ ID NO:56)
Primer G: 5'TCAACATCAATCCGAGTACG3' (SEQ ID NO:57)
Primer H: 5 'AGAGATTGTGGACCGAGGAG3 ' (SEQ ID NO:58)
As illustrated in Figure 3, the expected 643 bp DNA fragment was PCR amplified from the binary vector pTCAHAS and three A. thaliana lines transformed with pTCAHAS using primers E+F designed to flank the Xbal site inside the right T-DNA border. Similarly, the expected 149 bp DNA fragment was PCR amplified from the same DNA sources using primers G+H designed to flank the Xbal site inside the left T-DNA border. Example 8 Construction of additional intragenic vectors
In assembling the intragenic vector all plasmid constructions were performed using standard molecular biology techniques of plasmid isolation, restriction, ligation and transformation into Escherichia coli strain DH5α (Sambrook et al , Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987).
A binary vector with a T-DNA composed of potato DNA
The 1268 bp sequence illustrated in Example 2 as a T-DNA-like region of a potato (Solanum tuberosum) intragenic vector was synthesised by Genscript Corporation (Piscatawa, NJ, www.genscript.com) and supplied cloned into pUC57 (pUC57POTINV).
The Sail fragment encompassing the T-DNA composed of potato DNA from pUC57POTINV was isolated by restriction, then ligated to the 8004 bp Sail backbone of the binary vector ρART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) to form the binary vector pPOTINV. The orientation of the two fragments was determined by PCR analysis across the junctions of the two Sail sites and DNA sequencing.
The pPOTINV binary vector was transformed into the Agrobacterium strain A4T, also known as C58C1 (pArA4b) (Petit et al 1983, Molecular and General Genetics, 190:204-214), using the freeze-thaw method (Hofgen & Willmitzer 1988, Nucleic Acids Research, 16: 9877). The Agrobacterium was cultured overnight in LB broth supplemented with 300 mg/L spectinomycin for co-cultivation with leaves from in vitro cultured potato plants.
Virus-free potato plants of cultivar Iwa were multiplied in vitro on MS salts and vitamins (Murashige and Skoog, 1962, Physiologia Plantarum, 15: 473-497), plus 30g l"1 sucrose, 40 mg l"1 ascorbic acid, 500 mg l"1 casein hydrolysate and 7 g l"1 agar, adjusted to pH 5.8 with 0.1 M KOH. Plants were routinely subcultured as 2-3 node segments every three to four weeks and incubated at 26 °C under cool white fluorescent lamps (80-100 μmol m"2 sec"1; 16 h photoperiod). Leaves were excised from the in vitro plants, cut in half, dipped for about 30 sec in the liquid culture of Agrobacterium strain A4T harbouring pPOTINV, then blotted dry on sterile filter paper. These leaf segments were then cultured on potato medium defined above and incubated under reduced light intensity (5-10 μmol m"2 sec"1). Two days later, the leaf segments were transferred to the same medium supplemented with 200 mg l"1 Timentin to prevent Agrobacterium overgrowth.
Hairy roots were selected on MS medium without growth regulators. Genomic DNA isolated from these hairy roots was screened via PCR to identify those derived from co-transformation with pArA4b and pPOTINV. The following primers were used:
Primer I: 5'GCTCACCTTGCAGCTTCACT3' (SEQ ID NO:59)
Primer J: 5'CAGAGCTGGATTTGCATCAG3' (SEQ ID NO:60) to amplify an expected 570 bp DNA fragment from the T-DNA-like region of pPOTINV, and Primer K: 5'GATGGCAGAAGGCGAAGATA3' (SEQ ID NO:61)
Primer L: 5'GAGCTGGTCTTTGAAGTCTCG3' (SEQ ID NO:62) as an internal control to amplify an expected 1069 bp fragment from the endogenous potato actin gene. The expected 1069 bp fragment was amplified using primers K and L from all hairy root lines, including control hairy root line transformed with Agrobacterium strain A4T without the binary vector pPOTINV. The expected 570 bp DNA fragment was PCR amplified from the binary vector pPOTINV and from two of 80 hairy root lines tested using primers I and J (Figure 4). The DNA samples from the two hairy root lines positive for the T- DNA from pPOTINV and the control hairy root lines were also used for PCR using primers designed for the Agrobacterium virG gene: Primer M: 5'GCGGTAGCCGACAG3' (SEQ ID NO:63)
Primer N: 5'GCGTCAAAGAAATA3' (SEQ ID NO:64)
The DNA samples from all hairy root lines failed to amplify PCR products using primers M and N (Figure 5). Furthermore, cultures of these hairy roots failed to grow bacteria when incubated in LB medium. These results establish the absence of associated Agrobacterium with the hairy roots. The 2.5% co-transformation frequency (2 of 80) of T-DNAs from pArA4b and pPOTINV was achieved despite selection for only hairy roots. This demonstrates that the pPOTINV binary vector is effective in transforming potatoes.
A binary vector with a T-DNA composed of petunia DNA The 1507 bp sequence illustrated in Example 2 as a T-DNA-like region of a petunia (Petunia hybrida) intragenic vector was synthesised by Genscript Corporation (Piscatawa, NJ, www. enscript. com) and supplied cloned into pUC57 (pUC57PETINV). The Sail fragment encompassing the T-DNA composed of petunia DNA from pUC57PETINV was isolated by restriction, then ligated to the 8004 bp Sa l backbone of the binary vector pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) to form the binary vector pPETINV. The orientation of the two fragments was determined by PCR analysis across the junctions of the two Sail sites and DNA sequencing.
The pPETINV binary vector was transformed into the Agrobacterium strain A4T, also known as C58C1 (pArA4b) (Petit et al 1983, Molecular and General Genetics, 190:204-214), using the freeze-thaw method (Hofgen & Willmitzer 1988, Nucleic Acids Research, 16: 9877). The Agrobacterium was cultured overnight in LB broth supplemented with 300 mg/L spectinomycin for co-cultivation with leaves from in vitro cultured potato plants.
Virus-free potato plants of cultivar Iwa were multiplied in vitr-o on MS salts and vitamins (Murashige arid Skoog, 1962, Physiologia Plantarum, 15: 473-497), plus 30g l"1 sucrose, 40 mg l"1 ascorbic acid, 500 mg l"1 casein hydrolysate and 7 g l"1 agar, adjusted to pH 5.8 with 0.1 M KOH. Plants were routinely subcultured as 2-3 node segments every three to four weeks and incubated at 26 °C under cool white fluorescent lamps (80-100 μmol m"2 sec"1; 16 h photoperiod). Leaves were excised from the in vitro plants, cut in half, dipped for about 30 sec in the liquid culture of Agrobacterium strain A4T harbouring pPETINV, then blotted dry on sterile filter paper. These leaf segments were then cultured on potato medium defined above and incubated under reduced light intensity (5-10 μmol m"2 sec"1). Two days later, the leaf segments were transferred to the same medium supplemented with 200 mg l"1 Timentin to prevent Agrobacterium overgrowth.
Hairy roots were selected on MS medium without growth regulators. Genomic DNA isolated from these hairy roots was screened via PCR to identify those derived from co-transformation with pArA4b and pPETINV. The following primers were used:
Primer O: 5'GAGATAAACAAATAGTCCGGATCG3' (SEQ ID NO:65)
Primer P: 5OGGAGCATTTGGTGGAAATAG3' (SEQ ID NO:66) to amplify an expected 447 bp DNA fragment from the T-DNA-like region of pPETINV. The same DNA samples were also used in a PCR using primers K and L designed to amplify an expected 1069 bp fragment from the endogenous potato actin gene as an internal control. The expected 1069 bp fragment was amplified using primers K and L from all hairy root lines, including control hairy root line transformed with Agrobacterium strain A4T without the binary vector pPETINV. The expected 447 bp DNA fragment from the T-DNA-like region of pPETINV was PCR amplified from the binary vector pPETINV and from one of 85 hairy root lines tested using primers O and P (Figure 6). The DNA sample from the hairy root line positive for the T-DNA from pPETINV failed to amplify a PCR product using primers M and N designed for the Agrobacterium virQ gene (Figure 7). Furthermore, a culture of this hairy root line failed to grow bacteria when incubated in LB medium. These results establish the absence of associated Agrobacterium with the hairy root line positive for the T-DNA from pPETINV. The 1-2% co-transformation frequency (1 from 85) of T-DNAs from pArA4b and pPETINV was achieved despite selection for only hairy roots. Overall, these results demonstrates that the pPETINV binary vector is effective in transforming plants.
A binary vector with a T-DNA composed of onion DNA
The 1075 bp sequence illustrated in Example 2 as a T-DNA-like region of an onion (Allium cepa) intragenic vector was synthesised by Genscript Corporation (Piscatawa, NJ, www. genscript. com) and supplied cloned into pUC57 (pUC57ALLINV).
The Sail fragment encompassing the T-DNA composed of onion DNA from pUC57ALLINV was isolated by restriction, then ligated to the 8004 bp Sail backbone of the binary vector ρART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) to form the binary vector pALLINV. The orientation of the two fragments was determined by PCR analysis across the junctions of the two Sail sites and DNA sequencing.
Example 9 Design, Construction and Verification of Plant Derived Recombination Sites: /oxP-like sites for recombination with Cre recombinase
BLAST searches were conducted of publicly available plant DNA sequences from NCBI, SGN and TIGR databases.
1) Potato DNA fragment containing a oxP-like sequence - POTLOXP
A fragment containing a loxV-like sequence was designed from two EST sequences from potato {Solanum tuberosum) (NCBI accessions BQl 11407 and BQ045786). This fragment, named POTLOXP, is illustrated below. Restriction enzyme sites used for DNA cloning into the potato intragenic T-DNA described in Example 8 are shown in bold and the loxP-like sequence shown in bold and light grey.
^^^^^^^g^^l^^^^^gg^^^^g^K^^^^^S
Figure imgf000103_0001
(SEQ ID NO:67) Nucleotides 1-3 part of EcoRV restriction enzyme site (from the potato intragenic vector pPOTINV)
Nucleotides 4-402 nucleotides 17-415 of NCBI accession BQl 11407 Nucleotides 403-653 nucleotides 298-548 of NCBI accession BQ045786 Nucleotides 654-655 part of EcoRV restriction enzyme site (from the potato intragenic T-DNA)
The designed potato /oxP-like sequence has 6 nucleotide mismatches from the native loxV sequence as illustrated in bold below. loxP sequence ATAACTTCGTATAGCATACATTATACGAAGTTAT (SEQ ID NO:68)
Potato loxP-like CGATTCGM (SEQ ID NO:69)
The 655 bp POTLOXP sequence illustrated above was synthesised by Genscript Corporation (Piscatawa, NJ, www. enscript. com) and supplied cloned into pUC57. All plasmid constructions were performed using standard molecular biology techniques of plasmid isolation, restriction, ligation and transformation into Escherichia coli strain DH5α
(Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987).
Initially the 1286 bp Sail fragment encompassing the T-DNA composed of potato DNA from pUC57POTINV (described in Example 8) was subcloned into pGEMT to form pGEMTPOTINV. POTLOXP was then cloned into pGEMTPOTINV twice, firstly as a Xbal to CM fragment, then subsequently as a EcoRV to EcoRV fragment. Confirmation of the POTLOXP inserts was verified using restriction enzyme analysis and DNA sequencing. The resulting plasmid was named pPOTLOXP2.
The DNA sequence of the 2316 bp Sail fragment comprising the potato derived T-DNA region in pPOTLOXP2 is illustrated below. Only the nucleotides in italics are not part of potato genome sequences. The POTLOXP regions are shaded. The T-DNA borders are shown in bold, with the left border positioned at 314-337 and the right border positioned at 2005-2028. Restriction sites illustrated in bold represent those used in cloning the POTLOXP regions into pGEMTPOTINV. Unique restriction sites in pPOTLOXP2 for cloning between POTLOXP sites are:
Aflll C/TTAAG
Agel A/CCGGT
BamRl G/GATCC
BstD102l GAG/CGG
Cspl CG/GWCCG
PinAl A/CCGGT GΓCGACAGTAAAAGTTGCACCTGGAATAAGGTTTTCATTCTTCACAGGAGGCATCTCACTCT TTCTAGCAGGTCTTGAACGCTTAGATTGAACAGATGTAGGACTCACATCTGATATGGAGGAT TCTTGACTTGTTTCAGCAGCATCAGATGAAGCTTCTGAGACTTCACCTGATCCATCATCTGT AGCAGTTGCTTCTACTTCTTCCACTGCTACATCAGTCTCAGTTGCTGATACTATAAGACCTC TTAATTTAGGTCGTAAAATGCAACCAACTCTAAAATGGGGAAACAATTTAATAGATGTTGAC AGAGGCAGGATATATTTTGGGGTAAACGGGAATTCTTCAGCAGT GCTCGAGGGAGATTGGC GGTGCTTTCAGCTCACCTTGCAGCTTCACTCAACGTCTCCGATTTAACAACCTTCAAACTT|
^^^gHIEI^M^^ ^aB^^iigMϋ^^^^^ li^^^^^ ^^^^^^βl^^^HB^^H^^^^^^^^^^^E^B
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^gi g ^
^^^JTCGATGAGCGGACCGGTAAGAAGTATCCGGTTCAGGTTTCTGAGGATGGCACTATC AAGCCACCGACTTAΆAGAAGATAACAACAGGACAGAATGATAAAGGTCTTAAGCTTTATGA TCCAGGCTATCTCAACACAGCACCTGTTAGGTCATCAATATGCTATATAGATGGTGATGCCG
GGATCC AGA g^^^^^^^^^^^^^^^^^^l^^^βB
^^^^^ ^^^^^m^ϋs^ W-^^^M^w^^^i ^^^^^Λ^^^M ^i^^^X^^^^^^^^^S^^!^
HHβHSBβi^HHlϊ^βϋlSffiB^^^^H^β^^^fi
GATGCTCATCCAATGGGGGTTCTTGTCAGTGCAATGAGTGCTCTTTCCGTTTTTCATCCTGA TGCAAATCCAGCTCTGAGAGGACAGGATATATACAAGTGTAAACAATTTAAAAGCATATGGT GGCACTGCTCAATATATGAGGTGGGCGCGAGAAGCAGGTACCAATGTGTCCTCATCAAGAGA TGCATTCTTTACCAATCCAACGGTCAAAGCATACTACAAGTCTTTTGTCAAGGCTATTGTGA CAAGAAAAAACTCTATAAGTGGAGTTAAATATTCAGAAGAGCCCGCCATATTTGCGTGGGAA CTCATAAATGAGCCTCGTTGTGAATCCAGTTCATCAGCTGCTGCTCTCCAGGCGTGGATAGC AGAGATGGCTGGATTTGTCGAC (SEQ ID NO:70)
The ability of this construct to undergo recombination between the POTLOXP sites was tested in vivo using Cre recombinase expressing Escherichia coli strain 294-Cre (Buchholz et al., 1996, Nucleic Acids Research 24 (15) 3118-3119). The binary vector pPOTLOXP2 was transformed into E. coli strain 294-Cre and maintained by selection with 100 mg/1 ampillicin and incubation at 23 °C. Raising the temperature to 37 °C induces expression of Cre recombinase in E. coli strain 294-Cre, which effected recombination between the two POTLOXP sites in pPOTLOX2. This was evident by a reduction in the size of pPOTLOXP2 from 5316 bp to 4480 pb. Plasmid isolated from colonies of E. coli strain 294-Cre transformed with pPOTLOXP2 and cultured at 37 °C, was restricted with Sail. All colonies tested produced the fragments of 3.0 kb and 1.5 kb expected when recombination between the POTLOXP sites has occurred (Figure 8).
Recombination between the POTLOXP sites was further verified by DNA sequencing.
Plasmid was isolated from colonies of E. coli strain 294-Cre transformed with pPOTLOXP2 and cultured at 37 °C, then DNA sequenced across the Sail region inserted into pGEMT. The resulting sequence from two independent cultures is illustrated below and confirms that recombination is base pair faithful through the remaining POTLOXP site in plasmid preparations. Only the nucleotides in italics are not part of the potato genome sequences. The remaining POTLOXP region is shaded. The T-DNA borders are shown in bold, with the left border positioned at 314-337 and the right border positioned at 1169-1192. Restriction sites illustrated in bold represent those remaining from cloning the POTLOXP regions into pPOTINV.
GTCGACAGTAAAAGTTGCACCTGGAATAAGGTTTTCATTCTTCACAGGAGGCATCTCACTCT TTCTAGCAGGTCTTGAACGCTTAGATTGAACAGATGTAGGACTCACATCTGATATGGAGGAT TCTTGACTTGTTTCAGCAGCATCAGATGAAGCTTCTGAGACTTCACCTGATCCATCATCTGT AGCAGTTGCTTCTACTTCTTCCACTGCTACATCAGTCTCAGTTGCTGATACTATAAGACCTC TTAATTTAGGTCGTAAAATGCAACCAACTCTAAAATGGGGAAACAATTTAATAGATGTTGAC AGAGGCAGGATATATTTTGGGGTAAACGGGAATTCTTCAGCAGTTGCTCGAGGGAGATTGGC GGTGCTTTCAGCTCACCTTGCAGCTTCACTCAΆCGTCTCCGATTTAACAACCTTCAAACTTI i. $ϋ iβmiin ill
Figure imgf000107_0001
^^^^^^^ATCATACAGTCAATGCCCCATGATGCTCATCCAATGGGGGTTCTTGTCAGT
GCAATGAGTGCTCTTTCCGTTTTTCATCCTGATGCAAATCCAGCTCTGAGAGGACAGGATAT
ATACAAGTGTAAACAATTTAAAAGCATATGGTGGCACTGCTCAATATATGAGGTGGGCGCGA
GAAGCAGGTACCAATGTGTCCTCATCAAGAGATGCATTCTTTACCAATCCAACGGTCAAAGC
ATACTACAAGTCTTTTGTCAAGGCTATTGTGACAAGAAAAAACTCTATAAGTGGAGTTAAAT
ATTCAGAAGAGCCCGCCATATTTGCGTGGGAACTCATAAATGAGCCTCGTTGTGAATCCAGT
TCATCAGCTGCTGCTCTCCAGGCGTGGATAGCAGAGATGGCTGGATTTGTCGΛC
(SEQ ID NO:71)
2) Z,αxP-like sequences from other species
Medicago trunculata (barrel medic) foxP-like sequence designed from 2 ESTs
Eoj P ATAACTTCGTATAATGTATGCTATACGAAGTTAT (SEQ ID
NO:68)
Barrel medic loxP-like ATGACTTCGTATAATGTATGCTATACGAAGTGTG (SEQ ID
NO:72)
Nucleotides 1-19 Nucleotides 109-127 of NCBI accession CA919120 Nucleotides 20-34 Nucleotides 14-28 of NCBI accession CA989265 The barrel medic loxP-like site has 4 nucleotide mismatches from the native loxV sequence (illustrated above in bold). Picea (spruce) /ø P-like sequence designed from 2 ESTs
LoxP ATAACTTCGTATAATGTATGCTATACGAAGTTAT (SEQ
ID NO:68) Spruce loxP-like ATACCTTCGTATAATGTATGCTATACAAAGAAAT(SEQ ID NO:73)
Nucleotides 1-15 Nucleotides 226-240 of NCBI accession CO215992
Nucleotides 16-34 Nucleotides 148-166 of NCBI accession CO255617
The spruce loxP-like site has 4 nucleotide mismatches from the native loxP sequence (illustrated above in bold)
Zea mays (maize) /øxP-like sequence designed from 2 ESTs
Lox? ATAACTTCGTATAATGTATGCTATACGAAGTTAT (SEQ ID NO:68)
Maize loxP-like GCCACTCCGTATAATGTATGCTATACGAAATGAT (SEQ ID
NO:74)
Nucleotides 1-20 Nucleotides 326-345 of NCBI accession CB278114 Nucleotides 21-34 Nucleotides 11-27 of NCBI accession CD001443
The maize /oxP-like site has 6 nucleotide mismatches from the native loxP sequence (illustrated above in bold)
Example 10 Design, Construction and Verification of Plant Derived Recombination Sites: frt-like sites for recombination with FLP recombinase
BLAST searches were conducted of publicly available plant DNA sequences from NCBI, SGN and TIGR databases.
1) Potato DNA fragment containing a/rt-like sequence - POTFRT
A fragment containing a frt-like sequence was designed from two EST sequences from potato (Solanum tuberosum) (NCBI accessions BQ513657 and BG098563). This fragment, named POTFRT, is illustrated below. Restriction enzyme sites used for DNA cloning into the potato intragenic T-DNA described in Example 8 are shown in bold and the^t-like sequence shown in bold and light grey.
Figure imgf000109_0001
(SEQ ID NO:75)
Nucleotides 1-3 part of Bfrl restriction enzyme site (from the potato intragenic vector pPOTINV)
Nucleotides 4-45 nucleotides 454 to 495 of NCBI accession BQ513657 Nucleotides 46-185 nucleotides 40 to 179 of NCBI accession BG098563
The designed potato frt-like sequence has 5 nucleotide mismatches from the native frt sequence as illustrated in bold below.
frt sequence GAAGTTCCTATACTTTCTAGAGAATAGGAACTTC (SEQ ID NO:76)
Potato jfø-like sequence P?CTG*raeCTA ACTTTCTAGAGAATAGG GITG (SEQ ID NO:77)
The 185 bp POTFRT sequence illustrated above was synthesised by Genscript Corporation (Piscatawa, NJ, www. genscript. com) and supplied cloned into pUC57. All plasmid constructions were performed using standard molecular biology techniques of plasmid isolation, restriction, ligation and transformation into Escherichia coli strain DH5α (Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987).
POTFRT was cloned into the T-DNA composed of potato DNA residing in the plasmid pGEMTPOTINV (described in Example 9) twice, firstly as a EcoRI to Avrll fragment, then subsequently as a Bfrl to BamEl fragment. Confirmation of the POTFRT inserts was verified using restriction enzyme analysis and DNA sequencing. The resulting plasmid was named pPOTFRT2.
The DNA sequence of the 1432 bp Sail fragment comprising the potato derived T-DNA region in the resulting pPOTFRT2 is illustrated below. Only the nucleotides in italics are not part of potato genome sequences. The POTFRT regions are shaded. The T-DNA borders are shown in bold, with the left border positioned at 314-337 and the right border positioned at
1121-1144. Restriction sites illustrated in bold represent those used to clone the POTFRT regions into pGEMTPOTINV. Unique restriction sites in pPOTFRT2 for cloning between
POTFRT sites are:
Agel A/CCGGT BsfDlOll GAG/CGG
CM AT/CGAT
Cspl CG/GWCCG
PrnAI A/CCGGT
GTCGΛCAGTAAAAGTTGCACCTGGAATAAGGTTTTCATTCTTCACAGGAGGCATCTCACTCT TTCTAGCAGGTCTTGAACGCTTAGATTGAACAGATGTAGGACTCACATCTGATATGGAGGAT TCTTGACTTGTTTCAGCAGCATCAGATGAAGCTTCTGAGACTTCACCTGATCCATCATCTGT AGCAGTTGCTTCTACTTCTTCCACTGCTACATCAGTCTCAGTTGCTGATACTATAAGACCTC TTAATTTAGGTCGTAAAATGCAACCAACTCTAAAATGGGGAAACAATTTAATAGATGTTGAC
AGAGGCAGGATATATTTTGGGGTAAACGGGAATTCpTAGqCCfgCfGTTCCTATACTTTCTA' GAG TAGAGXTCCTJrø ^^ΪTCM^^OT.rø∞§|TT^T^^^^|CATgTO^^^^^^gcT AGAAACTTCCGGTGTATCCGCCGTTTCCGGCGTTGCACCTCCGCCGAATCTAAAAGGTGCGT TGACGATCATCGATGAGCGGACCGGTAAGAAGTATCCGGTTCAGGTTTCTGAGGATGGCACT
Figure imgf000111_0001
GCTACCCTATTGAAGAGCTGGCCGAGGGAAGTTCCTTCTTGGAAGTGGCATATCTTTTGTTG TΆTGGTAATTTACCATCTGAGAACCAGTTAGCAGACTGGGAGTTCACAGTTTCACAGCATTC AGCGGTTCCACAAGGACTCTTGGATATCATACAGTCAATGCCCCATGATGCTCATCCAATGG GGGTTCTTGTCAGTGCAATGAGTGCTCTTTCCGTTTTTCATCCTGATGCAAATCCAGCTCTG AGAGGACAGGATATATACAAGTGTAAACAATTTAAAAGCATATGGTGGCACTGCTCAATATA TGAGGTGGGCGCGAGAAGCAGGTACCAATGTGTCCTCATCAAGAGATGCATTCTTTACCAAT CCAACGGTCAAAGCATACTACAAGTCTTTTGTCAAGGCTATTGTGACAAGAAAAAACTCTAT AAGTGGAGTTAAATATTCAGAAGAGCCCGCCATATTTGCGTGGGAACTCATAAATGAGCCTC GTTGTGAATCCAGTTCATCAGCTGCTGCTCTCCAGGCGTGGATAGCAGAGATGGCTGGATTT GTC GAC
(SEQ ID NO:78)
The ability of this construct to undergo recombination between the POTFRT sites was tested in vivo using FLP recombinase expressing Escherichia coli strain 294-FLP (Buchholz et al., 1996, Nucleic Acids Research 24 (15) 3118-3119). The binary vector pPOTFRT2 was transformed into E. coli strain 294-FLP and maintained by selection with 100 mg/1 ampillicin and incubation at 23 °C. Raising the temperature to 37 °C induces expression of FLP recombinase in E. coli strain 294-FLP, which effected recombination between the two POTFRT sites in ρPOTFRT2. This was evident by a reduction in the size of pPOTFRT2 from 4432 bp to 4086 pb. Plasmid isolated from colonies of E. coli strain 294-FLP transformed with pPOTFRT2 and cultured at 37 °C, was restricted with Sail. All colonies tested produced the fragments of 3.0 kb, 1.4 kb, and 1.1 kb. These three fragments represent the pGΕMT backbone, the unrecombined POTFRT2 fragment, and the expected fragment from recombination between the POTLOXP sites, respectively (Figure 9).
Recombination between the POTFRT sites was further verified by DNA sequencing. The 1.1 kb fragment from lane 3 of Figure 9 was gel purified and direct DNA sequenced. The resulting sequence is illustrated below and confirms that recombination is base pair faithful through the remaining POTFRT site. The remaining POTFRT region is shaded. The left T- DNA border is illustrated in bold and positioned at 253-276. Restriction sites illustrated in bold represent those remaining from cloning the POTFRT regions into pGEMTPOTINV.
TTTCTAGCAAGTCTTGTACGCTTAGATTGAACAGATGTAGGACTCACATCTGATATGGAGGA TTCTTGACTTGTTTCAGCAGCATCAGATGAAGCTTCTGAGACTTCACCTGATCCATCATCTG TAGCAGTTGCTTCTACTTCTTCCACTGCTACATCAGTCTCAGTTGCTGATACTATAAGACCT CTTAATTTAGGTCGTAAAATGCAACCAACTCTAAAATGGGGAAACAATTTAATAGATGTTGA
CAGAGGCAGGATATATTTTGGGGTAAACGGGAATKΠ^
Figure imgf000112_0001
TTCTTGGAAGTGGCATATCTTTTGTTGTATGGTAATTTACCATCTGAGAACCAGTTAGCAGA
CTGGGAGTTCACAGTTTCACAGCATTCAGCGGTTCCACAAGGACTCTTGGATATCATACAGT
CAATGCCCCATGATGCTCATCCAATGGGGGTACTTGTCAGTGCAATGAGTGCTCTTTCCGTT
TTT
(SEQ ID NO:79)
2) Onion (Allium cepa) FRT-like fragment - ALLFRT
A fragment containing a frt-like sequence was designed from two EST sequences from onion (NCBI accessions CF434781 and CF445353). This fragment, named ALLFRT, is illustrated below. Restriction enzyme sites to allow cloning into the onion intragenic binary vector described in Example 8 are shown in bold and the frt-like sequence is illustrated in bold and light grey.
sβiiβiiϊ^βi^^siββi^^s^βKβiiϊM^^Bfi^^β
■^^BIIβliβliβil^βlSilSllβ^βaiS^^^Bββ
iilillllgl^^
Figure imgf000113_0001
(SEQ ID NO:80)
Nucleotides 1-450 nucleotides 28-477 of NCBI accession CF434718 Nucleotides 451-875 nucleotides 105-529 of NCBI accession CF445383
The designed onion,/rt-like sequence has 7 nucleotide mismatches from the native frt sequence as illustrated in bold below.
Frt sequence GAAGTTCCTATACTTTCTAGAGAATAGGAACTTC (SEQ ID NO:76)
Onion^/rt-like sequence CTTGTΪCCΪATACTCTCTGό^GAATAGβAACTG (SEQ ID NO:81)
The 875 bp ALLFRT sequence can be cloned into pALLINV twice, once via flanking Vspl sites into Noel site of pALLINV and subsequently via Mel and Xbal site into the oαl site of pALLINV. The correct orientation and confirmation of the ALLFRT insert can be verified by restriction enzyme analysis and DΝA sequencing.
The DΝA sequence of the 2896 bp Sail fragment comprising the onion derived T-DΝA region in the resulting pALLFRT2 is illustrated below. Only the nucleotides in italics are not part of onion genome sequences. The ALLFRT regions are shaded. The T-DΝA borders are shown in bold, with the left border positioned at 520-543 and the right border positioned at 2490-2513. Restriction sites illustrated in bold represent those used to clone the ALLFRT regions into the onion T-DΝA like sequence.
GTCGACTTCCCTTTCCTCTACTCCACTTGTTTCTCGCTTTCTCTACTTCCTTTTTCTCTCTT TTCTTTATATTTATTGCTCAGCTGGGATTAATTACTGTCATTTATTCCTCATATCTATTTTA TTGAATTAAAACGGTTATTTAGCTCGAGGCCTTCTCTCTTATTCTTTGCTTCCAAGGAGAGA GAATATGGCGAGTGGTAGCAATCATCAGCATGGTGGAGGAGGAAGAAGAAGAGGCGGAATGT TAGTCGCTGCGACCTTGCTTATTCTTCCTGCCATTTTCCCCAATTTGTTTGTTCCTCTTCCC TTTGCTTTTGGTAGTTCTGGCAGCGGTGCATCTCCTTCTCTCTTCTCCGAATGGAATGCTCC TAAACCTAGGCATCTCTCTCTTCTGAAAGCAGCCATTGAGCGTGAGATTTCTGACGAACAAA AATCAGAGCTGTGGTCTCCCTTGCCTCCACAGGGATGGAAACCGTGCCTTGAGACTCAATAT AGTAGCGGGCTACCCAGTAGATCGACAGGATATATTCAAGTGTAAAACAAGATGCTGAATCG A AGCAATGG CGC CT .^^^^^M^^^gMK^gi^^^^^^M
Figure imgf000114_0001
Figure imgf000114_0002
^CTAGACTTGCTTCTCGGATAATCAATCCTCAGTTTTTGATTCCTTCTCGAAGCTTCCTTG ATCTCCATAAGATGGTAAACAAGGAGGCGATAAAAAAAGAAAGGGCTAGACTTGCTGATGAG ATGAGCAGAGGATATTTTGCGGATATGGCAGAGATTCGTATACATGGTGGCAAGATTGCTAT GGCAAATGAAATTCTTATTCCATCAGGGGAAGCAATCAAATTTCCTGATTTGACAGTAAAAT TGTCTGATGATAGCAGTTTGCATTTACCAATTGTATCTACACAAAGTGCTACAAATAACAAT GCTAAATCCACTCCTGCTGCCTCATTGTTGTGCCTTTCCTTCAGAGCAAGTTCACAGACAAT
Figure imgf000114_0003
MiiiljiliiillM^ lllilllll^^ 1111111111^^ iigjijgiiiM ^n^^^^^^^^^ ^^K^^B^^^^fTG ^c^
Figure imgf000114_0004
Figure imgf000115_0001
TTTCTCATTCGGACCAATCAAGAGAATGTTTCTTAACATGACGAAGAAACCCACTGCTACTC AGCGGAAGATTGGTTATTTCATTTGGTGATCACTATGATTTTAGGAAGCAGCTTCAAATTGT AAATCTTTTGACAGGATATATATTACTGTAAAAAGTGAAGAGAGAAATGTGATATATGCTGA TGTTTCCATGGAGAGGGGTGCATTTCTTGTTCAACAAGCTATGAGGGCTTTCCATGGAAAGA ATATAGAAAGCGCAAAATCAAGGCTTAGTCTTTGCGAGGAGGATATTCGTGGGCAGTTAGAG ATGACAGATAACAAACCAGAGTTATATTCACAGCTTGGTGCTGTCCTTGGAATGCTAGGAGA CTGCTGTCGAGGAATGGGTGATACTAATGGTGCGATTCCATATTATGAAGAGAGTGTGGAAT TCCTCTTAAAAATGCCTGCAAAAGATCCCGAGGTTGTACATACACTATCAGTTTCCTTGAAT AAAATTGGAGACCTGAAATACTACGAAGGAGATCTGCAGTCGΛC (SEQIDNO:82)
Restriction enzyme sites available for cloning between ALLFRT sequences include:
Apa l GCANNNNN/TGC
Bsil C/TCGTG
BspMl ACCTGCNNNN/
Dralϊl CACNNN/GTG
Hindlϊl A/AGCTT
Mfel C/AATTG
Nhel G/CTAGC
PflMl CCANNNN/NTGG
Seal AGT/ACT
Sphl GCATG/C
Xbal T/CTAGA
3) jFW-like sequences from other species
Brassica napus (rape)/rι'-like sequence designed from 2 ESTs
Frt sequence GAAGTTCCTATACTTTCTAGAGAATAGGAACTTC (SEQ ID NO:76) Raρe t-like sequence ACAGTTCCTATACTTTCTGGAGAATAGGAAGGTG (SEQ ID NO:83)
Nucleotides 1-14 Nucleotides 397-410 of NCBI accession CD824140 Nucleotides 15-34 Nucleotides 128-147 of NCBI accession CD825268
The x pe frt-like sequence has 6 nucleotide mismatches from the native frt sequence (illustrated above in bold).
Glycine max (soybean) /rt-like sequence designed from 2 ESTs
Frt sequence GAAGTTCCTATACTTTCTAGAGAATAGGAACTTC (SEQ ID
NO:76)
Soybean frt-like sequence ACAGTTCCTATACTTTCTACAGAATAGGAACTTC (SEQ ID NO:84)
Nucleotides 1-19 Nucleotides 84-102 of NCBI accession BE057270
Nucleotides 20-34 Nucleotides 243-257 of NCBI accession BI970552
The soybean frt-like sequence has 3 nucleotide mismatches from the native frt sequence (illustrated above in bold).
Triticum aestivum (wheat) //"Mike sequence designed from 2 ESTs
Frt sequence GAAGTTCCTATACTTTCTAGAGAATAGGAACTTC (SEQ ID
NO:76)
Wheats-like sequence AGAGTTCCTATACTTTCTAGAGAATAGGAACCCC (SEQ ID
NO: 85)
Nucleotides 1-18 Nucleotides 446-463 of NCBI accession CD877128
Nucleotides 19-34 Nucleotides 1805-1820 of NCBI accession BT009538 The wheat frt-like sequence has 4 nucleotide mismatches from the native frt sequence (illustrated above in bold).
Pinus taeda (loblolly pine)/rMike sequence designed from 2 ESTs
Frt sequence GAAGTTCCTATACTTTCTAGAGAATAGGAACTTC (SEQ ID
NO:76)
Loblolly ρine rt-like sequence AAAGTTCCTATACTTTCTGGAGAATAGGAAAACA (SEQ ID NO:86)
Nucleotides 1-16 Nucleotides 14-29 of NCBI accession AA556441 Nucleotides 17-34 Nucleotides 764-781 of NCBI accession AF101785
The loblolly
Figure imgf000117_0001
sequence has 6 nucleotide mismatches from the native frt sequence (illustrated above in bold).
The above examples illustrate practice of the invention. It will be well understood by those with ordinary skill in the art that such DNA sequences can be assembled together to construct a complete vector composed entirely of plant DNA of the same or related species. It will also be appreciated by those skilled in the art that numerous variations and modifications may be made without departing from the spirit and scope of the invention.

Claims

CLAIMS:
1. A plant transformation vector comprising: a) T-DNA-like sequence including at least one T-DNA border-like sequence, the T- DNA border-like sequence comprising two polynucleotide sequence fragments, wherein all of the sequences of the T-DNA-like sequence are derived from plant species.
2. A plant transformation vector comprising: a) a T-DNA-like sequence including at least one T-DNA border-like sequence, and b) at least 20 bp of additional polynucleotide sequence on one or both sides of the T- DNA-like sequence in which all of said sequences are derived from plant species.
3. The plant transformation vector of claim 1 or 2 in which the T-DNA-like sequence includes two T-DNA border-like polynucleotide sequences, one at each end of the T-DNA- like sequence, both T-DNA border-like polynucleotide sequences being derived from plant species.
4. The plant transformation vector of any preceding claim in which the T-DNA-like sequence further comprises additional base polynucleotide sequence, the additional base polynucleotide sequence being derived from plant species.
5. The plant transformation vector of any preceding claim in which T-DNA-like sequence includes first and second recombinase recognition sequences, wherein the recombinase recognition sequences are derived from plant species.
6. The plant transformation vector of claim 5 in which the first recombinase recognition site and the second recombinase recognition sequence are loxP-like sequences derived from a plant species.
7. The plant transformation vector of claim 5 in which the first recombinase recognition sequence and the second recombinase recognition sequences areτrt-like sequences derived from plant species.
8. The plant transformation vector of any one of claims 5 to 7 which comprises a selectable marker sequence flanked by the first and second recombinase recognition sequences.
9. The plant transformation vector of claim 8 in which the selectable marker sequence is derived from plants.
10. The plant transformation vector of any preceding claim in which the polynucleotide encompassing the T-DNA border-like sequence(s), any base polynucleotide sequence, any recombinase recognition site sequences and any plant polynucleotide sequence additional to the T-DNA-like sequence are constructed from fewer than 10 polynucleotide sequence fragments derived from plant species.
11. The plant transformation vector of any preceding claim which further comprises an origin of replication polynucleotide sequence derived from plant species.
12. The plant transformation vector of any preceding claim in which the T-DNA-like sequence comprises a selectable marker polynucleotide sequence capable of functioning in selection of a plant cell or plant harbouring the T-DNA-like sequence, wherein the selectable marker sequence is derived from plant species.
13. The plant transformation vector of any preceding claim which comprises a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector, wherein the selectable marker sequence is derived from plant species.
14. The plant transformation vector of claim 12 in which the selectable marker polynucleotide sequence capable of functioning in selection of a plant harbouring the T-DNA- like sequence is also capable of functioning in selection of a bacterium harbouring the vector.
15. The plant transformation vector of any preceding claim in which the T-DNA-like sequence further comprises a genetic construct as herein defined, wherein all polynucleotide sequences of the genetic construct are derived from plants.
16. The plant transformation vector of any preceding claim in which the polynucleotide sequence of the entire vector is derived from plant species.
17. The plant transformation vector of any preceding claim in which the polynucleotide sequence of the entire vector is derived from plant species which are interfertile.
18. The plant transformation vector of any preceding claim in which the polynucleotide sequence of the entire vector is derived from the same plant species.
19. A plant transformation vector including a T-DNA like sequence, the T-DNA like sequence comprising: a) a polynucleotide sequence of at least about 20 bp in length derived from a plant species, comprising, the nucleotide sequence 5'-GR-3' (wherein R = G or A); and b) a chimeric T-DNA-border-like sequence comprising at its 5' end, the nucleotide sequence 5 ' -GR-3 ' from a) wherein the chimeric border is capable of functioning as a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant.
20. A plant transformation vector including a T-DNA like sequence, the T-DNA like sequence comprising: a) a polynucleotide sequence of at least about 20 bp in length derived from a plant species, comprising, the nucleotide sequence 5'-GRC-3' (wherein R = G or A); and b) a chimeric T-DNA-border-like sequence comprising at its 5' end, the nucleotide sequence 5'-GRC-3' from a) wherein the chimeric border is capable of functioning as a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant.
21. A plant transformation vector including a T-DNA like sequence, the T-DNA like sequence comprising: a) a polynucleotide sequence of at least about 20 bp in length derived from a plant species, comprising, the nucleotide sequence 5'-GRCA-3' (wherein R = G or A); and b) a chimeric T-DNA-border-like sequence comprising at its 5' end, the nucleotide sequence 5'-GRCA-3' from a) wherein the chimeric border is capable of functioning as a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant.
22. The plant transformation vector of any one of claims 19 to 21 in which T-DNA-like sequence includes, 5' to the chimeric T-DNA border-like sequence, first and second recombinase recognition sequences, wherein the recombinase recognition sequences are derived from plant species.
23. The plant transformation vector of claim 22 in which the first recombinase recognition site and the second recombinase recognition sequence are loxP-like sequences derived from a plant species.
24. The plant transformation vector of claim 22 in which the first recombinase recognition sequence and the second recombinase recognition sequences axe frt-like sequences derived from plant species.
25. The plant transformation vector of any one of claims 22 to 24 which comprises a selectable marker sequence flanked by the first and second recombinase recognition sequences.
26. The plant transformation vector of claim 25 in which the selectable marker sequence is derived from plants.
27. The plant transformation vector of any one of claims 19 to 26 in which the polynucleotide of at least 20 bp in length and any recombinase recognition site sequences are constructed from fewer than 10 fragments of a polynucleotide sequence derived from plant species.
28. The plant transformation vector of any one of claims 19 to 27 which further comprises an origin of replication polynucleotide sequence derived from plant species.
29. The plant transformation vector of any one of claims 19 to 28 in which the T-DNA-like sequence includes, 5' to the chimeric T-DNA border-like sequence, a selectable marker polynucleotide sequence capable of functioning in selection of a plant cell or plant harbouring the T-DNA-like sequence, wherein the selectable marker sequence is derived from plant species.
30. The plant transformation vector of any one of claims 19 to 29 which comprises a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector, wherein the selectable marker sequence is derived from a plant.
31. The plant transformation vector of claim 27 in which the selectable marker polynucleotide sequence capable of functioning in selection of a plant harbouring the T-DNA- like sequence is also capable of functioning in selection of a bacterium harbouring the vector.
32. The plant transformation vector of any one of claims 19 to 31 in which the T-DNA-like sequence further comprises a genetic construct as herein defined, wherein all polynucleotide sequences of the genetic construct are derived from plant species.
33. The plant transformation vector of any one of claims 19 to 32 in which all of the polynucleotide sequence of the vector, except for the chimeric T-DNA border-like sequence, is derived from plant species.
34. The plant transformation vector of any one of claims 19 to 32 in which all of the polynucleotide sequence of the vector, except for the chimeric T-DNA border-like sequence, is derived from plant species which are interfertile.
35. The plant transformation vector of any one of claims 19 to 32 in which all of the polynucleotide sequence of the vector, except for the chimeric T-DNA border-like sequence, is derived from the same plant species.
36. A plant transformation vector including a chimeric sequence, the chimeric sequence comprising: a) at the 5' end a plant-derived sequence of at least 20 nucleotides including the nucleotide sequence 5'-GR-3' (wherein R = G or A); and b) at the 3' end a border sequence capable of performing the function of a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant, wherein the nucleotide sequence 5'-GR-3' from a) forms the 5' end of the border sequence.
37. A plant transformation vector including a chimeric sequence, the chimeric sequence comprising: a) at the 5' end a plant-derived sequence of at least 20 nucleotides including the nucleotide sequence 5'-GRC-3' (wherein R = G or A); and b) at the 3' end a border sequence capable of performing the function of a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant, wherein the nucleotide sequence 5'-GRC-3' from a) forms the 5' end of the border sequence.
38. A plant transformation vector including a chimeric sequence, the chimeric sequence comprising: a) at the 5' end a plant-derived sequence of at least 20 nucleotides including the nucleotide sequence 5 ' -GRCA-3 ' (wherein R = G or A); and b) at the 3' end a border sequence capable performing the function of a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant, wherein the nucleotide sequence 5' -GRCA-3' from a) forms the 5' end of the border sequence.
39. The plant transformation vector of any one of claims 36 to 38 which includes, 5' to the border, first and second recombinase recognition sequences derived from plant species.
40. The plant transformation vector of claim 39 in which the first recombinase recognition site and the second recombinase recognition sequence are loxP-like sequences derived from a plant species.
41. The plant transformation vector of claim 39 in which the first recombinase recognition sequence and the second recombinase recognition sequences axefi't-like sequences derived from plant species.
42. The plant transformation vector of any one of claims 39 to 41 which comprises a selectable marker sequence flanked by the first and second recombinase recognition sequences.
43. The plant transformation vector of claim 42 in which the selectable marker sequence is derived from plants.
44. The plant transformation vector of any one of claims 36 to 43 in which the polynucleotide of at least 20 bp in length and any recombinase recognition site sequences are constructed from fewer than 10 fragments of a polynucleotide sequence derived from plant species.
45. The plant transfonnation vector of any one of claims 36 to 43 which further comprises an origin of replication polynucleotide sequence derived from plant species.
46. The plant transformation vector of any one of claims 33 to 45 which includes, 5' to the border sequence, a selectable marker polynucleotide sequence capable of functioning in selection of a plant cell or plant harbouring the selectable marker polynucleotide sequence, wherein the selectable marker sequence is derived from plant species.
47. The plant transformation vector of any one of claims 33 to 46 which comprises a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector, wherein the selectable marker sequence is derived from a plant.
48. The plant transformation vector of claim 46 in which the selectable marker polynucleotide sequence capable of functioning in selection of a plant harbouring the selectable marker polynucleotide sequence is also capable of functioning in selection of a bacterium harbouring the vector.
49. The plant transformation vector of any one of claims 36 to 48 in which further comprises a genetic construct as herein defined, wherein all polynucleotide sequences of the genetic construct are derived from plant species.
50. The plant transformation vector of any one of claims 36 to 49 in which all of the polynucleotide sequence of the vector, except for the border sequence, is derived from plant species.
51. The plant transformation vector of any one of claims 36 to 49 in which all of the polynucleotide sequence of the vector, except for the border sequence, is derived from plant species which are interfertile.
52. The plant transformation vector of any one of claims 36 to 49 in which all of the polynucleotide sequence of the vector, except for the border sequence, is derived from the same plant species.
53. A plant transformation vector comprising a selectable marker polynucleotide sequence capable of functioning in selection of a plant cell or plant harbouring the selectable marker polynucleotide, wherein the selectable marker sequence is derived from plant species.
54. A plant transformation vector comprising first and second recombinase recognition sequences, wherein the recombinase recognition sequences are derived from plant species.
55. The plant transformation vector of claim 54 in which the first recombinase recognition sequence and the second recombinase recognition sequence are loxP-like sequences derived from a plant species.
56. The plant transformation vector of claim 54 in which the first recombinase recognition sequence and the second recombinase recognition sequences axe frt-like sequences derived from plant species.
57. The plant transformation vector of any one of claims 54 to 56 which comprises a selectable marker sequence flanked by the first and second recombinase recognition sequences.
58. The plant transformation vector of claim 57 in which the selectable marker sequence is derived from plants.
59. The plant transformation vector of any one of claims 53 to 58 which further comprises an origin of replication-like polynucleotide sequence derived from plant species.
60. The plant transformation vector of any one of claims 53 to 59 which further comprises a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector, wherein the selectable marker sequence is derived from plant species.
61. The plant transformation vector of claim 60 in which the selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector is also capable of functioning in selection of a plant cell or plant harbouring the selectable marker polynucleotide.
62. A plant transformation vector comprising: a) an origin of replication polynucleotide sequence, and b) a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector in which all of said sequences are derived from plant species.
63. The plant transformation vector of any one of claims 53 to 62 which further comprises additional base polynucleotide sequence, the additional base polynucleotide sequence being derived from plant species.
64. The plant transformation vector of any one of claims 53 to 63 constructed from fewer than 10 polynucleotide sequence fragments derived from plant species.
65. The plant transformation vector of any one of claims 53 to 64 which further comprises a genetic construct as herein defined, wherein all polynucleotide sequences of the genetic construct are derived from plants.
66. The plant transformation vector of any one of claims 53 to 65 in which all of the polynucleotide sequence of the entire vector is derived from plant species.
67. The plant transformation vector of any one of claims 53 to 65 in which all of the polynucleotide sequence of the entire vector is derived from plant species which are interfertile.
68. The plant transformation vector of any one of claims 53 to 65 in which all of the polynucleotide sequence of the entire vector is derived from the same plant species.
69. A method of producing a transformed plant cell or plant, the method comprising the step of transformation of the plant cell or plant using the vector of any one of claims 1 to 68.
70. A method of producing a transformed plant cell or plant in which any polynucleotide stably integrated into the plant cell or plant is derived from a plant, the method comprising transformation of the plant with the vector of any one of claims 1 to 68.
71. A method of producing a transformed plant cell or plant in which any polynucleotide stably integrated into the plant cell or plant is derived from a plant interfertile with the plant or plant cell to be transformed, the method comprising transformation of the plant with the vector of any one of claims 1 to 68.
72. A method of producing a transformed plant cell or plant in which any polynucleotide stably integrated into the plant cell or plant is derived from a plant of the same species as the plant or plant cell to be transformed, the method comprising transformation of the plant with the vector of any one of claims 1 to 68.
73. A method of modifying a trait in a plant cell or plant comprising: (a) transforming of a plant cell or plant with a vector of any one of claims 1 to 68, the vector comprising a genetic construct capable of altering expression of a gene which influences the trait; and (b) obtaining a stably transformed plant cell or plant modified for the trait.
74. The method of any one of claims 69 to 73 in which transformation is vir gene-mediated.
75. The method of any one of claims 69 to 73 in which transformation is Agrobacterium- , mediated.
76. The method of any one of claims 69 to 73 in which transformation involves direct DNA uptake.
77. A plant cell or plant produced by a method of any one of claims 69 to 76.
78. A plant tissue, organ, propagule or progeny of the plant cell or plant of claim 77.
PCT/NZ2005/000117 2004-06-08 2005-06-08 Transformation vectors WO2005121346A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
AU2005252598A AU2005252598B8 (en) 2004-06-08 2005-06-08 Transformation vectors
EP05757541A EP1766029A4 (en) 2004-06-08 2005-06-08 Transformation vectors
AU2010257316A AU2010257316B2 (en) 2004-06-08 2010-12-21 Transformation Vectors

Applications Claiming Priority (8)

Application Number Priority Date Filing Date Title
NZ533371A NZ533371A (en) 2004-06-08 2004-06-08 Vectors for plant transformation contaning mostly or exclusively genetic material from plants useful with Agrobacterium transformation system
NZ533372 2004-06-08
NZ533371 2004-06-08
NZ53337204 2004-06-08
NZ53833105 2005-02-18
NZ538331 2005-02-18
NZ53833005 2005-02-18
NZ538330 2005-02-18

Publications (1)

Publication Number Publication Date
WO2005121346A1 true WO2005121346A1 (en) 2005-12-22

Family

ID=35503070

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/NZ2005/000117 WO2005121346A1 (en) 2004-06-08 2005-06-08 Transformation vectors

Country Status (3)

Country Link
EP (1) EP1766029A4 (en)
AU (2) AU2005252598B8 (en)
WO (1) WO2005121346A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1802760A2 (en) * 2004-09-08 2007-07-04 J.R. Simplot Company Plant-specific genetic elements and transfer cassettes for plant transformation
WO2007103383A2 (en) * 2006-03-07 2007-09-13 J.R. Simplot Company Plant-specific genetic elements and transfer cassettes for plant transformation
WO2010090536A3 (en) * 2009-01-15 2010-09-30 The New Zealand Institute For Plant And Food Research Limited Plant transformation using dna minicircles
US7928291B2 (en) 2006-07-19 2011-04-19 Monsanto Technology Llc Use of multiple transformation enhancer sequences to improve plant transformation efficiency
US8137961B2 (en) 2004-09-08 2012-03-20 J.R. Simplot Company Plant-specific genetic elements and transfer cassettes for plant transformation
WO2017185136A1 (en) * 2016-04-27 2017-11-02 Nexgen Plants Pty Ltd Construct and vector for intragenic plant transformation
AU2018253628B2 (en) * 2016-04-27 2019-07-25 Nexgen Plants Pty Ltd Construct and vector for intragenic plant transformation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001059086A2 (en) * 2000-02-08 2001-08-16 Sakata Seed Corporation Methods and constructs for agrobacterium-mediated plant transformation
WO2002081711A1 (en) * 2001-04-06 2002-10-17 Cropdesign N.V. The use of double and opposite recombination sites for the single step cloning of two dna segments
WO2003069980A2 (en) * 2002-02-20 2003-08-28 J.R. Simplot Company Precise breeding
US20030188345A1 (en) * 2000-06-28 2003-10-02 Ute Heim Binary vectors for the improved transformation of plants systems
WO2005004585A2 (en) * 2003-06-27 2005-01-20 J.R. Simplot Company Precise breeding

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001059086A2 (en) * 2000-02-08 2001-08-16 Sakata Seed Corporation Methods and constructs for agrobacterium-mediated plant transformation
US20030188345A1 (en) * 2000-06-28 2003-10-02 Ute Heim Binary vectors for the improved transformation of plants systems
WO2002081711A1 (en) * 2001-04-06 2002-10-17 Cropdesign N.V. The use of double and opposite recombination sites for the single step cloning of two dna segments
WO2003069980A2 (en) * 2002-02-20 2003-08-28 J.R. Simplot Company Precise breeding
WO2005004585A2 (en) * 2003-06-27 2005-01-20 J.R. Simplot Company Precise breeding

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
AOKI S AND SYONO K.: "Horizontal gene transfer and mutation: Ngrol genes in the genome of Nicotiana glauca.", PROC NATL ACAD SCI., vol. 86, no. 23, 1999, pages 13229 - 13234, XP008096696 *
OSBORNE BI ET AL: "A system for insertional mutagenesis and chromosomal rearrangement using the Ds transposon and Cre-lox.", PLANT JOURNAL: FOR CELL AND MOLECULAR BIOLOGY., vol. 7, no. 4, 1995, pages 687 - 701, XP008096697 *
ROMMENS CM ET AL: "Crop improvement through modification of the plant's own genome.", PLANT PHYSIOLOGY., vol. 135, no. 1, 2004, pages 421 - 431, XP002418025 *
See also references of EP1766029A4 *
STUURMAN J ET AL: "Single-site manipulation of tomato chromosomes in vitro and in vivo using Cre-lox site-specific recombination.", PLANT MOLECULAR BIOLOGY., vol. 32, no. 5, December 1996 (1996-12-01), pages 901 - 913, XP002094598 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8137961B2 (en) 2004-09-08 2012-03-20 J.R. Simplot Company Plant-specific genetic elements and transfer cassettes for plant transformation
EP1802760A4 (en) * 2004-09-08 2009-05-06 Simplot Co J R Plant-specific genetic elements and transfer cassettes for plant transformation
US7601536B2 (en) 2004-09-08 2009-10-13 J. R. Simplot Company Plant-specific genetic elements and transfer cassettes for plant transformation
EP1802760A2 (en) * 2004-09-08 2007-07-04 J.R. Simplot Company Plant-specific genetic elements and transfer cassettes for plant transformation
WO2007103383A2 (en) * 2006-03-07 2007-09-13 J.R. Simplot Company Plant-specific genetic elements and transfer cassettes for plant transformation
WO2007103383A3 (en) * 2006-03-07 2008-05-08 Simplot Co J R Plant-specific genetic elements and transfer cassettes for plant transformation
US8759616B2 (en) 2006-07-19 2014-06-24 Monsanto Technology Llc Use of multiple transformation enhancer sequences to improve plant transformation efficiency
US7928291B2 (en) 2006-07-19 2011-04-19 Monsanto Technology Llc Use of multiple transformation enhancer sequences to improve plant transformation efficiency
US8404931B2 (en) 2006-07-19 2013-03-26 Monsanto Technology Llc Use of multiple transformation enhancer sequences to improve plant transformation efficiency
US9464295B2 (en) 2006-07-19 2016-10-11 Monsanto Technology Llc Use of multiple transformation enhancer sequences to improve plant transformation efficiency
US10066235B2 (en) 2006-07-19 2018-09-04 Monsanto Technology Llc Use of multiple transformation enhancer sequences to improve plant transformation efficiency
US10865418B2 (en) 2006-07-19 2020-12-15 Monsanto Technology Llc Use of multiple transformation enhancer sequences to improve plant transformation efficiency
EP2387613A2 (en) * 2009-01-15 2011-11-23 The New Zealand Institute for Plant and Food Research Limited Plant transformation using dna minicircles
US20120042409A1 (en) * 2009-01-15 2012-02-16 Anthony Conner Plant transformation using dna minicircles
EP2387613A4 (en) * 2009-01-15 2012-06-06 Nz Inst Plant & Food Res Ltd Plant transformation using dna minicircles
WO2010090536A3 (en) * 2009-01-15 2010-09-30 The New Zealand Institute For Plant And Food Research Limited Plant transformation using dna minicircles
WO2017185136A1 (en) * 2016-04-27 2017-11-02 Nexgen Plants Pty Ltd Construct and vector for intragenic plant transformation
CN109477091A (en) * 2016-04-27 2019-03-15 纳斯根植物私人有限公司 Construct and carrier for the conversion of gene implants
AU2018253628B2 (en) * 2016-04-27 2019-07-25 Nexgen Plants Pty Ltd Construct and vector for intragenic plant transformation

Also Published As

Publication number Publication date
AU2005252598A1 (en) 2005-12-22
AU2005252598B2 (en) 2010-10-28
EP1766029A4 (en) 2008-07-23
AU2010257316B2 (en) 2013-03-21
EP1766029A1 (en) 2007-03-28
AU2010257316A1 (en) 2011-01-27
AU2005252598B8 (en) 2011-05-12

Similar Documents

Publication Publication Date Title
US20020178463A1 (en) Method for transforming monocotyledons
AU2005252598B2 (en) Transformation vectors
WO2001006844A1 (en) Method for superrapid transformation of monocotyledon
CN109722439B (en) Application of MLO2, MLO6 and MLO12 genes of tobacco in preparation of powdery mildew resistant tobacco variety and method thereof
JP2023156474A (en) Regeneration of genetically modified plants
WO2018220929A1 (en) Protein expression system in plant cell and use thereof
AU2010211450B2 (en) Plant transformation using DNA minicircles
BR112020002321A2 (en) new strains of agrobacterium tumefaciens claim priority
US11649465B2 (en) Methods and compositions for increasing expression of genes of interest in a plant by co-expression with p21
NZ579038A (en) Vectors for transformation
NZ533371A (en) Vectors for plant transformation contaning mostly or exclusively genetic material from plants useful with Agrobacterium transformation system
US20100257632A1 (en) Methods for generating marker-free transgenic plants
JP4543161B2 (en) Gene disruption method using retrotransposon of tobacco
JP3605633B2 (en) Novel plant gene, plant modification method using the gene, and plant obtained by the method
WO2023212556A2 (en) Compositions and methods for somatic embryogenesis in dicot plants
NZ585926A (en) Methods for generating marker free transgenic plants using Agrobacterium strains and virE2 protein
CN113490747A (en) Methods for increasing efficiency of genome engineering
NZ574191A (en) Plant transformation using DNA minicircles
MXPA06002799A (en) Methods and compositions for enhanced plant cell transformation

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DPEN Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Country of ref document: DE

WWE Wipo information: entry into national phase

Ref document number: 2005252598

Country of ref document: AU

WWE Wipo information: entry into national phase

Ref document number: 2005757541

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2005252598

Country of ref document: AU

Date of ref document: 20050608

Kind code of ref document: A

WWP Wipo information: published in national office

Ref document number: 2005252598

Country of ref document: AU

WWP Wipo information: published in national office

Ref document number: 2005757541

Country of ref document: EP