NZ579038A - Vectors for transformation - Google Patents

Vectors for transformation

Info

Publication number
NZ579038A
NZ579038A NZ579038A NZ57903805A NZ579038A NZ 579038 A NZ579038 A NZ 579038A NZ 579038 A NZ579038 A NZ 579038A NZ 57903805 A NZ57903805 A NZ 57903805A NZ 579038 A NZ579038 A NZ 579038A
Authority
NZ
New Zealand
Prior art keywords
plant
sequence
vector
dna
derived
Prior art date
Application number
NZ579038A
Inventor
Anthony John Conner
Philippa Jane Barrell
Johanna Maria Elisabeth Jacobs
Samantha Jane Baldwin
Annemarie Suzanne Lokerse
Original Assignee
Nz Inst Plant & Food Res Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nz Inst Plant & Food Res Ltd filed Critical Nz Inst Plant & Food Res Ltd
Priority to NZ579038A priority Critical patent/NZ579038A/en
Publication of NZ579038A publication Critical patent/NZ579038A/en

Links

Landscapes

  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Breeding Of Plants And Reproduction By Means Of Culturing (AREA)

Abstract

A plant transformation vector comprising first and second recombinase recognition sequences, wherein the recombinase recognition sequences are derived from plant species, and are not identical to any recombinase recognition sequences from non-plant species. Further provided are methods of producing transformed plants using the vector and plants or plant cells produced by such methods. (62) Divided Out of 553676

Description

*10057848024* 57 9 0 38 NEW ZEALAND PATENTS ACT, 1953 No: Divided out of NZ 553676 Date: Dated 8 June 2004 COMPLETE SPECIFICATION VECTORS FOR TRANSFORMATION We, THE NEW ZEALAND INSTITUTE FOR PLANT AND FOOD RESEARCH LIMITED, a New Zealand company and Crown Research Institute (under the Crown Research Institutes Act 1992) having a place of business at Mt Albert Research Centre, 120 Mt Albert Road, Mt Albert, Auckland, New Zealand, do hereby declare the invention for which we pray that a patent may be granted to us, and the method by which it is to be performed, to be particularly described in and by the following statement: (followed bj page 1a) intellectual property office of n.z. 13 AUG 2009 RECEIVED VECTORS FOR TRANSFORMATION BACKGROUND ART Over the past 20 years rapid scientific advances in molecular and cell biology have resulted in the development of technology to enable genetic engineering of plants (development of transformed plants, transgenic plants or GMOs). This offers new opportunities for the incorporation of genes into crop plants and represents a new technology platform for the next level of genetic gain in crop breeding.
An option provided by genetic engineering is the ability to extend the germplasm base available for crop improvement to any source of DNA, including that from other plants, microbes or animals. However this cross-species transformation has raised ethical concerns with the public, especially when associated with food.
As this technology develops further, more genes are being identified from crop species which would be of benefit to agriculture and industry if they were transferred to other genotypes of the same crop, i.e. within species transformation. The use of such "within-species transformation" approaches for moving genes between genotypes within the existing gene 20 pools available to plant breeders also has several advantages over traditional breeding: 1. Direct gene transfer to elite plants and cultivars without repeated backcrossing. This allows the efficient development of new plant lines without the many generations of hybridisation and selection usually required to recover the desired plant. 2. The transfer of single discrete genes, without the "linkage drag" associated with the transfer of many undefined and often undesirable neighboring genes in traditional plant breeding. 3. The specific design and development of new gene formulations. This can involve the 30 matching of molecular switches (promoters) with the desired coding regions to target the expression of the new gene at a specific location within a plant. Alternatively, "reverse genetics" approaches can be used to "knock-out" specific functions in plants. This can be achieved by positioning the coding region of a gene in the reverse orientation, relative to the la promoter responsible for "turning the gene on" or components of the coding region arranged in an inverted repeat under control of the promoter.
In addition, moving genes between plants of the same species does not raise the same ethical 5 concerns as cross-species transformation.
The application of genetic engineering requires the use of vectors for either Agrobacterium-mediated transformation or direct DNA uptake into plant cells. Agrobacterium-mediated transformation is the preferred method and requires the construction of modified T-DNA 10 (transferred-DNA) on a vector (usually a binary vector).
However, the transformation requires the use of vector systems based on DNA sequences from other species (e.g. the T-DNA border regions, the DNA region into which target genes are inserted, selectable markers genes and sequences allowing such vectors to replicate in 15 additional host systems); sequences that have been usually derived from bacterial systems.
The minimum requirement of a vector to perform Agrobacterium-mediated plant transformation is at least one T-DNA border region, although in practice transformation vector systems include other vector sequences as described above. Two T-DNA border 20 regions are usually used flanking the sequence of interest to be integrated into the plant genome. However in most instances such border sequences or parts thereof also become integrated into the genome of the transformed plant.
T-DNA sequences have been identified as naturally occurring in the genomes of plants (White 25 et al 1983, Nature 301: 348-350; Furaer et al 1986, Nature 319: 422-427; Aoki et al 1994, Molecular and General Genetics 243: 706-710; Susuli et al 2002, Plant Journal 32: 775-787). Plant transformation vectors in which the Agrobacterium borders are replaced with plant derived T-DNA border-like sequences have also been reported (WO 03/069980). If the T-DNA border-like sequences are chosen from a plant of the species to be transformed, this 30 allows for the possibility of production of plants transformed with only their own DNA. However, in practice integration is relatively un-predictable and often results in integration of other vector sequences from outside of the T-DNA borders and even transfer of the whole transformation vector which includes many additional non-plant sequences. 2 RECEIVED at IPONZ on 11 March 2010 It is an object of the invention to provide improved compositions and methods for plant transformation which reduce or eliminate the transfer of foreign DNA into the plant, or at least provide the public with a useful choice.
SUMMARY OF INVENTION In one aspect the invention provides a plant transformation vector comprising first and second recombinase recognition sequences, wherein the recombinase recognition sequences are derived from plant species, and are not identical to any recombinase recognition sequences 10 from non-plant species.
In one embodiment the first recombinase recognition sequence and the second recombinase recognition sequence are /oxP-like sequences derived from a plant species.
In an alternative embodiment the first recombinase recognition sequence and the second recombinase recognition sequences are frt-Wke, sequences derived from plant species.
In a preferred embodiment the plant transformation vector comprises a selectable marker sequence flanked by the first and second recombinase recognition sequences. Preferably the 20 selectable marker sequence is derived from plants.
In a further embodiment the plant transformation vector further comprises an origin of replication polynucleotide sequence derived from plant species.
In a further embodiment the plant transformation vector further comprises a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector, wherein the selectable marker sequence is derived from plant species.
In a preferred embodiment the selectable marker polynucleotide sequence capable of 30 functioning in selection of a bacterium harbouring the vector is also capable of functioning in selection of a plant cell or plant harbouring the selectable marker polynucleotide.
In a preferred embodiment the plant transformation vector is constructed from fewer than 10, preferably fewer than 9, preferably fewer than 8, preferably fewer than 7, preferably fewer 3 RECEIVED at IPONZ on 11 March 2010 than 6, preferably fewer than 5, preferably fewer than 4, preferably fewer than 3, most preferably 2 or 1 polynucleotide sequence fragments derived from plants.
In a further embodiment the plant transformation vector further comprises a genetic construct 5 as herein defined, wherein all polynucleotide sequences of the genetic construct are derived from plants.
In a preferred embodiment all of the polynucleotide sequence of the plant transformation vector is derived from plant species, more preferably from plant species which are interfertile 10 and most preferably from the same plant species.
In a further aspect the invention provides a method of producing a transformed plant cell or plant, the method comprising the step of transformation of the plant cell or plant using a transformation vector of the invention.
The invention also provides a method of modifying a trait in a plant cell or plant comprising: (a) transforming of a plant cell or plant with a vector of the invention, the vector comprising a genetic construct capable of altering expression of a gene which influences the trait; and 20 (b) obtaining a stably transformed plant cell or plant modified for the trait.
In a preferred embodiment any polynucleotide stably integrated into the plant cell or plant is derived from a plant. Preferably any polynucleotide stably integrated into the plant cell or plant is derived from a plant interfertile with the plant or plant cell to be transformed. Most 25 preferably any polynucleotide stably integrated into the plant cell or plant is derived from a plant of the same species as the plant or plant cell to be transformed.
In one embodiment transformation is vir gene-mediated.
In a further embodiment transformation is Agrobacterium-mediated.
In an alternative embodiment transformation involves direct DNA uptake.
The invention also provides a plant cell or plant produced by a method of the invention. 4 RECEIVED at IPONZ on 11 March 2010 The invention also provides a plant tissue, organ, propagule or progeny of the plant cell or plant of the invention.
Also disclosed is a plant transformation vector comprising: a) T-DNA-like sequence including at least one T-DNA border-like sequence, the T-DNA border-like sequence comprising two polynucleotide sequence fragments, wherein all of the sequences of the T-DNA-like sequence are derived from plant species. Also possible but less preferred is use of a similar T-DNA border-like sequence containing three or more 10 polynucleotide sequence fragments derived from plant species.
Also disclosed is a plant transformation vector comprising: a) a T-DNA-like sequence including at least one T-DNA border-like sequence b) additional plant polynucleotide sequence on one or both sides of the T-DNA-like 15 sequence in which all of said sequences are derived from plants, preferably from the same plant species.
Preferably the additional plant polynucleotide sequence is 5' to the left border when two T-DNA border-like sequences are used, or 5' to the single T-DNA border-like sequence when a 20 single T-DNA border-like sequence is used.
The additional plant polynucleotide sequence may be at least about lbp in length, preferably at least about 5 bp, preferably at least about 10 bp, preferably at least about 50 bp, preferably at least about 100 bp, preferably at least about 200 bp, preferably at least about 500 bp, more 25 preferably at least about 1 kb.
The T-DNA-like sequence may include two T-DNA border-like polynucleotide sequences flanking the T-DNA-like sequence, both T-DNA border-like polynucleotide sequences being derived from plants, preferably from the same plant species.
The T-DNA-like sequence may further comprise additional base polynucleotide sequence(s), the additional base polynucleotide sequence(s) being derived from plants preferably from the same plants species as the T-DNA border-like sequences.
RECEIVED at IPONZ on 11 March 2010 The T-DNA-like sequence may include first and second recombinase recognition site sequences, wherein all of said sequences are derived from plants, preferably from the same plant species.
The first recombinase recognition site and the second recombinase recognition site may be lox P-like sites derived from a plant species, preferably from the same plant species as the T-DNA border-like sequences.
Alternatively the first recombinase recognition site and the second recombinase recognition 10 site are yW-like sites derived from a plant species, preferably from the same plant species as the T-DNA border-like sequences.
The vector may comprise a selectable marker sequence flanked by the first and second recombinase recognition site sequences. Preferably the selectable marker is operably linked 15 to a constitutive promoter sequence. Preferably the selectable marker and/or the constitutive promoter sequences are derived from plants, preferably from the same plant species as the T-DNA border-like sequences.
The vector may comprise a recombinase sequence flanked by the first and second 20 recombinase recognition site sequences. Preferably the recombinase is operably linked to an inducible promoter sequence. Preferably the recombinase and/or inducible promoter sequences are derived from plants, preferably from the same plant species as the T-DNA border-like sequences.
When the recombinase recognition sites are a foxP-likc sequences, the recombinase sequence may be Cre and when the recombinase recognition sites are an frt-like sequences, the recombinase sequence may be FLP.
Alternatively a negative selection marker may be flanked by the first and second recombinase 30 recognition site sequences. Preferably the negative selection marker is CodA.
Neither the T-DNA border-like polynucleotide sequences, nor any base polynucleotide sequence of the T-DNA-like sequence, nor the first or second recombinase recognition site sequences, nor the plant polynucleotide sequence additional to the T-DNA-like sequence, should contain 6 RECEIVED at IPONZ on 11 March 2010 regulatory elements, such as promoters, which may influence the expression of inserted genes of interest.
Alternatively neither the T-DNA border-like polynucleotide sequences, nor any base 5 polynucleotide sequence of the T-DNA-like sequence, nor the first or second recombinase recognition site sequences, nor the plant polynucleotide sequence additional to the T-DNA-like sequence, should contain introns, which may influence the expression of inserted genes of interest.
Alternatively neither the T-DNA border-like polynucleotide sequences, nor any base polynucleotide sequence of T-DNA-like sequence, nor the first or second recombinase recognition site sequences, nor the plant polynucleotide sequence additional to the T-DNA-like sequence should be derived from heterochromatic regions of the genome from which they are derived.
The polynucleotide encompassing the T-DNA border-like sequences, the base polynucleotide sequence of the T-DNA-like sequence and the plant polynucleotide sequence additional to the T-DNA-like sequence may be constructed from fewer than 10, preferably fewer than 9, preferably fewer than 8, preferably fewer than 7, preferably fewer than 6, preferably fewer 20 than 5, preferably fewer than 4, preferably fewer than 3, most preferably 2 or 1 sequence fragments derived from plants.
The plant transformation vector may further comprise an origin of replication sequence. Preferably the origin of replication sequence is derived from a plant, preferably from the same 25 plant species as the T-DNA border-like sequences and/or the base polynucleotide sequence of the T-DNA-like sequence, and/or the sequence additional to the T-DNA-like sequence.
The T-DNA-like sequence of the plant transformation vector may comprise a selectable marker polynucleotide sequence for selection of a plant cell or plant harbouring the T-DNA-30 like sequence. Preferably the selectable marker sequence is derived from a plant, more preferably from the same plant species as the T-DNA border-like sequences and/or the base polynucleotide sequence of the T-DNA-like sequence, and/or the sequence additional to the T-DNA-like sequence. 7 RECEIVED at IPONZ on 11 March 2010 The plant transformation vector of the invention may further comprise a selectable marker polynucleotide sequence for selection of a bacterium harbouring the vector. Preferably the selectable marker sequence is derived from a plant, more preferably from the same plant species as the T-DNA border-like sequences and/or the base polynucleotide sequence of the 5 T-DNA-like sequence, and/or the sequence additional to the T-DNA-like sequence.
The selectable marker polynucleotide sequence for selection of a plant harbouring the T-DNA-like sequence may also function in selection of a bacterium harbouring the vector.
The T-DNA-like sequence may further comprise a genetic construct as herein defined. Preferably the genetic construct comprises a promoter polynucleotide sequence operably linked to a polynucleotide sequence of interest and a terminator polynucleotide sequence, wherein all of said polynucleotide sequences are derived from plants, preferably from the same plant species as the T-DNA border-like sequences.
The polynucleotide sequence of the entire vector may be derived from plant species, preferably from the same plant species.
Also disclosed is a plant transformation vector including a T-DNA like sequence, the T-DNA 20 like sequence comprising: a) a polynucleotide sequence of at least about 20 bp in length derived from a plant species, comprising, the nucleotide sequence 5'-GR-3' (wherein R = G or A); and b) a chimeric T-DNA-border-like sequence comprising at its 5' end, the nucleotide sequence 5'-GR-3' from a) wherein the chimeric border is capable of functioning as a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant.
Also disclosed is a plant transformation vector including a T-DNA like sequence, the T-DNA like sequence comprising: a) a polynucleotide sequence of at least about 20 bp in length derived from a plant species, comprising, the nucleotide sequence 5'-GRC-3' (wherein R = G or A); and b) a chimeric T-DNA-border-like sequence comprising at its 5' end, the nucleotide sequence 5'-GRC-3' from a) 8 RECEIVED at IPONZ on 11 March 2010 wherein the chimeric border is capable of functioning as a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant.
Also disclosed is a plant transformation vector including a T-DNA like sequence, the T-DNA 5 like sequence comprising: a) a polynucleotide sequence of at least about 20 bp in length derived from a plant species, comprising, the nucleotide sequence 5'-GRCA-3' (wherein R = G or A); and b) a chimeric T-DNA-border-like sequence comprising at its 5' end, the nucleotide sequence 5'-GRCA-3' from a) wherein the chimeric border is capable of functioning as a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant.
The T-DNA-like sequence may include, 5' to the chimeric T-DNA border-like sequence, first and second recombinase recognition sequences, wherein the recombinase recognition 15 sequences are derived from plant species.
The first recombinase recognition site and the second recombinase recognition sequence may be /oxP-like sequences. Alternatively the first recombinase recognition sequence and the second recombinase recognition sequences are//*/-like sequences.
The plant transformation vector may comprise a selectable marker sequence flanked by the first and second recombinase recognition sequences. Preferably the selectable marker sequence is derived from plants.
The polynucleotide of at least 20 bp in length and any recombinase recognition site sequences may be constructed from fewer than 10 fragments, preferably fewer than 9, preferably fewer than 8, preferably fewer than 7, preferably fewer than 6, preferably fewer than 5, preferably fewer than 4, preferably fewer than 3, most preferably 2 or 1 sequence fragments derived from plants.
The plant transformation vector may further comprise an origin of replication polynucleotide sequence derived from plant species. 9 RECEIVED at IPONZ on 11 March 2010 The T-DNA-like sequence may include, 5' to the chimeric T-DNA border-like sequence, a selectable marker polynucleotide sequence capable of functioning in selection of a plant cell or plant harbouring the T-DNA-like sequence, wherein the selectable marker sequence is derived from plant species.
The plant transformation vector may comprise a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector, wherein the selectable marker sequence is derived from a plant.
The selectable marker polynucleotide sequence capable of functioning in selection of a plant harbouring the T-DNA-like sequence may also be capable of functioning in selection of a bacterium harbouring the vector.
The T-DNA-like sequence of the plant transformation vector may further comprise a genetic 15 construct as herein defined, wherein all polynucleotide sequences of the genetic construct are derived from plant species.
All of the polynucleotide sequence of the plant transformation vector, except for the chimeric T-DNA border-like sequence, may be derived from plant species.
All of the polynucleotide sequence of the plant transformation vector, except for the chimeric T-DNA border-like sequence, may be derived from plant species which are interfertile.
Advantageously all of the polynucleotide sequence of the plant transformation vector, except 25 for the chimeric T-DNA border-like sequence, is derived from the same plant species.
Also disclosed is a plant transformation vector including a chimeric sequence, the chimeric sequence comprising: a) at the 5' end a plant-derived sequence of at least 20 bp in length including the 30 nucleotide sequence 5'-GR-3' (wherein R = G or A); and b) at the 3' end a border sequence capable of performing the function of a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant, wherein the nucleotide sequence 5'-GR-3' from a) forms the 5' end of the border sequence.
RECEIVED at IPONZ on 11 March 2010 Also disclosed is a plant transformation vector including a chimeric sequence, the chimeric sequence comprising: a) at the 5' end a plant-derived sequence of at least 20bp in length including the 5 nucleotide sequence 5'-GRC-3' (wherein R = G or A); and b) at the 3' end a border sequence capable of performing the function of a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant, wherein the nucleotide sequence 5'-GRC-3' from a) forms the 5' end of the border sequence.
Also disclosed is a plant transformation vector including a chimeric sequence, the chimeric sequence comprising: a) at the 5' end a plant-derived sequence of at least 20 bp in length including the nucleotide sequence 5'-GRCA-3' (wherein R = G or A); and 15 b) at the 3' end a border sequence capable performing the function of a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant, wherein the nucleotide sequence 5'-GRCA-3' from a) forms the 5' end of the border sequence.
Advantageously the plant-derived sequence of at least 20 bp in length is at least about 50bp in length, more preferably at least about lOObp in length, more preferably at least about 200bp in length, more preferably at least about 500bp in length, most preferably at least about lkb in length.
More advantageously the plant transformation includes, 5' to the border sequence, first and second recombinase recognition sequences derived from plant species.
Preferably the first recombinase recognition site and the second recombinase recognition sequence are /cwP-like sequences.
Alternatively the first recombinase recognition sequence and the second recombinase recognition sequences are frt-like sequences. 11 RECEIVED at IPONZ on 11 March 2010 The plant transformation vector may comprise a selectable marker sequence flanked by the first and second recombinase recognition sequences.
Preferably the selectable marker sequence is derived from plants.
The polynucleotide of at least 20 bp in length and any recombinase recognition site sequences, of the plant transformation vector, may be constructed from fewer than 10 fragments, preferably fewer than 9, preferably fewer than 8, preferably fewer than 7, preferably fewer than 6, preferably fewer than 5, preferably fewer than 4, preferably fewer 10 than 3, most preferably 2 or 1 polynucleotide sequence fragments derived from plant species.
The plant transformation vector may further comprise an origin of replication polynucleotide sequence derived from plant species.
The plant transformation vector may include, 5' to the border sequence, a selectable marker polynucleotide sequence capable of functioning in selection of a plant cell or plant harbouring the selectable marker polynucleotide sequence, wherein the selectable marker sequence is derived from plant species.
The plant transformation vector may comprise a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector, wherein the selectable marker sequence is derived from a plant.
Advantageously the selectable marker polynucleotide sequence capable of functioning in 25 selection of a plant harbouring the selectable marker polynucleotide sequence is also capable of functioning in selection of a bacterium harbouring the vector.
The plant transformation vector may further comprise a genetic construct as herein defined, wherein all polynucleotide sequences of the genetic construct are derived from plant species.
Advantageously all of the polynucleotide sequence of the plant transformation vector, except for the border sequence, is derived from plant species. 12 RECEIVED at IPONZ on 11 March 2010 Alternatively all of the polynucleotide sequence of the plant transformation vector, except for the border sequence, is derived from plant species which are interfertile.
Alternatively all of the polynucleotide sequence of the plant transformation vector, except for 5 the border sequence, is derived from the same plant species.
Also disclosed is a plant transformation vector comprising a selectable marker polynucleotide sequence capable of functioning in selection of a plant cell or plant harbouring the selectable marker polynucleotide, wherein the selectable marker sequence is derived from plant species.
Also disclosed is a plant transformation vector comprising: a) an origin of replication polynucleotide sequence, and b) a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harboring the vector in which all of said sequences are derived from plant species.
The plant transformation vector may further comprise additional base polynucleotide sequence, the additional base polynucleotide sequence being derived from plant species.
Also disclosed is a plant transformation vector comprising a selectable marker polynucleotide sequence for selection of a bacterium harbouring the vector. Preferably the selectable marker sequence is derived from a plant. More preferably the vector also comprises an origin of replication sequence functional in bacteria, preferably in E. coli. Preferably the origin of replication sequence is derived from a plant, more preferably from the same plant species as 25 the selectable marker polynucleotide sequence for selection of a bacterium harbouring the vector. Yet more preferably the vector further comprises a genetic construct as herein defined. Preferably the genetic construct sequence is derived from a plant, more preferably from the same plant species as the selectable marker polynucleotide sequence for selection of a bacterium harbouring the vector. Preferably the polynucleotide sequence of the entire 30 vector are derived from plant species, most preferably from the same plant species.
Also disclosed is a method for modifying a plant cell or plant, comprising: (a) transforming a plant cell or plant with the vector of the invention comprising a selectable marker flanked by /oxP-like recombinase recognition sites; 13 RECEIVED at IPONZ on 11 March 2010 (b) selecting a plant cell or plant expressing the selectable marker flanked by /oxP-like recombinase recognition sites; (c) inducing the expression of the Cre gene in the plant cell or plant; (d) culturing the plant cell or plant for sufficient time to allow excision of the selectable marker.
Also disclosed is a method for modifying a plant cell or plant, comprising: (a) transforming a plant cell or plant with the vector of the invention comprising a selectable marker flanked by /r/-like recombinase recognition sites; (b) selecting a plant cell or plant expressing the selectable marker flanked byy^-like recombinase recognition sites; (c) inducing the expression of the FLP gene in the plant cell or plant; (d) culturing the plant cell or plant for sufficient time to allow excision of the selectable marker.
Also disclosed is a plant modified by a method of the invention.
Preferably the plant cell or plant modified is of the same species as the vector sequence used to modify it.
Advantageously the plant cell or plant produced is of the same species as the vector sequence used to produce it. 14 DETAILED DESCRIPTION The term "polynucleotide^)," as used herein, means a single or double-stranded 5 deoxyribonucleotide or ribonucleotide polymer of any length, and include as non-limiting examples, coding and non-coding sequences of a gene, sense and antisense sequences, exons, introns, genomic DNA, cDNA, pre-mRNA, mRNA, rRNA, siRNA, miRNA, tRNA, ribozymes, recombinant polynucleotides, isolated and purified naturally occurring DNA or RNA sequences, synthetic RNA and DNA sequences, nucleic acid probes, primers, 10 fragments, genetic constructs, vectors and modified polynucleotides.
As used herein, the term "variant" refers to polynucleotide or polypeptide sequences different from the specifically identified sequences, wherein one or more nucleotides or amino acid residues is deleted, substituted, or added. Variants may be naturally occurring allelic variants, 15 or non-naturally occurring variants. Variants may be from the same or from other species and may encompass homologues, paralogues and orthologues. In certain embodiments, variants of the inventive polypeptides and polynucleotides possess biological activities that are the same or similar to those of the inventive polypeptides or polynucleotides. The term "variant" with reference to polynucleotides and polypeptides encompasses all forms of polynucleotides 20 and polypeptides as defined herein.
Variant polynucleotide sequences preferably exhibit at least 50%, more preferably at least 70%, more preferably at least 80%, more preferably at least 90%, more preferably at least 95%, more preferably at least 98%, and most preferably at least 99% identity to a sequence of 25 the present invention. Identity is found over a comparison window of at least 5 nucleotide positions, preferably at least 10 nucleotide positions, preferably at least 20 nucleotide positions, preferably at least 50 nucleotide positions, more preferably at least 100 nucleotide positions, and most preferably over the entire length of a polynucleotide of the invention.
Polynucleotide sequence identity can be determined in the following manner. The subject polynucleotide sequence is compared to a candidate polynucleotide sequence using BLASTN (from the BLAST suite of programs, version 2.2.5 [Nov 2002]) in bl2seq (Tatiana A. Tatusova, Thomas L. Madden (1999), "Blast 2 sequences - a new tool for comparing protein and nucleotide sequences", FEMS Microbiol Lett. 174:247-250), which is publicly available from NCBI (ftp://ftp.ncbi.nih.gov/blastA. The default parameters of bl2seq may be utilized.
Polynucleotide sequence identity may also be calculated over the entire length of the overlap 5 between a candidate and subject polynucleotide sequences using global sequence alignment programs (e.g. Needleman, S. B. and Wunsch, C. D. (1970) J. Mol. Biol. 48, 443-453). A full implementation of the Needleman-Wunsch global alignment algorithm is found in the needle program in the EMBOSS package (Rice, P. Longden, I. and Bleasby, A. EMBOSS: The European Molecular Biology Open Software Suite, Trends in Genetics June 2000, vol 16, No 6. pp.276-277) which can be obtained from http://www.hgmp.mrc.ac.uk/Software/EMBOSS/. The European Bioinformatics Institute server also provides the facility to perform EMBOSS-needle global alignments between two sequences on line at http:/www.ebi.ac.uk/emboss/align/.
Alternatively the GAP program may be used which computes an optimal global alignment of two sequences without penalizing terminal gaps. GAP is described in the following paper: Huang, X. (1994) On Global Sequence Alignment. Computer Applications in the Biosciences 10, 227-235.
Use of BLASTN as described above is preferred for use in the determination of sequence identity for polynucleotide variants according to the present invention.
Alternatively, variant polynucleotides of the present invention hybridize to the polynucleotide sequences disclosed herein, or complements thereof under stringent conditions.
The term "hybridize under stringent conditions", and grammatical equivalents thereof, refers to the ability of a polynucleotide molecule to hybridize to a target polynucleotide molecule (such as a target polynucleotide molecule immobilized on a DNA or RNA blot, such as a Southern blot or Northern blot) under defined conditions of temperature and salt concentration. The ability to hybridize under stringent hybridization conditions can be determined by initially hybridizing under less stringent conditions then increasing the stringency to the desired stringency. 16 With respect to polynucleotide molecules greater than about 100 bases in length, typical stringent hybridization conditions are no more than 25 to 30° C (for example, 10° C) below the melting temperature (Tm) of the native duplex (see generally, Sambrook et al, Eds, 1987, Molecular Cloning, A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press; Ausubel et al., 5 1987, Current Protocols in Molecular Biology, Greene Publishing,). Tm for polynucleotide molecules greater than about 100 bases can be calculated by the formula Tm = 81. 5 + 0. 41% (G + C-log (Na+). (Sambrook et al, Eds, 1987, Molecular Cloning, A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press; Bolton and McCarthy, 1962, PNAS 84:1390). Typical stringent conditions for polynucleotide molecules of greater than 100 bases in length would 10 be hybridization conditions such as prewashing in a solution of 6X SSC, 0.2% SDS; hybridizing at 65°C, 6X SSC, 0.2% SDS overnight; followed by two washes of 30 minutes each in IX SSC, 0.1% SDS at 65° C and two washes of 30 minutes each in 0.2X SSC, 0.1% SDS at 65°C.
With respect to polynucleotide molecules having a length less than 100 bases, exemplary stringent hybridization conditions are 5 to 10° C below Tm. On average, the Tm of a polynucleotide molecule of length less than 100 bp is reduced by approximately (500/oligonucleotide length)0 C.
Variant polynucleotides of the present invention also encompasses polynucl eotides that differ from the sequences of the invention but that, as a consequence of the degeneracy of the genetic code, encode a polypeptide having similar activity to a polypeptide encoded by a polynucleotide of the present invention. A sequence alteration that does not change the amino acid sequence of the polypeptide is a "silent variation". Except for ATG (methionine) and 25 TGG (tryptophan), other codons for the same amino acid may be changed by art recognized techniques, e.g., to optimize codon expression in a particular host organism.
Polynucleotide sequence alterations resulting in conservative substitutions of one or several amino acids in the encoded polypeptide sequence without significantly altering its biological 30 activity are also included in the invention. A skilled artisan will be aware of methods for making phenotypically silent amino acid substitutions (see, e.g., Bowie et al, 1990, Science 247, 1306). 17 Variant polynucleotides due to silent variations and conservative substitutions in the encoded polypeptide sequence may be determined using the publicly available bl2seq program from the BLAST suite of programs (version 2.2.5 [Nov 2002]) from NCBI (ftp://ftp.ncbi.nih.gov/blast/) via the tblastx algorithm as previously described.
A "fragment" of a polynucleotide sequence provided herein is a subsequence of contiguous nucleotides that is at least 5 nucleotides in length. The fragments of the invention comprise at least 5 nucleotides, preferably at least 10 nucleotides, preferably at least 15 nucleotides, preferably at least 20 nucleotides, more preferably at least 30 nucleotides, more preferably at 10 least 50 nucleotides, more preferably at least 50 nucleotides and most preferably at least 60 nucleotides of contiguous nucleotides of a polynucleotide of the invention.
The term "primer" refers to a short polynucleotide, usually having a free 3'OH group, that is hybridized to a template and used for priming polymerization of a polynucleotide 15 complementary to the target.
The term "probe" refers to a short polynucleotide that is used to detect a polynucleotide sequence, that is complementary to the probe, in a hybridization-based assay. The probe may consist of a "fragment" of a polynucleotide as defined herein.
The term "polypeptide", as used herein, encompasses amino acid chains of any length, including full-length proteins, in which amino acid residues are linked by covalent peptide bonds. Polypeptides of the present invention may be purified natural products, or may be produced partially or wholly using recombinant or synthetic techniques. The term may refer 25 to a polypeptide, an aggregate of a polypeptide such as a dimer or other multimer, a fusion polypeptide, a polypeptide fragment, a polypeptide variant, or derivative thereof.
The term "isolated" as applied to the polynucleotide sequences disclosed herein is used to refer to sequences that are removed from their natural cellular environment. An isolated 30 molecule may be obtained by any method or combination of methods including biochemical, recombinant, and synthetic techniques.
The term "genetic construct" refers to a polynucleotide molecule, usually double-stranded DNA, which may have inserted into it another polynucleotide molecule (the insert 18 polynucleotide molecule) such as, but not limited to, a cDNA molecule. A genetic construct may contain the necessary elements that permit transcribing the insert polynucleotide molecule, and, optionally, translating the transcript into a polypeptide. The insert polynucleotide molecule may be derived from the host cell, or may be derived from a 5 different cell or organism and/or may be a recombinant or synthetic polynucleotide. Once inside the host cell the genetic construct may become integrated in the host chromosomal DNA. The term "genetic construct" includes "expression construct" as herein defined. The genetic construct may be linked to a vector.
The term "expression construct" refers to a genetic construct that includes the necessary elements that permit transcribing the insert polynucleotide molecule, and, optionally, translating the transcript into a polypeptide. An expression construct typically comprises in a 5' to 3' direction: a) a promoter functional in the host cell into which the construct will be 15 transformed, b) the polynucleotide to be transcribed and/or expressed, and c) a terminator functional in the host cell into which the construct will be transformed.
The term "vector" refers to a polynucleotide molecule, usually double stranded DNA, which may include a genetic construct and be used to transport the genetic construct into a host cell. The vector may be capable of replication in at least one additional host system, such as Escherichia coli or Agrobacterium tumefaciens.
The term "coding region" or "open reading frame" (ORF) refers to the sense strand of a genomic DNA sequence or a cDNA sequence that is capable of producing a transcription product and/or a polypeptide under the control of appropriate regulatory sequences. The coding sequence is identified by the presence of a 5' translation start codon and a 3' translation stop codon. When inserted into a genetic construct, a "coding sequence" is 30 capable of being expressed when it is operably linked to promoter and terminator sequences.
"Operably-linked" means that the sequence to be expressed is placed under the control of regulatory elements that include promoters, tissue-specific regulatory elements, temporal 19 regulatory elements, chemical-inducible regulatory elements, environment-inducible regulatory elements, enhancers, repressors and terminators.
The term "noncoding region" refers to untranslated sequences that are upstream of the 5 translational start site and downstream of the translational stop site. These sequences are also referred to respectively as the 5' UTR and the 3' UTR. These regions include elements required for transcription initiation and termination and for regulation of translation efficiency.
Terminators are sequences, which terminate transcription, and are found in the 3' untranslated ends of genes downstream of the translated sequence. Terminators are important determinants of mRNA stability and in some cases have been found to have spatial regulatory functions.
The term "promoter" refers to nontranscribed cis-regulatory elements upstream of the coding 15 region that regulate gene transcription. Promoters comprise cis-initiator elements which specify the transcription initiation site and conserved boxes such as the TATA box, and motifs that are bound by transcription factors.
A "transformed plant" refers to a plant which contains new genetic material as a result of 20 genetic manipulation or transformation. The new genetic material may be derived from a plant of the same species or from a different species in which case it can also be known as a "transgenic plant".
An "inverted repeat" is a sequence that is repeated, where the second half of the repeat is in 25 the complementary strand, e.g., (5')GATCTA TAGATC(3') (3')CTAGAT ATCTAG(5') Read-through transcription will produce a transcript that undergoes complementary base-pairing to form a hairpin structure provided that there is a 3-5 bp spacer between the repeated 30 regions.
The terms "to alter expression of' and "altered expression" of a polynucleotide or polypeptide of the invention, are intended to encompass the situation where genomic DNA corresponding to a polynucleotide of the invention is modified thus leading to altered expression of a polynucleotide or polypeptide of the invention. Modification of the genomic DNA may be through genetic transformation or other methods known in the art for inducing mutations. The "altered expression" can be related to an increase or decrease in the amount of messenger RNA and/or polypeptide produced and may also result in altered activity of a polypeptide due 5 to alterations in the sequence of a polynucleotide and polypeptide produced.
The term "/arP-like sequence" refers to a sequence derived from the genome of a plant which can perform the function of a Cre recombinase recognition site. The /oxP-like sequence may be comprised of one contiguous sequence derived from the genome of a plant or may be 10 formed by combining two sequences derived from the genome of a plant.
A /oxP-like sequence is between 24-100 bp in length, preferably 24-80 bp in length, preferably 24-70 bp in length, preferably 24-60 bp in length, preferably 24-50 bp in length, preferably 24-40 bp in length, preferably 24-34 bp in length, preferably 26-34 bp in length, 15 preferably 28-34 bp in length, preferably 30-34 bp in length, preferably 32-34 bp in length, preferably 34 bp in length.
A /oxP-like sequence preferably comprises the consensus motif 20 5' AT A ACTT C GT AT ANNNNNNNNT AT AC G A AGTT AT 3' (where N = any nucleotide), or similar sequences.
The term "J^t-like sequence" refers to a sequence derived from the genome of a plant which 25 can perform the function of an FLP recombinase recognition site. The frt-like sequence may be comprised of one contiguous sequence derived from the genome of a plant or may be formed by combining two sequences derived from the genome of a plant.
An frt-like sequence is between 28-100 bp in length, preferably 28-80 bp in length, preferably 30 28-70 bp in length, preferably 28-60 bp in length, preferably 28-50 bp in length, preferably 28-40 bp in length, preferably 28-34 bp in length, preferably 30-34 bp in length, preferably 32-34 bp in length, preferably 34 bp in length.
A frt-like sequence preferably comprises the consensus motif 21 ' GAAGTTCCTATACNNNNNNNNGWATAGGAACTTC 3' (where W = A or T, N = any nucleotide).
The term "T-DNA border-like sequence" refers to a sequence derived from the genome of a plant which can perform the function of an Agrobacterium T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant. The T-DNA border-like sequence may be comprised of one contiguous sequence derived from the genome of a plant 10 or may be formed by combining two or more sequences derived from the genome of a plant.
A T-DNA border-like sequence is between 10-100 bp in length, preferably 10-80 bp in length, preferably 10-70 bp in length, preferably 15-60 bp in length, preferably 15-50 bp in length, preferably 15-40 bp in length, preferably 15-30 bp in length, preferably 20-30 bp in length, 15 preferably 21-30 bp in length, preferably 22-30 bp in length, preferably 23-30 bp in length, preferably 24-30 bp in length, preferably 25-30 bp in length, preferably 26-30 bp in length.
A T-DNA border-like sequence preferably comprises the consensus motif: 5' GRCAGGATATATNNNNNKSTMA WN3' (where R = G or A, K = T or G, S = G or C, M = C or A, W = A or T and N = any nucleotide.
The T-DNA border-like sequence of the invention is preferably at least 50%, more preferably at least 55%, more preferably at least 60%, more preferably at least 65%, more preferably at least 70%, more preferably at least 75%, more preferably at least 80%, more preferably at 25 least 85%, more preferably at least 90%, more preferably at least 95%, more preferably at least 99% identical to any Agrobacterium-derived T-DNA border sequence.
Although not preferred, a T-DNA border-like sequence of the invention may include a sequence naturally occurring in a plant which is modified or mutated to change the efficiency 30 at which it is capable of integrating a linked polynucleotide sequence into the genome of a plant.
The term "T-DNA-like sequence" refers to a sequence derived from a plant genome which includes at one or both ends a T-DNA border-like sequence, or a chimeric T-DNA-border-like 22 sequence as herein defined. A T-DNA-like sequence may include additional base sequence between the T-DNA border-like sequences, or to one side of a T-DNA border-like sequence. The base sequence of the T-DNA-like sequences of the invention preferably includes restriction sites or alternative cloning sites to facilitate insertion of further polynucleotide 5 sequences.
The term "chimeric T-DNA border-like sequence" refers to a sequence which can perform the function of an Agrobacterium T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant, wherein part of the sequence is derived from a plant and 10 part of the sequence is derived from another source, such as Agrobacterium.
Upon plant transformation, it is well understood that T-DNA integration from the right border is very precise. Molecular cloning and sequencing across T-DNA/plant genomic DNA junctions has repeatedly established that T-DNA integration at the right border is highly 15 conserved, with only the first few nucleotides of the right border being integrated into plant genomes (Gheysen, G., Angenon, G., van Montagu, M., Agrobacterium-mediated plant transformation: a scientifically intriguing story with significant applications, pp. 1-33, in Transgenic Plant Research, editor Lindsey, K., Harwood Academic Publishers, Amsterdam, 1998).
For this reason, when deriving a chimeric border for use as a "right border" in intragenic transformation, it is only necessary for the first few nucleotides (up to four nucleotides) to be of plant origin; i.e. 5'GRCA...3'. The remaining DNA sequence of such rights borders can be authentic sequences from Agrobacterium T-DNA borders.
It will be well understood by those skilled in the art that a DNA sequence of 5'GRCA3' will occur frequently in any genome. It is expected to be found at random once in every 256 nucleotides and is likely to be found on any other fragment useful for the construction of vectors for plant transformation.
The term "border sequence" refers to a sequence derived from a plant which can perform the function of an Agrobacterium T-DNA border sequence for integration of a polynucleotide sequence into the genome of a plant. 23 A "border sequence" is between 10-100 bp in length, preferably 10-80 bp in length, preferably 10-70 bp in length, preferably 15-60 bp in length, preferably 15-50 bp in length, preferably 15-40 bp in length, preferably 15-30 bp in length, preferably 20-30 bp in length, preferably 21-30 bp in length, preferably 22-30 bp in length, preferably 23-30 bp in length, 5 preferably 24-30 bp in length, preferably 25-30 bp in length, preferably 26-30 bp in length.
A "border sequence" preferably comprises the consensus motif: ' GRC AGGAT AT ATNNNNNKSTM A WN 3' (where R = G or A, K = T or G, S = G or C, M = C or A, W = A or T and N = any nucleotide).
The term "border sequence" as used herein includes known Agrobacterium borders, including 15 those disclosed herein.
The term "border sequence" also includes modified versions of known Agrobacterium sequences, which have been modified, for example by substitution, addition or deletion, to improve the efficiency at which they are capable of performing function of an Agrobacterium 20 T-DNA border sequence for integration of a polynucleotide sequence into the genome of a plant.
The terms "origin of replication derived from a plant" or "plant-derived origin of replication" or grammatical equivalents thereof refers to a sequence derived from a plant which can 25 support replication of a vector in which it is included in a bacterium. The "plant-derived origins of replication" may be composed of one, two or more sequence fragments derived from plants. Preferably the "plant-derived origins of replication" are composed of two sequence fragments derived from plants.
The plant-derived origin of replication may comprise the consensus motif: AGApCAlATAAGCCT TaBcAlATAACAGCiCC Where R = G or A (Pu), Y = C or T (Py) and W = A or T 24 Alternatively the plant-derived origin of replication comprises the consensus motif: 5 AGAllC a1§AT AAGCCTjjjT Aj§C AjlAT A AC AGCBCC Where R = G or A (Pu), Y = C or T (Py) and W = A or T The terms "selectable marker derived from a plant" or "plant-derived selectable marker" or 10 grammatical equivalents thereof refers to a sequence derived from a plant which can enable selection of a plant cell harbouring the sequence or a sequence to which the selectable marker is linked. The "plant-derived selectable markers" may be composed of one, two or more sequence fragments derived from plants. Preferably the "plant-derived selectable markers" are composed of two sequence fragments derived from plants.
In one embodiment, the plant-derived selectable marker is at least 50%, more preferably at least 55%, more preferably at least 60%, more preferably at least 65%, more preferably at least 70%, more preferably at least 75%, more preferably at least 80%, more preferably at least 85%, more preferably at least 90%, more preferably at least 95%, more preferably at 20 least 99% identical to the sequence of SEQ ID NO: 10.
Alternatively the plant-derived selectable marker is at least 90%, preferably at least 95%, and most preferably 100% identical to SEQ ID NO:39 or SEQ ID N0:40.
Methods for transforming plant cells, plants and portions thereof with polynucleotides are described in Draper et al, 1988, Plant Genetic Transformation and Gene Expression. A Laboratory Manual. Blackwell Sci. Pub. Oxford, p. 365; Potrykus and Spangenburg, 1995, Gene Transfer to Plants. Springer-Verlag, Berlin.; and Gelvin et al, 1993, Plant Molecular Biol. Manual. Kluwer Acad. Pub. Dordrecht. A review of transgenic plants, including 30 transformation techniques, is provided in Galun and Breiman, 1997, Transgenic Plants. Imperial College Press, London.
It will be well understood by those skilled in the art that the intragenic vectors of the invention can function in the place of the binary vectors for Agrobacterium-mediatQd transformation and as vectors for direct DNA uptake approaches.
The invention provides novel plant derived /oxP-like and frt-like recombinase recognition sequences, novel T-DNA border-like sequences, T-DNA-like sequences, transformation vectors, methods for producing transformed plant cells and plants, and plant cells and plants produced by the methods.
The majority of selectable markers for plant transformation are antibiotic or herbicide resistance genes; their presence in transgenic crop plants has given rise to public concerns on environmental safety.
Regardless of their origin, once a transgenic plant has been established marker genes are no 15 longer required. Moreover it is desirable to remove the promoter and enhancer elements used to drive the expression of the marker genes as these may interfere with the expression of neighboring endogenous genes.
Previously site-specific recombination systems have been elegantly used to excise precise 20 sequences corresponding to selectable marker constructs in transgenic plants (reviewed by Gilbertson, L. Cre-lox recombination: Cre-ative tools for plant biotechnology TRENDS in Biotechnology 21(12) 550-555 2003).
Two such recombination systems are the Escherichia coli bacteriophage PI Cre/loxP system 25 and the Saccharomyces cerevisiae FLP/frt systems, which require only a single-polypeptide recombinase, Cre or FLP and minimal 34bp DNA recombination sites, loxP or frt.
When two recombination sites in the same orientation flank integrated DNA such as a selectable marker, recombinase mediates a crossover between these sites effectively excising 30 the intervening DNA.
The recombinase enzyme can either be located next to the selectable marker gene so that it is in effect auto excised (Mlynarova, L and Nap J-P, A self-excising Cre recombinase allows efficient recombination of multiple ectopic heterospecific lox sites in transgenic tobacco, 26 RECEIVED at IPONZ on 11 March 2010 Transgenic Research, 12: 45-57, 2003), or it can be transiently expressed (Gleave, A.P, Mitra, D.S, Mudge, S.R and Morris, B.A.M. Selectable marker-free transgenic plants without sexual crossing: transient expression of cre recombinase and use of a conditional lethal dominant gene, Plant Molecular Biology, 40: 223-235,1999).
Following excision only one recombination site remains.
The applicants have identified novel T-DNA border-like sequences from plant genomes and devised improved methods for transformation which minimise or eliminate transfer of foreign 10 DNA to the transformed plant cell or plant.
The applicants provide T-DNA border-like sequences, T-DNA-like sequences, transformation vectors, methods for transforming plant cells and plants, and the plant cells and plants produced by the methods.
The applicants have also identified novel plant derived /oxP-like and frt-like recombinase recognition sequences from plant genomes and devised further improved methods for transformation which minimise or eliminate transfer of foreign DNA to the transformed plant.
It will be understood by those skilled in the art that corresponding recombinase sequences can be expressed in plants in order to facilitate recombination of the /oxP-like and frt-Wke recombinase recognition sequences of the invention.
The invention provides methods which allow for within-species or "intragenic" as opposed to 25 transgenic transformation of plants. Vectors useful for this approach can therefore be described as intragenic vectors. The invention provides such intragenic vectors and methods of using them to produce intragenic transformed plants without any foreign DNA.
It will be understood by those skilled in the art that DNA sequences used to construct such 30 "intragenic vectors" are preferentially derived from DNA sequences (ESTs or cDNAs) known to be expressed in plant genomes. In this manner sequences derived from heterochromatic regions, promoters or introns can be avoided. The use of such sequences for the construction of intragenic vectors may influence the subsequent expression of genes of interest following their transfer to plants via intragenic vectors. 27 RECEIVED at IPONZ on 11 March 2010 The applicants provide novel T-DNA border-like sequences from several plant species (as shown in Example 1) formed by combining two to three fragments of genomic DNA, with all fragments being from a single plant species of interest or a closely related species. The 5 common nature of such sequences in plant genomes is shown in Example 1.
The applicants further provide isolated T-DNA-like sequences from several plant species as shown in Example 2. The T-DNA-like region sequences in Example 2 include the T-DNA-like sequences flanked (and delineated) by T-DNA border-like sequences (high-lighted) and 10 additional sequence on either one or both sides of the T-DNA-like sequence.
Plant-derived selectable marker sequences which are useful for selecting transformed plant cells and plants harbouring a particular T-DNA-like sequence include PPga22 (Zuo et ah, Curr Opin Biotechnol. 13: 173-80, 2002), Ckil (Kakimoto, Science 274: 982-985, 1996), Esrl 15 (Banno et al, Plant Cell 13: 2609-18, 2001), and dhdps-rl (Ghislain et al., Plant Journal, 8: 733-743, 1995). It is also possible to use pigmentation markers to visually select transformed plant cells and plants, such as the R and CI genes (Lloyd et al, Science, 258: 1773-1775, 1992; Bodeau and Walbot, Molecular and General Genetics, 233: 379-387, 1992). A preferred plant-derived selectable marker is the acetohydroxyacid synthase gene as shown in 20 Example 6 and Example 7. Non-plant derived selectable markers are also described herein.
Preferred intragenic vectors of the invention contain a plant-derived selectable marker which function in selection of bacteria harbouring the marker as described in Example 3 and Example 5.
The preferred intragenic vectors of the invention consist entirely of plant-derived polynucleotide sequence from the species to be transformed, or from closely related species, such as species interfertile with the plant to be transformed, considered to be within the germplasm pool accessible to traditional plant breeding. Such vectors preferably include a 30 plant-derived origin of replication which is functional in bacteria, particularly in Agrobacterium species and preferably also in E. coli. The invention provides plant transformation vectors comprising such sequences. Preferred origin of replication sequences include those shown in Example 4. 28 RECEIVED at IPONZ on 11 March 2010 The invention provides novel /oxP-like and frt-like recombinase recognition sequences from several plant species as shown in Example 9 and Example 10.
Construction of a vector is described in Example 6 and Example 8. Plant transformation 5 using these vectors is described in Example 7 and Example 8.
Example 6 and Example 7 also illustrate the construction and successful use of a vector with a chimeric T-DNA border-like sequence. In this instance the "right border" is composed of 5'GAC3' from the end of a sequence isolated from Arabidopsis thaliana, with the remainder 10 of the chimeric T-DNA border-like sequence, 5' AGGAT AT ATT GGC GGGT AAAC3', being derived from the binary vector pART27 (see sequence of pTCl in Example 6). Such chimeric T-DNA border-like sequences are preferably used as the right border when two border-like sequences are used to flank the T-DNA-like sequence. When vectors with only one borderlike sequence are used, the plant derived end (e.g. 5'GRC3') end of the T-DNA border-like 15 sequence must be contiguous with the plant derived sequence(s) destined for integration into a plant genome.
The polynucleotide molecules of the invention can be isolated by using a variety of techniques known to those of ordinary skill in the art. By way of example, such 20 polynucleotides can be isolated through use of the polymerase chain reaction (PCR) described in Mullis et al., Eds. 1994 The Polymerase Chain Reaction, Birkhauser, incorporated herein by reference. The polynucleotides of the invention can be amplified using primers, as defined herein, derived from the polynucleotide sequences of the invention.
Further methods for isolating polynucleotides of the invention include use of all, or portions of, the disclosed polynucleotide sequences as hybridization probes. The technique of hybridizing labeled polynucleotide probes to polynucleotides immobilized on solid supports such as nitrocellulose filters or nylon membranes, can be used to screen the genomic or cDNA libraries. Exemplary hybridization and wash conditions are: hybridization for 20 hours at 30 65°C in 5. 0 X SSC, 0. 5% sodium dodecyl sulfate, 1 X Denhardt's solution; washing (three washes of twenty minutes each at 55°C) in 1. 0 X SSC, 1% (w/v) sodium dodecyl sulfate, and optionally one wash (for twenty minutes) in 0. 5 X SSC, 1% (w/v) sodium dodecyl sulfate, at 60°C. An optional further wash (for twenty minutes) can be conducted under conditions of 0. 1 X SSC, 1% (w/v) sodium dodecyl sulfate, at 60°C. 29 The polynucleotide fragments of the invention may be produced by techniques well-known in the art such as restriction endonuclease digestion and oligonucleotide synthesis.
A partial polynucleotide sequence may be used, in methods well-known in the art to identify the corresponding further contiguous polynucleotide sequence. Such methods would include PCR-based methods, 5'RACE (Frohman MA, 1993, Methods Enzymol. 218: 340-56) and hybridization- based method, computer/database-based methods. Further, by way of example, inverse PCR permits acquisition of unknown sequences, flanking the polynucleotide 10 sequences disclosed herein, starting with primers based on a known region (Triglia et al., 1998, Nucleic Acids Res 16, 8186, incorporated herein by reference). The method uses several restriction enzymes to generate a suitable fragment in the known region of a gene. The fragment is then circularized by intramolecular ligation and used as a PCR template. Divergent primers are designed from the known region. In order to physically assemble full-15 length clones, standard molecular biology approaches can be utilized (Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987).
It will be understood by those skilled in the art that in order to produce intragenic vectors for further species it may be necessary to identify the sequences corresponding to essential or 20 preferred elements of such vectors in other plant species. It will be appreciated by those skilled in the art that this may be achieved by identifying polynucleotide variants of the sequences disclosed. Many methods are known by those skilled in the art for isolating such variant sequences.
Variant polynucleotides may be identified using PCR-based methods (Mullis et al, Eds. 1994 The Polymerase Chain Reaction, Birkhauser). Typically, the polynucleotide sequence of a primer, useful to amplify variants of polynucleotide molecules of the invention by PCR, may be based on a sequence encoding a conserved region of the corresponding amino acid sequence.
Further methods for identifying variant polynucleotides of the invention include use of all, or portions of, the polynucleotides disclosed herein as hybridization probes to screen plant genomic or cDNA libraries as described above. Typically probes based on a sequence encoding a conserved region of the corresponding amino acid sequence may be used.
Hybridisation conditions may also be less stringent than those used when screening for sequences identical to the probe.
The variant polynucleotide sequences of the invention, may also be identified by computer-5 based methods well-known to those skilled in the art, using public domain sequence alignment algorithms and sequence similarity search tools to search sequence databases (public domain databases include Genbank, EMBL, Swiss-Prot, PIR and others). See, e.g., Nucleic Acids Res. 29: 1-10 and 11-16, 2001 for examples of online resources. Similarity searches retrieve and align target sequences for comparison with a sequence to be analyzed 10 (i.e., a query sequence). Sequence comparison algorithms use scoring matrices to assign an overall score to each of the alignments.
An exemplary family of programs useful for identifying variants in sequence databases is the BLAST suite of programs (version 2.2.5 [Nov 2002]) including BLASTN, BLASTP, 15 BLASTX, tBLASTN and tBLASTX, which are publicly available from (ftp://ftp.ncbi.nih.gov/blast/) or from the National Center for Biotechnology Information (NCBI), National Library of Medicine, Building 38A, Room 8N805, Bethesda, MD 20894 USA. The NCBI server also provides the facility to use the programs to screen a number of publicly available sequence databases. BLASTN compares a nucleotide query sequence 20 against a nucleotide sequence database. BLASTP compares an amino acid query sequence against a protein sequence database. BLASTX compares a nucleotide query sequence translated in all reading frames against a protein sequence database. tBLASTN compares a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames. tBLASTX compares the six-frame translations of a nucleotide query 25 sequence against the six-frame translations of a nucleotide sequence database. The BLAST programs may be used with default parameters or the parameters may be altered as required to refine the screen.
The use of the BLAST family of algorithms, including BLASTN, BLASTP, and BLASTX, is 30 described in the publication of Altschul et al., Nucleic Acids Res. 25: 3389-3402, 1997.
The "hits" to one or more database sequences by a queried sequence produced by BLASTN, BLASTP, BLASTX, tBLASTN, tBLASTX, or a similar algorithm, align and identify similar portions of sequences. The hits are arranged in order of the degree of similarity and the 31 length of sequence overlap. Hits to a database sequence generally represent an overlap over only a fraction of the sequence length of the queried sequence.
The BLASTN, BLASTP, BLASTX, tBLASTN and tBLASTX algorithms also produce 5 "Expect" values for alignments. The Expect value (E) indicates the number of hits one can "expect" to see by chance when searching a database of the same size containing random contiguous sequences. The Expect value is used as a significance threshold for determining whether the hit to a database indicates true similarity. For example, an E value of 0.1 assigned to a polynucleotide hit is interpreted as meaning that in a database of the size of the 10 database screened, one might expect to see 0.1 matches over the aligned portion of the sequence with a similar score simply by chance. For sequences having an E value of 0.01 or less over aligned and matched portions, the probability of finding a match by chance in that database is 1% or less using the BLASTN, BLASTP, BLASTX, tBLASTN or tBLASTX algorithm.
To identify the polynucleotide variants most likely to be functional equivalents of the disclosed sequences, several further computer based approaches are known to those skilled in the art.
Multiple sequence alignments of a group of related sequences can be carried out with CLUSTALW (Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994) CLUSTALW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Research, 22:4673-4680, http://www-igbmc.u-strasbg.fr/BioInfo/ClustalW/Top.html") or T-COFFEE 25 (Cedric Notredame, Desmond G. Higgins, Jaap Heringa, T-Coffee: A novel method for fast and accurate multiple sequence alignment, J. Mol. Biol. (2000) 302: 205-217)) or PILEUP, which uses progressive, pairwise alignments (Feng and Doolittle, 1987, J. Mol. Evol. 25, 351).
Pattern recognition software applications are available for finding motifs or signature 30 sequences. For example, MEME (Multiple Em for Motif Elicitation) finds motifs and signature sequences in a set of sequences, and MAST (Motif Alignment and Search Tool) uses these motifs to identify similar or the same motifs in query sequences. The MAST results are provided as a series of alignments with appropriate statistical data and a visual 32 overview of the motifs found. MEME and MAST were developed at the University of California, San Diego.
PROSITE (Bairoch and Bucher, 1994, Nucleic Acids Res. 22, 3583; Hofmann et al., 1999, Nucleic Acids Res. 27, 215) is a method of identifying the functions of uncharacterized 5 proteins translated from genomic or cDNA sequences. The PROSITE database (www.expasy.org/prosite) contains biologically significant patterns and profiles and is designed so that it can be used with appropriate computational tools to assign a new sequence to a known family of proteins or to determine which known domain(s) are present in the sequence (Falquet et al., 2002, Nucleic Acids Res. 30, 235). Prosearch is a tool that can 10 search SWISS-PROT and EMBL databases with a given sequence pattern or signature.
The function of a variant of a polynucleotide of the invention may be assessed by replacing the corresponding sequence in an intragenic vector with the variant sequence and testing the functionality of the vector in a host bacterial cell or in a plant transformation procedure as herein defined.
Methods for assembling and manipulating genetic constructs and vectors are well known in the art and are described generally in Sambrook el al, Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987; Ausubel et al, Current Protocols in Molecular Biology, Greene Publishing, 1987).
Numerous traits in plants may also be altered through methods of the invention. Such methods may involve the transformation of plant cells and plants, using a vector of the invention including a genetic construct designed to alter expression of a polynucleotide or polypeptide which modulates such a trait in plant cells and plants. Such methods also include 25 the transformation of plant cells and plants with a combination of the construct of the invention and one or more other constructs designed to alter expression of one or more polynucleotides or polypeptides which modulate such traits in such plant cells and plants.
A number of plant transformation strategies are available (e.g. Birch, 1997, Ann Rev Plant Phys Plant Mol Biol, 48, 297). For example, strategies may be designed to increase expression of a polynucleotide/polypeptide in a plant cell, organ and/or at a particular 33 developmental stage where/when it is normally expressed or to ectopically express a polynucleotide/polypeptide in a cell, tissue, organ and/or at a particular developmental stage which/when it is not normally expressed. The expressed polynucleotide/polypeptide may be derived from the plant species to be transformed or may be derived from a different plant 5 species.
Transformation strategies may be designed to reduce expression of a polynucleotide/polypeptide in a plant cell, tissue, organ or at a particular developmental stage which/when it is normally expressed. Such strategies are known as gene silencing strategies.
Direct gene transfer involves the uptake of naked DNA by cells and its subsequent integration into the genome (Conner, A.J. and Meredith, C.P., Genetic manipulation of plant cells, pp. 653-688, in The Biochemistry of Plants: A Comprehensive Treatise, Vol 15, Molecular Biology, editor Marcus, A., Academic Press, San Diego, 1989; Petolino, J. Direct DNA 15 delivery into intact cells and tissues, pp. 137-143, in Transgenic Plants and Crops, editors Khachatourians et al., Marcel Dekker, New York, 2002,. The cells can include those of intact plants, pollen, seeds, intact plant organs, in vitro cultures of plants, plant parts, tissues and cells or isolated protoplasts. Those skilled in the art will understand that methods to effect direct DNA transfer may involve, but not limited to: passive uptake; the use of 20 electroporation; treatments with polyethylene glycol and related chemicals and their adjuncts; electrophoresis, cell fusion with liposomes or spheroplasts; microinjection, silicon carbide whiskers, and microparticle bombardment.
Genetic constructs for expression of genes in transgenic plants typically include promoters for 25 driving the expression of one or more cloned polynucleotide, terminators and selectable marker sequences to detect presence of the genetic construct in the transformed plant.
The promoters suitable for use in the constructs of this invention are functional in a cell, tissue or organ of a monocot or dicot plant and include cell-, tissue- and organ-specific 30 promoters, cell cycle specific promoters, temporal promoters, inducible promoters, constitutive promoters that are active in most plant tissues, and recombinant promoters. Choice of promoter will depend upon the temporal and spatial expression of the cloned polynucleotide, so desired. The promoters may be those normally associated with a transgene of interest, or promoters which are derived from genes of other plants, viruses, and plant 34 pathogenic bacteria and fungi. Those skilled in the art will, without undue experimentation, be able to select promoters that are suitable for use in modifying and modulating plant traits using genetic constructs comprising the polynucleotide sequences of the invention. Examples of constitutive promoters used in plants include the CaMV 35S promoter, the nopaline 5 synthase promoter and the octopine synthase promoter, and the Ubi 1 promoter from maize. Plant promoters which are active in specific tissues, respond to internal developmental signals or external abiotic or biotic stresses are also described in the scientific literature. Exemplary promoters are described, e.g., in WO 02/00894, which is herein incorporated by reference.
Exemplary terminators that are commonly used in plant transformation genetic constructs include, e.g., the cauliflower mosaic virus (CaMV) 35S terminator, the Agrobacterium tumefaciens nopaline synthase or octopine synthase terminators, the Zea mays zein gene terminator, the Oryza sativa ADP-glucose pyrophosphorylase terminator and the Solarium tuberosum PI-1I terminator.
Selectable markers commonly used in plant transformation include the neomycin phophotransferase 11 gene (NPT II) which confers kanamycin resistance, the aadA gene, which confers spectinomycin and streptomycin resistance, the phosphinothricin acetyl transferase (bar gene) for Ignite (AgrEvo) and Basta (Hoechst) resistance, and the hygromycin phosphotransferase gene ( hpt) for hygromycin resistance.
It will be understood by those skilled in the art that non-plant derived regulatory elements described above may be used in the intragenic vectors of the invention operably linked to selectable markers placed between the recombinase recognition sites.
Gene silencing strategies may be focused on the gene itself or regulatory elements which effect expression of the encoded polypeptide. "Regulatory elements" is used here in the widest possible sense and includes other genes which interact with the gene of interest.
Genetic constructs designed to decrease or silence the expression of a polynucleotide/polypeptide of the invention may include an antisense copy of a polynucleotide of the invention. In such constructs the polynucleotide is placed in an antisense orientation with respect to the promoter and terminator.
An "antisense" polynucleotide is obtained by inverting a polynucleotide or a segment of the polynucleotide so that the transcript produced will be complementary to the mRNA transcript of the gene, e.g., Genetic constructs designed for gene silencing may also include an inverted repeat as herein defined. The preferred approach to achieve this is via RNA-interference strategies using genetic constructs encoding self-complementary "hairpin" RNA (Wesley et al., 2001, Plant 10 Journal, 27: 581-590).
The transcript formed may undergo complementary base pairing to form a hairpin structure. Usually a spacer of at least 3-5 bp between the repeated region is required to allow hairpin formation.
Another silencing approach involves the use of a small antisense RNA targeted to the transcript equivalent to an miRNA (Llave et al., 2002, Science 297, 2053). Use of such small antisense RNA corresponding to polynucleotide of the invention is expressly contemplated.
The term genetic construct as used herein also includes small antisense RNAs and other such polynucleotides effecting gene silencing.
Transformation with an expression construct, as herein defined, may also result in gene silencing through a process known as sense suppression (e.g. Napoli et al1990, Plant Cell 2, 25 279; de Carvalho Niebel et ah, 1995, Plant Cell, 7, 347). In some cases sense suppression may involve over-expression of the whole or a partial coding sequence but may also involve expression of non-coding region of the gene, such as an intron or a 5' or 3' untranslated region (UTR). Chimeric partial sense constructs can be used to coordinately silence multiple genes (Abbott et al., 2002, Plant Physiol. 128(3): 844-53; Jones et al., 1998, Planta 204: 499-30 505). The use of such sense suppression strategies to silence the expression of a polynucleotide of the invention is also contemplated.
' GATCTA 3' (coding strand) 5 3'CUAGAU 5' mRNA 3'CTAGAT 5' (antisense strand) 5'GAUCUA 3' antisense RNA 36 The polynucleotide inserts in genetic constructs designed for gene silencing may correspond to coding sequence and/or non-coding sequence, such as promoter and/or intron and/or 5' or 3' UTR sequence, or the corresponding gene.
Other gene silencing strategies include dominant negative approaches and the use of ribozyme constructs (Mclntyre, 1996, Transgenic Res, 5, 257) Pre-transcriptional silencing may be brought about through mutation of the gene itself or its regulatory elements. Such mutations may include point mutations, frameshifts. insertions, 10 deletions and substitutions.
The following are representative publications disclosing genetic transformation protocols that can be used to genetically transform the following plant species: onions (WOOO/44919); peas (Grant et al., 1995 Plant Cell Rep., 15, 254-258; Grant et al., 1998, Plant Science, 139:159-15 164); petunia (Deroles and Gardner, 1988, Plant Molecular Biology, 11: 355-364); Medicago truncatula (Trieu and Harrison 1996, Plant Cell Rep. 16: 6-11); rice (Alam et al., 1999, Plant Cell Rep. 18, 572); maize (US Patent Serial Nos. 5, 177, 010 and 5, 981, 840); wheat (Ortiz et al., 1996, Plant Cell Rep. 15, 1996, 877); tomato (US Patent Serial No. 5, 159, 135); potato (Kumar et al., 1996 Plant J. 9, : 821); cassava (Li et al., 1996 Nat. Biotechnology 14, 736); 20 lettuce (Michelmore et al., 1987, Plant Cell Rep. 6, 439); tobacco (Horsch et al., 1985, Science 227, 1229); cotton (US Patent Serial Nos. 5, 846, 797 and 5, 004, 863); grasses (US Patent Nos. 5, 187, 073 and 6. 020, 539); peppermint (Niu et al., 1998, Plant Cell Rep. 17, 165); citrus plants (Pena et al., 1995, Plant Sci.104, 183); caraway (Krens et al., 1997, Plant Cell Rep, 17, 39); banana (US Patent Serial No. 5, 792, 935); soybean (US Patent Nos. 5, 416, 25 011 ; 5, 569, 834 ; 5, 824, 877 ; 5, 563, 04455 and 5, 968, 830); pineapple (US Patent Serial No. 5, 952, 543); poplar (US Patent No. 4, 795, 855); monocots in general (US Patent Nos. 5, 591, 616 and 6, 037, 522); brassica (US Patent Nos. 5, 188, 958 ; 5, 463, 174 and 5, 750, 871); and cereals (US Patent No. 6, 074, 877). It will be understood by those skilled in the art that the above protocols may be adapted for example, for use with alternative selectable 30 marker for transformation.
The plant-derived sequences in the vectors of the invention may be derived from any plant species. 37 In one embodiment the plant-derived sequences in the vectors of the invention are from gymnosperm species. Preferred gymnosperm genera include Cycas, Pseudotsuga, Pinus and Picea. Preferred gymnosperm species include Cycas rumphii, Pseudotsuga menziesii, Pinus radiata, Pinus taeda, Pinus pinaster, Picea engelmannia x sitchensis, Picea sitchensis and 5 Picea glauca.
In a further embodiment the plant-derived sequences in the vectors of the invention are from bryophyte species. Preferred bryophyte genera include Marchantia, Tortula, Physcomitrella and Ceratodon. Preferred bryophyte species include Marchantia polymorpha, Tortula 10 ruralis, Physcomitrella patens and Ceratodon purpureous.
In a further embodiment the plant-derived sequences in the vectors of the invention are from algae species. Preferred algae genera include Chlamydomonas. Preferred algae species include Chlamydomonas reinhardtii.
In a further embodiment the plant-derived sequences in the vectors of the invention are from angiosperm species. Preferred angiosperm genera include Aegilops, Allium, Amborella, Anopterus, Apium, Arabidopsis, Arachis, Asparagus, Atropa, Avena, Beta, Betula, Brassica, Camellia, Capsicum, Chenopodium, Cicer, Citrus, Citrullus, Coffea, Cucumis, Elaeis, 20 Eschscholzia, Eucalyptus, Fagopyrum, Fragaria, Glycine, Gossypium, Helianthus, Hevea, Hordeum, Humulus, Ipomoea, Lactuca, Limonium, Linum, Lolium, Lotus, Lycopersicon, Lycoris, Malus, Manihot, Medicago, Mesembryanthemum, Musa, Nicotiana, Nuphar, Olea, Oryza, Persea, Petunia, Phaseolus, Pisum, Plumbago, Poncirus, Populus, Prunus, Puccinellia, Pyrus, Quintinia, Raphanus, Saccharum, Schedonorus, Secale, Sesamum, 25 Solanum, Sorghum, Spinacia, Thellungiella, Theobroma, Triticum, Vaccinium, Vitis, Zea and Zinnia.
Preferred angiosperm species include Aegilops speltoides, Allium cepa, Amborella trichopoda, Anopterus macleayanus, Apium graveolens, Arabidopsis thaliana, Arachis 30 hypogaea, Asparagus officinalis, Atropa belladonna, Avena sativa, Beta vulgaris, Brassica napus, Brassica rapa, Brassica oleracea, Capsicum annuum, Capsicum frutescens, Cicer arietinum, Citrullus lanatus, Citrus Clementina, Citrus reticulata, Citrus sinensis, Coffea arabica, Coffea canephora, Cucumis sativus, Elaeis guineesis, Eschscholzia californica, Eucalyptus tereticornis, Fagopyrum esculentum, Fragaria x ananassa, Glycine max, 38 Gossypium arboreum, Gossypium hirsutum, Gossypium raimondii, Helianthus annuus, Helianthus argophyllus, Hevea brasiliensis, Hordeum vulgare, Humulus lupulus, Ipomoea batatas, Ipomoea nil, Lactuca sativa, Limonium bicolor, Linum usitatissimum, Lolium multiflorum, Lotus corniculatus, Lycopersicon esculentum, Lycopersicon penellii, Lycoris 5 longituba, Malus x domestica, Manihot esculenta, Medicago truncatula, Mesembryanthemum crystallinum, Nicotiana benthamiana, Nicotiana tabacum, Nuphar advena, Oka europea, Oryza sativa, Oryza minuta, Persea americana, Petunia hybrida, Phaseolus coccineus, Phaseolus vulgaris, Pisum sativum, Plumbago zeylanica, Poncirus trifoliata, Populus alba x tremula, Populus tremula x tremuloides, Populus tremula, Populus balsamifera x teldoides), 10 Prunus americana, Prunus armeniaca, Prunus domestica, Prunus dulcis, Prunus persica, Puccinellia tenuiflora, Pyrus communis, Quintinia verdonii, Raphanus staivus, Saccharum officinarum, Schedonorus arundinaceus, Secale cereale, Sesamum indicum, Solanum habrochaites, Solanum nigrum, Solanum tuberosum, Sorghum bicolor, Sorghum propinquum, Spinacia oleracea, Thellungiella halophila, Thellungiella salsuginea, Theobroma cacao, 15 Triticum aestivum, Triticum durum, Triticum monococcum, Vaccimum corymbosum, Vitis vinifera, Zea mays and Zinnia elegans.
Particularly preferred angiosperm genera include Solanum, Petunia and Allium. Particularly preferred angiosperm species include Solanum tuberosum, Petunia hybrida and Allium cepa.
The plant cells and plants of the invention may be derived from any plant species.
In one embodiment the plant cells and plants of the invention are from gymnosperm species. Preferred gymnosperm genera include Cycas, Pseudotsuga, Pinus and Picea. Preferred 25 gymnosperm species include Cycas rumphii, Pseudotsuga menziesii, Pinus radiata, Pinus taeda, Pinus pinaster, Picea engelmannia x sitchensis, Picea sitchensis and Picea glauca.
In a further embodiment the plant cells and plants of the invention are from bryophyte species. . Preferred bryophyte genera include Marchantia, Tortula, Physcomitrella and 30 Ceratodon. Preferred bryophyte species include Marchantia polymorpha, Tortula ruralis, Physcomitrella patens and Ceratodon purpureous. 39 In a further embodiment the plant cells and plants of the invention are from algae species. Preferred algae genera include Chlamydomonas. Preferred algae species include Chlamydomonas reinhardtii.
In a further embodiment the plant cells and plants of the invention are from angiosperm species. Preferred angiosperm genera include Aegilops, Allium, Amborella, Anopterus, Apium, Arabidopsis, Arachis, Asparagus, Atropa, Avena, Beta, Betula, Brassica, Camellia, Capsicum, Chenopodium, Cicer, Citrus, Citrullus, Coffea, Cucumis, Elaeis, Eschscholzia, Eucalyptus, Fagopyrum, Fragaria, Glycine, Gossypium, Helianthus, Hevea, Hordeum, 10 Humulus, Ipomoea, Lactuca, Limonium, Linum, Lolium, Lotus, Lycopersicon, Lycoris, Malus, Manihot, Medicago, Mesembryanthemum, Musa, Nicotiana, Nuphar, Olea, Oryza, Persea, Petunia, Phaseolus, Pisum, Plumbago, Poncirus, Populus, Prunus, Puccinellia, Pyrus, Quintinia, Raphanus, Saccharum, Schedonorus, Secale, Sesamum, Solanum, Sorghum, Spinacia, Thellungiella, Theobroma, Triticum, Vaccinium, Vitis, Zea and Zinnia.
Preferred angiosperm species include Aegilops speltoides, Allium cepa, Amborella trichopoda, Anopterus macleayanus, Apium graveolens, Arabidopsis thaliana, Arachis hypogaea, Asparagus officinalis, Atropa belladonna, Avena sativa, Beta vulgaris, Brassica 20 napus, Brassica rapa, Brassica oleracea, Capsicum annuum, Capsicum frutescens, Cicer arietinum, Citrullus lanatus, Citrus clementina, Citrus reticulata, Citrus sinensis, Coffea arabica, Coffea canephora, Cucumis sativus, Elaeis guineesis, Eschscholzia californica, Eucalyptus tereticornis, Fagopyrum esculentum, Fragaria x ananassa, Glycine max, Gossypium arboreum, Gossypium hirsutum, Gossypium raimondii, Helianthus annuus, 25 Helianthus argophyllus, Hevea brasiliensis, Hordeum vulgare, Humulus lupulus, Ipomoea batatas, Ipomoea nil, Lactuca sativa, Limonium bicolor, Linum usitatissimum, Lolium multiflorum, Lotus corniculatus, Lycopersicon esculentum, Lycopersicon penellii, Lycoris longituba, Malus x domestica, Manihot esculenta, Medicago truncatula, Mesembryanthemum crystallinum, Nicotiana benthamiana, Nicotiana tabacum, Nuphar advena, Olea europea, 30 Oryza sativa, Oryza minuta, Persea americana, Petunia hybrida, Phaseolus coccineus, Phaseolus vulgaris, Pisum sativum, Plumbago zeylanica, Poncirus trifoliata, Populus alba x tremula, Populus tremula x tremuloides, Populus tremula, Populus balsamifera x teldoides), Prunus americana, Prunus armeniaca, Prunus domestica, Prunus dulcis, Prunus persica, Puccinellia tenuiflora, Pyrus communis, Quintinia verdonii, Raphanus staivus, Saccharum 40 officinarum, Schedonorus arundinaceus, Secale cereale, Sesamum indicum, Solanum habrochaites, Solanum nigrum, Solanum tuberosum, Sorghum bicolor, Sorghum propinquum, Spinacia oleracea, Thellungiella halophila, Thellungiella salsuginea, Theobroma cacao, Triticum aestivum, Triticum durum, Triticum monococcum, Vaccinium corymbosum, Vitis 5 vinifera, Zea mays and Zinnia elegans.
Particularly preferred angiosperm genera include Solanum, Petunia and Allium. Particularly preferred angiosperm species include Solanum tuberosum, Petunia hybrida and Allium cepa.
The cells and plants of the invention may be grown in culture, in greenhouses or the field. They may be propagated vegetatively, as well as either selfed or crossed with a different plant strain and the resulting hybrids, with the desired phenotypic characteristics, may be identified. Two or more generations may be grown to ensure that the subject phenotypic characteristics are stably maintained and inherited. Plants resulting from such standard breeding approaches 15 also form an aspect of the present invention.
BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 shows PCR verification the propagation of plasmid pPOTCOLE2SPEC in E. coli mediated by a potato-derived COLE2-like origin of replication. Lanes 1 and 2 are plasmid preparations restricted with a BamRMEcolXX double digest from two independent transformation events of pPOTCOLE2SPEC into E. coli DH5a already possessing pBX243; they show 3.9 kb, 2.5 kb, and 1.5 kb fragments, representing the pBX243 backbone, linearised pPOTCOLE2SPEC, and the pBX243 Rep gene respectively. Lane 3 is a plasmid preparation restricted with a BamRHEcoRl double digest from a culture transformed with only pBX243 and shows 3.9 kb and 1.5 kb fragments, representing the pBX243 backbone and the pBX243 Rep gene. Lane 4 is the GeneRuler DNA ladder mix #SM0331 (Fermentas, Hanover, Maryland) size marker.
Figure 2 shows PCR verification the potato-derived LacOl-like sequences functioning as a plasmid selectable element by operator-repressor titration. Lane 1 is the GeneRuler DNA ladder mix #SM0331 (Fermentas, Hanover, Maryland) size marker. Lanes 2-6 are plasmid preparations restricted with Pstl from five independent transformation events of 41 pBR322P0TLAC01 into E. coli strain DH1 lacdapD using repressor titration selection; they show the expected 1.3 kb and 3.8 kb fragments. Lane 7 is a plasmid preparation restricted with Pstl following transformation of pBR322P0TLAC01 into E. coli strain DH5a using ampillicin selection and also shows the expected 1.3 kb and 3.8 kb fragments. Lane 8 is 5 linearised pBR322 visualised as a 4.4 kb fragment.
Figure 3 shows PCR verification of Arabidopsis thaliana 'Columbia' transformed with the intragenic vector pTCAHAS. Lanes 1&2, 3&4 and 5&6 are three A. thaliana lines transformed with the intragenic vector, lanes 1,3,5 using primers E+F, lanes 2,4,6 using 10 primers G+H; lanes 8&9 are untransformed A. thaliana, lane 8 using primers E+F, lane 9 using primers G+H; lanes 10&11 are no template controls, lane 10 using primers E+F, lane 11 using primers G+H; lanes 12&13 are the intragenic vector pTCAHAS, lane 12 using primers E+F, lane 13 using primers G+H; lanes 7&14 are the 100 bp molecular ruler (170-8206, BioRad laboratories, USA). Primers E+F amplify an expected 643 bp fragment and primers 15 G+H amplify an expected 149 bp fragment from the T-DNA-like region of pTCAHAS.
Figure 4 shows PCR verification of potato cultivar 'Iwa' transformed with the intragenic vector pPOTlNY. This involved a multiplexed PCR using primers I+J to amplify the 570 bp fragment from the pPOTINV T-DNA-like region and primers K+L to amplify the 1069 bp 20 product from the endogenous actin gene of potato. Lanes 1&7 are the 1 kb plus molecular ruler 10787-018 (Invitrogen, Carlsbad, California), lane 2 is the co-transformed hairy root line #18, lane 3 is the co-transformed hairy root line #74, lane 4 is a control hairy root line transformed with Agrobacterium strain A4T without the binary vector pPOTINV, lane 5 is the intragenic vector pPOTINV, lane 6 is a no template control.
Figure 5 shows PCR verification of the absence of Agrobacterium DNA in the samples used for PCR analysis in Figure 4. This involved a PCR using primers M+N to amplify the 590 bp of the Agrobacterium virG gene. Lanes 1&7 are the 1 kb plus molecular ruler 10787-018 (Invitrogen, Carlsbad, California), lane 2 is the co-transformed hairy root line #18, lane 3 is 30 the co-transfonned hairy root line #74, lane 4 is a control hairy root line transformed with Agrobacterium strain A4T without the binary vector pPOTINV, lane 5 is Agrobacterium strain A4T, lane 6 is a no template control. 42 Figure 6 shows PCR verification of potato cultivar 'Iwa' transformed with the intragenic vector pPETINV. This involved PCR using primers O+P to amplify the 447 bp fragment from the pPETINV T-DNA-like region (lanes 2-5) and primers K+L to amplify the 1069 bp product from the endogenous actin gene of potato (lanes 6-8). Lane 1 is the 1 kb plus 5 molecular ruler 10787-018 (Invitrogen, Carlsbad, California); lanes 2 and 6 are the co-transformed hairy root line #24; lanes 3 and 7 are a control hairy root line transformed with Agrobacterium strain A4T without the binary vector pPETINV; lane 4 is the intragenic vector pPETINV; lanes 5 and 8 are a no template controls.
Figure 7 shows PCR verification of the absence of Agrobacterium DNA in the samples used for PCR analysis in Figure 6. This involved a PCR using primers M+N to amplify the 590 bp of the Agrobacterium virG gene. Lane 1 is the 1 kb plus molecular ruler 10787-018 (Invitrogen, Carlsbad, California), lane 2 is the co-transformed hairy root line #18, lane 3 is a control hairy root line transformed with Agrobacterium strain A4T without the binary vector 15 pPOTINV, lane 4 is Agrobacterium strain A4T, lane 5 is a no template control.
Figure 8 illustrates recombination between the POTLOXP sites mediated by Cre recombinase. Plasmid was isolated from E. coli strain 294-Cre transformed with pPOTLOXP2 and restricted with Sail. Expression of Cre recombinase was induced by raising the temperature 20 from 23 °C to 37 °C. Lane 1 is the 1 kb plus molecular ruler 10787-018 (Invitrogen, Carlsbad, California); lane 2 illustrates the expected 3.0 kb and 2.3 kb Sail fragments of unrecombined pPOTLOXP2 isolated from a culture maintained at 23 °C; lanes 3-8 illustrate the 3.0 kb and 1.5 kb Sail fragments expected from Cre-mediated recombination between the POTLOXP sites in six different colonies cultured at 37 °C.
Figure 9 illustrates recombination between the POTFRT sites mediated by FLP recombinase. Plasmid was isolated from E. coli strain 294-FLP transformed with pPOTFRT2 and restricted with Sail. Expression of FLP recombinase was induced by raising the temperature from 23 °C to 37 °C. Lanes 1 and 8 are the GeneRuler DNA ladder mix #SM0331 (Fermentas, 30 Hanover, Maryland) size marker; lane 2 illustrates the expected 3.0 kb and 1.4 kb Sail fragments of unrecombined pPOTFRT2 isolated from a culture maintained at 23 °C; lanes 3-7 illustrate the 3.0 kb and 1.4 kb fragments, and the 1.1 kb Sail fragments expected from FLP-mediated recombination between the POTFRT sites in five different colonies cultured at 37 °C. 43 EXAMPLES The invention will now be illustrated with reference to the following non-limiting examples.
Example 1 Identification of T-DNA border-like sequences in many plant species Agrobacterium T-DNA borders contain the following consensus motif: 5 'GRCAGGATATATNNNNNKSTMAWN3' (Where R = G or A, K = T or G, S = G or C, M = C or A, W = A or T and N = any nucleotide).
A search on NCBI GenBank (http://www.ncbi.n1m.nih.gov/BLASTr) and TIGR database (http://tigrblast.tigr.org/tgi/) using the BLAST tool "search for short, nearly exact matches" 15 and searching within the EST databases, yielded multiple accession numbers for each motif 5' GACAGGATATAT3' and 5' GGCAGGATATAT3' as shown in Table 1. The search was limited to Viridiplanteae and the expect value was 10000. Searches were also conducted in the EST Database of Japan carried out using Expect values of 10000 and the gap tool off (http://www.ddbi.nig.ac.jp).
Table 1. Plant species and DNA accession numbers in which a partial T-DNA border has been identified in EST sequences. All accession numbers were found from searches in the NCBI Genbank EST databases, except for those labelled A which were identified using the TIGR database and those labelled B which were found in the EST Database of Japan.
Note:1 indicates 5' GRC AGGAT AT (A);2 indicates 5' GRCAGGATA3'; 3 indicates 5'GRCAGGAT3'; 4 indicates 5'GRCAGGA3';and "+" indicates there are many more accessions than the example(s) listed.
Plant group and species 'GACAGGATATAT3' 'GGCAGGATATAT3' Dicotyledonous plants Camelliaceae 44 Camellia (tea) "CV013936 'CV066981 Chenopodiaceae Beta vulgaris (beet) bBI096344 BQ586429 Compositae Lactuca sativa (lettuce) 'BU005745, JBU005333, 'BU002323 + BBQ851200, BU011977 + Helianthus annuus (sunflower) BU0123919, BU024595, BU025229 + BU018078, BQ976399, BQ976382 + Zinnia elegans AU288531 AU303793 Convolvulaceae Ipomoea batatas (sweet potato) CB330857, CB330537, CB330346, CN330857 + JCB329905 Cruciferae Arabidopsis thaliana AV783572 CB264522 + Brassica oleracea 2CV973863 3CV973875 Brassica napus CD812561 'CN737684 Cucurbitaceae Cucumis sativus (cucumber) '00)86108 'CK085499 Ericaceae Vaccinium (blueberry) 2CV190833 'CF810976 Euphorbiaceae Hevea brasiliensis (rubber) 'CB376888 JCB376996 Manihot esculenta (cassava) 'CK644474 CK647455, CK646256, CK643162 Fagaceae Betula (birch) jCD276790 'CD278538 Lauraceae Persea americana (avocado) 'CK766454, !CV457849 CK754032, CK749566 Leguminosae Glycine max (soybean) CX705662, CX548491, CF922194 C0985845, CD398345, CD390961 + Lotus corniculatus BP046527 BP085776 + 45 Medicago truncatula CB891412 + CA921810 + Pisum sativum (pea) 2CD861031,2CD860484 2CD859175,2CD859173 Pisum sativum (pea) - chloroplast BX05395 Linaceae Linum usitatissimum (linseed, flax) CV478515 'CV478657 Malvaceae Gossypium arboreum (cotton) BQ403352 + BF270004 + Gossypium hirsutum (cotton) 'CA993786 C0491158, AI729959 Gossypium raimondii C0107555 Mesembryanthemaceae Mesembryanthemum crystallinum AW053481 + Moraceae Humulus lupulus (hop) 4CD527122 Musaceae Musa (banana) 4CV012662 Nymphaeaceae Nuphar advena CK753223 CD467171 Oleraceae Olea europaea (olive) 4C)K087200 Papaveraceae Eschscholzia californica CK755701 Pedaliaceae Sesamum indicum (sesame) BU669955 BU668412 Plumbaginaceae Plumbago zeylanica CB817698 Rosaceae Fragaria x ananassa (strawberry) '0X309734, C0817569 'C0817444, C0381912 Malus x domestica (apple) CN579782 Prunus domestica Prunus persica 'AJ876058, 'AJ873533 'AJ826265, 'AJ827659 Prunus armeniaca CB819601 CK754032 + 46 Rubiaceae Coffea arabica (coffee) Coffea canephora (coffee) 'CF589163 'CF589153 Rutaceae Poncirus trifoliate CD574807, CD573690, CX672356 + CD574743 Citrus sinensis CN188154 + CB293973 Citrus (grapefruit) DN960139 'DN798417 Salicaeae Populus tremula CK108404 + Populus alba x tremula CF231132 + Populus tremula x tremuloides BU828279 + Populus sp. (poplar) CX659694, CX659667, CX658712 DN495050, DN484787, CV255709 + Solanaceae Capsicum annuum (pepper) Capsicum frutescens (pepper) C0911113, C0908671, CA515326 + CA524776, BM06678, BM066170 + Lycopersicon esculentum (tomato) AW217429 BM410480 + Lycopersicon penellii AW399741 Nicotiana benthamiana CK297930 + CK286647 + Nicotiana tabacum BL29276 Solanum tuberosum (potato) CN516032 + CK716936 + Sterculiaceae Theobroma cacao (cacao) 'CA797461, 'CA797357, 1CA797340 'CA795783 Umbelliferae Apium graveolens (celery) 'BU693260 'CN254199 Vitaceae Vitis vinifera (grape) CD715300, CD714256, CD009006 CB981221, CX017627, CN547415 + Monocotyledonous plants Amaryllidaceae 47 Lycoris longituba CN447505 CN448914 Gramineae Aegilops speltoides BF292132 Hordeum vulgare (barley) DN184623, DN184566, DN177451, CK567740 + CA031595, DN182886, DN177809, DN 177690 + Lolium multiflorum (ryegrass) AU249845 Oryza sativa (rice) CR278518, CR755701 + CK078411, CK068012, CK056225, CB682490 + Puccinellia tenuiflora CN486642 Saccharum (sugarcane) CF573560 Saccharum officinarum (sugarcane) CF576361, CA293676 + CA243368, CA235389, CA225160 + Sorghum bicolor (sorghum) CN131489, CN135330, CN124024 CD426230, CD424405 Sorghum propinquum BG051239 + Triticum aestivum (bread wheat) CV777378, CV776448, CV764325, CK216594 BJ319521, BJ318204, CK15202 8 + Triticum durum (pasta wheat) AJ716746 Secale cereale (rye) BF145815 BE586285 Zea mays (corn, maize, sweetcorn) DN221450, CF050111, CK370367, CF055801 + DN223506, DN222326, DN212047, CF057191 + Liliaceae Allium cepa (onion) CF452050+ CF445160 + Asparagus officinalis (asparagus) 'CV290039 CV289577 Palmae Elaeis guineesis (oil palm) 'CN601600 Gymnosperms Cycas rumphii CB090878 Picea glauca C0242116 C0484796, C0474325, C0257134, CK439582 + Picea sitchensis C0223865 C0226349, C0225190 Picea sp.
AC0251892, TC8803, 48 TCI 0688 Pinus pinaster BX784355 + Pinus radiata JAA230194 iAA220891, 4AA220909 Pinus taeda CF666840+ CF672772 + Bryophytes Ceratodon purpureus AW098060 Physcomitrella patens BJ586625+ BJ585772 Tortula ruralis CN206945 Algae Chlamydomonas reinhardtii BM002336 + BI717507 + The initial 5'GRCAGGATATAT3' of the T-DNA border-like motif is less likely to be 5 identified in database searches than the shorter sequence 5'KSTMAWN3\ If the entire border sequence is formed using 2 EST sequences as shown in Example 2 of the patent application, then a second BLAST search is undertaken using 5'KSTMAWN3' from known T-DNA border sequences. A list of such sequences are: 5TGTCATG3' 5'TGTAAAC3', 5'GGTAAAC3', 5'TGTAAAA3', 5'GGTAAAA3'; which correspond to the following 10 border sequences: 5'gacaggatatatgttcttgtcatg3' (pRi), 5'gacaggatatattggcgggtaaac3' (pTiT37 andpTiC58), 5'ggcaggatatatcgaggtgtaaaa3' (pTIl 5955), 5'ggcaggatatattgtggtgtaaac3' (pART27 lb) and 5'gacaggatatattggcgggtaaac3' (pART27 rb).
BLAST searches using these sequences produce multiple matches. For example just within 15 Solanum tuberosum (a plant whose genome has not been completely sequenced) , a search (BLAST "search for short, nearly exact matches" Expect 20000 and descriptions 1000) for only 5'TGTAAAC3' in NCBI GenBank yields 997 exact matches of which 985 are S. tuberosum ESTs (search performed 2 June 2004).
Alternatively, the sequences defined in Table 1 can be used for the design of "chimeric right borders" for plant-dervied T-DNA-like sequences. 49 Example 2 Identification of T-DNA-like regions from plant genomes The design of T-DNA- like regions for possible intragenic vectors was undertaken by searching plant EST databases for Agrobacterium border-like sequences. Limiting searches to 5 EST sequences facilitates the design of intragenic vectors by: 1. The base DNA making up the T-DNA-like region (including the T-DNA-like sequence and additional sequence at either one or both sides of the T-DNA-like sequence) does not involve regulatory elements such as promoters that may influence expression of inserted target genes; and 2. The DNA on which the T-DNA-like region is based is not derived from heterochromatic regions (non coding, non expressed, condensed DNA) as this may suppress activity of the genes intended for transfer.
BLAST searches were conducted as described by Altschul et al. (Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Res. 25: 3389-15 3402,1997).
Border sequences used to search the databases: Sequence motifs used to search the databases were 5'GACAGGATATAT3' or 5' GGC AGGAT AT AT3', 5'TGTAAAC3\ 5'GGTAAAC3', 5'TGTAAAA3\ 20 5'GGTAAAA3'. Other known borders were used as query sequences, these being: 5' G AC AGGAT AT AT GTTCTT GT CAT G3' (pRi) 5'GACAGGATATATTGGCGGGTAAAC3' (pTiT37 and pTiC58) ' GGC AGGAT AT ATC GAGGT GT AAAA3' (pTil 5955) ' GGC AGGAT AT ATT GT GGT GT A AAC3' (pART27 lb) 5' GAC AGGAT AT ATT GGCGGGT AA AC3' (pART27 rb) Potato, Petunia and tomato vectors NCBIBLAST - www.ncbi.nlm.nih.gov/BLAST/ "blastn" and "search for short, nearly exact matches" was used to search the EST database. Expect values of 10000 or 20000 (dependent 30 on word size) were used and the search was limited by entrez query, potato {Solanum), tomato (.Lycopersicon), or Petunia. All Petunia EST sequences from the NCBI site were also downloaded in FASTA format and searched using the "find" tool in Microsoft Notepad. Solanaceae genomics network - http://soldb.cit.cornell.edu/cgi-bin/tools/blast/simple.pl 50 BLAST settings included expect values of 10,000 (due to short sequences) and the default settings. All searches were done in EST databases. Unigene sequences were identified using the EST searches.
Pinus, Nicotiana, Medicago, apple and onion vectors NCBI BLAST - www.ncbi.nlm.nih.gov/BLAST/ BLAST was carried out as above with an Expect value of 10,000 and limited by entrez query to Pinus, Nicotiana> Medicago, apple or onion {Allium).
Rice vector NCBIBLAST - www.ncbi-nlm.nih.gov/BLAST/. Settings were as above but limited by entrez query rice or Oryza.
TIGR - http://tigrblast.tigr.org/tgi (searched unique gene indices). Used an expect value of 10,000 and matrix blosum62 or blosumlOO. All other values were the default settings. The 15 searches identified some TC# sequences (tentative consensus sequences) and ESTs containing the region of interest were identified from these.
Staff - http://web.staff.or.ip/. The RGP EST database was used to search for ESTs containing the sequences of interest, using Expect values of 10000 and the remaining options at default settings.
Design of extended intragenic T-DNA- like regions ESTs were identified that showed sequence identity to parts of the Agrobacterium border-like sequences. These identified EST sequences were then assessed for homology, length of sequence flanking the borders and unique restriction sites. This was carried out using 25 DNAMAN (version 3.2, Lynnon BioSoft. copyright© 1994-1997). ESTs were adjoined (usually 3 ESTs) to give a T-DNA-like region containing two border sequences, unique restriction sites between the border sequences (that can be used as cloning sites) and extra plant EST sequence beyond the borders to minimize the opportunity for non-intragenic vector backbone sequences being transferred with the T-DNA-like region into plant genomes. 30 Multiple intragenic T-DNA-like regions were designed and compared. Those designed to have the optimum sequence and useful unique restriction sites are presented below. 51 T-DNA-like region of a potato intragenic vector This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined. The nucleotides in italics are not part of the potato genome sequence.
Nucleotides 6 - 334 are the reverse complement of nucleotides 315 - 643 of sgn-Ul 79068. Nucleotides 335 - 974 are nucleotides 131 - 770 of sgn-Ul 74278.
Nucleotides 975 - 1265 are nucleotides 117 - 407 of CN216800.
The T-DNA border-like sequences are shown in bold. The left border is nucleotides 314 - 337 10 and the right border is nucleotides 957 - 980. Unique restriction sites in the resulting binary vector that are located within the T-DNA-like region are: A/Ill at 611 Agel at 518 ApaBl at 912 15 AsulsA.516 Aval at 357 Avail at 516 BamHl at 687 As/D102Iat514 20 C/rlOI at 518 Cfrl at 723 CM at 507 Cspl at 516 EcolW at 683 25 EcoRl at 340 Haelll at 725 HgiAl at 916 Maell at 405 Malll at 875 30 PinAl at 518 Scil at 359 Xbal at 433 Xhol at 357 Xholl at 687 52 There are also two EcoRW sites within the T-DNA-like region (698 and 853) that could be used as cloning sites. 1 GTCGACAGTA aaagttgcac ctggaataag gttttcattc ttcacaggag gcatctcact 61 ctttctagca ggtcttgaac GCTTAGATTG AACAGATGTA GGACTCACAT ctgatatgga 121 ggattcttga cttgtttcag cagcatcaga tgaagcttct gagacttcac ctgatccatc 181 atctgtagca gttgcttcta CTTCTTCCAC TGCTACATCA GTCTCAGTTG CTGATACTAT 241 AAGACCTCTT AATTTAGGTC GTAAAATGCA ACCAACTCTA AAATGGGGAA ACAATTTAAT 301 AGATGTTGAC agaggcagga tatattttgg ggtaaacggg AATTCTTCAG CAGTTGCTCG 361 AGGGAGATTG GCGGTGCTTT CAGCTCACCT TGCAGCTTCA CTCAACGTCT CCGATTTAAC 421 AACCTTCAAA CTTCTAGAAA CTTCCGGTGT ATCCGCCGTT TCCGGCGTTG CACCTCCGCC 481 GAATCTAAAA GGTGCGTTGA CGATCATCGA TGAGCGGACC GGTAAGAAGT ATCCGGTTCA 541 GGTTTCTGAG GATGGCACTA TCAAAGCCAC CGACTTAAAG AAGATAACAA CAGGACAGAA 601 TGATAAAGGT cttaagcttt ATGATCCAGG CTATCTCAAC ACAGCACCTG ttaggtcatc 661 AATATGCTAT ATAGATGGTG ATGCCGGGAT CCTTAGATAT CGAGGCTACC CTATTGAAGA 721 GCTGGCCGAG GGAAGTTCCT TCTTGGAAGT GGCATATCTT TTGTTGTATG GTAATTTACC 781 ATCTGAGAAC CAGTTAGCAG ACTGGGAGTT CACAGTTTCA CAGCATTCAG CGGTTCCACA 841 AGGACTCTTG GATATCATAC AGTCAATGCC CCATGATGCT CATCCAATGG GGGTTCTTGT 901 CAGTGCAATG AGTGCTCTTT CCGTTTTTCA TCCTGATGCA AATCCAGCTC tgagaggaca 961 GGATATATAC AAGTGTAAAC AATTTAAAAG CATATGGTGG CACTGCTCAA TATATGAGGT 1021 GGGCGCGAGA AGCAGGTACC AATGTGTCCT CATCAAGAGA TGCATTCTTT ACCAATCCAA 1081 CGGTCAAAGC ATACTACAAG TCTTTTGTCA AGGCTATTGT GACAAGAAAA AACTCTATAA 1141 GTGGAGTTAA ATATTCAGAA GAGCCCGCCA TATTTGCGTG GGAACTCATA aatgagcctc 1201 GTTGTGAATC CAGTTCATCA GCTGCTGCTC TCCAGGCGTG GATAGCAGAG ATGGCTGGAT 1261 TTGTCGAC (SEQ ID NO:l) T-DNA-like region of a petunia intragenic vector This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined. The nucleotides in italics are not part of the petunia genome sequence.
Nucleotides 6-399 are the complete sequence of the 394 nucleotide fragment from sgn-e521144.
Nucleotides 400-855 are the reverse complement of nucleotides 85-540 from sgn-e534315. Nucleotides 856-1071 are the reverse complement of nucleotides 121-336 from sgn-u207691. 53 The T-DNA border-like sequences are shown in bold. The left border is nucleotides 347-370 and right border is nucleotides 844-867. Unique restriction sites in the resulting binary vector that are located within the T-DNA-like region are: Acclll at 392 Age I at 788 BbvI at 453 BspMll at 392 Bstlll at 453 C/r 101 at 788 Clal site at 398 Fnumi at 442 Mae I at 665 Nsil at 752 PinAl at 788 There are also two Nspl sites (RCATG/Y) within the T-DNA-like region (616 and 755) that could be used as cloning sites. The most useful restriction site for cloning into the T-DNA-like region is the Clal site which is shown in underlined bold. 1 GTCGActtta tgatcctggc tatctcaaca cagcgcctgt tcggtcatca atatgttata 61 TAGATGGTGA TGCCGGGATC CTTAGGTATC GAGGTTACCC TATTGAAGAG CTGGCTGAGG 121 GAAGCTCCTT CTTGGAAGTG GCTTATCTTT tattgtacgg TAATTTGCCA tctgagaacc 181 AGTTGGCAGA CTGTGAGTTC ACAGTTTCAC aacattcagc agttccacaa GGACTCCTTG 241 GATATCATAC agtcaatgcc ccatgatgct catccgatgg gtgttcttgt cagtgcaatg 301 agcgctcttt ctgtctttca ccctgatgcc aatccagctc TTAGGGGACA GGATATATAC 361 AAGTCTAAAC aaatgagaga taaacaaata gtccggatcg atacgtgaag atcaaaatga 421 aaaggggagg cgatagatta gcagcatgag cctatatttc tctcacaaaa attcccagat 481 ATTCGACACA ATAGCTCTAA CAACACTGAG cttttgatta CTTGGGTCAC TTCTTCATTT 541 CTCTATCGTC TGTTCAGTCT TTTCCTCTGA TTTAGTTTCT GCATCATAAG TTTTGCCAAA 601 GCCAAGTTCT GACATGTCTT GCTTTGCCAT CAAATTCTTC TCCATACGAC ACTCCAGGTA 661 CTTCCTAGAG AGGTGTCTAC ACTGCTCAGA TTTATGCCCA GCGGATTTTA GACAACTAAG 721 GTATTCCTTC TTCTCCACGT CACATAAATG CATGTGATCC AAAGGGAAAA CTCCTTTTTC 781 TGGTGGAfl.CC GGTCTCAATC CTCTATTTCC ACCAAATGCT CCCCCTGCAC TCATTACGGA 841 gatggcagga tatatgttct tgtcatggaa TAGGCCACTG CTTTCAGCTG TCTGGAGACC 901 GTGAAGTGTA CGTTGAGCCA CAGCCCATTG TGCTTCCCTC TCACCTTTTC CGTAATCCTT 961 CTTGGTTGTG AAGGCAGTCT TATTCTGCAT CATTGATTGC caggcgtcac CACTCAACGT 1021 GTAACGGCTG ATGAATTTAA GAATATCAAG AGGGAAATAG GTGATAATTG TCGAC (SEQ ID NO:2) 54 T-DNA-like region of a tomato (Lycopersicon esculentum) intragenic vector: This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined. The nucleotides in italics are not 5 part of the tomato genome sequence.
Nucleotides 5 - 537 are nucleotides 2-534 of SGN-E260320.
Nucleotides 538 - 976 are the reverse complement of nucleotides 79-517 of SGN-E291502 Nucleotides 977 - 1188 are the reverse complement of nucleotides 1 - 212 of CK575027.
The T-DNA border-like sequences are shown in bold. The left border is nucleotides 375 - 398 and the right border nucleotides 960 - 983. The restriction sites and positions that could be used for cloning within the T-DNA are shown below (as calculated by DNAMAN): Alw26l at 881 .4/wNI at 876p 15 Bbvl at 798, 843 and 442 Bgtll at 740 BsplAQHl at 528 BspMl at 705 Bstlll at 798, 843 and 442 20 CM at 637 Eco7>ll at 881 EcoSll at 787 EcoM at 431 FnuAHl at 456, 787 and 832 25 Mael at 573, 711 and 744 Mfel at 444 MspAl at 455 Nde I at 511 A'spBII at 455 30 Nspl at 535 and 583 Pstl at 833 Pvull at 455 Rsal at 530 Styl at 427 55 Xcml at 925 1 gtcgacaaca ggacagaatg ataagggtct taagctttat gatccaggct atctcaacac 61 GGCACCTGTT AGGTCATCAA TATGTTATAT tgatggtgat gccgggatcc ttagatatcg 121 AGGCTACCCT ATTGAAGAGC tggccgaggg aagttccttc ttggaagtgg catatctttt 181 gttgtatggt aatttaccat ctgagaatca gttagcagac tgggagttca cagtttcaca 241 GCATTCAGCA GTTCCACAAG GACTCTTGGA TATCATACAG TCAATGCCAC ATGATGCTCA 301 TCCAATGGGG GTTCTTGTCA GTGCAATGAG TGCTCTTTCC GTTTTTCATC CTGATGCAAA 361 TCCAGCTCTG agagggcagg ATATATACAA GTCTAAACAA GTGAGAGATA AACAAATAGT 421 TCGGATCCTT GGCAAGGCAC CTACAATTGC TACAGCTGCT TACTTAAGAA TGGCTGGCAG 481 GCCACCTGTC CTTCCATCCA ACAATCTCTC ATATGCGGAG AACTTCTTGT ACATGCTTGC 541 TTCCTAC.ATC CTTTACATAA CTATCACTCA ACCTAGAAAC ATGCACCAAT CCATCCGTAA 601 AAGCTCCAAA ATCAATAAAA GCACCGAATG GCTGTATCGA TCTGACCTTT CCAGGAAAAG 661 TTGCACCTGG AATAAGGTCT TCATTCTTCA CAGGAGGCAT CTCACTCTTT CTAGCAGGTC 721 TTGAACGCTT AGACTGAACA GATCTAGGAC tcacatctga TACAGAGGAT tcttcactta 781 TTTCAGCAGC ATCAGATGAA GCTTCAGCAA CTCCACCAGA TCCATCATCT GCAGCAGTTG 841 CTTCTACTTC TTCCACTGCT ACATCGGTTT CAGTTGCTGA TACTACGAGA CCTCTTAATT 901 TATGTCGTAA AATGCAACCA ACTCTAAAAT GGGGAAACAA TTTAATAGAT GTTGACAGGG 961 GCAGGATATA TTTTGGTGTA AACCTGTTTC TTGCACTAAT CGTGCTTTGT CTTCCTCAGT 1021 TGGATAAGGC CACTTAGAAT GTGATTGCCA CCAAGCTTTC AACACAGATG TAGTATCACC 1081 AGGCAGTTTT CCTGCTCTTC TTTTGCGTAA AATTTCCTCT CTAATGTCAA CAATTTTTTC 1141 CTTATAACCC TGTTTGAGTT CATGCTTGAG TTCTTGCCTA ACACGCTCGT CGAC (SEQ ID NO:3) T-DNA-like region of a Nicotiana benthamiana intragenic vector This sequence can be ligated into pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined . The nucleotides in italics are not in the N. benthamiana genome sequence.
Nucleotides 5 - 853 are nucleotides 111 - 959 of CK292156 Nucleotides 854 - 1469 are the reverse complement of nucleotides 81-696 of CK286377. Nucleotides 1470 - 1787 are nucleotides 285 - 602 of CN748849.
The T-DNA border-like sequences are shown in bold. The left border is nucleotides 566 -35 589 and the right border is nucleotides 1455 - 1478. Unique restriction sites in the resulting binary vector that are located within the T-DNA-like region are: Acclll at 611 56 AM at 654 Ahalll at 1160 BamHl at 614 Bsil at 1362 Asp14071 at 719 ffspMII at 611 Dral at 1160 £coNI at 622 Maell at 840 Nspl at 726 Seal at 921 Sspl at 1420 Vspl at 1085 Xholl at 614 There are also two Aval sites (773 and 1072), two Banl sites (627 and 906), two Haelll (672 and 1393), three Mae I sites (727, 1182 and 1355), and three MalV sites (616, 629 and 908) within the T-DNA-like region that could be used as cloning sites. 1 gtcgacctcg ccgcttcagt caatctctcc gattccaaac ttttagaaac ttccgttgta 61 tccgccactt ccgtcgtcgc gccgccgccg aatctaaaag gcgctttgac gatcatcgat 121 gagcgaaccg gtaagaggta tccagttcaa gtttcggagg aaggcactat caaagccacc 181 gacttgaaaa agataacagc aggacataat gataagggtc tcaagcttta tgatccggga 241 tatctcaaca cagcacctgt tcggtcatca atatgttata tagatggtga tgctggtatc 301 cttagatatc gaggttaccc aattgaagag ctggctgagg gaagttcctt cttggaagtg 361 gcttatcttt tgatgtatgg taatttacca tctgagaacc agttggcaga ttgggagttc 421 acagtttcac aacattcagc agttccacaa ggaatcatgg atattataca ttcgatgccc 481 catgatgctc atccaatggg tgttcttgtc agcgcaatga gtgctctttc tgtctttcat 541 cctgatgcca atccagctct gagaggacag gatatataca agtctaaaca agtgagagat 601 aaacaaatag tccggatcct tggcaaggca cctacaattg ctgcggctgc ttacttaaga 661 atggctggaa ggccacctgt ccttccatcc aacaatctct cttatgcaga gaacttcttg 721 tacatgctag attcattagg taataggtct tacaaaccca atcctcgact cgctcgggtg 781 ctcgacattc ttttcatatt acacgcggaa catgaaatga attgctctac tgctgcagca 841 cgtcatcttg cttaaatgca actgctctat tttgtgtcag aatttggtga aaaatgcact 901 gttttggcac caaaagttag tacttttgga caactttttg gtgaaccaaa atctgtccaa 961 aatgacttgt ttacctactt aaagaggtca ttttttcata ccaggggaca tccccgacat 1021 cccaggatac atagcttttg aaaaattttt tacactcaag aatacacaaa actcgggagc 57 1081 aaaattaata gctgaatgtt taatagtaag ctgaaacttg agagttttgg agtgagtttt 1141 ttgagagaaa ataacacttt aaaaaacaaa agtccataca gctagtttat agtttttctt 1201 tcactaaaga tgctgagttt tacggtttgt ttttggttgt tttgggttca atttattgct 1261 gtttttttta ctatttttac tgtcactgct gctgcatttt tgctactgct gtatttttgc 1321 tcttcaggta acctgagaag cttatttttt gatactagcc actcgtgttg tatttgtcct 1381 ttttaattta aggccaaata gtttcagttg tagaagtaat attttctcct ttcattagta 1441 aagttcaatt aaaaggcagg ATATATTGTG atgtaaacac cgtcctgaag tgtaccagct 1501 aaggacaagg gat caaagaa tttgccgcca gggtaaccct gttctccagt gaagttggca 1561 aagttctcag cagtcttgga ccatggagtt gcccactcta ctgactgaga ctcagggttg 1621 aagaagtcca cccacctttt gctctcaacc caccccatga gtagaagttg agtgccaaga 1681 agtgagccaa aagagaaagt gcaatagcac cagggtcagc accagcctcg aaccatggga 1741 tgccactcca ggcttgacca acaaagagcc caatactgca gccattgTCG AC (SEQ ID NO:4) T-DNA-like region of an apple intragenic vector This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined. The nucleotides in italics are not part of the apple genome sequence.
Nucleotides 5 - 246 are nucleotides 1 - 242 of CN862631.
Nucleotides 247 - 644 are the reverse complement of nucleotides 28-425 of CN942531. Nucleotides 645 - 943 are the reverse complement of nucleotides 1 - 299 of C0541348.
The T-DNA border-like sequences are shown in bold. The left border is nucleotides 229 - 252 and the right border is nucleotides 627 - 650. The restriction sites and positions that could be used for cloning within the T-DNA are shown below: AcclI at 392 Alul at 338 Alw26l at 593 Asul at 353 and 387 Aval at 270 Avail at 353 Banl at 415 BsaOl at 374 and 466 Bsp\2%6l at 418 As/D102I at 262 Clal at 466 58 EcoHl at 269 and 270 £coNI at 348 EcoRV at 438 FnuDlI at 392 Fokl at 619 and 235 Hhal at 394 HinVM at 392 Hpal at 565 MaeII at 449 Mae III at 509 MspA.ll at 400 MalV at 289 and 417 iVspBII at 400 Pvul at 466 RleAl at 269 ScrFI at 271 and 272 Sdul at 418 Smal at 272 Stul at 653 Thai at 392 Xholl at 586 and 818 Xmal at 270 Xorll at 466 1 GTCGACt aat gaggctttga tctaccacaa ggcttttcca atgccggcat tgtcatacaa 61 gtttcagaac acagactcac tttccggcca t gacacagat gatgctgcac agtttatctc 121 ttccgtttgt tggcgaggcc aaacctccac cttaattgct gcaaattcga cggggaatat 181 aaaaattttg gagatggttt gatgatctcc aaggtgattc ttgaatctgg CAGGATATAT 241 GGGGTGGTCA tcccacatcg agcggatcac ccgggagaag gtgaacggtt ccaccgtcaa 301 tgtcggcatc aaccccctcc aaggtcgtca tcaccaagct ccgcctcgac aaggaccgca 361 aagttctgct cgaccgcaaa ggccaagggc cgcgccgccg ctgacaagga caagggcacc 421 aagttcactg ccgaggatat catgcagaac gtcgattgat ttcgatcgat ttcatttcgg 481 tttgtgtttt tgttagttaa atgaaagtag taactgtcaa gt taagcact ttagtcggaa 541 tcacttttaa tttgaagtat gcgttaacgg atttggtgtt taatcggatc ttcgatttga 601 gacatggatg gatttgtgct ttttttgaca GGATATATTA TATTGTAAAC aggcctccct 661 c ag ac gat ac aaatgaaccc tcat gtaaat ttgtttcatt atttattctc attaacaatg 59 721 actaacacac aaatataaaa gaaataaatc acatttgggg tcttgtctgg acaacataga 781 gtttaccgtc cctgatcacg ccttcaatgt cttgtggaga tccatacagt tcctcgattg 841 cacttcctgc acgagcaatg ctagagagga ttgtcttgcg aaagtttccg tcaaccataa 901 gcgggtcaga tgagtagtcg accaagactt tctcttcctc gtc (SEQ ID NO:5) T-DNA-like region of a Medicago truncatula intragenic vector This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined. The nucleotides in italics are not 10 part of the M. truncatula genome sequence.
Nucleotides 6-357 are nucleotides 2- 353 of CA921810.
Nucleotides 358 - 694 are nucleotides 112 — 448 of AL375389.
Nucleotides 695 - 1055 are the reverse complement of nucleotides 2-362 of CF069972.
The T-DNA border-like sequence is shown in bold. The left border is nucleotides 339 - 362 and the right border is nucleotides 677 - 700. Unique restriction sites in the resulting binary vector that are located within the T-DNA-like region are: Accl at 403 AfllU at 465 20 Age I at 409 Asull at 522 5sa0I at 410 Bsml at 539 CfrlOl at 409 25 Csp45I at 522 Hpall at 410 MaeII at 391 Mboll at 511 Mspl at 410 30 NlalY at 585 Pin Al at 409 Rsal at 394 60 There are also two Ahalll sites (462 and 567), two Dral sites (462 and 567) and three Taql sites (522, 615 and 636) that could be used as cloning sites. 1 GTCGACtcat taaacaaata aaagaactat tcaaatgttt agcacatttg aaactgagtc 61 ag cataaaaa catttaccat gctagtttaa ctagttcaga cgcaaccaaa acaccatgaa 121 acttaattgc ataggaaagc accaacctgt ttcagcagca agacatgctg actacaggtc 181 aactactgtt tctgcataat aactatttat caactactaa tattccatgg tagaatag cc 241 atcaaaacca tcattgcgca gcagagggtc aaaaaggaac aatgattttc aacagttact 301 ccaaggataa ctgatgctgc agctggtaac aagttattGG CAGGATATAT TACCATGTAA 361 aattctaggc tatgtttaca aaaaaattga acgtacttaa tgtatacgac cggtaaagga 421 gaaaaaggaa gtataagtca cttaatttaa ttttttaact ttaaacatgt tttttaggag 481 gcacaattat aagttaaaaa tgtaaggaaa acctattttc ttcgaatata tagatttggc 541 attccatttt agttaaggag ttaatttaaa aatat gaaag taggagcctc tgttcgtaaa 601 atttgtgaaa aatgtcgatt gatacgcagg cgaggtcgaa ttatagtaat ttgttccaac 661 ccaataacaa atacaaggca GGATATATTT AACTGTAAAC gaccat gagc cctgtgctct 721 gcaggtgcat gccaatgaag ttgtttcaat gaataattca ttccatttat attaatatcg 781 cccacttttc cttcaaaatg caccccaata ctgtattggt ggttaacaag tgtggcattt 841 gtaggaaggt agtttctgtc taaggatttc aatacattgt tcacaacaat atttgtcatt 901 gcaagatcca ctggactctg tgctttccca tttgagcatg ctgcgaatga ttgttttagt 961 gttccccatt tcagaggacc atttggtcca atataactaa aattaaccga atcatgattt 1021 gctgaagtgc atagagccaa agcagctatg aaagg TCGAC (SEQ ID NO:6) T-DNA-like region of an onion intragenic vector This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined. The nucleotides in italics are not part of the onion genome sequence.
Nucleotides 5 - 537 are nucleotides 4 - 536 of CF449263.
Nucleotides 538 - 1186 are nucleotides 94 - 742 of CF441521.
Nucleotides 1187 - 1503 are nucleotides 162-478 of CF452730.
The T-DNA border-like sequences are shown in bold. The left border is nucleotides 520 -543 and the right border is nucleotides 1169 - 1192. Unique restriction sites in the resulting binary vector that are located in the T-DNA-like region are: Accl at 2 and 739 Asul at 1046 61 Avail at 1046 Bell at 1124 Clal at 556 Dralll at 869 EcoSll at 910 and 975 Hindlll at 628 Mfel at 852 Ndel at 1006 PflMl at 649 Rsal at 999 Sapl at 580 Xbal at 578 1 gtcgacttcc ctttcctcta ctccacttgt ttctcgcttt ctctacttcc tttttctctc 61 ttttctttat atttattgct cagctgggat taattactgt catttattcc tcatatctat 121 tttattgaat taaaacggtt atttagctcg aggccttctc tcttattctt tgcttccaag 181 gagagagaat atggcgagtg gtagcaatca tcagcatggt ggaggaggaa gaagaagagg 241 cggaatgtta gtcgctgcga ccttgcttat tcttcctgcc attttcccca atttgtttgt 301 tcctcttccc tttgcttttg gtagttctgg cagcggtgca tctccttctc tcttctccga 361 atggaatgct cctaaaccta ggcatctctc tcttctgaaa gcagccattg agcgtgagat 421 ttctgacgaa caaaaatcag agctgtggtc tcccttgcct ccacagggat ggaaaccgtg 481 ccttgagact caatatagta gcgggctacc cagtagatcg ACAGGATATA T TGAAGTGTA 541 aaacaagatg ctgaatcgat tagcaatggt tcgctcttct agacttgctt ctcggataat 601 caatcctcag tttttgattc cttctcgaag cttccttgat ctccataaga tggtaaacaa 661 ggaggcgata aaaaaagaaa gggctagact tgctgatgag atgagcagag gatattttgc 721 ggatatggca gagattcgta tacatggtgg caagattgct atggcaaatg aaattcttat 781 tccatcaggg gaagcaatca aatttcctga tttgacagta aaattgtctg atgatagcag 841 tttgcattta ccaattgtat ctacacaaag tgctacaaat aacaatgcta aatccactcc 901 tgctgcctca ttgttgtgcc tttccttcag agcaagttca cagacaatgg ttgaatcatg 961 gactgttcct tttttggaca cttttaactc ttcagaagta caagcatatg aggtatcatt 1021 tttggattct tggtttttct cattcggacc aatcaagaga atgtttctta acatgacgaa 1081 gaaacccact gctactcagc ggaagattgg ttatttcatt tggtgatcac tatgatttta 1141 ggaagcagct tcaaattgta aatcttttga CAGGATATAT AT T ACTGTAA aaagtgaaga 1201 gagaaatgtg atatatgctg atgtttccat ggagaggggt gcatttcttg ttcaacaagc 1261 tatgagggct ttccatggaa agaatataga aagcgcaaaa tcaaggctta gtctttgcga 1321 ggaggatatt cgtgggcagt tagagatgac agataacaaa ccagagttat attcacagct 1381 tggtgctgtc cttggaatgc taggagactg ctgtcgagga atgggtgata ctaatggtgc 62 1441 gattccatat tatgaagaga gtgtggaatt cctcttaaaa atgcctgcaa aagatcccga 1501 ggtCGAC (SEQ ID NO:7) T-DNA-like region of a rice intragenic vector This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined. This requires a partial digest due to a Sail site within the T-DNA like region. The nucleotides in italics are not part of the 10 rice genome sequence.
Nucleotides 6 - 634 are nucleotides 1 - 629 of CR287857.
Nucleotides 635 - 1258 are nucleotides 156 — 779 of AK100350.
Nucleotides 1259 - 1740 are nucleotides 222 - 703 of CB619781.
The T-DNA border-like sequences are shown in bold. The left border is nucleotides 616 — 639 and the right border is nucleotides 1247 - 1270. Unique restriction sites in the resulting binary vector that are located in the T-DNA-like region are: Accl at 945 Acll at 716 20 Afia at 984 Ahalll at 893 Alw26l at 1102 Apal at 1166 Banll at 662 and 1166 25 BgRl at 1026 BsaOl at 1137 and 1200 BspU I at 771 Clal at 1200 Dral at 893 30 Drall at 1162 HgiAl at 675 Hindll at 946 Maell at 716, 842 and 1125 MseI at 892 and 985 63 Pssl at 1165 Pvul at 1137 and 1200 Sail at 944 SWaBI at 1126 Sspl at 1073 Xorll at 1137 and 1200 1 GTCGACGGGA attcgccatt atggccgggg gaagcttccc tgtcactact gcaggatcat 61 caaaagcaag ctagggcgca tcttcttgac actgaacctt ttgagcatgc atttggacca 121 aagggcaaga ggaaacgccc aaaactaatg gctcttgatt atgaatctct attgaagaaa 181 gctgatgatt ctcaaggtgc atttgaggat aagcatgcta cagcgaagtt gctgaaagag 241 gaagaggaag atggcttacg atacctagtc cggcacacaa tgtttgagaa gggacagagc 301 aaaagaattt ggggtgaact ctataaagtt attgactctt cagatgttgt cgtgcaggtg 361 ttggatgcca gggatccaat gggtactaga tgctaccatc tggagaaaca tctgaaggag 421 aatgccaagc acaaacactt ggtattctta ctaaataagt gtgatctagt acctgcttgg 481 g c c acaaaag gatggttgcg cactttatca aaggactatc ccaacctagc ataccat gca 541 agcatcaaca gttcatttgg caaaggatca cttctttcag tgttacggga ggatggacgc 601 cctgagagat gtgacgacag GATATATAGT GAGGTCATGC agtgcaagcc cctccccgag 661 cccgaggtca gagcactttg cgagaaggca aaagagatat tgatggagga gagcaacgtt 721 caacctgtaa agagtcctgt tacaatatgt ggtgatattc atgggcagtt tcatgacctt 781 gcagaactgt tccgaatcgg tggaaagtgc ccagatacaa actacttgtt tatgggagat 841 tacgtggatc gtggttatta ttctgttgaa actgtcacgc ttttggtggc tttaaaggtt 901 cgttatcctc agcgaattac tattctcaga ggaaaccacg aaagtcgaca gatcactcaa 961 gtttatggat tctatgacga gtgcttaagg aagtacggga atgcaaatgt gtggaaaact 1021 tttacagatc tcttcgatta cttccccttg acagcattgg ttgagtcaga aatattttgc 1081 ctgcatggtg gattatcgcc atccattgag acacttgata ac at ac g taa cttcgatcgt 1141 gtccaagaag ttccccatga agggcccatg tgtgatcttc tgtggtctga tccagacgat 1201 cgatgtggtt ggggtatttc tcctcgaggt gctggataca ccttcgggca GGATATATTG 1261 GCGGGTAAAC caattcctgg ttttcccgac aaaccctcga gaataaattc attctttgca 1321 gaaggatgtc aaactggtga caatggtgct ggttcctcgc aagagttgaa tggtcattgc 1381 aatggagaac ccagttgccc agagcaagga gttctgacca atggtggcaa cacgccctct 1441 ccaagcacac aatgctatga aaataagttt gcaacatcca ccaacggcaa ctattctatt 1501 gggaatggtg atacattatc tagcagcaac tcattacatg cgggcaaaca gaatgctggc 1561 tttacctata atggtttcaa tccaaaacct tacaaagaac catcaggaag caacacatat 1621 ctgaataata catgcaatgg taaaccatcg gaagataat c acaataaatg tgccccaaac 1681 ctgccggcaa aagattgcca agggggcatg ccattcttac atcgtggctt ccttctaagg 1741 TCGAC (SEQ ID NO:8) 64 T-DNA-like region of Pinus taeda intragenic vector This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the Sail sites that are underlined. The nucleotides in italics are not 5 part of the P. taeda genome sequence.
Nucleotides I - 333 are nucleotides 114 - 446 of BM133642.
Nucleotides 334 - 914 are nucleotides 81 - 661 of CF392877.
Nucleotides 915 - 1172 are nucleotides 138 - 395 of CX715693.
The T-DNA border-like sequences are shown in bold. The left border is nucleotides 314 — 337 and the right border is nucleotides 898 - 921. Unique restriction sites in the resulting binary vector that are located in the T-DNA-like region are: Acclll at 442 Alw26l at 513 15 Asulat8l3 Avail at 813 BspMll at 442 Drall at 813 Dralll at 879 20 Ecolll at 876 £coNI at 808 EcoRV at 555 Fokl at 871 Haelll at 339 25 Hpall at 443 Mae I at 534 Mspl at 443 PmaCl at 876 PpuM] at 813 30 at 816 There are also to Accl sites (2 and 888) and two BspHI sites (587 and 806) that could be used as cloning sites. 65 1 gtcgacattc gcagcgatga tggataaact 61 ccacgtctgg attgctctcc 5 aggaattaga 121 gattgcaaaa atgaaaaaga caagcgtata 181 gctttttgta gtgataattg tgatagctat 10 2 41 tcgattattt ggtgaaaaaa tgtgttgaag 301 ccacagcagt GTTGGCAGGA ctcatggaag 3 61 aaagtaatgt gcagcctgtc 15 catggtcaat 421 ttcacgatct tgccgagctg aattatttgt 481 tcatgggtga ctatgttgac cttctagtgt 20 541 cattgaaggt gcgatatccc gagagtcgtc 601 agat tact ca agtatatgga aatgcaaacg 661 tctggaagat atttacagac 25 gtagaatcag 721 aaattttctg cttacatggt aacataag ga 781 attttgaccg tgttcaagaa ttgtggtcag 30 8 41 atccagatga tagatgtgga acatttggac 901 AGGATATATC TGAAGGTAAA cagagataat 9 61 tatgttggca ttgaatatga 35 gatgctgagg 1021 ttttaaaatt tctatcagct caagaatgga cgcttggcgc tttcattatc ttacagtcga gtcaagacga aaagagggag aggcaaatgc aattattggc ggaacttaaa gagaaagagt ttaacctcca atggtaatca ttattttgaa caaacttgaa aatgattgtg gacagttatt TATATCCAAT TGTAAAAGGC caaggagatt aagtgccctg ttacaatctg tggtgacata ttccggattg gtggaaagtg tccagatacg cgaggatact attcagtcga gactgtcact caacgaatta ccattcttcg aggaaatcat ttctatgatg aatgtttacg gaagtatgga ctttttgact acttcccatt gacagcactg ggtttgtcac cttccattga cacattagat gtgcctcatg aaggtcccat gtgcgatctt tggggaatct ctccacgtgg tgcagggtat ccgaagggat ccttcaaagt tgaccaatat taccaaccct gaaatattgt gctaaagaga taacgagcac atgatacata atatatgcca 66 1081 atggaatttc catttggctt taaaaaatga tttgtaatgt cacgtacatt agcatctaca 1141 aagaattgga ttgccttcat tcacttttcg TCGAC (SEQ ID NO:9) Example 3 Identification of plant-derived sequences that function as a selectable marker in bacteria: complementation of deficient bacteria In preferred intragenic vectors of the invention, the complete vector is made up entirely of 10 plant-derived sequences. One desirable component for effective vector manipulation is a bacterial selectable marker. Preferred marker sequences include plant genes that complement bacterial mutants deficient in genes essential for their growth, such as amino acid biosynthesis genes. One such gene is acetohydroxyacid synthase.
Acetohydroxyacid synthase is an enzyme which catalyses the formation of acetolactate pyruvate, the first step in valine, leucine and isoleucine biosynthesis. Furthermore, plants with mutant forms of AHAS can confer resistance to sulfonylurea herbicides and related compounds (Mazur and Falco, Annual Review of Plant Physiology and Plant Molecular Biology, 40: 441-470, 1989). For example, the Arabidopsis thaliana mutant AHAS gene 20 confers resistance to the herbicide chlorsulfuron upon transformation into tobacco (Haughan et al., Molecular and General Genetics, 211: 266-271, 1988).
Wild-type AHAS Genbank details are: LOCUS NM_114714 2270 bp mRNA linear PLN 19-FEB-2004 25 DEFINITION Arabidopsis thaliana acetolactate synthase, chloroplast /acetohydroxy-acid synthase (ALS) (At3g48560) mRNA, complete cds.
Located on chromosome 3.
A discontiguous megablast of the gene sequence above against all publicly available 30 Virdiplantae genome sequences using the standard NCBI parameters shows that the AHAS gene is present in many plant species. Genes encoding acetohydroxyacid synthase are found in both the E. coli genome and the Agrobacterium tumefaciens C58 genome (Genbank accessions NC00913 and NC003062 respectively). Furthermore, functional expression of plant AHAS genes in bacterial systems to complement deficiencies in AHAS has been well 67 established. For example, the AHAS genes from Arabidopsis thaliana (Smith et al., Proceedings of the National Academy of Science, USA, 86: 4179-4183, 1989), Nicotiana tabacum (Kim and Chang, Journal of Biochemistry and Molecular Biology, 28: 265-270, 1995), and Brassica napus (Wiersma et al., Molecular and General Genetics, 224: 155-159, 5 1990) have been used to complement AHAS-deficient bacteria such as Escherichia coli and Salmonella typhimurium.
Furthermore, it will be understood by those with ordinary skill in the art, that plant-derived sequences such as AHAS known to complement bacterial deficiencies can be placed under the 10 control of plant promoters known to be transcriptionally active in bacteria. For example, Jacob et al. (Transgenic Research, 11: 291-303, 2002) describe several such plant promoters, one of which is the potato ST-LS1 promoter. In order to provide an example in which all the components of the present invention are derived from a single plant species, we have isolated the potato {Solanum tuberosum) AHAS gene. This gene can be used in the manner described 15 above to provide a bacterial selectable marker gene to maintain vectors in bacteria.
Using potato {Solanum tuberosum) cultivar 'Iwa' genomic DNA as a template, various fragments of the AHAS gene were isolated based on primers designed from related species. These fragments were cloned, their DNA sequence determined, and a composite consensus 20 sequence generated for the potato AHAS gene. In order to generate the complete sequence for a single allele, the following primers flanking to coding region were designed: Primer Q: 5' T AGCC ATTTTGCCTCCTTTC3' Primer R: 5' CAACGGC AAACT AGAC AGAT AG A A3' A polymerase chain reaction was then performed with high fidelity Pwo polymerase with 25 primers Q and R to amplify a fragment using genomic DNA from potato cultivar 'Iwa' as a template. This product was A-tailed, and ligated into pGemT (Promega) following the manufacturers' instructions. The cloned AHAS allele was then sequenced using primers based on the consensus sequence anchored about every 400 bp along the cloned fragment. The following sequence for the coding region of a potato cultivar 'Iwa' AHAS allele (from 30 the start codon to the stop codon) was obtained: 1 atggcggctg ctgcctcacc atctccatgt ttctccaaaa ccctacctcc atcttcctcc 61 aaatcttcca ccattcttcc tagatctacc ttccctttcc acaatcaccc tcaaaaagcc 35 121 tcaccccttc atctcaccca cacccatcat catcgtcgtg gtttcgccgt ttccaatgtc 68 181 gtcatatcca ctaccaccca taacgacgtt tctgaacctg aaacattcgt ttcccgtttc 241 gcccctgacg aacccagaaa gggttgtgat gttcttgtgg aggcacttga aagggagggg 301 gttacggatg tatttgcgta cccaggaggt gcttctatgg agattcatca ggctttgaca 361 cgttcgaata ttattcgtaa tgtgctgcca cgtcatgagc aaggtggtgt gtttgctgca 421 gagggttacg cacgggcgac tgggttccct ggtgtttgca ttgctacctc tggtccggga 481 gctacgaatc ttgttagtgg tcttgcggat gctttgttgg atagtattcc gattgttgct 541 attacgggtc aagtgccgag gaggatgatt ggtactgatg cgtttcagga aacgcctatt 601 gttgaggtaa cgagatctat tacgaagcat aattatcttg ttatggatgt agaggatatt 661 cctagggttg ttcgtgaagc gttttttcta gcgaaatcgg gacggcctgg gccggttttg 721 attgatgtac ctaaggatat tcagcaacaa ttggtgatac ctaattggga tcagccaatg 781 aggttgcctg gttacatgtc taggttacct aaattgccta atgagatgct tttggaacaa 841 attattaggc tgatttcgga gtcgaagaag cctgttttgt atgtgggtgg tgggtgtttg 901 caatcaagtg aggagctgag acgatttgtg gagcttacgg gtattcctgt ggcgagtact 961 ttgatgggtc ttggagcttt tccaactggg gatgagcttt cccttcaaat gttgggtatg 1021 catgggactg tgtatgctaa ttatgctgtg gatggtagtg atttgttgct tgcatttggg 1081 gtgaggtttg atgatcgagt tactggtaaa ttggaagctt ttgctagccg agcgaaaatt 1141 gtccacattg atattgattc ggctgagatt g gaaagaaca agcaacctca tgtttccatt 1201 tgtgcagata tcaagttggc attacagggt ttgaattcca tattggaggg taaagaaggt 1261 aagctgaagt tggacttttc tgcttggaga caggagttaa cggaacagaa ggtgaagtac 1321 ccattgagtt ttaagacttt tggtgaagcc atccctccac aatatgctat tcaggttctt 1381 gatgagttaa ctaacggaaa tgccattatt agtactggtg tggggcaaca ccagatgtgg 1441 gctgcccaat actataagta caaaaagcca caccaatggt tgacatctgg tggattagga 1501 gcaatgggat ttggtttgcc tgctgcaata ggtgcggctg ttggaagacc gggtgagatt 1561 gtggttgaca ttgatggtga cgggagtttt atcatgaatg tgcaggagtt agcaacaatt 1621 aaggtggaga atctcccagt taagattatg ttgctgaata atcaacactt gggaatggtg 1681 gttcaatggg aggatcgatt ctataaggct aacagagcac acacttactt gggtgatcct 1741 gctaatgagg aagagatctt ccctaatatg ttgaaattcg cagaggcttg tggcgtacct 1801 gctgcaagag tgtcacacag ggatgatctt agagctgcca ttcaaaagat gttagacact 1861 cctgggccat acttgttgga tgtgattgta cctcatcagg agcacgttct acctatgatt 1921 cccagtggcg gtgctttcaa agatgtgatc acagagggtg atgggagacg ttcatattga (SEQ ID NO: 10) Example 4 Identification of a plant-derived sequences that function as plasmid origins of replication in bacteria Preferred intragenic vectors of the invention comprise an origin of replication that functions in E. coli and Agrobacterium tumefaciens.
Plant derived bacterial origins of replication in this example are based on the smallest known prokaryotic replication origins of Colicin E plasmids (ColE plasmids), specifically ColE2-P9 (from Shigella sp.) and ColE3-CA38 (from E. coli). The minimal replication origins of these 69 plasmids, named C0LE2 and COLE3, require only 1 specific factor (Rep) to be provided in trans, Plasmids pBX243 and pBX343 provide Rep in trans for ColE2 and ColE3 respectively. The minimal origins also require host DNA polymerase I and other factors (see Yasueda et al., Molecular and General Genetics, 215: 209-216,1989; Shinohara and Itoh, Journal of 5 Molecular Biology, 257: 290-300, 1996).
There are 2 differences between ColE2 and ColE3 origin sequences, one mismatch and a deletion of a single nucleotide in ColE2 (or an insertion in ColE3). The deletion/insertion, not the mismatch, is responsible for determining the plasmid specificity in the interaction of the 10 origins with the trans- acting factors.
Characteristic features of these sequences are two direct repeat sequences of 7 bp (5'CAPuATAA) or of 9 bp (APvCAPuATAA) which are separated from each other by 7 bp or 5 bp in ColE2 and by 8 bp or 6 bp in ColE3.
ColE2 AGACCAGATAAGCCT TATCAGATAACAGCGCC (SEQ ID NO:l 1) ColE3 AGACCAAATAAGCCTATATCAGATAACAGCGCC (SEQ ID NO: 12) The one nucleotide mismatch G/A can be substituted without effect. Only T/A is acceptable in the insertion position and the third to last position can be G or an A.
It is likely that other changes can also be made that do not affect the composition of the two direct repeat sequences.
Consensus sequences for ColE2 and ColE3 can be described as: R = G or A (Pu) Y = C or T (Py) W = A or T Consensus ColF.2 AGAIBcaIaTAAGCCT TAjlCaIaTAACAGCljfCC Consensus ColE3 AGABCAlATAAGCCTpTAfCAiATAACAGCBCC 70 Other minimal replication origins and Rep genes from other (Colicin E) plasmids could also be used when constructing plant-derived replication origins (Table 2).
Table 2. Replication origins from ColE plasmids that could be used to construct plant derived replication origins.
Original Host Plasmid Shigella sp Escherichia coli Escherichia coli Escherichia coli 15 Escherichia coli Shigella sonnei Shigella sonnei Shigella sonnei Escherichia coli 20 Escherichia coli Escherichia coli ColE2-P9 ColE3-CA38 ColE2-CA42 ColE2-GEI602 ColE2imm-K317 ColE4-CT9 ColE5-099 ColE6-CT14 ColE7-K317 c0ie8-J ColE9-J (putative) minimal origins agaccagataa-gcct-tatcagataacagcgcc agaccaaataa-gcctatatcagataacagcgcc -ga—aaata—gcctatatcagataacagcgcc -ga—aaata-gcctatatcagataacagcgcc -ga—aaata—gcctatatcagataacagcgcc agaccagataa-gcct-tatcagataacagcgcc agaccaaataaaacctatatcagataacagcgct agaccaaataaaacctatatcagataacagcgct agaccaaataa-gcctatatcagataacagcgcc agaccaaataa-gcctatatcagataacagcgcc agaccaaataaaacctatatcagataacagcgct NCBI Accession D30054 D30055 D30056 D30057 D30058 D30059 D30060 D30061 D30062 D30063 D30064 Potato COLE2-like replication sequence The ColE2 consensus sequence was used to search publicly available potato {Solanum tuberosum) DNA sequences. The potato COLE2-like replication sequence POTCOLE2 was constructed in silico from two sequences, accessions: SGN U254575 nucleotides 359-721 correspond to POTCOLE2 1-363 30 TIGR EST494490 nucleotides 248-693 correspond to POTCOLE2 362-807 tggccacaaaacaagcgccaaacaacgagcaacaacaaatcaagattgcaccaaaactagaa aattaaagaagagtatcaccccaaatgcgttactgttcacgacctcaaatcagaatctacag atctctaaatccgatctccactgttgaattgcaagaaccagatgctgagaactctcagttca 35 aatttgagcacgatccaacggttaacgaagcggcaaactctgtctgaagcggactgcctgag 71 cagaaaatttccagaagcaaaaacgggattttctctttttctctcaatctctaaaacgaatc tctcttgatttttctctcttgtgtttctgaaaataagaccaaataagccttatcagataaca gcacctgaagcagctcatgtagcttgtcagcaccaggtcctggcctaaacactgtatcattg ccacgcaaagagcacgggtctgctccaccatctgatgacccaatacaccm'gcacctggcga 5 aaacctatgtgtgcgccccaaaagttctttgtcaatcttgtctaggactggatcatcccttg caaattttcgggcaaagggagcattgctcttggccattttgtcgaagttcttcatagttaga gacatgggatgttgctttggtggactgtcccaagcaatgtaatgaagatcgtggcttattgc tgtgtgccgaaatttctgagtgttgcaaatgacagtgtgaaaatatccctctggcgaagaga caaaatttgtataatacataagcatagtccgtggaaagttatcccatccccatatgcagtac 10 t (SEQ ID NO: 13) The POTCOLE2 replication sequence, the minimal origin sequence, is underlined.
The 807 bp POTCOLE2 sequence was synthesised by Genscript Corporation (Piscatawa, NJ, www.genscript.com) and supplied cloned into the Smal site of pUC57 (pUC57POTCOLE2).
The following primers were designed to PCR amplify the spectinomycin resistance gene of pART27 (Gleave, Plant Molecular Biology, 20: 1203-1207, 1992).
PrimerS: 5'GTGTCGACAACTACGATACGG3' (SEQ ID NO: 14) Primer T: 5'CGTAAGCTTGAACGAATTCTTAG3' (SEQ ID NO: 15) Nucleotides underlined represent a Sail site in primer S and represent Hindlll and £coRI sites in primer T. A 1661 bp fragment with the spectinomycin resistance gene was PCR amplified 25 from pART27 using high fidelity Pwo polymerase. This fragment was ligated as a Sail to Hindlll region into pUC57POTCOLE2 to give pUC57POTCOLE2SPEC and position the spectinomycin resistance gene immediately adjacent to the POTCOLE2 fragment.
The fragment corresponding to the spectinomycin resistance gene and the POTCOLE2 was 30 isolated as a 2.5 kb EcoRl fragment from pUC57POTCOLE2SPEC and self-circularised to generate pPOTCOLE2SPEC. The ligation was transformed into E. coli DH5a harbouring helper plasmid pBX243 (with an ampicillin resistance gene) and transformation selected on l plates supplemented with ampicillin and spectinomycin (100 jig/mL). Resulting colonies were picked, plasmid DNA isolated and analysed by restriction enzyme digest using BamHl 72 and IscoRI. A BarrMUEcoRi double digest will release the Rep gene from pBX243 and will linearise pPOTCOLE2SPEC.
Successful bacterial propagation of pPOTCOLE2SPEC using the potato-derived origin of 5 replication is evident as three bands on a gel at 3.9 kb, 2.5 kb, and 1.5 kb, representing the pBX243 backbone, linearised pPOTCOLE2SPEC, and the pBX243 Rep gene. Control digests of only pBX243 results in two bands, the 3.9 kb pBX243 backbone and the 1.5 kb Rep gene. Figure 1 provides confirmation of replication of pPOTCOLE2SPEC in bacteria mediated by the potato-derived origin of replication.
COLE2-like sequences from the genomes of other plant species A search on NCBI GenBank (http://www,ncbi.nlm.nih.gov/BLAST/) and TIGR database (http://tigrblast.tigr.org/tgi/) using the BLAST tool "search for short, nearly exact matches" and searching within the EST databases, yielded C0LE2-like sequences in the genomes of 15 other plant species. Searches were made for the consensus ColE2 sequence: It was possible to readily assemble C0LE2-like sequences by joining two sequences from the 20 same species with a few mismatches unlikely to impact on the functionality of the origin of replication. In the examples below the position of the join between two EST sequences is indicated by "/" and mismatches to the consensus sequence are indicated in bold. agaBcaIataagccttaBcaIataacagcBcc Species Sequence Consensus ColE2 AGABCAjATAAGCCT TaBICaIIaTAACAGCjCC Allium cepa NO:16) 30 CF436111/CF452305 AG ACC AAAT A AGCTC/T AT C AG AT AGC AGCTGC (SEQ ID Beta vulgaris NO: 17) BQ589076/BQ590618 AGGCCAAATAAGCCT/TATCAGATAACAGCGCC (SEQ ID 73 Medicago trunculata AAAT C AAAT AAGCCTTAT C A/GAT AAC AGC ACC (SEQ ID NO: 18) TCI 11839/TC102142 Gossypium arboreum AGACCAGATAATCCT/TAACAGATAACAGCGCC (SEQ ID NO: 19) BM358442 /AW729597 Hordeum vulgare AGATCAGATAAGCCTTA/TCAGATAACAGCGCC (SEQ ID N0:20) DN 15 8 808/AY9243 8 8 Sorghum bicolor GAC C AG AT A AGC ATT ATT AG/AT AAC AGC GCC (SEQ ID 15 NO:21) CF427156 /CX613542 Picea glauca AGACCTAATAAGCCT/TATCAGATAACTGTGCG (SEQ ID NO:22) C0485190/ C0235782 Theobroma cacao AGACCAAATAAGACTTA/TCAGATAACAGCACG (SEQ ID NO:23) CA796667/CF974720 Mesembryanthemum crystallinum CA838853 / BE036300 A A ACC AAAT A AGCTTT A/T C AG AT A AC AGC AC A (SEQ ID NO:24) Petunia hybrida GC AT C AGAT AAGCCT/AACCA AAT AAC AGC AAC (SEQ ID NO:25) NP1240078/TC390 74 Brassica napus NO:26) CD814492/CD814199 agaccagataagact/catcagataacaacaca (seq id Zea mays NO:27) DN211845/DN232238 c c acc ag at aagcctt/at c ag at aac agttgc (SEQ ID Pinus taeda agtacagataagcctt/accaaataacaacacc (SEQ ID NO:28) DR019180/ BQ699992 COLE3-like sequences from the genomes of plant species A search on NCBI GenBank (http://www.ncbi.nlm.nih.gov/BLAST/) and TIGR database 15 Chttp://tigrblast.tigr.org/tgi/) using the BLAST tool "search for short, nearly exact matches" and searching within the EST databases, yielded COLE3-like sequences in the genomes of other plant species. Searches were made for the consensus ColE3 sequence: It was also possible to readily assemble COLE3-like sequences by joining two sequences from the same species with a few mismatches unlikely to impact on the functionality of the replication origin. In the examples below the position of the join between two EST sequences is indicated by "/"and mismatches to the consensus sequence are indicated in bold. agaBcaMataagcctWtaIcaIataacagcBcc Species Sequence Consensus ColF.3 AG Afc AlAT A AGC CT WT AY CAftAT A A C A GCRC' C Allium cepa NO:29) CF436111/CF452305 agaccaaataagc/tt at atc agat agc agctgc (SEQ ID 75 Vitis vinifera AGATCAGATAAGCCTTTA/TCAGATAACAGCCCC (SEQ ID N0:30) CF207293 /CF515867 Nicotiana tabacum AGACCAAATAAGCA/TATATCAGATAACTTCGGA (SEQ ID N0:31) TC1407/BP530912 Glycine max T GACC AA AT AAGCTT AT AT/C AG AT AAC AGAGTC (SEQ ID NO:32) BE209626/AI959871 Saccharum officinarum AGACCAAATAACCCTAAAT/CAGATAACAACGC (SEQ ID NO:33) CA156596/CA092850 Secale cereale AGACCAAATAATCATAT/TTCAGATAACAGCGCC (SEQ ID NO:34) BE704886/CD453313 Capsicum annum AAACCAAATAAGCAAA/TATCAGATAACTTCGCA (SEQ ID NO:35) CA525915/TC6186 Populus euphratica GC ACCAAAT A AGCCAAT A/T C AG AT A AC AGCT GC (SEQ ID NO:36) AJ777378/ AJ768273 Lotus japonicus CT ACCAAAT A AGC A/T AT ATC AG AT AAC AGC GT A (SEQ ID NO:37) TC15168/AV774815 Medicago trunculata AGAT C AA AT AAGCCTTT A/T C AGAT AAC AGC AG A (SEQ ID NO:38) 76 CR931730/AC1483 60 Note: the last example was derived from nr database rather than an EST database.
Example 5 Identification of plant-derived sequences that function as a selectable marker in bacteria: operator-repressor titration Antibiotics and antibiotic resistance genes traditionally used for the selection and maintenance of recombinant plasmids in hosts such as E. coli and A. tumefaciens. Their continued use is 10 undesirable in plant biotechnology, where the threat of horizontal transfer to other microbes exists. An alternative plasmid selection strategy based on the phenomenon repressor titration was developed by Cobra Biomanufacturing Pic (WO 03/097838 Al; Williams et al, Nucleic Acids Research, 26: 2120-2124, 1998; Cranenburgh et al., Nucleic Acids Research, 29: e26, 2001; Cranenburgh et al., Journal of Molecular Microbiology and Biotechnology, 7: 197-203, 2004).
Background The Operator-Repressor Titration (ORT) system enables selection and maintenance of 20 plasmids that are free from expressed selectable marker genes and require only the short non-expressed lac operator for selection and maintenance.
E. coli ORT strain DH1 lacdapD (genotype recA endAl gyrA96 thil hsdrl 7 supE44 relAl A(dapD)::kan hipA::lac-dapD) contains a chromosomal conditionally essential gene dapD 25 under the control of the lac operator/promoter system. Under normal conditions, a repressor protein encoded by a second chromosomal gene binds to the chromosomal lac operator and prevents transcription of dapD, and cells lyse. Growth is permitted when an inducer (IPTG) is provided i.e. on a nutrient agar plate. Alternatively, growth is also permitted when a plasmid containing a lac operator sequence is introduced into the cell. The repressor protein 30 binds to the plasmid-borne operator sequence, derepressing the chromosomal operator and allowing dapD expression. 77 Two lac operator sequences have previously been shown to function as plasmid selectable elements: LacQX is 21 bp and is derived from the wild-type E. coli lac operon.
LacO is 20 bp and is an 'ideal' version of LacO 1, being a perfect palindrome of the first 10 bp 5 ofLacOX.
LacOl: aattgtgagcggataacaatt (SEQIDNO:39) LacO: aattgtgagcgctcacaatt (SEQ ID n0:40) Previous research conducted to understand the lac operator has involved the analysis of various operator analogues, or ZacOl-like sequences, i.e. sequences that are able to titrate the lac repressor. As little as 13 bp of a 14 bp symmetrical consensus sequence TGTGAGCGCTCACA is able to bind the lac repressor (Simons et al., Proceedings of the National Academy of Science, USA, 81: 1624-1628, 1984). No work has been conducted to 15 show that these LacO-Wkc sequences will function as plasmid selectable elements but it appears likely. With only 13-14 bp required, it is statistically probable that sequences capable of binding the lac repressor and acting as a plasmid selectable element will be found in all plant genomes.
Search for LacO-like sequences in plant genomes A search on NCBI GenBank (http://www.ncbi.nlm.nih.gov/BLAST/) and TIGR database (http://tigrblast.tigr.org/tgi/) using the BLAST tool "search for short, nearly exact matches" and searching within the EST databases, were made for the LacO sequence: AATTGTGAGCGCTCACAATT (SEQ ID N0:40) The following list gives NCBI accession numbers that have sequences identical to at least 10 bp that comprise one of the two inverted repeats that make up LacO: Dicotyledonous plants Camelliaceae Camellia sp. CVO14004 Chenopodiaceae Beta vulgaris CV301904 78 Chenopodium sp.
Compositae Lactuca sativa Helianthus annuus 5 Helianthus argophyllus Convolvulaceae Ipomoea batatas Ipomoea nil Cruciferae 10 Brassica rapa Brassica rapa pekinensis Brassica napus Thellungiella halophila Thellungiella salsuginea 15 Cucurbitaceae Cucumis sativus Ericaceae Vaccinium corymbosum Euphorbiaceae 20 Manihot esculenta Lauraceae Persea americana Leguminosae Arachis hypogaea 25 Cicer arietinum Glycine max Glycine soja Lotus corniculatus var. japonicus Phaseolus vulgaris 30 Phaseolus coccineus Linaceae Linum usitatissimum Malvaceae Gossypium hirsutum CN782052 BQ999659 CD848561 CF097400 CB330510 BJ577170 L33645 CV523215 CD837028 BM985805 DN777083 DN910885 CV191519 CK651449 CK758014 CXI 27972 CD051347 CX708729 BG041485 BP073115 CV542405 CA914174 CV478930 CV478503 BU671795 BU672101 BJ576074 BJ570027 CD038632 DR044140 79 Gossypium raimondii Myrtaceae Eucalyptus tereticornis Pedaliaceae 5 Sesamum indicum Plumbaginaceae Limonium bicolor Rosaceae Fragaria x ananassa 10 Malus x domestica Prunus persica Prunus armeniaca Prunus dulcis Pyrus communis 15 Rutaceae Citrus sinensis Citrus reticulata Salicaeae Populus tremula 20 Solanaceae Capsicum annuum Lycopersicon esculentum Medicago truncatula Petunia x hybrida 25 Solanum habrochaites Solanum tuberosum Sterculiaceae Theobroma cacao Vitaceae 30 Vitis vinifera Monocotyledonous plants Gramineae Avena sativa Hordeum vulgare C0131784 CD669011 BU668313 CX263567 AB208578 CV882575 AJ825706 CV048462 BI203104 AJ504986 DN618703 CF828122 BU823277 BM066713 BP883198 AL384864 CV300353 DN168862 DN849072 CF974287 CX127882 CN821127 CV062014 CN191267 BP876932 CV469139 80 Oryza sativa CR290368 Saccharum officinarum CA104782 Sorghum bicolor CX607714 Schedonorus arundinaceus CK802645 Triticum aestivum BQ578949 Triticum monococcum BQ801760 Zea mays C0526196 Liliaceae Allium cepa CF448121 Gymnosperms Pinus taeda DRO13559 Picea engelmannii x sitchensis C0213279 Picea glauca CK441720 Pinus pinaster BX676975 Pseudotsuga menziesii CN638414 Bryophytes Marchantia polymorpha AU081717 Algae Chlamydomonas reinhardtii CF558875 Search for LacOl-like sequences in plant genomes A search on NCBI GenBank (http://www.ncbi.nlm.nih.gov/BLASTA and TIGR database (http://tigrblast.tigr.org/tgi/) using the BLAST tool "search for short, nearly exact matches'' and searching within the EST databases, were made for GGATAACAATT.
The 21 bp LacOl sequence is identical in its first 10 bp to LacO. The following list gives accession numbers where at least the last 11 bp of the LacO 1 sequence, GGATAACAATT, are found: Dicotyledonous plants Chenopodiaceae Beta vulgaris CX779649 CF542856 Compositae Lactuca sativa BU004821 BU008839 81 Helianthus annuus Convolvulaceae Ipomoea nil Cruciferae 5 Brassica rapa Brassica napus Raphanus sativus Cucurbitaceae Citrullus lanatus 10 Leguminosae Cicer arietinum Glycine max Medicago trunculata Phaseolus vulgaris 15 Phaseolus coccineus Pisum sativum Linaceae Linum usitatissimum Malvaceae 20 Gossypium raimondii Gossypium hirsutum Rosaceae Malus x domestica Prunus persica 25 Prunus armeniaca Prunus americana Pyrus communis Rutaceae Citrus sinensis 30 Citrus Clementina Solanaceae Capsicum annuum Lycopersicon esculentum Nicotiana tabacum BU671786 BQ965452 BJ567255 CV433907 CV432343 CX195012 CD838296 AF051115 AI563425 CK148974 C0036432 BQ144942 CV533775 CA913133 CD860446 CA482669 C0128755 C0497326 CV997415 AJ823535 CV048921 CV458467 AJ504896 CX709893 CB543020 CN879093 CN882088 CK758903 CX675412 CX075530 CX298649 CA516533 BI926125 CN949741 BP535353 82 Solanum tuberosum Vitaceae Vitis vinifera Monocotyledonous plants Gramineae Avena sativa Hordeum vulgare Lolium multiflorum Oryza sativa 10 Saccharum officinarum Secale cereale Sorghum bicolor Triticum aestivum Zea mays 15 Liliaceae Allium cepa Gymnosperms Pinus taeda Picea glauca 20 Pinus pinaster Pseudotsuga menziesii CK719419 CD715798 CN820280 CK568615 AU247989 CF986696 CA279301 BE495021 CX615619 CK215572 CF046268 BE205560 CF449604 AW065199 DN614133 C0251715 C0241938 BX682941 CN640766 Potato LacOl sequence as a recombinant plasmid selectable element The 21 bp LacO 1 sequence was used to search publicly available potato (Solanum tuberosum) EST sequences. Sequences were found in NCBI accessions CV501815 and CK259105 joined in silico with Bgfll restriction enzyme recognition sites (agatct) added to termini to make 693 bp POTLACOl: agatctAATATTTACTTCTCCACTTAAACAAATACCCCAATCAGAATCACTAGCTGGCAGAT TCCTTGTCCTCTATTGACAGCAAACATAGACGTACATTATAGAGCCACCACAACATTAGACA AACATTCTTTAAACAAGAGGTGGATACTGCTTAGACrGCAGGCGCACCCTCTTTCGGTACTC CAGAACATCCTGAATAAACATATGATACCCTTCAGTTTGGGCAGGATCAGCAGGGTTTGGCT GATCTAACAAGTCCTGGATACCAACCAGTATCTGTTTCACGGTGATGGCTGGTCTCCACCCA 83 CTATCTTCATTGAGGATCGACAAGCAAACTGTTCCAGATGGATAGACATTGGGATGGAAAAA GCCTGGTGGGAATTTACACTTTGGCGGTTTACTCGGATAATCTTCACTGAAGTGAATTGTGA GCGGATAACAATTGGGGAAATCATTATGTAAATTCAACAAATATTTCAATTTATGCATTAGC AAATTGTTATCAGGATCTACCACATCAGGATTGTCTTCTATGCTACGCAGCTAGTCGAACTC 5 GACTCCCTCGTTGTCTTCCTGGTAAATCCGGTCGAATATATCTCGACGGATGTTTTCTCCGG TACGATCAATACAATTTTTTCAACGAAACTACTGATTCAGCTAAAGATACAGTGAACTGTAG CAGCTagatct (SEQ ID NO:41) LacOl sequence is underlined. First nucleotide of CK259105 underlined and in bold. Terminal Bglll restriction enzyme sites (agatct in lowercase) are not of potato sequence origin.
The 693 bp POTLACOl sequence was synthesised by Genscript Corporation (Piscatawa, NJ, 15 www.genscript.com) and supplied cloned into the Smal site of pUC57 (pUC57P0TLAC01).
POTLACOl was excised from pUC57P0TLAC01 with Bglll and ligated into pBR322 previously linearised with BamHl. The resulting plasmid pBR322P0TLAC01 was transformed into E. coli strain DllllacdapD and colonies were selected using repressor 20 titration. Plasmid DNA was isolated from selected colonies and digested with restriction enzyme Pstl (see Figure 2). Linearised pBR322 is visualised as a band at 4.4 kb. Pstl digested pBR322P0TLAC01 is visualised as two bands, one at 1.3 kb and one at 3.8 kb. The results indicate that POTLACO1 functions as plasmid selectable element.
Onion LacO and LacOl sequences Onion-derived LacOl and LacO sequences, ALLLACO and ALLLACOl, have also been made in silico. 84 NCBI accessions CF448121 and CF450773 were used to generate a 756 bp ALLLACO {LacO sequence underlined): agatcttcgtcagcctacaactaccgacaatcccaaacccacatccgacgacgataactacg 5 aaaatagggaggtggattctgacggagcttcggattccgacgatgattgggaaggggttgag agcacggagttggatgagatttttagtgcggcgactacgtttatagctgcgactgctgcgga taagaattctgcaaaagtttcgaatgatctgcagctgcagttatatgggttttacaagattg ctactgaggggccttgtaccgttccccaaccttctgcacttaaaatgacagctcgtgccaag tggaatgcatggcagaaacttggttccatgcctcctgaagaagctatggagaagtacattgc 10 aattgtgagcgctcacaattgcttttacgctatgtatgacaatatggataatcatggtgggg cccagagagccccaatgaatcctcagcaaattccatttggaaattcattatatggagctggg tctggactcatccgaggtggcttgggtgcctatggagagagatttttaggttcaagctccga gtttatgcagagcaatataagtagatggttctccaaccctcagtattactttcaagtgaatg accagtatgtgaggaacaagttgaaagttgttttgtttccctttttacacagagggcattgg 15 acaaggatcactgaaccggttggtggcaggctttcttacaaacctccaatttttgacatcaa tgccccagatct (SEQ ID NO:42) NCBI accessions CF448121 and CF449604 were used to generate a 662 bp ALLLACOl 20 {LacOl sequence underlined): agatctggggcattgatgtcaaaaattggaggtttgtaagaaagcctgccaccaaccggttc agtgatccttgtccaatgccctctgtgtaaaaagggaaacaaaacaactttcaacttgttcc tcacatactggtcattcacttgaaagtaatactgagggttggagaaccatctacttatattg 25 ctctgcataaactcggagcttgaacctaaaaatctctctccataggcacccaagccacctcg gatgagtccagacccagctccatataatgaatttccaaatggaatttgctgaggattcattg gggctctctqgqccccaccatgattatccatattqtcatacatagcqtaaaagcaattgtga gcqqataacaattcatttcaaaagggaggaggaggaggacaacagagtcagggtcttacgct ttttgtgaaaggttttgatagctctcaagatccattcacgattcgtgatactcttcgatcgc 30 attttgagtcctgtggagagatttctcgtgtttcagttccaaaagattttgaaaccggcagc tccagggggattgcgtacattgatttcaatgaacaagagagttttaacaaagccctagaact gaatggatcagaaatagatggatactacctggttgttgatca (SEQ ID NO:43) 85 Other LacOl-like sequences from plant genomes It is highly likely that ZacOl-like sequences will also function as plasmid selectable elements. For example, the following sequence was also found. qiI 144954261qbIBE2Q5560.1| API26F NPI Onion cDNA library Allium cepa cDNA clone API26F similar to catalase, mRNA sequence.
Length = 628 Score = 28.2 bits (14), Expect = 0.34 Identities = 21/22 (95%), Gaps = 1/22 (4%) Strand = Plus / Minus Query: 1 aattgtgagcgg-ataacaatt 21 I II I I I I I I I il I I I II I I I I Sbjct: 477 aattgtgagcgggataacaatt 456 Example 6 Design and construction of an intragenic vector for Arabidopsis thaliana The consensus T-DNA border sequence can be defined as: 25 5GGCAGGATATATXXXXXTGTAAXX3' Although other variants can include: GACAGGATATATXXXXXGGTCAXX3 (nucleotides in bold represent possible substitutions in some T-DNA borders).
Searches for such DNA sequences identified a single sequence in the A. thaliana genome corresponding to: GACAGGATATATCGTGATGTCAAC3' (SEQ ID NO:44) [ex AL138652 from chromosome 3, bp 60629-60606) 86 This "T-DNA border" is remarkably similar to authentic T-DNA borders from Agrobacterium Ti or Ri plasmids, with all nucleotide substitutions occurring in variable regions: 5 GACAGGATATATGGTGATGTCACG3' pTiS4 (SEQ ID NO:45) 5GACAGGATATATGTTCTTGTCATG3' pRi(TRrb) (SEQ ID NO:46) GGCAGGATATATCGAGGTGTAAAA3 pTi 15955 (TR lb) (SEQ ID NO:47) GACAGGATATATTGGCGGGTAAAC3' pTiT37 (rb) (SEQ ID NO:48) GACAGGATATATTGGCGGGTAAAC3' pTiC58 (rb) (SEQ ID NO:49) (nucleotides in bold represent nucleotide substitutions) The A. thaliana "T-DNA border" is from an open reading frame (nucleotides 59676-63206 from AL138652) for a putative protein of unknown function [i.e. no promoters and presumably not a heterochromatic region].
Examination of sequences flanking this "T-DNA border" reveal a 2838 bp fragment 20 (nucleotides 59735-62572 from AL138652) with several unique restriction sites suitable as potential insertion sites for other genes and Southern analysis of plants transformed using this vector.
If the "T-DNA border" found at nucleotides 60629-60606 is considered the "left border" of a 25 binary vector there are several unique restriction sites, including Xbal, between this left border and the first three nucleotides equivalent to a right border at positions 59735-59737. The right border beyond these three nucleotides can be provided by authentic right border sequences of non-plant origin, thereby resulting in a "chimeric right border'.
In assembling an intragenic vector all plasmid constructions were performed using standard molecular biology techniques of plasmid isolation, restriction, ligation and transformation into Escherichia coli strain DH5a (Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987). 87 The following primers were designed: Primer A: 5'CCGAGGAGGTGCTAGAGCrC7>*G4GCGTAAAGGAATGTCC3' (SEQ ID 5 N0:50) Primer B: 5'AAAGGCrCG^GGTTTACCCGCCAATATATCCTGTCTATGTTTC ACATGAACACGTGAATCTTC3' (SEQ ID NO:51) Primer C: 5'AAAGGGTCG/iCTAGATCTTTCGGTTGTGTGAATGATTCCGATGA GAGAAGAAGAC3' (SEQ ID NO:52) Primer D: 5'GGACATTCCTTTACGC70MGi4GCTCTAGCACCTCCTCGG3' (SEQ ID NO:53) Restriction sites within the primers are indicated in italics: TCTAGA Xbal site, CTCGAG -Xhol site, GTCGAC - Sail site.
Using Arabidopsis thaliana 'Columbia' genomic DNA as a template and primers A and B a polymerase chain reaction was performed using high fidelity Pwo polymerase to amplify a "right border" 703 bp fragment which was subsequently restricted with Xbal and Xhol ligated into pPROEX-1 restricted with the same endonucleases, to form pPROEX-lrb.
Using Arabidopsis thaliana 'Columbia' genomic DNA as a template and primers C and D a polymerase chain reaction was performed using high fidelity Pwo polymerase to amplify a "left border" 2216 bp fragment which was subsequently restricted with Xbal and Sail ligated into pPROEX-l-rb restricted with the same endonucleases, to form pPROEX-AtTD.
The 2864 bp Sail to Xhol fragment of pPROEX-AtTD was ligated to the 8004 bp Sail backbone of the binary vector pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) to form pTCl. The orientation of the two fragments was determined by restriction patterns and confirmed by DNA sequencing. The full sequence of pTCl is shown below and 30 comprises a 2838 bp DNA fragment derived from Arabidopsis thaliana (nucleotides 59735-62572 from AL138652) presented in italics. The right and left T-DNA borders are in bold and the unique Xbal site used for subsequent cloning is in bold and underlined. 88 i GTCGACGGAT CTTTTCCGCT GCATAACCCT GCTTCGGGGT CATTATAGCG ATTTTTTCGG 61 TATATCCATC CTTTTTCGCA CGATATACAG GATTTTGCCA AAGGGTTCGT GTAGACTTTC 121 CTTGGTGTAT CCAACGGCGT CAGCCGGGCA GGATAGGTGA AGTAGGCCCA CCCGCGAGCG 181 GGTGTTCCTT CTTCACTGTC CCTTATTCGC ACCTGGCGGT GCTCAACGGG AATCCTGCTC 241 TGCGAGGCTG GCCGGCTACC GCCGGCGTAA CAGATGAGGG CAAGCGGATG GCTGATGAAA 301 CCAAGCCAAC CAGGGGTGAT GCTGCCAACT TACTGATTTA GTGTATGATG GTGTTTTTGA 361 GGTGCTCCAG TGGCTTCTGT TTCTATCAGC TGTCCCTCCT GTTCAGCTAC TGACGGGGTG 421 GTGCGTAACG GCAAAAGCAC CGCCGGACAT CAGCGCTATC TCTGCTCTCA CTGCCGTAAA 481 ACATGGCAAC TGCAGTTCAC TTACACCGCT TCTCAACCCG GTACGCACCA GAAAATCATT 541 GATATGGCCA TGAATGGCGT TGGATGCCGG GCAACAGCCC GCATTATGGG CGTTGGCCTC 601 AACACGATTT TACGTCACTT AAAAAACTCA GGCCGCAGTC GGTAACCTCG CGCATACAGC 661 CGGGCAGTGA CGTCATCGTC TGCGCGGAAA TGGACGAACA GTGGGGCTAT GTCGGGGCTA 721 AATCGCGCCA GCGCTGGCTG TTTTACGCGT ATGACAGTCT CCGGAAGACG GTTGTTGCGC 781 ACGTATTCGG TGAACGCACT ATGGCGACGC TGGGGCGTCT TATGAGCCTG CTGTCACCCT 841 TTGACGTGGT GATATGGATG ACGGATGGCT GGCCGCTGTA TGAATCCCGC CTGAAGGGAA 901 AGCTGCACGT AATCAGCAAG CGATATACGC AGCGAATTGA GCGGCATAAC CTGAATCTGA 961 GGCAGCACCT GGCACGGCTG GGACGGAAGT CGCTGTCGTT CTCAAAATCG GTGGAGCTGC 1021 ATGACAAAGT CATCGGGCAT TATCTGAACA TAAAACAC TA TCAATAAGTT GGAGTCATTA 1081 CCCAACCAGG AAGGGCAGCC CACCTATCAA GGTGTACTGC CTTCCAGACG AACGAAGAGC 1141 GATTGAGGAA AAGGCGGCGG CGGCCGGCAT GAGCCTGTCG GCCTACCTGC TGGCCGTCGG 1201 CCAGGGCTAC AAAATCACGG GCGTCGTGGA CTATGAGCAC GTCCGCGAGC TGGCCCGCAT 1261 CAATGGCGAC CTGGGCCGCC TGGGCGGCCT GCTGAAACTC TGGCTCACCG ACGACCCGCG 1321 CACGGCGCGG TTCGGTGATG CCACGATCCT CGCCCTGCTG GCGAAGATCG AAGAGAAGCA 1381 GGACGAGCTT GGCAAGGTCA TGATGGGCGT GGTCCGCCCG AGGGCAGAGC CATGACTTTT 1441 TTAGCCGCTA AAACGGCCGG GGGGTGCGCG TGATTGCCAA GCACGTCCCC ATGCGCTCCA 1501 TCAAGAAGAG CGACTTCGCG GAGCTGGTAT TCGTGCAGGG CAAGATTCGG AATACCAAGT 1561 ACGAGAAGGA CGGCCAGACG GTCTACGGGA CCGACTTCAT TGCCGATAAG GTGGATTATC 1621 TGGACACCAA GGCACCAGGC GGGTCAAATC AGGAATAAGG GCACATTGCC CCGGCGTGAG 1681 TCGGGGCAAT CCCGCAAGGA GGGTGAATGA ATCGGACGTT TGACCGGAAG GCATACAGGC 1741 AAGAACTGAT CGACGCGGGG TTTTCCGCCG AGGATGCCGA AACCATCGCA AGCCGCACCG 1801 TCATGCGTGC GCCCCGCGAA ACCTTCCAGT CCGTCGGCTC GATGGTCCAG CAAGCTACGG 1861 CCAAGATCGA GCGCGACAGC GTGC.AACTGG CTCCCCCTGC CCTGCCCGCG CCATCGGCCG 1921 CCGTGGAGCG TTCGCGTCGT CTCGAACAGG AGGCGGCAGG TTTGGCGAAG TCGATGACCA 1981 TCGACACGCG AGGAACTATG ACGACCAAGA AGCGAAAAAC CGCCGGCGAG GACCTGGCAA 2041 AACAGGTCAG CGAGGCCAAG CAGGCCGCGT TGCTGAAACA CACGAAGCAG CAGATCAAGG 2101 AAATGCAGCT TTCCTTGTTC GATATTGCGC CGTGGCCGGA CACGATGCGA GCGATGCCAA 2161 ACGACACGGC CCGCTCTGCC CTGTTCACCA CGCGCAACAA GAAAATCCCG CGCGAGGCGC 2221 TGCAAAACAA GGTCATTTTC CACGTCAACA AGGACGTGAA GATCACCTAC ACCGGCGTCG 2281 AGCTGCGGGC CGACGATGAC GAACTGGTGT GGCAGCAGGT GTTGGAGTAC GCGAAGCGCA 40 2341 CCCCTATCGG CGAGCCGATC ACCTTCACGT TCTACGAGCT TTGCCAGGAC CTGGGCTGGT 2401 CGATCAATGG CCGGTATTAC ACGAAGGCCG AGGAATGCCT GTCGCGCCTA CAGGCGACGG 89 2461 CGATGGGCTT CACGTCCGAC CGCGTTGGGC ACCTGGAATC GGTGTCGCTG CTGCACCGCT 2521 TCCGCGTCCT GGACCGTGGC AAGAAAACGT CCCGTTGCCA GGTCCTGATC GACGAGGAAA 2581 TCGTCGTGCT GTTTGCTGGC GACCACTACA CGAAATTCAT ATGGGAGAAG TACCGCAAGC 2641 TGTCGCCGAC GGCCCGACGG ATGTTCGACT ATTTCAGCTC GCACCGGGAG CCGTACCCGC 2701 TCAAGCTGGA AACCTTCCGC CTCATGTGCG GATCGGATTC CACCCGCGTG AAGAAGTGGC 2761 GCGAGCAGGT CGGCGAAGCC TGCGAAGAGT TGCGAGGCAG CGGCCTGGTG GAACACGCCT 2821 GGGTCAATGA TGACCTGGTG CATTGCAAAC GCTAGGGCCT TGTGGGGTCA GTTCCGGCTG 2881 GGGGTTCAGC AGCCAGCGCT TTACTGGCAT TTCAGGAACA AGCGGGCACT GCTCGACGCA 2941 CTTGCTTCGC TCAGTATCGC TCGGGACGCA CGGCGCGCTC TACGAACTGC CGATAAACAG 3001 AGGATTAAAA TTGACAATTG TGATTAAGGC TCAGATTCGA CGGCTTGGAG CGGCCGACGT 3061 GCAGGATTTC CGCGAGATCC GATTGTCGGC CCTGAAGAAA GCTCCAGAGA TGTTCGGGTC 3121 CGTTTACGAG CACGAGGAGA AAAAGCCCAT GGAGGCGTTC GCTGAACGGT TGCGAGATGC 3181 CGTGGCATTC GGCGCCTACA TCGACGGCGA GATCATTGGG CTGTCGGTCT TCAAACAGGA 3241 GGACGGCCCC AAGGACGCTC ACAAGGCGCA TCTGTCCGGC GTTTTCGTGG AGCCCGAACA 3301 GCGAGGCCGA GGGGTCGCCG GTATGCTGCT GCGGGCGTTG CCGGCGGGTT TATTGCTCGT 3361 GATGATCGTC CGACAGATTC CAACGGGAAT CTGGTGGATG CGCATCTTCA TCCTCGGCGC 3421 ACTTAATATT TCGCTATTCT GGAGCTTGTT GTTTATTTCG GTCTACCGCC TGCCGGGCGG 3481 GGTCGCGGCG ACGGTAGGCG CTGTGCAGCC GCTGATGGTC GTGTTCATCT CTGCCGCTCT 3541 GCTAGGTAGC CCGATACGAT TGATGGCGGT CCTGGGGGCT ATTTGCGGAA CTGCGGGCGT 3601 GGCGCTGTTG GTGTTGACAC CAAACGCAGC GCTAGATCCT GTCGGCGTCG CAGCGGGCCT 3661 GGCGGGGGCG GTTTCCATGG CGTTCGGAAC CGTGCTGACC CGCAAGTGGC AACCTCCCGT 3721 GCCTCTGCTC ACCTTTACCG CCTGGCAACT GGCGGCCGGA GGACTTCTGC TCGTTCCAGT 3781 AGCTTTAGTG TTTGATCCGC CAATCCCGAT GCCTACAGGA ACCAATGTTC TCGGCCTGGC 3841 GTGGCTCGGC CTGATCGGAG CGGGTTTAAC CTACTTCCTT TGGTTCCGGG GGATCTCGCG 3901 ACTCGAACCT ACAGTTGTTT CCTTACTGGG CTTTCTCAGC CGGGATGGCG CTAAGAAGCT 3961 ATTGCCGCCG ATCTTCATAT GCGGTGTGAA ATACCGCACA GATGCGTAAG GAGAAAATAC 4021 CGCATCAGGC GCTCTTCCGC TTCCTCGCTC ACTGACTCGC TGCGCTCGGT CGTTCGGCTG 4081 CGGCGAGCGG TATCAGCTCA CTCAAAGGCG GTAATACGGT TATCCACAGA ATCAGGGGAT 4141 AACGCAGGAA AGAACATGTG AGCAAAAGGC CAGCAAAAGG CCAGGAACCG TAAAAAGGCC 4201 GCGTTGCTGG CGTTTTTCCA TAGGCTCCGC CCCCCTGACG AGCATCACAA AAATCGACGC 4261 TCAAGTCAGA GGTGGCGAAA CCCGACAGGA CTATAAAGAT ACCAGGCGTT TCCCCCTGGA 4321 AGCTCCCTCG TGCGCTCTCC TGTTCCGACC CTGCCGCTTA CCGGATACCT GTCCGCCTTT 4381 CTCCCTTCGG GAAGCGTGGC GCTTTCTCAA TGCTCACGCT GTAGGTATCT CAGTTCGGTG 4441 TAGGTCGTTC GCTCCAAGCT GGGCTGTGTG CACGAACCCC CCGTTCAGCC CGACCGCTGC 4501 GCCTTATCCG GTAACTATCG TCTTGAGTCC AACCCGGTAA GACACGACTT ATCGCCACTG 4561 GCAGCAGCCA CTGGTAACAG GATTAGCAGA GCGAGGTATG TAGGCGGTGC TACAGAGTTC 4621 TTGAAGTGGT GGCCTAACTA "CGGCTACACT AGAAGGACAG TATTTGGTAT CTGCGCTCTG 4681 CTGAAGCCAG TTACCTTCGG AAAAAGAGTT GGTAGCTCTT GATCCGGCAA ACAAACCACC 4741 GCTGGTAGCG GTGGTTTTTT TGTTTGCAAG CAGCAGATTA CGCGCAGAAA AAAAGGATCT 40 4801 CAAGAAGATC CTTTGATCTT TTCTACGGGG TCTGACGCTC AGTGGAACGA AAACTCACGT 4861 TAAGGGATTT TGGTCATGAG ATTATCAAAA AGGATCTTCA CCTAGATCCT TTTAAATTAA 90 4 921 4981 5041 5101 5 5161 5221 5281 5341 5401 10 5461 5521 5581 5641 5701 15 5761 5821 5881 5941 6001 20 6061 6121 6181 6241 6301 25 6361 6421 6481 6541 6601 30 6661 6721 6781 6841 6901 35 6961 7021 7081 7141 7201 40 7261 7321 aaatgaagtt tgcttaatca tgactccccg gcaatgatac gccggaaggg AAACAAGTGG AGGTTTGCGA GCAGGTGGCG GCACCTATCC TTACATCTCG AGAGCTTGTC CATTGTTGTG GAAAAAGTGG TGTACCTGCC CACGTATAAA GGGAGCACAG AATTCAGGAG AGAGGTAGTT CGGCTCCGCA GACCGTAAGG GGCTTCCCCT CGACATCATT CAATGACATT GCTGACAAAA TGATCCGGTT CTCGCCGCCC GTACAGCGCA gcgcctgccg AGAAGATCGC CGAGATCACC CGCGGCGCGG GGTGGTTCTA TGTTTTAGTG TCCAAGCAAC CTCTGAGAGC TAGGCTTATA CAATGGGACA GGGGCGCGGC TGGCCAATTT GCCTGCTCTC GGCCGAGGGG ttaaatcaat GTGAGGCACC TCGTGTAGAT cgcgagaccc ccgagcgcag CAGCAACGGA TCCGCTGTGC GAAACATTGG GACCAAGGCT GATGATGACT GGGAAGATTG TCGCACACGC GCACTAAGCA TGCAATTTGT ACTAGACCTC GATGACGCCT TTAAACATCA GGCGTCATCG GTGGATGGCG CTTGATGAAA GGAGAGAGCG CCGTGGCGTT CTTGCAGGTA GCAAGAGAAC CCTGAACAGG GACTGGGCTG GTAACCGGCA GCCCAGTATC TTGGCCTCGC AAGGTAGTCG CTTAACTCAA AGCCTCGTAC GATGAAGCTC TACGACAACT AACTACGATA TATAGCGCAA ACGAACTTCT GTCTATGGCG TCGCTTGCCC TAATAAAATG CGCAGCCCCT ctaaagtata tatctcagcg AACTACGATA ACGCTCACCG aagtggtcct TTCGCAAACC CAGGCGTTAG atgctgagaa TTGAACTATC CTGATGAAGA AACTCAACTC ACCGAGGCAA GACAGCTCCT ACGCAAAATG AAGTCTCGAA AACAATTCAT TGAGGGAAGC AGCGCCATCT GCCTGAAGCC CAACGCGGCG AGATTCTCCG ATCCAGCTAA TCTTCGAGCC ATAGCGTTGC ATCTATTTGA GCGATGAGCG aaatcgcgcc agcccgtcat GCGCAGATCA GCAAATAATG GCGTTAGAGA TTGCGATGGC GTCTTCCCTA CCATAAGCAA ATAGTTCATC ATGGGTCTCG TTTCCACATC GCAAAGATGG TGACAGATAA TTAGGAGCTT GGGGGGATGG tatgagtaaa ATCTGTCTAT CGGGAGGGCT GCTCCAGATT gcaactttat TGTCACGCCT GCGTCATATG ccatttcatt TACCAGAAGT CTCTGCTTGC AACATGGAAC AGGAGTCGCG TGGCATACGA tggctttact CGAAACAGCG TCAAGCCGAC GGTGATCGCC CGAACCGACG ACACAGTGAT AGCTTTGATC CGCTGTAGAA GCGCGAACTG AGCCACGATC CTTGGTAGGT GGCGCTAAAT AAATGTAGTG GAAGGATGTC ACTTGAAGCT GTTGGAAGAA TCTAACAATT GCTGGGGAAG ATCGGGGCAG TGACTACTCC TTACGACAAT CAATTACGAC CACTTTCGCC TGGCAAAAGG GAGCTTCTGC CGGCCTGAAG GGCTGCCATT GAGGCCCGCG cttggtctga ttcgttcatc TACCATCTGG tatcagcaat ccgcctccat tttgtgccaa AAGATTTCGG gttcgtgaag GTGAGCCCCT TATGGCGCAT GATCTAGCCT CACAGTCTCA TTAGAGACAC CTCGGCGGCA ATGTACTGGT ACCGCTTCGC GAAGTATCGA TTGCTGGCCG ATTGATTTGC AACGACCTTT GTCACCATTG CAATTTGGAG GACATTGATC CCAGCGGCGG GAAACCTTAA CTTACGTTGT gctgccgact aggcaggctt TTTGTTCACT CGTTCAAGCC ACTATGCGCG GCACTTGCTG CCATCCAACT AG T C CAT CAA AATAGTCGCA GGCTACTACG ATGTTCTACA GGGGCATTGG ATCATGTATC TTTGGGGTGA TTAGCGGGCC cagttaccaa CATAGTTGCC CCCCAGTGCT AAACCAGCCA CCAGTCTATT AAGCCGCGCC TGATCCCTGA TGTTCGATGT ACCGGAAGGA TCATCGACCA CTATCGAACA TCGAATTTGC AAACGAACAA TTGACCTGTT ACTGGTTCTC GGCGCGGCTT CTCAACTATC TACATTTGTA TGGTTACGGT TGGAAACTTC TTGTGCACGA AATGGCAGCG TGGCTATCTT AGGAACTCTT CGCTATGGAA CCCGCATTTG GGGCAATGGA ATCTTGGACA ACGTGAAAGG GACGCCGCTT ATCTGTTGAA ACCTGCCAAT ACGACATTTC AT TACGACAA ACGGAAATCG TCATTGCCAA CCCCAAAAGG TCGTCATAAA TAAGCAACTA GGCCGTTCGC GGGAGGGTTC 91 7381 7441 7501 7561 7621 7681 7741 7801 7861 7921 7981 8041 8101 8161 8221 8281 8341 8401 8461 8521 8581 8641 8701 8761 8821 8881 8941 9001 9061 9121 9181 9241 9301 9361 9421 9481 9541 9601 gagaaggggg ggcacccccc ttcggcgtgc gcggtcacgc gccagggcgc agccctggtt aaaaacaagg tttataaata ttggtttaaa agcaggttaa aagacaggtt agcggtggcc gaaaaacggg cggaaaccct tgcaaatgct ggattttctg cctgtggaca gcccctcaaa tgtcaatagg tgcgcccctc atctgtcagc actctgcccc tcaagtgtca aggatcgcgc ccctcatctg tcagtagtcg cgcccctcaa gtgtcaatac cgcagggcac ttatccccag gcttgtccac atcatctgtg ggaaactcgc gtaaaatcag gcgttttcgc cgatttgcga ggctggccag ctccacgtcg ccggccgaaa tcgagcctgc ccctcatctg tcaacgccgc gccgggtgag tcggcccctc aagtgtcaac GTCCGCCCCT catctgtcag tgagggccaa GTTTTCCGCG aggtatccac AACGCCGGCG GCCGGCCGCG gtgtctcgca CACGGCTTCG ACGGCGTTTC TGGCGCGTTT GCAGGGCCAT AGACGGCCGC CAGCCCAGCG GCGAGGGCAA CCAGCCCGGT GAGCGTCGGA AAGGGTCGAG GTTTACCCGC CAATATATCC TGTCTATGTT TCACATGAAC ACGTGAATCT TCTTCAACAC GCCCACCTAA CCGCTCCTTT GCAGATAATC GACGGCGTCG AGTTGATGTG TGATCAACAT TACCAGAATT CCTTTCATCA GCTGAGTATC GGAATTGTTC TCTGCTTATT CCTCCATCCA CTGCATAGTT CCCTAGCTTG TCTCTGTAAT CATATGCTAC TTCATGTTCA CGGAACCTTT TACTATCTGC CTTCTCATAA GACATTCTTG ATTGCTTAGC ATCCCTGTAG TTGTAATCAT AAGGCATATT CTCATGCATA ACCTCACTTG CGTTGTCTCT AAGACCATAA TCATCTCTTG TACGCAAAAT TGAATCATTC GAATGATAAA CCTCTTGTCT ACCATCTTGA TATCTCATAT TGGCATAAAC TTTAACATCA CCACCATTAC GTCGTTGCAA ACGCTCATCA TCCAAGTAGA CTTGATCTCG GTCATCAAAA AGATATCTCC TGCCTCGAAG AGCTTCCTCA TCTTGCTTGC CAGCTGATGA TCTACTGACA TCAGGATGCA TCACCCCATA CGAATCAATT TCATGATCTC TTAGGAGTTG CTGGCTTTCA TAGGGCAAAT AGGCTTCCCT TCCGTCATTC GAGGACATTC CTTTACGCTC TAGAGCTCTA GCACCTCCTC GGTCCACAAT CTCTGCTTTG GTGACAGCAG GATACATCCT CTCATCAATG CCAGAGTCGT AGTACTTCAG TTGTTGTTTA TTGTAATGCT GATAAACATC CTTGCTTTCA TTATCCAAAT ACGCTTCATT TCTATCAATG AAGGCTACTC TCCTAAGCTC TAGCGCCTTG GCATCTCCAT GGTCTACTAT AATATCTGAC GAGTTGACAT CACGATATAT CCTGTCATCA ATGCCATAGT CATGATCTTT CTTAAGTTGT TGGCTTTCGT AATGCAGATA TGCATCCCCC CTTTTATAAT CCATGTATGA TTCCTCTCCA TCATCGAAGG ATCCTCTTCT ACGCTCAAGA GCTCTGGCTT CTTCCCCGTT TACAAGAATA TCTGATTTAT TGAGACTGGG ATGCATCATG GCAAAAGAGT TAGTTTCATG ATCTTTTAGG AGTTGCTGGC TTTCACTTTG AAAATATGCT TCCTTTCGAT CATTTAAGGA TACTCCTCTA TACCTTAGAC CTCTTGCATC TTCATGGTCT ACTAGAATAT CTGATCTGTT GACATCAGGA GGCATCAT GA CATAAGAGTC AGTTTCATAA TCGTTTAGGA GTTGCTGGTT TTCACATTGC AAGTATGCGT CCTTTTTATC ATTCAAGGAC ACTCCTCCAT ACCTCCGACC TCTGGCATCT TCATGGTCTA CCAGAATATC TGATTTGTTG ACATCGGGAT GCCTCATGAC GTAAGAGTCA GTTTCATGAT CGTTTAGGAG TCGCTGCCTT TCACATTGCA AGTATGCTTC CTTTTTATCA TTCAAGGAAA CTCCTCTATA CCTCCGACCT CTGCCATCTT CATGGTCTAC CAGAGTATCT GATTTGTTGA CATGGGGATG CATCATGCCA TAGGAGTTAG TTTCATAATC ATTTAGGAGT CTCTGTCTTT CACATTGCAT GTATGCTTCC TTTTTATCAT TCAAGGACCC TCCTCTATAC CTTAGACCTT TGGAATCTTC CCGGTCTTCC AGAGTATCTG ATTTGTTGAC ATCGGGATGC ATTATGCCAT AGGAGTTAGT TTCATAATCA TTTAGGAGTT GCTGGCTTTC ACATTGCAAG TAAGCTTCCC TTCTATCATT TAAGGACCCT CCTCTATACC 92 9841 ttagacctct ggaattttcc cgttcccagt ctgctagaat atctaatctg ttgacatcat 9901 caggataaat cttagcatca gagcgagagt cataatcttt ctccagttgt tggattttgt 9961 aattcagata aacatccttc ctttcattat ccaagtatgc tgcctttcct ttgtttaagc 10021 atcgtcgttg aagctgcact cctcttccat cctgatctac cacaatatct gccatatcta 10081 catcagaatg tatcctacca tcattgccag ggtgatactc gtttctcagt tcctgtcttt 10141 tatcatctga aaaaacagca tttctctcat tatccaaata tgcttccctt tctttattta 10201 agcaacgtgt tatgctctgt gaagctttat catctcgatc tactacagca tatggttcac 10261 tgaaatcaag attcttctta ctgtcaacac catcatatag ataatccttt ctcagtactt 10321 gacatttgtt atccaaataa acaacctttc tttcttcgtt acggaagttc ctgtagtgat 10381 catctcgttc atccaccact cttgatccca actccacaaa aggataatct tccttcacag 10441 actcataatg gtcagccatc ctctctttcc tgctaaactc aagatgggta tcggccgcat 10501 caacatcagc tatatttgaa ccacagacat gggattttga taaagatcct ctcctctgca 10561 taaaaagatc attctctcta gccacattat tgacctcatg cctaacactg ggaaactctc 10621 tcattgctat atcagagcct atatgataat tatcccgagc ttcatccact atctctttga 10681 cacacctgct cacaactggt gaatcatggt ctccacgact taaatctcta acttgttgat 10741 cccttggtga gtttctacca acataatcat cgacacgtct agtacgtaga acctgtggta 10801 cacaaagatt cccatcataa tcatgtcttc ttctctcatc ggaatcattc acacaaccga 10861 aagatcta (SEQ ID NO:54) A mutant form of the Arabidopsis thaliana acetohydroxyacid synthase gene conferring resistance to sulfonylurea herbicides such as chlorsulfuron was inserted into the T-DNA of pTCl. The 5.8 kb Xbal fragment from pGHl (Haughn et al. 1988, Molecular and General Genetics 211:266-271) was ligated into the unique Xbal site between the left and right T-25 DNA borders of pTCl to produce pTCAHAS. The orientation of the two fragments was determined by restriction patterns and confirmed by DNA sequencing.
Example 7 Transformation of Arabidopsis thaliana with an intragenic vector The pTCAHAS binary vector was transformed into the disarmed Agrobacterium tumefaciens strain EHA105 (Hood et al 1993, Transgenic Research, 2:208-218), using the freeze-thaw method (Hofgen & Willmitzer 1988, Nucleic Acids Research, 16: 9877). Agrobacterium was cultured overnight in LB broth supplemented with 300 mg/L spectinomycin and used to transform Arabidopsis thaliana 'Columbia' using the floral dip method (Clough and Bent, 35 Plant Journal 16: 735-743, 1998). 93 The resulting self pollinated seed was screened in vitro on half-strength MS salts (Murashige and Skoog 1962, Physiologia Plantarum, 15: 473-497) supplemented with 10 pg/L chlorsulfuron. Seeds were also sown on a standard potting mix in a greenhouse and the germinated seedlings at the 3-4 true leaf stage were sprayed with a standard application of 5 Glean (active ingredient chlorsulfuron) at a rate equivalent to 20 g/ha.
Genomic DNA from the recovered chlorsulfuron-resistant seedlings were confirmed as being transformed with the intragenic vector pTCAHAS by polymerase chain reactions across the junctions of the two Xbal sites adjoining the original T-DNA of pTCl and the inserted 5.8 kb 10 Xbal fragment to form pTCAHAS. The following primers were used: Primer E: 5'CATCCACTGCATAGTTCCC3' (SEQ ID NO:55) Primer F: 5'GATGCGTTGATCTCTTCATCA3' (SEQ ID NO:56) Primer G: 5 'TCAACATCAATCCGAGTACG3' (SEQIDNO:57) Primer H: 5' AGAG ATT GT GG A C C G AGGAG3' (SEQIDNO:58) As illustrated in Figure 3, the expected 643 bp DNA fragment was PCR amplified from the binary vector pTCAHAS and three A. thaliana lines transformed with pTCAHAS using primers E+F designed to flank the Xbal site inside the right T-DNA border. Similarly, the 20 expected 149 bp DNA fragment was PCR amplified from the same DNA sources using primers G+H designed to flank the Xbal site inside the left T-DNA border.
Example 8 Construction of additional intragenic vectors In assembling the intragenic vector all plasmid constructions were performed using standard molecular biology techniques of plasmid isolation, restriction, ligation and transformation into Escherichia coli strain DH5a (Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987).
A binary vector with a T-DNA composed of potato DNA The 1268 bp sequence illustrated in Example 2 as a T-DNA-like region of a potato (Solanum tuberosum) intragenic vector was synthesised by Genscript Corporation (Piscatawa, NJ, www.genscript. com) and supplied cloned into pUC57 (pUC57POTINV). 94 The Sail fragment encompassing the T-DNA composed of potato DNA from pUC57POTINV was isolated by restriction, then ligated to the 8004 bp Sail backbone of the binary vector pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) to form the binary vector pPOTINV. The orientation of the two fragments was determined by PCR analysis across the 5 junctions of the two Sail sites and DNA sequencing.
The pPOTINV binary vector was transformed into the Agrobacterium strain A4T, also known as C58C1 (pArA4b) (Petit et al 1983, Molecular and General Genetics, 190:204-214), using the freeze-thaw method (Hofgen & Willmitzer 1988, Nucleic Acids Research, 16: 9877). The 10 Agrobacterium was cultured overnight in LB broth supplemented with 300 mg/L spectinomycin for co-cultivation with leaves from in vitro cultured potato plants.
Virus-free potato plants of cultivar Iwa were multiplied in vitro on MS salts and vitamins (Murashige and Skoog, 1962, Physiologia Plantarum, 15: 473-497), plus 30g l"1 sucrose, 40 15 mg l"1 ascorbic acid, 500 mg l"1 casein hydrolysate and 7 g l"1 agar, adjusted to pH 5.8 with 0.1 M KOH. Plants were routinely subcultured as 2-3 node segments every three to four weeks and incubated at 26 °C under cool white fluorescent lamps (80-100 |amol m"2 sec"1; 16 h photoperiod). Leaves were excised from the in vitro plants, cut in half, dipped for about 30 sec in the liquid culture of Agrobacterium strain A4T harbouring pPOTINV, then blotted dry 20 on sterile filter paper. These leaf segments were then cultured on potato medium defined above and incubated under reduced light intensity (5-10 |amoI m"2 sec"1). Two days later, the leaf segments were transferred to the same medium supplemented with 200 mg l"1 Timentin to prevent Agrobacterium overgrowth.
Hairy roots were selected on MS medium without growth regulators. Genomic DNA isolated from these hairy roots was screened via PCR to identify those derived from co-transformation with pArA4b and pPOTINV. The following primers were used: Primer I: 5'GCTCACCTTGCAGCTTCACT3' (SEQIDNO:59) Primer J: 5' C AGAGCT GG ATTT GC AT C A G3' (SEQIDNO:60) to amplify an expected 570 bp DNA fragment from the T-DNA-like region of pPOTINV, and Primer K: 5 'GATGGCAGAAGGCGAAGATA3' (SEQIDNO:61) Primer L: 5 'GAGCTGGTCTTTGAAGTCTCG3' (SEQ ID NO:62) as an internal control to amplify an expected 1069 bp fragment from the endogenous potato actin gene. The expected 1069 bp fragment was amplified using primers K and L from all 95 hairy root lines, including control hairy root line transformed with Agrobacterium strain A4T without the binary vector pPOTINV. The expected 570 bp DNA fragment was PCR amplified from the binary vector pPOTINV and from two of 80 hairy root lines tested using primers I and J (Figure 4). The DNA samples from the two hairy root lines positive for the T-5 DNA from pPOTINV and the control hairy root lines were also used for PCR using primers designed for the Agrobacterium virG gene: Primer M: 5'GCGGTAGCCGACAG3' (SEQIDNO:63) Primer N: 5' GCGT C AAAGA A AT A3' (SEQIDNO:64) The DNA samples from all hairy root lines failed to amplify PCR products using primers M 10 and N (Figure 5). Furthermore, cultures of these hairy roots failed to grow bacteria when incubated in LB medium. These results establish the absence of associated Agrobacterium with the hairy roots. The 2.5% co-transformation frequency (2 of 80) of T-DNAs from pArA4b and pPOTINV was achieved despite selection for only hairy roots. This demonstrates that the pPOTINV binary vector is effective in transforming potatoes.
A binary vector with a T-DNA composed of petunia DNA The 1507 bp sequence illustrated in Example 2 as a T-DNA-like region of a petunia (.Petunia hybrida) intragenic vector was synthesised by Genscript Corporation (Piscatawa, NJ, www.genscript.com) and supplied cloned into pUC57 (pUC57PETINV).
The Sail fragment encompassing the T-DNA composed of petunia DNA from pUC57PETINV was isolated by restriction, then ligated to the 8004 bp Sail backbone of the binary vector pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) to form the binary vector pPETINV. The orientation of the two fragments was determined by PCR 25 analysis across the junctions of the two Sail sites and DNA sequencing.
The pPETINV binary vector was transformed into the Agrobacterium strain A4T, also known as C58C1 (pArA4b) (Petit et al 1983, Molecular and General Genetics, 190:204-214), using the freeze-thaw method (Hofgen & Willmitzer 1988, Nucleic Acids Research, 16: 9877). The 30 Agrobacterium was cultured overnight in LB broth supplemented with 300 mg/L spectinomycin for co-cultivation with leaves from in vitro cultured potato plants.
Virus-free potato plants of cultivar Iwa were multiplied in vitro on MS salts and vitamins (Murashige and Skoog, 1962, Physiologia Plantarum, 15: 473-497), plus 30g l"1 sucrose, 40 96 mg l"1 ascorbic acid, 500 mg l"1 casein hydrolysate and 7 g l"1 agar, adjusted to pH 5.8 with 0.1 M KOH. Plants were routinely subcultured as 2-3 node segments every three to four weeks and incubated at 26 °C under cool white fluorescent lamps (80-100 |j.mol m"2 sec"1; 16 h photoperiod). Leaves were excised from the in vitro plants, cut in half, dipped for about 30 5 sec in the liquid culture of Agrobacterium strain A4T harbouring pPETINV, then blotted dry on sterile filter paper. These leaf segments were then cultured on potato medium defined above and incubated under reduced light intensity (5-10 ]_imol m"2 sec"1). Two days later, the leaf segments were transferred to the same medium supplemented with 200 mg l"1 Timentin to prevent Agrobacterium overgrowth.
Hairy roots were selected on MS medium without growth regulators. Genomic DNA isolated from these hairy roots was screened via PCR to identify those derived from co-transformation with pArA4b and pPETINV. The following primers were used: Primer O: 5' GAG AT AA AC AAAT AGTCCGGATCG3' (SEQIDNO:65) Primer P: 5'GGGAGCATTTGGTGGAAATAG3' (SEQ ID NO:66) to amplify an expected 447 bp DNA fragment from the T-DNA-like region of pPETINV. The same DNA samples were also used in a PCR using primers K and L designed to amplify an expected 1069 bp fragment from the endogenous potato actin gene as an internal control. The expected 1069 bp fragment was amplified using primers K and L from all hairy root lines, 20 including control hairy root line transformed with Agrobacterium strain A4T without the binary vector pPETINV. The expected 447 bp DNA fragment from the T-DNA-like region of pPETINV was PCR amplified from the binary vector pPETINV and from one of 85 hairy root lines tested using primers O and P (Figure 6). The DNA sample from the hairy root line positive for the T-DNA from pPETINV failed to amplify a PCR product using primers M and 25 N designed for the Agrobacterium virG gene (Figure 7). Furthermore, a culture of this hairy root line failed to grow bacteria when incubated in LB medium. These results establish the absence of associated Agrobacterium with the hairy root line positive for the T-DNA from pPETINV. The 1-2% co-transformation frequency (1 from 85) of T-DNAs from pArA4b and pPETINV was achieved despite selection for only hairy roots. Overall, these results 30 demonstrates that the pPETINV binary vector is effective in transforming plants. 97 A binary vector with a T-DNA composed of onion DNA The 1075 bp sequence illustrated in Example 2 as a T-DNA-like region of an onion {Allium cepa) intragenic vector was synthesised by Genscript Corporation (Piscatawa, NJ, 5 www.genscript.com') and supplied cloned into pUC57 (pUC57ALLINV).
The Sail fragment encompassing the T-DNA composed of onion DNA from pUC57ALLINV was isolated by restriction, then ligated to the 8004 bp Sail backbone of the binary vector pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) to form the binary vector 10 pALLINV. The orientation of the two fragments was determined by PCR analysis across the junctions of the two Sail sites and DNA sequencing.
Example 9 Design, Construction and Verification of Plant Derived Recombination Sites: /avP-like sites for recombination with Cre recombinase BLAST searches were conducted of publicly available plant DNA sequences from NCBI, SGN and TIGR databases. 1) Potato DNA fragment containing a /o.vP-like sequence - POTLOXP A fragment containing a /oxP-like sequence was designed from two EST sequences from potato {Solanum tuberosum) (NCBI accessions BQ111407 and BQ045786). This fragment, named POTLOXP, is illustrated below. Restriction enzyme sites used for DNA cloning into the potato intragenic T-DNA described in Example 8 are shown in bold and the loxP-like 25 sequence shown in bold and light grey.
^^BilglWMiliili— 98 (SEQ ID NO:67) Nucleotides 1-3 part of EcoRV restriction enzyme site (from the potato intragenic vector pPOTINV) nucleotides 17-415 of NCBI accession BQ111407 nucleotides 298-548 of NCBI accession BQ045786 part of EcoRV restriction enzyme site (from the potato intragenic T-DNA) Nucleotides 4-402 Nucleotides 403-653 Nucleotides 654-655 The designed potato /oxP-like sequence has 6 nucleotide mismatches from the native lox? sequence as illustrated in bold below.
NO:68) Potato loxP-like CCG A • ' ' ■ . GC (SEQ ID The 655 bp POTLOXP sequence illustrated above was synthesised by Genscript Corporation 20 (Piscatawa, NJ, www.genscript.com) and supplied cloned into pUC57. All plasmid constructions were performed using standard molecular biology techniques of plasmid isolation, restriction, ligation and transformation into Escherichia coli strain DH5a (Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987).
Initially the 1286 bp Sail fragment encompassing the T-DNA composed of potato DNA from pUC57POTINV (described in Example 8) was subcloned into pGEMT to form pGEMTPOTINV. POTLOXP was then cloned into pGEMTPOTINV twice, firstly as a Xbal to Clal fragment, then subsequently as a EcoRV to EcoRV fragment. Confirmation of the 30 POTLOXP inserts was verified using restriction enzyme analysis and DNA sequencing. The resulting plasmid was named pPOTLOXP2.
The DNA sequence of the 2316 bp Sail fragment comprising the potato derived T-DNA region in pPOTLOXP2 is illustrated below. Only the nucleotides in italics are not part of loxP sequence ATAACTTCGTATAGCATACATTATACGAAGTTAT (SEQ ID NO:69) 99 potato genome sequences. The POTLOXP regions are shaded. The T-DNA borders are shown in bold, with the left border positioned at 314-337 and the right border positioned at 2005-2028. Restriction sites illustrated in bold represent those used in cloning the POTLOXP regions into pGEMTPOTINV. Unique restriction sites in pPOTLOXP2 for cloning between POTLOXP sites are: Aflll Age I BarrMl BsiD\Q2\ Cspl PinAl C/TTAAG A/CCGGT G/GATCC GAG/CGG CG/GWCCG A/CCGGT GTCGACAGTAAAAGTTGCACCTGGAATAAGGTTTTCATTCTTCACAGGAGGCATCTCACTCT TTCTAGCAGGTCTTGAACGCTTAGATTGAACAGATGTAGGACTCACATCTGATATGGAGGAT 15 TCTTGACTTGTTTCAGCAGCATCAGATGAAGCTTCTGAGACTTCACCTGATCCATCATCTGT AGCAGTTGCTTCTACTTCTTCCACTGCTACATCAGTCTCAGTTGCTGATACTATAAGACCTC TTAATTTAGGTCGTAAAATGCAACCAACTCTAAAATGGGGAAACAATTTAATAGATGTTGAC AGAGGCAGGATATATTTTGGGGTAAACGGGAATTCTTCAGCAGTTGCTCGAGGGAGATTGGC GGTGCTTTCAGCTCACCTTGCAGCTTCACTCAACGTCTCCGATTTAACAACCTTCAAACTT| CGAATTCC \( rrc > a " ~ T CC"GTAA <"'TT.i3AATCGATGAGCGGACCGGTAAGAAGTATCCGGTTCAGGTTTCTGAGGATGGCACTATC AAAGCCACCGACTTAAAGAAGATAACAACAGGACAGAATGATAAAGGTCTTAAGCTTTATGA TCCAGGCTATCTCAACACAGCACCTGTTAGGTCATCAATATGCTATATAGATGGTGATGCCG GGATCCTTAGA1 A AA . O' .-l1 .
-'JAKjfjLXI 100 gatgctcatccaatgggggttcttgtcagtgcaatgagtgctctttccgtttttcatcctga TGCAAATCCAGCTCTGAGAGGACAGGATATATACAAGTGTAAACAATTTAAAAGCATATGGT ggcactgctcaatatatgaggtgggcgcgagaagcaggtaccaatgtgtcctcatcaagaga tgcattctttaccaatccaacggtcaaagcatactacaagtcttttgtcaaggctattgtga caagaaaaaactctataagtggagttaaatattcagaagagcccgccatatttgcgtgggaa 15 ctcataaatgagcctcgttgtgaatccagttcatcagctgctgctctccaggcgtggatagc agagatggctggatttgtcgac (SEQ ID NO:70) The ability of this construct to undergo recombination between the POTLOXP sites was 20 tested in vivo using Cre recombinase expressing Escherichia coli strain 294-Cre (Buchholz et al., 1996, Nucleic Acids Research 24 (15) 3118-3119). The binary vector pPOTLOXP2 was transformed into E. coli strain 294-Cre and maintained by selection with 100 mg/1 ampillicin and incubation at 23 °C. Raising the temperature to 37 °C induces expression of Cre recombinase in E, coli strain 294-Cre, which effected recombination between the two 25 POTLOXP sites in pPOTLOX2. This was evident by a reduction in the size of pPOTLOXP2 from 5316 bp to 4480 pb. Plasmid isolated from colonies of JE. coli strain 294-Cre transformed with pPOTLOXP2 and cultured at 37 °C, was restricted with Sail. All colonies tested produced the fragments of 3.0 kb and 1.5 kb expected when recombination between the POTLOXP sites has occurred (Figure 8).
Recombination between the POTLOXP sites was further verified by DNA sequencing. Plasmid was isolated from colonies of E. coli strain 294-Cre transformed with pPOTLOXP2 and cultured at 37 °C, then DNA sequenced across the Sail region inserted into pGEMT. The resulting sequence from two independent cultures is illustrated below and confirms that 101 recombination is base pair faithful through the remaining POTLOXP site in plasmid preparations. Only the nucleotides in italics are not part of the potato genome sequences. The remaining POTLOXP region is shaded. The T-DNA borders are shown in bold, with the left border positioned at 314-337 and the right border positioned at 1169-1192. Restriction sites 5 illustrated in bold represent those remaining from cloning the POTLOXP regions into pPOTINV.
GTCGACAGTAAAAGTTGCACCTGGAATAAGGTTTTCATTCTTCACAGGAGGCATCTCACTCT TTCTAGCAGGTCTTGAACGCTTAGATTGAACAGATGTAGGACTCACATCTGATATGGAGGAT 10 TCTTGACTTGTTTCAGCAGCATCAGATGAAGCTTCTGAGACTTCACCTGATCCATCATCTGT AGCAGTTGCTTCTACTTCTTCCACTGCTACATCAGTCTCAGTTGCTGATACTATAAGACCTC TTAATTTAGGTCGTAAAATGCAACCAACTCTAAAATGGGGAAACAATTTAATAGATGTTGAC AGAGGCAGGATATATTTTGGGGTAAACGGGAATTCTTCAGCAGTTGCTCGAGGGAGATTGGC GGTGCTTTCAGCTCACCTTGCAGCTTCACTCAACGTCTCCGATTTAACAACCTTCAAACTT| GfTGAATdGATATCATACAGTCAATGCCCCATGATGCTCATCCAATGGGGGTTCTTGTCAGT GCAATGAGTGCTCTTTCCGTTTTTCATCCTGATGCAAATCCAGCTCTGAGAGGACAGGATAT ATACAAGTGTAAACAATTTAAAAGCATATGGTGGCACTGCTCAATATATGAGGTGGGCGCGA GAAGCAGGTACCAATGTGTCCTCATCAAGAGATGCATTCTTTACCAATCCAACGGTCAAAGC ATACTACAAGTCTTTTGTCAAGGCTATTGTGACAAGAAAAAACTCTATAAGTGGAGTTAAAT 30 ATTCAGAAGAGCCCGCCATATTTGCGTGGGAACTCATAAATGAGCCTCGTTGTGAATCCAGT TCATCAGCTGCTGCTCTCCAGGCGTGGATAGCAGAGATGGCTGGATTTGTCGAC (SEQ ID NO:71) 102 2) Z,axP-like sequences from other species Medicago trunculata (barrel medic) /oxP-like sequence designed from 2 ESTs LoxP ATAACTTCGTATAAT GTAT GCTATACGAAGTTAT (SEQ ID NO:68) Barrel medic loxP-like ATGACTTCGTATAATGTATGCTATACGAAGTGTG (SEQ ID NO:72) Nucleotides 1-19 Nucleotides 109-127 of NCBI accession CA919120 10 Nucleotides 20-34 Nucleotides 14-28 of NCBI accession CA989265 The barrel medic /oxP-like site has 4 nucleotide mismatches from the native lox? sequence (illustrated above in bold).
Picea (spruce) /oxP-like sequence designed from 2 ESTs LoxP ATAACTTCGTATAATGTATGCTATACGAAGTTAT (SEQ ID NO:68) Spruce lox P-like ATACCTTCGT AT AAT GT AT GCT ATAC AAAGAA AT (SEQ ID NO:73) Nucleotides 1-15 Nucleotides 226-240 of NCBI accession C0215992 Nucleotides 16-34 Nucleotides 148-166 ofNCBI accession C0255617 The spruce /oxP-like site has 4 nucleotide mismatches from the native lox? sequence (illustrated above in bold) Zea mays (maize) Lox P ID NO:68) Maize /oxP-like NO:74) Nucleotides 1-20 Nucleotides 326-345 ofNCBI accession CB278114 Nucleotides 21 -34 Nucleotides 11 -27 of NCBI accession CDOO1443 The maize /oxP-like site has 6 nucleotide mismatches from the native loxP sequence (illustrated above in bold) /oxP-like sequence designed from 2 ESTs ATAACTT CGT AT AAT GTATGCTAT ACGAAGTT AT (SEQ GCCACTCCGTATAATGTATGCTATACGAAATGAT (SEQ ID 103 Example 10 Design, Construction and Verification of Plant Derived Recombination Sites: frt-like sites for recombination with FLP recombinase BLAST searches were conducted of publicly available plant DNA sequences from NCBI, 5 SGN and TIGR databases. 1) Potato DNA fragment containing a/r/-like sequence - POTFRT A fragment containing afrt-Wks sequence was designed from two EST sequences from potato (Solanum tuberosum) (NCBI accessions BQ513657 and BG098563). This fragment, named 10 POTFRT, is illustrated below. Restriction enzyme sites used for DNA cloning into the potato intragenic T-DNA described in Example 8 are shown in bold and the/rZ-like sequence shown in bold and light grey. cttAAG. ' GAATTC ■ GTTCCTATACTTTCTAGAGAATAGGAAG ['': (SEQ ID NO:75) Nucleotides 1-3 Nucleotides 4-45 Nucleotides 46-185 part of Bfrl restriction enzyme site (from the potato intragenic vector pPOTINV) nucleotides 454 to 495 ofNCBI accession BQ513657 nucleotides 40 to 179 ofNCBI accession BG098563 The designed potato ^TY-like sequence has 5 nucleotide mismatches from the native frt 25 sequence as illustrated in bold below. frt sequence GAAGTTCCTATACTTTCTAGAGAATAGGAACTTC (SEQ ID NO:76) Potato/rt-like sequence *CCTG i' ICCTAT'ACTTTCTAGAGAATAGGAAGTTG (SEQ ID 30 NO:77) The 185 bp POTFRT sequence illustrated above was synthesised by Genscript Corporation (Piscatawa, NJ, www.genscript.com) and supplied cloned into pUC57. All plasmid 104 constructions were performed using standard molecular biology techniques of plasmid isolation, restriction, ligation and transformation into Escherichia coli strain DH5a (Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987).
POTFRT was cloned into the T-DNA composed of potato DNA residing in the plasmid pGEMTPOTINV (described in Example 9) twice, firstly as a EcoRl to Avrll fragment, then subsequently as a Bfrl to RamWl fragment. Confirmation of the POTFRT inserts was verified using restriction enzyme analysis and DNA sequencing. The resulting plasmid was named 10 pPOTFRT2.
The DNA sequence of the 1432 bp SaR fragment comprising the potato derived T-DNA region in the resulting pPOTFRT2 is illustrated below. Only the nucleotides in italics are not part of potato genome sequences. The POTFRT regions are shaded. The T-DNA borders are 15 shown in bold, with the left border positioned at 314-337 and the right border positioned at 1121-1144. Restriction sites illustrated in bold represent those used to clone the POTFRT regions into pGEMTPOTINV. Unique restriction sites in pPOTFRT2 for cloning between POTFRT sites are: Age I A/CCGGT itoD102I GAG/CGG Clal AT/CGAT Cspl CG/GWCCG PinAl A/CCGGT gtcgacagtaaaagtigcacctggaataaggttttcattcttcacaggaggcatctcactct ttctagcaggtcttgaacgcttagattgaacagatgtaggactcacatctgatatggaggat tcttgacttgtttcagcagcatcagatgaagcttctgagacttcacctgatccatcatctgt agcagttgcttctacttcttccactgctacatcagtctcagttgctgatactataagacctc 30 ttaatttaggtcgtaaaatgcaaccaactctaaaatggggaaacaatttaatagatgttgac AGAGGCAGGATATATTTTGGGGTAAACGGGAATTCTTAqnr.GTT^r ".'"TATA. t iO'J'.TL/-. *V*V~ -".J L ~^Z'ATAGT'T'l^STGT P ^jM»i——ct agaaacttccggtgtatccgccgtttccggcgttgcacctccgccgaatctaaaaggtgcgt 105 TGACGATCATCGATGAGCGGACCGGTAAGAAGTATCCGGTTCAGGTTTCTGAGGATGGCACT GCTACCCTATTGAAGAGCTGGCCGAGGGAAGTTCCTTCTTGGAAGTGGCATATCTTTTGTTG TATGGTAATTTACCATCTGAGAACCAGTTAGCAGACTGGGAGTTCACAGTTTCACAGCATTC AGCGGTTCCACAAGGACTCTTGGATATCATACAGTCAATGCCCCATGATGCTCATCCAATGG GGGTTCTTGTCAGTGCAATGAGTGCTCTTTCCGTTTTTCATCCTGATGCAAATCCAGCTCTG 10 AGAGGACAGGATATATACAAGTGTAAACAATTTAAAAGCATATGGTGGCACTGCTCAATATA TGAGGTGGGCGCGAGAAGCAGGTACCAATGTGTCCTCATCAAGAGATGCATTCTTTACCAAT CCAACGGTCAAAGCATACTACAAGTCTTTTGTCAAGGCTATTGTGACAAGAAAAAACTCTAT AAGTGGAGTTAAATATTCAGAAGAGCCCGCCATATTTGCGTGGGAACTCATAAATGAGCCTC GTTGTGAATCCAGTTCATCAGCTGCTGCTCTCCAGGCGTGGATAGCAGAGATGGCTGGATTT 15 GTC GAC (SEQ ID NO:78) The ability of this construct to undergo recombination between the POTFRT sites was tested in vivo using FLP recombinase expressing Escherichia coli strain 294-FLP (Buchholz et al., 20 1996, Nucleic Acids Research 24 (15) 3118-3119). The binary vector pPOTFRT2 was transformed into E. coli strain 294-FLP and maintained by selection with 100 mg/1 ampillicin and incubation at 23 °C. Raising the temperature to 37 °C induces expression of FLP recombinase in E. coli strain 294-FLP, which effected recombination between the two POTFRT sites in pPOTFRT2. This was evident by a reduction in the size of pPOTFRT2 25 from 4432 bp to 4086 pb. Plasmid isolated from colonies of E. coli strain 294-FLP transformed with pPOTFRT2 and cultured at 37 °C, was restricted with Sail. All colonies tested produced the fragments of 3.0 kb, 1.4 kb, and 1.1 kb. These three fragments represent the pGEMT backbone, the unrecombined POTFRT2 fragment, and the expected fragment from recombination between the POTLOXP sites, respectively (Figure 9).
Recombination between the POTFRT sites was further verified by DNA sequencing. The 1.1 kb fragment from lane 3 of Figure 9 was gel purified and direct DNA sequenced. The resulting sequence is illustrated below and confirms that recombination is base pair faithful through the remaining POTFRT site. The remaining POTFRT region is shaded. The left T- 106 DNA border is illustrated in bold and positioned at 253-276. Restriction sites illustrated in bold represent those remaining from cloning the POTFRT regions into pGEMTPOTINV.
TTTCTAGCAAGTCTTGTACGCTTAGATTGAACAGATGTAGGACTCACATCTGATATGGAGGA TTCTTGACTTGTTTCAGCAGCATCAGATGAAGCTTCTGAGACTTCACCTGATCCATCATCTG TAGCAGTTGCTTCTACTTCTTCCACTGCTACATCAGTCTCAGTTGCTGATACTATAAGACCT CTTAATTTAGGTCGTAAAATGCAACCAACTCTAAAATGGGGAAACAATTTAATAGATGTTGA CAGAGGCAGGATATATTTTGGGGTAAACGGGAA 'TCCTATAGTTTCT GGATCC 'TAGATATCGAGGCTACCCTATTGAAGAGCTGGCCGAGGGAAGTTCC TTCTTGGAAGTGGCATATCTTTTGTTGTATGGTAATTTACCATCTGAGAACCAGTTAGCAGA CTGGGAGTTCACAGTTTCACAGCATTCAGCGGTTCCACAAGGACTCTTGGATATCATACAGT CAATGCCCCATGATGCTCATCCAATGGGGGTACTTGTCAGTGCAATGAGTGCTCTTTCCGTT TTT (SEQ ID NO:79) 2) Onion {Allium cepa) FRT-like fragment - ALLFRT A fragment containing a frt- like sequence was designed from two EST sequences from onion (NCBI accessions CF434781 and CF445353). This fragment, named ALLFRT, is illustrated below. Restriction enzyme sites to allow cloning into the onion intragenic binary vector described in Example 8 are shown in bold and the frt-\\ke sequence is illustrated in bold and light grey. iTTAAT *-\rnm* * a iCTTG 3 AAC I CH a C.-AC 107 (SEQ ID N0:80) Nucleotides 1-450 nucleotides 28-477 ofNCBI accession CF434718 Nucleotides 451-875 nucleotides 105-529 ofNCBI accession CF445383 The designed onion frt- like sequence has 7 nucleotide mismatches from the native frt sequence as illustrated in bold below.
Frt sequence gaagttcctatactttctagagaataggaacttc (SEQ ID NC:76) Onion/r/-like sequence CTTGTTCCTATACTCTCTGGAGAATAGGAACTGT (SEQ ID NO:81) The 875 bp ALLFRT sequence can be cloned into pALLINV twice, once via flanking Vspl sites into Ndel site of pALLINV and subsequently via Nhel and Xbal site into the Xbal site of pALLINV. The correct orientation and confirmation of the ALLFRT insert can be verified by restriction enzyme analysis and DNA sequencing.
The DNA sequence of the 2896 bp Sail fragment comprising the onion derived T-DNA region in the resulting pALLFRT2 is illustrated below. Only the nucleotides in italics are not part of onion genome sequences. The ALLFRT regions are shaded. The T-DNA borders are shown in bold, with the left border positioned at 520-543 and the right border positioned at 2490-2513. Restriction sites illustrated in bold represent those used to clone the ALLFRT 30 regions into the onion T-DNA like sequence. gtcgacttccctttcctctactccacttgtttcicgctttctctacttcctttttctctctt ttctttatatttattgctcagctgggattaattactgtcatttattcctcatatctatttta ttgaattaaaacggttatttagctcgaggccttctctcttattctttgcttccaaggagaga 108 GAATATGGCGAGTGGTAGCAATCATCAGCATGGTGGAGGAGGAAGAAGAAGAGGCGGAATGT TAGTCGCTGCGACCTTGCTTATTCTTCCTGCCATTTTCCCCAATTTGTTTGTTCCTCTTCCC TTTGCTTTTGGTAGTTCTGGCAGCGGTGCATCTCCTTCTCTCTTCTCCGAATGGAATGCTCC TAAACCTAGGCATCTCTCTCTTCTGAAAGCAGCCATTGAGCGTGAGATTTCTGACGAACAAA AATCAGAGCTGTGGTCTCCCTTGCCTCCACAGGGATGGAAACCGTGCCTTGAGACTCAATAT AGTAGCGGGCTACCCAGTAGATCGACAGGATATATTCAAGTGTAAAACAAGATGCTGAATCG ATTAGCAATGGTTCGCT C T TCSAGC^CTf'TCCT ^CTAGACTTGCTTCTCGGATAATCAATCCTCAGTTTTTGATTCCTTCTCGAAGCTTCCTTG ATCTCCATAAGATGGTAAACAAGGAGGCGATAAAAAAAGAAAGGGCTAGACTTGCTGATGAG ATGAGCAGAGGATATTTTGCGGATATGGCAGAGATTCGTATACATGGTGGCAAGATTGCTAT 20 GGCAAATGAAATTCTTATTCCATCAGGGGAAGCAATCAAATTTCCTGATTTGACAGTAAAAT TGTCTGATGATAGCAGTTTGCATTTACCAATTGTATCTACACAAAGTGCTACAAATAACAAT GCTAAATCCACTCCTGCTGCCTCATTGTTGTGCCTTTCCTTCAGAGCAAGTTCACAGACAAT GGTTGAATCATGGACTGTTCCTTTTTTGGACACTTTTAACTCTTCAGAAGTACAAGCA— 109 jTATGAGGTATCATTTTTGGATTCTTGGTT TTTCTCATTCGGACCAATCAAGAGAATGTTTCTTAACATGACGAAGAAACCCACTGCTACTC AGCGGAAGATTGGTTATTTCATTTGGTGATCACTATGATTTTAGGAAGCAGCTTCAAATTGT AAAT C T T T TGACAGGATATATATTACTGTAAAAAG T GAAGAGAGAAAT GT GATATATGCTGA TGTTTCCATGGAGAGGGGTGCATTTCTTGTTCAACAAGCTATGAGGGCTTTCCATGGAAAGA ATATAGAAAGCGCAAAATCAAGGCTTAGTCTTTGCGAGGAGGATATTCGTGGGCAGTTAGAG ATGACAGATAACAAACCAGAGTTATATTCACAGCTTGGTGCTGTCCTTGGAATGCTAGGAGA CTGCTGTCGAGGAATGGGTGATACTAATGGTGCGATTCCATATTATGAAGAGAGTGTGGAAT TCCTCTTAAAAATGCCTGCAAAAGATCCCGAGGTTGTACATACACTATCAGTTTCCTTGAAT AAAATTGGAGACCTGAAATACTACGAAGGAGATCTGCAGTCGAC (SEQ ID NO:82) Restriction enzyme sites available for cloning between ALLFRT sequences include: ApaBl GCANNNNN/TGC Bsil C/TCGTG BspMl ACCTGCNNNN/ Dralll CACNNN/GTG Hin&lll A/AGCTT Mfel C/AATTG Nhe I G/CTAGC PflMl CCANNNN/NTGG Seal AGT/ACT Sphl GCATG/C Xbal T/CTAGA 3) FrMike sequences from other species Brassica napus (rape)//-Mike sequence designed from 2 ESTs Frt sequence GAAGTTCCTATACTTTCTAGAGAATAGGAACTTC (SEQ ID NO:76) Rape frt- like sequence ACAGTTCCTATACTTTCTGGAGAATAGGAAGGTG (SEQ ID NO:83) 110 Nucleotides 1-14 Nucleotides 15-34 Nucleotides 397-410 ofNCBI accession CD824140 Nucleotides 128-147 ofNCBI accession CD825268 The rape /rM ike sequence has 6 nucleotide mismatches from the native frt sequence (illustrated above in bold).
Glycine max (soybean) frt-\ike sequence designed from 2 ESTs Frt sequence G AAGT TCCTATACTTTC TAG AG AAT AG G AA C T T C (SEQ ID NO:76) Soybeanfrt-like sequence ACAGTTCCTATACTTTCTACAGAATAGGAACTTC (SEQ ID NO:84) Nucleotides 1-19 Nucleotides 84-102 ofNCBI accession BE057270 Nucleotides 20-34 Nucleotides 243-257 ofNCBI accession BI970552 The soybean frt-Yike sequence has 3 nucleotide mismatches from the native frt sequence (illustrated above in bold).
Triticum aestivum (wheat)//'/-like sequence designed from 2 ESTs Frt sequence gaagttcctatactttctagagaataggaacttc (SEQ ID NO:76) Wheats-like sequence agagttcctatactttctagagaataggaacccc (SEQ ID NO:85) Nucleotides 1-18 Nucleotides 446-463 of NCBI accession CD877128 Nucleotides 19-34 Nucleotides 1805-1820 ofNCBI accession BT009538 The wheat frt-Wks sequence has 4 nucleotide mismatches from the native frt sequence (illustrated above in bold).
Ill Pinus taeda (loblolly pine)//-Mike sequence designed from 2 ESTs Frt sequence gaagttcctatactttctagagaataggaacttc (SEQ ID no:76) Loblolly pine frt-WYe sequence aaagttcctatactttctggagaataggaaaaca (SEQ ID no:86) Nucleotides 1-16 Nucleotides 14-29 ofNCBI accession AA556441 Nucleotides 17-34 Nucleotides 764-781 ofNCBI accession AF101785 The loblolly pine frt-\ike sequence has 6 nucleotide mismatches from the native frt sequence (illustrated above in bold).
The above examples illustrate practice of the invention. It will be well understood by those with ordinary skill in the art that such DNA sequences can be assembled together to construct a complete vector composed entirely of plant DNA of the same or related species. It will also be appreciated by those skilled in the art that numerous variations and modifications may be made without departing from the spirit and scope of the invention. 112 RECEIVED at IPONZ on 11 March 2010

Claims (27)

CLAIMS:
1. A plant transformation vector comprising first and second recombinase recognition sequences, wherein the recombinase recognition sequences are derived from plant 5 species, and are not identical to any recombinase recognition sequences from non-plant species.
2. The plant transformation vector of claim 1, in which the first recombinase recognition sequence and the second recombinase recognition sequence are /oxP-like sequences 10 derived from a plant species.
3. The plant transformation vector of claim 1, in which the first recombinase recognition sequence and the second recombinase recognition sequences aresequences derived from plant species. 15
4. The plant transformation vector of any preceding claim, comprising a selectable marker sequence flanked by the first and second recombinase recognition sequences.
5. The plant transformation vector of claim 4, in which the selectable marker sequence is 20 derived from plants.
6. The plant transformation vector of any preceding claim, in which the plant transformation vector further comprises an origin of replication polynucleotide sequence derived from plant species. 25
7. The plant transformation vector of any preceding claim, comprising a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector, wherein the selectable marker sequence is derived from plant species. 30
8. The plant transformation vector of claim 7, in which the selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector is also capable of functioning in selection of a plant cell or plant harbouring the selectable marker polynucleotide. 35
9. The plant transformation vector of any preceding claim in which the entire vector is constructed from fewer than 10 polynucleotide sequence fragments derived from plant species. 113
10. The plant transformation vector of any preceding claim which further comprises a genetic construct as herein defined, wherein all polynucleotide sequences of the genetic construct are derived from plants. 5
11. The plant transformation vector of any preceding claim in which the polynucleotide sequence of the entire vector is derived from plant species.
12. The plant transformation vector of any preceding claim in which the polynucleotide 10 sequence of the entire vector is derived from plant species which are interfertile.
13. The plant transformation vector of any preceding claim in which the polynucleotide sequence of the entire vector is derived from the same plant species. 15
14. A method of producing a transformed plant cell or plant, the method comprising the step of transformation of the plant cell or plant using the vector of any one of claims 1 to 13.
15. A method of producing a plant cell or plant with a modified trait, the method the steps 20 comprising: (a) transforming of a plant cell or plant with a vector of any one of claims 1 to 13, the vector comprising a genetic construct capable of altering expression of a gene which influences the trait; and (b) obtaining a stably transformed plant cell or plant modified for the trait. 25
16. A method of producing a transformed plant cell or plant in which any polynucleotide stably integrated into the plant cell or plant is derived from a plant, the method comprising transformation of the plant with the vector of any one of claims 1 to 13. 30
17. A method of producing a transformed plant cell or plant in which any polynucleotide stably integrated into the plant cell or plant is derived from a plant interfertile with the plant or plant cell to be transformed, the method comprising transformation of the plant with the vector of any one of claims 1 to 13. 35
18. A method of producing a transformed plant cell or plant in which any polynucleotide stably integrated into the plant cell or plant is derived from a plant of the same species 114 RECEIVED at IPONZ on 11 March 2010 as the plant or plant cell to be transformed, the method comprising transformation of the plant with the vector of any one of claims 1 to 13.
19. The method of any one of claims 14 to 18, in which transformation is vir-mediated 5
20. The method of any one of claims 14 to 18, in which transformation is Agrobacterium-mediated.
21. The method of any one of claims 14 to 18 in which transformation involves direct DNA 10 uptake.
22. A plant cell or plant produced by a method of any one of claims 14 to 21.
23. A plant tissue, organ, propagule or progeny of the plant cell or plant of claim 22. 15
24. The plant transformation vector of any one of claims 1 to 13 substantially as herein described with reference to any example thereof.
25. The method of any one of claims 14 to 21 substantially as herein described with 20 reference to any example thereof.
26. The plant cell or plant of claim 22 substantially as herein described with reference to any example thereof. 25
27. The plant tissue, organ, propagule or progeny of claim 23 substantially as herein described with reference to any example thereof. 115
NZ579038A 2005-09-07 2005-09-07 Vectors for transformation NZ579038A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
NZ579038A NZ579038A (en) 2005-09-07 2005-09-07 Vectors for transformation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
NZ579038A NZ579038A (en) 2005-09-07 2005-09-07 Vectors for transformation

Publications (1)

Publication Number Publication Date
NZ579038A true NZ579038A (en) 2010-04-30

Family

ID=42123100

Family Applications (1)

Application Number Title Priority Date Filing Date
NZ579038A NZ579038A (en) 2005-09-07 2005-09-07 Vectors for transformation

Country Status (1)

Country Link
NZ (1) NZ579038A (en)

Similar Documents

Publication Publication Date Title
JP6530887B2 (en) Agrobacterium strain modified to increase plant transformation rate
US20020178463A1 (en) Method for transforming monocotyledons
AU2010257316B2 (en) Transformation Vectors
CN109722439B (en) Application of MLO2, MLO6 and MLO12 genes of tobacco in preparation of powdery mildew resistant tobacco variety and method thereof
Torregrosa et al. Influence of Agrobacterium strain, culture medium, and cultivar on the transformation efficiency of Vitis vinifera L
WO2001006844A1 (en) Method for superrapid transformation of monocotyledon
WO1995006722A1 (en) Method of transforming monocotyledon by using scutellum of immature embryo
CN111630174B (en) Regeneration of genetically modified plants
MXPA03008640A (en) Site-targeted transformation using amplification vectors.
AU2010211450B2 (en) Plant transformation using DNA minicircles
BR112020002321A2 (en) new strains of agrobacterium tumefaciens claim priority
NZ579038A (en) Vectors for transformation
NZ533371A (en) Vectors for plant transformation contaning mostly or exclusively genetic material from plants useful with Agrobacterium transformation system
BR112020004764A2 (en) methods and compositions to increase the expression of genes of interest in a plant by coexpression with p21
JP3605633B2 (en) Novel plant gene, plant modification method using the gene, and plant obtained by the method
JP4543161B2 (en) Gene disruption method using retrotransposon of tobacco
US20100257632A1 (en) Methods for generating marker-free transgenic plants
NZ585926A (en) Methods for generating marker free transgenic plants using Agrobacterium strains and virE2 protein
CN113490747A (en) Methods for increasing efficiency of genome engineering
WO2023212556A2 (en) Compositions and methods for somatic embryogenesis in dicot plants
JP5114161B2 (en) Novel site-specific recombinase recognition sequences and vectors
JPWO2020171192A1 (en) Nucleic acid for genome editing of plant cells and its use
NZ574191A (en) Plant transformation using DNA minicircles
MXPA06002799A (en) Methods and compositions for enhanced plant cell transformation
AU4619601A (en) Agrobacterium mediated method of plant transformation

Legal Events

Date Code Title Description
PSEA Patent sealed
RENW Renewal (renewal fees accepted)
RENW Renewal (renewal fees accepted)
ERR Error or correction

Free format text: THE OWNER HAS BEEN CORRECTED TO 3047947, THE NEW ZEALAND INSTITUTE FOR PLANT AND FOOD RESEARCH LIMITED, PRIVATE BAG 92169, VICTORIA STREET WEST, AUCKLAND 1142, NZ

Effective date: 20150723

LAPS Patent lapsed