AU2005252598B2 - Transformation vectors - Google Patents

Transformation vectors Download PDF

Info

Publication number
AU2005252598B2
AU2005252598B2 AU2005252598A AU2005252598A AU2005252598B2 AU 2005252598 B2 AU2005252598 B2 AU 2005252598B2 AU 2005252598 A AU2005252598 A AU 2005252598A AU 2005252598 A AU2005252598 A AU 2005252598A AU 2005252598 B2 AU2005252598 B2 AU 2005252598B2
Authority
AU
Australia
Prior art keywords
plant
sequence
dna
sequences
derived
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU2005252598A
Other versions
AU2005252598B8 (en
AU2005252598A1 (en
Inventor
Samantha Jane Baldwin
Philippa Jane Barrell
Anthony John Conner
Johanna Maria Elisabeth Jacobs
Annemarie Suzanne Lokerse
Jan-Peter Hendrik Nap
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New Zealand Insitiute for Plant and Food Research Ltd
Original Assignee
Nz Inst For Crop & Food Res
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from NZ533371A external-priority patent/NZ533371A/en
Application filed by Nz Inst For Crop & Food Res filed Critical Nz Inst For Crop & Food Res
Publication of AU2005252598A1 publication Critical patent/AU2005252598A1/en
Assigned to THE NEW ZEALAND INSTITUTE OF PLANT AND FOOD RESEARCH LIMITED reassignment THE NEW ZEALAND INSTITUTE OF PLANT AND FOOD RESEARCH LIMITED Alteration of Name(s) of Applicant(s) under S113 Assignors: NEW ZEALAND INSTITUTE FOR CROP & FOOD RESEARCH LIMITED
Publication of AU2005252598B2 publication Critical patent/AU2005252598B2/en
Priority to AU2010257316A priority Critical patent/AU2010257316B2/en
Application granted granted Critical
Publication of AU2005252598B8 publication Critical patent/AU2005252598B8/en
Ceased legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8201Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation
    • C12N15/8202Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation by biological means, e.g. cell mediated or natural vector
    • C12N15/8205Agrobacterium mediated transformation
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8201Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation
    • C12N15/8213Targeted insertion of genes into the plant genome by homologous recombination

Description

WO 2005/121346 PCT/NZ2005/000117 TRANSFORMATION VECTORS BACKGROUND ART 5 Over the past 20 years rapid scientific advances in molecular and cell biology have resulted in the development of technology to enable genetic engineering of plants (development of transformed plants, transgenic plants or GMOs). This offers new opportunities for the incorporation of genes into crop plants and represents a new technology platform for the next level of genetic gain in crop breeding. 10 An option provided by genetic engineering is the ability to extend the germplasm base available for crop improvement to any source of DNA, including that from other plants, microbes or animals. However this cross-species transformation has raised ethical concerns with the public, especially when associated with food. 15 As this technology develops further, more genes are being identified from crop species which would be of benefit to agriculture and industry if they were transferred to other genotypes of the same crop, i.e. within species transformation. The use of such "within-species transformation" approaches for moving genes between genotypes within the existing gene 20 pools available to plant breeders also has several advantages over traditional breeding: 1. Direct gene transfer to elite plants and cultivars without repeated backcrossing. This allows the efficient development of new plant lines without the many generations of hybridisation and selection usually required to recover the desired plant. 25 2. The transfer of single discrete genes, without the "linkage drag" associated with the transfer of many undefined and often undesirable neighboring genes in traditional plant breeding. 3. The specific design and development of new gene formulations. This can involve the 30 matching of molecular switches (promoters) with the desired coding regions to target the expression of the new gene at a specific location within a plant. Alternatively, "reverse genetics" approaches can be used to "knock-out" specific functions in plants. This can be achieved by positioning the coding region of a gene in the reverse orientation, relative to the WO 2005/121346 PCT/NZ2005/000117 promoter responsible for "turning the gene on" or components of the coding region arranged in an inverted repeat under control of the promoter. In addition, moving genes between plants of the same species does not raise the same ethical 5 concerns as cross-species transformation. The application of genetic engineering requires the use of vectors for either Agrobacterium mediated transformation or direct DNA uptake into plant cells. Agrobacterium-mediated transformation is the preferred method and requires the construction of modified T-DNA 10 (transferred-DNA) on a vector (usually a binary vector). However, the transformation requires the use of vector systems based on DNA sequences from other species (e.g. the T-DNA border regions, the DNA region into which target genes are inserted, selectable markers genes and sequences allowing such vectors to replicate in 15 additional host systems); sequences that have been usually derived from bacterial systems. The minimum requirement of a vector to perform Agro.bacterium-mediated plant transformation is at least one T-DNA border region, although in practice transformation vector systems include other vector sequences as described above. Two T-DNA border 20 regions are usually used flanking the sequence of interest to be integrated into the plant genome. However in most instances such border sequences or parts thereof also become integrated into the genome of the transformed plant. T-DNA sequences have been identified as naturally occurring in the genomes of plants (White 25 et al 1983, Nature 301: 348-350; Furner et al 1986, Nature 319: 422-427; Aoki et al 1994, Molecular and General Genetics 243: 706-710; Susuli et al 2002, Plant Journal 32: 775-787). Plant transformation vectors in which the Agrobacterium borders are replaced with plant derived T-DNA border-like sequences have also been reported (WO 03/069980). If the T DNA border-like sequences are chosen from a plant of the species to be transformed, this 30 allows for the possibility of production of plants transformed with only their own DNA. However, in practice integration is relatively un-predictable and often results in integration of other vector sequences from outside of the T-DNA borders and even transfer of the whole transformation vector which includes many additional non-plant sequences.
It is an object of the invention to provide improved compositions and methods for plant transfomation which reduce or eliminate the transfer of foreign DNA into the pant, or at ) least provide the public with a useful choice. SUMMARY OF INVENTION In one aspect the invention provides a plant transformation vector comprising: a) T-DNA-like sequence including at least one T-DNA border-like sequence, the T 10 DNA border-like sequence comprising two non-contiguous polynueleotide sequence fragments, wherein all of the sequences of the T-DNA-like sequence are derived from plant species. Also possible but less preferred is use of a similar T-DNA border-like sequence containing three or more polynucleotide sequence fragments derived from plant species. 15 Also disclosed is a plant transformation vector comprising a) a T-DNA-like sequence including at least one T-DNA border-like sequence b) additional plant polynucleotide sequence on one or both sides of the T-DNA-like sequence in which all of said sequences are derived from plants. preferably from the same plant species. 20 Advantageously the additional plant polynucleotide sequence is 5' to the left border when two T-DNA border-like sequences are used, or 5' to the single T-DNA border-like sequence when a single T-DNA border-like sequence is used. 25 Advantageously the said additional plant polynucleotide sequence is at least about I bp in length, preferably at least about 5 bp, preferably at least about 10 bp, preferably at least about 50 bp, preferably at least about 100 bp, preferably at least about 200 bp, preferably at least about 500 bp, more preferably at least about I kb. 30 In a preferred embodiment the T-DNA-like sequence includes two 'l'-DNA border-like polynucleotide sequences flanking the T-DNA-like sequence. both T-DNA border-like polynucleotide sequences being derived from plants. preferably from the same plant species. 3 WO 2005/121346 PCT/NZ2005/000117 In a further embodiment the T-DNA-like sequence further comprises additional base polynucleotide sequence(s), the additional base polynucleotide sequence(s) being derived from plants preferably from the same plants species as the T-DNA border-like sequences. 5 In a further embodiment the T-DNA-like sequence includes first and second recombinase recognition site sequences, wherein all of said sequences are derived from plants, preferably from the same plant species. In a further embodiment the first recombinase recognition site and the second recombinase 10 recognition site are lox P-like sites derived from a plant species, preferably from the same plant species as the T-DNA border-like sequences. In a further embodiment the first recombinase recognition site and the second recombinase recognition site arefrt-like sites derived from a plant species, preferably from the same plant 15 species as the T-DNA border-like sequences. In one embodiment the vector comprises a selectable marker sequence flanked by the first and second recombinase recognition site sequences. Preferably the selectable marker is operably linked to a constitutive promoter sequence. Preferably the selectable marker and/or the 20 constitutive promoter sequences are derived from plants, preferably from the same plant species as the T-DNA border-like sequences. In a further embodiment the vector comprises a recombinase sequence flanked by the first and second recombinase recognition site sequences. Preferably the recombinase is operably 25 linked to an inducible promoter sequence. Preferably the recombinase and/or inducible promoter sequences are derived from plants, preferably from the same plant species as the T DNA border-like sequences. In preferred vectors, when the recombinase recognition sites are a loxP-like sequences, the 30 recombinase sequence is Cre and when the recombinase recognition sites are anfirt-like sequences, the recombinase sequence is FLP. Alternatively a negative selection marker may be flanked by the first and second recombinase recognition site sequences. Preferably the negative selection marker is CodA. A1 WO 2005/121346 PCT/NZ2005/000117 In a further embodiment neither the T-DNA border-like polynucleotide sequences, nor any base polynucleotide sequence of the T-DNA-like sequence, nor the first or second recombinase recognition site sequences, nor the plant polynucleotide sequence additional to 5 the T-DNA-like sequence, contain regulatory elements, such as promoters, which may influence the expression of inserted genes of interest. In a further embodiment neither the T-DNA border-like polynucleotide sequences, nor any base polynucleotide sequence of the T-DNA-like sequence, nor the first or second 10 recombinase recognition site sequences, nor the plant polynucleotide sequence additional to the T-DNA-like sequence, contain introns, which may influence the expression of inserted genes of interest. In a further embodiment neither the T-DNA border-like polynucleotide sequences, nor any 15 base polynucleotide sequence of T-DNA-like sequence, nor the first or second recombinase recognition site sequences, nor the plant polynucleotide sequence additional to the T-DNA like sequence are derived from heterochromatic regions of the genome from which they are derived. 20 In a further embodiment the polynucleotide encompassing the T-DNA border-like sequences, the base polynucleotide sequence of the T-DNA-like sequence and the plant polynucleotide sequence additional to the T-DNA-like sequence are constructed from fewer than 10, preferably fewer than 9, preferably fewer than 8, preferably fewer than 7, preferably fewer than 6, preferably fewer than 5, preferably fewer than 4, preferably fewer than 3, most 25 preferably 2 or 1 sequence fragments derived from plants. In a further embodiment the plant transformation vector of the invention further comprises an origin of replication sequence. Preferably the origin of replication sequence is derived from a plant, preferably from the same plant species as the T-DNA border-like sequences and/or the 30 base polynucleotide sequence of the T-DNA-like sequence, and/or the sequence additional to the T-DNA-like sequence. In a further embodiment the T-DNA-like sequence of the plant transformation vector of the invention comprises a selectable marker polynucleotide sequence for selection of a plant cell or plant harbouring the T-DNA-lIke sequence. Preferably the selectable marker sequence is derived from a plant. more preferably from the same plant species as the T-DNA border-like ) sequences and/or the base polynucleotide sequence of the T-DNA-like sequence, and/or the sequence additional to the T-DNA-like sequence. 5 In a further embodiment the plant transformation vector of tie invention further comprises a selectable marker polynueleotide sequence for selection of a bacterium harbouring the vector. Preferably the selectable marker sequence is derived from a plant, more preferably from the same plant species as the T-DNA border-like sequences and/or the base polynucleotide 10 sequence of the T-DNA-like sequence, and/or the sequence additional to the 'f-DNA-like sequence. In a further embodiment the selectable marker polynucleotide sequence for selection of a plant harbouring the T-DNA-like sequence also functions in selection of a bacterium 1 5 harbouring the vector. In a further embodiment the T-DNA-like sequence further comprises a genetic construct as herein defined. Preferably the genetic construct comprises a promoter polynucleotide sequence operably linked to a polynucleotide sequence of interest and a terminator 20 polynucleotide sequence. wherein all of said polynucleotide sequences are derived from plants, preferably from the same plant species as the T-DNA border-like sequences. In a preferred embodiment the polynucleotide sequence of the entire vector is derived from plant species, preferably from the same plant species. 25 Also disclosed is a plant transformation vector including a T-DNA like sequence, the T-DNA like sequence comprising: a) a polynucleotide sequence of a! least about 20 bp in length derived from a plant species, comprising, the nucleotide sequence 5'-GR-3' (wherein R = G or A); and 30 b) a chimeric T-DNA-border-like sequence comprising at its 5' end, the nucleotide sequence 5'-GR-3' from a) wherein the chimerie border is capable of functioning as a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant. 6 Also disclosed is a plant transformation vector including a T-DNA like sequence, the T-DNA like sequence comprising: a) a polynucleotide sequence of at least about 20 bp in length derived from a plant species, comprising, the nucleotide sequence 5'-GRC-T (wherein R - G or A); and 5 b) a chimeric T-DNA-border-like sequence comprising at its 5' end, the nucleotide sequence 5'-GRC-T from a) wherein the ehimeric border is capable of functioning as a T-DNA border sequence in integration of a polynucleotide sequence into the genoie of a plant. 10 Also disclosed is a plant transformation vector including a T-DNA like sequence, the T-DNA like sequence comprising: a) a polynticleotide sequence of at least about 20 bp in length derived from a plant species, comprising, the nucleotide sequence 5'-GRCA-3' (wherein R = G or A); and b) a chineric T-DNA-border-like sequence comprising at its 5' end, the nucleotide 15 sequence 5'-GR(:A-3! from a) wherein the chimeric border is capable of functioning as a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant. Advantageously, the T-DNA-like sequence includes. 5' to the chimeric T-DNA border-like 20 sequence, first and second recombinase recognition sequences, wherein the recombinase recognition sequences are derived from plant species. Advantageously the first recombinase recognition site and the second recombinase recognition sequence are loxP-like sequences. Alternatively the first recombinase recognition 25 sequence and the second recombinase recognition sequences arefrt-like sequences. Advantageously the plant transformation vector comprises a selectable marker sequence flanked by the first and second recombinase recognition sequences. Preferably the selectable marker sequence is derived from plants. 30 Advantageously, the polynucleotide of at least 20 bp in length and any recombinase recognition site sequences are constructed from fewer than 10 fragments, preferably fewer than 9, preferably fewer than 8, preferably fewer than 7, preferably fewer than 6, preferably 7 fewer than 5. preferably fewer than 4, preferably fewer than 3. most preferably 2 or 1 sequence fragments derived from plants. Advantageously the plant transformation vector further comprises an origin of replication 5 polynucleotide sequence derived from plant species. Advantageously, the T-DNA-like sequence includes. 5' to the chimeric T-DNA border-like sequence, a selectable marker polynucleotide sequence capable of functioning in selection of a plant cell or plant harbouring the T-DNA-like sequence, wherein the selectable marker 10 sequence is derived from plant species. Advantageously the plant transformation vector comprises a selectable marker polynucleotide sequence capable of ftnctioning in selection of a bacterium harbouring the vector, wherein the selectable marker sequence Ts derived from a plant. 15 Advantageously the selectable marker polyuocleotide sequence capable of functioning in selection of a plant harbouring the T-DNA-like sequence is also capable of functioning in selection of a bacterium harbouring the vector. 20 Advantageously the T-DNA-likc sequence of the plant transformation vector further comprises a genetic construct as herein defined, wherein all polynucleotide sequences of the genetic construct are derived from plant species. Advantageously all of the polynucleotide sequence of the the plant transformation vector, 25 except for the chimeric T-DNA border-like sequence, is derived from plant species. More advantageously all of the polynueleotide sequence of the the plant transformation vector, except for the chimeric T-DNA border-like sequence, is derived from plant species which are interfertile. 30 Most advantageously all of the polynucleotide sequence of the the plant transformation vector, except for the chimeric T-DNA border-like sequence, is derived from the same plant species. 8 Also disclosed is a plant transformation vector including a chimeric sequence, the chimeric sequence comprising: a) at the 5' end a plant-derived sequence of at least 20 bp in length including the nucleotide sequence 5'-GR-3' (wherein R = ( or A); and 5 b) at the 3' end a border sequence capable of performing the function of a T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant, wherein the nucleotide sequence 5'-GR-3' from a) forms the 5' end of the border sequence. 10 Also disclosed is a plant transformation vector including a chimeric sequence, the chimeric sequence comprising: a) at the 5' end a plant-derived sequence of at least 20bp in length including the nucleotide sequence 5'-GRC-3' (wherein R - G or A); and b) at the 3' end a border sequence capable of performing the function of a T-DNA 15 border sequence in integration of a polynucleotide sequence into the genome of a plant, wherein the nucleotide sequence 5'-GRC-3' from a) forms tIe 5' end of the border sequence. Also disclosed is a plant transformation vector including a chimeric sequence, the chineric 20 sequence comprising: a) at the 5' end a plant-derived sequence of at least 20 bp in length including the nucleotide sequence 5'-GRCA-3' (wherein R - C or A); and b) at the 3' end a border sequence capable performing the function of a T-DNA border sequence in inLegration of a polynucleotide sequence into the genome of a plant, 25 wherein the nucleotide sequence 5'-GRCA-3' from a) forms the 5' end of the border sequence. Advantageously the plant-derived sequence of at least 20 bp in length is at least about 50bp in length, more preferably at least about 100bp in length, more preferably at least about 200bp in 30 length, more preferably at least about 500bp in length, most preferably at least about 1kb in length. Advantageously the plant transformation includes, 5' to the border sequence, first and second recombinase recognition sequences derived from plant species. 9 Advantageously the first reconbinase recognition site and the second recombinase ) recognition sequence are loxP-like sequences. 5 Alternatively the first recombinase recognition sequence and the second recombinase recognition sequences arefrr-like sequences. Advantageously the plant transformation vector comprises a selectable marker sequence flanked by the first and second recombinase recognition sequences. 10 Advantageously the selectable marker sequence is derived from plants. Advantageously the polynucleotide of at least 20 hp in length and any recombinase recognition site sequences, of the plant transformation vector, are constructed from fewer than 15 10 fragments. preferably fewer than 9, preferably fewer than 8, preferably fewer than 7. preferably fewer than 6, preferably fewer than 5, preferably fewer than 4, preferably fewer than 3, most preferably 2 or I polynLcleotide sequence fragments derived from plant species. Advantageously the plant transformation vector futhher comprises an origin of replication 20 polynucleotide sequence derived from plant species. Advantageously the plant transformation vector includes, 5' to the border sequence, a selectable marker polynucleotide sequence capable of functioning in selection of a plant celi or plant harbouring the selectable marker polynucleotide sequence, wherein the selectable 25 marker sequence is derived from plant species. Advantageously the plant transformation vector comprises a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector, wherein the selectable marker sequence is derived from a plant. 30 Advantageously the selectable marker polynucleotide sequence capable of functioning in selection of a plant tiarbouring the selectable marker polynucleotide sequence is also capable of Functioning in selection of a bacterium harbouring the vector, 10 Advaniageously the plant transformation vector further comprises a genetic CoJstrL[CI as herein defined, wherein all polynucleoide sequences of the genetic construct are derived from ) plant species. 5 Advantageously all of the polynucleotide sequence of the plant transformation vector. except for the border sequence. is derived from plant species. More advantageously all of the polynucleotide sequence of the plant transformation vector. except for the border sequence, is derived from plant species which are interfertile. 10 Yet more advantageously all of the polynucleotide sequence of the planL transformation vector, except for the border sequence, is derived from the same plant species. Also disclosed is a plant transformation vector comprising a selectable marker polynucleotide 15 sequence capable of functioning in selection of a plant cell or plant harhouring the selectable marker polynucleotide, wherein the selectable marker sequence is derived from plant species. Also disclosed is a plant transfonnation vector comprising first and second recombinase recognition sequences, wherein the recombinase recognition sequences are derived from plant 20 species. Advantageously the first recombinase recognition sequence and the second reconibinase recognition sequence are loxP-like sequences derived from a plant species. 25 Alternatively the first recombinase recognition sequence and the second recombinase recognition sequences are fri-like sequences derived from plant species. Advantageously the plant transformation vector comprises a selectable marker sequence flanked by the first and second recombinase recognition sequences. Preferably the selectable 30 marker sequence is derived from plants. Advantageously the plant transformation vector further comprises an origin of replication polynucleotide sequence derived from plant species. 11 Advantageously the plant transformation vetor further comprises a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the ) vector. wherein the selectablc marker sequence is derived From plant species. 5 Advantageously the selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring the vector is also capable of functioning in selection of a plant cell or plant harbouring the selectable marker polynucleotide. Also disclosed in a plant transformation vector comprising 10 a) an origin of replication polynucleotide sequence, and b) a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harboring the vector in which all of said sequences are derived from plant species. 15 Advantageously the plant transformation vector further comprises additional base polynucleotide sequence, the additional base polynucleotide sequence being derived from plant species. Advantageously the plant transformation vector is constructed from fewer than 10, preferably 20 fewer than 9. preferably fewer than 8., preferably fewer than 7, preferably fewer than 6, preferably fewer than 5, preferably fewer than 4, preferably fewer than 3, most preferably 2 or I polynucleotide sequence fragments derived from plants. Advantageously the plant transformation vector further comprises a genetic construct as 25 herein defined, wherein all polynucleotide sequences of the genetic construct are derived from plants. Advanlageously all of the polynucleotide sequence of the plant transformation vector is derived from plant species, more preferably from plant species which are interferlile and most 30 preferably from the same plant species. Also disclosed is a plant transformation vector comprising a selectable marker polynucleotide sequence for selection of a bacterium harbouring the vector. Preferably the selectable marker sequence is derived from a plant. More preferably the vector also comprises an origin of replication sequence functional in bacteria. preferably in F coli. Preferably the origin of replication sequence is derived from a plant. more preferably from the same plant species as ) the selectable marker polynucleotide sequence for selection of a bacterium harbouring the vector. Yet more preferably the vector further comprises a genetic construct as herein 5 defined. Preferably the genetic construct seqLence is derived from a plant, more preferably fiom the same plant species as the selectable marker polynucleotide sequence for selection of a bacterium harbouring the vector, Preferably the polynucleotide sequence of the entire vector are derived from plant species, most preferably Fromi the same plant species., 10 In a further aspect the invenLion provides a method of producing a transformed plant cell or plant, the method comprising the step of transformation of the plant cell or plant using a transformation vector of the invention. In a preferred embodiment any polynueleotide stably integrated into the plant cell or plant is 15 derived from a plant. Preferably any polynucleotide stably integrated into the plant cell or plant is derived from a plan] interfertile with the plant or plant cell to be transformed. Most preferably any polynucleotide stably integrated into the plant cell or plant is derived from a plant of the same species as the plant or plant cell to be transfbned. 20 The invention also provides a method of modifying a trait in a plant cell or plant comprising: (a) transforming of a plant cell or plant with a vector of the invention, the vector comprising a genetic construct capable of altering expression of a gene which influences the trait; and (b) obtaining a stably transformed plant cell or plant modified for the trait. 25 In one embodiment transformation is vir gene-mediated. In a further embodiment transformation is Agrobacerium-mediated. 30 In an alternative embodiment transformation involves direct DNA uptake. Also disclosed is a method for modifying a plam cell or plant, comprising: (a) transforming a plant cell or plant witl the vector of the invention comprising a selectable marker flanked by loxP-like recombinase recognition sites: 13 (b) selecting a plant cell or plant expressing the selectable marker flanked by loxP-like recombinase recognition sites: (c) inducing the expression of the Cre gene in the plant cell or plant; (d) culturing the plant cell or plant for sufficient time to allow excision of the 5 selectable marker. Also disclosed is a method for modifying a plant cell or plant, comprising: (a) transforming a plant cell or plant with the vec lor of the invention comprising a selectable marker flanked byfi-like recombinase recognition sites; 10 (b) selecting a plant cell or plant expressing the selectable marker flanked byfr-like recombinase recognition sites; (c) inducing the expression of the FLP gene in the plant cell or plant; (d) culturing the plant cell or plant for sufficient time to allow excision of the selectable marker. 15 The invention provides a plant modilled by a method of the invention. In a preferred embodiment the plant cell or plant modified is of the same species as the vector sequence used to modify it. 20 The invention also provides a plant cell or plant produced by a method of the invention In a preferred embodiment the plant cell or plant produced is of the same species as the vector sequence used to produce it. 25 The invention also provides a plant tissue, organ, propagule or progeny of the plant cell or plant of the invention, 14 WO 2005/121346 PCT/NZ2005/000117 DETAILED DESCRIPTION The term "polynucleotide(s)," as used herein, means a single or double-stranded 5 deoxyribonucleotide or ribonucleotide polymer of any length, and include as non-limiting examples, coding and non-coding sequences of a gene, sense and antisense sequences, exons, introns, genomic DNA, cDNA, pre-mRNA, mRNA, rRNA, siRNA, miRNA, tRNA, ribozymes, recombinant polynucleotides, isolated and purified naturally occurring DNA or RNA sequences, synthetic RNA and DNA sequences, nucleic acid probes, primers, 10 fragments, genetic constructs, vectors and modified polynucleotides. As used herein, the term "variant" refers to polynucleotide or polypeptide sequences different from the specifically identified sequences, wherein one or more nucleotides or amino acid residues is -deleted, substituted, or added. Variants may be naturally occurring allelic variants, 15 or non-naturally occurring variants. Variants may be from the same or from other species and may encompass homologues, paralogues and orthologues. In certain embodiments, variants of the inventive polypeptides and polynucleotides possess biological activities that are the same or similar to those of the inventive polypeptides or polynucleotides. The term "variant" with reference to polynucleotides and polypeptides encompasses all forms of polynucleotides 20 and polypeptides as defined herein. Variant polynucleotide sequences preferably exhibit at least 50%, more preferably at least 70%, more preferably at least 80%, more preferably at least 90%, more preferably at least 95%, more preferably at least 98%, and most preferably at least 99% identity to a sequence of 25 the present invention. Identity is found over a comparison window of at least 5 nucleotide positions, preferably at least 10 nucleotide positions, preferably at least 20 nucleotide positions, preferably at least 50 nucleotide positions, more preferably at least 100 nucleotide positions, and most preferably over the entire length of a polynucleotide of the invention. 30 Polynucleotide sequence identity can be determined in the following manner. The subject polynucleotide sequence is compared to a candidate polynucleotide sequence using BLASTN (from the BLAST suite of programs, version 2.2.5 [Nov 2002]) in bl2seq (Tatiana A. Tatusova, Thomas L. Madden (1999), "Blast 2 sequences - a new tool for comparing protein WO 2005/121346 PCT/NZ2005/000117 and nucleotide sequences", FEMS Microbiol Lett. 174:247-250), which is publicly available from NCBI (ftp://ftp.ncbi.nih.gov/blast/). The default parameters of bl2seq may be utilized. Polynucleotide sequence identity may also be calculated over the entire length of the overlap 5 between a candidate and subject polynucleotide sequences using global sequence alignment programs (e.g. Needleman, S. B. and Wunsch, C. D. (1970) J. Mol. Biol. 48, 443-453). A full implementation of the Needleman-Wunsch global alignment algorithm is found in the needle program in the EMBOSS package (Rice, P. Longden, I. and Bleasby, A. EMBOSS: The European Molecular Biology Open Software Suite, Trends in Genetics June 2000, vol 16, No 10 6. pp.276-277) which can be obtained from http://www.hamp.mrc.ac.uk/Software/EMBOSS/. The European Bioinformatics Institute server also provides the facility to perform EMBOSS-needle global alignments between two sequences on line at http:/www.ebi.ac.uk/emboss/align/. 15 Alternatively the GAP program may be used which computes an optimal global alignment of two sequences without penalizing terminal gaps. GAP is described in the following paper: Huang, X. (1994) On Global Sequence Alignment. Computer Applications in the Biosciences 10, 227-235. 20 Use of BLASTN as described above is preferred for use in the determination of sequence identity for polynucleotide variants according to the present invention. Alternatively, variant polynucleotides of the present invention hybridize to the polynucleotide sequences disclosed herein, or complements thereof under stringent conditions. 25 The term "hybridize under stringent conditions", and grammatical equivalents thereof, refers to the ability of a polynucleotide molecule to hybridize to a target polynucleotide molecule (such as a target polynucleotide molecule immobilized on a DNA or RNA blot, such as a Southern blot or Northern blot) under defined conditions of temperature and salt 30 concentration. The ability to hybridize under stringent hybridization conditions can be determined by initially hybridizing under less stringent conditions then increasing the stringency to the desired stringency.
WO 2005/121346 PCT/NZ2005/000117 With respect to polynucleotide molecules greater than about 100 bases in length, typical stringent hybridization conditions are no more than 25 to 300 C (for example, 10 C) below the melting temperature (Tm) of the native duplex (see generally, Sambrook et al., Eds, 1987, Molecular Cloning, A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press; Ausubel et al., 5 1987, Current Protocols in Molecular Biology, Greene Publishing,). Tm for polynucleotide molecules greater than about 100 bases can be calculated by the formula Tm = 81. 5 + 0. 41% (G + C-log (Na+). (Sambrook et al.. Eds, 1987, Molecular Cloning, A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press; Bolton and McCarthy, 1962, PNAS 84:1390). Typical stringent conditions for polynucleotide molecules of greater than 100 bases in length would 10 be hybridization conditions such as prewashing in a solution of 6X SSC, 0.2% SDS; hybridizing at 65"C, 6X SSC, 0.2% SDS overnight; followed by two washes of 30 minutes each in IX SSC, 0.1% SDS at 650 C and two washes of 30 minutes each in 0.2X SSC, 0.1% SDS at 65 0 C. 15 With respect to polynucleotide molecules having a length less than 100 bases, exemplary stringent hybridization conditions are 5 to 10' C below Tm. On average, the Tm of a polynucleotide molecule of length less than 100 bp is reduced by approximately (500/oligonucleotide length)' C. 20 Variant polynucleotides of the present invention also encompasses polynucleotides that differ from the sequences of the invention but that, as a consequence of the degeneracy of the genetic code, encode a polypeptide having similar activity to a polypeptide encoded by a polynucleotide of the present invention. A sequence alteration that-does not change the amino acid sequence of the polypeptide is a "silent variation". Except for ATG (methionine) and 25 TGG (tryptophan), other codons for the same amino acid may be changed by art recognized techniques, e.g., to optimize codon expression in a particular host organism. Polynucleotide sequence alterations resulting in conservative substitutions of one or several amino acids in the encoded polypeptide sequence without significantly altering its biological 30 activity are also included in the invention. A skilled artisan will be aware of methods for making phenotypically silent amino acid substitutions (see, e.g., Bowie et al., 1990, Science 247, 1306).
WO 2005/121346 PCT/NZ2005/000117 Variant polynucleotides due to silent variations and conservative substitutions in the encoded polypeptide sequence may be determined using the publicly available bl2seq program from the BLAST suite of programs (version 2.2.5 [Nov 2002]) from NCBI (ftp://ftp.ncbi.nih.gov/blast/) via the tblastx algorithm as previously described. 5 A "fragment" of a polynucleotide sequence provided herein is a subsequence of contiguous nucleotides that is at least 5 nucleotides in length. The fragments of the invention comprise at least 5 nucleotides, preferably at least 10 nucleotides, preferably at least 15 nucleotides, preferably at least 20 nucleotides, more preferably at least 30 nucleotides, more preferably at 10 least 50 nucleotides, more preferably at least 50 nucleotides and most preferably at least 60 nucleotides of contiguous nucleotides of a polynucleotide of the invention. The term "primer" refers to a short polynucleotide, usually having a free 3'OH group, that is hybridized to a template and used for priming polymerization of a polynucleotide 15 complementary to the target. The term "probe" refers to a short polynucleotide that is used to detect a polynucleotide sequence, that is complementary to the probe, in a hybridization-based assay. The probe may consist of a "fragment" of a polynucleotide as defined herein. 20 The term "polypeptide", as used herein, encompasses amino acid chains of any length, including full-length proteins, in which amino acid residues are linked by covalent peptide bonds. Polypeptides of the present invention may be purified natural products, or may be produced partially or wholly using recombinant or synthetic techniques. The term may refer 25 to a polypeptide, an aggregate of a polypeptide such as a dimer or other multimer, a fusion polypeptide, a polypeptide fragment, a polypeptide variant, or derivative thereof. The term "isolated" as applied to the polynucleotide sequences disclosed herein is used to refer to sequences that are removed from their natural cellular environment. An isolated 30 molecule may be obtained by any method or combination of methods including biochemical, recombinant, and synthetic techniques. The term "genetic construct" refers to a polynucleotide molecule, usually double-stranded DNA, which may have inserted into it another polynucleotide molecule (the insert WO 2005/121346 PCT/NZ2005/000117 polynucleotide molecule) such as, but not limited to, a cDNA molecule. A genetic construct may contain the necessary elements that permit transcribing the insert polynucleotide molecule, and, optionally, translating the transcript into a polypeptide. The insert polynucleotide molecule may be derived from the host cell, or may be derived from a 5 different cell or organism and/or may be a recombinant or synthetic polynucleotide. Once inside the host cell the genetic construct may become integrated in the host chromosomal DNA. The term "genetic construct" includes "expression construct" as herein defined. The genetic construct may be linked to a vector. 10 The term "expression construct" refers to a genetic construct that includes the necessary elements that permit transcribing the insert polynucleotide molecule, and, optionally, translating the transcript into a polypeptide. An expression construct typically comprises in a 5' to 3' direction: a) a promoter functional in the host cell into which the construct will be 15 transformed, b) the polynucleotide to be transcribed and/or expressed, and c) a terminator functional in the host cell into which the construct will be transformed. 20 The term "vector" refers to a polynucleotide molecule, usually double stranded DNA, which may include a genetic construct and be used to transport the genetic construct into a host cell. The vector may be capable of replication in at least one additional host system, such as Escherichia coli or Agrobacterium tunefaciens. 25 The term "coding region" or "open reading frame" (ORF) refers to the sense strand of a genomic DNA sequence or a cDNA sequence that is capable of producing a transcription product and/or a polypeptide under the control of appropriate regulatory sequences. The coding sequence is identified by the presence of a 5' translation start codon and a 3' translation stop codon. When inserted into a genetic construct, a "coding sequence" is 30 capable of being expressed when it is operably linked to promoter and terminator sequences. "Operably-linked" means that the sequence to be expressed is placed under the control of regulatory elements that include promoters, tissue-specific regulatory elements, temporal WO 2005/121346 PCT/NZ2005/000117 regulatory elements, chemical-inducible regulatory elements, environment-inducible regulatory elements, enhancers, repressors and terminators. The term "noncoding region" refers to untranslated sequences that are upstream of the 5 translational start site and downstream of the translational stop site. These sequences are also referred to respectively as the 5' UTR and the 3' UTR. These regions include elements required for transcription initiation and termination and for regulation of translation efficiency. 10 Terminators are sequences, which terminate transcription, and are found in the 3' untranslated ends of genes downstream of the translated sequence. Terminators are important determinants of mRNA stability and in some cases have been found to have spatial regulatory functions. The term "promoter" refers to nontranscribed cis-regulatory elements upstream of the coding 15 region that regulate gene transcription. Promoters comprise cis-initiator elements which specify the transcription initiation site and conserved boxes such as the TATA box, and motifs that are bound by transcription factors. A "transformed plant" refers to a plant which contains new genetic material as a result of 20 genetic manipulation or transformation. The new genetic material may be derived from a plant of the same species or from a different species in which case it can also be known as a "transgenic plant". An "inverted repeat" is a sequence that is repeated, where the second half of the repeat is in 25 the complementary strand, e.g., (5')GATCTA.......TAGATC(3') (3')CTAGAT.......ATCTAG(5') Read-through transcription will produce a transcript that undergoes complementary base pairing to form a hairpin structure provided that there is a 3-5 bp spacer between the repeated 30 regions. The terms "to alter expression of' and "altered expression" of a polynucleotide or polypeptide of the invention, are intended to encompass the situation where genomic DNA corresponding to a polynucleotide of the invention is modified thus leading to altered expression of a WO 2005/121346 PCT/NZ2005/000117 polynucleotide or polypeptide of the invention. Modification of the genomic DNA may be through genetic transformation or other methods known in the art for inducing mutations. The "altered expression" can be related to an increase or decrease in the amount of messenger RNA and/or polypeptide produced and may also result in altered activity of a polypeptide due 5 to alterations in the sequence of a polynucleotide and polypeptide produced. The term "IoxP.-like sequence" refers to a sequence derived from the genome of a plant which can perform the function of a Cre recombinase recognition site. The loxP-like sequence may be comprised of one contiguous sequence derived from the genome of a plant or may be 10 formed by combining two sequences derived from the genome of a plant. A loxP-like sequence is between 24-100 bp in length, preferably 24-80 bp in length, preferably 24-70 bp in length, preferably 24-60 bp in length, preferably 24-50 bp in length, preferably 24-40 bp in length, preferably 24-34 bp in length, preferably 26-34 bp in length, 15 preferably 28-34 bp in length, preferably 30-34 bp in length, preferably 32-34 bp in length, preferably 34 bp in length. A loxP-like sequence preferably comprises the consensus motif 20 5' ATAACTTCGTATANNNNNNNNTATACGAAGTTAT 3' (where N = any nucleotide), or similar sequences. The tenn "firt-like sequence" refers to a sequence derived from the genome of a plant which 25 can perform the function of an FLP recombinase recognition site. Thefrt-like sequence may be comprised of one contiguous sequence derived from the genome of a plant or may be formed by combining two sequences derived from the genome of a plant. Anfrt-like sequence is between 28-100 bp in length, preferably 28-80 bp in length, preferably 30 28-70 bp in length, preferably 28-60 bp in length, preferably 28-50 bp in length, preferably 28-40 bp in length, preferably 28-34 bp in length, preferably 30-34 bp in length, preferably 32-34 bp in length, preferably 34 bp in length. Afrt-like sequence preferably comprises the consensus motif WO 2005/121346 PCT/NZ2005/000117 5' GAAGTTCCTATACNNNNNNNNGWATAGGAACTTC 3' (where W = A or T, N = any nucleotide). 5 The term "T-DNA border-like sequence" refers to a sequence derived from the genome of a plant which can perform the function of an Agrobacterium T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant. The T-DNA border-like sequence may be comprised of one contiguous sequence derived from the genome of a plant 10 or may be formed by combining two or more sequences derived from the genome of a plant. A T-DNA border-like sequence is between 10-100 bp in length, preferably 10-80 bp in length, preferably 10-70 bp in length, preferably 15-60 bp in length, preferably 15-50 bp in length, preferably 15-40 bp in length, preferably 15-30 bp in length, preferably 20-30 bp in length, 15 preferably 21-30 bp in length, preferably 22-30 bp in length, preferably 23-30 bp in length, preferably 24-30 bp in length, preferably 25-30 bp in length, preferably 26-30 bp in length. A T-DNA border-like sequence preferably comprises the consensus motif: 5'GRCAGGATATATNNNNNKSTMAWN3' 20 (where R G or A, K = T or G, S = G or C, M = C or A, W = A or T and N = any nucleotide. The T-DNA border-like sequence of the invention is preferably at least 50%, more preferably at least 55%, more preferably at least 60%, more preferably at least 65%, more preferably at least 70%, more preferably at least 75%, more preferably at least 80%, more preferably at 25 least 85%, more preferably at least 90%, more preferably at least 95%, more preferably at least 99% identical to any Agrobacterium-derived T-DNA border sequence. Although not preferred, a T-DNA border-like sequence of the invention may include a sequence naturally occurring in a plant which is modified or mutated to change the efficiency 30 at which it is capable of integrating a linked polynucleotide sequence into the genome of a plant. The term "T-DNA-like sequence" refers to a sequence derived from a plant genome which includes at one or both ends a T-DNA border-like sequence, or a chimeric T-DNA-border-like WO 2005/121346 PCT/NZ2005/000117 sequence as herein defined. A T-DNA-like sequence may include additional base sequence between the T-DNA border-like sequences, or to one side of a T-DNA border-like sequence. The base sequence of the T-DNA-like sequences of the invention preferably includes restriction sites or alternative cloning sites to facilitate insertion of further polynucleotide 5 sequences. The term "chimeric T-DNA border-like sequence" refers to a sequence which can perform the function of an Agrobacteriun T-DNA border sequence in integration of a polynucleotide sequence into the genome of a plant, wherein part of the sequence is derived from a plant and 10 part of the sequence is derived from another source, such as Agrobacterium. Upon plant transformation, it is well understood that T-DNA integration from the right border is very precise. Molecular cloning and sequencing across T-DNA/plant genomic DNA junctions has repeatedly established that T-DNA integration at the right border is highly 15 conserved, with only the first few nucleotides of the right border being integrated into plant genomes (Gheysen, G., Angenon, G., van Montagu, M., Agrobacterium-mediated plant transformation: a scientifically intriguing story with significant applications, pp. 1-33, in Transgenic Plant Research, editor Lindsey, K., Harwood Academic Publishers, Amsterdam, 1998). 20 For this reason, when deriving a chimeric border for use as a "right border" in intragenic transformation, it is only necessary for the first few nucleotides (up to four nucleotides) to be of plant origin; i.e. 5'GRCA... 3'. The remaining DNA sequence of such rights borders can be authentic sequences from Agrobacterium T-DNA borders. 25 It will be well understood by those skilled in the art that a DNA sequence of 5'GRCA3' will occur frequently in any genome. It is expected to be found at random once in every 256 nucleotides and is likely to be found on any other fragment useful for the construction of vectors for plant transformation. 30 The term "border sequence" refers to a sequence derived from a plant which can perform the function of an Agrobacterium T-DNA border sequence for integration of a polynucleotide sequence into the genome of a plant.
WO 2005/121346 PCT/NZ2005/000117 A "border sequence" is between 10-100 bp in length, preferably 10-80 bp in length, preferably 10-70 bp in length, preferably 15-60 bp in length, preferably 15-50 bp in length, preferably 15-40 bp in length, preferably 15-30 bp in length, preferably 20-30 bp in length, preferably 21-30 bp in length, preferably 22-30 bp in length, preferably 23-30 bp in length, 5 preferably 24-30 bp in length, preferably 25-30 bp in length, preferably 26-30 bp in length. A "border sequence" preferably comprises the consensus motif: 5'GRCAGGATATATNNNNNKSTMAWN3' 10 (where R= G or A, K = T or G, S =G or C, M= C or A, W = A or T and N = any nucleotide). The term "border sequence" as used herein includes known Agrobacterium borders, including 15 those disclosed herein. The term "border sequence" also includes modified versions of known Agrobacterium sequences, which have been modified, for example by substitution, addition or deletion, to improve the efficiency at which they are capable of performing function of an Agrobacterium 20 T-DNA border sequence for integration of a polynucleotide sequence into the genome of a plant. The terms "origin of replication derived from a plant" or "plant-derived origin of replication" or grammatical equivalents thereof refers to a sequence derived from a plant which can 25 support replication of a vector in which it is included in a bacterium. The "plant-derived origins of replication" may be composed of one, two or more sequence fragments derived from plants. Preferably the "plant-derived origins of replication" are composed of two sequence fragments derived from plants. 30 The plant-derived origin of replication may comprise the consensus motif: AGALCANIATAAGCCT TALYCAKATAACAGCAJCC Where R= G or A (Pu), Y= C or T (Py) and W=A or T WO 2005/121346 PCT/NZ2005/000117 Alternatively the plant-derived origin of replication comprises the consensus motif: 5 AGANICA|ATAAGCCT T AAATAACAGCkCC Where R =G or A (Pu), Y =C or T (Py) and W = A or T The terms "selectable marker derived from a plant" or "plant-derived selectable marker" or 10 grammatical equivalents thereof refers to a sequence derived from a plant which can enable selection of a plant cell harbouring the sequence or a sequence to which the selectable marker is linked. The "plant-derived selectable markers" may be composed of one, two or more sequence fragments derived from plants. Preferably the "plant-derived selectable markers" are composed of two sequence fragments derived from plants. 15 In one embodiment, the plant-derived selectable marker is at least 50%, more preferably at least 55%, more preferably at least 60%, more preferably at least 65%, more preferably at least 70%, more preferably at least 75%, more preferably at least 80%, more preferably at least 85%, more preferably at least 90%, more preferably at least 95%, more preferably at 20 least 99% identical to the sequence of SEQ ID NO: 10. Alternatively the plant-derived selectable marker is at least 90%, preferably at least 95%, and most preferably 100% identical to SEQ ID NO:39 or SEQ ID NO:40. 25 Methods for transforming plant cells, plants and portions thereof with polynucleotides are described in Draper et al., 1988, Plant Genetic Transformation and Gene Expression. A Laboratory Manual. Blackwell Sci. Pub. Oxford, p. 365; Potrykus and Spangenburg, 1995, Gene Transfer to Plants. Springer-Verlag, Berlin.; and Gelvin et al., 1993, Plant Molecular Biol. Manual. Kluwer Acad. Pub. Dordrecht. A review of transgenic plants, including 30 transformation techniques, is provided in Galun and Breiman, 1997, Transgenic Plants. Imperial College Press, London.
WO 2005/121346 PCT/NZ2005/000117 It will be well understood by those skilled in the art that the intragenic vectors of the invention can function in the place of the binary vectors for Agrobacterium-mediated transformation and as vectors for direct DNA uptake approaches. 5 The invention provides novel plant derived loxP-like and frt-like recombinase recognition sequences, novel T-DNA border-like sequences, T-DNA-like sequences, transformation vectors, methods for producing transformed plant cells and plants, and plant cells and plants produced by the methods. 10 The majority of selectable markers for plant transformation are antibiotic or herbicide resistance genes; their presence in transgenic crop plants has given rise to public concerns on environmental safety. Regardless of their origin, once a transgenic plant has been established marker genes are no 15 longer required. Moreover it is desirable to remove the promoter and enhancer elements used to drive the expression of the marker genes as these may interfere with the expression of neighboring endogenous genes. Previously site-specific recombination systems have been elegantly used to excise precise 20 sequences corresponding to selectable marker constructs in transgenic plants (reviewed by Gilbertson, L. Cre-lox recombination: Cre-ative tools for plant biotechnology TRENDS in Biotechnology 21(12) 550-555 2003). Two such recombination systems are the Escherichia coli bacteriophage P1 Cre/loxP system 25 and the Saccharomyces cerevisiae FLP/frt systems, which require only a single-polypeptide recombinase, Cre or FLP and minimal 34bp DNA recombination sites, loxP orfrt. When two recombination sites in the same orientation flank integrated DNA such as a selectable marker, recombinase mediates a crossover between these sites effectively excising 30 the intervening DNA. The recombinase enzyme can either be located next to the selectable marker gene so that it is in effect auto excised (Mlynarova, L and Nap J-P, A self-excising Cre recombinase allows efficient recombination of multiple ectopic heterospecific lox sites in transgenic tobacco, WO 2005/121346 PCT/NZ2005/000117 Transgenic Research, 12: 45-57, 2003), or it can be transiently expressed (Gleave, A.P, Mitra, D.S, Mudge, S.R and Morris, B.A.M. Selectable marker-free transgenic plants without sexual crossing: transient expression of cre recombinase and use of a conditional lethal dominant gene, Plant Molecular Biology, 40: 223-235, 1999). 5 Following excision only one recombination site remains. The applicants have identified novel T-DNA border-like sequences from plant genomes and devised improved methods for transformation which minimise or eliminate transfer of foreign 10 DNA to the transformed plant cell or plant. The invention provides T-DNA border-like sequences, T-DNA-like sequences, transformation vectors, methods for transforming plant cells and plants, and the plant cells and plants produced by the methods. 15 The applicants have also identified novel plant derived loxP-like and frt-like recombinase recognition sequences from plant genomes and devised further improved methods for transformation which minimise or eliminate transfer of foreign DNA to the transformed plant. 20 It will be understood by those skilled in the art that corresponding recombinase sequences can be expressed in plants in order to facilitate recombination of the loxP-like and frt-like recombinase recognition sequences of the invention. The invention provides methods which allow for within-species or "intragenic" as opposed to 25 transgenic transformation of plants. Vectors useful for this approach can therefore be described as intragenic vectors. The invention provides such intragenic vectors and methods of using them to produce intragenic transformed plants without any foreign DNA. It will be understood by those skilled in the art that DNA sequences used to construct such 30 "intragenic vectors" are preferentially derived from DNA sequences (ESTs or cDNAs) known to be expressed in plant genomes. In this manner sequences derived from heterochromatic regions, promoters or introns can be avoided. The use of such sequences for the construction of intragenic vectors may influence the subsequent expression of genes of interest following their transfer to plants via intragenic vectors.
WO 2005/121346 PCT/NZ2005/000117 The invention provides novel T-DNA border-like sequences from several plant species (as shown in Example 1) formed by combining two to three fragments of genomic DNA, with all fragments being from a single plant species of interest or a closely related species. The 5 common nature of such sequences in plant genomes is shown in Example 1. The invention further provides isolated T-DNA-like sequences from several plant species as shown in Example 2. The T-DNA-like region sequences in Example 2 include the T-DNA like sequences flanked (and delineated) by T-DNA border-like sequences (high-lighted) and 10 additional sequence on either one or both sides of the T-DNA-like sequence. Plant-derived selectable marker sequences which are useful for selecting transformed plant cells and plants harbouring a particular T-DNA-like sequence include PPga22 (Zuo et al., Curr Opin Biotechnol. 13: 173-80, 2002), Ckil (Kakimoto, Science 274: 982-985, 1996), Esrl 15 (Banno et al., Plant Cell 13: 2609-18, 2001), and dhdps-rl (Ghislain et al., Plant Journal, 8: 733-743, 1995). It is also possible to use pigmentation markers to visually select transformed plant cells and plants, such as the R and Cl genes (Lloyd et al., Science, 258: 1773-1775, 1992; Bodeau and Walbot, Molecular and General Genetics, 233: 379-387, 1992). A preferred plant-derived selectable marker is the acetohydroxyacid synthase gene as shown in 20 Example 6 and Example 7. Non-plant derived selectable markers are also described herein. Preferred intragenic vectors of the invention contain a plant-derived selectable marker which function in selection of bacteria harbouring the marker as described in Example 3 and Example 5. 25 The preferred intragenic vectors of the invention consist entirely of plant-derived polynucleotide sequence from the species to be transformed, or from closely related species, such as species interfertile with the plant to be transformed, considered to be within the germplasm pool accessible to traditional plant breeding. Such vectors preferably include a 30 plant-derived origin of replication which is functional in bacteria, particularly in Agrobacterium species and preferably also in E. coli. The invention provides plant transformation vectors comprising such sequences. Preferred origin of replication sequences include those shown in Example 4.
WO 2005/121346 PCT/NZ2005/000117 The invention provides novel loxP-like and frt-like recombinase recognition sequences from several plant species as shown in Example 9 and Example 10. Construction of a vector of the invention is described in Example 6 and Example 8. Plant 5 transformation using these vectors is described in Example 7 and Example 8. Example 6 and Example 7 also illustrate the construction and successful use of a vector with a chimeric T-DNA border-like sequence. In this instance the "right border" is composed of 5'GAC3' from the end of a sequence isolated from Arabidopsis thaliana, with the remainder 10 of the chimeric T-DNA border-like sequence, 5'AGGATATATTGGCGGGTAAAC3', being derived from the binary vector pART27 (see sequence of pTCl in Example 6). Such chimeric T-DNA border-like sequences are preferably used as the right border when two border-like sequences are used to flank the T-DNA-like sequence. When vectors with only one border like sequence are used, the plant derived end (e.g. 5'GRC3') end of the T-DNA border-like 15 sequence must be contiguous with the plant derived sequence(s) destined for integration into a plant genome. The polynucleotide molecules of the invention can be isolated by using a variety of techniques known to those of ordinary skill in the art. By way of example, such 20 polynucleotides can be isolated through use of the polymerase chain reaction (PCR) described in Mullis et al., Eds. 1994 The Polymerase Chain Reaction, Birkhauser, incorporated herein by reference. The polynucleotides of the invention can be amplified using primers, as defined herein, derived from the polynucleotide sequences of the invention. 25 Further methods for isolating polynucleotides of the invention include use of all, or portions of, the disclosed polynucleotide sequences as hybridization probes. The technique of hybridizing labeled polynucleotide probes to polynucleotides immobilized on solid supports such as nitrocellulose filters or nylon membranes, can be used to screen the genomic or cDNA libraries. Exemplary hybridization and wash conditions are: hybridization for 20 hours at 30 65'C in 5. 0 X SSC, 0. 5% sodium dodecyl sulfate, 1 X Denhardt's solution; washing (three washes of twenty minutes each at 55'C) in 1. 0 X SSC, 1% (w/v) sodium dodecyl sulfate, and optionally one wash (for twenty minutes) in 0. 5 X SSC, 1% (w/v) sodium dodecyl sulfate, at 60'C. An optional further wash (for twenty minutes) can be conducted under conditions of 0. 1 X SSC, 1% (w/v) sodium dodecyl sulfate, at 60*C.
WO 2005/121346 PCT/NZ2005/000117 The polynucleotide fragments of the invention may be produced by techniques well-known in the art such as restriction endonuclease digestion and oligonucleotide synthesis. 5 A partial polynucleotide sequence may be used, in methods well-known in the art to identify the corresponding further contiguous polynucleotide sequence. Such methods would include PCR-based methods, 5'RACE (Frohman MA, 1993, Methods Enzymol. 218: 340-56) and hybridization- based method, computer/database-based methods. Further, by way of example, inverse PCR permits acquisition of unknown sequences, flanking the polynucleotide 10 sequences disclosed herein, starting with primers based on a known region (Triglia et al., 1998, Nucleic Acids Res 16, 8186, incorporated herein by reference). The method uses several restriction enzymes to generate a suitable fragment in the known region of a gene. The fragment is then circularized by intramolecular ligation and used as a PCR template. Divergent primers are designed from the known region. In order to physically assemble full 15 length clones, standard molecular biology approaches can be utilized (Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987). It will be understood by those skilled in the art that in order to produce intragenic vectors for further species it may be necessary to identify the sequences corresponding to essential or 20 preferred elements of such vectors in other plant species. It will be appreciated by those skilled in the art that this may be achieved by identifying polynucleotide variants of the sequences disclosed. Many methods are known by those skilled in the art for isolating such variant sequences. 25 Variant polynucleotides may be identified using PCR-based methods (Mullis et al., Eds. 1994 The Polymerase Chain Reaction, Birkhauser). Typically, the polynucleotide sequence of a primer, useful to amplify variants of polynucleotide molecules of the invention by PCR, may be based on a sequence encoding a conserved region of the corresponding amino acid sequence. 30 Further methods for identifying variant polynucleotides of the invention include use of all, or portions of, the polynucleotides disclosed herein as hybridization probes to screen plant genomic or cDNA libraries as described above. Typically probes based on a sequence encoding a conserved region of the corresponding amino acid sequence may be used.
WO 2005/121346 PCT/NZ2005/000117 Hybridisation conditions may also be less stringent than those used when screening for sequences identical to the probe. The variant polynucleotide sequences of the invention, may also be identified by computer 5 based methods well-known to those skilled in the art, using public domain sequence alignment algorithms and sequence similarity search tools to search sequence databases (public domain databases include Genbank, EMBL, Swiss-Prot, PIR and others). See, e.g., Nucleic Acids Res. 29: 1-10 and 11-16, 2001 for examples of online resources. Similarity searches retrieve and align target sequences for comparison with a sequence to be analyzed 10 (i.e., a query sequence). Sequence comparison algorithms use scoring matrices to assign an overall score to each of the alignments. An exemplary family of programs useful for identifying variants in sequence databases is the BLAST suite of programs (version 2.2.5 [Nov 2002]) including BL~ASTN, BLASTP, 15 BLASTX, tBLASTN and tBLASTX, which are publicly available from (ftp://ftp.ncbi.nih.gov/blast/) or from the National Center for Biotechnology Information (NCBI), National Library of Medicine, Building 38A, Room 8N805, Bethesda, MD 20894 USA. The NCBI server also provides the facility to use the programs to screen a number of publicly available sequence databases. BLASTN compares a nucleotide query sequence 20 against a nucleotide sequence database. BLASTP compares an amino acid query sequence against a protein sequence database. BLASTX compares a nucleotide query sequence translated in all reading frames against a protein sequence database. tBLASTN compares a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames. tBLASTX compares the six-frame translations of a nucleotide query 25 sequence against the six-frame translations of a nucleotide sequence database. The BLAST programs may be used with default parameters or the parameters may be altered as required to refine the screen. The use of the BLAST family of algorithms, including BLASTN, BLASTP, and BLASTX, is 30 described in the publication of Altschul et al., Nucleic Acids Res. 25: 33 89-3402, 1997. The "hits" to one or more database sequences by a queried sequence produced by BLASTN, BLASTP, BLASTX, tBLASTN, tBLASTX, or a similar algorithm, align and identify similar portions of sequences. The hits are arranged in order of the degree of similarity and the WO 2005/121346 PCT/NZ2005/000117 length of sequence overlap. Hits to a database sequence generally represent an overlap over only a fraction of the sequence length of the queried sequence. The BLASTN, BLASTP, BLASTX, tBLASTN and tBLASTX algorithms also produce 5 "Expect" values for alignments. The Expect value (E) indicates the number of hits one can "expect" to see by chance when searching a database of the same size containing random contiguous sequences. The Expect value is used as a significance threshold for determining whether the hit to a database indicates true similarity. For example, an E value of 0.1 assigned to a polynucleotide hit is interpreted as meaning that in a database of the size of the 10 database screened, one might expect to see 0.1 matches over the aligned portion of the sequence with a similar score simply by chance. For sequences having an E value of 0.01 or less over aligned and matched portions, the probability of finding a match by chance in that database is 1% or less using the BLASTN, BLASTP, BLASTX, tBLASTN or tBLASTX algorithm. 15 To identify the polynucleotide variants most likely to be functional equivalents of the disclosed sequences, several further computer based approaches are known to those skilled in the art. 20 Multiple sequence alignments of a group of related sequences can be carried out with CLUSTALW (Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994) CLUSTALW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Research, 22:4673-4680, http://www-iabme.u-strasbg.fr/BioInfo/ClustalW/Top.htm) or T-COFFEE 25 (Cedric Notredame, Desmond G. Higgins, Jaap Heringa, T-Coffee: A novel method for fast and accurate multiple sequence alignment, J. Mol. Biol. (2000) 302: 205-217)) or PILEUP, which uses progressive, pairwise alignments (Feng and Doolittle, 1987, J. Mol. Evol. 25, 351). Pattern recognition software applications are available for finding motifs or signature 30 sequences. For example, MEME (Multiple Em for Motif Elicitation) finds motifs and signature sequences in a set of sequences, and MAST (Motif Alignment and Search Tool) uses these motifs to identify similar or the same motifs in query sequences. The MAST results are provided as a series of alignments with appropriate statistical data and a visual WO 2005/121346 PCT/NZ2005/000117 overview of the motifs found. MEME and MAST were developed at the University of California, San Diego. PROSITE (Bairoch and Bucher, 1994, Nucleic Acids Res. 22, 3583; Hofnann et al., 1999, Nucleic Acids Res. 27, 215) is a method of identifying the functions of uncharacterized 5 proteins translated from genomic or cDNA sequences. The PROSITE database (www.expasy.org/prosite) contains biologically significant patterns and profiles and is designed so that it can be used with appropriate computational tools to assign a new sequence to a known family of proteins or to determine which known domain(s) are present in the sequence (Falquet et al., 2002, Nucleic Acids Res. 30, 235). Prosearch is a tool that can 10 search SWISS-PROT and EMBL databases with a given sequence pattern or signature. The function of a variant of a polynucleotide of the invention may be assessed by replacing the corresponding sequence in an intragenic vector with the variant sequence and testing the functionality of the vector in a host bacterial cell or in a plant transformation procedure as herein defined. 15 Methods for assembling and manipulating genetic constructs and vectors are well known in the art and are described generally in Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987; Ausubel et al., Current Protocols in Molecular Biology, Greene Publishing, 1987). 20 Numerous traits in plants may also be altered through methods of the invention. Such methods may involve the transformation of plant cells and plants, using a vector of the invention including a genetic construct designed to alter expression of a polynucleotide or polypeptide which modulates such a trait in plant cells and plants. Such methods also include 25 the transformation of plant cells and plants with a combination of the construct of the invention and one or more other constructs designed to alter expression of one or more polynucleotides or polypeptides which modulate such traits in such plant cells and plants. 30 A number of plant transformation strategies are available (e.g. Birch, 1997, Ann Rev Plant Phys Plant Mol Biol, 48, 297). For example, strategies may be designed to increase expression of a polynucleotide/polypeptide in a plant cell, organ and/or at a particular WO 2005/121346 PCT/NZ2005/000117 developmental stage where/when it is normally expressed or to ectopically express a polynucleotide/polypeptide in a cell, tissue, organ and/or at a particular developmental stage which/when it is not normally expressed. The expressed polynucleotide/polypeptide may be derived from the plant species to be transformed or may be derived from a different plant 5 species. Transformation strategies may . be designed to reduce expression of a polynucleotide/polypeptide in a plant cell, tissue, organ or at a particular developmental stage which/when it is normally expressed. Such strategies are known as gene silencing strategies. 10 Direct gene transfer involves the uptake of naked DNA by cells and its subsequent integration into the genome (Conner, A.J. and Meredith, C.P., Genetic manipulation of plant cells, pp. 653-688, in The Biochemistry of Plants: A Comprehensive Treatise, Vol 15, Molecular Biology, editor Marcus, A., Academic Press, San Diego; 1989; Petolino, J. Direct DNA 15 delivery into intact cells and tissues, pp.137-143, in Transgenic Plants and Crops, editors Khachatourians et al., Marcel Dekker, New York, 2002,. The cells can include those of intact plants, pollen, seeds, intact plant organs, in vitro cultures of plants, plant parts, tissues and cells or isolated protoplasts. Those skilled in the art will understand that methods to effect direct DNA transfer may involve, but not limited to: passive uptake; the use of 20 electroporation; treatments with polyethylene glycol and related chemicals and their adjuncts; electrophoresis, cell fusion with liposomes or spheroplasts; microinjection, silicon carbide whiskers, and microparticle bombardment. Genetic constructs for expression of genes in transgenic plants typically include promoters for 25 driving the expression of one or more cloned polynucleotide, terminators and selectable marker sequences to detect presence of the genetic construct in the transformed plant. The promoters suitable for use in the constructs of this invention are functional in a cell, tissue or organ of a monocot or dicot plant and include cell-, tissue- and organ-specific 30 promoters, cell cycle specific promoters, temporal promoters, inducible promoters, constitutive promoters that are active in most plant tissues, and recombinant promoters. Choice of promoter will depend upon the temporal and spatial expression of the cloned polynucleotide, so desired. The promoters may be those normally associated with a transgene of interest, or promoters which are derived from genes of other plants, viruses, and plant WO 2005/121346 PCT/NZ2005/000117 pathogenic bacteria and fungi. Those skilled in the art will, without undue experimentation, be able to select promoters that are suitable for use in modifying and modulating plant traits using genetic constructs comprising the polynucleotide sequences of the invention. Examples of constitutive promoters used in plants include the CaMV 35S promoter, the nopaline 5 synthase promoter and the octopine synthase promoter, and the Ubi 1 promoter from maize. Plant promoters which are active in specific tissues, respond to internal developmental signals or external abiotic or biotic stresses are also described in the scientific literature. Exemplary promoters are described, e.g., in WO 02/00894, which is herein incorporated by reference. 10 Exemplary terminators that are commonly used in plant transformation genetic constructs include, e.g., the cauliflower mosaic virus (CaMV) 35S terminator, the Agrobacterium tumefaciens nopaline synthase or octopine synthase terminators, the Zea mays zein gene terminator, the Oryza sativa ADP-glucose pyrophosphorylase terminator and the Solanum tuberosum PI-II terminator. 15 Selectable markers commonly used in plant transformation include the neomycin phophotransferase II gene (NPT II) which confers kanamycin resistance, the aadA gene, which confers spectinomycin and streptomycin resistance, the phosphinothricin acetyl transferase (bar gene) for Ignite (AgrEvo) and Basta (Hoechst) resistance, and the 20 hygromycin phosphotransferase gene (hpt) for hygromycin resistance. It will be understood by those skilled in the art that non-plant derived regulatory elements described above may be used in the intragenic vectors of the invention operably linked to selectable markers placed between the recombinase recognition sites. 25 Gene silencing strategies may be focused on the gene itself or regulatory elements which effect expression of the encoded polypeptide. "Regulatory elements" is used here in the widest possible sense and includes other genes which interact with the gene of interest. 30 Genetic constructs designed to decrease or silence the expression of a polynucleotide/polypeptide of the invention may include an antisense copy of a polynucleotide of the invention. In such constructs the polynucleotide is placed in an antisense orientation with respect to the promoter and terminator.
WO 2005/121346 PCT/NZ2005/000117 An "antisense" polynucleotide is obtained by inverting a polynucleotide or a segment of the polynucleotide so that the transcript produced will be complementary to the mRNA transcript of the gene, e.g., 5'GATCTA 3' (coding strand) 3'CTAGAT 5' (antisense strand) 5 3'CUAGAU 5' mRNA 5'GAUCUA 3' antisense RNA Genetic constructs designed for gene silencing may also include an inverted repeat as herein defined. The preferred approach to achieve this is via RNA-interference strategies using genetic constructs encoding self-complementary "hairpin" RNA (Wesley et al., 2001, Plant 10 Journal, 27: 581-590). The transcript formed may undergo complementary base pairing to form a hairpin structure. Usually a spacer of at least 3-5 bp between the repeated region is required to allow hairpin formation. 15 Another silencing approach involves the use of a small antisense RNA targeted to the transcript equivalent to an miRNA (Llave et al., 2002, Science 297, 2053). Use of such small antisense RNA corresponding to polynucleotide of the invention is expressly contemplated. 20 The term genetic construct as used herein also includes small antisense RNAs and other such polynucleotides effecting gene silencing. Transformation with an expression construct, as herein defined, may also result in gene silencing through a process known as sense suppression (e.g. Napoli et al., 1990, Plant Cell 2, 25 279; de Carvalho Niebel et al., 1995, Plant Cell, 7, 347). In some cases sense suppression may involve over-expression of the whole or a partial coding sequence but may also involve expression of non-coding region of the gene, such as an intron or a 5' or 3' untranslated region (UTR). Chimeric partial sense constructs can be used to coordinately silence multiple genes (Abbott et al., 2002, Plant Physiol. 128(3): 844-53; Jones et al., 1998, Planta 204: 499 30 505). The use of such sense suppression strategies to silence the expression of a polynucleotide of the invention is also contemplated.
WO 2005/121346 PCT/NZ2005/000117 The polynucleotide inserts in genetic constructs designed for gene silencing may correspond to coding sequence and/or non-coding sequence, such as promoter and/or intron and/or 5' or 3' UTR sequence, or the corresponding gene. 5 Other gene silencing strategies include dominant negative approaches and the use of ribozyme constructs (McIntyre, 1996, Transgenic Res, 5, 257) Pre-transcriptional silencing may be brought about through mutation of the gene itself or its regulatory elements. Such mutations may include point mutations, frameshifts, insertions, 10 deletions and substitutions. The following are representative publications disclosing genetic transformation protocols that can be used to genetically transform the following plant species: onions (WOOO/44919); peas (Grant et al., 1995 Plant Cell Rep., 15, 254-258; Grant et al., 1998, Plant Science, 139:159 15 164); petunia (Deroles and Gardner, 1988, Plant Molecular Biology, 11: 355-364); Medicago truncatula (Trieu and Harrison 1996, Plant Cell Rep. 16: 6-11); rice (Alam et al., 1999, Plant Cell Rep. 18, 572); maize (US Patent Serial Nos. 5, 177, 010 and 5, 981, 840); wheat (Ortiz et al., 1996, Plant Cell Rep. 15, 1996, 877); tomato (US Patent Serial No. 5, 159, 135); potato (Kumar et al., 1996 Plant J. 9, : 821); cassava (Li et al., 1996 Nat. Biotechnology 14, 736); 20 lettuce (Michelmore et al., 1987, Plant Cell Rep. 6, 439); tobacco (Horsch et al., 1985, Science 227, 1229); cotton (US Patent Serial Nos. 5, 846, 797 and 5, 004, 863); grasses (US Patent Nos. 5, 187, 073 and 6. 020, 539); peppermint (Niu et al., 1998, Plant Cell Rep. 17, 165); citrus plants (Pena et al., 1995, Plant Sci.104, 183); caraway (Krens et al., 1997, Plant Cell Rep, 17, 39); banana (US Patent Serial No. 5, 792, 935); soybean (US Patent Nos. 5, 416, 25 011 ; 5, 569, 834 ; 5, 824, 877 ; 5, 563, 04455 and 5, 968, 830); pineapple (US Patent Serial No. 5, 952, 543); poplar (US Patent No. 4, 795, 855); monocots in general (US Patent Nos. 5, 591, 616 and 6, 037, 522); brassica (US Patent Nos. 5, 188, 958 ; 5, 463, 174 and 5, 750, 871); and cereals (US Patent No. 6, 074, 877). It ~will be understood by those skilled in the art that the above protocols may be adapted for example, for use with alternative selectable 30 marker for transformation. The plant-derived sequences in the vectors of the invention may be derived from any plant species.
WO 2005/121346 PCT/NZ2005/000117 In one embodiment the plant-derived sequences in the vectors of the invention are from gymnosperm species. Preferred gymnosperm genera include Cycas, Pseudotsuga, Pinus and Picea. Preferred gymnosperm species include Cycas rumphii, Pseudotsuga menziesii, Pinus radiata, Pinus taeda, Pinus pinaster, Picea engelmannia x sitchensis, Picea sitchensis and 5 Picea glauca. In a further embodiment the plant-derived sequences in the vectors of the invention are from bryophyte species. Preferred bryophyte genera include Marchantia, Tortula, Physcomitrella and Ceratodon. Preferred bryophyte species include Marchantia polymorpha, Tortula 10 ruralis, Physcomitrella patens and Ceratodonpurpureous. In a further embodiment the plant-derived sequences in the vectors of the invention are from algae species. Preferred algae genera include Chlamydomonas. Preferred algae species include Chlamydomonas reinhardtii. 15 In a further embodiment the plant-derived sequences in the vectors of the invention are from angiosperm species. Preferred angiosperm genera include Aegilops, Allium, Amborella, Anopterus, Apium, Arabidopsis, Arachis, Asparagus, Atropa, Avena, Beta, Betula, Brassica, Camellia, Capsicum, Chenopodium, Cicer, Citrus, Citrullus, Coffea, Cucumis, Elaeis, 20 Eschscholzia, Eucalyptus, Fagopyrum, Fragaria, Glycine, Gossypium, Helianthus, Hevea, Hordeum, Humulus, Ipomoea, Lactuca, Limonium, Linum, Lolium, Lotus, Lycopersicon, Lycoris, Malus, Manihot, Medicago, Mesembryanthemum, Musa, Nicotiana, Nuphar, Olea, Oryza, Persea, Petunia, Phaseolus, Pisum, Plumbago, Poncirus, Populus, Prunus, Puccinellia, Pyrus, Quintinia, Raphanus, Saccharum, Schedonorus, Secale, Sesamum, 25 Solanum, Sorghum, Spinacia, Thellungiella, Theobroma, Triticum, Vaccinium, Vitis, Zea and Zinnia. Preferred angiosperm species include Aegilops speltoidss, Allium cepa, Amborella trichopoda, Anopterus macleayanus, Apium graveolens, Arabidopsis thaliana, Arachis 30 hypogaea, Asparagus officinalis, Atropa belladonna, Avena sativa, Beta vulgaris, Brassica napus, Brassica rapa, Brassica oleracea, Capsicum annuum, Capsicumfrutescens, Cicer arietinum, Citrullus lanatus, Citrus clementina, Citrus reticulata, Citrus sinensis, Coffea arabica, Coffea canephora, Cucumis sativus, Elaeis guineesis, Eschscholzia californica, Eucalyptus tereticornis, Fagopyrum esculentum, Fragaria x ananassa, Glycine max, WO 2005/121346 PCT/NZ2005/000117 Gossypium arboreum, Gossypium hirsutum, Gossypium raimondii, Helianthus annuus, Helianthus argophyllus, Hevea brasiliensis, Hordeum vulgare, Humulus lupulus, Ipomoea batatas, Ipomoea nil, Lactuca sativa, Limonium bicolor, Linum usitatissimum, Lolium multiflorum, Lotus corniculatus, Lycopersicon esculentum, Lycopersicon penellii, Lycoris 5 longituba, Malus x domestica, Manihot esculenta, Medicago truncatula, Mesembryanthemum crystallinum, Nicotiana benthamiana, Nicotiana tabacum, Nuphar advena, Olea europea, Oryza sativa, Oryza minuta, Persea americana, Petunia hybrida, Phaseolus coccineus, Phaseolus vulgaris, Pisum sativum, Plumbago zeylanica, Poncirus trifoliata, Populus alba x tremula, Populus tremula x tremuloides, Populus tremula, Populus balsamifera x teldoides), 10 Prunus americana, Prunus armeniaca, Prunus domestica, Prunus dulcis, Prunuspersica, Puccinellia tenuiflora, Pyrus communis, Quintinia verdonii, Raphanus staivus, Saccharum officinarum, Schedonorus arundinaceus, Secale cereale, Sesamum indicum, Solanum habrochaites, Solanum nigrum, Solanum tuberosum, Sorghum bicolor, Sorghum propinquum, Spinacia oleracea, Thellungiella halophila, Thellungiella salsuginea, Theobroma cacao, 15 Triticum aestivum, Triticum durum, Triticum monococcum, Vaccinium corymbosum, Vitis vinifera, Zea mays and Zinnia elegans. Particularly preferred angiosperm genera include Solanum, Petunia and Allium. Particularly preferred angiosperm species include Solanum tuberosum, Petunia hybrida and Allium cepa. 20 The plant cells and plants of the invention may be derived from any plant species. In one embodiment the plant cells and plants of the invention are from gymnosperm species. Preferred gymnosperm genera include Cycas, Pseudotsuga, Pinus and Picea. Preferred 25 gymnosperm species include Cycas rumphii, Pseudotsuga menziesii, Pinus radiata, Pinus taeda, Pinuspinaster, Picea engelmannia x sitchensis, Picea sitchensis and Picea glauca. In a further embodiment the plant cells and plants of thbe invention are from bryophyte species. . Preferred bryophyte genera include Marchantia, Tortula, Physcomitrella and 30 Ceratodon. Preferred bryophyte species include Marchantia polymorpha, Tortula ruralis, Physcomitrella patens and Ceratodon purpureous.
WO 2005/121346 PCT/NZ2005/000117 In a further embodiment the plant cells and plants of the invention are from algae species. Preferred algae genera include Chlamydomonas. Preferred algae species include Chlamydomonas reinhardtii. 5 In a further embodiment the plant cells and plants of the invention are from angiosperm species. Preferred angiosperm genera include Aegilops, Allium, Anborella, Anopterus, Apium, Arabidopsis, Arachis, Asparagus, Atropa, Avena, Beta, Betula, Brassica, Camellia, Capsicum, Chenopodium, Cicer, Citrus, Citrullus, Coffea, Cucumis, Elaeis, Eschscholzia, Eucalyptus, Fagopyrum, Fragaria, Glycine, Gossypium, Helianthus, Hevea, Hordeum, 10 Humulus, Ipomoea, Lactuca, Limonium, Linum, Lolium, Lotus, Lycopersicon, Lycoris, Malus, Manihot, Medicago, Mesembryanthemum, Musa, Nicotiana, Nuphar, Olea, Oryza, Persea, Petunia, Phaseolus, Pisum, Plumbago, Poncirus, Populus, Prunus, Puccinellia, Pyrus, Quintinia, Raphanus, Saccharum, Schedonorus, Secale, Sesamum, Solanum, Sorghum, Spinacia, Thellungiella, Thedbroma, Triticum, Vaccinium, Vitis, Zea and Zinnia. 15 Preferred angiosperm species include Aegilops speltoides, Allium cepa, Amborella trichopoda, Anopterus macleayanus, Apium graveolens, Arabidopsis thaliana, Arachis hypogaea, Asparagus officinalis, Atropa belladonna, Avena sativa, Beta vulgaris, Brassica 20 napus, Brassica rapa, Brassica oleracea, Capsicum annuum, Capsicum frutescens, Cicer arietinum, Citrullus lanatus, Citrus clementina, Citrus reticulata, Citrus sinensis, Coffea arabica, Coffea canephora, Cucumis sativus, Elaeis guineesis, Eschscholzia calfornica, Eucalyptus tereticornis, Fagopyrum esculentum, Fragaria x ananassa, Glycine max, Gossypium arboreum, Gossypium hirsutum, Gossypium raimondii, Helianthus annuus, 25 Helianthus argophyllus, Hevea brasiliensis, Hordeum vulgare, Humulus lupulus, Ipomoea batatas, Ipomoea nil, Lactuca sativa, Limonium bicolor, Linum usitatissimum, Lolium multiflorum, Lotus corniculatus, Lycopersicon esculentum, Lycopersicon penellii, Lycoris longituba, Malus x domestica, Manihot esculenta, Medicago truncsatula, Mesembryanthemun crystallinum, Nicotiana benthamiana, Nicotiana tabacum, Nuphar advena, Olea europea, 30 Oryza sativa, Oryza minuta, Persea americana, Petunia hybrida, Phaseolus coccineus, Phaseolus vulgaris, Pisum sativum, Plumbago zeylanica, Poncirus trfoliata, Populus alba x tremula, Populus tremula x tremuloides, Populus tremula, Populus balsamifera x teldoides), Prunus americana, Prunus armeniaca, Prunus domestica, Prunus dulcis, Prunuspersica, Puccinellia tenuiflora, Pyrus communis, Quintinia verdonii, Raphanus staivus, Saccharum WO 2005/121346 PCT/NZ2005/000117 officinarum, Schedonorus arundinaceus, Secale cereale, Sesamum indicum, Solanum habrochaites, Solanum nigrum, Solanum tuberosum, Sorghum bicolor, Sorghum propinquum, Spinacia oleracea, Thellungiella halophila, Thellungiella salsuginea, Theobroma cacao, Triticum aestivum, Triticum durum, Triticum monococcum, Vaccinium corymbosum, Vitis 5 vinfera, Zea mays and Zinnia elegans. Particularly preferred angiosperm genera include Solanum, Petunia and Allium. Particularly preferred angiosperm species include Solanum tuberosum, Petunia hybrida and Allium cepa. 10 The cells and plants of the invention may be grown in culture, in greenhouses or the field. They may be propagated vegetatively, as well as either selfed or crossed with a different plant strain and the resulting hybrids, with the desired phenotypic characteristics, may be identified. Two or more generations may be grown to ensure that the subject phenotypic characteristics are stably maintained arid inherited. Plants resulting from such standard breeding approaches 15 also form an aspect of the present invention. BRIEF DESCRIPTION OF THE DRAWINGS 20 Figure 1 shows PCR verification the propagation of plasmid pPOTCOLE2SPEC in E. coli mediated by a potato-derived COLE2-like origin of replication. Lanes 1 and 2 are plasmid preparations restricted with a BamHI/EcoRI double digest from two independent transformation events of pPOTCOLE2SPEC into E. coli DH5 already possessing pBX243; they show 3.9 kb, 2.5 kb, and 1.5 kb fragments, representing the pBX243 backbone, 25 linearised pPOTCOLE2SPEC, and the pBX243 Rep gene respectively. Lane 3 is a plasmid preparation restricted with a BamHI/EcoRI double digest from a culture transformed with only pBX243 and shows 3.9 kb and 1.5 kb fragments, representing the pBX243 backbone and the pBX243 Rep gene. Lane 4 is the GeneRuler DNA ladder mix #SM0331 (Fermentas, Hanover, Maryland) size marker. 30 Figure 2 shows PCR verification the potato-derived LacO 1-like sequences functioning as a plasmid selectable element by operator-repressor titration. Lane 1 is the GeneRuler DNA ladder mix #SM0331 (Fermentas, Hanover, Maryland) size marker. Lanes 2-6 are plasmid preparations restricted with PstI from five independent transformation events of WO 2005/121346 PCT/NZ2005/000117 pBR322POTLACO1 into . coli strain DHllacdapD using repressor titration selection; they show the expected 1.3 kb and 3.8 kb fragments. Lane 7 is a plasmid preparation restricted with PstI following transformation of pBR322POTLACO1 into . coli strain DH5a using ampillicin selection and also shows the expected 1.3 kb and 3.8 kb fragments. Lane 8 is 5 linearised pBR322 visualised as a 4.4 kb fragment. Figure 3 shows PCR verification of Arabidopsis thaliana 'Columbia' transformed with the intragenic vector pTCAHAS. Lanes 1&2, 3&4 and 5&6 are three A. thaliana lines transformed with the intragenic vector, lanes 1,3,5 using primers E+F, lanes 2,4,6 using 10 primers G+H; lanes 8&9 are untransformed A. thaliana, lane 8 using primers E+F, lane 9 using primers G+H; lanes 10& 11 are no template controls, lane 10 using primers E+F, lane 11 using primers G+H; lanes 12&13 are the intragenic vector pTCAHAS, lane 12 using primers E+F, lane 13 using primers G+H; lanes 7&14 are the 100 bp molecular ruler (170-8206, BioRad laboratories, USA). Primers E+F amplify an expected 643 bp fragment and primers 15 G+H amplify an expected 149 bp fragment from the T-DNA-like region of pTCAHAS. Figure 4 shows PCR verification of potato cultivar 'Iwa' transformed with the intragenic vector pPOTINV. This involved a multiplexed PCR using primers I+J to amplify the 570 bp fragment from the pPOTINV T-DNA-like region and primers K+L to amplify the 1069 bp 20 product from the endogenous actin gene of potato. Lanes 1 &7 are the 1 kb plus molecular ruler 10787-018 (Invitrogen, Carlsbad, California), lane 2 is the co-transformed hairy root line #18, lane 3 is the co-transformed hairy root line #74, lane 4 is a control hairy root line transformed with Agrobacterium strain A4T without the binary vector pPOTINV, lane 5 is the intragenic vector pPOTINV, lane 6 is a no template control. 25 Figure 5 shows PCR verification of the absence of Agrobacterium DNA in the samples used for PCR analysis in Figure 4. This involved a PCR using primers M+N to amplify the 590 bp of the Agrobacterium virG gene. Lanes 1 &7 are the 1 kb plus molecular ruler 10787-018 (Invitrogen, Carlsbad, California), lane 2 is the co-transformed hairy root line #18, lane 3 is 30 the co-transformed hairy root line #74, lane 4 is a control hairy root line transformed with Agrobacteriun strain A4T without the binary vector pPOTINV, lane 5 is Agrobacterium strain A4T, lane 6 is a no template control.
WO 2005/121346 PCT/NZ2005/000117 Figure 6 shows PCR verification of potato cultivar 'Iwa' transformed with the intragenic vector pPETINV. This involved PCR using primers O+P to amplify the 447 bp fragment from the pPETINV T-DNA-like region (lanes 2-5) and primers K+L to amplify the 1069 bp product from the endogenous actin gene of potato (lanes 6-8). Lane 1 is the 1 kb plus 5 molecular ruler 10787-018 (Invitrogen, Carlsbad, California); lanes 2 and 6 are the co transformed hairy root line #24; lanes 3 and 7 are a control hairy root line transformed with Agrobacterium strain A4T without the binary vector pPETINV; lane 4 is the intragenic vector pPETINV; lanes 5 and 8 are a no template controls. 10 Figure 7 shows PCR verification of the absence of Agrobacterium DNA in the samples used for PCR analysis in Figure 6. This involved a PCR using primers M+N to amplify the 590 bp of the Agrobacterium virG gene. Lane 1 is the 1 kb plus molecular ruler 10787-018 (Invitrogen, Carlsbad, California), lane 2 is the co-transformed hairy root line #18, lane 3 is a control hairy root line transformed with Agrobacterium strain A4T without the binary vector 15 pPOTINV, lane 4 is Agrobacterium strain A4T, lane 5 is a no template control. Figure 8 illustrates recombination between the POTLOXP sites mediated by Cre recombinase. Plasmid was isolated from E. coli strain 294-Cre transformed with pPOTLOXP2 and restricted with Sal. Expression of Cre recombinase was induced by raising the temperature 20 from 23 "C to 37 *C. Lane 1 is the 1 kb plus molecular ruler 10787-018 (Invitrogen, Carlsbad, California); lane 2 illustrates the expected 3.0 kb and 2.3 kb SalI fragments of unrecombined pPOTLOXP2 isolated from a culture maintained at 23 "C; lanes 3-8 illustrate the 3.0 kb and 1.5 kb SalI fragments expected from Cre-mediated recombination between the POTLOXP sites in six different colonies cultured at 37 *C. 25 Figure 9 illustrates recombination between the POTFRT sites mediated by FLP recombinase. Plasmid was isolated from E. coli strain 294-FLP transformed with pPOTFRT2 and restricted with SalI. Expression of FLP recombinase was induced by raising the temperature from 23 'C to 37 'C. Lanes 1 and 8 are the GeneRuler DNA ladder mix #SM0331 (Fermentas, 30 Hanover, Maryland) size marker; lane 2 illustrates the expected 3.0 kb and 1.4 kb SalI fragments of unrecombined pPOTFRT2 isolated from a culture maintained at 23 "C; lanes 3-7 illustrate the 3.0 kb and 1.4 kb fragments, and the 1.1 kb SalI fragments expected from FLP mediated recombination between the POTFRT sites in five different colonies cultured at 37 0
C.
WO 2005/121346 PCT/NZ2005/000117 EXAMPLES The invention will now be illustrated with reference to the following non-limiting examples. 5 Example 1 Identification of T-DNA border-like sequences in many plant species Agrobacterium T-DNA borders contain the following consensus motif: 5'GRCAGGATATATNNNNNKSTMAWN3' 10 (Where R = G or A, K = T or G, S = G or C, M = C or A, W = A or T and N = any nucleotide). A search on NCBI GenBank (http://www.ncbi.nlm.nih.gov/BLAST/) and TIGR database (http://tikrblast.tigr.org/tgi/) using the BLAST tool "search for short, nearly exact matches" 15 and searching within the EST databases, yielded multiple accession numbers for each motif 5'GACAGGATATAT3' and 5'GGCAGGATATAT3' as shown in Table 1. The search was limited to Viridiplanteae and the expect value was 10000. Searches were also conducted in the EST Database of Japan carried out using Expect values of 10000 and the gap tool off (http://www.ddbj.nig.ac.jp). 20 Table 1. Plant species and DNA accession numbers in which a partial T-DNA border has been identified in EST sequences. All accession numbers were found from searches in the NCBI Genbank EST databases, except for those labelled A which were identified using the TIGR database and those labelled B which were found in the EST Database of Japan. 25 Note: ' indicates 5'GRCAGGATAT (A); 2 indicates 5'GRCAGGATA3'; 3 indicates 5'GRCAGGAT3'; 4 indicates 5'GRCAGGA3';and "+" indicates there are many more accessions than the example(s) listed. 30 Plant group and species 5'GACAGGATATAT3' 5'GGCAGGATATAT3' Dicotyledonous plants Camelliaceae WO 2005/121346 PCT/NZ2005/000117 Camellia (tea) 2 CV013936 ICV066981 Chenopodiaceae Beta vulgaris (beet) BBI096344 BQ586429 Compositae Lactuca sativa (lettuce) 'BU005745, 'BU005333, BBQ851200, BU011977 + 'BU002323 + Helianthus annuus (sunflower) -BU0123919, BU024595, BU018078, BQ976399, BU025229 + BQ976382 + Zinnia elegans AU288531 AU303793 Convolvulaceae Ipomoea batatas (sweet potato) CB330857, CB330537, 'CB329905 CB330346, CN330857 + Cruciferae Arabidopsis thaliana AV783572 CB264522 + Brassica oleracea 2 CV973863 3 CV973 875 Brassica napus CD812561 'CN737684 Cucurbitaceae Cucumis sativus (cucumber) 'CK086108 'CK085499 Ericaceae Vaccinium (blueberry) 2 CV190833 'CF810976 Euphorbiaceae Hevea brasiliensis (rubber) 'CB376888 3 CB376996 Manihot esculenta (cassava) 'CK644474 CK647455, CK646256, CK643162 Fagaceae Betula (birch) 3 CD276790 'CD278538 Lauraceae Persea americana (avocado) 'CK766454, 'CV457849 CK754032, CK749566 Leguminosae Glycine max (soybean) CX705662, CX548491, C0985845, CD398345, CF922194 CD390961 + Lotus corniculatus BP046527 BP085776 + WO 2005/121346 PCT/NZ2005/000117 Medicago truncatula CB891412 + CA921810 + Pisum sativum (pea) 2 CD861031, 2 CD860484 2 CD859175, 2 CD859173 Pisum sativumi (pea) - chloroplast BX05395 Linaceae Linum usitatissimum (linseed, CV478515 'CV478657 flax) Malvaceae Gossypium arboreun (cotton) BQ403352 + BF270004 + Gossypium hirsutum (cotton) 'CA993786 C0491158, A1729959 Gossypium raimondii C0107555 Mesembryanthemaceae Mesembryanthemum crystallinum AW053481 + Moraceae Humulus lupulus (hop) 4 CD527122 Musaceae Musa (banana) 4 CVO 12662 Nymphaeaceae Nuphar advena CK753223 CD467171 Oleraceae Olea europaea (olive) 4 0K087200 Papaveraceae Eschscholzia cahfornica CK755701 Pedaliaceae Sesamum indicum (sesame) BU669955 BU668412 Plumbaginaceae Plumbago zeylanica CB817698 Rosaceae Fragaria x ananassa (strawberry) 'CX309734, C0817569 'C0817444, C0381912 Malus x domestica (apple) CN579782 Prunus donestica Prunuspersica 'AJ876058, 'AJ873533 'AJ826265, 'AJ827659 Prunus armeniaca CB819601 CK754032 + WO 2005/121346 PCT/NZ2005/000117 Rubiaceae Coffea arabica (coffee) 'CF589163 'CF589153 Coffea canephora (coffee) Rutaceae Poncirus trifoliate CD574807, CD573690, CD574743 CX672356 + Citrus sinensis CN188154 + CB293973 Citrus (grapefruit) DN960139 'DN798417 Salicaeae Populus tremula CK108404 + Populus alba x tremula CF231132 + Populus tremula x tremuloides BU828279 + Populus sp. (poplar) CX659694, CX659667, DN495050, DN484787, CX658712 CV255709 + Solanaceae Capsicum annuum (pepper) C0911113, C0908671, CA524776, BM06678, Capsicum frutescens (pepper) CA515326 + BM066170 + Lycopersicon esculentum (tomato) AW217429 BM410480 + Lycopersicon penellii AW399741 Nicotiana benthamiana CK297930 + CK286647 + Nicotiana tabacum BL29276 Solanum tuberosum (potato) CN516032 + CK716936 + Sterculiaceae Theobroma cacao (cacao) 'CA797461, 'CA797357, 'CA795783 CA797340 Umbelliferae Apium graveolens (celery) 'BU693260 'CN254199 Vitaceae Vitis vinfera (grape) CD715300, CD714256, CB981221, CX017627, CD009006 CN547415 + Monocotyledonous plants Amaryllidaceae WO 2005/121346 PCT/NZ2005/000117 Lycoris longituba CN447505 CN448914 Gramineae Aegilops speltoides BF292132 Hordeum vulgare (barley) DN184623, DN184566, CA031595, DN182886, DN177451, CK567740 + DN177809, DN177690 + Lolium multiflorum (ryegrass) AU249845 Oryza sativa (rice) - CR278518, CR755701 + CK078411, CK068012, CK056225, CB682490 + Puccinellia tenuiflora CN486642 Saccharum (sugarcane) CF573560 Saccharum officinarum CF576361, CA293676 + CA243368, CA235389, (sugarcane) CA225160 + Sorghum bicolor (sorghum) CN131489, CN135330, CD426230, CD424405 CN124024 Sorghum propinquum BG051239 + Triticum aestivum (bread wheat) CV777378, CV776448, BJ319521, BJ318204, CV764325, CK216594 CK152028 + Triticum durum (pasta wheat) AJ716746 Secale cereale (rye) BF145815 BE586285 Zea mays (corn, maize, sweetcorn) DN221450, CF0501 11, DN223506, DN222326, CK370367, CF055801 + DN212047, CF057191 + Liliaceae Allium cepa (onion) CF452050 + CF445160 + Asparagus officinalis (asparagus) ICV290039 CV289577 Palmae Elaeis guineesis (oil palm) 'CN601600 Gymnosperms Cycas rumphii CB090878 Picea glauca C0242116 C0484796, C0474325, C0257134, CK439582 + Picea sitchensis C0223865 C0226349, C0225190 Picea sp. AC0251892, TC8803, WO 2005/121346 PCT/NZ2005/000117 TC10688 Pinus pinaster BX784355 + Pinus radiata 3 AA230194 3 AA220891, 4 AA220909 Pinus taeda CF666840 + CF672772+ Bryophytes Ceratodon purpureus AW098060 Physcomitrella patens BJ586625 + BJ585772 Tortula ruralis CN206945 Algae Chlamydomonas reinhardtii BM002336 + B1717507 + The initial 5'GRCAGGATATAT3' of the T-DNA border-like motif is less likely to be 5 identified in database searches than the shorter sequence 5'KSTMAWN3'. If the entire border sequence is formed using 2 EST sequences as shown in Example 2 of the patent application, then a second BLAST search is undertaken using 5'KSTMAWN3' from known T-DNA border sequences. A list of such sequences are: 5'TGTCATG3' 5'TGTAAAC3', 5'GGTAAAC3', 5'TGTAAAA3', 5'GGTAAAA3'; which correspond to the following 10 border sequences: 5'gacaggatatatgttcttgtcatg3' (pRi), 5'gacaggatatattggcgggtaaac3' (pTiT37 and pTiC58), 5'ggcaggatatatcgaggtgtaaaa3' (pTil5955), 5'ggcaggatatattgtggtgtaaac3' (pART27 lb) and 5'gacaggatatattggcgggtaaac3' (pART27 rb). BLAST searches using these sequences produce multiple matches. For example just within 15 Solanum tuberosum (a plant whose genome has not been completely sequenced) , a search (BLAST "search for short, nearly exact matches" Expect 20000 and descriptions 1000) for only 5'TGTAAAC3' in NCBI GenBank yields 997 exact matches of which 985 are S. tuberosum ESTs (search performed 2 June 2004). 20 Alternatively, the sequences defined in Table 1 can be used for the design of "chimeric right borders" for plant-dervied T-DNA-like sequences.
WO 2005/121346 PCT/NZ2005/000117 Example 2 Identification of T-DNA-like regions from plant genomes The design of T-DNA- like regions for possible intragenic vectors was undertaken by searching plant EST databases for Agrobacterium border-like sequences. Limiting searches to 5 EST sequences facilitates the design of intragenic vectors by: 1. The base DNA making up the T-DNA-like region (including the T-DNA-like sequence and additional sequence at either one or both sides of the T-DNA-like sequence) does not involve regulatory elements such as promoters that may influence expression of inserted target genes; and 10 2. The DNA on which the T-DNA-like region is based is not derived from heterochromatic regions (non coding, non expressed, condensed DNA) as this may suppress activity of the genes intended for transfer. BLAST searches were conducted as described by Altschul et al. (Gapped BLAST and PSI BLAST: a new generation of protein database search programs, Nucleic Acids Res. 25: 3389 15 3402, 1997). Border sequences used to search the databases: Sequence motifs used to search the databases were 5'GACAGGATATAT3' or 5'GGCAGGATATAT3', 5'TGTAAAC3', 5'GGTAAAC3', 5'TGTAAAA3', 20 5'GGTAAAA3'. Other known borders were used as query sequences, these being: 5'GACAGGATATATGTTCTTGTCATG3' (pRi) 5'GACAGGATATATTGGCGGGTAAAC3' (pTiT37 and pTiC58) 5'GGCAGGATATATCGAGGTGTAAAA3' (pTil5955) 5'GGCAGGATATATTGTGGTGTAAAC3' (pART27 lb) 25 5'GACAGGATATATTGGCGGGTAAAC3' (pART27 rb) Potato, Petunia and tomato vectors NCBI BLAST - www.ncbi.nlm.nih.gov/BLAST/ "blastn" and "search for short, nearly exact matches" was used to search the EST database. Expect values of 10000 or 20000 (dependent 30 on word size) were used and the search was limited by entrez query, potato (Solanum), tomato (Lycopersicon), or Petunia. All Petunia EST sequences from the NCBI site were also downloaded in FASTA format and searched using the "find" tool in Microsoft Notepad. Solanaceae genomics network - http://soldb.cit.comell.edu/cgi-bin/tools/blast/simple.pl WO 2005/121346 PCT/NZ2005/000117 BLAST settings included expect values of 10,000 (due to short sequences) and the default settings. All searches were done in EST databases. Unigene sequences were identified using the EST searches. 5 Pinus, Nicotiana, Medicago, apple and onion vectors NCBI BLAST - www.ncbi.nlm.nih.gov/BLAST/ BLAST was carried out as above with an Expect value of 10,000 and limited by entrez query to Pinus, Nicotiana, Medicago, apple or onion (Allium). 10 Rice vector NCBI BLAST - www.ncbi.nlm.nih.gov/BLAST/. Settings were as above but limited by entrez query rice or Oryza. TIGR - http://tigrblast.tigr.org/tgi (searched unique gene indices). Used an expect value of 10,000 and matrix blosum62 or blosum100. All other values were the default settings. The 15 searches identified some TC# sequences (tentative consensus sequences) and ESTs containing the region of interest were identified from these. Staff - http://web.staff.or.jp/. The RGP EST database was used to search for ESTs containing the sequences of interest, using Expect values of 10000 and the remaining options at default settings. 20 Design of extended intragenic T-DNA- like regions ESTs were identified that showed sequence identity to parts of the Agrobacterium border-like sequences. These identified EST sequences were then assessed for homology, length of sequence flanking the borders and unique restriction sites. This was carried out using 25 DNAMAN (version 3.2, Lynnon BioSoft. copyrightC1994-1997). ESTs were adjoined (usually 3 ESTs) to give a T-DNA-like region containing two border sequences, unique restriction sites between the border sequences (that can be used as cloning sites) and extra plant EST sequence beyond the borders to minimize the opportunity for non-intragenic vector backbone sequences being transferred with the T-DNA-like region into plant genomes. 30 Multiple intragenic T-DNA-like regions were designed and compared. Those designed to have the optimum sequence and useful unique restriction sites are presented below. T-DNA-like region of a potato intragenic vector WO 2005/121346 PCT/NZ2005/000117 This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the SalI sites that are underlined. The nucleotides in italics are not part of the potato genome sequence. Nucleotides 6 - 334 are the reverse complement of nucleotides 315 - 643 of sgn-U179068. 5 Nucleotides 335 - 974 are nucleotides 131 - 770 of sgn-U174278. Nucleotides 975 - 1265 are nucleotides 117 - 407 of CN216800. The T-DNA border-like sequences are shown in bold. The left border is nucleotides 314 - 337 and the right border is nucleotides 957 - 980. Unique restriction sites in the resulting binary 10 vector that are located within the T-DNA-like region are: AflII at 611 AgeI at 518 ApaBI at 912 AsuI at 516 15 Aval at 357 Avail at 516 BamHI at 687 BstD102I at 514 Cfr1IO at 518 20 CfrI at 723 ClaI at 507 CspI at 516 EcoHI at 683 EcoRI at 340 25 HaeIII at 725 HgiAI at 916 MaeII at 405 NaIII at 875 PinAI at 518 30 Scil at 359 XbaI at 433 XhoI at 357 XhoII at 687 WO 2005/121346 PCT/NZ2005/000117 There are also two EcoRV sites within the T-DNA-like region (698 and 853) that could be used as cloning sites. 1 GTCGACAGTA AAAGTTGCAC CTGGAATAAG GTTTTCATTC TTCACAGGAG GCATCTCACT 5 61 CTTTCTAGCA GGTCTTGAAC GCTTAGATTG AACAGATGTA GGACTCACAT CTGATATGGA 121 GGATTCTTGA CTTGTTTCAG CAGCATCAGA TGAAGCTTCT GAGACTTCAC CTGATCCATC 181 ATCTGTAGCA GTTGCTTCTA CTTCTTCCAC TGCTACATCA GTCTCAGTTG CTGATACTAT 241 AAGACCTCTT AATTTAGGTC GTAAAATGCA ACCAACTCTA AAATGGGGAA ACAATTTAAT 301 AGATGTTGAC AGAGGCAGGA TATATTTTGG GGTAAACGGG AATTCTTCAG CAGTTGCTCG 10 361 AGGGAGATTG GCGGTGCTTT CAGCTCACCT TGCAGCTTCA CTCAACGTCT CCGATTTAAC 421 AACCTTCAAA CTTCTAGAAA CTTCCGGTGT ATCCGCCGTT TCCGGCGTTG CACCTCCGCC 481 GAATCTAAAA GGTGCGTTGA CGATCATCGA TGAGCGGACC GGTAAGAAGT ATCCGGTTCA 541 GGTTTCTGAG GATGGCACTA TCAAAGCCAC CGACTTAAAG AAGATAACAA CAGGACAGAA 601 TGATAAAGGT CTTAAGCTTT ATGATCCAGG CTATCTCAAC ACAGCACCTG TTAGGTCATC 15 661 AATATGCTAT ATAGATGGTG ATGCCGGGAT CCTTAGATAT CGAGGCTACC CTATTGAAGA 721 GCTGGCCGAG GGAAGTTCCT TCTTGGAAGT GGCATATCTT TTGTTGTATG GTAATTTACC 781 ATCTGAGAAC CAGTTAGCAG ACTGGGAGTT CACAGTTTCA CAGCATTCAG CGGTTCCACA 841 AGGACTCTTG GATATCATAC AGTCAATGCC CCATGATGCT CATCCAATGG GGGTTCTTGT 901 CAGTGCAATG AGTGCTCTTT CCGTTTTTCA TCCTGATGCA AATCCAGCTC TGAGAGGACA 20 961 GGATATATAC AAGTGTAAAC AATTTAAAAG CATATGGTGG CACTGCTCAA TATATGAGGT 1021 GGGCGCGAGA AGCAGGTACC AATGTGTCCT CATCAAGAGA TGCATTCTTT ACCAATCCAA 1081 CGGTCAAAGC ATACTACAAG TCTTTTGTCA AGGCTATTGT GACAAGAAAA AACTCTATAA 1141 GTGGAGTTAA ATATTCAGAA GAGCCCGCCA TATTTGCGTG GGAACTCATA AATGAGCCTC 1201 GTTGTGAATC CAGTTCATCA GCTGCTGCTC TCCAGGCGTG GATAGCAGAG ATGGCTGGAT 25 1261 TTGTCGAC (SEQ ID NO:1) T-DNA-like region of a petunia intragenic vector This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the SalI sites that are underlined. The nucleotides in italics are not 30 part of the petunia genome sequence. Nucleotides 6-399 are the complete sequence of the 394 nucleotide fragment from sgn e521144. Nucleotides 400-855 are the reverse complement of nucleotides 85-540 from sgn-e534315. Nucleotides 856-1071 are the reverse complement of nucleotides 121-336 from sgn-u207691. 35 The T-DNA border-like sequences are shown in bold. The left border is nucleotides 347-370 and right border is nucleotides 844-867. Unique restriction sites in the resulting binary vector that are located within the T-DNA-like region are: WO 2005/121346 PCT/NZ2005/000117 AccIII at 392 Agel at 788 BbvI at 453 BspMII at 392 5 Bst7lI at 453 Cfr10I at 788 ClaI site at 398 Fnu4HI at 442 MaeI at 665 10 NsiI at 752 PinAI at 788 There are also two NspI sites (RCATG/Y) within the T-DNA-like region (616 and 755) that could be used as cloning sites. The most useful restriction site for cloning into the T-DNA like region is the ClaI site which is shown in underlined bold. 15 1 GTCGACTTTA TGATCCTGGC TATCTCAACA CAGCGCCTGT TCGGTCATCA ATATGTTATA 61 TAGATGGTGA TGCCGGGATC CTTAGGTATC GAGGTTACCC TATTGAAGAG CTGGCTGAGG 121 GAAGCTCCTT CTTGGAAGTG GCTTATCTTT TATTGTACGG TAATTTGCCA TCTGAGAACC 181 AGTTGGCAGA CTGTGAGTTC ACAGTTTCAC AACATTCAGC AGTTCCACAA GGACTCCTTG 20 241 GATATCATAC AGTCAATGCC CCATGATGCT CATCCGATGG GTGTTCTTGT CAGTGCAATG 301 AGCGCTCTTT CTGTCTTTCA CCCTGATGCC AATCCAGCTC TTAGGGGACA GGATATATAC 361 AAGTCTAAAC AAATGAGAGA TAAACAAATA GTCCGGATCG ATACGTGAAG ATCAAAATGA 421 AAAGGGGAGG CGATAGATTA GCAGCATGAG CCTATATTTC TCTCACAAAA ATTCCCAGAT 481 ATTCGACACA ATAGCTCTAA CAACACTGAG CTTTTGATTA CTTGGGTCAC TTCTTCATTT 25 541 CTCTATCGTC TGTTCAGTCT TTTCCTCTGA TTTAGTTTCT GCATCATAAG TTTTGCCAAA 601 GCCAAGTTCT GACATGTCTT GCTTTGCCAT CAAATTCTTC TCCATACGAC ACTCCAGGTA 661 CTTCCTAGAG AGGTGTCTAC ACTGCTCAGA TTTATGCCCA GCGGATTTTA GACAACTAAG 721 GTATTCCTTC TTCTCCACGT CACATAAATG CATGTGATCC AAAGGGAAAA CTCCTTTTTC 781 TGGTGGAACC GGTCTCAATC CTCTATTTCC ACCAAATGCT CCCCCTGCAC TCATTACGGA 30 841 GATGGCAGGA TATATGTTCT TGTCATGGAA TAGGCCACTG CTTTCAGCTG TCTGGAGACC 901 GTGAAGTGTA CGTTGAGCCA CAGCCCATTG TGCTTCCCTC TCACCTTTTC CGTAATCCTT 961 CTTGGTTGTG AAGGCAGTCT TATTCTGCAT CATTGATTGC CAGGCGTCAC CACTCAACGT 1021 GTAACGGCTG ATGAATTTAA GAATATCAAG AGGGAAATAG GTGATAATTG TCGAC (SEQ ID NO:2) 35 T-DNA-like region of a tomato (Lycopersicon esculentuin) intragenic vector: WO 2005/121346 PCT/NZ2005/000117 This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the SalI sites that are underlined. The nucleotides in italics are not part of the tomato genome sequence. Nucleotides 5 - 537 are nucleotides 2-534 of SGN-E260320. 5 Nucleotides 538 - 976 are the reverse complement of nucleotides 79 - 517 of SGN-E291502 Nucleotides 977 - 1188 are the reverse complement of nucleotides 1 - 212 of CK575027. The T-DNA border-like sequences are shown in bold. The left border is nucleotides 375 - 398 and the right border nucleotides 960 - 983. The restriction sites and positions that could be 10 used for cloning within the T-DNA are shown below (as calculated by DNAMAN): Alw26I at 881 AlwNI at 876p BbvI at 798, 843 and 442 BglI at 740 15 Bsp1407I at 528 BspMI at 705 Bst7lI at 798, 843 and 442 ClaI at 637 Eco31I at 881 20 Eco57I at 787 EcoNI at 431 Fnu4HI at 456, 787 and 832 MaeI at 573, 711 and 744 MfeI at 444 25 MspAI at 455 NdeI at 511 NspBII at 455 ~ NspI at 535 and 583 PstI at 833 30 PvuII at 455 RsaI at 530 Style at 427 XcmI at 925 WO 2005/121346 PCT/NZ2005/000117 1 GTCGACAACA GGACAGAATG ATAAGGGTCT TAAGCTTTAT GATCCAGGCT ATCTCAACAC 61 GGCACCTGTT AGGTCATCAA TATGTTATAT TGATGGTGAT GCCGGGATCC TTAGATATCG 121 AGGCTACCCT ATTGAAGAGC TGGCCGAGGG AAGTTCCTTC TTGGAAGTGG CATATCTTTT 5 181 GTTGTATGGT AATTTACCAT CTGAGAATCA GTTAGCAGAC TGGGAGTTCA CAGTTTCACA 241 GCATTCAGCA GTTCCACAAG GACTCTTGGA TATCATACAG TCAATGCCAC ATGATGCTCA 301 TCCAATGGGG GTTCTTGTCA GTGCAATGAG TGCTCTTTCC GTTTTTCATC CTGATGCAAA 361 TCCAGCTCTG AGAGGGCAGG ATATATACAA GTCTAAACAA GTGAGAGATA AACAAATAGT 421 TCGGATCCTT GGCAAGGCAC CTACAATTGC TACAGCTGCT TACTTAAGAA TGGCTGGCAG 10 481 GCCACCTGTC CTTCCATCCA ACAATCTCTC ATATGCGGAG AACTTCTTGT ACATGCTTGC 541 TTCCTACATC CTTTACATAA CTATCACTCA ACCTAGAAAC ATGCACCAAT CCATCCGTAA 601 AAGCTCCAAA ATCAATAAAA GCACCGAATG GCTGTATCGA TCTGACCTTT CCAGGAAAAG 661 TTGCACCTGG AATAAGGTCT TCATTCTTCA CAGGAGGCAT CTCACTCTTT CTAGCAGGTC 721 TTGAACGCTT AGACTGAACA GATCTAGGAC TCACATCTGA TACAGAGGAT TCTTCACTTA 15 781 TTTCAGCAGC ATCAGATGAA GCTTCAGCAA CTCCACCAGA TCCATCATCT GCAGCAGTTG 841 CTTCTACTTC TTCCACTGCT ACATCGGTTT CAGTTGCTGA TACTACGAGA CCTCTTAATT 901 TATGTCGTAA AATGCAACCA ACTCTAAAAT GGGGAAACAA TTTAATAGAT GTTGACAGGG 961 GCAGGATATA TTTTGGTGTA AACCTGTTTC TTGCACTAAT CGTGCTTTGT CTTCCTCAGT 1021 TGGATAAGGC CACTTAGAAT GTGATTGCCA CCAAGCTTTC AACACAGATG TAGTATCACC 20 1081 AGGCAGTTTT CCTGCTCTTC TTTTGCGTAA AATTTCCTCT CTAATGTCAA CAATTTTTTC 1141 CTTATAACCC TGTTTGAGTT CATGCTTGAG TTCTTGCCTA ACACGCTCGT CGAC (SEQ ID NO:3) T-DNA-like region of a Nicotiana benthamiana intragenic vector 25 This sequence can be ligated into pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203 1207) backbone using the SalI sites that are underlined. The nucleotides in italics are not in the N. benthamiana genome sequence. Nucleotides 5 - 853 are nucleotides 111 - 959 of CK292156 Nucleotides 854 - 1469 are the reverse complement of nucleotides 8 1-696 of CK286377. 30 Nucleotides 1470 - 1787 are nucleotides 285 - 602 of CN748849. The T-DNA border-like sequences are shown in bold. The left border is nucleotides 566 589 and the right border is nucleotides 1455 - 1478. Unique restriction sites in the resulting binary vector that are located within the T-DNA-like region are: 35 AccIll at 611 AflII at 654 AhaIII at 1160 WO 2005/121346 PCT/NZ2005/000117 BamHI at 614 BsiI at 1362 Bsp1407I at 719 BspMII at 611 5 DraI at 1160 EcoNI at 622 MaeII at 840 NspI at 726 Scale at 921 10 Sspl at 1420 VspI at 1085 XhoII at 614 There are also two AvaI sites (773 and 1072), two BanI sites (627 and 906), two HaeIII (672 15 and 1393), three MaeI sites (727, 1182 and 1355), and three NaIV sites (616, 629 and 908) within the T-DNA-like region that could be used as cloning sites. 1 GTCGACCTCG CCGCTTCAGT CAATCTCTCC GATTCCAAAC TTTTAGAAAC TTCCGTTGTA 61 TCCGCCACTT CCGTCGTCGC GCCGCCGCCG AATCTAAAAG GCGCTTTGAC GATCATCGAT 20 121 GAGCGAACCG GTAAGAGGTA TCCAGTTCAA GTTTCGGAGG AAGGCACTAT CAAAGCCACC 181 GACTTGAAAA AGATAACAGC AGGACATAAT GATAAGGGTC TCAAGCTTTA TGATCCGGGA 241 TATCTCAACA CAGCACCTGT TCGGTCATCA ATATGTTATA TAGATGGTGA TGCTGGTATC 301 CTTAGATATC GAGGTTACCC AATTGAAGAG CTGGCTGAGG GAAGTTCCTT CTTGGAAGTG 361 GCTTATCTTT TGATGTATGG TAATTTACCA TCTGAGAACC AGTTGGCAGA TTGGGAGTTC 25 421 ACAGTTTCAC AACATTCAGC AGTTCCACAA GGAATCATGG ATATTATACA TTCGATGCCC 481 CATGATGCTC ATCCAATGGG TGTTCTTGTC AGCGCAATGA GTGCTCTTTC TGTCTTTCAT 541 CCTGATGCCA ATCCAGCTCT GAGAGGACAG GATATATACA AGTCTAAACA AGTGAGAGAT 601 AAACAAATAG TCCGGATCCT TGGCAAGGCA CCTACAATTG CTGCGGCTGC TTACTTAAGA 661 ATGGCTGGAA GGCCACCTGT CCTTCCATCC AACAATCTCT CTTATGCAGA GAACTTCTTG 30 721 TACATGCTAG ATTCATTAGG TAATAGGTCT TACAAACCCA ATCCTCGACT CGCTCGGGTG 781 CTCGACATTC TTTTCATATT ACACGCGGAA CATGAAATGA ATTGCTCTAC TGCTGCAGCA 841 CGTCATCTTG CTTAAATGCA ACTGCTCTAT TTTGTGTCAG AATTTGGTGA AAAATGCACT 901 GTTTTGGCAC CAAAAGTTAG TACTTTTGGA CAACTTTTTG GTGAACCAAA ATCTGTCCAA 961 AATGACTTGT TTACCTACTT AAAGAGGTCA TTTTTTCATA CCAGGGGACA TCCCCGACAT 35 1021 CCCAGGATAC ATAGCTTTTG AAAAATTTTT TACACTCAAG AATACACAAA ACTCGGGAGC 1081 AAAATTAATA GCTGAATGTT TAATAGTAAG CTGAAACTTG AGAGTTTTGG AGTGAGTTTT 1141 TTGAGAGAAA ATAACACTTT AAAAAACAAA AGTCCATACA GCTAGTTTAT AGTTTTTCTT WO 2005/121346 PCT/NZ2005/000117 1201 TCACTAAAGA TGCTGAGTTT TACGGTTTGT TTTTGGTTGT TTTGGGTTCA ATTTATTGCT 1261 GTTTTTTTTA CTATTTTTAC TGTCACTGCT GCTGCATTTT TGCTACTGCT GTATTTTTGC 1321 TCTTCAGGTA ACCTGAGAAG CTTATTTTTT GATACTAGCC ACTCGTGTTG TATTTGTCCT 1381 TTTTAATTTA AGGCCAAATA GTTTCAGTTG TAGAAGTAAT ATTTTCTCCT TTCATTAGTA 5 1441 AAGTTCAATT AAAAGGCAGG ATATATTGTG ATGTAAACAC CGTCCTGAAG TGTACCAGCT 1501 AAGGACAAGG GATCAAAGAA TTTGCCGCCA GGGTAACCCT GTTCTCCAGT GAAGTTGGCA 1561 AAGTTCTCAG CAGTCTTGGA CCATGGAGTT GCCCACTCTA CTGACTGAGA CTCAGGGTTG 1621 AAGAAGTCCA CCCACCTTTT GCTCTCAACC CACCCCATGA GTAGAAGTTG AGTGCCAAGA 1681 AGTGAGCCAA AAGAGAAAGT GCAATAGCAC CAGGGTCAGC ACCAGCCTCG AACCATGGGA 10 1741 TGCCACTCCA GGCTTGACCA ACAAAGAGCC CAATACTGCA GCCATTGTCG AC (SEQ ID NO:4) T-DNA-like region of an apple intragenic vector This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 15 1203-1207) backbone using the SalI sites that are underlined. The nucleotides in italics are not part of the apple genome sequence. Nucleotides 5 - 246 are nucleotides 1 - 242 of CN862631. Nucleotides 247 - 644 are the reverse complement of nucleotides 28 - 425 of CN942531. Nucleotides 645 - 943 are the reverse complement of nucleotides 1 - 299 of C0541348. 20 The T-DNA border-like sequences are shown in bold. The left border is nucleotides 229 252 and the right border is nucleotides 627 - 650. The restriction sites and positions that could be used for cloning within the T-DNA are shown below: AccII at 392 25 AluI at 338 Alw26I at 593 AsuI at 353 and 387 AvaI at 270 AvaII at 353 30 BanI at 415 BsaOI at 374 and 466 Bsp12861 at 418 BstD102I at 262 ClaI at 466 35 EcoHI at 269 and 270 WO 2005/121346 PCT/NZ2005/000117 EcoNI at 348 EcoRV at 438 FnuDII at 392 FokI at 619 and 235 5 HhaI at 394 HinPlL at 392 Hpal at 565 MaeII at 449 MaeIII at 509 10 MspAlI at 400 NlaIV at 289 and 417 NspBII at 400 Pvul at 466 RleAI at 269 15 ScrFI at 271 and 272 SduI at 418 SmaI at 272 StuI at 653 ThaI at 392 20 XhoII at 586 and 818 XmaI at 270 XorII at 466 1 GTCGACTAAT GAGGCTTTGA TCTACCACAA GGCTTTTCCA ATGCCGGCAT TGTCATACAA 25 61 GTTTCAGAAC ACAGACTCAC TTTCCGGCCA TGACACAGAT GATGCTGCAC AGTTTATCTC 121 TTCCGTTTGT TGGCGAGGCC AAACCTCCAC CTTAATTGCT GCAAATTCGA CGGGGAATAT 181 AAAAATTTTG GAGATGGTTT GATGATCTCC AAGGTGATTC TTGAATCTGG CAGGATATAT 241 GGGGTGGTCA TCCCACATCG AGCGGATCAC CCGGGAGAAG GTGAACGGTT CCACCGTCAA 301 TGTCGGCATC AACCCCCTCC AAGGTCGTCA TCACCAAGCT CCGCCTCGAC AAGGACCGCA 30 361 AAGTTCTGCT CGACCGCAAA GGCCAAGGGC CGCGCCGCCG CTGACAAGGA CAAGGGCACC 421 AAGTTCACTG CCGAGGATAT CATGCAGAAC GTCGATTGAT TTCGATCGAT TTCATTTCGG 481 TTTGTGTTTT TGTTAGTTAA ATGAAAGTAG TAACTGTCAA GTTAAGCACT TTAGTCGGAA 541 TCACTTTTAA TTTGAAGTAT GCGTTAACGG ATTTGGTGTT TAATCGGATC TTCGATTTGA 601 GACATGGATG GATTTGTGCT TTTTTTGACA GGATATATTA TATTGTAAAC AGGCCTCCCT 35 661 CAGACGATAC AAATGAACCC TCATGTAAAT TTGTTTCATT ATTTATTCTC ATTAACAATG 721 ACTAACACAC AAATATAAAA GAAATAAATC ACATTTGGGG TCTTGTCTGG ACAACATAGA WO 2005/121346 PCT/NZ2005/000117 781 GTTTACCGTC CCTGATCACG CCTTCAATGT CTTGTGGAGA TCCATACAGT TCCTCGATTG 841 CACTTCCTGC ACGAGCAATG CTAGAGAGGA TTGTCTTGCG AAAGTTTCCG TCAACCATAA 901 GCGGGTCAGA TGAGTAGTCG ACCAAGACTT TCTCTTCCTC GTC (SEQ ID NO:5) 5 T-DNA-like region of a Medicago truncatula intragenic vector This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the SalI sites that are underlined. The nucleotides in italics are not part of the M truncatula genome sequence. 10 Nucleotides 6-357 are nucleotides 2- 353 of CA921810. Nucleotides 358 - 694 are nucleotides 112 - 448 of AL375389. Nucleotides 695 - 1055 are the reverse complement of nucleotides 2-362 of CF069972. The T-DNA border-like sequence is shown in bold. The left border is nucleotides 339 - 362 15 and the right border is nucleotides 677 - 700. Unique restriction sites in the resulting binary vector that are located within the T-DNA-like region are: AccI at 403 AflIII at 465 AgeI at 409 20 AsuII at 522 Bsa0I at 410 BsmI at 539 Cfr10I at 409 Csp451 at 522 25 HpaII at 410 MaeII at 391 AboII at 511 MspI at 410 NlaIV at 585 30 PinAI at 409 Rsal at 394 There are also two AhaIII sites (462 and 567), two DraI sites (462 and 567) and three TaqI sites (522, 615 and 636) that could be used as cloning sites.
WO 2005/121346 PCT/NZ2005/000117 1 GTCGACTCAT TAAACAAATA AAAGAACTAT TCAAATGTTT AGCACATTTG AAACTGAGTC 61 AGCATAAAAA CATTTACCAT GCTAGTTTAA CTAGTTCAGA CGCAACCAAA ACACCATGAA 5 121 ACTTAATTGC ATAGGAAAGC ACCAACCTGT TTCAGCAGCA AGACATGCTG ACTACAGGTC 181 AACTACTGTT TCTGCATAAT AACTATTTAT CAACTACTAA TATTCCATGG TAGAATAGCC 241 ATCAAAACCA TCATTGCGCA GCAGAGGGTC AAAAAGGAAC AATGATTTTC AACAGTTACT 301 CCAAGGATAA CTGATGCTGC AGCTGGTAAC AAGTTATTGG CAGGATATAT TACCATGTAA 361 AATTCTAGGC TATGTTTACA AAAAAATTGA ACGTACTTAA TGTATACGAC CGGTAAAGGA 10 421 GAAAAAGGAA GTATAAGTCA CTTAATTTAA TTTTTTAACT TTAAACATGT TTTTTAGGAG 481 GCACAATTAT AAGTTAAAAA TGTAAGGAAA ACCTATTTTC TTCGAATATA TAGATTTGGC 541 ATTCCATTTT AGTTAAGGAG TTAATTTAAA AATATGAAAG TAGGAGCCTC TGTTCGTAAA 601 ATTTGTGAAA AATGTCGATT GATACGCAGG CGAGGTCGAA TTATAGTAAT TTGTTCCAAC 661 CCAATAACAA ATACAAGGCA GGATATATTT AACTGTAAAC GACCATGAGC CCTGTGCTCT 15 721 GCAGGTGCAT GCCAATGAAG TTGTTTCAAT GAATAATTCA TTCCATTTAT ATTAATATCG 781 CCCACTTTTC CTTCAAAATG CACCCCAATA CTGTATTGGT GGTTAACAAG TGTGGCATTT 841 GTAGGAAGGT AGTTTCTGTC TAAGGATTTC AATACATTGT TCACAACAAT ATTTGTCATT 901 GCAAGATCCA CTGGACTCTG TGCTTTCCCA TTTGAGCATG CTGCGAATGA TTGTTTTAGT 961 GTTCCCCATT TCAGAGGACC ATTTGGTCCA ATATAACTAA AATTAACCGA ATCATGATTT 20 1021 GCTGAAGTGC ATAGAGCCAA AGCAGCTATG AAAGGTCGAC (SEQ ID NO:6) T-DNA-like region of an onion intragenic vector This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 25 1203-1207) backbone using the SalI sites that are underlined. The nucleotides in italics are not part of the onion genome sequence. Nucleotides 5 - 537 are nucleotides 4 - 536 of CF449263. Nucleotides 538 - 1186 are nucleotides 94 - 742 of CF441521. Nucleotides 1187 - 1503 are nucleotides 162 - 478 of CF452730. 30 The T-DNA border-like sequences are shown in bold. The left border is nucleotides 520 543 and the right border is nucleotides 1169 - 1192. Unique restriction sites in the resulting binary vector that are located in the T-DNA-like region are: AccI at 2 and 739 35 Asu at 1046 Aval at 1046 BclI at 1124 WO 2005/121346 PCT/NZ2005/000117 Clalat 556 DraIII at 869 Eco57I at 910 and 975 HindJIJ at 628 5 Mfelat 852 NdeI at 1006 PfiMI at 649 RsaJ at 999 Sapl at 580 10 Xbalat578 1 GTCGACTTCC CTTTCCTCTA CTCCACTTGT TTCTCGCTTT CTCTACTTCC TTTTTCTCTC 61 TTTTCTTTAT ATTTATTGCT CAGCTGGGAT TAATTACTGT CATTTATTCC TCATATCTAT 15 121 TTTATTGAAT TAAAACGGTT ATTTAGCTCG AGGCCTTCTC TCTTATTCTT TGCTTCCAAG 181 GAGAGAGAAT ATGGCGAGTG GTAGCAATCA TCAGCATGGT GGAGGAGGAA GAAGAAGAGG 241 CGGAATGTTA GTCGCTGCGA CCTTGCTTAT TCTTCCTGCC ATTTTCCCCA ATTTGTTTGT 301 TCCTCTTCCC TTTGCTTTTG GTAGTTCTGG CAGCGGTGCA TCTCCTTCTC TCTTCTCCGA 361 ATGGAATGCT CCTAAACCTA GGCATCTCTC TCTTCTGAAA GCAGCCATTG AGCGTGAGAT 20 421 TTCTGACGAA CAAAAATCAG AGCTGTGGTC TCCCTTGCCT CCACAGGGAT GGAAACCGTG 481 CCTTGAGACT CAATATAGTA GCGGGCTACC CAGTAGATCG ACAGGATATA TTCAAGTGTA 541 AAACAAGATG CTGAATCGAT TAGCAATGGT TCGCTCTTCT AGACTTGCTT CTCGGATAAT 601 CAATCCTCAG TTTTTGATTC CTTCTCGAAG CTTCCTTGAT CTCCATAAGA TGGTAAACAA 661 GGAGGCGATA AAAAAAGAAA GGGCTAGACT TGCTGATGAG ATGAGCAGAG GATATTTTGC 25 721 GGATATGGCA GAGATTCGTA TACATGGTGG CAAGATTGCT ATGGCAAATG AAATTCTTAT 781 TCCATCAGGG GAAGCAATCA AATTTCCTGA TTTGACAGTA AAATTGTCTG ATGATAGCAG 841 TTTGCATTTA CCAATTGTAT CTACACAAAG TGCTACAAAT AACAATGCTA AATCCACTCC 901 TGCTGCCTCA TTGTTGTGCC TTTCCTTCAG AGCAAGTTCA CAGACAATGG TTGAATCATG 961 GACTGTTCCT TTTTTGGACA CTTTTAACTC TTCAGAAGTA CAAGCATATG AGGTATCATT 30 1021 TTTGGATTCT TGGTTTTTCT CATTCGGACC AATCAAGAGA ATGTTTCTTA ACATGACGAA 1081 GAAACCCACT GCTACTCAGC GGAAGATTGG TTATTTCATT TGGTGATCAC TATGATTTTA 1141 GGAAGCAGCT TCAAATTGTA AATCTTTTGA CAGGATATAT ATTACTGTAA AAAGTGAAGA 1201 GAGAAATGTG ATATATGCTG ATGTTTCCAT GGAGAGGGGT GCATTTCTTG TTCAACAAGC 1261 TATGAGGGCT TTCCATGGAA AGAATATAGA AAGCGCAAAA TCAAGGCTTA GTCTTTGCGA 35 1321 GGAGGATATT CGTGGGCAGT TAGAGATGAC AGATAACAAA CCAGAGTTAT ATTCACAGCT 1381 TGGTGCTGTC CTTGGAATGC TAGGAGACTG CTGTCGAGGA ATGGGTGATA CTAATGGTGC 1441 GATTCCATAT TATGAAGAGA GTGTGGAATT CCTCTTAAAA ATGCCTGCAA AAGATCCCGA 1501 GGTCGAC (SEQ ID NO:7) WO 2005/121346 PCT/NZ2005/000117 T-DNA-like region of a rice intragenic vector This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 5 1203-1207) backbone using the SalI sites that are underlined. This requires a partial digest due to a SalI site within the T-DNA like region. The nucleotides in italics are not part of the rice genome sequence. Nucleotides 6 - 634 are nucleotides 1 - 629 of CR287857. Nucleotides 635 - 1258 are nucleotides 156 - 779 of AK100350. 10 Nucleotides 1259 - 1740 are nucleotides 222 - 703 of CB619781. The T-DNA border-like sequences are shown in bold. The left border is nucleotides 616 639 and the right border is nucleotides 1247 - 1270. Unique restriction sites in the resulting binary vector that are located in the T-DNA-like region are: 15 AccI at 945 AclI at 716 AflII at 984 AhaIII at 893 Alw26I at 1102 20 ApaI at 1166 BanII at 662 and 1166 BglII at 1026 BsaOI at 1137 and 1200 BspHI at 771 25 ClaI at 1200 DraI at 893 DraII at 1162 HgiAI at 675 HindII at 946 30 MaeII at 716, 842 and 1125 MseI at 892 and 985 PssI at 1165 Pvul at 1137 and 1200 SalI at 944 WO 2005/121346 PCT/NZ2005/000117 SnaBI at 1126 SspI at 1073 Xorll at 1137 and 1200 5 1 GTCGACGGGA ATTCGCCATT ATGGCCGGGG GAAGCTTCCC TGTCACTACT GCAGGATCAT 61 CAAAAGCAAG CTAGGGCGCA TCTTCTTGAC ACTGAACCTT TTGAGCATGC ATTTGGACCA 121 AAGGGCAAGA GGAAACGCCC AAAACTAATG GCTCTTGATT ATGAATCTCT ATTGAAGAAA 181 GCTGATGATT CTCAAGGTGC ATTTGAGGAT AAGCATGCTA CAGCGAAGTT GCTGAAAGAG 10 241 GAAGAGGAAG ATGGCTTACG ATACCTAGTC CGGCACACAA TGTTTGAGAA GGGACAGAGC 301 AAAAGAATTT GGGGTGAACT CTATAAAGTT ATTGACTCTT CAGATGTTGT CGTGCAGGTG 361 TTGGATGCCA GGGATCCAAT GGGTACTAGA TGCTACCATC TGGAGAAACA TCTGAAGGAG 421 AATGCCAAGC ACAAACACTT GGTATTCTTA CTAAATAAGT GTGATCTAGT ACCTGCTTGG 481 GCCACAAAAG GATGGTTGCG CACTTTATCA AAGGACTATC CCAACCTAGC ATACCATGCA 15 541 AGCATCAACA GTTCATTTGG CAAAGGATCA CTTCTTTCAG TGTTACGGGA GGATGGACGC 601 CCTGAGAGAT GTGACGACAG GATATATAGT GAGGTCATGC AGTGCAAGCC CCTCCCCGAG 661 CCCGAGGTCA GAGCACTTTG CGAGAAGGCA AAAGAGATAT TGATGGAGGA GAGCAACGTT 721 CAACCTGTAA AGAGTCCTGT TACAATATGT GGTGATATTC ATGGGCAGTT TCATGACCTT 781 GCAGAACTGT TCCGAATCGG TGGAAAGTGC CCAGATACAA ACTACTTGTT TATGGGAGAT 20 841 TACGTGGATC GTGGTTATTA TTCTGTTGAA ACTGTCACGC TTTTGGTGGC TTTAAAGGTT 901 CGTTATCCTC AGCGAATTAC TATTCTCAGA GGAAACCACG AAAGTCGACA GATCACTCAA 961 GTTTATGGAT TCTATGACGA GTGCTTAAGG AAGTACGGGA ATGCAAATGT GTGGAAAACT 1021 TTTACAGATC TCTTCGATTA CTTCCCCTTG ACAGCATTGG TTGAGTCAGA AATATTTTGC 1081 CTGCATGGTG GATTATCGCC ATCCATTGAG ACACTTGATA ACATACGTAA CTTCGATCGT 25 1141 GTCCAAGAAG TTCCCCATGA AGGGCCCATG TGTGATCTTC TGTGGTCTGA TCCAGACGAT 1201 CGATGTGGTT GGGGTATTTC TCCTCGAGGT GCTGGATACA CCTTCGGGCA GGATATATTG 1261 GCGGGTAAAC CAATTCCTGG TTTTCCCGAC AAACCCTCGA GAATAAATTC ATTCTTTGCA 1321 GAAGGATGTC AAACTGGTGA CAATGGTGCT GGTTCCTCGC AAGAGTTGAA TGGTCATTGC 1381 AATGGAGAAC CCAGTTGCCC AGAGCAAGGA GTTCTGACCA ATGGTGGCAA CACGCCCTCT 30 1441 CCAAGCACAC AATGCTATGA AAATAAGTTT GCAACATCCA CCAACGGCAA CTATTCTATT 1501 GGGAATGGTG ATACATTATC TAGCAGCAAC TCATTACATG CGGGCAAACA GAATGCTGGC 1561 TTTACCTATA ATGGTTTCAA TCCAAAACCT TACAAAGAAC CATCAGGAAG CAACACATAT 1621 CTGAATAATA CATGCAATGG TAAACCATCG GAAGATAATC ACAATAAATG TGCCCCAAAC 1681 CTGCCGGCAA AAGATTGCCA AGGGGGCATG CCATTCTTAC ATCGTGGCTT CCTTCTAAGG 35 1741 TCGAC (SEQ ID NO:8) T-DNA-like region of Pinus taeda intragenic vector WO 2005/121346 PCT/NZ2005/000117 This sequence can be ligated to the pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) backbone using the SalI sites that are underlined. The nucleotides in italics are not part of the P. taeda genome sequence. Nucleotides 1 - 333 are nucleotides 114 - 446 of BM133642. 5 Nucleotides 334 - 914 are nucleotides 81 - 661 of CF392877. Nucleotides 915 - 1172 are nucleotides 138 - 395 of CX715693. The T-DNA border-like sequences are shown in bold. The left border is nucleotides 314 337 and the right border is nucleotides 898 - 921. Unique restriction sites in the resulting 10 binary vector that are located in the T-DNA-like region are: AccIll at 442 Alw261 at 513 AsuI at 813 AvaII at 813 15 BspMII at 442 DraII at 813 DraIII at 879 Eco72I at 876 EcoNI at 808 20 EcoRV at 555 FokI at 871 HaeJI at 339 HpaII at 443 MaeI at 534 25 MspI at 443 PmaCI at 876 PpuMI at 813 PssI at 816 30 There are also to AccI sites (2 and 888) and two BspHI sites (587 and 806) that could be used as cloning sites.
WO 2005/121346 PCT/NZ2005/000117 1 GTCGACATTC GCAGCGATGA CGCTTGGCGC TTTCATTATC TTACAGTCGA TGGATAAACT 61 CCACGTCTGG ATTGCTCTCC GTCAAGACGA AAAGAGGGAG AGGCAAATGC AGGAAT TAGA 5 121 GATTGCAAAA ATGAAAAAGA AATTATTGGC GGAACTTAAA GAGAAAGAGT CAAGCGTATA 181 GCTTTTTGTA GTGATAATTG TTAACCTCCA ATGGTAATCA TTATTTTGAA TGATAGCTAT 241 TCGATTATTT GGTGAAAAAA CAAACTTGAA AATGATTGTG GACAGTTATT 10 TGTGTTGAAG 301 CCACAGCAGT GTTGGCAGGA TATATCCAAT TGTAAAAGGC CAAGGAGATT CTCATGGAAG 361 AAAGTAATGT GCAGCCTGTC AAGTGCCCTG TTACAATCTG TGGTGACATA CATGGTCAAT 15 421 TTCACGATCT TGCCGAGCTG TTCCGGATTG GTGGAAAGTG TCCAGATACG AATTATTTGT 481 TCATGGGTGA CTATGTTGAC CGAGGATACT ATTCAGTCGA GACTGTCACT CTTCTAGTGT 541 CATTGAAGGT GCGATATCCC CAACGAATTA CCATTCTTCG AGGAAATCAT 20 GAGAGTCGTC 601 AGATTACTCA AGTATATGGA TTCTATGATG AATGTTTACG GAAGTATGGA AATGCAAACG 661 TCTGGAAGAT ATTTACAGAC CTTTTTGACT ACTTCCCATT GACAGCACTG GTAGAATCAG 25 721 AAATTTTCTG CTTACATGGT GGTTTGTCAC CTTCCATTGA CACATTAGAT AACATAAGGA 781 ATTTTGACCG TGTTCAAGAA GTGCCTCATG AAGGTCCCAT GTGCGATCTT TTGTGGTCAG 841 ATCCAGATGA TAGATGTGGA TGGGGAATCT CTCCACGTGG TGCAGGGTAT 30 ACATTTGGAC 901 AGGATATATC TGAAGGTAAA CCGAAGGGAT CCTTCAAAGT TGACCAATAT CAGAGATAAT 961 TATGTTGGCA TTGAATATGA TACCAACCCT GAAATATTGT GCTAAAGAGA GATGCTGAGG 35 1021 TTTTAAAATT TCTATCAGCT TAACGAGCAC ATGATACATA ATATATGCCA
CAAGAATGGA
WO 2005/121346 PCT/NZ2005/000117 1081 ATGGAATTTC CATTTGGCTT TAAAAAATGA TTTGTAATGT CACGTACATT AGCATCTACA 1141 AAGAATTGGA TTGCCTTCAT TCACTTTTCG TCGAC (SEQ ID NO:9) 5 WO 2005/121346 PCT/NZ2005/000117 Example 3 Identification of plant-derived sequences that function as a selectable marker in bacteria: complementation of deficient bacteria In preferred intragenic vectors of the invention, the complete vector is made up entirely of 5 plant-derived sequences. One desirable component for effective vector manipulation is a bacterial selectable marker. Preferred marker sequences include plant genes that complement bacterial mutants deficient in genes essential for their growth, such as amino acid biosynthesis genes. One such gene is acetohydroxyacid synthase. 10 Acetohydroxyacid synthase is an enzyme which catalyses the formation of acetolactate pyruvate, the first step in valine, leucine and isoleucine biosynthesis. Furthermore, plants with mutant forms of AHAS can confer resistance to sulfonylurea herbicides and related compounds (Mazur and Falco, Annual Review of Plant Physiology and Plant Molecular Biology, 40: 441-470, 1989). For example, the Arabidopsis thaliana mutant AHAS gene 15 confers resistance to the herbicide chlorsulfuron upon transformation into tobacco (Haughan et al., Molecular and General Genetics, 211: 266-271, 1988). Wild-type AHAS Genbank details are: LOCUS NM_114714 2270 bp mRNA linear PLN 19-FEB-2004 20 DEFINITION Arabidopsis thaliana acetolactate synthase, chloroplast /acetohydroxy-acid synthase (ALS) (At3g48560) mRNA, complete cds. Located on chromosome 3. A discontiguous megablast of the gene sequence above against all publicly available 25 Virdiplantae genome sequences using the standard NCBI parameters shows that the AHAS gene is present in many plant species. Genes encoding acetohydroxyacid synthase are found in both the E. coli genome and the Agrobacterium tumefaciens C58 genome (Genbank accessions NCO0913 and NC003062 respectively). Furtheimore, functional expression of plant ALIAS genes in bacterial systems to complement deficiencies in AHAS has been well 30 established. For example, the AHAS genes from Arabidopsis thaliana (Smith et al., Proceedings of the National Academy of Science, USA, 86: 4179-4183, 1989), Nicotiana tabacum (Kim and Chang, Journal of Biochemistry and Molecular Biology, 28: 265-270, 1995), and Brassica napus (Wiersma et al., Molecular and General Genetics, 224: 155-159, WO 2005/121346 PCT/NZ2005/000117 1990) have been used to complement AHAS-deficient bacteria such as Escherichia coli and Salmonella typhimurium. Furthermore, it will be understood by those with ordinary skill in the art, that plant-derived 5 sequences such as AHAS known to complement bacterial deficiencies can be placed under the control of plant promoters known to be transcriptionally active in bacteria. For example, Jacob et al. (Transgenic Research, 11: 291-303, 2002) describe several such plant promoters, one of which is the potato ST-LS1 promoter. In order to provide an example in which all the components of the present invention are derived from a single plant species, we have isolated 10 the potato (Solanum tuberosum) AHAS gene. This gene can be used in the manner described above to provide a bacterial selectable marker gene to maintain vectors in bacteria. Using potato (Solanum tuberosum) cultivar 'Iwa' genomic DNA as a template, various fragments of the AHAS gene were isolated based on primers designed from related species. 15 These fragments were cloned, their DNA sequence determined, and a composite consensus sequence generated for the potato AHAS gene. In order to generate the complete sequence for a single allele, the following primers flanking to coding region were designed: Primer Q: 5'TAGCCATTTTGCCTCCTTTC3' Primer R: 5'CAACGGCAAACTAGACAGATAGAA3' 20 A polymerase chain reaction was then performed with high fidelity Pwo polymerase with primers Q and R to amplify a fragment using genomic DNA from potato cultivar 'Iwa' as a template. This product was A-tailed, and ligated into pGemT (Promega) following the manufacturers' instructions. The cloned AHAS allele was then sequenced using primers based on the consensus sequence anchored about every 400 bp along the cloned fragment. 25 The following sequence for the coding region of a potato cultivar 'Iwa' AHAS allele (from the start codon to the stop codon) was obtained: 1 ATGGCGGCTG CTGCCTCACC ATCTCCATGT TTCTCCAAAA CCCTACCTCC ATCTTCCTCC 30 61 AAATCTTCCA CCATTCTTCC TAGATCTACC TTCCCTTTCC ACAATCACCC TCAAAAAGCC 121 TCACCCCTTC ATCTCACCCA CACCCATCAT CATCGTCGTG GTTTCGCCGT TTCCAATGTC 181 GTCATATCCA CTACCACCCA TAACGACGTT TCTGAACCTG AAACATTCGT TTCCCGTTTC 241 GCCCCTGACG AACCCAGAAA GGGTTGTGAT GTTCTTGTGG AGGCACTTGA AAGGGAGGGG 301 GTTACGGATG TATTTGCGTA CCCAGGAGGT GCTTCTATGG AGATTCATCA GGCTTTGACA 35 361 CGTTCGAATA TTATTCGTAA TGTGCTGCCA CGTCATGAGC AAGGTGGTGT GTTTGCTGCA 421 GAGGGTTACG CACGGGCGAC TGGGTTCCCT GGTGTTTGCA TTGCTACCTC TGGTCCGGGA WO 2005/121346 PCT/NZ2005/000117 481 GCTACGAATC TTGTTAGTGG TCTTGCGGAT GCTTTGTTGG ATAGTATTCC GATTGTTGCT 541 ATTACGGGTC AAGTGCCGAG GAGGATGATT GGTACTGATG CGTTTCAGGA AACGCCTATT 601 GTTGAGGTAA CGAGATCTAT TACGAAGCAT AATTATCTTG TTATGGATGT AGAGGATATT 661 CCTAGGGTTG TTCGTGAAGC GTTTTTTCTA GCGAAATCGG GACGGCCTGG GCCGGTTTTG 5 721 ATTGATGTAC CTAAGGATAT TCAGCAACAA TTGGTGATAC CTAATTGGGA TCAGCCAATG 781 AGGTTGCCTG GTTACATGTC TAGGTTACCT AAATTGCCTA ATGAGATGCT TTTGGAACAA 841 ATTATTAGGC TGATTTCGGA GTCGAAGAAG CCTGTTTTGT ATGTGGGTGG TGGGTGTTTG 901 CAATCAAGTG AGGAGCTGAG ACGATTTGTG GAGCTTACGG GTATTCCTGT GGCGAGTACT 961 TTGATGGGTC TTGGAGCTTT TCCAACTGGG GATGAGCTTT CCCTTCAAAT GTTGGGTATG 10 1021 CATGGGACTG TGTATGCTAA TTATGCTGTG GATGGTAGTG ATTTGTTGCT TGCATTTGGG 1081 GTGAGGTTTG ATGATCGAGT TACTGGTAAA TTGGAAGCTT TTGCTAGCCG AGCGAAAATT 1141 GTCCACATTG ATATTGATTC GGCTGAGATT GGAAAGAACA AGCAACCTCA TGTTTCCATT 1201 TGTGCAGATA TCAAGTTGGC ATTACAGGGT TTGAATTCCA TATTGGAGGG TAAAGAAGGT 1261 AAGCTGAAGT TGGACTTTTC TGCTTGGAGA CAGGAGTTAA CGGAACAGAA GGTGAAGTAC 15 1321 CCATTGAGTT TTAAGACTTT TGGTGAAGCC ATCCCTCCAC AATATGCTAT TCAGGTTCTT 1381 GATGAGTTAA CTAACGGAAA TGCCATTATT AGTACTGGTG TGGGGCAACA CCAGATGTGG 1441 GCTGCCCAAT ACTATAAGTA CAAAAAGCCA CACCAATGGT TGACATCTGG TGGATTAGGA 1501 GCAATGGGAT TTGGTTTGCC TGCTGCAATA GGTGCGGCTG TTGGAAGACC GGGTGAGATT 1561 GTGGTTGACA TTGATGGTGA CGGGAGTTTT ATCATGAATG TGCAGGAGTT AGCAACAATT 20 1621 AAGGTGGAGA ATCTCCCAGT TAAGATTATG TTGCTGAATA ATCAACACTT GGGAATGGTG 1681 GTTCAATGGG AGGATCGATT CTATAAGGCT AACAGAGCAC ACACTTACTT GGGTGATCCT 1741 GCTAATGAGG AAGAGATCTT CCCTAATATG TTGAAATTCG CAGAGGCTTG TGGCGTACCT 1801 GCTGCAAGAG TGTCACACAG GGATGATCTT AGAGCTGCCA TTCAAAAGAT GTTAGACACT 1861 CCTGGGCCAT ACTTGTTGGA TGTGATTGTA CCTCATCAGG AGCACGTTCT ACCTATGATT 25 1921 CCCAGTGGCG GTGCTTTCAA AGATGTGATC ACAGAGGGTG ATGGGAGACG TTCATATTGA (SEQ ID NO:10) WO 2005/121346 PCT/NZ2005/000117 Example 4 Identification of a plant-derived sequences that function as plasmid origins of replication in bacteria Preferred intragenic vectors of the invention comprise an origin of replication that functions in 5 E coli and Agrobacteriun tumefaciens. Plant derived bacterial origins of replication in this example are based on the smallest known prokaryotic replication origins of Colicin E plasmids (ColE plasmids), specifically ColE2-P9 (from Shigella sp.) and ColE3-CA38 (from E. coli). The minimal replication origins of these 10 plasmids, named COLE2 and COLE3, require only 1 specific factor (Rep) to be provided in trans. Plasmids pBX243 and pBX343 provide Rep in trans for ColE2 and ColE3 respectively. The minimal origins also require host DNA polymerase I and other factors (see Yasueda et al., Molecular and General Genetics, 215: 209-216, 1989; Shinohara and Itoh, Journal of Molecular Biology, 257: 290-300, 1996). 15 There are 2 differences between ColE2 and ColE3 origin sequences, one mismatch and a deletion of a single nucleotide in ColE2 (or an insertion in ColE3). The deletion/insertion, not the mismatch, is responsible for determining the plasmid specificity in the interaction of the origins with the trans-acting factors. 20 Characteristic features of these sequences are two direct repeat sequences of 7 bp (5'CAPuATAA) or of 9 bp (APyCAPuATAA) which are separated from each other by 7 bp or 5 bp in ColE2 and by 8 bp or 6 bp in ColE3. 25 ColE2 AGACCAGATAAGCCT TATCAGATAACAGCGCC (SEQ ID NO:11) ColE3 AGACCAAATAAGCCTATATCAGATAACAGCGCC (SEQ ID NO:12) 30 The one nucleotide mismatch G/A can be substituted without effect. Only T/A is acceptable in the insertion position and the third to last position can be G or an A. It is likely that other changes can also be made that do not affect the composition of the two direct repeat sequences. Consensus sequences for ColE2 and ColE3 can be described as: WO 2005/121346 PCT/NZ2005/000117 R = G or A (Pu) Y C or T (Py) W=AorT 5 Consensus ColE2 AGAMjCAKIATAAGCCT TA CAATAACAGCgCC Consensus CoIE3 AG CARATAAGCCT A A ACAGC CC Other minimal replication origins and Rep genes from other (Colicin E) plasmids could also 10 be used when constructing plant-derived replication origins (Table 2). Table 2. Replication origins from ColE plasmids that could be used to construct plant derived replication origins. 15 Original Host Plasmid (putative) minimal origins NCBI Accession Shigella sp ColE2-P9 agaccagataa-gcct-tatcagataacagcgcc D30054 20 Escherichia coli ColE3-CA38 agaccaaataa-gcctatatcagataacagcgcc D30055 Escherichia coli ColE2-CA42 -ga-aaata--gcctatatcagataacagcgcc D30056 Escherichia coli ColE2-GE1602 -ga-aaata-gctatatcagataacagcgcc D30057 Escherichia coli ColE2imm-K317 -ga--aaata-gctatatcagataacagcgcc D30058 Shigella sonnei CoIE4-CT9 agaccagataa-gcct-tatcagataacagcgcc D30059 25 Shigella sonnei CoIE5-099 agaccaaataaaacctatatcagataacagcgct D30060 Shigella sonnei ColE6-CT14 agaccaaataaaacetatatcagataacagcgct D30061 Escherichia coli ColE7-K317 agaccaaataa-gcctatatcagataacagcgcc D30062 Escherichia coli ColE8-J agaccaaataa-gcctatatcagataacagcgcc D30063 Escherichia coli ColE9-J agaccaaataaaacctatatcagataacagcgct D30064 30 Potato COLE2-like replication sequence The Co1E2 consensus sequence was used to search publicly available potato (Solanum tuberosun) DNA sequences. The potato COLE2-like replication sequence POTCOLE2 was 35 constructed in silico from two sequences, accessions: WO 2005/121346 PCT/NZ2005/000117 SGN U254575 nucleotides 359-721 correspond to POTCOLE2 1-363 TIGR EST494490 nucleotides 248-693 correspond to POTCOLE2 362-807 5 TGGCCACAAAACAAGCGCCAAACAACGAGCAACAACAAATCAAGATTGCACCAAAACTAGAA AATTAAAGAAGAGTATCACCCCAAATGCGTTACTGTTCACGACCTCAAATCAGAATCTACAG ATCTCTAAATCCGATCTCCACTGTTGAATTGCAAGAACCAGATGCTGAGAACTCTCAGTTCA AATTTGAGCACGATCCAACGGTTAACGAAGCGGCAAACTCTGTCTGAAGCGGACTGCCTGAG CAGAAAATTTCCAGAAGCAAAAACGGGATTTTCTCTTTTTCTCTCAATCTCTAAAACGAATC 10 TCTCTTGATTTTTCTCTCTTGTGTTTCTGAAAATAAGACCAAATAAGCCTTATCAGATAACA GCACCTGAAGCAGCTCATGTAGCTTGTCAGCACCAGGTCCTGGCCTAAACACTGTATCATTG CCACGCAAAGAGCACGGGTCTGCTCCACCATCTGATGACCCAATACACCATGCACCTGGCGA AAACCTATGTGTGCGCCCCAAAAGTTCTTTGTCAATCTTGTCTAGGACTGGATCATCCCTTG CAAA-TTTTCGGGCAAAGGGAGCATTGCTCTTGGCCATTTTGTCGAAGTTCTTCATAGTTAGA 15 GACATGGGATGTTGCTTTGGTGGACTGTCCCAAGCAATGTAATGAAGATCGTGGCTTATTGC TGTGTGCCGAAATTTCTGAGTGTTGCAAATGACAGTGTGAAAATATCCCTCTGGCGAAGAGA CAAAATTTGTATAATACATAAGCATAGTCCGTGGAAAGTTATCCCATCCCCATATGCAGTAC T (SEQ ID NO:13) 20 The POTCOLE2 replication sequence, the minimal origin sequence, is underlined. The 807 bp POTCOLE2 sequence was synthesised by Genscript Corporation (Piscatawa, NJ, www.genscript.com) and supplied cloned into the SiaI site of pUC57 (pUC57POTCOLE2). 25 The following primers were designed to PCR amplify the spectinomycin resistance gene of pART27 (Gleave, Plant Molecular Biology, 20: 1203-1207, 1992). Primer S: 5'GTGTCGACAACTACGATACGG3' (SEQ ID NO:14) Primer T: 5'CGTAAGCTTGAACGAATTCTTAG3' (SEQ ID NO:15) 30 Nucleotides underlined represent a SalI site in primer S and represent HindIII and EcoRI sites in primer T. A 1661 bp fragment with the spectinomycin resistance gene was PCR amplified from pART27 using high fidelity Pwo polymerase. This fragment was ligated as a SalI to WO 2005/121346 PCT/NZ2005/000117 HindIl region into pUC57POTCOLE2 to give pUC57POTCOLE2SPEC and position the spectinomycin resistance gene immediately adjacent to the POTCOLE2 fragment. The fragment corresponding to the spectinomycin resistance gene and the POTCOLE2 was 5 isolated as a 2.5 kb EcoRI fragment from pUC57POTCOLE2SPEC and self-circularised to generate pPOTCOLE2SPEC. The ligation was transformed into E. coli DH5a harbouring helper plasmid pBX243 (with an ampicillin resistance gene) and transformation selected on L plates supplemented with ampicillin and spectinomycin (100 ptg/mL). Resulting colonies were picked, plasmid DNA isolated and analysed by restriction enzyme digest using BamHI 10 and EcoRI. A BanHI/EcoRI double digest will release the Rep gene from pBX243 and will linearise pPOTCOLE2SPEC. Successful bacterial propagation of pPOTCOLE2SPEC using the potato-derived origin of replication is evident as three bands on a gel at 3.9 kb, 2.5 kb, and 1.5 kb, representing the 15 pBX243 backbone, linearised pPOTCOLE2SPEC, and the pBX243 Rep gene. Control digests of only pBX243 results in two bands, the 3.9 kb pBX243 backbone and the 1.5 kb Rep gene. Figure 1 provides confirmation of replication of pPOTCOLE2SPEC in bacteria mediated by the potato-derived origin of replication. 20 COLE2-like sequences from the genomes of other plant species A search on NCBI GenBank (http://www.ncbi.nlm.nih.gov/BLAST/) and TIGR database (http://tigrblast.tigr.org/tgi/) using the BLAST tool "search for short, nearly exact matches" and searching within the EST databases, yielded COLE2-like sequences in the genomes of other plant species. Searches were made for the consensus ColE2 sequence: 25 AGACANATAAGCCT T ACA TAACAGCIACC It was possible to readily assemble COLE2-like sequences by joining two sequences from the same species with a few mismatches unlikely to impact on the functionality of the origin of 30 replication. In the examples below the position of the join between two EST sequences is indicated by "/" and mismatches to the consensus sequence are indicated in bold. Species Sequence WO 2005/121346 PCT/NZ2005/000117 Consensus ColE2 AGAMCA)CtATAAGCCT TAiCA'ATAACAGCgCC Allium cepa AGACCAAATAAGCTC/TATCAGATAGCAGCTGC (SEQ ID NO: 16) 5 CF436111/CF452305 Beta vulgaris AGGCCAAATAAGCCT/TATCAGATAACAGCGCC (SEQ ID NO:17) BQ589076/BQ590618 10 Medicago trunculata AAATCAAATAAGCCTTATCA/GATAACAGCACC (SEQ ID NO:18) TC111839/TC102142 15 Gossypium arboreum AGACCAGATAATCCT/TAACAGATAACAGCGCC (SEQ ID NO:19) BM358442 /AW729597 Hordeum vulgare AGATCAGATAAGCCTTA/TCAGATAACAGCGCC (SEQ ID 20 NO:20) DN158808/AV924388 Sorghum bicolor GACCAGATAAGCATTATTAG/ATAACAGCGCC (SEQ ID NO:21) 25 CF427156 /CX613542 Picea glauca AGACCTAATAAGCCT/TATCAGATAACTGTGCG (SEQ ID NO:22) C0485190/ CO235782 30 Theobroma cacao AGACCAAATAAGACTTA/TCAGATAACAGCACG (SEQ ID NO:23) CA796667/CF974720 WO 2005/121346 PCT/NZ2005/000117 Mesembryanthemum crystallinum CA838853 / BE036300 AAACCAAATAAGCTTTA/TCAGATAACAGCACA (SEQ ID NO:24) 5 Petunia hybrida GCATCAGATAAGCCT/AACCAAATAACAGCAAC (SEQ ID NO:25) NP1240078/TC390 Brassica napus AGACCAGATAAGACT/CATCAGATAACAACACA (SEQ ID 10 NO:26) CD814492/CD814199 Zea mays CCACCAGATAAGCCTT/ATCAGATAACAGTTGC (SEQ ID NO:27) 15 DN211845/DN232238 Pinus taeda AGTACAGATAAGCCTT/ACCAAATAACAACACC (SEQ ID NO:28) DRO 19180/ BQ699992 20 COLE3-like sequences from the genomes of plant species A search on NCBI GenBank (http://www.nebi.nlm.nih.gov/BLAST/) and TIGR database (http://tigrblast.tigr.org/tgi/) using the BLAST tool "search for short, nearly exact matches" and searching within the EST databases, yielded COLE3-like sequences in the genomes of 25 other plant species. Searches were made for the consensus ColE3 sequence: AGA-CAkIATAAGCCT(TAff!CAlTAACAGCRCC It was also possible to readily assemble COLE3-like sequences by joining two sequences 30 from the same species with a few mismatches unlikely to impact on the functionality of the replication origin. In the examples below the position of the join between two EST sequences is indicated by "/"and mismatches to the consensus sequence are indicated in bold. Species Sequence WO 2005/121346 PCT/NZ2005/000117 Consensus ColE3 AGARCA ATAAGCCTNTA9CAOITAACAGCRCC Allium cepa AGACCAAATAAGC/TTATATCAGATAGCAGCTGC (SEQ ID 5 NO:29) CF436111/CF452305 Vitis vinifera AGATCAGATAAGCCTTTA/TCAGATAACAGCCCC (SEQ ID NO:30) 10 CF207293 /CF515867 Nicotiana tabacum AGACCAAATAAGCA/TATATCAGATAACTTCGGA (SEQ ID NO:3 1) TC1407/BP530912 15 Glycine max TGACCAAATAAGCTTATAT/CAGATAACAGAGTC (SEQ ID NO:32) BE209626/AI959871 20 Saccharum officinarum AGACCAAATAACCCTAAAT/CAGATAACAACGC (SEQ ID NO:33) CA156596/CA092850 Secale cereale AGACCAAATAATCATAT/TTCAGATAACAGCGCC (SEQ ID 25 NO:34) BE704886/CD453313 Capsicum annum AAACCAAATAAGCAAA/TATCAGATAACTTCGCA (SEQ ID NO:35) 30 CA525915/TC6186 Populus euphratica GCACCAAATAAGCCAATA/TCAGATAACAGCTGC (SEQ ID NO:36) AJ777378/ AJ768273 WO 2005/121346 PCT/NZ2005/000117 Lotus japonicus CTACCAAATAAGCA/TATATCAGATAACAGCGTA (SEQ ID NO:37) TC15168/AV774815 5 Medicago trunculata AGATCAAATAAGCCTTTA/TCAGATAACAGCAGA (SEQ ID NO:38) CR931730/AC148360 10 Note: the last example was derived from nr database rather than an EST database.
WO 2005/121346 PCT/NZ2005/000117 Example 5 Identification of plant-derived sequences that function as a selectable marker in bacteria: operator-repressor titration Antibiotics and antibiotic resistance genes traditionally used for the selection and maintenance 5 of recombinant plasmids in hosts such as E coli and A. tumefaciens. Their continued use is undesirable in plant biotechnology, where the threat of horizontal transfer to other microbes exists. An alternative plasmid selection strategy based on the phenomenon repressor titration was developed by Cobra Biomanufacturing Plc (WO 03/097838 Al; Williams et al., Nucleic Acids Research, 26: 2120-2124, 1998; Cranenburgh et al., Nucleic Acids Research, 29: e26, 10 2001; Cranenburgh et al., Journal of Molecular Microbiology and Biotechnology, 7: 197-203, 2004). Background 15 The Operator-Repressor Titration (ORT) system enables selection and maintenance of plasmids that are free from expressed selectable marker genes and require only the short non expressed lac operator for selection and maintenance. E. coli ORT strain DHllacdapD (genotype recA endA1 gyrA96 thil hsdr] 7 supE44 relAl 20 A(dapD)::kan hipA::lac-dapD) contains a chromosomal conditionally essential gene dapD under the control of the lac operator/promoter system. Under normal conditions, a repressor protein encoded by a second chromosomal gene binds to the chromosomal lac operator and prevents transcription of dapD, and cells lyse. Growth is permitted when an inducer (IPTG) is provided i.e. on a nutrient agar plate. Alternatively, growth is also permitted when a 25 plasmid containing a lac .operator sequence is introduced into the cell. The repressor protein binds to the plasmid-borne operator sequence, derepressing the chromosomal operator and allowing dapD expression. Two lac operator sequences have previously been shown to function as plasmid selectable 30 elements: LacO1 is 21 bp and is derived from the wild-type E. coli lac operon. LacO is 20 bp and is an 'ideal' version of LacOl, being a perfect palindrome of the first 10 bp of LacOl.
WO 2005/121346 PCT/NZ2005/000117 LacOl: AATTGTGAGCGGATAACAATT (SEQ ID NO:39) LacO: AATTGTGAGCGCTCACAATT (SEQ ID NO:40) 5 Previous research conducted to understand the lac operator has involved the analysis of various operator analogues, or LacO 1-like sequences, i.e. sequences that are able to titrate the lac repressor. As little as 13 bp of a 14 bp symmetrical consensus sequence TGTGAGCGCTCACA is able to bind the lac repressor (Simons et al., Proceedings of the National Academy of Science, USA, 81: 1624-1628, 1984). No work has been conducted to 10 show that these LacO-like sequences will function as plasmid selectable elements but it appears likely. With only 13-14 bp required, it is statistically probable that sequences capable of binding the lac repressor and acting as a plasmid selectable element will be found in all plant genomes. 15 Search for LacO-like sequences in plant genomes A search on NCBI GenBank (http://www.ncbi.nlm.nih.gov/BLAST/) and TIGR database (http://tigrblast.tigr.org/tgi/) using the BLAST tool "search for short, nearly exact matches" and searching within the EST databases, were made for the LacO sequence: 20 AATTGTGAGCGCTCACAATT (SEQ ID NO:40) The following list gives NCBI accession numbers that have sequences identical to at least 10 bp that comprise one of the two inverted repeats that make up LacO: 25 Dicotyledonous plants Camelliaceae Camellia sp. CV014004 Chenopodiaceae Beta vulgaris CV301904 30 Chenopodium sp. CN782052 Compositae Lactuca sativa BQ999659 Helianthus annuus CD848561 BU671795 BU672101 Helianthus argophyllus CF097400 WO 2005/121346 PCT/NZ2005/000117 Convolvulaceae Ipomoea batatas CB330510 Ipomoea nil BJ577170 BJ576074 BJ570027 Cruciferae 5 Brassica rapa L33645 Brassica rapa pekinensis CV523215 Brassica napus CD837028 Thellungiella halophila BM985805 Thellungiella salsuginea DN777083 10 Cucurbitaceae Cucumis sativus DN910885 Ericaceae Vaccinium corymbosum CV191519 Euphorbiaceae 15 Manihot esculenta CK651449 Lauraceae Persea americana CK758014 Leguminosae Arachis hypogaea CX127972 CD038632 20 Cicer arietinum CD051347 Glycine max CX708729 Glycine soja BG041485 Lotus corniculatus var.japonicus BP073115 Phaseolus vulgaris CV542405 25 Phaseolus coccineus CA914174 Linaceae Linum usitatissimum CV478930 CV478503 Malvaceae Gossypium hirsutum DR044140 30 Gossypium raimondii C0131784 Myrtaceae Eucalyptus tereticornis CD669011 Pedaliaceae Sesamum indicum BU668313 WO 2005/121346 PCT/NZ2005/000117 Plumbaginaceae Limonium bicolor CX263567 Rosaceae Fragaria x ananassa AB208578 5 Malus x domestica CV882575 Prunuspersica AJ825706 Prunus armeniaca CV048462 Prunus dulcis B1203104 Pyrus communis AJ504986 10 Rutaceae Citrus sinensis DN618703 CN191267 Citrus reticulata CF828122 Salicaeae Populus tremula BU823277 15 Solanaceae Capsicum annuum BM066713 Lycopersicon esculentum BP883198 BP876932 Medicago truncatula AL3 84864 Petunia x hybrida CV300353 20 Solanum habrochaites DN168862 Solanum tuberosum DN849072 CV469139 Sterculiaceae Theobroma cacao CF974287 Vitaceae 25 Vitis vin'fera CX127882 Monocotyledonous plants Gramineae Avena sativa CN821127 Hordeum vulgare CV062014 30 Oryza sativa CR290368 Saccharum officinarum CA104782 Sorghum bicolor CX607714 Schedonorus arundinaceus CK802645 Triticum aestivum BQ578949 WO 2005/121346 PCT/NZ2005/000117 Triticum monococcum BQ801760 Zea mays C0526196 Liliaceae Allium cepa CF448121 5 Gymnosperms Pinus taeda DR013559 Picea engelmannii x sitchensis C0213279 Picea glauca CK441720 Pinus pinaster BX676975 10 Pseudotsuga menziesii CN638414 Bryophytes Marchantia polymorpha AU081717 Algae Chlamydomonas reinhardtii CF558875 15 Search for LacOi-like sequences in plant genomes A search on NCBJ GenBank (http://www.ncbi.nlmn.nih.gov/BLAST/) and TJGR database (hff-tp://tigrblast.tigr.or,-/tgi/) using the BLAST tool "search for short, nearly exact matches" 20 and searching within the EST databases, were made for GGATAACAATT. The 21 bp LacOlI sequence is identical in its first 10 bp to LacO. The following list gives accession numbers where at least the last 11 bp of the LacO 1 sequence, GGATAACAATT, are found: 25 Dicotyledonous plants Chenopodiaceae Beta vulgaris CX779649 CF542856 Compositae Lactuca sativa BU00482 1 BU008 839 30 Helianthus annuus BU671786 BQ965452 Convolvulaceae Ipomoea nil BJ567255 Cruciferae Brassica rapa CV433907 CV432343 WO 2005/121346 PCT/NZ2005/000117 Brassica napus CX195012 CD838296 Raphanus sativus AF051115 Cucurbitaceae Citrullus lanatus A1563425 5 Leguminosae Cicer arietinum CK148974 Glycine max C0036432 CX709893 Medicago trunculata BQ144942 Phaseolus vulgaris CV533775 CB543020 10 Phaseolus coccineus CA913133 Pisum sativum CD860446 Linaceae Linum usitatissimum CA482669 Malvaceae 15 Gossypium raimondii C0128755 Gossypium hirsutum C0497326 Rosaceae Malus x domestica CV997415 CN879093 CN882088 Prunus persica AJ823535 20 Prunus armeniaca CV048921 Prunus americana CV458467 CK758903 Pyrus communis AJ504896 Rutaceae Citrus sinensis CX675412 CX075530 25 Citrus clementina CX298649 Solanaceae Capsicum annuum CA516533 Lycopersicon esculentum B1926125 Nicotiana tabacum CN949741 BP535353 30 Solanum tuberosum CK719419 Vitaceae Vitis vinfera CD715798 Monocotyledonous plants Gramineae WO 2005/121346 PCT/NZ2005/000117 Avena sativa CN820280 Hordeum vulgare CK568615 Lolium multiflorum AU247989 Oryza sativa CF986696 5 Saccharum officinarum CA279301 Secale cereale BE495021 Sorghum bicolor CX615619 Triticum aestivum CK215572 Zea mays CF046268 10 Liliaceae Allium cepa BE205560 CF449604 Gymnosperms Pinus taeda AW065199 DN614133 Picea glauca C0251715 C0241938 15 Pinus pinaster BX682941 Pseudotsuga menziesii CN640766 20 Potato LacOl sequence as a recombinant plasmid selectable element The 21 bp LacOl sequence was used to search publicly available potato (Solanum tuberosum) EST sequences. Sequences were found in NCBI accessions CV501815 and CK259105 joined in silico with BglII restriction enzyme recognition sites (agatct) added to termini to make 693 bp POTLACOl: 25 agat ctAATATTTACTTCTCCACTTAAACAAATACCCCAATCAGAAT CACTAGCTGGCAGAT TCCTTGTCCTCTATTGACAGCAAACATAGACGTACATTATAGAGCCACCACAACATTAGACA AACATTCTTTAAACAAGAGGTGGATACTGCTTAGACTGCAGGCGCACCCTCTTTCGGTACTC CAGAACAT CCTGAATAAACATATGATACCCTTCAGTTTGGGCAGGATCAGCAGGGTTTGGCT 30 GATCTAACAAGTCCTGGATACCAACCAGTATCTGTTTCACGGTGATGGCTGGTCTCCACCCA CTATCTTCATTGAGGATCGACAAGCAAACTGTTCCAGATGGATAGACATTGGGATGGAAAAA GCCTGGTGGGAATTTACACTTTGGCGGTTTACTCGGATAATCTTCACTGAAGTGAATTGTGA GCGGATAACAATTGGGGAAATCATTATGTAAATTCAACAAATATTTCAATTTATGCATTAGC
AAATTGTTATCAGGATCTACCACATCAGGATTGTCTTCTATGCTACGCAGCTAGTCGAACTC
WO 2005/121346 PCT/NZ2005/000117 GACTCCCTCGTTGTCTTCCTGGTAAATCCGGTCGAATATATCTCGACGGATGTTTTCTCCGG TACGATCAATACAATTTTTTCAACGAAACTACTGATTCAGCTAAAGATACAGTGAACTGTAG CAGCTagatct (SEQ ID NO:41) 5 LacOl sequence is underlined. First nucleotide of CK259105 underlined and in bold. Terminal BglII restriction enzyme sites (agatct in lowercase) are not of potato sequence origin. 10 The 693 bp POTLACO 1 sequence was synthesised by Genscript Corporation (Piscatawa, NJ, www.genscript.com) and supplied cloned into the SmaI site of pUC57 (pUC57POTLACO1). POTLACOI was excised from pUC57POTLACO1 with BglI and ligated into pBR322 previously linearised with BanHI. The resulting plasmid pBR322POTLACO1 was 15 transformed into E. coli strain DHl lacdapD and colonies were selected using repressor titration. Plasmid DNA was isolated from selected colonies and digested with restriction enzyme PstI (see Figure 2). Linearised pBR322 is visualised as a band at 4.4 kb. Pstl digested pBR322POTLACO1 is visualised as two bands, one at 1.3 kb and one at 3.8 kb. The results indicate that POTLACO1 functions as plasmid selectable element. 20 Onion LacO and LacOl sequences Onion-derived LacOl and LacO sequences, ALLLACO and ALLLACO1, have also been made in silico. 25 NCBI accessions CF448121 and CF450773 were used to generate a 756 bp ALLLACO (LacO sequence underlined): agatcttcgtcagcctacaactaccgacaatcccaaacccacatccgacgacgataactacg aaaatagggaggtggattctgacggagcttcggattccgacgatgattgggaaggggttgag 30 agcacggagttggatgagatttttagtgcggcgactacgtttatagctggactgctgcgga taagaattctgcaaaagtttcgaatgatctgcagctgcagttatatgggttttacaagattg ctactgaggggccttgtaccgttccccaaccttctgcacttaaaatgacagctcgtgccaag tggaatgcatggcagaaacttggttccatgcctcctgaagaagctatggagaagtacattgc WO 2005/121346 PCT/NZ2005/000117 aattgtgagcgctcacaattgcttttacgctatgtatgacaatatggataatcatggtgggg cccagagagccccaatgaatcctcagcaaattccatttggaaattcattatatggagctggg tctggactcatccgaggtggcttgggtgcctatggagagagatttttaggttcaagctccga gtttatgcagagcaatataagtagatggttctccaaccctcagtattactttcaagtgaatg 5 accagtatgtgaggaacaagttgaaagttgttttgtttccctttttacacagagggcattgg acaaggatcactgaaccggttggtggcaggctttcttacaaacctccaatttttgacatcaa tgccccagatct (SEQ ID NO:42) 10 NCBI accessions CF448121 and CF449604 were used to generate a 662 bp ALLLACO1 (LacO 1 sequence underlined): agatctggggcattgatgtcaaaaattggaggtttgtaagaaagcctgccaccaaccggttc agtgatccttgtccaatgccctctgtgtaaaaagggaaacaaaacaactttcaacttgttcc 15 tcacatactggtcattcacttgaaagtaatactgagggttggagaaccatctacttatattg ctctgcataaactcggagcttgaacctaaaaatctctctccataggcacccaagccacctcg gatgagtccagacccagctccatataatgaatttccaaatggaatttgctgaggattcattg gggctctctgggccccaccatgattatccatattgtcatacatagcgtaaaagcaattgtga gcggataacaattcatttcaaaagggaggaggaggaggacaacagagtcagggtcttacgct 20 ttttgtgaaaggttttgatagctctcaagatccattcacgattcgtgatactcttcgatcgc attttgagtcctgtggagagatttctcgtgtttcagttccaaaagattttgaaaccggcagc tccagggggattgcgtacattgatttcaatgaacaagagagttttaacaaagccctagaact gaatggatcagaaatagatggatactacctggttgttgatca (SEQ ID NO:43) 25 Other LacOl-like sequences from plant genomes It is highly likely that LacO 1-like sequences will also function as plasmid selectable elements. For example, the following sequence was also found. 30 gil14495426gbjBE205560.11 API26F NPI Onion cDNA library Allium cepa cDNA clone API26F similar to catalase, mRNA sequence. Length = 628 35 (V*7 WO 2005/121346 PCT/NZ2005/000117 Score = 28.2 bits (14), Expect = 0.34 Identities = 21/22 (95%), Gaps = 1/22 (4%) Strand Plus / Minus 5 Query: 1 aattgtgagcgg-ataacaatt 21 Il ll 1111111 I1lii ll1 l Sbjct: 477 aattgtgagcgggataacaatt 456 10 WO 2005/121346 PCT/NZ2005/000117 Example 6 Design and construction of an intragenic vector for Arabidopsis thaliana The consensus T-DNA border sequence can be defined as: 5 5
GGCAGGATATATXXXXXTGTAAXX
3 ' Although other variants can include: "GACAGGATATATXXXXXGGTCAXX (nucleotides in bold represent possible substitutions in some T-DNA borders). 10 Searches for such DNA sequences identified a single sequence in the A. thaliana genome corresponding to: "GACAGGATATATCGTGATGTCAAC' (SEQ ID NO:44) [ex AL138652 from chromosome 3, bp 60629-60606) 15 This "T-DNA border" is remarkably similar to authentic T-DNA borders from Agrobacterium Ti or Ri plasmids, with all nucleotide substitutions occurring in variable regions: 5 'GACAGGATATATGGTGATGTCACGr pTiS4 (SEQ ID NO:45) s'GACAGGATATATGTTCTTGTCATG 3 ' pRi (TR rb) (SEQ ID 20 NO:46) s'GGCAGGATATATCGAGGTGTAAAA 3 ' pTi15955 (TR lb) (SEQ ID NO:47) s'GACAGGATATATTGGCGGGTAAAC 3 ' pTiT37 (rb) (SEQ ID NO:48) 25 5
'GACAGGATATATTGGCGGGTAAAC
3 ' pTiC58 (rb) (SEQ ID NO:49) (nucleotides in bold represent nucleotide substitutions) The A. thaliana "T-DNA border" is from an open reading frame (nucleotides 59676-63206 30 from AL13 8652) for a putative protein of unknown function [i.e. no promoters and presumably not a heterochromatic region].
WO 2005/121346 PCT/NZ2005/000117 Examination of sequences flanking this "T-DNA border" reveal a 2838 bp fragment (nucleotides 59735-62572 from AL138652) with several unique restriction sites suitable as potential insertion sites for other genes and Southern analysis of plants transformed using this vector. 5 If the "T-DNA border" found at nucleotides 60629-60606 is considered the "left border" of a binary vector there are several unique restriction sites, including XbaI, between this left border and the first three nucleotides equivalent to a right border at positions 59735-59737. The right border beyond these three nucleotides can be provided by authentic right border 10 sequences of non-plant origin, thereby resulting in a "chimeric right border'. In assembling an intragenic vector all plasmid constructions were performed using standard molecular biology techniques of plasmid isolation, restriction, ligation and transformation into Escherichia coli strain DH5a (Sambrook et al., Molecular Cloning: A Laboratory Manual, 15 2nd Ed. Cold Spring Harbor Press, 1987). The following primers were designed: 20 Primer A: 5'CCGAGGAGGTGCTAGAGCTCTAGAGCGTAAAGGAATGTCC3' (SEQ ID NO:50) Primer B: 5'AAAGGCTCGAGGTTTACCCGCCAATATATCCTGTCTATGTTTC ACATGAACACGTGAATCTTC3' (SEQ ID NO:51) Primer C: 5'AAAGGGTCGACTAGATCTTTCGGTTGTGTGAATGATTCCGATGA 25 GAGAAGAAGAC3' (SEQ ID NO:52) Primer D: 5'GGACATTCCTTTACGCTCTAGAGCTCTAGCACCTCCTCGG3' (SEQ ID NO:53) Restriction sites within the primers are indicated in italics: TCTAGA -AbaI site, CTCGAG 30 Xhol site, GTCGAC - Sall site. Using Arabidopsis thaliana 'Columbia' genomic DNA as a template and primers A and B a polymerase chain reaction was performed using high fidelity Pwo polymerase to amplify a WO 2005/121346 PCT/NZ2005/000117 "right border" 703 bp fragment which was subsequently restricted with XbaI and XhoI ligated into pPROEX-1 restricted with the same endonucleases, to form pPROEX-lrb. Using Arabidopsis thaliana 'Columbia' genomic DNA as a template and primers C and D a 5 polymerase chain reaction was performed using high fidelity Pwo polymerase to amplify a "left border" 2216 bp fragment which was subsequently restricted with XbaI and SalI ligated into pPROEX-1-rb restricted with the same endonucleases, to form pPROEX-AtTD. The 2864 bp SalI to XhoI fragment of pPROEX-AtTD was ligated to the 8004 bp SalI 10 backbone of the binary vector pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203 1207) to form pTCI. The orientation of the two fragments was determined by restriction patterns and confirmed by DNA sequencing. The full sequence of pTCl is shown below and comprises a 2838 bp DNA fragment derived from Arabidopsis thaliana (nucleotides 59735 62572 from ALl38652) presented in italics. The right and left T-DNA borders are in bold 15 and the unique XbaI site used for subsequent cloning is in bold and underlined. 1 GTCGACGGAT CTTTTCCGCT GCATAACCCT GCTTCGGGGT CATTATAGCG ATTTTTTCGG 61 TATATCCATC CTTTTTCGCA CGATATACAG GATTTTGCCA AAGGGTTCGT GTAGACTTTC 121 CTTGGTGTAT CCAACGGCGT CAGCCGGGCA GGATAGGTGA AGTAGGCCCA CCCGCGAGCG 20 181 GGTGTTCCTT CTTCACTGTC CCTTATTCGC ACCTGGCGGT GCTCAACGGG AATCCTGCTC 241 TGCGAGGCTG GCCGGCTACC GCCGGCGTAA CAGATGAGGG CAAGCGGATG GCTGATGAAA 301 CCAAGCCAAC CAGGGGTGAT GCTGCCAACT TACTGATTTA GTGTATGATG GTGTTTTTGA 361 GGTGCTCCAG TGGCTTCTGT TTCTATCAGC TGTCCCTCCT GTTCAGCTAC TGACGGGGTG 421 GTGCGTAACG GCAAAAGCAC CGCCGGACAT CAGCGCTATC TCTGCTCTCA CTGCCGTAAA 25 481 ACATGGCAAC TGCAGTTCAC TTACACCGCT TCTCAACCCG GTACGCACCA GAAAATCATT 541 GATATGGCCA iGAATGGCGT TGGATGCCGG GCAACAGCCC GCATTATGGG CGTTGGCCTC 601 AACACGATTT TACGTCACTT AAAAAACTCA GGCCGCAGTC GGTAACCTCG CGCATACAGC 661 CGGGCAGTGA CGTCATCGTC TGCGCGGAAA TGGACGAACA GTGGGGCTAT GTCGGGGCTA 721 AATCGCGCCA GCGCTGGCTG TTTTACGCGT ATGACAGTCT CCGGAAGACG GTTGTTGCGC 30 781 ACGTATTCGG TGAACGCACT ATGGCGACGC TGGGGCGTCT TATGAGCCTG CTGTCACCCT 841 TTGACGTGGT GATATGGATG ACGGATGGCT GGCCGCTGTA TGAATCCCGC CTGAAGGGAA 901 AGCTGCACGT AATCAGCAAG CGATATACGC AGCGAATTGA GCGGCATAAC CTGAATCTGA 961 GGCAGCACCT GGCACGGCTG GGACGGAAGT CGCTGTCGTT CTCAAAATCG GTGGAGCTGC 1021 ATGACAAAGT CATCGGGCAT TATCTGAACA TAAAACACTA TCAATAAGTT GGAGTCATTA 35 1081 CCCAACCAGG AAGGGCAGCC CACCTATCAA GGTGTACTGC CTTCCAGACG AACGAAGAGC 1141 GATTGAGGAA AAGGCGGCGG CGGCCGGCAT GAGCCTGTCG GCCTACCTGC TGGCCGTCGG 1201 CCAGGGCTAC AAAATCACGG GCGTCGTGGA CTATGAGCAC GTCCGCGAGC TGGCCCGCAT 1261 CAATGGCGAC CTGGGCCGCC TGGGCGGCCT GCTGAAACTC TGGCTCACCG ACGACCCGCG nAs WO 2005/121346 PCT/NZ2005/000117 1321 CACGGCGCGG TTCGGTGATG CCACGATCCT CGCCCTGCTG GCGAAGATCG AAGAGAAGCA 1381 GGACGAGCTT GGCAAGGTCA TGATGGGCGT GGTCCGCCCG AGGGCAGAGC CATGACTTTT 1441 TTAGCCGCTA AAACGGCCGG GGGGTGCGCG TGATTGCCAA GCACGTCCCC ATGCGCTCCA 1501 TCAAGAAGAG CGACTTCGCG GAGCTGGTAT TCGTGCAGGG CAAGATTCGG AATACCAAGT 5 1561 ACGAGAAGGA CGGCCAGACG GTCTACGGGA CCGACTTCAT TGCCGATAAG GTGGATTATC 1621 TGGACACCAA GGCACCAGGC GGGTCAAATC AGGAATAAGG GCACATTGCC CCGGCGTGAG 1681 TCGGGGCAAT CCCGCAAGGA GGGTGAATGA ATCGGACGTT TGACCGGAAG GCATACAGGC 1741 AAGAACTGAT CGACGCGGGG TTTTCCGCCG AGGATGCCGA AACCATCGCA AGCCGCACCG 1801 TCATGCGTGC GCCCCGCGAA ACCTTCCAGT CCGTCGGCTC GATGGTCCAG CAAGCTACGG 10 1861 CCAAGATCGA GCGCGACAGC GTGCAACTGG CTCCCCCTGC CCTGCCCGCG CCATCGGCCG 1921 CCGTGGAGCG TTCGCGTCGT CTCGAACAGG AGGCGGCAGG TTTGGCGAAG TCGATGACCA 1981 TCGACACGCG AGGAACTATG ACGACCAAGA AGCGAAAAAC CGCCGGCGAG GACCTGGCAA 2041 AACAGGTCAG CGAGGCCAAG CAGGCCGCGT TGCTGAAACA CACGAAGCAG CAGATCAAGG 2101 AAATGCAGCT TTCCTTGTTC GATATTGCGC CGTGGCCGGA CACGATGCGA GCGATGCCAA 15 2161 ACGACACGGC CCGCTCTGCC CTGTTCACCA CGCGCAACAA GAAAATCCCG CGCGAGGCGC 2221 TGCAAAACAA GGTCATTTTC CACGTCAACA AGGACGTGAA GATCACCTAC ACCGGCGTCG 2281 AGCTGCGGGC CGACGATGAC GAACTGGTGT GGCAGCAGGT GTTGGAGTAC GCGAAGCGCA 2341 CCCCTATCGG CGAGCCGATC ACCTTCACGT TCTACGAGCT TTGCCAGGAC CTGGGCTGGT 2401 CGATCAATGG CCGGTATTAC ACGAAGGCCG AGGAATGCCT GTCGCGCCTA CAGGCGACGG 20 2461 CGATGGGCTT CACGTCCGAC CGCGTTGGGC ACCTGGAATC GGTGTCGCTG CTGCACCGCT 2521 TCCGCGTCCT GGACCGTGGC AAGAAAACGT CCCGTTGCCA GGTCCTGATC GACGAGGAAA 2581 TCGTCGTGCT GTTTGCTGGC GACCACTACA CGAAATTCAT ATGGGAGAAG TACCGCAAGC 2641 TGTCGCCGAC GGCCCGACGG ATGTTCGACT ATTTCAGCTC GCACCGGGAG CCGTACCCGC 2701 TCAAGCTGGA AACCTTCCGC CTCATGTGCG GATCGGATTC CACCCGCGTG AAGAAGTGGC 25 2761 GCGAGCAGGT CGGCGAAGCC TGCGAAGAGT TGCGAGGCAG CGGCCTGGTG GAACACGCCT 2821 GGGTCAATGA TGACCTGGTG CATTGCAAAC GCTAGGGCCT TGTGGGGTCA GTTCCGGCTG 2881 GGGGTTCAGC AGCCAGCGCT TTACTGGCAT TTCAGGAACA AGCGGGCACT GCTCGACGCA 2941 CTTGCTTCGC TCAGTATCGC TCGGGACGCA CGGCGCGCTC TACGAACTGC CGATAAACAG 3001 AGGATTAAAA TTGACAATTG TGATTAAGGC TCAGATTCGA CGGCTTGGAG CGGCCGACGT 30 3061 GCAGGATTTC CGCGAGATCC GATTGTCGGC CCTGAAGAAA GCTCCAGAGA TGTTCGGGTC 3121 CGTTTACGAG CACGAGGAGA AAAAGCCCAT GGAGGCGTTC GCTGAACGGT TGCGAGATGC 3181 CGTGGCATTC GGCGCCTACA TCGACGGCGA GATCATTGGG CTGTCGGTCT TCAAACAGGA 3241 GGACGGCCCC AAGGACGCTC ACAAGGCGCA TCTGTCCGGC GTTTTCGTGG AGCCCGAACA 3301 GCGAGGCCGA GGGGTCGCCG GTATGCTGCT GCGGGCGTTG CCGGCGGGTT TATTGCTCGT 35 3361 GATGATCGTC CGACAGATTC CAACGGGAAT CTGGTGGATG CGCATCTTCA TCCTCGGCGC 3421 ACTTAATATT TCGCTATTCT GGAGCTTGTT GTTTATTTCG GTCTACCGCC TGCCGGGCGG 3481 GGTCGCGGCG ACGGTAGGCG CTGTGCAGCC GCTGATGGTC GTGTTCATCT CTGCCGCTCT 3541 GCTAGGTAGC CCGATACGAT TGATGGCGGT CCTGGGGGCT ATTTGCGGAA CTGCGGGCGT 3601 GGCGCTGTTG GTGTTGACAC CAAACGCAGC GCTAGATCCT GTCGGCGTCG CAGCGGGCCT 40 3661 GGCGGGGGCG GTTTCCATGG CGTTCGGAAC CGTGCTGACC CGCAAGTGGC AACCTCCCGT 3721 GCCTCTGCTC ACCTTTACCG CCTGGCAACT GGCGGCCGGA GGACTTCTGC TCGTTCCAGT WO 2005/121346 PCT/NZ2005/000117 3781 AGCTTTAGTG TTTGATCCGC CAATCCCGAT GCCTACAGGA ACCAATGTTC TCGGCCTGGC 3841 GTGGCTCGGC CTGATCGGAG CGGGTTTAAC CTACTTCCTT TGGTTCCGGG GGATCTCGCG 3901 ACTCGAACCT ACAGTTGTTT CCTTACTGGG CTTTCTCAGC CGGGATGGCG CTAAGAAGCT 3961 ATTGCCGCCG ATCTTCATAT GCGGTGTGAA ATACCGCACA GATGCGTAAG GAGAAAATAC 5 4021 CGCATCAGGC GCTCTTCCGC TTCCTCGCTC ACTGACTCGC TGCGCTCGGT CGTTCGGCTG 4081 CGGCGAGCGG TATCAGCTCA CTCAAAGGCG GTAATACGGT TATCCACAGA ATCAGGGGAT 4141 AACGCAGGAA AGAACATGTG AGCAAAAGGC CAGCAAAAGG CCAGGAACCG TAAAAAGGCC 4201 GCGTTGCTGG CGTTTTTCCA TAGGCTCCGC CCCCCTGACG AGCATCACAA AAATCGACGC 4261 TCAAGTCAGA GGTGGCGAAA CCCGACAGGA CTATAAAGAT ACCAGGCGTT TCCCCCTGGA 10 4321 AGCTCCCTCG TGCGCTCTCC TGTTCCGACC CTGCCGCTTA CCGGATACCT GTCCGCCTTT 4381 CTCCCTTCGG GAAGCGTGGC GCTTTCTCAA TGCTCACGCT GTAGGTATCT CAGTTCGGTG 4441 TAGGTCGTTC GCTCCAAGCT GGGCTGTGTG CACGAACCCC CCGTTCAGCC CGACCGCTGC 4501 GCCTTATCCG GTAACTATCG TCTTGAGTCC AACCCGGTAA GACACGACTT ATCGCCACTG 4561 GCAGCAGCCA CTGGTAACAG GATTAGCAGA GCGAGGTATG TAGGCGGTGC TACAGAGTTC 15 4621 TTGAAGTGGT GGCCTAACTA CGGCTACACT AGAAGGACAG TATTTGGTAT CTGCGCTCTG 4681 CTGAAGCCAG TTACCTTCGG AAAAAGAGTT GGTAGCTCTT GATCCGGCAA ACAAACCACC 4741 GCTGGTAGCG GTGGTTTTTT TGTTTGCAAG CAGCAGATTA CGCGCAGAAA AAAAGGATCT 4801 CAAGAAGATC CTTTGATCTT TTCTACGGGG TCTGACGCTC AGTGGAACGA AAACTCACGT 4861 TAAGGGATTT TGGTCATGAG ATTATCAAAA AGGATCTTCA CCTAGATCCT TTTAAATTAA 20 4921 AAATGAAGTT TTAAATCAAT CTAAAGTATA TATGAGTAAA CTTGGTCTGA CAGTTACCAA 4981 TGCTTAATCA GTGAGGCACC TATCTCAGCG ATCTGTCTAT TTCGTTCATC CATAGTTGCC 5041 TGACTCCCCG TCGTGTAGAT AACTACGATA CGGGAGGGCT TACCATCTGG CCCCAGTGCT 5101 GCAATGATAC CGCGAGACCC ACGCTCACCG GCTCCAGATT TATCAGCAAT AAACCAGCCA 5161 GCCGGAAGGG CCGAGCGCAG AAGTGGTCCT GCAACTTTAT CCGCCTCCAT CCAGTCTATT 25 5221 AAACAAGTGG CAGCAACGGA TTCGCAAACC TGTCACGCCT TTTGTGCCAA AAGCCGCGCC 5281 AGGTTTGCGA TCCGCTGTGC CAGGCGTTAG GCGTCATATG AAGATTTCGG TGATCCCTGA 5341 GCAGGTGGCG GAAACATTGG ATGCTGAGAA CCATTTCATT GTTCGTGAAG TGTTCGATGT 5401 GCACCTATCC GACCAAGGCT TTGAACTATC TACCAGAAGT GTGAGCCCCT ACCGGAAGGA 5461 TTACATCTCG GATGATGACT CTGATGAAGA CTCTGCTTGC TATGGCGCAT TCATCGACCA 30 5521 AGAGCTTGTC GGGAAGATTG AACTCAACTC AACATGGAAC GATCTAGCCT CTATCGAACA 5581 CATTGTTGTG TCGCACACGC ACCGAGGCAA AGGAGTCGCG CACAGTCTCA TCGAATTTGC 5641 GAAAAAGTGG GCACTAAGCA GACAGCTCCT TGGCATACGA TTAGAGACAC AAACGAACAA 5701 TGTACCTGCC TGCAATTTGT ACGCAAAATG TGGCTTTACT CTCGGCGGCA TTGACCTGTT 5761 CACGTATAAA ACTAGACCTC AAGTCTCGAA CGAAACAGCG ATGTACTGGT ACTGGTTCTC 35 5821 GGGAGCACAG GATGACGCCT AACAATTCAT TCAAGCCGAC ACCGCTTCGC GGCGCGGCTT 5881 AATTCAGGAG TTAAACATCA TGAGGGAAGC GGTGATCGCC GAAGTATCGA CTCAACTATC 5941 AGAGGTAGTT GGCGTCATCG AGCGCCATCT CGAACCGACG TTGCTGGCCG TACATTTGTA 6001 CGGCTCCGCA GTGGATGGCG GCCTGAAGCC ACACAGTGAT ATTGATTTGC TGGTTACGGT 6061 GACCGTAAGG CTTGATGAAA CAACGCGGCG AGCTTTGATC AACGACCTTT TGGAAACTTC 40 6121 GGCTTCCCCT GGAGAGAGCG AGATTCTCCG CGCTGTAGAA GTCACCATTG TTGTGCACGA 6181 CGACATCATT CCGTGGCGTT ATCCAGCTAA GCGCGAACTG CAATTTGGAG AATGGCAGCG WO 2005/121346 PCT/NZ2005/000117 6241 CAATGACATT CTTGCAGGTA TCTTCGAGCC AGCCACGATC GACATTGATC TGGCTATCTT 6301 GCTGACAAAA GCAAGAGAAC ATAGCGTTGC CTTGGTAGGT CCAGCGGCGG AGGAACTCTT 6361 TGATCCGGTT CCTGAACAGG ATCTATTTGA GGCGCTAAAT GAAACCTTAA CGCTATGGAA 6421 CTCGCCGCCC GACTGGGCTG GCGATGAGCG AAATGTAGTG CTTACGTTGT CCCGCATTTG 5 6481 GTACAGCGCA GTAACCGGCA AAATCGCGCC GAAGGATGTC GCTGCCGACT GGGCAATGGA 6541 GCGCCTGCCG GCCCAGTATC AGCCCGTCAT ACTTGAAGCT AGGCAGGCTT ATCTTGGACA 6601 AGAAGATCGC TTGGCCTCGC GCGCAGATCA GTTGGAAGAA TTTGTTCACT ACGTGAAAGG 6661 CGAGATCACC AAGGTAGTCG GCAAATAATG TCTAACAATT CGTTCAAGCC GACGCCGCTT 6721 CGCGGCGCGG CTTAACTCAA GCGTTAGAGA GCTGGGGAAG ACTATGCGCG ATCTGTTGAA 10 6781 GGTGGTTCTA AGCCTCGTAC TTGCGATGGC ATCGGGGCAG GCACTTGCTG ACCTGCCAAT 6841 TGTTTTAGTG GATGAAGCTC GTCTTCCCTA TGACTACTCC CCATCCAACT ACGACATTTC 6901 TCCAAGCAAC TACGACAACT CCATAAGCAA TTACGACAAT AGTCCATCAA ATTACGACAA 6961 CTCTGAGAGC AACTACGATA ATAGTTCATC CAATTACGAC AATAGTCGCA ACGGAAATCG 7021 TAGGCTTATA TATAGCGCAA ATGGGTCTCG CACTTTCGCC GGCTACTACG TCATTGCCAA 15 7081 CAATGGGACA ACGAACTTCT TTTCCACATC TGGCAAAAGG ATGTTCTACA CCCCAAAAGG 7141 GGGGCGCGGC GTCTATGGCG GCAAAGATGG GAGCTTCTGC GGGGCATTGG TCGTCATAAA 7201 TGGCCAATTT TCGCTTGCCC TGACAGATAA CGGCCTGAAG ATCATGTATC TAAGCAACTA 7261 GCCTGCTCTC TAATAAAATG TTAGGAGCTT GGCTGCCATT TTTGGGGTGA GGCCGTTCGC 7321 GGCCGAGGGG CGCAGCCCCT GGGGGGATGG GAGGCCCGCG TTAGCGGGCC GGGAGGGTTC 20 7381 GAGAAGGGGG GGCACCCCCC TTCGGCGTGC GCGGTCACGC GCCAGGGCGC AGCCCTGGTT 7441 AAAAACAAGG TTTATAAATA TTGGTTTAAA AGCAGGTTAA AAGACAGGTT AGCGGTGGCC 7501 GAAAAACGGG CGGAAACCCT TGCAAATGCT GGATTTTCTG CCTGTGGACA GCCCCTCAAA 7561 TGTCAATAGG TGCGCCCCTC ATCTGTCAGC ACTCTGCCCC TCAAGTGTCA AGGATCGCGC 7621 CCCTCATCTG TCAGTAGTCG CGCCCCTCAA GTGTCAATAC CGCAGGGCAC TTATCCCCAG 25 7681 GCTTGTCCAC ATCATCTGTG GGAAACTCGC GTAAAATCAG GCGTTTTCGC CGATTTGCGA 7741 GGCTGGCCAG CTCCACGTCG CCGGCCGAAA TCGAGCCTGC CCCTCATCTG TCAACGCCGC 7801 GCCGGGTGAG TCGGCCCCTC AAGTGTCAAC GTCCGCCCCT CATCTGTCAG TGAGGGCCAA 7861 GTTTTCCGCG AGGTATCCAC AACGCCGGCG GCCGGCCGCG GTGTCTCGCA CACGGCTTCG 7921 ACGGCGTTTC TGGCGCGTTT GCAGGGCCAT AGACGGCCGC CAGCCCAGCG GCGAGGGCAA 30 7981 CCAGCCCGGT GAGCGTCGGA AAGGGTCGAG GTTTACCCGC CAATATATCC TGTCTATGTT 8041 TCACATGAAC ACGTGAATCT TCTTCAACAC GCCCACCTAA CCGCTCCTTT GCAGATAATC 8101 GACGGCGTCG AGTTGATGTG TGATCAACAT TACCAGAATT CCTTTCATCA GCTGAGTATC 8161 GGAATTGTTC TCTGCTTATT CCTCCATCCA CTGCATAGTT CCCTAGCTTG TCTCTGTAAT 8221 CATATGCTAC TTCATGTTCA CGGAACCTTT TACTATCTGC CTTCTCATAA GACATTCTTG 35 8281 ATTGCTTAGC ATCCCTGTAG TTGTAATCAT AAGGCATATT CTCATGCATA ACCTCACTTG 8341 CGTTGTCTCT AAGACCATAA TCATCTCTTG TACGCAAAAT TGAATCATTC GAATGATAAA 8401 CCTCTTGTCT ACCATCTTGA TATCTCATAT TGGCATAAAC TTTAACATCA CCACCATTAC 8461 GTCGTTGCAA ACGCTCATCA TCCAAGTAGA CTTGATCTCG GTCATCAAAA AGATATCTCC 8521 TGCCTCGAAG AGCTTCCTCA TCTTGCTTGC CAGCTGATGA TCTACTGACA TCAGGATGCA 40 8581 TCACCCCATA CGAATCAATT TCATGATCTC TTAGGAGTTG CTGGCTTTCA TAGGGCAAAT 8641 AGGCTTCCCT TCCGTCATTC GAGGACATTC CTTTACGCTC TAGAGCTCTA GCACCTCCTC WO 2005/121346 PCT/NZ2005/000117 8701 GGTCCACAAT CTCTGCTTTG GTGACAGCAG GATACATCCT CTCATCAATG CCAGAGTCGT 8761 AGTACTTCAG TTGTTGTTTA TTGTAATGCT GATAAACATC CTTGCTTTCA TTATCCAAAT 8821 ACGCTTCATT TCTATCAATG AAGGCTACTC TCCTAAGCTC TAGCGCCTTG GCATCTCCAT 8881 GGTCTACTAT AATATCTGAC GAGTTGACAT CACGATATAT CCTGTCATCA ATGCCATAGT 5 8941 CATGATCTTT CTTAAGTTGT TGGCTTTCGT AATGCAGATA TGCATCCCCC CTTTTATAAT 9001 CCATGTATGA TTCCTCTCCA TCATCGAAGG ATCCTCTTCT ACGCTCAAGA GCTCTGGCTT 9061 CTTCCCCGTT TACAAGAATA TCTGATTTAT TGAGACTGGG ATGCATCATG CCAAAAGAGT 9121 TAGTTTCATG ATCTTTTAGG AGTTGCTGGC TTTCACTTTG AAAATATGCT TCCTTTCGAT 9181 CATTTAAGGA TACTCCTCTA TACCTTAGAC CTCTTGCATC TTCATGGTCT ACTAGAATAT 10 9241 CTGATCTGTT GACATCAGGA GGCATCATGA CATAAGAGTC AGTTTCATAA TCGTTTAGGA 9301 GTTGCTGGTT TTCACATTGC AAGTATGCGT CCTTTTTATC ATTCAAGGAC ACTCCTCCAT 9361 ACCTCCGACC TCTGGCATCT TCATGGTCTA CCAGAATATC TGATTTGTTG ACATCGGGAT 9421 GCCTCATGAC GTAAGAGTCA GTTTCATGAT CGTTTAGGAG TCGCTGCCTT TCACATTGCA 9481 AGTATGCTTC CTTTTTATCA TTCAAGGAAA CTCCTCTATA CCTCCGACCT CTGCCATCTT 15 9541 CATGGTCTAC CAGAGTATCT GATTTGTTGA CATGGGGATG CATCATGCCA TAGGAGTTAG 9601 TTTCATAATC ATTTAGGAGT CTCTGTCTTT CACATTGCAT GTATGCTTCC TTTTTATCAT 9661 TCAAGGACCC TCCTCTATAC CTTAGACCTT TGGAATCTTC CCGGTCTTCC AGAGTATCTG 9721 ATTTGTTGAC ATCGGGATGC ATTATGCCAT AGGAGTTAGT TTCATAATCA TTTAGGAGTT 9781 GCTGGCTTTC ACATTGCAAG TAAGCTTCCC TTCTATCATT TAAGGACCCT CCTCTATACC 20 9841 TTAGACCTCT GGAATTTTCC CGTTCCCAGT CTGCTAGAAT ATCTAATCTG TTGACATCAT 9901 CAGGATAAAT CTTAGCATCA GAGCGAGAGT CATAATCTTT CTCCAGTTGT TGGATTTTGT 9961 AATTCAGATA AACATCCTTC CTTTCATTAT CCAAGTATGC TGCCTTTCCT TTGTTTAAGC 10021 ATCGTCGTTG AAGCTGCACT CCTCTTCCAT CCTGATCTAC CACAATATCT GCCATATCTA 10081 CATCAGAATG TATCCTACCA TCATTGCCAG GGTGATACTC GTTTCTCAGT TCCTGTCTTT 25 10141 TATCATCTGA AAAAACAGCA TTTCTCTCAT TATCCAAATA TGCTTCCCTT TCTTTATTTA 10201 AGCAACGTGT TATGCTCTGT GAAGCTTTAT CATCTCGATC TACTACAGCA TATGGTTCAC 10261 TGAAATCAAG ATTCTTCTTA CTGTCAACAC CATCATATAG ATAATCCTTT CTCAGTACTT 10321 GACATTTGTT ATCCAAATAA ACAACCTTTC TTTCTTCGTT ACGGAAGTTC CTGTAGTGAT 10381 CATCTCGTTC ATCCACCACT CTTGATCCCA ACTCCACAAA AGGATAATCT TCCTTCACAG 30 10441 ACTCATAATG GTCAGCCATC CTCTCTTTCC TGCTAAACTC AAGATGGGTA TCGGCCGCAT 10501 CAACATCAGC TATATTTGAA CCACAGACAT GGGATTTTGA TAAAGATCCT CTCCTCTGCA 10561 TAAAAAGATC ATTCTCTCTA GCCACATTAT TGACCTCATG CCTAACACTG GGAAACTCTC 10621 TCATTGCTAT ATCAGAGCCT ATATGATAAT TATCCCGAGC TTCATCCACT ATCTCTTTGA 10681 CACACCTGCT CACAACTGGT GAATCATGGT CTCCACGACT TAAATCTCTA ACTTGTTGAT 35 10741 CCCTTGGTGA GTTTCTACCA ACATAATCAT CGACACGTCT AGTACGTAGA ACCTGTGGTA 10801 CACAAAGATT CCCATCATAA TCATGTCTTC TTCTCTCATC GGAATCATTC ACACAACCGA 10861 AAGATCTA (SEQ ID NO:54) WO 2005/121346 PCT/NZ2005/000117 A mutant form of the Arabidopsis thaliana acetohydroxyacid synthase gene conferring resistance to sulfonylurea herbicides such as chlorsulfuron was inserted into the T-DNA of pTC1. The 5.8 kb Xbal fragment from pGHl (Haughn et al. 1988, Molecular and General Genetics 211:266-271) was ligated into the unique XbaI site between the left and right T 5 DNA borders of pTCl to produce pTCAHAS. The orientation of the two fragments was determined by restriction patterns and confirmed by DNA sequencing.
WO 2005/121346 PCT/NZ2005/000117 Example 7 Transformation of Arabidopsis thaliana with an intragenic vector The pTCAHAS binary vector was transformed into the disarmed Agrobacterium tunefaciens strain EHA105 (Hood et al 1993, Transgenic Research, 2:208-218), using the freeze-thaw 5 method (Hdfgen & Willmitzer 1988, Nucleic Acids Research, 16: 9877). Agrobacterium was cultured overnight in LB broth supplemented with 300 mg/L spectinomycin and used to transform Arabidopsis thaliana 'Columbia' using the floral dip method (Clough and Bent, Plant Journal 16: 735-743, 1998). 10 The resulting self pollinated seed was screened in vitro on half-strength MS salts (Murashige and Skoog 1962, Physiologia Plantarumn, 15: 473-497) supplemented with 10 pg/L chlorsulfuron. Seeds were also sown on a standard potting mix in a greenhouse and the germinated seedlings at the 3-4 true leaf stage were sprayed with a standard application of Glean (active ingredient chlorsulfuron) at a rate equivalent to 20 g/ha. 15 Genomic DNA from the recovered chlorsulfuron-resistant seedlings were confirmed as being transformed with the intragenic vector pTCAHAS by polymerase chain reactions across the junctions of the two XbaI sites adjoining the original T-DNA of pTC l and the inserted 5.8 kb XbaI fragment to form pTCAHAS. The following primers were used: 20 Primer E: 5'CATCCACTGCATAGTTCCC3' (SEQ ID NO:55) Primer F: 5'GATGCGTTGATCTCTTCATCA3' (SEQ ID NO:56) Primer G: 5'TCAACATCAATCCGAGTACG3' (SEQ ID NO:57) Primer H: 5'AGAGATTGTGGACCGAGGAG3' (SEQ ID NO:58) 25 As illustrated in Figure 3, the expected 643 bp DNA fragment was PCR amplified from the binary vector pTCAHAS and three A. thaliana lines transformed with pTCAHAS using primers E+F designed to flank the XbaI site inside the right T-DNA border. Similarly, the expected 149 bp DNA fragment was PCR amplified from the same DNA sources using 30 primers G+H designed to flank the XbaI site inside the left T-DNA border.
WO 2005/121346 PCT/NZ2005/000117 Example 8 Construction of additional intragenic vectors In assembling the intragenic vector all plasmid constructions were performed using standard molecular biology techniques of plasmid isolation, restriction, ligation and transformation into 5 Escherichia coli strain DH5a (Sambrook et aL, Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987). A binary vector with a T-DNA composed of potato DNA The 1268 bp sequence illustrated in Example 2 as a T-DNA-like region of a potato (Solanum 10 tuberosum) intragenic vector was synthesised by Genscript Corporation (Piscatawa, NJ, www.genscript.com) and supplied cloned into pUC57 (pUC57POTINV). The SalI fragment encompassing the T-DNA composed of potato DNA from pUC57POTINV was isolated by restriction, then ligated to the 8004 bp SalI backbone of the binary vector 15 pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) to form the binary vector pPOTINV. The orientation of the two fragments was determined by PCR analysis across the junctions of the two SalI sites and DNA sequencing. The pPOT1NV binary vector was transformed into the Agrobacterium strain A4T, also known 20 as C58C1 (pArA4b) (Petit et al 1983, Molecular and General Genetics, 190:204-214), using the freeze-thaw method (Hbfgen & Willmitzer 1988, Nucleic Acids Research, 16: 9877). The Agrobacteriuin was cultured overnight in LB broth supplemented with 300 mg/L spectinomycin for co-cultivation with leaves from in vitro cultured potato plants. 25 Virus-free potato plants of cultivar Iwa were multiplied in vitro on MS salts and vitamins (Murashige and Skoog, 1962, Physiologia Plantarum, 15: 473-497), plus 30g 1- sucrose, 40 mg 1- ascorbic acid, 500 mg 1 - casein hydrolysate and 7 g 1' agar, adjusted to pH 5.8 with 0.1 M KOH. Plants were routinely subcultured as 2-3 node segments every three to four weeks and incubated at 26 *C under cool white fluorescent lamps (80-100 ptmol m- 2 sec'; 16 h 30 photoperiod). Leaves were excised from the in vitro plants, cut in half, dipped for about 30 sec in the liquid culture of Agrobacteriuin strain A4T harbouring pPOTINV, then blotted dry on sterile filter paper. These leaf segments were then cultured on potato medium defined above and incubated under reduced light intensity (5-10 [[mol m 2 sec-). Two days later, the WO 2005/121346 PCT/NZ2005/000117 leaf segments were transferred to the same medium supplemented with 200 mg 1- 1 Timentin to prevent Agrobacterium overgrowth. Hairy roots were selected on MS medium without growth regulators. Genomic DNA isolated 5 from these hairy roots was screened via PCR to identify those derived from co-transformation with pArA4b and pPOTINV. The following primers were used: Primer 1: 5'GCTCACCTTGCAGCTTCACT3' (SEQ ID NO:59) Primer J: 5'CAGAGCTGGATTTGCATCAG3' (SEQ ID NO:60) to amplify an expected 570 bp DNA fragment from the T-DNA-like region of pPOTINV, and 10 Primer K: 5'GATGGCAGAAGGCGAAGATA3' (SEQ ID NO:61) Primer L: 5'GAGCTGGTCTTTGAAGTCTCG3' (SEQ ID NO:62) as an internal control to amplify an expected 1069 bp fragment from the endogenous potato actin gene. The expected 1069 bp fragment was amplified using primers K and L from all hairy root lines, including control hairy root line transformed with Agrobacterium strain A4T 15 without the binary vector pPOTINV. The expected 570 bp DNA fragment was PCR amplified from the binary vector pPOTINV and from two of 80 hairy root lines tested using primers I and J (Figure 4). The DNA samples from the two hairy root lines positive for the T DNA from pPOTINV and the control hairy root lines were also used for PCR using primers designed for the Agrobacterium virG gene: 20 Primer M: 5'GCGGTAGCCGACAG3' (SEQ ID NO:63) Primer N: 5'GCGTCAAAGAAATA3' (SEQ ID NO:64) The DNA samples from all hairy root lines failed to amplify PCR products using primers M and N (Figure 5). Furthermore, cultures of these hairy roots failed to grow bacteria when incubated in LB medium. These results establish the absence of associated Agrobacterium 25 with the hairy roots. The 2.5% co-transformation frequency (2 of 80) of T-DNAs from pArA4b and pPOTINV was achieved despite selection for only hairy roots. This demonstrates that the pPOTINV binary vector is effective in transforming potatoes. A binary vector with a T-DNA composed of petunia DNA 30 The 1507 bp sequence illustrated in Example 2 as a T-DNA-like region of a petunia (Petunia hybrida) intragenic vector was synthesised by Genscript Corporation (Piscatawa, NJ, www.genscript.com) and supplied cloned into pUC57 (pUC57PETINV).
WO 2005/121346 PCT/NZ2005/000117 The SalI fragment encompassing the T-DNA composed of petunia DNA from pUC57PETINV was isolated by restriction, then ligated to the 8004 bp SalI backbone of the binary vector pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) to form the binary vector pPETINV. The orientation of the two fragments was determined by PCR 5 analysis across the junctions of the two SalI sites and DNA sequencing. The pPETINV binary vector was transformed into the Agrobacterium strain A4T, also known as C58C1 (pArA4b) (Petit et al 1983, Molecular and General Genetics, 190:204-214), using the freeze-thaw method (Hfgen & Willmitzer 1988, Nucleic Acids Research, 16: 9877). The 10 Agrobacterium was cultured overnight in LB broth supplemented with 300 mg/L spectinomycin for co-cultivation with leaves from in vitro cultured potato plants. Virus-free potato plants of cultivar Iwa were multiplied in vitro on MS salts and vitamins (Murashige arid Skoog, 1962, Physiologia Plantarum, 15: 473-497), plus 30g 1- sucrose, 40 15 mg 17 ascorbic acid, 500 mg 1- casein hydrolysate and 7 g 1- agar, adjusted to pH 5.8 with 0.1 M KOH. Plants were routinely subcultured as 2-3 node segments every three to four weeks and incubated at 26 *C under cool white fluorescent lamps (80-100 pmol m 2 sec'; 16 h photoperiod). Leaves were excised from the in vitro plants, cut in half, dipped for about 30 sec in the liquid culture of Agrobacterium strain A4T harbouring pPETINV, then blotted dry 20 on sterile filter paper. These leaf segments were then cultured on potato medium defined above and incubated under reduced light intensity (5-10 pLmol m 2 sec'). Two days later, the leaf segments were transferred to the same medium supplemented with 200 mg 1- Timentin to prevent Agrobacterium overgrowth. 25 Hairy roots were selected on MS medium without growth regulators. Genomic DNA isolated from these hairy roots was screened via PCR to identify those derived from co-transformation with pArA4b and pPETINV. The following primers were used: Primer 0: 5'GAGATAAACAAATAGTCCGGATCG3' (SEQ ID NO:65) Primer P: 5'GGGAGCATTTGGTGGAAATAG3' (SEQ ID NO:66) 30 to amplify an expected 447 bp DNA fragment from the T-DNA-like region of pPETINV. The same DNA samples were also used in a PCR using primers K and L designed to amplify an expected 1069 bp fragment from the endogenous potato actin gene as an internal control. The expected 1069 bp fragment was amplified using primers K and L from all hairy root lines, including control hairy root line transformed with Agrobacterium strain A4T without the WO 2005/121346 PCT/NZ2005/000117 binary vector pPETINV. The expected 447 bp DNA fragment from the T-DNA-like region of pPETINV was PCR amplified from the binary vector pPETINV and from one of 85 hairy root lines tested using primers 0 and P (Figure 6). The DNA sample from the hairy root line positive for the T-DNA from pPETINV failed to amplify a PCR product using primers M and 5 N designed for the Agrobacterium virG gene (Figure 7). Furthermore, a culture of this hairy root line failed to grow bacteria when incubated in LB medium. These results establish the absence of associated Agrobacterium with the hairy root line positive for the T-DNA from pPETINV. The 1-2% co-transformation frequency (1 from 85) of T-DNAs from pArA4b and pPETINV was achieved despite selection for only hairy roots. Overall, these results 10 demonstrates that the pPETINV binary vector is effective in transforming plants. A binary vector with a T-DNA composed of onion DNA The 1075 bp sequence illustrated in Example 2 as a T-DNA-like region of an onion (Allium 15 cepa) intragenic vector was synthesised by Genscript Corporation (Piscatawa, NJ, www.genscript.com) and supplied cloned into pUC57 (pUC57ALLINV). The SalI fragment encompassing the T-DNA composed of onion DNA from pUC57ALLINV was isolated by restriction, then ligated to the 8004 bp SalI backbone of the binary vector 20 pART27 (Gleave 1992, Plant Molecular Biology, 20: 1203-1207) to form the binary vector pALLINV. The orientation of the two fragments was determined by PCR analysis across the junctions of the two SalI sites and DNA sequencing.
WO 2005/121346 PCT/NZ2005/000117 Example 9 Design, Construction and Verification of Plant Derived Recombination Sites: loxP-like sites for recombination with Cre recombinase BLAST searches were conducted of publicly available plant DNA sequences from NCBI, 5 SGN and TIGR databases. 1) Potato DNA fragment containing a JoxP-like sequence - POTLOXP A fragment containing a loxP-like sequence was designed from two EST sequences from 10 potato (Solanum tuberosum) (NCBI accessions BQ111407 and BQ045786). This fragment, named POTLOXP, is illustrated below. Restriction enzyme sites used for DNA cloning into the potato intragenic T-DNA described in Example 8 are shown in bold and the loxP-like sequence shown in bold and light grey. 15 gatAT'-CA G T GTCTAGAGCCT AATS"CTAAMA TCrc CAGTAGAACCGAATTCGTATAGCATACATTATACGAAGGCATCTCTGTAGCAT 25 ATCGATA tc (SEQ ID NO:67) Nucleotides 1-3 part of EcoRV restriction enzyme site (from the potato intragenic vector pPOTINV) Nucleotides 4-402 nucleotides 17-415 of NCBI accession B Q111407 30 Nucleotides 403-653 nucleotides 298-548 of NCBI accession BQ045786 Nucleotides 654-655 part of EcoRV restriction enzyme site (from the potato intragenic T-DNA) The designed potato loxP-like sequence has 6 nucleotide mismatches from the native oxP sequence as illustrated in bold below.
WO 2005/121346 PCT/NZ2005/000117 loxP sequence ATAACTTCGTATAGCATACATTATACGAAGTTAT (SEQ ID NO:68) Potato loxP-like CCGAATT GTATAGCATAtCTTATAG -CAT (SEQ ID 5 NO:69) The 655 bp POTLOXP sequence illustrated above was synthesised by Genscript Corporation (Piscatawa, NJ, www.genscript.com) and supplied cloned into pUC57. All plasmid constructions were performed using standard molecular biology techniques of plasmid 10 isolation, restriction, ligation and transformation into Escherichia coli strain DH5a (Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987). Initially the 1286 bp Sal fragment encompassing the T-DNA composed of potato DNA from 15 pUC57POTINV (described in Example 8) was subcloned into pGEMT to form pGEMTPOTINV. POTLOXP was then cloned into pGEMTPOTINV twice, firstly as a XbaI to ClaI fragment, then subsequently as a EcoRV to EcoRV fragment. Confirmation of the POTLOXP inserts was verified using restriction enzyme analysis and DNA sequencing. The resulting plasmid was named pPOTLOXP2. 20 The DNA sequence of the 2316 bp SalI fragment comprising the potato derived T-DNA region in pPOTLOXP2 is illustrated below. Only the nucleotides in italics are not part of potato genome sequences. The POTLOXP regions are shaded. The T-DNA borders are shown in bold, with the left border positioned at 314-337 and the right border positioned at 25 2005-2028. Restriction sites illustrated in bold represent those used in cloning the POTLOXP regions into pGEMTPOTINV. Unique restriction sites in pPOTLOXP2 for cloning between POTLOXP sites are: AflIl C/TTAAG AgeI A/CCGGT 30 BanHI G/GATCC BstD102I GAG/CGG CspI CG/GWCCG PinAI A/CCGGT WO 2005/121346 PCT/NZ2005/0001 17 GTCGACAOTAAAAOTTOCACCTOOAATAAOOTT TTCATTCTTCACAOOAOOCATCTCACTCT TTCTAOCAOOTCTTOAACOCTTAOATTOAACAOATOTAOOACTCACATCTOATATOOAOOAT TCTTOACTTOTTTCAOCAOCATCAOATOAAOCTTCTOAOACTTCACCTOATCCATCATCTOT 5 AOCAOTTOCTTCTACTTCTTCCACTOCTACATCAOTCTCAOTTOCTOATACTATAAOACCTC T TAATTTAOOTCOTAAAATOCAACCAACTCTAAAATOOOOAYACAATTTAATAOATOTTOAC AOAGGCAGGATATATTTTGGGGTAAACOOOAAT TCT TCAOCAOTTOCT COAGOGAGATT SOC OOTOCTTTCAOCTCACCTTOCAOCTTCACTCAACOTCTCCOATTTAACAACCTTCAAACTTC PPL A&CGAA 10 ' _A IA4 4, ~ CCGAATTCGT 15 ATDAGCATACATTATACGAAGGCAT? Q ~ >7 4ATCGATOAOCOOACCOOTAAOAAOTATCCOOTTCAOOTTTCTOAOOATOOCACTATC 20 AAAOCCACCOACTTAAAOAAOATAACAACAOOACAOAATOATAAAOOTCTTAAOCTTTATOA T CCAOOCTATCTCAACACAOCACCTOTTAOOTCATCAATATOCTATATAOATOOTOATOCCO OOATCCTAGTA"'EAPP CAAT T G TATAG ATCATAM777JGCT(' '7 ~-r,,P,,r - A Vr (V'&'fl 259ATOOTAAA CT\IGAA -k TAAATATCCA GTGTAtCAGATiXACCGATTGTATGCATGAGCTTTCGAGGCAT GO A T OCAAAT CCAOCTCTOAOAOGACAGGATATATACAAGTGTAAACAATTTAAAAOCATATOOT WO 2005/121346 PCT/NZ2005/000117 GGCACT GCTCAATATATGAGGTGGGCGCGAGAAGCAGGTACCAATGTGTCCTCATCAAGAGA TGCATTCTTTACCAATCCAACGGTCAAAGCATACTACAAGTCTTTTGTCAAGGCTATTGTGA CAAGAAAAAACTCTATAAGTGGAGTTAAATATTCAGAAGAGCCCGCCATATTTGCGTGGGAA CTCATAAATGAGCCTCGTTGTGAATCCAGTTCATCAGCTGCTGCTCTCCAGGCGTGGATAGC 5 AGAGATGGCTGGATTTGTCGAC (SEQ ID NO:70) The ability of this construct to undergo recombination between the POTLOXP sites was tested in vivo using Cre recombinase expressing Escherichia coli strain 294-Cre (Buchholz et 10 al., 1996, Nucleic Acids Research 24 (15) 3118-3119). The binary vector pPOTLOXP2 was transformed into E. coli strain 294-Cre and maintained by selection with 100 mg/l ampillicin and incubation at 23 "C. Raising the temperature to 37 *C induces expression of Cre recombinase in E. coli strain 294-Cre, which effected recombination between the two POTLOXP sites in pPOTLOX2. This was evident by a reduction in the size of pPOTLOXP2 15 from 5316 bp to 4480 pb. Plasmid isolated from colonies of E. coli strain 294-Cre transformed with pPOTLOXP2 and cultured at 37 "C, was restricted with SalI. All colonies tested produced the fragments of 3.0 kb and 1.5 kb expected when recombination between the POTLOXP sites has occurred (Figure 8). 20 Recombination between the POTLOXP sites was further verified by DNA sequencing. Plasmid was isolated from colonies of E. coli strain 294-Cre transformed with pPOTLOXP2 and cultured at 37 C, then DNA sequenced across the SalI region inserted into pGEMT. The resulting sequence from two independent cultures is illustrated below and confirms that recombination is base pair faithful through the remaining POTLOXP site in plasmid 25 preparations. Only the nucleotides in italics are not part of the potato genome sequences. The remaining POTLOXP region is shaded. The T-DNA borders are shown in bold, with the left border positioned at 314-337 and the right border positioned at 1169-1192. Restriction sites illustrated in bold represent those remaining from cloning the POTLOXP regions into pPOTINV. 30 GTCGACAGTAAAAGTTGCACCTGGAATAAGGTTTTCATTCTTCACAGGAGGCATCTCACTCT TTCTAGCAGGTCTTGAACGCTTAGATTGAACAGATGTAGGACTCACATCTGATATGGAGGAT TCTTGACTTGTTTCAGCAGCATCAGATGAAGCTTCTGAGACTTCACCTGATCCATCATCTGT
AGCAGTTGCTTCTACTTCTTCCACTGCTACATCAGTCTCAGTTGCTGATACTATAAGACCTC
WO 2005/121346 PCT/NZ2005/000117 TTAATTTAGGTCGTAAAATGCAACCAACTCTAAAATGGGGAAACAATTTAATAGATGTTGAC AGAGGCAGGATATATTTTGGGGTAAACGGGAATTCTTCAGCAGTTGCTCGAGGGAGATTGGC GGTGCTTTCAGCTCACCTTGCAGCTTCACTCAACGTCTCCGATTTAACAACCTTCAAACTTO TAA' TTTk 4#A 5 ACCGAATTCG! 10 ATAGCATACATTATACGAAGGCATCT A T C- ATCGATATCATACAGTCAATGCCCCATGATGCTCATCCAATGGGGGTTCTTGTCAGT 15 GCAATGAGTGCTCTTTCCGTTTTTCATCCTGATGCAAATCCAGCTCTGAGAGGACAGGATAT ATACAAGTGTAAACAATTTAAAAGCATATGGTGGCACTGCTCAATATATGAGGTGGGCGCGA GAAGCAGGTACCAATGTGTCCTCATCAAGAGATGCATTCTTTACCAATCCAACGGTCAAAGC ATACTACAAGTCTTTTGTCAAGGCTATTGTGACAAGAAAAAACTCTATAAGTGGAGTTAAAT ATTCAGAAGAGCCCGCCATATTTGCGTGGGAACTCATAAATGAGCCTCGTTGTGAATCCAGT 20 TCATCAGCTGCTGCTCTCCAGGCGTGGATAGCAGAGATGGCTGGATTTGTCGAC (SEQ ID NO:71) 2) LoxP-like sequences from other species 25 Medicago trunculata (barrel medic) loxP-like sequence designed from 2 ESTs LoxP ATAACTTCGTATAATGTATGCTATACGAAGTTAT (SEQ ID NO:68) Barrel medic loxP-like ATGACTTCGTATAATGTATGCTATACGAAGTGTG (SEQ ID NO:72) 30 Nucleotides 1-19 Nucleotides 109-127 of NCBI accession CA919120 Nucleotides 20-34 Nucleotides 14-28 of NCBI accession CA989265 The barrel medic loxP-like site has 4 nucleotide mismatches from the native loxP sequence (illustrated above in bold).
WO 2005/121346 PCT/NZ2005/000117 Picea (spruce) loxP-like sequence designed from 2 ESTs LoxP ATAACTTCGTATAATGTATGCTATACGAAGTTAT (SEQ ID NO:68) 5 Spruce loxP-like ATACCTTCGTATAATGTATGCTATACAAAGAAAT (SEQ ID NO:73) Nucleotides 1-15 Nucleotides 226-240 of NCBI accession C0215992 Nucleotides 16-34 Nucleotides 148-166 of NCBI accession C0255617 10 The spruce loxP-like site has 4 nucleotide mismatches from the native loxP sequence (illustrated above in bold) Zea mays (maize) loxP-like sequence designed from 2 ESTs LoxP ATAACTTCGTATAATGTATGCTATACGAAGTTAT (SEQ 15 ID NO:68) Maize loxP-like GCCACTCCGTATAATGTATGCTATACGAAATGAT (SEQ ID NO:74) Nucleotides 1-20 Nucleotides 326-345 of NCBI accession CB278114 20 Nucleotides 21-34 Nucleotides 11-27 of NCBI accession CD001443 The maize loxP-like site has 6 nucleotide mismatches from the native loxP sequence (illustrated above in bold) WO 2005/121346 PCT/NZ2005/000117 Example 10 Design, Construction and Verification of Plant Derived Recombination Sites:frt-like sites for recombination with FLP recombinase BLAST searches were conducted of publicly available plant DNA sequences from NCBI, 5 SGN and TIGR databases. 1) Potato-DNA fragment containing afrt-like sequence - POTFRT A fragment containing afrt-like sequence was designed from two EST sequences from potato (Solanum tuberosum) (NCBI accessions BQ513657 and BG098563). This fragment, named 10 POTFRT, is illustrated below. Restriction enzyme sites used for DNA cloning into the potato intragenic T-DNA described in Example 8 are shown in bold and thefrt-like sequence shown in bold and light grey. cttAAG GGAAT TCT CCTCGTTCCTATACTTTCTAGAGAATAGGAAGT (SEQ ID NO:75) Nucleotides 1-3 part of BfrI restriction enzyme site (from the potato intragenic 20 vector pPOTINV) Nucleotides 4-45 nucleotides 454 to 495 of NCBI accession BQ513657 Nucleotides 46-185 nucleotides 40 to 179 of NCBI accession BG098563 The designed potatof-t-like sequence has 5 nucleotide mismatches from the nativefrt 25 sequence as illustrated in bold below. frt sequence GAAGT TCCTATACT T TCTAGAGAATAGGAACTTC (SEQ ID NO:76) Potatofrt-like sequence TCTGTTCCTATACITTT TAGAIGAATAGQAGTd (SEQ ID 30 NO:77) The 185 bp POTFRT sequence illustrated above was synthesised by Genscript Corporation (Piscatawa, NJ, www.genscript.com) and supplied cloned into pUC57. All plasmid constructions were performed using standard molecular biology techniques of plasmid WO 2005/121346 PCT/NZ2005/000117 isolation, restriction, ligation and transformation into Escherichia coli strain DH5a (Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press, 1987). 5 POTFRT was cloned into the T-DNA composed of potato DNA residing in the plasmid pGEMTPOTINV (described in Example 9) twice, firstly as a EcoRI to AvrII fragment, then subsequently as a BfrI to BamHI fragment. Confirmation of the POTFRT inserts was verified using restriction enzyme analysis and DNA sequencing. The resulting plasmid was named pPOTFRT2. 10 The DNA sequence of the 1432 bp SalI fragment comprising the potato derived T-DNA region in the resulting pPOTFRT2 is illustrated below. Only the nucleotides in italics are not part of potato genome sequences. The POTFRT regions are shaded. The T-DNA borders are 15 shown in bold, with the left border positioned at 314-337 and the right border positioned at 1121-1144. Restriction sites illustrated in bold represent those used to clone the POTFRT regions into pGEMTPOTINV. Unique restriction sites in pPOTFRT2 for cloning between POTFRT sites are: Agel A/CCGGT 20 BstDl102I GAG/CGG Clal AT/CGAT CspI CG/GWCCG PinAl A/CCGGT 25 GTCGACAGTAAAAGTTGCACCTGGAATAAGGTTTTCATTCTTCACAGGAGGCATCTCACTCT TTCTAGCAGGTCTTGAACGCTTAGATTGAACAGATGTAGGACTCACATCTGATATGGAGGAT TCTTGACTTGTTTCAGCAGCATCAGATGAAGCTTCTGAGACTTCACCTGATCCATCATCTGT AGCAGTTGCTTCTACTTCTTCCACTGCTACATCAGTCTCAGTTGCTGATACTATAAGACCTC 30 TTAATTTAGGTCGTAAAATGCAACCAACTCTAAAATGGGGAAACAATTTAATAGATGTTGAC AGAGGCAGGATATATTTTGGGGTAAACGGGAATTC TAdicc-fCTGTTCCTATACTTTCTA GAGAATAGG7AAGTTCT-"IAAO'IT' 'TCCTG',ATIdC@( C ,'AT,, A AdCGCA GT GTaT C AGATT TCA GGTGAT TCCACT T TG TARA AT CT
AGAAACTTCCGGTGTATCCGCCGTTTCCGGCGTTGCACCTCCGCCGAATCTAAAAGGTGCGT
WO 2005/121346 PCT/NZ2005/000117 TGACGATCATCGATGAGCGGACCGGTAAGAAGTATCCGGTTCAGGTTTCTGAGGATGGCACT ATCAAAGCCACCGACTTAAAGAAGATAACAACAGGACAGAATGATAAAGGTCTTAGAAATT -GAATTCT GCTTCGTTCCTATACTTTCTAGGAATAGGAAGTT.T-7TG 5 CCATAATTTG ,bC-TGATCCTTAGATATCGAG GCTACCCTATTGAAGAGCTGGCCGAGGGAAGTTCCTTCTTGGAAGTGGCATATCTTTTGTTG TATGGTAATTTACCATCTGAGAACCAGTTAGCAGACTGGGAGTTCACAGTTTCACAGCATTC AGCGGTTCCACAAGGACTCTTGGATATCATACAGTCAATGCCCCATGATGCTCATCCAATGG GGGTTCTTGTCAGTGCAATGAGTGCTCTTTCCGTTTTTCATCCTGATGCAAATCCAGCTCTG 10 AGAGGACAGGATATATACAAGTGTAAACAATTTAAAAGCATATGGTGGCACTGCTCAATATA TGAGGTGGGCGCGAGAAGCAGGTACCAATGTGTCCTCATCAAGAGATGCATTCTTTACCAAT CCAACGGTCAAAGCATACTACAAGTCTTTTGTCAAGGCTATTGTGACAAGAAAAAACTCTAT AAGTGGAGTTAAATATTCAGAAGAGCCCGCCATATTTGCGTGGGAACTCATAAATGAGCCTC GTTGTGAATCCAGTTCATCAGCTGCTGCTCTCCAGGCGTGGATAGCAGAGATGGCTGGATTT 15 GTCGAC (SEQ ID NO:78) The ability of this construct to undergo recombination between the POTFRT sites was tested in vivo using FLP recombinase expressing Escherichia coli strain 294-FLP (Buchholz et al., 20 1996, Nucleic Acids Research 24 (15) 3118-3119). The binary vector pPOTFRT2 was transformed into E coli strain 294-FLP and maintained by selection with 100 mg/l ampillicin and incubation at 23 'C. Raising the temperature to 37 *C induces expression of FLP recombinase in E coli strain 294-FLP, which effected recombination between the two POTFRT sites in pPOTFRT2. This was evident by a reduction in the size of pPOTFRT2 25 from 4432 bp to 4086 pb. Plasmid isolated from colonies of E coli strain 294-FLP transformed with pPOTFRT2 and cultured at 37 "C, was restricted with SalI. All colonies tested produced the fragments of 3.0 kb, 1.4 kb, and 1.1 kb. These three fragments represent the pGEMT backbone, the unrecombined POTFRT2 fragment, and the expected fragment from recombination between the POTLOXP sites, respectively (Figure 9). 30 Recombination between the POTFRT sites was further verified by DNA sequencing. The 1.1 kb fragment from lane 3 of Figure 9 was gel purified and direct DNA sequenced. The resulting sequence is illustrated below and confirms that recombination is base pair faithful through the remaining POTFRT site. The remaining POTFRT region is shaded. The left T- WO 2005/121346 PCT/NZ2005/000117 DNA border is illustrated in bold and positioned at 253-276. Restriction sites illustrated in bold represent those remaining from cloning the POTFRT regions into pGEMTPOTINV. TTTCTAGCAAGTCTTGTACGCTTAGATTGAACAGATGTAGGACTCACATCTGATATGGAGGA 5 TTCTTGACTTGTTTCAGCAGCATCAGATGAAGCTTCTGAGACTTCACCTGATCCATCATCTG TAGCAGTTGCTTCTACTTCTTCCACTGCTACATCAGTCTCAGTTGCTGATACTATAAGACCT CTTAATTTAGGTCGTAAAATGCAACCAACTCTAAAATGGGGAAACAATTTAATAGATGTTGA
CAGAGGCAGGATATATTTTGGGGTAAACGGGAATNNTAGCC
T TCTGTTCCTATACTTTCT AGAGAATAGGAAG AAC C TTt A T TAGff.f GA'tCTTAGATATCGAGGCTACCCTATTGAAGAGCTGGCCGAGGGAAGTTCC TTCTTGGAAGTGGCATATCTTTTGTTGTATGGTAATTTACCATCTGAGAACCAGTTAGCAGA CTGGGAGTTCACAGTTTCACAGCATTCAGCGGTTCCACAAGGACTCTTGGATATCATACAGT CAATGCCCCATGATGCTCATCCAATGGGGGTACTTGTCAGTGCAATGAGTGCTCTTTCCGTT 15 TTT (SEQ ID NO:79) 2) Onion (Allium cepa) FRT-like fragment - ALLFRT 20 A fragment containing afrt-like sequence was designed from two EST sequences from onion (NCBI accessions CF434781 and CF445353). This fragment, named ALLFRT, is illustrated below. Restriction enzyme sites to allow cloning into the onion intragenic binary vector described in Example 8 are shown in bold and thefrt-like sequence is illustrated in bold and light grey. 25 ATT AATCCCACCTGCGACAATTCCAGTGCTCTTCAGAA ____RTTAA GA GC','.TG J1 TGGZ{GATCCATCACTGAAGTAGTACCGATGOCTTTGAGTGATAAGCAAACCA AGCAGGCGTTCTECAGT0GTATG TCTAGCACT- CAOAGTGCATTGdTA CA CTTATCGGGCGGGTTTTTC'CTCCZAAAF C-ACCCAIGGA CC7AAATC(CATIGTTCCTCCCTTTCTT 30 ej§TtQC~CrOTCACAGCTACT7TTCCT~TAAfdAEGATCGCTCTTCdJ dA . - TCAAATTCTACCCGGGCTCTACAAAACCTAAGCGT"CGATGA( Q GAAAGAAC AAGAA T T TCTIT&TGGATAGTEGCTTC§ACA OTTG TTCCTATACTCTCTGGAGAATAGGAACTGTATATCATCCTTTCT7TGAAAGGArGG AAAG CAAGT A ATATTTGGATd WO 2005/121346 PCT/NZ2005/000117 x5 I- JiTTAAT CTC TAGA (SEQ ID NO:80) Nucleotides 1-450 nucleotides 28-477 of NCBI accession CF434718 10 Nucleotides 451-875 nucleotides 105-529 of NCBI accession CF445383 The designed onionfrt-like sequence has 7 nucleotide mismatches from the nativefrt sequence as illustrated in bold below. 15 Frt sequence GAAGTTCCTATACTTTCTAGAGAATAGGAACTTC (SEQ ID NO:76) Onionfrt-like sequence CTTGT&TCCTACT CTC TGGA ATAGGAAC'TGT (SEQ ID NO :81) 20 The 875 bp ALLFRT sequence can be cloned into pALLINV twice, once via flanking VspI sites into NdeI site of pALLINV and subsequently via NheI and XbaI site into the XbaI site of pALLINV. The correct orientation and confirmation of the ALLFRT insert can be verified by restriction enzyme analysis and DNA sequencing. 25 The DNA sequence of the 2896 bp SalI fragment comprising the onion derived T-DNA region in the resulting pALLFRT2 is illustrated below. Only the nucleotides in italics are not part of onion genome sequences. The ALLFRT regions are shaded. The T-DNA borders are shown in bold, with the left border positioned at 520-543 and the right border positioned at 30 2490-2513. Restriction sites illustrated in bold represent those used to clone the ALLFRT regions into the onion T-DNA like sequence. GTCGACTTCCCTTTCCTCTACTCCACTTGTTTCTCGCTTTCTCTACTTCCTTTTTCTCTCTT
TTCTTTATATTTATTGCTCAGCTGGGATTAATTACTGTCATTTATTCCTCATATCTATTTTA
WO 2005/121346 PCT/NZ2005/000117 TTGAATTAAAACGGTTATTTAGCTCGAGGCCTTCTCTCTTATTCTTTGCTTCCAAGGAGAGA GAATATGGCGAGTGGTAGCAATCATCAGCATGGTGGAGGAGGAAGAAGAAGAGGCGGAATGT TAGTCGCTGCGACCTTGCTTATTCTTCCTGCCATTTTCCCCAATTTGTTTGTTCCTCTTCCC TTTGCTTTTGGTAGTTCTGGCAGCGGTGCATCTCCTTCTCTCTTCTCCGAATGGAATGCTCC 5 TAAACCTAGGCATCTCTCTCTTCTGAAAGCAGCCATTGAGCGTGAGATTTCTAGAACAAA AATCAGAGCTGTGGTCTCCCTTGCCTCCACAGGGATGGAAACCGTGCCTTGAGACTCAATAT AGTAGCGGGCTACCCAGTAGATCGACAGGATATATTCAAGTGTAAAACAAGATGCTGAATCG ATTAGCAATGGTTCGCTCTTCTAGCTCTTTCCTTAACAGAAT 10 14" tr C2,G T- TTG TCCTATACTCCTGGGdTdATctA ACTCNAAGA GCTC CTCTAGACTTGCTTCTCGGATAATCAATCCTCAGTTTTTGATTCCTTCTCGAAGCTTCCTTG ATCTCCATAAGATGGTAAACAAGGAGGCGATAAAAAAAGAAAGGGCTAGACTTGCTGATGAG 20 ATGAGCAGAGGATATTTTGCGGATATGGCAGAGATTCGTATACATGGTGGCAAGATTGCTAT GGCAAATGAAATTCTTATTCCATCAGGGGAAGCAATCAAATTTCCTGATTTGACAGTAAAAT TGTCTGATGATAGCAGTTTGCATTTACCAATTGTATCTACACAAAGTGCTACAAATAACAAT GCTAAATCCACTCCTGCTGCCTCATTGTTGTGCCTTTCCTTCAGAGCAAGTTCACAGACAAT GGTTGAATCATGGACTGTTCCTTTTTTGGACACTTTTAACTCTTCAGAAGTACAAGCA4TAi 25 CCACCTGCAAT4 CAACATCAOATGA CTA7A QGT T CT-\TGTATCTC T/GCCT TGCAGCA~GTGCA TGTGCATT$ GGGCGGTTTT TCCA ATCGACCCAAAACATGTCCTCGCTTTCT C TCCCA GACACTGTCTTCTAAAAAGA ATCGCT CTTCATCA P 30 TCTACCGC CT ~7i7ATCCA CCTCAAGACACTTTGAA>TAG C UV____ AACqATTGTCCT CAAGAACCAAGAT TCATCCTST CTCGGA GTAGTTCACACAC TG T CCT ,TACTCTCT GGAGAATAGGAACTGTA TCAC T CGAATAGGAGATGGAGATAA GGNG6TTOCI EANACAAAA1CAAGATAIATATIGITGT TGA CT T . Z AATCGACTATGAGTACCT~TCCTGGGAAACTGGAGGAGATGCAATAGTTATACdC WO 2005/121346 PCT/NZ2005/0001 17 - ,~TATGAGGTATCATTTTTGGATTCTTGGTT TTTCTCATTCGGACCAATCAAGAGAATGTTTCTTACATGACGAAIGAAACCCACTGCTACTC AGCGGAAGATTGGTTATTTCATTTGGTGATCACTATGATTTTAGGAGCAGCTTCATTGT 5 AAATCTTTTGACAGGATATATATTACTGTAAJ\GTGAGAGAGA\JATGTGATATATGCTGA TGTTTCCATGGAGAGGGGTGCATTTCTTGTTCACAAGCTATGAGGGCTTTCCATGGAGA ATATAGAAAGCGCAAA.ATCAAGGCTTAGTCTTTGCGAGGAGGATA.TTCGTGGGCAGTTAGAG ATGACAGATAACAAACCAGAGTTATATTCACAGCTTGGTGCTGTCCTTGGAAJTGCTAGGAGA CTGCTGTCGAGGAAT GGGTGATACTAATGGTGCGATTCCATATTATGAJAGAGAGTGTGGAAT 10 TCCTCTTAAAAATGCCTGCAAAAGATCCCGAGGTTGTACATACACTATCAGTTTCCTTGAAT AAAATTGGAGACCTGAAATACTACGAAGGAGATCTGCAGTCGAC (SEQ ID NO: 82) Restriction enzyme sites available for cloning between ALLFRT sequences include: 15 ApaBi GCANNNMNN/TGC BsiI C/TCGTG BspMI ACCTGCNNN.N/ DrallI CACNNTN/GTG Hindll A/AGCTT 20 MfeJ C/AATTG NheJ G/CTAGC PflMI CCANNNN/NTGG ScaI AGT/ACT SphJ GCATG/C 25 XbaJ T/CTAGA 3) Fr/-like sequences from other species 30 Brassica napus (rape)frt-like sequence designed from 2 ESTs Frt sequence GAAGT TCCTATACTT TCTAGAGAATAGGAACT TC (SEQ ID NO :76) WO 2005/121346 PCT/NZ2005/000117 Rapefrt-like sequence ACAGT TCCTATACTTTCTGGAGAATAGGAAGGTG (SEQ ID NO:83) Nucleotides 1-14 Nucleotides 397-410 of NCBI accession CD824140 5 Nucleotides 15-34 Nucleotides 128-147 of NCBI accession CD825268 The rapefrt-like sequence has 6 nucleotide mismatches from the nativefrt sequence (illustrated above in bold). 10 Glycine max (soybean)frt-like sequence designed from 2 ESTs Frt sequence GAAGTTCCTATACTTTCTAGAGAATAGGAACTTC (SEQ ID NO:76) Soybeanfrt-like sequence ACAGTT CCTATACTTT CTACAGAATAGGAACT TC (SEQ ID 15 NO:84) Nucleotides 1-19 Nucleotides 84-102 of NCBI accession BE057270 Nucleotides 20-34 Nucleotides 243-257 of NCBI accession B1970552 20 The soybeanfrt-like sequence has 3 nucleotide mismatches from the nativefrt sequence (illustrated above in bold). Triticum aestivum (wheat)frt-like sequence designed from 2 ESTs 25 Frt sequence GAAGTTCCTATACTTTCTAGAGAATAGGAACTTC (SEQ ID NO:76) Wheatfrt-like sequence AGAGT T CCTATACT T T CTAGAGAATAGGAACCCC (SEQ ID NO:85) 30 Nucleotides 1-18 Nucleotides 446-463 of NCBI accession CD877128 Nucleotides 19-34 Nucleotides 1805-1820 of NCBI accession BT009538 WO 2005/121346 PCT/NZ2005/000117 The wheatfrt-like sequence has 4 nucleotide mismatches from the nativefrt sequence (illustrated above in bold). Pinus taeda (loblolly pine)frt-like sequence designed from 2 ESTs 5 Frt sequence GAAGTTCCTATACTTTCTAGAGAATAGGAACTTC (SEQ ID NO:76) Loblolly pinefrt-like sequence AAAGTTCCTATACTTTCTGGAGAATAGGAAAACA (SEQ ID NO:86) 10 Nucleotides 1-16 Nucleotides 14-29 of NCBI accession AA556441 Nucleotides 17-34 Nucleotides 764-781 of NCBI accession AF10 1785 The loblolly pine frt-like sequence has 6 nucleotide mismatches from the native frt sequence 15 (illustrated above in bold). The above examples illustrate practice of the invention. It will be well understood by those with ordinary skill in the art that such DNA sequences can be assembled together to construct a complete vector composed entirely of plant DNA of the same or related species. It will also 20 be appreciated by those skilled in the art that numerous variations and modifications may be made without departing from the spirit and scope of the invention.

Claims (25)

  1. 2. The plant transformation vector of claim I in which the T-DNA-like sequence includes 10 two T-DNA border-like polynucleotide sequences, one at each end of the 1-DNA-like sequence, both T-DNA border-like polynucleotide sequences being derived from plant species.
  2. 3. The plant transformation vector of any preceding claim in which the T-DNA-like sequence 15 further comprises additional base polynucleotide sequence, the additional base polymieleotide sequence being derived fiom plant species.
  3. 4. The plant transformation vector of any preceding claim in which T-DNA-like sequence includes first and second recombinase recognition sequences. wherein the recombinase 20 recognition sequences are derived from plant species.
  4. 5. The plant transformation vector of claim 4 in which the first recombinase recognition site and the second recombinase recognition sequence are loxP-like sequences derived from a plant species. 25
  5. 6. The plant transformation vector of claim 4 in which the first recombinase recognition sequence and the second recombinase recognition sequences areft-like sequences derived from plant species. 30 7. The plant transformation vector of any one of claims 4 to 6 which comprises a selectable marker sequence flanked by the first and second recombinase recognition sequences. 117
  6. 8. The plant transformation vector of claim 7 in which the selectable marker sequence is derived from plants.
  7. 9. The plant transformation vector of any preceding claim in which the polynucleotide 5 encompassing the T-DNA border-like sequence(s), any base polynucleotide sequence, any recombinase recognition site sequences and any plant polynucleotide sequence additional to the T-DNA-like sequence are constructed from fewer than 10 polynucleotide sequence fragments derived from plant species. 10 10. The plant transformation vector of any preceding claim which further comprises an origin of replication polynucleotide sequence derived from plant species. I1. The plant transfornation vector of any preceding claim in which the T-DNA-like sequence comprises a selectable marker polynueleotide sequence capable of functioning in 15 selection of a plant cell or plant harbouring the T-DNA-like sequence, wherein the selectable marker sequence is derived from plant species.
  8. 12. The plant transformation vector of any preceding claim which comprises a selectable marker polynucleotide sequence capable of functioning in selection of a bacterium harbouring 20 the vector, wherein the selectable marker sequence is derived from plant species.
  9. 13. The plant transformation vector of claim 1 in which the selectable marker polynucleotide sequence capable of functioning in selection of a plant harbouring the T-DNA like sequence is also capable of functioning in selection of a bacterium harbouring the vector. 25
  10. 14. The plant transformation vector of any preceding claim in which the T-DNA-like sequence further comprises a genetic construct as herein defined, wherein all polynucleotide sequences of the genetic construct are derived from plants. 30 15. The plant transformation vector of any preceding claim in which the polynucleotide sequence of the entire vector is derived from plant species
  11. 16. The plant transformation vector of any preceding claim in which the polynucleotide sequence of the entire vector is derived from plant species which are interfertile. 118
  12. 17. The plant transformation vector of any preceding claim in which the polynucleotide ) sequence of the entire vector is derived from the same plant species. 5 18. A method ofproducing a transformed plant cell or plant. the method comprising the step of transformation of the plant cell or plant using the vector of any one of claims I to 17.
  13. 19. A method of producing a transformed plant cell or plant in which any polynucleotide stably integrated into the plant cell or plant is derived from a plant, the method comprising 10 transformation of the plant with the vector of any one of claims I to 17.
  14. 20. A method of producing a transformed plant cell or plant in which any polynucleotide stably integrated into the plant cell or plant is derived from a plant interfertile with the plant or plant cell to be transformed, the method comprising transformation of the plant with the 15 vector of any one of claims I to 17.
  15. 21. A method of producing a transformed plan: cell or plant in which any polynucleotide stably integrated into the plant cell or plant is derived from a plant of the same species as the plant or plant cell to be transformed. the method comprising transformation of the plant with 20 the vector of any one of claims I to 17.
  16. 22. A method of modifying a trait in a plant cell or plant comprising: (a) transforming of a plant cell or plant with a vector of any one of claims I to 17, the vector comprising a genetic construct capable of altering expression of a gene which 25 influences the trait and (b) obtaining a stably transformed plant cell or plant modified for the trait.
  17. 23. The method of any one of claims 18 to 22 in which transformation is vir gene-mediated. 30 24. The method of any one of claims 18 to 22 in which transformation is Agrobacerium mediated.
  18. 25. The method of any one of claims 18 to 22 in which transformation involves direct DNA uptake. 119
  19. 26. A plant cell or plant produced by a method of any one of claims 18 to 22.
  20. 27. A plant tissue, organ, propagule or progeny of the plant cell or plant of claim 26. 5
  21. 28. A plant transformation vector comprising: a) T-DNA-like sequence including at least one T-DNA border-like sequence, the T DNA border-like sequence comprising three or more non-contiguous polynucleotide sequence fragments, wherein all of the sequences of the T-DNA-like sequence are 10 derived from plant species.
  22. 29. The plant transformation vector of claim 28 in which the T-DNA border-like sequence comprises three non-contiguous polynucleotide sequence fragments. 1 5 30. A plant transformation vector as claimed in any one of claims I to 17 substantially as herein described with reference to any example thereof
  23. 31. A method as claimed in any one of claims 18 to 25, substantially as herein described with reference to any example thereof. 20
  24. 32. A plant cell as claimed in claim 26, substantially as herein described with reference to any example thereof
  25. 33. A plant tissue, organ, propagule or progeny as claimed in claim 27, substantially as herein 25 described with reference to any example thereof. 120
AU2005252598A 2004-06-08 2005-06-08 Transformation vectors Ceased AU2005252598B8 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2010257316A AU2010257316B2 (en) 2004-06-08 2010-12-21 Transformation Vectors

Applications Claiming Priority (9)

Application Number Priority Date Filing Date Title
NZ533371A NZ533371A (en) 2004-06-08 2004-06-08 Vectors for plant transformation contaning mostly or exclusively genetic material from plants useful with Agrobacterium transformation system
NZ533372 2004-06-08
NZ53337204 2004-06-08
NZ533371 2004-06-08
NZ53833005 2005-02-18
NZ538331 2005-02-18
NZ538330 2005-02-18
NZ53833105 2005-02-18
PCT/NZ2005/000117 WO2005121346A1 (en) 2004-06-08 2005-06-08 Transformation vectors

Related Child Applications (1)

Application Number Title Priority Date Filing Date
AU2010257316A Division AU2010257316B2 (en) 2004-06-08 2010-12-21 Transformation Vectors

Publications (3)

Publication Number Publication Date
AU2005252598A1 AU2005252598A1 (en) 2005-12-22
AU2005252598B2 true AU2005252598B2 (en) 2010-10-28
AU2005252598B8 AU2005252598B8 (en) 2011-05-12

Family

ID=35503070

Family Applications (2)

Application Number Title Priority Date Filing Date
AU2005252598A Ceased AU2005252598B8 (en) 2004-06-08 2005-06-08 Transformation vectors
AU2010257316A Ceased AU2010257316B2 (en) 2004-06-08 2010-12-21 Transformation Vectors

Family Applications After (1)

Application Number Title Priority Date Filing Date
AU2010257316A Ceased AU2010257316B2 (en) 2004-06-08 2010-12-21 Transformation Vectors

Country Status (3)

Country Link
EP (1) EP1766029A4 (en)
AU (2) AU2005252598B8 (en)
WO (1) WO2005121346A1 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8137961B2 (en) 2004-09-08 2012-03-20 J.R. Simplot Company Plant-specific genetic elements and transfer cassettes for plant transformation
US7601536B2 (en) 2004-09-08 2009-10-13 J. R. Simplot Company Plant-specific genetic elements and transfer cassettes for plant transformation
US20070250948A1 (en) * 2006-03-07 2007-10-25 J.R. Simplot Company Plant-specific genetic-elements and transfer cassettes for plant transformation
JP5684473B2 (en) 2006-07-19 2015-03-11 モンサント テクノロジー エルエルシー Use of multiple transformation enhancer sequences to improve plant transformation efficiency
AU2010211450B2 (en) * 2009-01-15 2015-05-14 The New Zealand Institute For Plant And Food Research Limited Plant transformation using DNA minicircles
AU2018253628B2 (en) * 2016-04-27 2019-07-25 Nexgen Plants Pty Ltd Construct and vector for intragenic plant transformation
BR112018072154A2 (en) * 2016-04-27 2019-03-19 Nexgen Plants Pty Ltd recombinant genetic constructs, methods for producing a recombinant genetic construct and for genetically enhancing a plant, vector, host cell and plants

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040107455A1 (en) * 2002-02-20 2004-06-03 J.R. Simplot Company Precise breeding

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1261725A2 (en) * 2000-02-08 2002-12-04 Sakata Seed Corporation Methods and constructs for agrobacterium-mediated plant transformation
AU7752001A (en) * 2000-06-28 2002-01-08 Sungene Gmbh And Co Kgaa Binary vectors for improved transformation of plant systems
WO2002081711A1 (en) * 2001-04-06 2002-10-17 Cropdesign N.V. The use of double and opposite recombination sites for the single step cloning of two dna segments
WO2003069980A2 (en) * 2002-02-20 2003-08-28 J.R. Simplot Company Precise breeding

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040107455A1 (en) * 2002-02-20 2004-06-03 J.R. Simplot Company Precise breeding

Also Published As

Publication number Publication date
EP1766029A1 (en) 2007-03-28
WO2005121346A1 (en) 2005-12-22
AU2010257316B2 (en) 2013-03-21
AU2005252598B8 (en) 2011-05-12
AU2005252598A1 (en) 2005-12-22
AU2010257316A1 (en) 2011-01-27
EP1766029A4 (en) 2008-07-23

Similar Documents

Publication Publication Date Title
US5591616A (en) Method for transforming monocotyledons
CA2162449C (en) Vector for introducing a gene into a plant, and methods for producing transgenic plants and multitudinously introducing genes into a plant using the vector
US20020178463A1 (en) Method for transforming monocotyledons
AU2005252598B2 (en) Transformation vectors
WO2001006844A1 (en) Method for superrapid transformation of monocotyledon
WO1995006722A1 (en) Method of transforming monocotyledon by using scutellum of immature embryo
JP2023156474A (en) Regeneration of genetically modified plants
US20230130592A1 (en) Novel agrobacterium tumefaciens strains
AU2010211450B2 (en) Plant transformation using DNA minicircles
NZ579038A (en) Vectors for transformation
NZ553676A (en) Vectors for transformation
CA2215763C (en) Transformed embryogenic microspores for the generation of fertile homozygous plants
JP3605633B2 (en) Novel plant gene, plant modification method using the gene, and plant obtained by the method
US20100257632A1 (en) Methods for generating marker-free transgenic plants
WO2023212556A2 (en) Compositions and methods for somatic embryogenesis in dicot plants
CN113490747A (en) Methods for increasing efficiency of genome engineering
NZ585926A (en) Methods for generating marker free transgenic plants using Agrobacterium strains and virE2 protein
NZ574191A (en) Plant transformation using DNA minicircles
AU4619601A (en) Agrobacterium mediated method of plant transformation
MXPA06002799A (en) Methods and compositions for enhanced plant cell transformation

Legal Events

Date Code Title Description
FGA Letters patent sealed or granted (standard patent)
TH Corrigenda

Free format text: IN VOL 24, NO 43, PAGE(S) 4973 UNDER THE HEADING APPLICATIONS ACCEPTED - NAME INDEX UNDER THE NAME THE NEW ZEALAND INSTITUTE OF PLANT AND FOOD RESEARCH LIMITED, APPLICATION NO. 2005252598, UNDER INID (71) CORRECT THE APPLICANT NAME TO THE NEW ZEALAND INSTITUTE FOR PLANT AND FOOD RESEARCH LIMITED

Free format text: IN VOL 23, NO 21, PAGE(S) 8385 UNDER THE HEADING CHANGE OF NAMES(S) OF APPLICANT(S), SECTION 104 - 2005 UNDER THE NAME THE NEW ZEALAND INSTITUTE OF PLANT AND FOOD RESEARCH LIMITED, APPLICATION NO. 2005252598, UNDER INID (71) CORRECT THE APPLICANT NAME TO THE NEW ZEALAND INSTITUTE FOR PLANT AND FOOD RESEARCH LIMITED

MK14 Patent ceased section 143(a) (annual fees not paid) or expired