CA2362897A1 - Plant centromeres - Google Patents

Plant centromeres Download PDF

Info

Publication number
CA2362897A1
CA2362897A1 CA002362897A CA2362897A CA2362897A1 CA 2362897 A1 CA2362897 A1 CA 2362897A1 CA 002362897 A CA002362897 A CA 002362897A CA 2362897 A CA2362897 A CA 2362897A CA 2362897 A1 CA2362897 A1 CA 2362897A1
Authority
CA
Canada
Prior art keywords
seq
recombinant dna
dna construct
gene
centromere
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002362897A
Other languages
French (fr)
Inventor
Daphne Preuss
Gregory Copenhaver
Kevin Keith
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Chicago
Original Assignee
The University Of Chicago
Daphne Preuss
Gregory Copenhaver
Kevin Keith
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The University Of Chicago, Daphne Preuss, Gregory Copenhaver, Kevin Keith filed Critical The University Of Chicago
Priority to CA2904855A priority Critical patent/CA2904855A1/en
Publication of CA2362897A1 publication Critical patent/CA2362897A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8201Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation
    • C12N15/8213Targeted insertion of genes into the plant genome by homologous recombination
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01HNEW PLANTS OR NON-TRANSGENIC PROCESSES FOR OBTAINING THEM; PLANT REPRODUCTION BY TISSUE CULTURE TECHNIQUES
    • A01H1/00Processes for modifying genotypes ; Plants characterised by associated natural traits
    • A01H1/06Processes for producing mutations, e.g. treatment with chemicals or with radiation
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8201Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation

Landscapes

  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Organic Chemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Zoology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Cell Biology (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Biophysics (AREA)
  • Botany (AREA)
  • Developmental Biology & Embryology (AREA)
  • Environmental Sciences (AREA)
  • Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Peptides Or Proteins (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention provides for the identification and cloning of functional plant centromeres in Arabidopsis. This will permit construction of stably inherited minichromosomes which can serve as vectors for the construction of transgenic plant and animal cells. In addition, information on the structure and function of these regions will prove valuable in isolating additional centrometric and centromere related genetic elements and polypeptides from other species.

Description

DEMANDES OU BREVETS VOLUMtNEUX
, COMPREND PLUS D'UN TOME.
CECI EST LE TOME _ ~'DE S
NOTE: Pour les tomes additionels, veuillez contacter le Bureau canadien des brevets i JUMBO APPLICATIONS/PATENTS I
THIS SECTION OF TiiE APPL.ICATION1PATENT CONTAINS MORE
THAN ONE VOLUME
THIS IS VOLUME ( _ DF s WOTE: i=or additiona't voiumes~please contact'the Canadian Patent Office -_. _ ~;_ ., ., ;,.
_ ~...
.. ~.,':: .. .::..: . _ ..: : .. , v '. ':~ . . : : v. . : : . :::~ ~ .. . ."
~ . ,.. ';i-:~,,.~~~.~sw't~\. _ W .w ._: :;v ...'.'~...
-_--.___..._.__._..._..._ __..__.~. ~.._._.......... _~

PLANT CHROMOSOME COMPOSITIONS AND METHODS

BACKGROUND OF THE INVENTION
This application claims the priority of U.S. Provisional Application Ser. No.
60/125.219. filed March l8, 1999, U.S. Provisional Application Ser. No.
60/127.409.
filed April I. 1999. U.S. Provisional Application Ser. No. 60/134,770. filed May l8.
1999, U.S. Provisional Application Ser. No. 60/153.584, filed September 13.
1999. U.S.
Provisional Application Ser. No. 60/154.603. filed September 17. 1999 and U.S.
Provisional Application Ser. No. --/---. - . filed December 16, 1999, each of which disclosures is specifically incorporated herein by reference in its entirety.
The government owns rights in the invention pursuant to U.S. Department of Agriculture Grant No. 96-35304-3491. National Science Foundation Grant No.

and Grant No. DOEDE-FG05-920822072 from the Consortium for Plant Biotechnology.
I. Field of the Invention The present invention relates generally to the field of molecular biology.
More particularly, it concerns plant chromosome compositions and methods for using the same.
II. Description of Related Art Two general approaches are used for introduction of new genetic information ("transformation") into cells. One approach is to introduce the new genetic information as part of another DNA molecule, referred to as a "vector." which can be maintained as an independent unit (an episome) apart from the chromosomal DNA molecule(s).
Episomal vectors contain all the necessary DNA sequence elements required for DNA
replication and maintenance of the vector within the cell. Many episomal vectors are available for use in bacterial cells (for example. see ManiatiseraL. 1982). However. only a few episomal vectors that function in higher eukaryotic cells have been developed.
The available higher eukarvotic episomal vectors are based on naturally occurring viruses and most Function only in mammalian cells (Willard. 1997). In higher plant systems the only known double-stranded DNA viruses that replicate through a double-stranded intermediate upon which an episomal vector could be based is the gemini virus.
although the gemini virus is limited to an approximately 800 by insert. Although an episomal plant vector based on the Caulit7ower Mosaic Virus has been developed. its capacity to carry new genetic information also is limited (Brisson et al., 1984).
The other general method of genetic transformation involves integration of introduced DNA sequences into the recipient cell's chromosomes, permitting the new information to be replicated and partitioned to the cell's progeny as a part of the natural chromosomes. The most common form of integrative transformation is called "transfection" and is frequently used in mammalian cell culture systems.
Transfecoon involves introduction of relatively large quantities of deproteinized DNA into cells. The introduced DNA usually is broken and joined together in various combinations before it is integrated at random sites into the cell's chromosome (see, for example Wigler et al., 1977). Common problems with this procedure are the rearrangement of introduced DNA sequences and unpredictable levels of expression due to the location of the transgene in the genome or so called "position effect variation'' (Shingo et al., 1986).
Further. unlike episomal DNA. integrated DNA cannot normally be precisely removed.
A more refined form of integrative transformation can be achieved by exploiting naturally occurring viruses that integrate into the host's chromosomes as part of their life cycle.
such as retroviruses (see Cepko et crl.. 1984). In mouse, homologous integration has recently become common. although it is significantly more difficult to use in plants (Lam et al. 1996).
The most common genetic transformation method used in higher plants is hosed on the transfer of bacterial DNA into plant chromosomes that occurs during infemion by the phytopathogenic soil bacterium AKrnhcrctPrrrrjrr (see Nester et ul., 1984). By substituting genes of interest for the naturally transferred bacterial sequences (called T-DNA). investigators have been able to introduce new DNA into plant cells.
However.
even this more "refined" integrative transformation system is limited in three major ways.
First. DNA sequences introduced into plant cells using the Agrohcrcterium T-DNA system are frequently rearranged (see Jones et «L, 1987). Second. the expression of the introduced DNA sequences varies between individual transformants (see Jones et «L, 1980. This variability is presumably caused by rearranged sequences and the influence of surroundin~~ sequences in the plant chromosome (i.e., position effects). as well as methylation of the transgene. A third drawback of the Agrobcrcter«rm T-DNA
system is the reliance on a "gene addition" mechanism: the new Genetic information is added to the cenome (i.e., all the genetic information a cell possesses) but does not replace information already present in the genome.
One attractive alternative to commonly used methods of transformation is the use of an artificial chromosome. Artificial chromosomes are man-made linear or circular DNA molecules constructed from cis-acting DNA sequence elements that are responsible for the proper replication and partitioning of natural chromosomes (see Murray et «L, 1983). Desired elements include: ( 1 ) Autonomous Replication Sequences lp (ARS) (these have properties of replication origins. which are the sites for initiation of DNA replication). (2) Centromeres (site of kinetochore assembly and responsible for proper distribution of replicated chromosomes at mitosis or meiosis), and (3) Telomeres (specialized DNA structures at the ends of linear chromosomes that function to stabilize the ends and facilitate the complete replication of the extreme termini of the DNA
molecule).
At present. the essential chromosomal element, for construction of artificial chromosomes have been precisely characterized only from lower eukaryotic species.
ARSs have been isolated from unicellular fun~~i. including .S«cch«ronTVCe.s cerevisi«e 2s (breweis yeast) and Sclri:n.c«cclrarornvce.~~ ponrh~ tree Stinchcomb et «!., 1979 and Hsiao et «L, 1979). An ARS behaves like a replication origin allowing DNA
molecules that contain the ARS to be replicated as an episome after introduction into the cell nuclei of these funCi. Plasmids containing these sequences replicate. but in the absence of a centromere they are partitioned randomly into daughter cells.

WO OOh5325 PCT/US00/07392 Artificial chromosomes have been constructed in yeast using the three cloned essential chromosomal elements. Murray et al., 1983. disclose a cloning system based on the in vitro construction of linear DNA molecules that can be transformed into yeast.
where they are maintained as artificial chromosomes. These yeast artificial chromosomes (YACs) contain cloned genes, origins of replication. centromeres and telomeres and are segregated in daughter cells with high fidelity when the YAC is at least 100 kB in length.
Smaller CEN containing vectors may be stably segregated, however, when in circular form.
None of the essential components identified in unicellular organisms, however.
function in higher eukaryotic systems. For example. a yeast CEN sequence will not confer stable inheritance upon vectors transformed into higher eukaryotes.
While such DNA fragments can be readily introduced. they do not stably exist as episomes in the host cell. This has seriously hampered efforts to produce artificial chromosomes in higher t 5 organisms.
In one case, a plant artificial chromosome was discussed (Richards et cal..
U.S.
Patent No. 5.270.201 ). However, this vector was based on plant telomeres, as a functional plant centromere was not disclosed. While tclomeres are important in maintaining the stability of chromosomal termini. they do not encode the information needed to ensure stable inheritance of an artificial chromosome. It is well documented that centromere function is crucial for stable chromosome! inheritance in almost all eukaryotic organisms (reviewed in Nicklas 1988). For example. broken chromosomes that lack a centromere (ucentric chromosomes) are rapidly lost from cell lines, while fragments that have a centromere are faithfully segregated. The centromere accomplishes this by attaching. via centromere binding proteins. to the spindle fibers during mitosis and meiosis, thus ensuring proper gene segregation during cell divisions.
In contrast to the detailed studies done in S. cerrai.cicre and S. pumhe.
little is known about the molecular structure of functional centromeric DNA of higher _>_ eukaryotes. Ultrastructural studies indicate that higher eukaryotic kinetochores. which are specialized complexes of proteins that form on the chromosome during late prophase, are large structures (mammalian kinetochore plates are approximately 0.3 Eun in diameter) which possess multiple microtubule attachment sites (reviewed in Rieder, 1982). It is therefore possible that the centromeric DNA regions of these organisms will be correspondingly large. although the minimal amount of DNA necessary for centromere function may be much smaller.
While the above studies have been useful in elucidating the structure and function of centromeres, they have failed to provide a cloned centromere from a higher eukaryotic organism. The extensive literature indicating both the necessity of centromeres for stable inheritance of chromosomes. and the non-functionality of yeast centromeres in higher organisms, demonstrate that_clonin~ of a functional centromere from a higher eukaryote is a necessary first step in the production of artificial chromosomes suitable for use in IS higher plants and animals. The production of artificial chromosomes with centromeres which function in higher eukaryotes would overcome many of the problems associated with the prior art and represent a si_nificant breakthrough in biotechnology research.
SUMMARY OF THE INVENTION
In one aspect of the invention. a method is provided for the identification of plant centromeres. In one embodiment of the invention. the method may comprise tetrad analysis. Briefly, tetrad analysis measures the recombination frequency between genetic makers and a centromere by analyzing all four products of individual meiosis.
A
particular advantage arises from the yrrcrrtet (grt Il mutation in Arcrbidupsis, which causes the four products of pollen mother cell meiosis in Ar«bidop.ci.c to remain attached. The c/u«rter mutation may also find use in accordance with the invention in species other than Arahidnpsis. For example. several naturally occurring plant species are also known to release pollen clusters. including water lilies. cattails. heath IEriccrce«e «nd f_pacriclcc«e).
evening primrose (On«,~raceam. sundew (Urn.cer«ce«e). orchids (Orchid«ce«e), and acaclaS (Mirnr~sucecre) (Preuss 1994: Smyth 1994). None of these species however, has been developed into an experimental systems thus severely limiting their use for genetic analysis. However, it is contemplated by the inventors th~u a c/unrtet mutation could be introduced into a host plant to enable the use of tetrad analysis in potentially any species.
When used to pollinate a flower. one tetrad can result in the formation of four seeds. and the plants from these seeds can be analyzed genetically. With unordered tetrads, however, such as those produced by Arabiclopsis, genetic mapping using tetrad analysis requires that two markers be scored simultaneously.
In another aspect, the invention provides a recombinant DNA construct comprising a plant centromere. The recombinant DNA construct rnay additionally comprise any other desired sequences, for example. a telomere, including a plant ielomere such as an Arcrbiclupsis tlraliancr telomere, or alternatively. a yeast or any other type of telomere. One may also desire to include an autonomous replicating sequence (ARS), IS such as a plant ARS, including an Arabidop.ci.c thaliarra ARS. Still further, one may wish to include a structural gene on the construct, or multiple genes (for example, two, three, four. five, six, seven, eight, nine, ten, fifteen. twenty, twenty-five. fifty, one hundred. two hundred. five hundred, one thousand! up to and including the maximum number of structural genes (roughly 5000) which can physically be placed on the recombinant DNA
construct. Examples of structural genes one may wish to use include a selectable or screenable marker gene. an antibiotic resistance gene, a herbicide resistance gene. a nitrogen fixation gene, a plant pathogen defence gene, a plant stre,s-induced gene, a toxin gene, a receptor gene, a li~and gene, a hormone gene. an enzyme gene, an interleukin gene. a clotting factor gene. a cytokine gene. an antibody gene, a Growth factor Gene and a seed storage gene. In one embodiment of the invention, the construct is capable of expressing the structural gene, for example. in a prokaryote or eukaryote, including a lower eukaryote, or a higher eukaryote such as a plant.
In yet another aspect. the invention provides a recombinant DNA construct comprising a plant centromere and which is a plasmid. The plasmid may contain any _7_ WO 00/~~32~ PCT/US00/07392 desired sequences, sUCh as an origin of replication. including an origin of replication functions in bacteria, such as E. coli and Agrobacterirrrrr, or in plants or yeast, for example, such as S. cerevisicrc~. The plasmid may also comprises a selection marker, which may function in bacteria, including E. cnli anti A,yrohncterirrrrr, as well as a S selection marker that functions in plants or yeast. such as S. cerevisiae.
In still yet another aspect, the invention provides a recombinant DNA
construct comprising a plant centromere and which is capable of being maintained as a chromosome. wherein the chromosome is transmitted in dividing cells. The plant centromere may be from any plant.
In still yet another aspect. the invention provides a plant centromere which is further defined as an Arcrhidnpsi.s thalicrncr centromere. In yet another embodiment of the invention. the plant centromere is an Arabiclopsis tlraliana chromosome 1 centromere, IS and may still further be defined as flanked by the genetic markers T22C23-T7 and T3P8-SP6, or still further as flanked by the genetic markers T22C23-T7 and TSD18, T22C23-T7 and T3L4, TSD18 and T3P8-SP6, TSD18 and T3L4, and T3L4 and T3P8-SP6. In vet another embodiment of the invention. the plant centromere comprises an Arcrbidupsis tlruliana chromosome 2 centromere. The chromosome 2 centromere may comprise, for example, from about 100 to about 61 1.000. about 500 to about 611,000, about 1.000 to about 611.000, about 10.000 to about 61 1.000. about 20.000 to about 61 1.000. about 40.000 to about 61 1.000, about 80.000 to about 61 1.000.
about ISO.OOU to about 6I I ,000, or about 300,000 to about 61 1.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID N0:209, including comprising the nucleic acid sequence of SEQ ID N0:209. The centromere may also be defined as comprising from about 100 to about 50,959. about 500 to about S0,9S9, about 1.000 to about 50.959, about 5,000 to about 50.959. about 10.000 to about 50.959, 20.000 to about 50.959. about 30.000 to about 50.959. or about 40.000 to about S0.9S9 contiguous nucleotides of the nucleic acid sequence of SEQ ID N0:210. and may comprise the nucleic acid sequence of SEQ
ID
N0:210: The centromere may comprise sequences from both SEQ ID NOS:209 and 210.
_8_ WO 00/~~325 PCT/US00/07392 including the aforementioned fragments. or the entirety of SEQ ID NOS:209 and 210. In particular embodimemnts, the inventors contemplate a 3' fru~ment of SEQ ID
\'0:209 can be fused to a ~' fra~tnent of SEQ 1D \0:210. optionally including one or more 180 by repeat sequence disposed therebetween.
In still yet another aspect, the invention provides an Arcrhicloh.ci.c tlrrrliancr chromosome 3 centromere. In one embodiment of the invention. the centromere may be further defined as flanked by the genetic markers T9G9-SP6 and TSM 14-SP6, and still further defined as flanked by a pair of genetic markers selected from the group consisting of T9G9-SP6 and TI4H20. T9G9-SP6 and T7K14, T9G9-SP6 and T21P20, T14H20 and T7K 14, T i 4H20 and T2 I P20, T 14H20 and TSM 14-SP6. T7K 14 and TSM 1=1-SP6.
T7K 14 and T2 I P20. and T21 P20 and TSM 14-SP6.
In still yet another aspect, the invention provides an Araliidnp.ci.c tlrcrliana chromosome 4 centromere. In certain embodiments of the invention. the centromere may comprise from about 100 to about 1,082.000. about 500 to about 1,082.000.
about 1.000 to about 1.082.000. about 5,000 to about 1.082.000, about 10,000 to about 1.082.000, about 50,000 to about l ,082.000, about 100.000 to about 1,082,000, about 200.000 to about 1.082.000. about 400.000 to about 1.082,000, or about 800,000 to about 1.082,000 contiguous nucleotides of the nucleic acid sequence of SEQ ID N0:2 l I , including comprising the nucleic acid sequence of SEQ ID N0:21 I. The centromere may also be defined as comprisin« from about 100 to about 163.317. about 500 to about 163.317.
about 1.000 to about 163,317, about 5.000 to about 163.317, about 10.000 to about 163,317, about 30.000 to about 163.317, about 50.000 to about 163.317. shout 80.000 to about 163.317, or about 120.000 to about 163.317 contiguous nucleotides of the nucleic acid sequence of SEQ ID N0:212. and may be defined as comprising the nucleic acid sequence of SEQ 1D N0:212. The centromere may comprise sequences from both SEQ
ID NOS:21 I and ?I?. including the aforementioned fragments. or the entirety of SEQ ID
NOS:21 1 and 212. In particular embodimemnts. the inventors contemplate a 3' fragment _y_ of SEQ ID N0:211 can be fused to a ~' fragment of SEQ ID N0:212, optionally including one or more 180 by repeat sequence disposed therebetween.
In yet another embodiment. there is provided a Arcrbidvpsi.s tlraliurrcr chromosome l, 3 or 5 centromere selected from the nucleic acid sequence given by SEQ ID
N0:184.
SEQ ID N0:185. SEQ ID NO: I 86. SEQ ID N0:187, SEQ ID NO: I88. SEQ ID NO: !
89.
SEQ ID N0:190. SEQ ID N0:191. SEQ ID N0:192. SEQ ID N0:193. SEQ ID NO: i94.
SEQ ID N0:195. SEQ ID N0:196, SEQ ID N0:197. SEQ ID N0:198, SEQ ID N0:199.
SEQ ID N0:200. SEQ ID N0:201, SEQ ID N0:202, SEQ ID N0:203. SEQ ID N0:204.
SEQ ID N0:205, SEQ ID N0:206. SEQ ID N0:207, SEQ ID N0:208. or traements thereof. In one embodiment. the construct comprises at least 100 base pairs.
up to an including the full length. of one of the preceding sequences. In addition. the construct may include t or more 180 base pair repeats.
IS In still yet another aspect. the invention provides an Arabiclapsis rlurlicura chromosome 5 centromere. The centromere may be further defined as flanked by the genetic markers F13K20-T7 and CUE1, and still farther defined as flanked by a pair of genetic markers selected from the Qroup consisting of F13K20-T7 and T18M4.
FI3K20-T7 and T18F2. F13K20-T7 and T24I20, T18M4 and T18F2. T18M4 and T24I20. T I 8M4 and CUE 1. T 18F2 and T24I20, T 18F2 and CUE 1. and T24I20 and CUE 1.
In still yet another aspect. the invention provides a recombinant DNA
construct comprising a plant centromere. and further defined as comprising n copies of a repeated nucleotide sequence. wherein n is at least 2. Potentially any number of repeat copies capable of physically heine placed on the recombinant construct could be included on the construct. including about ~. 10. I5. 20. 30. 50. 75. 100. 150, 200. 300. 400.
500. 750.
1.000. 1.500. 2.000. 3.000. 5.000. 7.500. i 0.000. 20.000. 30.000. 40.000.
50.000. 60.000.
70.000. 80.000. 90.000 and about 100.000. including all ranges in-beUveen such copy nutobers. In one embodiment the repeated nucleotide sequence may be isolatable from _ 10-the nucleic acid sequence given by SEQ ID NO:18:~. SEQ ID N0:185. SEQ ID NO:1 S6.
SEQ ID N0:187. SEQ ID N0:188. SEQ ID N0:189. SEQ ID N0:190. SEQ ID N0:191.
SEQ 1D N0:192. SEQ ID N0:193. SEQ ID N0:194. SEQ ID N0:195. SEQ ID N0:196.
SEQ ID NO:197. SEQ ID N'0:198. SEQ ID N0:199. SEQ ID N0:200. SEQ ID N0:201.
SEQ ID N0:202. SEQ ID N0:203. SEQ ID N0:204. SEQ ID N'0:205, SEQ ID N0:206, SEQ ID N0:207. SEQ ID N0:208. SEQ ID N0:209. SEQ ID N0:210. SEQ ID N0:21 1 or SEQ ID N0:212. Examples of such sequences that could be used are liven in FIGS.
23A-23D. The length of the repeat used may vary. but will preferably range from about 20 by to about 250 bp, from about 50 by to about 225 bp, from about 75 by to about 210 bp. from about 100 by to about 205 bp, from about 125 by to about 200 bp. from about 150 by to about 195 bp. from about 160 by to about 190 and from about 170 by to about 185 by including about 180 bp.
In conjunction with SEQ ID NOS:209. 210. 211 and 212, the repeats may be included as part of centromeric structures. The number of repeats may vary and include 1, 2, 3, 4, 5. 6. 7. 8. 9, 10, 1 I . 12. 13, 14, 15. 16, 17. I 8, 19. 20, 21.
22. 23, 24. 25, 30. 35.
40. 45. 50. 60. 70. 80. 90. 100. 125, 150. 175, 200, 300. 400. 500 or more.
In still yet another aspect, the invention provides a minichromosome vector comprising a plant centromere and a telomere sequence. Any additional desired sequences may be added to the minichromosomc. such as an autonomous replicating sequence. a second telomere sequence and a structural gene. One or more of the foregoing sequences may be added . up to the maximum number of such sequences that can physically be placed on the minichromosome. The minichromosome may comprise any of the centromere compositions disclosed herein. In one embodiment of the invention. the minichromosome rnay comprise a nucleic acid sequence selected from the group consisting of SEQ ID NO: I . SEQ ID N0:2. SEQ ID N0:3. SEQ ID N0:4. SEQ
ID
NO:S. SEQ ID N0:6. SEQ ID N0:7. SEQ ID N0:8. SEQ ID N0:9. SEQ ID NO:10. SEQ
ID NO:I1. SEQ ID NO:I?. SEQ ID NO:I 3. SEQ ID N0:14. SEQ ID NO:I>. SEQ ID
N0:16. SEQ ID NO:17. SEQ ID N0:18, SEQ ID NO: l9. SEQ ID N0:20. and SEQ (D

N0:21. The minichromosome also may contain "negative" selectable markers which confer susceptibility to an antibiotic, herbicide or other agent, thereby allowing for selection against plants. plant cells or cells of anv other or~T;znism of interest containing a minichromosome. The minichromosome also may include genes which control the copy number of the minichromosome within a cell. One or more structural genes also may be included in the minichromosome. Specifically contemplated as being useful will be as many stnrctural genes as may be inserted into the minichromosome while still maintaining a functional vector. This may include one. two. three. four, five.
six, seven.
eight, nine or more structural genes.
t0 In still yet another aspect. the invention provides a recombinant DNA
construct comprising a plant centromere. The cell may be of any type. including a prokaryotic cell or eukaryotic cell. Where the cell is a eukaryotic cell. the cell may be, for example. a yeast cell or a higher eukaryotic cell, such as plant cell. The plant cell may be from a 1 ~ dicotyledonous plant, such as tobacco. tomato, potato. soybean. canola.
sunflower, alfalfa.
cotton and Arcrbiclopsis, or may be a monocotyledonous plant cell, such as wheat. maize, rye. rice, turfgrass, oat. barley, sorghum. millet, and sugarcane. In one embodiment of the invention, the plant centromere is an Arahiclnp.si.s rlrcrlicrna centromere.
and the cell may be an Arcrbidopsi.s tlrcrliurrcr cell. The recombinant DNA construct rnay comprise 20 additional sequences. such as a telomere. an autonomous replicmin~ sequence (ARS}, a structural gene. or a selectable or screenable marker gene. including as many of such sequences as may physically be placed on said recombinant DNA construct. In one embodiment of the invention, the cell is further defined as capable of expressing said structural gene. In another embodiment of the invention. a plant is provided comprising ?~ the aforementioned cells.
In still yet another aspect, the invention provides a method of preparing a transgenic plant cell comprising contacting a starting plant cell with a recombinant DNA
construct comprising a plant centromere, whereby said startinU plant cell is transformed 30 with said recombinant DNA construct. The recombinant DN,-1 construct may comprise any desired sequences. such as many structural genes as can physically be placed on said recombinant DNA construct. In particular embodiments. the centromere is an Arcrhidopsis thaliaun centromere. and the plant cell may be an Arahiclop.ri.s thcrlimrcr cell.
In still yet another aspect. the invention provides a transgenic plant comprising a minichromosome vector, wherein the vector comprises a plant centromere and a telomere sequence. The minichromosome vector may further comprise an autonomous replicating sequence, second telomere sequence. or a structural gene, such as an antibiotic resistance gene, a herbicide resistance gene, a nitrogen fixation gene, a plant pathogen defense gene.
a plant stress-induced gene, a toxin gene. a receptor gene. a ligand gene, a seed storage gene, a hormone gene. an enzyme gene, an interleukin gene, a clotting factor gene, a cytokine gene, an antibody gene, and a growth factor gene. As many of such sequence, may be included as can physically be placed on the minichromosome. The minichromosome vector may further comprise a nucleic acid sequence selected from the group CO11SISLIrI~ of SEQ ID NO: I, SEQ ID N0:2, SEQ ID N0:3, SEQ ID N0:4, SEQ
ID
NO:~. SEQ ID N0:6. SEQ ID N0:7. SEQ ID N0:8, SEQ ID N0:9. SEQ ID NO:10. SEQ
ID NO:I I. SEQ ID N0:12. SEQ LD N0:13, SEQ ID NO:I~i. SEQ ID NO:15, SEQ ID
N0:16, SEQ ID N0:17, SEQ ID NO: ! 8. SEQ ID N0:19. SEQ ID N0:20, and SEQ ID
N0:21. The transgenic plant may be any type of plant, such as a dicotyledonous plant.
for example, tobacco. tomato. potato. pea, carrot, cauliflower. broccoli, soybean, canola.
sunflower. alfalfa, cotton and Anuhiclop.ris, or may be a monocotyledonous plant. such as wheat, maize. rye, rice. turf~~raaa. oat. barley. sorghum, millet. and sugarcane.
In still yet another aspect, the invention provides a method of producing a minichromosome vector comprising: (a) obtaining a first vector and a second vector.
wherein said first vector or said second vector comprises a selectable or screenable marker. an origin of replication. a telomere. and a plant centromerc. and wherein said first vector and said second vector comprises a site for site-specific recombination: and (b) contactin~~ said first vector with said second vector to allow site-specific recombination to occur between said site for cite-specific recombination on said first vector and said site for site-specific recombination on said second vector to create a minichromosome vector comprising said selectable or screenable marker. said origin of replication.
said telomere and said plant centromere. The contacting may be done irr vireo or ire aivn, ineludin~~
wherein the contacting is carried out in a prokaryotic cell such as an A~rohucterirrnr or E.
cnli cell. or in a lower eukarvotic cell, such as a yeast cell. The contacting may still further be carried out in a higher eukaryotic cell. such as a plant cell, including an Arnbiclnpsis thalinncr cell. The contacting may be done in the presence of potentially any recombinase, including Cre. Flp. Gin. Pin. Sre, pinD, Int-B 13. and R. The first vector or second vector may comprise border sequences for A~roJ~crcter-iurn-mediated IO transformation. In one embodiment of the invention, the plant centromere is an Arabiclupsis thalicrncr centromere. The telomere may be a plant telomere. Any plant selectable or screenable marker could be used. including GFP. GUS. BAR. PAT, HPT or NPTII.
I S In still yet another aspect. a method is provided of screening a candidate centromere sequence for plant centromere activity, said method comprising the steps of:
(a) obtaining an isolated nucleic acid sequence comprising a candidate centromere sequence: (b) integratively transforming plant cells with said isolated nucleic acid: and (c) screening for centromere activity of said candidate centromere sequence.
In the 20 method. the screening may comprise observing a phenotypic effect present in the integratively transformed plant cells or plants comprising the plant cells.
wherein the phenotypic effect is absent in a control plant cell not integratively transformed with said isolated nucleic acid sequence. or a plant comprising said control plant cell.
Types of phenotypic effects that could be screened for include reduced viability.
reduced efficiency 25 of said transforming. genetic instability in the inte~ratively transformed nucleic acid.
aberrant plant sectors. increased ploidy, aneuploidy, and increased integrative transformation in distal or centromeric chromosome regions. The isolated nucleic acid sequence may comprise a bacterial artificial chromosome. which may be further defines!
as a binary bacterial artificial chromosome. The inte~ratively transforming may comprise 30 use of any type of transformation. such as A,yrnhncuerirrrn-mediated transformation. In -I=l-one embodiment of the invention, the control plant cell has been integratively transformed with a nucleic acid sequence other than a candidate centromere sequence.
In still yet another aspect. the invention provides a recombinant DNA
construct comprising an Arabiclupsis polyubiquitin 1 1 promoter. wherein the promoter comprises from about 2~ to about 2.000 contiguous nucleotides of the nucleic acid sequence of SEQ
ID N0:180. In further embodiments of the invention. the promoter may comprise from about 75 to about 2.000, from about 125 to about 2.000. from about 200 to about 2.000.
from about 400 to about 2,000, from about 800 to about 2.000, from about 1,000 to about
2,000, or from about 1,500 to about 200 contiguous nucleotides of the nucleic acid sequence of SEQ ID N0:180, or may comprise the nucleic acid sequence of SEQ ID
N0:180. The promoter containing construct may comprise any additional desired sequences, for example. that of an enhancer. a telomere sequence. a plant centromere sequence. an ARS, or a structural gene, including an antibiotic resistance gene. a 1 ~ herbicide resistance gene. a nitrogen fixation gene, a plant pathogen defense gene, a plant stress-induced gene, a toxin gene, a receptor gene, a ligand gene, a seed storage gene, a hormone gene, an enzyme gene, an interleukin gene, a clotting factor gene. a cytokine gene. an antibody gene. and a growth factor gene. In one embodiment of the invention, the promoter may be operably linked to the 5' end of the structural gene.
In still yet another aspect. the invention provides a recombinant DNA
construct comprising an Arabidnp.sis 40S ribosomal protein S 16 promoter. wherein said promoter comprises from about 25 to about 2.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID N0:182. In particular embodiments of the invention. the promoter 2~ may comprise from about 7~ to about 2.000. from about 12~ to about 2.000.
from about 200 to about 2.000, from about 400 to about 2,000. from about 800 to about 2,000, from about 1,000 to about 2.000 or from about I 500 to about 2.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID I\O: 182. or may comprise the nucleic acid sequence of SEQ ID N0:182. The promoter containing construct may comprise any additional desired sequences, for example. that c>t an enhances. a telomere sequence, a plant _l;_ WO 00/~~325 PCT/US00/07392 centromere sequence, an ARS. or a structural gene. including an antibiotic resistance gene, a herbicide resistance nene. a nitrogen fixation gene. a plant pathogen defense gene.
a plant stress-induced nene. a toxin gene. a receptor gene. a lioand gene. a seed storage gene. a hormone gene. an enzyme nene. an interleukin gene, a clottinn factor gene, a cytokine gene. an antibody gene, and a growth factor gene. In one embodiment of the invention. the promoter may be operably linked to the 5' end of the structural gene.
In still yet another aspect, the invention provides a recombinant DNA
construct comprising an Arabidopsis polyubiquitin 11 3' regulatory sequence including the terminator sequence. wherein the 3' regulatory sequence comprises from about 25 to about 2001 contiguous nucleotides of the nucleic acid sequence of SEQ ID
N0:181. In one embodiment of the invention, the 3' regulatory sequence may be further defined as comprising from about 75 to about 2001, from about 125 to about 2001, from about 200 to about 2001. from about 400 to about 2001, from about 800 to about 2001, or from IS about 1,000 to about 2001 continuous nucleotides of the nucleic acid sequence of SEQ ID
N0:181. and may comprise the nucleic acid sequence of SEQ ID N0:181. The recombinant sequence may further comprise any other sequence, for example. an enhancer, a telomere sequence, a plant centromere sequence. an ARS, and a structural gene, including an antibiotic resistance gene, a herbicide resistance gene, a nitrogen fixation gene, a plant pathonen defense gene, a plant strews-induced gene. a toxin nene, a receptor gene. a ligand gene. a seed storage gene. a hormone gene. an enzyme gene, an interleukin gene. a clotting factor gene, a cytokine nene, an antibody gene.
and a growth factor nene. In one embodiment of the invention. the terminator may be operably linked to the 3' end of the structural nene.
In still yet another aspect, the invention provides a recombinant DNA
construct comprising an Arahiclnp.ci.c 40S ribosomal protein S 16 3~ regulatory sequence including the terminator sequence. wherein the 3' regulatory sequence comprises from about 2~ to about 2.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID
N0:183. In particular embodiments of the invention. the 3~ regulatory sequence may comprise from l 6_ about 7~ to about 2.000. from about 1 ?~ to about 2,000, from about 200 to about ?.000, from about 400 to about 2.000, from about 800 to about 2,000, or from about 1.000 to about ?.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:
183. and may comprise the nucleic acid sequence of SEQ ID N0:183. The recombinant sequence may further comprise any other sequence. for example, an enhancer, a telomere sequence.
a plant centromere sequence, an ARS, and a structurat gene, includin;~ an antibiotic resistance gene. a herbicide resistance gene. a nitrogen fixation gene, a plant pathogen defense gene, a plant stress-induced gene. a toxin gene, a receptor Gene, a ligand gene, a seed storage gene, a hormone gene. an enzyme gene, an interleukin gene. a clotting factor gene, a cytokine gene, an antibody gene, and a growth factor gene. In one embodiment of the invention, the terminator may be operably linked to the 3' end of the structural gene.
In still yet another aspect. the invention provides methods for expressing foreign Qenes in plants. plant cells or cells of any other organism of interest. The foreign genes may be from any organism, including plants, animals and bacteria. It is further contemplated that minichromosomes could be used to simultaneously transfer multiple foreign genes to a plant comprising entire biochemical or regulatory pathways.
In yet another embodiment of the invention. it is contemplated that the minichromosomes can be used as DNA cloning vectors. Such a vector could be used in plant and animal sequencing= projects. The current invention may be of particular use in the cloning of sequences which are "unclonable'~ in yeast and bacteria, but which may be easier to clone in a plant based system.
In still yet another aspect of the invention, it is contemplated that the minichromosomes disclosed herein may be used to clone functional segments of DNA
such as origins of DNA replication. telomeres. telomere associated genes.
nuclear matrix attachment regions (MARS). scaffold attachment regions (SARs). boundary elements.
enhancers. silencers, promoters. recombinational hot-spots and centromeres.
This embodiment may be carried out by cloning DNA into a defective minichromosome which is deficient for one or more type of functional elements. Sequences which complement such deficient elements would cause the minichromosome to be stably inherited.
A
selectable or screenable marker on the minichromosome could then be used to select for viable minichromosome containing cells which contain cloned functional elements of the type that were non-functional in the detective minichromosome.
In still yet another aspect of the invention, the sequences disclosed herein may be used for the isolation of centromeric sequences from plants other than Ar-abiclop.sis. Such techniques may employ, for example, hybridization or sequence-based analysis.
In one embodiment of the invention, the centromere may be isolated from agriculturally important species such as, for example. vegetable crops, including artichokes, kohlrabi, arugula, leeks, asparagus, lettuce (e.g., head, leaf, romaine), bok choy, malanga, broccoli.
melons (e.g., muskmelon. watermelon. crenshaw, honeydew. cantaloupe), brussels sprouts, cabbage. cardoni, carrots, napa, cauliflower. okra, onions. celery.
parsley. chick peas, parsnips. chicory, Chinese cabbage. peppers. collards. potatoes, cucumber plants IS (marrows, cucumbers), pumpkins, cucurbits, radishes, dry bulb onions, rutabaga.
eggplant, salsify, escarole. shallots, endive, garlic, spinach, green onions, squash, greens, beet (sugar beet and fodder beet), sweet potatoes, Swiss chard. horseradish, tomatoes.
kale. turnips, and spices. Alterantively, centromeres could be isolated from fruit and vine crops such as apples, apricots, cherries, nectarines. peaches. pears, plums, prunes. quince almonds, chestnuts. filberts, pecans. pistachios. walnuts, citrus, blueberries, boysenberries, cranberries. currants, loganberries. raspberries. strawberries, blackberries, gapes, avocados. bananas. kiwi, persimmons. pomegranate, pineapple, tropical fruits.
pomes, melon, mango, papaya, and lychee.
In still yet another aspect of the invention. centromeres could be isolated in accordance with the invention from field crop plants. such as evening primrose, meadow foam. corn (field. sweet. popcorn), hops. jojoha. peanuts. rice. safflower.
small Grains (barley. oats, rye. wheat. etc.). sor~~hurn. tobacco. kapok. leguminous plants (beans.
lentils. peas. soybeans). oil plants (rape. mustard. poppy. olives.
sunflowers, coconut.
castor oil plants. cocoa beans, groundnuts). fibre plants (cotton. tlax. hemp.
jute).
15_ lauraceae (cinnamon, camphor), or plants such as coffee, su'Tarcane. tea. and natural rubber plants. Still other examples of plants from which centromeres could be isolated include bedding plants such as flowers. cactus. succulents and ornamental plants. as well as trees such as forest (broad-!caved trees and evergreens, such as conifers).
Iruit.
ornamental. and nut-bearing trees, as well as shrubs and other nursery stock.
In still yet another aspect of the invention. the minichromosome vectors described herein may be used to perform efficient gene replacement studies. At present.
gene replacement has been detected on only a few occasions in plant systems and has only been detected at low frequency in mammalian tissue culture systems (see Thomas et al., 1986: Smithies et al., 1985). The reason for this is the high frequency of illegitimate nonhomologous recombination eventa relative to the frequency of homologous recombination events (the latter are responsible for ~=ene replacement).
Artificial chromosomes may participate in homologous recombination preferentially.
Since the artificial chromosomes remain intact upon delivery, no recombinogenic broken ends will be generated to serve as substrates for the extremely efficient illegitimate recombination machinery. Thus, the artificial chromosome vectors disclosed by the present invention will be maintained in the nucleus through meiosis and available to participate in homology-dependent meiotic recombination. In addition. because in principle, artificial chromosomes of any length could be constructed using the teaching of the present invention, the vectors could be used to introduce extremely long stretches of DNA from the same or any other organism into cells. Specifically contemplated inserts include those from about several base pairs to one hundred megabase pairs, including about I kb. 25 kB. 50 kB, 100 kB, 125 kB, 150 kB. 200 kB. 300 kB. 400 kB. 500 kB. 600 kB, 700 kB. 800 kB. 900 kB. 1 MB. I .25 Mb. 1.5 Mb. 2 Mb, 3 Mb. 5 Mb, 10 Mb.
25 Mb.
50 Mb and 100 Mb.
In still yet another aspect. the present invention provides methods for the construction of minichromosome vectors for the Uenetic: transformation of plant cells.
uses of the vectors. and organisms transformed by them. Standard reference works setting forth the General principles of recombinant DNA technology include Lewin, 1985.
Other works describe methods and products of genetic engineering. See, e.
r,~..
~~laniatis ct crl.. 1982: Watson et al.. 1983: Setlow et «L. 1979: and Dillon et «l.. 1985.
In still yet another aspect. the invention provides a method of preparing a trans~enic cell. In one embodiment of the invention. the method comprises the steps of:
a.) obtaining a nucleic acid molecule comprising Arabidnp.ci.c tlraliartcr centromere DNA
having the following characteristics: l.) mapping to a location on an Arnhidnpsi.c tlraliona chromosome defined by a pair of genetic markers selected from the group consisting of:
mi342 and T27K 12. mi310 and 84133, atpox and ATA. mi233 and mi 167, and F

t7 and CUE1, and 2.) sorts DNA to the spindle poles in meiosis 1 in a pattern indicating the disjunction of homologous chromosomes, bl preparing a recombinant construct comprising said nucleic acid molecule: and c) transforming a recipient cell with said recombinant construct.
IS
The cell may be. for example, a lower eukaryotic cell including a yeast cell, or may be a higher eukaryotic cell. Where the cell is a higher eukaryotic cell, the cell may be an animal or plant cell. In one embodiment of the invention, the cell is not an Arnbidop.ci.c th«linn« cell. In another embodiment of the invention, the Ar-crhidupsi.c tlmli«rrcr centromere is defined by the marker pair mi342 and T27K 12. which may be further defined by the genetic marker pair T22C23-t7 and T3P8-sp6: and / or is defined by the marker pair mi310 and 84133, which may be further defined by the genetic marker pair FSJ I S-sp6 and T I SD9: and 1 or is defined by the marker pair atpox and ATA. which may be further defined by the genetic marker pair T9G9-sp6 and TSM 14-sp6: and / or is defined by the marker pair mi233 and mi 167. which may be further defined by the nenetic marker pair T24H24.30k3 and F13H14-t7: and / or is defined by the genetic marker pair F13K20-t7 and CUEi. which may be further defined by a genetic marker pair selected from the croup consisting of F I 3K20-T7 and T I 8M=1. F 13K20-T7 and T 18F2.
F I 3 K20-T7 and T2=1120. T I 8M4 and T I 8F?. T I 8M=l and T24I20. T 18M4 and CUE 1.
T 18F2 and 1'24120. T l SF2 and CUE 1. and T24120 and CUE 1.
_20_ In one embodiment of the invention, the transforming may comprise use of a method selected from the group consisting of: A,s,~rnbcrcterirrrrr-mediated transformation.
protoplast transformation. electroporation, or particle bombardment. The recombinant construct may comprise desired elements, including a telomere, such as an Arcrhidnp.ci.c thaliann or yeast telomere. The recombinant construct may also comprise an autonomous replicating sequence (ARS), for example, an Arnfiidop.si.c thaliana ARS. The recombinant construct may also comprise a prokaryotic or eukaryotic selectable or screenable marker gene. Also desired to include with a recombinant construct may be one or more structural genes. Exemplary structural genes include a gene selected from the group consisting of an antibiotic resistance gene, a herbicide resistance gene, a nitrogen fixation gene, a plant pathogen defense gene, a plant stress-induced gene. a toxin gene. a seed storage gene, a hormone gene, an enzyme gene, an interleukin gene. a clotting factor gene, a cytokine gene. an antibody gene, and a growth factor ~eoe. The method may further comprise the step of re~eneratin~ a transgenic plant from said cell.
In still yet another aspect. the invention provides a method of identifying a nucleic acid molecule capable of conferring centromere activity comprising the steps of: a) obtaining a nucleic acid molecule comprising Arcrbiclop.ci.c thnliarra centromere DNA.
wherein the Arnhicloh.ci.c rlrccliarra centromere is defined by a pair of genetic markers selected from the group consisting of mi342 and T27K12, mi310 and x4133, atpox and ATA. mi233 and mi167. and F13K?0-t7 and T17M11-sp6: b) preparing a recombinant construct that comprises the nucleic acid molecule: and c) determining the ability of the recombinant construct to demonstrate a stable inheritance pattern. In the method. the ability to demonstrate a stable inheritance pattern may be determined by preparin~z a recombinant cell that comprises the recombinant construct. In another embodiment of the invention. the Arubiclop.cis rlrcrlicrrra centromere is defined by the marker pair mi34? anti T27K1?. which may be further defined by the ~yenetic marker pair T22C23-t7 and T3P8-sp6: and / or is defined by the cnarker pair mi310 and 84133, which may he turther defined by the genetic marker pair F~J 1 ~-sp6 and T15D9: and / or is defined by the marker pair atpox and ATA, which may be further defined by the genetic marker pair T9G9-sp6 and TAM 14-sp6: and / or is defined by the marker pair mi233 and mi 167, which may be further defined by the genetic marker pair T?4H24.30k3 and F13H
I=t-t7:
and / or is defined by the genetic marker pair F13K20-t7 and CUE1, which may be i further defined by a genetic marker pair selected from the croup consisting of F13K20-T7 and T 18M4. F I 3K20-T7 and T t 8F2. F I 3 K20-T7 and T24I20. T 18M4 and T
18F2.
T 18M4 and T24I?0, T l 8M4 and CUE 1, T I 8F2 and T?aI20, Tl 8F2 and CUE I .
and T24I20 and CUE 1.
l0 In one embodiment of the invention. the recombinant construct is not chromosomally integrated. Said obtaining may comprise obtaining a BAC or YAC
clone comprising said Arahidopsis rlraliana centromere DNA. The DNA may be obtained by a method that includes the use of pulsed-field gel electrophoresis. and may be obtained by a method that includes positional cloninb. In another embodiment of the invention. the 15 positional cloning may comprise identifying a contiguous set of clones comprising said Arnhiclnpsis thaliancr centromere DNA, wherein said set of clones is t7anked by a pair of genetic markers selected from the group consistiny~ of mi342 and T27K12. mi310 and ~4 E 33. atpox and ATA, mi233 and mi 167. and F I 3K20-t7 and T 17M I 1-sp6.
20 The contiguous set of clones may span the Ar«hidnpsi.c thalinua centromere.
The recombinant construct may comprise a selectable or screenable marker and said step of determining may comprise determining a phenotype conferred by the selectable or screenable marker. The determining may comprise. for example. determining the ability of the recombinant construct to demonstrate a stable inheritance pattern in mitosis and /
?~ or meiosis. In still another embodiment. the invention provides a trans~~enic cell prepared by a method provided by the invention. Also provided by the invention are a transgenic plant, plant parts and tissue cultures comprisin~~ the transaenic cell. In another embodiment of the invention. the Arahicl~~psus rlr«iicrn« centromere is defined by the marker pair mi34? and T27K 12. which may be further defined by the genetic marker pair 30 T??C23-t7 and T3P8-sp6: and / or is defined by the marker pair mi310 and g~
133. which may be further defined by the genetic marker pair F~J 15-sp6 and TI SD9: and /
or is defined by the marker pair atpox and ATA. which may be further defined by the genetic marker pair T9G9-sp6 and TSM 1=l-sp6: and / or is defined by the marker pair mi233 and mi 167, which may be further defined by the genetic marker pair T2~H2=1.30k3 and F ! 3H 14-t7: and / or is defined by the genetic marker pair F ( 3K20-t7 and CUE 1. which may be further defined by a genetic marker pair selected from the group consisting of F13K20-T7 and T18M4. F13K20-T7 and T18F2. F13K20-T7 and T24I20, T18M4 and T 18F2: T 18M4 and T24I20. T 18Md and CUE 1. T 18F2 and T24I20. T 18F2 and CUE
1.
and T24I20 and CUE I .
In still yet another aspect of the invention. a centromere used in accordance with the invention is not from Arahidopsi.s, for example, from Arahidopci.s tlraliumr.
Similarly. a plant or plant cell comprising a centromere composition in accordance with the invention. may also be from a plant other than Arahidopsis.
BRIEF DESCRIPTION OF THE DRAWINGS
The following drawings form part of the present specification and are included to further demonstrate certain aspects of the present invention. The invention may be better understood by reference to one or more of these drawings in combination with the detailed description of specific embodiments presented herein. The file of this patent contains at least one drawing executed in color. Copies of this patent with color drawin~(s) will be provided by the Patent and Trademark Office upon request and payment of the necessary fee.
FIG. I. Centromere mappin~ with unordered tetrads: A cross of Uvo parents (AABB x aabb). in which "A" is on the centromere of one chromosome. and "B" is linked to the centromere of a second chromosome. At meiosis. the A and B
chromosomes assort independently. resultin~l in equivalent numbers of parental ditype (PD) and nonparental ditype (NPD) tetrads (recombinant progeny are shown in gray).
Tetratype tetrads (TT) result only from a crossover between ''B" and the centromere.
FIG. 2. Low resolution mad location of Ar«hidnp.ci.c centromeres. Trisomic mapping was used to determine the map position of centromeres on four of the five Ar«bidnp.ci.c chromosomes (Kootnneef, 1983: Sears et crl.. 1970). For chromosome 4.
useful trisomic strains were not obtained. With the methods of Koornneef and Sears et crl. 1983. (which rely on low-resolution deletion mapping) the centromere on chromosome 1 was found to lie between the two visible markers. ttl and ctrl.
that are separated by 5 cM. Centromere positions on the other chromosomes are mapped to a lower resolution.
FIG. 3. Physical maps of the aenetically-defined .4r-«hidon.ci.c centromeres.
Each centromeric region is drawn to scale: physical sizes are derived from DNA
sequencing l5 (chromosomes 11 and IV) or from estimates based on BAC fingerprinting (Marry et «!.. 1999: Mozo et crl.. 1999) (chromosomes I, III. and V).
Indicated for each chromosome are positions of markers (above), the number of tetratype / total tetrads at those markers (below), the boundaries of the centromere (thick black bars).
and the name of conti~s derived from fingerprint analysis (Marry et «L. 1999: Mozo et crl..
1999). For each contig, more than two genetic markers, developed from the database of BAC-end sequences (http://www.tigr.org/tdb/at/abe/bac end_search.hunl) were scored.
PCR
primers corresponding to these sequences were used to identify size or restriction site polymorphisms in the Columbia and Landsberg ecotypes (Bell and Ecker. 199=1:
Konieczny and Ausubel. 1993): primer sequences arc available (http://genome-www.Stanford.edu/Arabidopsis/aboutcaps.html). Tetratype tetrads resulting from treatments that stimulate crossing over (boxes); positions of markers in centimorsans (cMl) shared with the recombinant inbred (RI) map (ovals) (http://nasc.nott.ac.uk/new_ri_map.html: Somerville and Somcrville. 1999): and sequences bordering gaps in the physical map that correspond to 180 by repeats (open circles) (Round et crl., 1997). ~S rDNA (black circles) or 160 by repeats Gray circles) are indicated (Copenhaver et crl., 1999).
FIG. 4. Exemplary list of seed stock used for' tetrad analysis in Arcrbicl«p.si.s th«licrrr«. The individual strains are identified by the strain number (column B). The tetrad member number (column A) indicates the tetrad source (i.e., T1 indicates seeds from tetrad number 1, and the numbers - l . -2, -3. or -4 indicate individual members of the tetrad). The strains listed have been deposited with the Arabi~lopsis Biological Resources Center (ABRC) at Ohio State University under the name of Daphne Preuss.
FIG. 5. Marker information for centromere mapping. DNA polymorphisms used to localize the centromeres are indicated by chromosome (Column 1 ). The name of each marker is shown in Column ?_ the name of the markers used by Copenhaver et «L.

to position centromeres is Given in Column 3 and marker type is indicated in Column =t.
1~ CAPS (Co-dominant Amplified Polvmorphic Sites) are markers that can be amplified with PCR and detected by digesting with the appropriate restriction enzyme (also indicated in Column 3). SSLPs (Simple Sequence Length Polymorphisms) detect polymorphisms by amplifying different length PCR products. Column > notes if the marker is available on public web sites (e.,~..
http://genome-www.stanford.edu/Arabidopsis). For those markers that are not available on public web sites the sequences of the forward and reverse primers used to amplity the marker are listed in columns 6 and 7. respectively.
FIG. 6. Scoring PCR-based markers for tetrad analysis. The genotype of the progeny from one pollen tetrad (T? i was determined for two genetic markers (S0392 and nga76). Analysis of the four progeny plants (T?-1 through T2-4) using PCR and gel electrophoresis allows the genotype of the plant to be determined. and the genotype of the pollen parent to be interred.

FIG.7A-7N. Exemplary Minichromosorne vectors: The vectors shown in FIG. 7A, FIG. 7B, FIG. 7E, FIG. 7F, FIG. 7I and FIG. 7J have an E. call origin of replication which can be high copy number. low copy number or sirt~~le copy.
In FIGS. 7A-7N. the vectors include a multiple cloning site which can contain recognition sequences for conventional restriction endonucleases with 4-8 by specificity as well as recognition sequences for very rare cutting enzymes such as. for example. I-Ppo 1. I-Cue I. PI-Tli, PI-Psp I. Not I. and PI Sce I. In FIG. 7A-7N. the centromere is flanked by Lox sites which can act as targets for the site specific recombinase Cre. FIG.
7r1. Shows an E. coli plant circular shuttle vector with a plant ARS. FIG. 7B. Shows a plant circular vector without a plant ARS. The vector relies on a plant origin of replication function found in other plant DNA sequences such as selectable or screenable markers.
FIG. 7C.
Shows a yeast-plant circular shuttle vector with a plant ARS. The yeast ARS is included twice. once on either side of multiple cionin~ site to ensure that large inserts are stable.
FIG. 7D. Shows a yeast-plant circular shuttle vector without a plant ARS. The vector relies on a plant origin of replication function found in other plant DNA
sequences such as selectable markers. The yeast ARS is included twice, once on either side of the multiple cloning site to ensure that lame inserts arc stable. FIG.7E. Shows an E.
cnli-Agrobncterirrm-plant circular shuttle vector with a plant ARS. Vir functions for T-DNA transfer would be provided in trans by a using the appropriate Agrnbncterir.r» r strain. FIG. 7F. Shows an E. cull-A,~rnhrrcrerirr»r-plant circular shuttle vector without a plant ARS. The vector relies on a plant orisin of replication function found in other plant DNA sequences such as selectable markers Vir functions for T-DNA transfer would be provided in traps by a using the appropriate Ayrohcrcterirr» r strain. FIG.
7G. Shows a linear plant vector with a plant ARS. The linear vector could be assembled in vitro and ?~ then transferred into the plant by, for example. mechanical means such as micro projectile bombardment. electroporation. or PEG-mediated transformation. FIG. 7H. Shows a linear plant vector without a plant ARS. The linear vector could be assemhled i» vitro and then transferred into the plant by, for example. mechanical means such as micro projectile bombardment. electroporation. or PEG-mediated transformation.
FIC'~s. 7I-7N.
The fi~lures are identical to FIGS. 7A-7F. respectively. with the exception that they do not contain plant telomeres. These vectors will remain circular once delivered into the plant cell and therefore do not require telomeres to stabilize their ends.
FIG. 8. Sequence features at CEN2 (A) and CEN4 (B). Central bars depict annotated genomic sequence of indicated BAC clones; black, genetically-defined centromeres: white. regions flanking the centromeres. Sequences corresponding to genes and repetitive features, filled boxes (above and below the bars, respectively). are defined as in F1G. 12A-T; predicted nonmobile genes, red: genes carried by mobile elements.
black: nonmobile pseudogenes, pink; pseudogenes carried by mobile elements, gray:
retroelements, yellow; transposons, green: previously defined centromeric repeats, dark blue: 180 by repeats, pale blue. Chromosome-specific centromere features include a large mitochondria) DNA insertion (orange: CENZ). and a novel array of tandem repeats (purple: CEN=l). Gaps in the physical maps (//). unannotated regions (hatched boxes), and expressed ~~enes (filled circles) are shown.
FIG. 9. Method for converting_a BAC clone for any other bacterial clone) into a minichromosome. A portion of the conversion vector will integrate into the BAC
clone (or other bacterial clone of interest) either through non-homologous recombination (transposable element mediated) or by the action of a site specific recombinase system.
such as Cre-Lox or FLP-FRT.
FIG. 10. Method for analysis of diccntrie chromosomes in Ar-c~hidnt~.ci.c.
BiBAC
vectors containing centromere fragments 0100 kb) are integ=rated into the Arcrbiclopsi.s genome using A,~rnl~acterirrm-mediated transformation procedures and studied for adverse affects due to formation of dicentric chromosomes. l ) BiBACs containing centromere fragments are identified using standard protocols. 2) Plant transformation.
3) Analysis of defects in growth and development of plants containing dicentric chromosomes.
_y_ FIG. 11A-G. Method for converting a BAC clone (or any other bacterial clone) into a minichromosome. The necessary selectable markers and origins of replication for propagation of genetic material in E. cull. A,~~rnhcrcterirrm and Arcrhidnp.si.s as well as the necessary genetic loci for Agrnhcrcterir«rr mediated transformation into Ar«hiclnpsi.s are cloned into a conversion vector. Using Cre/IoxP recombination, the conversion vectors are recombined into BACs containing centromere fragments to form minichromosomes.
FIG. 12A-T. Properties of centromeric re~ions on chromosomes II and IV. ('top) Drawing of genetically-defined centromeres (gray shading. CEN2, left: CEN=1.
right).
adjacent pericentromeric DNA, and a distal segment of each chromosome, scaled in Mb as determined by DNA sequencing (gaps in the grey shading correspond to gaps in the physical maps). Positions in cM on the RI map (http://nasc.nott.ac.uk/new_ri_map.html) and physical distances in Mb. beginning at the northern telomere and at the centromeric gap. are shown. (B~ttom) The density of each feature (FIGS. 12A-12T) is plotted relative IS to the position on the chromosome in Mb. (FIG. 12A, 12K) cM positions for markers on the RI map (solid squares) and a curve representing the genomic average of I
cM/221 kb (dashed line). A single crossover within CEN4 in the RI mapping population (http://nasc.nott.ac.uk/new_ri_map.html: Somerville and Somerville. 1999) may reflect a difference between male meiotic recombination monitored here and recombination in female meiosis. (FIGS. 12B-12E and FIGS. 12L-120) The % of DNA occupied by repetitive elements was calculated for a 100 kb window with a sliding interval of 10 kb.
(FIGS. 12B, 12L) 180 by repeats: (FIGS. 1?C. 12M) sequences with similarity to retroelements. including del. Tal. Tall. copiu. Athila. LINE. Ty3. TSCL. 106B
(Athila-like). Tat 1. LTRs and Cinful: (FIGS. 12D, 12N) sequences with similarity t~
transposons. including Tag 1. En/Spm. Ac/Ds, Tam I MuDR. Limpet. MITES and Mariner: (FIGS. 12E, 120) previously described centromeric repeats including 163A.
164A. 164B. ''78A. I I B7RE. mi 167. pAT?7. 160-. 180- and s00-by repeats. and telomeric sequences (Nlurata et al.. 1997: I-Ieslop-Harrison et crl.. 1999:
Brandes et «L. 1997: Franz et crl.. 1998: Wright et crl.. 1996: Konieczny et «L, 1991:
Pelissier or «l.. 1996: Voytas and Ausubel. 1988: Chve or crl.. 1997: Tsay et «!., 1993:
_?g-Richards et «L. 1991: Simoens et «!., 1988: Thompson et crl.. 1996: Pelissier et «L, 1996 Franz et crl.. 1998: Pelissier et «f., 199; Voytas and Ausubel. 1988:
Thompson et crl., 1996). (FIGS. 12F, 12P) ~l~ adenosine + thymidine was calculated for ,t 50 kb window wish a sliding interval of 25 kb (FIGS. 12G-12J, 12Q-12T). The number of predicted genes or pseudogenes was plotted over a window of 100 kb with a sliding interval of 10 kb. (FIGS. 12G, 12I, 12Q, 12S)' predicted genes (FIGS. 12G, 12Q) and pseudogenes (FIGS. 12I, 12S) typically not found on mobile DNA elements:
(FIGS. 12H, 12J, 12R, 12T) predicted genes (FIGS. 12H, 12R) and pseudogenes (FIGs. 12.1, I2T) often carried on mobile DNA. including reverse transcriptase, transposase. and retroviral polyproteins. Dashed lines indicate regions in which sequencing or annotation is in progress, annotation was obtained from GenBank records (http://www.ncbi.nlm.nih.gov/Entrez/nucleotide.html). from the AGAD database (http://www.ti~r.or~/tdb/at/a~ad/.), and by BLAST comparisons to the database of repetitive Ar«bidop.ci.c sequences (http:l/nucleus.cshl.org/protarablAtRepBase.htm):
though updates to annotation records may change individual entries, the overall structure of the region will not be significantly altered.
FIG. 13. Methods for converting a BAC clone containins centromere DNA into a minichromosome for introduction into plant cells. The specific elements described are provided for exemplary purposes and are not limiting. A) diagram of the BAC
clone.
noting the position of the centromere DNA (red). a site-specific recombination site (for example, lox P). and the F origin of replication. B) Conversion vector containing selectable and color markers (for example. 3~S-Bar. nptll. LAT~2-GUS.
Scarecrow-GFP), telomeres. a site-specific recombination site (for example. lox P), antibiotic ?5 resistance markers (for example. amp or spc/str). A,yrnf~«crerium T-DNA
borders (Aaro Left and Right) and origin of replication (RiA4~. C) The product of site specific recombination with the Cre recombinase at the lox P sites yields a circular product with centromeric DNA and markers flanked by telomeres. D) Minichromosome immediately utter transformation into plants: subsequently. the left and right borders will likely be _?9_ removed by the plant cell and additional telomeric sequence added by the plant telomerase.
FIG. 14A-B. Conservation of centromere DNA. BAC clones (bars) used to S sequence CErlr? (FIG. 14A) and CE'N4 (FIG. 14B) are indicated: arrows denote the boundaries of the genetically-defined centromeres. PCR primer pairs yielding products from only Columbia (fitled circles) or from both Landsberg and Columbia (open circles):
BACs encoding DNA with homology to the mitochondrial genome (gray bars); 180 by repeats (gray boxes); unsequenced DNA (dashed lines); and gaps in the physical map (double slashes) are shown.
FIG. l~A-B. Primers used to analyze conservation of centromere sequences in the A. tlraliarrn Columbia and Landsbera ecotypes. FIG. 15A: Primers used for amplification of chromosome 2 sequences. FIG. 1~B: Primers used for amplification of chromosome 4 sequences.
FIG. 16. Sequences common to CEN2 and CFN4. Genetically-defined centromeres (bold lines), sequenced (thin lines), and unannotated (dashed lines) BAC
clones are displayed as in FIG. 14A, B: Repeats AtCCS 1 (A. tlraliaocr centromere conserved sequence) and AtCCS2 (closed and open circles. respectively), AtCCS3 (triangles), and AtCCS4-7 (4-7, respectively) are indicated (GenBank Accession numbers AF204874 to AF?04880). and were identified using BLAST ?.0 (http://blast.wustl.edu).
FIG. 17. Sequenced BAC clones from centromere ?. The sequenced BAC clones 2s are indicated by the horizontal lines near the top of the figure (see for example T14A4).
The red box denotes the boundaries of centromere 2. and for the BAC clones that comprise the centromere. GenBank Accession numbers are ~acn in the lower right panel.
The contiguous sequences within the red box are given by SFQ ID N0:209 and SEQ
ID
N0:210. Horizontal lines below the sequenced clones indicate additional BAC
clones:
-3()-sequenced end points of these BACs are indicated with a closed circle. Clones with one or more endpoints that are undetermined are indicated by red text.
FIG. 18. Sequenced BAC clones from centromere =I. The sequenced BAC clones from centromere 4 are indicated by the horizontal lines near the top of the figure (see for example T24M8). The red box denotes the boundaries of centrornere 4. and for the BAC
clones that comprise the centromere. GenBank Accession numbers are given in the lower right panel. The contiguous sequences within the red box are given by SEQ ID
N0:211 and SEQ ID N0:212. Horizontal lines below the sequenced clones indicate additional BAC clones: sequenced end points of these BACs are indicated with a closed circle.
Clones with one or more endpoints that are undetermined are indicated by red text.
FIG. 19. Sequence tiling path of centromeres I. 3. and 5. The boundaries of these centromeres was determined as described in Copenhaver et al ( 1999).
Contig numbers refer to the fingerprint contigs assembled by Marra c r crl. ( 1999).
Some of these clones have been sequenced and accession numbers are provided (see attached list). In other cases. sequencing will be finished by the Arahiclop.sis ~enome project.
FIG. 20. Position of DNA from centromere 2 carried in BiBAC vectors. Clones were placed on the physical map by fingerprint and PCR analysis and comparison with the sequenced BAC clones.
FIG. 21. Exemplary methods for adding selectable or screenable markers to BiBAC clones. The desired marker is flanked by transposon borders. anti incubated with the BiBAC in the presence of transposase. Subsequently. the BiBAC is introduced into plants. Often these BiBACs may integrate into natural chromosome, creating a dicentric chromosome which rnay have altered stability and may cause chromosome breakage.
resulting in novel chromosome fragments.
_;l_ FIG. 22. Assa~of chromosome stability. The stability of natural chromosomes, constructed minichromosome. or dicentric chromosomes can be assessed by monitoring the assortment of color markers through cell division. The markers are linked to the centromere in modified BAC or BiBAC vectors and introduced into plants.
Regulation of the marker gene by an appropriate promoter determines which tissues will be assayed.
For example, root-specific promoters, such as SCARECROW make it possible to monitor assortment in files of root cells: post-meiotic pollen-specific promoters such as LAT52 allow monitoring of assortment through meiosis, and general promoters such as the 35S
Cauliflower mosaic virus promoter make it possible to monitor assortment in many other plant tissues. Qualitative assays assess the general pattern of stability and measure the size of sectors corresponding to marker loss, while quantitative assays require knowledge of cell lineage and alhw the number of chromosome loss events to be calculated during mitosis and meiosis.
FIG. 23A-D. SecLuence ali«nments for 180 be repeats from centromeres 1-4. The left hand column indicates the BAC source of the repeat copy and an arbitrarily assigned number given to the sequence. For example, the designation f12g6-I indicates a repeat copy from BAC number f12g6 and arbitrarily given a repeat number of 1. The nucleic acid sequences of the BACs containing the repeat copies. designated f12g6.
f~al3.
t25f1~, t12j2, t14c8, t6c20. f2li?. and f6h8 are given by SEQ ID N0:184. SEQ

N0:191. SEQ ID N0:189. SEQ ID \0:20. SEQ ID N0:206. SEQ ID N0:186. SEQ ID
N0:208 and SEQ ID N0:207. respectively. FIG. 23A. Alignment of 180 by repeats from centromere 1. FIG. 23B. Alignment of 180 by repeats from centromere 2.
FIG.
23C. Alignment of 180 by repeata from centromere 3. FIG. 23D. Alignment of 180 by repeats from centromere 4 DETAILED DESCRIPTION OF THE INVENTION
The inventors have overcome the deficiencies in the prior art by providing.
for the first time. the nucleic acid sequence of a plant chromosome. The significance of this achievement relative to the prior art is exemplified by the general lack tit detailed information in the art regarding the centromeres of multicellular organisms in general.
To date. the most extensive and reliable characterization of centromere sequences has come from studies of lower eukaryotes such as S. cerevi.sine and S. ponrhe.
where the ability to analyze centromere functions has provided a clear picture of the desired DNA
sequences. The S. cereoisioe centromere consists of three essential regions, CDEI, CDEII, and CDEIII, totaling only 125 bp, or approximately 0.006 to 0.06% of each yeast chromosome (Carbon et ul.. 1990: Bloom 1993). S. pomhe centromeres are between and 100 kB in length and consist of repetitive elements that comprise 1 to 3%
of each chromosome (Baum et nl.. 1994). Subsequent studies, using tetrad analysis to follow the segregation of artificial chromosomes. demonstrated that less than 1/5 of the naturally occurring S. ponrbe centromere is sufficient for centromere function (Baum cu crl., 1994).
In contrast, the centromeres of mammals and other higher eukaryotes are poorly defined. Although DNA fragments that hybridize to centromeric regions in higher eukaryotes have been identified, little is known regarding the functionality of these sequences (see Tyler-Smith er crl.. 1993). In many cases centromere repeats correlate with centromere location, with probes to the repeats mapping both cytologically and genetically to centromere regions. Many of these sequences are tandemly-repeated satellite elements and dispersed repeated sequences in arrays ranging from 300 kB to 5000 kB in length (Willard 1990). To date. only one of these repeats. a 171 by element known as the alphoid satellite. has been shown by in situ hybridization to be present at each human centromere (Tyler-Smith er crl.. 1993). Whether repeats themselves represent functional centromeres remains controversial, as other genomic DNA is required to confer inheritance upon a region of DNA ( Willard. 1997). Alternatively, the positions of some higher eukaryotic centromeres have been estimated by analyzing the segregation of chromosome fragments. This approach is imprecise. however. because a limited set of fragments can be obtained. and hecause normal centromere function is intlueneed by surrounding chromosomal sequences (for example. see Koomneef. 1983: FIG. ?).
_33_ A more precise method for mapping centromeres that can be used in intact chromosomes is tetrad analysis (Mortimer et crl.. 1981 ). which provides a functional definition of a centromere ih its native chromosomal context. At present. the only centromeres that have been mapped in this manner are from lower eukaryotes.
including S the yeasts Sacchcrru»rvces cerevisiue, Sclri=osncclurro»ryces pombc:, and Kluyvero»rvce.s Icrcti.s (Carbon et al.. 1990; Hegemann et al., 1993). In these systems.
accurate mapping of the centromeres made it possible to clone ceniromeric DNA. using a chromosome walking strategy (Clarke et crl., 1980). Subsequently. artificial chromosome assays were used to define more precisely the centromere sequences (Hegemann et al.. 1993;
Baum et crl.. 1994).
Attempts to develop a reliable centromeric assay in mammals have yielded ambiguous results. For example. Hadlaczky et crl.. ( 1991 ) identified a 1=l kB human fragment that can. at low frequency, result in cle »ovo centromere formation in a mouse cell line. In situ hybridization studies, however. have shown that this fragment is absent from naturally occurring centromeres, calling into question the reliability of this approach for testing centromere function (Tyler-Smith et nl.. 1993). Similarly, transfection of alphoid satellites into cell lines results in the formation of new chromosomes. yet these chromosomes also contain host sequences that could contribute centromere activity (Haaf et crl.. 1992: Willard. 1997). Further. the novel chromosomes can have alphoid DNA spread throughout their length yet have only a single centromeric constriction.
indicating that a block of alphoid DNA alone may be insufficient for centromere function (Tyler-Smith et crl., 1993).
Although plant centromeres can be visualized easily in condensed chromosomes.
they have not been characterized as extensively as centromeres froth yeast or mammals.
Genetic characterization has relied on seyre~Tation analysis of chromosome traaments, and in particular on analysis of trisomic strains that carry a Genetically marked.
telocentric fragment (for example. see Koornneef 1983: FIG. ? t. In addition. repetitive elements have heen identified that are either ~~eneticallv f Richards et crl.. 1991 ) or physically _3~_ (Alfenito c~r crl.. 1993: Maluszynska et crl.. 19911 linked to a centromere.
In no case.
however. has the functional significance of these sequences been tested.
Cytology in Arabidop.si,s tlrcrliaraa has served to correlate centromere structure with repeat sequences. A fluorescent dye. DAPI. allows visualization of centromeric chromatin domains in metaphase chromosomes. A t7uorescence ur srrrr hybrottzatton (FISH) probe based on 180 by pAL ( repeat sequences colocalized with the DAPI
signature near the centromeres of all five Arabidnpsis chromosomes (Maluszynska et nl.. 1991; Martinez-Zapater et ul., 1986). Although a functional role for pALI has been proposed, more recent studies have failed to detect this sequence near the centromeres in species closely related to Arcrbiclupsis thcrliorrn (Maluszynska er crl.. 1993).
These results are particularly troubling because one of the species tested. A.
prnuilu, is thought to be an amphidiploid. derived from a cross between A. tlrulrcnrcr and another close relative (Maluszynska et ul., 1991: Price et al.. 1990. Another repetitive sequence, pAtTl2. has been genetically mapped to within 5 cM of the centromere on chromosome 1 and to the central region of chromosome ~ (Richards et crl.. 1991 ), although its presence on other chromosomes has not been established. Like pALI. a role for pAtTl2 in centromere function remains to be demonstrated.
Due to the tact that kinetochores constitute a necessary link between centromeric DNA and the spindle apparatus. the proteins that are associated with these structures recently have been the focus of intense investigation (Bloom 1993: Earnshaw 1991 ).
Human autoantibodies that bind specifically in the vicinity of the centromere have facilitated the clonin~~ of centromere-associated proteins (CENPs, Rattner 1991 ). and at least one of these proteins belongs to the kinesin superfamily of microtubule-haled motors (Yen 1991 ). Yeast centromere-binding proteins also have been identified. both through Genetic and biochemical studies (Bloom 1993: Lechner et crl., 1991 1.
The centromeres of Arcrbiclop.si.s rlrcrlicrrur have been mapped using trisomic strains, where the ae~7regation of chromosome fragments (Koornneef 191;3 i or whole -3>-chromosomes (Sears et al.. 1970) way used to localize tour of the centromeres to within 5, 12. 17 and 38 cM. respectively (F1G. 2). These positions have not been refined by more recent studies hecause the method is limited the difficulty of obtaining viable trisomic strains (Koornneef 19831. These factors introduce significant error into the calculated position of the centromere. and in Arabidopsis, where I cM
corresponds roughly to 200 kB (Koornneef 1987: Hwang et al.. 1991 ), this method did not map any of the centromeres with sufficient precision to make chromosome walking strategies practical. Mapping of the Arabidopsi.s genome was also discussed by (Hauge m nl., 1991).
f0 I. Tetrad Analysis With tetrad analysis. the recombination frequency between genetic markers and a centromere can be measured directly (FIG. 1 ). This method requires analysis of all four products of individual meiosis. and it has not been applied previously to multicellular eukaryotes because their meiotic products typically are dissociated.
Identification of the c/uurtet mutation makes tetrad analysis possible for the first time in a higher eukaryotic system (Preuss et nl., !99=l). The grrnrtet (yrt I ) mutation causes the four products of pollen mother cell meiosis in Arnbidup.sis to remain attached. When used to pollinate a flower, one tetrad can result in the formation of four seeds, and the plants from these seeds can be analyzed genetically.
With unordered tetrads. such as those produced by S. cereaisicre or Arcrbidnpsi.s.
genetic mapping usinU tetrad analysis requires that two markers be scored simultaneously (Whitehouse 1950). Tetrads fall into different classes depending on whether the markers are in a parental (nonrecombinant) or nonparental (recombinant) configuration (FIG. I) A tetrad with only nonrecombinant members is referred to as a parental ditype (PD): one with only recombinant members as a nonparental ditype (NPD): and a tetrad with two recombinant and two nonrecombinant members as a tetratype (TT) (Perkins 193).
If two genetic loci are on different chromosomes. and thus assort independently. the frequency of tetratype (crossover products) versus parental or nonparental assortment ditype WO 00/55325 ~ PCT/US00/07392 (noncrossover products) depends on the frequency of crossover between each of the two loci and their respective centromeres.
Tetratype tetrads arise only when a crossover has occurred between a marker in question and its centromere. Thus, to identity genes that are closely linked to the centromere, markers are examined in a pair-wise fashion until the TT frequency approaches zero. The genetic distance (in centimorgans, cM) between the markers and their respective centromeres is defined by the function [( 1/2)TTJ/100 (Mortimer et «l.. 1981 ). Because positional information obtained by tetrad analysis is a representation of physical distance between two points. as one approaches the centromere the chance of a recombination event declines.
Tetrad analysis has been used to genetically track centromeres in yeasts and other fungi in which products of a single meioses can be collected. The budding yeast l~ S«cch«romvce.r cerevi.si«e lacks mitotic condensation and thus cytogenetics (Hegemann et «L, 1993), yet due to tetrad analysis. has served as the vehicle of discovery for ceniromere function. Meiosis is followed by the generation of four spores held within an ascus and these can be directly assayed for gene segre~=anon.
The recessive grtl mutation makes it possible to perform tetrad analysis in Arcrhiclnpsi.s by causing the four products of meiosis to remain attached (Preuss et «l.. 1994; and Smythe 1994: both incorporated herein by reference).
As previously shown, within each tetrad. genetic loci segre~~ate in a 2:? ratio (FIG. 6).
Individual tetrads can be manipulated onto f)owers with a fine brush (at a rate of 20 ~> tetrads per hour). and in 30% of such crosses. tour viable seeds can be obtained ( Preuss et «l.. 1994 ).
Mapping centromeres with high precision requires a dense ~~enetic map. and although the current Ar«hidnp.si.s map contains many visible markers, it would be _laborious to cross each into the gril background. Alternatively. hundreds of DNA
_7_ WO 00/55325 PCT/US00/0'7392 polymorphisms can be introduced simultaneously by crossing two different strains. both containing the drtl nuUation_ A dense RFLP map (Chant et «l., 1988) and PCR-based maps (Koniecznv et «L. 1993: Bell et crl.. 1990 have been generated in Ar«hidnp.ci.c from crosses of the Landsber~ and Columbia strains (An«hidnp.ci.c map and genetic marker data is available from the Internet at http://genome-www.stanford.edu/Arabidopsis and http://cbil.humgen.upenn.edu/at~c/sslp_info/sslp.html). These strains differ by I% at the DNA sequence level and have colinear genetic maps (Chano et crl., 1988:
Koornneef.
I 987).
Centromere mapping with tetrad analysis requires simultaneous analysis of two markers. one of which must be centromere-linked (FIG. l ). To identify these centromere-linked markers, markers distributed across all ~ chromosomes were scored and compared in a pairwise fashion.
Initially, ~ienetic markers that can be scored by PCR analysis were tested (Konieczny et crl.. 1993; Bell et crl., 1994). Such markers are now sufficiently dense to map any locus an as additional PCR-detectable polymorphisms are identified they are incorporated into the analyses. In addition. as described in FIG. ~. new CAPS
and SSLP
markers useful for mapping the centromere can be readily identified.
A collection of Arcrbidop.cis tetrad sets was prepared by the inventors for use in tetrad analysis. To date. progeny plants from > l .000 isolated tetr<~d seed sets have been Uerminated and leaf tissue collected and stored from each of the tetrad progeny plants.
The leaf tissue from individual plants was used to make DNA for PCR based marker analysis. The plants also were allowed to self-fertilize and the seed they produced was collected. From each of these individual seed sets. seedlings can be germinated and their tissues utilized for makin~~ genomic Di\A. Tissue pooled from multiple seedlings is useful for making Southern genornic DNA blots for the analysis of restriction fragment length polvmorphisms (RFLPs). An exemplary liar of the aced stock of informative individuals used for tetrad analysis is liven in FIG. -1.
_3h_ II. l~Iappin~ Strate~v Previous DNA fingerprint and hybridization analysis of two bacterial artificial chromosome (BAC) libraries had led to the assembly of physical mops covering nearly all single-copy portions of the Arafiiclap.si.s ~enome (Marry et nl.. 1999).
However. the presence of repetitive DNA near the Arabidapsis centromeres. including 180 by repeats.
retroelements. and middle repetitive sequences complicated efforts to anchor centromeric BAC contigs to particular chromosomes (Murata et crl., 1997: Heslop-Harrison et crl., 1999: Brandes et crl., 1997; Franz et al.. 1998: Wright et crl.. 1996;
Koniecznyet crl.. 1991:
Pelissier et ul., 1995; Voytas and Ausubel. 1988; Chye et ul., 1997: Tsay et crl., 1993:
Richards et al., 1991: Sirnoens et crl.. 1988: Thompson et crl., 1996:
Pelissier et al.. 1996).
The inventors used genetic mapping to unambiguously assign these unanchored contias to specific centromeres. scoring polymorphic markers in =18 plants with crossovers informative for the entire genorne (Copenhaver et al.. 1998). In this manner.
several centromeric contigs were connected to the physical maps of the chromosome arms (see EXAMPLE 6), and a large set of DNA markers defining centromere boundaries were generated. DNA sequence analysis confirmed the structure of the conti~s for chromosomes II and IV (Lin et al.. 1999).
CEN2 and CEN=1 were selected in particular for analysis. Both reside on structurally similar chromosomes with a 3.~ Mb rDNA arrays on their distal tips, with regions measuring 3 and 2 Mb. respectively, between the rDNA and centromeres, and 16 and 13 Mb regions on their long arms (Copenhaver and Pikaard. 1996).
The virtually complete and annotated sequence of chromosomes tl and IV was used to conduct an analysis of centromeres at the nucleotid; level (http://www.ncbi.nlm.nih.~~ov/Entrez/nucleotide.html). The sequence composition was analyzed within the genetically-defined centromere boundaries and compared to the adjacent pericentromeric rc~Tion~ ( FIGs. l 2A-T). Analysis of the two centrornerec _39_ facilitated comparisons of sequence patterns and identification of conserved sequence elements.
The centromere sequences were found to harbour 180 by repeat sequences. These sequences were found to reside in the gaps of each centromeric contig (FIG. 3.
FIGS. 12B.
12L), with few repeats and no long arrays elsewhere in the genome. BAC clones near these gaps have end sequences corresponding to repetitive elements that likely constitute the bulk of the DNA between the contigs, including I80 by repeats. SS rDNA or 160 by repeats (FIG. 3). Fluorescent irr situ hybridization has shown these repetitive sequences are abundant components of Ar«hi~lnp.si.s centromeres (Murals et «!.. 1997:
Heslop-Harrison et «l.. 1999: Brandes et «!., 1997). Genetic mapping and pulsed-field gel electrophoresis indicate that many 180 by repeats reside in long arrays measuring between 0.4 and I .4 Mb in the centromeric regions (Round et «L. 1997);
sequence analysis revealed additional interspersed copies near the gaps. The inventors specifically 1S contemplate the use of such 180 by repeats for the construction of minichromosomes.
The annotated sequence of chromosomes II and IV identified regions with homology to middle repetitive DNA. both within the functional centromeres and in the adjacent regions (FIGS. 12B-12E and 12L-120).
In a 4.3 Mb sequenced re~Tion that includes CENZ and a 2.8 Mb sequenced region that includes CEN4. retrotransposon homology was found to account for > 10% of the DNA sequence. with a maximum of 62CO and 709c, respectively (FIGS. 12C. l2M).
Sequences with similarity to transposons or middle repetitive elements were found to occupy a similar zone. but were less common (29% and 1 I % maximum density for chromosomes II and 1V respectively (FIGS. 12D-12E and FIG. 12N-120). Finally.
unlike in the case of Drn.cnplril« and Ne rrro.cpnr« centromeres l Sun et «L. I 997:
Cambareri et crl., 1998) low complexity DNA. including microsatellites, homopolvmer tracts. and AT rich isochores. were not found to be enriched in the centromeres of Arzrhidnp.sis. Near CEN2. simple repeat sequence densities were comparable to those on the distal chromosome arms. oceupyin~~ l.ir~ of the sequence within the centromere.

3.2% in the flanking regions. and ranging from 20 to 319 by in length (71 by on average).
Except for an insertion of mitochondria) DNA at CEN2 the DNA in and around the centromeres did not contain any large regions that deviated significantly From the eenomic average of - 64% A + T (FIGs. 12F. I2P) (Bevan et ul., 1999).
Unlike the 180 by repeats, all other repetitive elements near CEN2 and CEN4 were less abundant within the genetically-defined centromeres than in the flanking regions. The high concentration of repetitive elements outside of the functional centromere domain suggest they may be insufficient for centromere activity.
Thus, identifying segments of the Arahidopsi.c genome that are enriched in these repetitive sequences does not pinpoint the regions that provide centromere function: a similar situation may occur in the genomes of other higher eukaryotes.
The repetitive DNA flanking the centromeres may play an important role, forming 1 ~ an altered chromatin conformation that serves to nucleate or stabilize centromere structure. Alternatively, other mechanisms could result in the accumulation of repetitive elements near centromeres. Though evolutionary models predict repetitive DNA
accumulates in regions of . low recombination (Charlesworth et crl., 1986;.
Charlesworth er crl., 1994). many Arcrhidopsi.s repetitive elements are more abundant in the recombinationally active pericentromeric regions than in the centromeres themselves.
Instead. retroelements and other transposons may preferentially insert into regions flanking the centromeres or be eliminated from the rest of the aenome at a higher rate.
III. Centromere Compositions Certain aspects of the present invention concern isolated nucleic acid segments and recombinant vectors comprisin= a plant centromere. In one embodment of the invention. the plant eentromere is an Arcrhidohsi.c thalicuru centromere. In a further embodiment of the invention. nucleic acid sequences comprising an .-1.
tlrnlicrna chromosome ? centrotnere are provided. The sequence of the Arcrhidop.cis rhalicrrrcr chromosome 2 centromere is exemplified by the nucleic acid sequences of SEQ ID
-~11-WO 00/~s32s PCT/US00/07392 N0:209 and SEQ ID N0:210. As shown in FIG. 17, the nucleic acid sequences of SEQ
ID N0:209 and SEQ ID N0:210 flank a series of 180 by repeats in centromere 2 of A.
~lrcrlicuur. As such, the chromosome 2 centromere may further be defined as comprising n number of repeats linked to a nucleic acid sequence included in SEQ ID N0:209 or SEQ
ID N0:210. or sequences isolated from both of those sequences. In particular embodiments of the invention, the number of repeats (n), is about 2. 4. 8, 1~.
25, 40. 70, 100, 200. 400. 600. 800, 1,000. I .500. 2.000. 4.000, 6.000. 8000. 10,000.
30,000. 50,000 or about 100,000. The actual repeat sequence used may vary. Representative samples of repeat sequences that could be used are given in FIGS. 23A-23D and included in the nucleic acid sequences given by SEQ )D NOs 184-208. The length of the repeat used may also vary. and may include repeats of. for example. about 10 bp. 20 bp. 40 bp, 60 bp.
80 hp, 100 bp. I20 bp. 140 bp, I50 bp. 160 bp. 170 bp, 180 bp. l90 bp. or about 200 by or larger or a repeat sequence, for example, as listed in F1G. 23A-F1G.23D and included in the nucleic acid sequences given by SEQ ID NOs 184-208 IS
Isolated segments of the nucleic acid sequences of SEQ ID N0:209 and SEQ ID
N0:210 are also contemplated to be of use with the invention, either with or without beinv linked to a series of repeats. Particularly, contiguous nucleic acid segments of about 100. 200. 400. 800. 1,500. 3,000. x.000, 7.500, 10,000, 15.000. 25.000.
40.000.
75.000. 100.000. 125.000, 150,000. 250.000. 350.000. 40.000. 600.000. 700.00 and about 800.000 by of the nucleic acid sequences of SEQ ID N0:209 or SEQ ID
N0:210 specifically form part of the instant invention. In particular embouiments of the invention. such nucleic acid sequences may be linked to n number of repeated sequences.
for example. where n is 2. 4. 8, 15, 2~. 40. 70. 100, 200. 400. 600, 800. i .000. 1.500.
2.000. 4.000. 6.000. 8000, 10,000. 50.000 or about 100.000. The repeat sequence may comprise. for example. about 10 bp. '0 bp. 40 bp, 60 bp. 80 bp, 100 bp. I 20 bp. 140 hp.
1 ~0 bp. l60 bp. 170 hp. ! 80 bp. I 90 hp. or about 200 by or a larger segment of contiguous nucleotides of. for example. a repeat listed in FIG. 23A-FIG.?3D
and included in the nucleic acid sequences given by SEQ ID NOs 184-208.
_42_ WO 00/5532 PCTlUS00/07392 In another embodiment of the invention. nucleic acid sequences comprising r»r A.
tlr«licr»cr chromosome 4 centromere are provided. The sequence of the Ar«hidnp.ci.c tlr«li«»« chromosome 4 centromere is exemplified by the nucleic acid sequences of SEQ
ID N0:21 I and SEQ ID N0:212. As shown in FIG. 18, the nucleic acid sequences oC
SEQ ID N0:211 and SEQ ID N0:212 in Ar«hi~lnpsi.s flank a series of repeated sequences. As such, the chromosome 4 centromere may further be defined as comprising n number of repeats linked to a nucleic acid sequence included in SEQ ID N0:21 1 or SEQ ID N0:212, or sequences from both SEQ ID N0:21 I and SEQ ID N0:212. In particular embodiments of the invention, the number of repeats (n), is about ?. 4. 8. I5, 25. 40. 70. I 00, 200, 400. 600, 800, 1,000, 1,500, 2,0(10, 4,000, 6,000, 8000. I 0.000.
50,000. or about 100,000. The actual repeat sequence used may vary.
Representative samples of repeat sequences that could be used are given in FIGS. 23A-23D.
wherein these sequences are included in the nucleic acid sequences given by SEQ ID NOs 208. The length of the repeat used may also vary, and may include repeats of, for example. about 10 bp. 20 bp, 40 bp. 60 bp, 80 bp, 100 bp, 120 bp, 140 bp. 150 bp. 160 bp, l70 bp, 180 bp, I 90 bp, or about 200 by or larger.
Isolated segments of the nucleic acid sequences of SEQ ID N0:21 1 and SEQ ID
N0:21? are also contemplated to be of use with the invention. either with or without being linked to a series of repeated sequences. Particularly. contiguous nucleic acid segments of about 100. 200. 400. 800. 1.,00. 3,OOC1, 5.000, 7.500. 10.000.
15.000.
25.000. 40.000, 75,000. 100.000. 125.000. 150.000, 250.000, 350,000, 450.000.
600.000.
700.00 by of the nucleic acid sequences of SEQ ID N0:21 l or SEQ (D N0:212 specifically form part of the instant invention. In particular embodiments of the invention. such nucleic acid sequences may be linked to n number of repeated sequences.
for example. where n is 2. 4. 8. L ~. 25. 40. 70. ! 00, 200. 400. 600. 800.
1.000. l .500.
2.000. -4.000. 6.000. 8000. 10,000. 50,000 or shout 100.000. The repeat sequence may comprise. for example. about 10 bp. 20 bp. 40 bp. 60 bp. 80 bp, 100 bp. l 20 bp. l40 bp.
150 bp. 160 bp, 170 bp. l80 bp. 190 bp. or about 200 by or a larger segment of contiguous nucleotides of the sequence of SEQ ID N0:184-208.
_4;_ Also provided by the invention are regulatory regions from the Arcrhiclopsi.c polyubiquitin I 1 gene. including promoter anti terminator sequences thereof.
The nucleic:
acid sequences of these regulatory regions are exemplified by the nucleic acid sequences of SEQ ID Ivr0:180 and SEQ ID N0:181. Also included with such sequences are contiguous stretch of from about 10. 15. 20, 25. 30. =t0. 50. 75, 100. 125, I
50. 200. 300.
500, 750, 1,000, 1.500. and about 2,000 nucleotides of the nucleic acid sequence of SEQ
ID N0:180 and SEQ ID N0:181. In particular embodiments of the invention. it may be desirable to operably link the Arnbidoh.ci.c polyubiquitin 1 1 promoter sequences to the 5' end of a coding sequence. It may also be desirable to operably link the Arcrhiclopsis polyubiquitin I 1 terminator sequence to the 3~ end of a coding sequence.
Still further provided by the invention are regulatory regions from the Arubidnhsi.c 40S ribosomal protein S 16 gene. including promoter and terminator sequences thereof.
The nucleic acid sequences of these regulatory regions are exemplified by the nucleic acid sequences of SEQ lD N0:182 and SEQ ID N0:183. Also included with such sequences are continuous stretch of from about l0. I5. ?0. ?5. 30. ~10, 50. 75. 100.
1'_'S, 150. 200.
300. 500. 750. 1,000. 1.500, and about ?.000 nucleotides of the nucleic acid sequence of SEQ 1D N0:182 and SEQ ID N0:183. In particular embodiments of the invention, it may be desirable to operably link the Arcrhiclmj~.ci.s =lOS ribosomal protein S 16 gene .cequences to the 5' end of a coding sequence. It may also be desirable to operably (ink the Arcrbidopsi.c 40S ribosomal protein S 16 gene sequence to the 3' end of a coding sequence.
Still further provided by the invention are «ene sequences and related regulatory elements and sequences with other functions from centromere regions. In particular. the invention includes the centromere sequences <_iven by SEQ ID NO:1. SEQ ID
NO:?. SEQ
ID N0:3. SEQ ID N0:4. SEQ ID NO:~. SEQ ID N0:6. SEQ ID N0:7. SEQ ID N0:8.
SEQ ID N0:9. SEQ ID NO:10. SEQ ID NO: I I . Sf-:Q ID NO: l?. SEQ ID N0:13. SEQ
ID
NO: l~l. SEQ 1D NO:15. SEQ ID NO: f f,. SEQ 1D X0:17. SEQ ID N0:18. SEQ ID

N0:19_ SEQ ID N0:20. and SEQ ID N0:21, as well as lengths of about 5, 10. 15.
20. 25.
30, 40, 50. 60. 70, 80, 90, 100. 1 I 0. 12~. I 50. ! 7~. 200. 250. 300, 350, 400. 500, 550.
590. 1,000. and about 1.500 contiguous nucleotides of these sequences, up to and including the full length of the sequences.
Centromere-containing nucleic acid sequences may be provided with other sequences for the creation and use of recombinant minichromosomes. Such nucleic acid sequences specifically within the scope of the invention include the nucleic acid sequences listed in the sequence listing provided herewith.
The present invention concerns nucleic acid segments, isolatable from A.
tlzcrliana cells, that are enriched relative to total ~enomic DMA or other nucleic acids and are capable of conferring centromere activity to a recombinant molecule when incorporated into the host cell. As used herein, the term "nucleic acid segment" refers to a nucleic acid molecule that has been purified from total genomic nucleic acids of a particular species.
Therefore. a nucleic acid segment conferring centromere function refers to a nucleic acid segment that contains centromere sequences yet is isolated away from, or purified free from, total genomic nucleic acids of A. thaliana. Included within the term "nucleic acid segment". are nucleic acid segments and smaller fragments of such segments, and also recombinant vectors, including, for example, BACs. ~'ACs, plasmids, cosmids, phage, viruses. and the like.
Similarly, a nucleic acid segment compriwng an isolated or purified centromeric sequence refers to a nucleic acid segment including centromere sequences and.
in certain aspects. regulatory sequences, isolated substantially away from other naturally occurring sequences. or other nucleic acid sequences. In this respect. the term "gene"
is used for simplicity to refer to a functional nucleic acid see ment. protein, polypeptide or peptide encoding unit. As will be understood by those in the art. this functional term includes both genomic sequence,, cDNA sequences and smaller engineered gene se~~ments that may express. or may be adapted to express. proteins. polypeptides or peptides.

"Isolated substantially away from other sequences" means that the sequences of interest, in thin case centromere sequences. are included within the genomic nucleic acid clones provided herein. Of course. this refers to the nucleic acid segment as originally isolated. and does not exclude genes or coding regions later added to the segment by the hand of man.
In particular embodiments, the invention concerns isolated nucleic acid segments and recombinant vectors incorporating nucleic acid sequences that encode a centromere functional sequence that includes a contiguous sequence from the centromeres of the current invention. In certain other embodiments, the invention concerns isolated nucleic acid segments and recombinant vectors that include within their sequence a contiguous nucleic acid sequence from an A. tlratiana centromere. Again. nucleic acid segments that exhibit centromere function activity will be most preferred.
The nucleic acid segments of the present invention. regardless of the length of the sequence itself, may be combined with other nucleic acid sequences, such as promoters.
polyadenylation signals, additional restriction enzyme sites. multiple cloning sites, other coding segments, and the like, such that their overall length may vary considerably. It is therefore contemplated that a nucleic acid fragment of almost any length may be employed, with the total length preferably being limited by the ease of preparation and use in the intended recombinant DNA protocol.
(i) Prinrer.c card Probes In addition to their use in the construction of recombinant constructs, includin~~
minichromosomes. the nucleic acid sequences disclosed herein may find a variety of other uses. For example, the centromere sequences described herein may find use as probes or primers in nucleic acid hybridization embodiments. As such. it is contemplated that nucleic acid segments that comprise a sequence region that consists of at least a 14 nucleotide long contiguous sequence that has the same sequence as. or is complementary -=16-to, a l4 nucleotide long contiguous DNA segment of a centromere sequence of the current invention. for example, of the sequences given by SEQ ID NOS:1-2l?.
and particularly. SEQ 1D NOS: l-? 1 and SEQ ID NOS:180-212, will find particular utility.
Loner contiguous identical or complementary sequences. e.y., those of about 20. 30. 40.
50, 100. 200. 500, 1.000. 2.000, 5.000 bp, etc., including all intermediate lengths and up to and including the full-length sequence of the sequences given in SEQ ID
NOS:I-212.
also will be of use in certain embodiments.
As described in detail herein. the ability of such nucleic acid probes to specifically hybridize to centromeric sequences will enable them to be of use in detectin' the presence of similar, partially complementary sequences from other plants or animals.
However.
other uses are envisioned. including the use of the centromeres for the preparation of mutant species primers. or primers for use in preparing other genetic constructions.
Nucleic acid fragments having sequence regions consisting of contiguous nucleotide stretches of 8. 9, 10. I I, 12, 13, 14, 15, 16, 1?, 18, 19. 20. 21.
22, 23, 24, 25. 26, ?7: 28, 29, 30, 31. 32, 33, 34. 35. 36. 37, 38. 39, 40, 41. 42, 43, 44. 45, 46, 47. 48. 49. 50.
55, 60. 65. 70, 75. 80, 85. 90. 9>. 100 or even of IOl-200 nucleotides or so.
identical or complementary to a centromere sequence of the current invention, including the sequences given in SEQ ID NOS:I-212. are particularly contemplated as hybridization probes for use in, e.,~., Southern and Northern blotting. Smaller fragments will generally find use in hybridization embodiments. wherein the length of the contiguous complementary region may be varied. such as between about 10-14 and about 100 or 200 nucleotides, but larber contiguous complementarily stretches also may be used.
according to the length compiernentary sequences one wishes to detect.
Of course. fragments rnay also be obtained by other techniques such as. e.,~..
by mechanical shearing or by restriction enzyme di~~estion. Small nucleic acid segments or fragments may be readily prepared by. for example. directly synthesizing the fragment by chemical meam. as is commonly practiced using an automated oligonucleotide synthesizer. Also. fragments may be obtained by application of nucleic acid reproduction technology, such as the PCR~~" technology of U. S. Patents 4.683.195 and
4.683,202 (each incorporated herein by reference). by introducin~_ selected sequences into recombinant vectors for recombinant production. and by other recombinant DNA
techniques generally known to those of skill in the art of molecular biology.
Accordingly, the centromere sequences of the current invention may be used for their ability to selectively form duplex molecules with complementary stretches of DNA
fragments. Depending on the application envisioned. one will desire to employ varying conditions of hybridization to achieve varying degrees of selectivity of probe.towards target sequence. For applications requiring high selectivity. one will typically desire to employ relatively stringent conditions to form the hybrids, e.g., one will select relatively low salt and/or high temperature conditions. such as provided by about 0.02 M
to about 0.15 M NaCI at temperatures of about 50°C to about 70°C. Such selective conditions IS tolerate little, if any, mismatch between the probe and the template or target strand, and would be particularly suitable for isolating centromeric DNA segments. Nucleic acid sequences hybridizing under these conditions and the conditions below to the nucleic acid sequences provided by the invention, including those given by SEQ ID NOS:I-212. form a part of the invention. Detection of nucleic acid segments via hybridization is well-known to those of skill in the art. and the teachings of U. S. Patents 4,965.188 and
5.176.995 (each specifically incorporated herein by reference in its entirety) are exemplary of the methods of hybridization analyses. Teachings such as those found in the texts of Maloy et nl.. 1991: Seeal. 1976: Prokop. 1991: and Kuby, 1994, are particularly relevant.
ZS
Of course. for some applications. for example. where one desires to prepare mutants employing a mutant primer strand hybridized to an underlying template or where one seeks to isolate centromere function-conferring sequences from related species, functional equivalents. or the like. less stringent hybridization conditions will typically he needed in order to allow formation of the heteroduplet. In these circumstances, one may -=18-WO 00/s~325 PCT/US00/07392 desire to employ conditions such as about 0.15 M to about 0.9 M salt. at temperatures ranging from about 20°C to about 55°C. Cross-hybridizing species can thereby be readily identified ;ts positively hybridizing signals with respect to control hybridizations. 1n any case. it is generally appreciated that conditions can be rendered more stringent by the addition of increasing amounts of formamide. which serves to destabilize the hybrid duplex in the same manner as increased temperature or decreased salt. Thus.
hybridization conditions can be readily manipulated. and thus will generally be a method of choice depending on the desired results.
In certain embodiments, it will be advantageous to employ nucleic acid sequences of the present invention in combination with an appropriate means, such as a label, for determining hybridization. A wide variety of appropriate indicator means are known in the art, including fluorescent. radioactive. enzymatic or other iigands, such as avidin/biotin, which are capable of giving a detectable signal. In preferred embodiments.
I S one will likely desire to employ a fluorescent label or an enzyme ta'r.
such as urease.
alkaline phosphatase or peroxidase, instead of radioactive or other environmentally undesirable reagents. In the case of enzyme tags. colorimetric indicator substrates are known that can be employed to provide a means visible to the human eye or spectrophotometrically. to identify specific hybridization with complementary nucleic acid-containing samples.
In general. it is envisioned that the hybridization probes described herein will be useful both as reagents in solution hybridization as well as in embodiments employing a solid phase. In embodiments involving a solid phase. the test DNA (or RNA) is adsorbed ?5 or otherwise affixed to a selected matrix or surface. This fixed. single-stranded nucleic acid is then subjected to specific hybridization with selected probes under desired conditions. The selected conditions will depend on the particular circumstances based on the particular criteria required (depending. for example. on the G+C content.
type of target nucleic acid. source of nucleic acid. size of hybridization probe, orc.l. Following _~c~_ washing of the hybridized surface so as to remove nonspecifically bound probe molecules, specific hybridization is detected. or even quantitated, by means of the label.
(ii) Lame Nucleic Acicl Segnrertts Using the markers t7ankin~ each centromere (see FIG. 3) it may be possible to purify a contiguous DNA fragment that contains both t7anking markers and the centromere encoded between those markers. In order to carry this out. very large DNA
fragments up to the size of an entire chromosome are prepared by embedding Aruhiclopsis tissues in agarose using, for example. the method described by Copenhaver et al.. ( 1995).
These large pieces of DNA can be digested in the agarose with any restriction enzyme.
Those restriction enzymes which will be particularly useful for isolating intact centromeres include enzymes which yield very lame DNA fragments. Such restriction enzymes include those with specificities greater than six base pairs such as.
for example.
Asc I. Bae I. BbvC I. Fse I. Not I. Pac I. Pme I. PpuM I, Rsr I1, SanD I. Sap I. SexA I, Sfi 1~ I, Sgf 1. SgrA 1, Sbf I, Srf 1. Sse8387 I. Sse8647 I. Swa. UbaD I, and UbaE
1. or any other enzyme that cuts at a low frequency within the Arabiclopsis genome, and specifically within the centromeric region. Alternatively, a partial digest with a more frequent cutting restriction enzyme could be used.
Alternatively. lame DNA fragments spanning some or all of a centromere could be produced using RecA-Assisted Restriction Endonuclease (RARE) cleav~t~~e (Ferrin.
1991 ). In order to carry this out. very large DNA fragments up to the size of an entire chromosome are prepared by embedding Aruhidop.si.s tissues in agarose using.
for example. the method described by Copenhaver et nl.. ( 19951. Single stranded DNA
oligomers with sequences homologous to sites flanking the region of DNA to be purified are made to form triple stranded complexes with the agarose embedded DNA using the recombinase enzyme RecA. The DNA is then treated with a site specific methylase such as, for example. Alu 1 methvlase. BamH I methylase, dam methylase. EcoR I
methylase.
Hae III methylase. Hha I methvlase. Hpa II methylase. or ivlsp methviase. The methyiasc will modify all the sites specified by its recognition sequence except those within the _>0_ WO 00/~~325 PCT/US00/07392 triplex region protected by the RecA/DNA oli~omer complex. The RecA/DNA
oli~lomet complex are then removed from the a~arose embedded DNA and the DNA is then cleaved with the restriction enzyme corresponding to the methylase used. for example. if EcoRI methylase was used then EcoRl restriction endonuclease would be used to perform the cleavage. Only those sites protected from modification will be subject to cleavage by the restriction endonuclease. Thus by choosing targets flanking the centromeric regions that contain the recognition sequence of a site specific methylase/restriction endonuclease pair RARE can be used to cleave the entire region from the rest of the chromosome. It is important to note that this method can be used to isolate a DNA fragment of unknown composition by using sequence information flanking it. Thus, this method may be used to isolate the DNA contained within any gaps in the physical map for the centromeres.
'The DNA isolated by this method can then be sequenced.
The large DNA fragments produced by digestion with restriction enzymes or by IS RARE cleavage are then separated by size using pulsed-field gel electrophoresis (PFGE) (Schwartz et al., 1982). Specifically. Contour-clamped Homogeneous Electric Field (CHEF) electrophoresis (a variety of PFGE) can be used to separate DNA
molecules as large as 10 Mb (Chu et crl., 1985). Large DNA fragments resolved on CHEF gels can then be analyzed using standard Southern hybridization techniques to identify and measure the size of those fragments which contain both centromere flanking markers and therefor, the centromere. After determining the size of the centromere containing fragment by comparison with known size standards, the region from the ~~cl that contains the centromere fragment can be cut out of a duplicate ael. This centromeric DNA can then be analyzed. sequenced. and used in a variety of applications, as described below.
including the construction of minichromosomes. As indicated in detail below.
minichromosornes can be constructed by attaching telomeres and selectable markers to the centromere fragment cut from the a~am;e gel using standard techniques which allow DNA ligation within the gel slice. Plant cells can then be transformed with this hybrid DNA molecule using the techniques described herein below.
-sl-WO 00/~~32~ PCT/US00/07392 IV. ##Recombinant Constructs Comprising Centromere Seguences ##
In light of the instant disclosure it will be possible for those of ordinary skill in the art to construct the recombinant DNA constructs described herein. Useful construction methods are well-known to those of skill in the art (see. for example, Maniatis et al., 198?). As constructed. the minichromosome will preferably include an autonomous replication sequence (ARS) functional in plants, a centromere functional in plants, and a telomere functional in plants.
The basic elements in addition to a plant centromere that may be used in constructing a minichromosome vector are known to those of skill in the art.
For example, one type of telomere sequence that could be used is an Arabiclopsis telomere, which consists of head to tail arrays of the monomer repeat CCCTAAA totaling a few (for example 3-4) kb in length. The telomeres of ,4rabidopsis, like those of other organisms. vary in length and do not appear to have a strict length requirement. An 1~ example of a cloned telomere can be found in GenBank accession number (Richards and Ausubel, 1988). Yeast telomere sequences have also been described (see.
e.,y., Louis. 1994: Genbank accession number S70807). Additionally. a method for isolating a higher eukaryotic telomere from Al'(fI7lCIlJ(JSlS IIrcrIrC111C!
WaS described by Richards and Ausubel ( 1988).
It is commonly believed that higher eukaryotes do not posses a specific sequence that is used as a replication origin. but instead replicate their DNA from random sites distributed along the chromosome. In Arclhiclnp.ci.c, it is thought that the cell will form origins of replications about once every 70 kb (Van't Hot, 1978). Thus.
because higher 2~ eukaryotes have origins of replication at potentially random positions on each chromosome. it is not possible to describe a specific origin sequence. but it may generally be assumed that a segment of plant DNA of a sufficient size will be recognized by the cell and origins will be generated on the construct. For example, any piece of Arcrhidnp,ci.c ~enomic DNA larger than 70 kb would be expected to contain an ARS. By including such a segment of DNA on a recombinant vectc>r. ARS function may be provided to the _p_ vector. Additionally, many S. cerevi.sicre autonomous replicating sequences have been sequenced and could be used to fulfill the ARS function. One example is the Saccharonrvce.s cerevi.sicre autonomously replicating sequence ARS 131 A
(GenBank number L25319). Many origins of replications have been also been sequenced and cloned from E. cnli and could be used with the invention. for example. the Col E I origin of replication (Ohmori and Tomizawa. 1979: GenBank number V00270). One Agrobucterirrrrr origin that could be used is RiA4. The localization of origins of replication in the plasmids of Agrobcrcterirr»r rlri:.o,qeues strain A4 was described by Jouanin et al. ( 1985).
(i) Corrsideratiorts in the Preparation oJReconrbinaru Cortstrrtcts In addition to the basic elements, positive or negative selectable plant markers (e.X.. antibiotic or herbicide resistance genes). and a cloning site for insertion of foreign DNA may be included. In addition. a visible marker. such as green fluorescent protein.
also may be desirable. In order to propagate the vectors in E. coli, it is necessary to convert the linear molecule into a circle by addition of a stuffer fragment between the telomeres. Inclusion of an E. coli plasmid replication origin and selectable marker also may be preferred. It also may be desirable to include A~robcrcterium sequences to improve replication and transfer to plant cells. The inventors have described a number of exemplary minichromosome constructs in FIGS. 7A-7EI. although it will be apparent to those in skill art that many changes may be made in the order and types of elements present in these constructs and still obtain a functional minichromosome within the scope of the instant invention.
Artificial plant chromosomes which replicate in yeast also may be constructed to take advantage of the large insert capacity and stability of repetitive DNA
inserts afforded by this system (see Burke et nl.. 1987). In this case. yeast ARS and CEN
sequences may be added to the vector. The artificial chromosome is maintained in yeast as a circular molecule using a atuffer fragment to separate the telorneres.

WO 00!55325 PCT/USOt)/07392 A fragment of DNA. from any source whatsoever, may be purified and inserted into a minichromosome at any appropriate restriction endonuclease cleavage site. The DNA se~~ment usually will include various regulatory signals for the expression of proteins encoded by the fragment. Alternatively. regulatory signals resident in the S minichromosome may be utilized.
The techniques and procedures required to accomplish insertion are well-known in the art (see Maniatis et crl., 1982). Typically. this is accomplished by incubating a circular plasmid or a linear DNA fragment in the presence of a restriction endonuclease such that the restriction endonuclease cleaves the DNA molecule. Endonucleases preferentially break the internal phosphodiester bonds of polynucleotide chains. They may be relatively unspecific. cutting polynucleotide bonds regardless of the surroundinyT
nucleotide sequence. However. the endonucleases which cleave only a specific nucleotide sequence are tailed restriction enzymes. Restriction endonucleases generally internally cleave DNA molecules at specific recognition sites, making breaks within "recognition" sequences that in many, but not all. cases exhibit two-fold symmetry around a given point. Such enzymes typically create double-stranded breaks.
Many of these enzymes make a staggered cleavage. yielding DNA fragments with protruding single-stranded ~' or 3' termini. Such ends are said to be "sticky"
or "cohesive" because they will hydrogen bond to complementary 3' or 5~ ends. As a result.
the end of any DNA fragment produced by an enzyme. such as EcoRI, can anneal with any other fragment produced by that enzyme. This properly allows splicing of foreign genes into plasmids, for example. Some restriction endonucleases that may be particularly useful with the current invention include HinclIIl. Pstl, EcoRl, and BamHl.
Some endonucleases create fragments that have blunt ends. that is. that lack any protruding sin~_le strands. An alternative way to create blunt ends rs to use a restnctron enzyme that leaves overhangs, but to fill in the overhangs with a polymerase, such .m klenovy. thereby resulting in blunt ends. When DNA has heen cleaved with restriction enzymes that cut across both strands at the same position, blunt end ligation can be used to join the fragments directly together. The advantage of this technique is that any pair of ends may be joined together. irrespective of sequence.
Those nucleases that preferentially break oft terminal nucleotides are referred to as exonucleases. For example. small deletions can be produced in any DNA
molecule by treatment with an exonuclease which starts from each 3' end of the DNA and chews away single strands in a 3' to 5' direction, creating a population of DNA molecules with single-stranded fragments at each end, some containing terminal nucleotides.
Similarly, exonucleases that digest DNA from the 5' end or enzymes that remove nucleotides from both strands have often been used. Some exonucleases which may be particularly useful in the present invention include Bal3l, S1. and E.r~~III. These nucleolytic reactions can be controlled by varying the time of incubation. the temperature. and the enzyme concentration needed to make deletions. Phosphatases and kinases also may be used to control which fragments have ends which can be joined. Examples of useful phosphatases include shrimp alkaline phosphatase and calf intestinal alkaline phosphatase. An example of a useful kinase is T4 polynucleotide kinase.
Once the source DNA sequences and vector sequences have been cleaved and modified to generate appropriate ends they are incubated together with enzymes capable of mediating the ligation of the two DNA molecules. Particularly useful enzymes for this purpose include T4 lipase. E. cnli lipase, or other similar enzymes. The action of these enzymes results in the sealing of the linear DNA to produce a larger DNA
molecule containing the desired fragment (see. for example. U.S. Patent Nos. 4.237.224:
4.264.731; 4,273,875: 4.322,499 and 4,336,336. which are specifically incorporated herein by reference).
It is to be understood that the termini of the linearized plasmid and the termini of the DNA fragment being inserted must be complementary or blunt in order for the ligation reaction to be successful. Suitable complementarily can be achieved by choosing _55_ W O 00/~~325 PCT/US00/07392 appropriate restriction endonucleases (i.e., if the fragment is produced by the same restriction endonuclease or one that generates the same overhand as that used to linearize the Plasmid, then the termini of both molecules mill be complementary). As discussed previously. in one embodiment of the invention. at least two classes of the vectors used in the present invention are adapted to receive the foreign oligonucleotide fragments in only one orientation. After joining the DNA segment to the vector. the resulting hybrid DNA
can then be selected from among the large population of clones or libraries.
A method useful for the molecular cloning of DNA sequences includes in vitro joining of DNA segments, fragmented from a source of high molecular weight genomic DNA, to vector DNA molecules capable of independent replication. The cloning vector may include plasmid DNA (see Cohen er «l., 1973). phaoe DNA (see Thomas et «L, 1974), SV40 DNA (see Nussbaum er «L. 1976), yeast DNA. E. cull DNA
and most Significantly, plant DNA.
IS
A variety of processes are known which may be utilized to effect transformation;
i.e., the inserting of a heterologous DNP. sequences into a host cell, whereby the host becomes capable of efficient expression of the inserted sequences.
(ii) Re~ulntorv Elenrent.s In one embodiment of the invention, constructs may include a plant promoter, for example, the CaMV 35S promoter (Odell et «L. 1985), or others such as CaMV 19S
(Lawton et «l., 1987), no.r (Ebert et «L. 1987). Adh (Walker er «!.. 1987), sucrose synthase (Yang & Russell. 1990), a-tubulin. actin (Wang et «l.. 1992), cob (Sullivan er «L, 1989). PEPCase (Hudspeth & Grula. 1989) or those associated with the R
gene complex (Chandler er «L, 1989). Tissue specific promoters such as root cell promoters (Conkling et crl., 1990) and tissue specrtrc enhancers tt-romm et «L. ty~su) are also contemplated to be useful. as are inducible promoters such as ABA- and turaor-inducible promoters. In particular embodiments of the invention. a Lat52 promoter may be used (Twell et al.. 1991). A particularly useful tissue specific promoter is the SCARECROW
(Scr) root-specific promoter (DiLaurenzio et al., i 996).
As the DNA sequence between the transcription initiation site and the start of the coding sequence, i.e., the untranslated leader sequence, can intluence gene expression.
Therefore. one may also wish to employ a particular leader sequence.
It is envisioned that a functional gene could be introduced under the control of novel promoters or enhancers, etc., or perhaps even homologous or tissue specific (for example. root-, collar/sheath-, whorl-, stalk-. earshank-. kernel- or leaf-specific) promoters or control elements. In particular embodiments of the invention, the functional gene may be in an antisense orientation relative to the promoter.
(ii) Terntittcttors IS It may also be desirable to link a functional gene to a 3' end DNA sequence that acts as a signal to terminate transcription and allow for the poly-adenylation of the mRNA
produced by coding sequences. Such a terminator may be the native terminator of the functional gene or. alternatively, may be a heterologous 3' end. Examples of terminators that could be used with the invention are those from the nopaline synthase gene of Agrvbacterittm tcmtefctciet~s (nos 3' end) (Bevan et ol., 1983). the terminator for the T7 transcript from the octopine synthase gene of Ayrnbacteri~uu tutnc/acimes. and the 3' end of the protease inhibitor 1 or I1 genes from potato or tomato.
(iii) Murder Gertes It may be desirable to use one or more marker genes in accordance with the invention. Such markers may be adapted for use in prokaryotic, lower eukaryotic or higher eukaryotic systems. or may be capable of use in any combination of the foregoin~~
classes of organisms. By employing a selectable or screenable marker protein.
one can provide or enhance the ability to identify transformants. "Marker ~~ene;" are ~~enes that impart a distinct phenotype to cells expressing the marker protein and thus allow such _j7_ transformed cells to be distinguished from cells that do not have the marker.
Such genes may encode either a selectable or screenable marker, depending on whether the marker confers a trait which one can "select" for by chemical means, i.e., through the use of a selective agent (e.g., a herbicide. antibiotic, or the like), or whether it is simply a trait that one can identify through observation or testing, i.e., by "screening"' (e.,~~., the green fluorescent protein). Of course. many examples of suitable marker proteins are known to the art and can be employed in the practice of the invention.
Included within the terms selectable or screenable markers also are genes which encode a "secretable marker" whose secretion can be detected as a means of identifying or selecting for transformed cells. Examples include markers which are secretable antigens that can be identified by antibody interaction. or even secretable enzymes which can be detected by their catalytic activity. Secretable proteins fall into a number of classes.
including small, diffusible proteins detectable, e.R., by ELISA: small active enzymes detectable in extracellular solution (e.g.. a-amylase, (3-lactamase, phosphinothricin acetyltransferase): and proteins that are inserted or trapped in the cell wall (e.g., proteins that include a leader sequence such as that found in the expression unit of extensin or tobacco PR-S).
With regard to selectable secretabie markers, the use of a gene that encodes a protein that becomes sequestered in the cell wall. and which protein includes a unique epitope is considered to be particularly advantageous. Such a secreted antigen marker would ideally employ an epitope sequence that would provide low background in plant tissue. a promoter-leader sequence that would impart efficient expression and targeting across the plasma membrane. and would produce protein that is bound in the cell wall and yet accessible to antibodies. A normally secreted wall protein modified to include a unique epitope would satisfy all such requirements.
_;g_ 1. Selectable Markers Many selectable marker ~=enes may be used in accordance with invention including. hut not limited to. rrc~u (Potrvkus cr «l.. 1985). which provides kanamvcin resistance and can be selected for using kanamycin. G=118, paromomycin. e~c.;
bcrr. which confers bialaphos or phosphinothricin resistance: a mutant EPSP synthase protein (Hinchee et crl.. 1988) conferring glyphosate resistance; a nitrilase such as burr from Kleh.siellcr n:.aerure which confers resistance Eo bromoxynil (Stalker et nl..
1988); a mutant acetolactate synthase (ALS) which confers resistance to imidazolinone, sulfonylurea or other ALS inhibiting chemicals (European Patent Application 154.204, 1985); a methotrexate resistant DNFR (Thillet et «l.. 1988), a dalapon dehalogenase that confers resistance to the herbicide dalapon: or a mutated anthranilate synthase that confers resistance to 5-methyl tryptophan. Where a mutant EPSP synthase is employed, additional benefit may be realized through the incorporation of a suitable chloroplast transit peptide. CTP (U.S. Patent No. 5.188.6=l2> or OTP (U.S. Patent No.
x.633.448) and IS use of a modified maize EPSPS (PCT Application WO 97/04103).
An illustrative embodiment of selectable marker capable of being used in systems to select transformants are those that encode the enzyme phosphinothricin acetyltransferase, such as the hcrr gene from Srreprnmyce.r Irvgrn.ccopicrrs or the put gene from Streptnmvces viriclnclrrnnro,~mr~.s. The enzyme phosphinothricin acetyl transferase (PAT) inactivates the active ingredient in the herbicide bialaphos, phosphinothricin (PPT). PPT inhibits glutamine synthetase. (Murakami er «L. 1986: Twell et crl.. 1989) causing rapid accumulation of ammonia and cell death. The use of bar as a selectable marker gene and for the production of herbicide-resistant rice plants from protopiasts was described by Rathore c crl., ( (993).
A number of S. cerevisicre marker genes arc also known and could be used with the invention, such as. for example. the HIS4 gene lDonahue et «l.. 1982:
GenBank number J01331 ). An example of an E. cr~li marker gene which has been cloned and sequenced and could be used in accordance with the invention is the Ap gene, which _5c~_ WO 00/~~32~ PCT/US00/07392 confers resistance to beta-lactam antibiotics such as ampacillin (nucleotides 4618 to X478 of GenBank accession number U66885).
2. Screenable Markers Screenable markers that may be employed include a ~i-glucuronidase (GUS) or triclA gene which encodes an enzyme for which various chromogenic substrates are known: an R-locus gene, which encodes a product that regulates the production of anthocyanin pigments (red color) in plant tissues lDellaporta et al., 1988); a (3-lactamase gene (Sutcliffe, 1978), which encodes an enzyme for which various chromogenic substrates are known (e.g., PADAC, a chromogenic cephalosporin}; a .rylE gene (Zukowsky et al., 1983) which encodes a catechol dioxygenase that can convert chromogenic catechols: an a-amylase gene (Ikuta et al., 1990); a tyrosinase Gene (Katz et crl.. 1983) which encodes an enzyme capable of oxidizing tyrosine to DOPA and dopaquinone which in tum condenses to form the easily-detectable compound melanin; a ~3-galactosidase gene, which encodes an enzyme for which there are chromogenic substrates: a luciferase (lrex) gene (Ow et ul.. 1986). which allows for bioluminescence detection: an aequorin gene (Prasher et crl., 1985) which may be employed in calcium sensitive bioluminescence detection; or a gene encoding for green fluorescent protein (Sheen et al.. 1995: Haseloff et al., 1997: Reichel et al., 1996: Tian et ul..
1997: WO
97141228).
Genes from the maize R gene complex can also be used as screenable markers.
The R gene complex in maize encodes a protein that acts to regulate the production of anthocyanin pigments in most seed and plant tissue. Maize strains can have one. or as 2~ many as tour. R alleles which combine to regulate pigmentation in a developmental and tissue specific manner. Thus, an R gene introduced into such cells will cause the expression of a red pigment and, if stably incorporated. can be visually scored as a red sector. If a maize line carries dominant alleles for genes encoding= for the enzymatic intermediates in the anthocvanin biosynthetic pathway (C2. A l . A?. Bo I and Bz2), but carries a recessive allele at the R locus. transformation of any cell from th,rt line with R

will result in red pigment formation. Exemplary lines include Wisconsin 22 which contains the rg-Stadler allele and TR 112. a K55 derivative which is r-g, b, Pl.
Alternatively. any genotype of maize can be utilized if the C l and R alleles are introduced together.
Another screenable marker contemplated for use in the present invention is firefly luciferase, encoded by the lrrx gene. The presence of the lox gene in transformed cells may be detected using, for example. X-ray film, scintillation counting, fluorescent spectrophotometry, low-light video cameras, photon counting cameras or multiwell luminometry. It also is envisioned that this system may be developed for populational screening for bioluminescence, such as on tissue culture plates, or even for whole plant screening. The gene which encodes green fluorescent protein (GFP) is contemplated as a particularly useful reporter gene (Sheen et al., 1995; Haseloff et al., 1997;
Reichel et crl..
1996; Tian et crl., 1997; WO 97/41228). Expression of green fluorescent protein may be IS visualized in a cell or plant as fluorescence following illumination by particular wavelengths of light.
3. Negative Selectable Markers Introduction of genes encoding traits that can be selected against may be useful for eliminating minichromosomes from a cell or for selecting against cells which comprise a particular minichromosome. An example of a negative selectable marker which has been investigated is the enzyme cytosine deaminase (Stouggard. 1993). In the presence of this enzyme the compound 5-f7uorocytosine is converted to 5-fluorouracil which is toxic to plant and animal cells. Therefore. cells comprising a minichromosome with this gene could be directly selected against. Other genes that encode proteins that render the plant sensitive to a certain compound will also be useful in this context. For example. T-DNA
gene 2 from AS~rnlprcterirrm ternrefaciens encodes a protein that catalyzes the conversion of u-naphthalene acetamide (NAMI to a-naphthalene acetic acid (NAA> renders plant cells sensitive to high concentrations of NAM (Depicker er nl.. 1988).

V. Isolation of Centromeres From Plants The inventors have provided, for the first time, the nucleic acid sequence of a plant centromere. This will allow one of skill in the art to obtain centromere sequences from potentially any species. The inventors specifically provide herein below a number of methods which may be employed to isolate such centromeres.
(i) Utilization of Conserved Segerence.c Numerous of the centromere sequences identified by the inventors were also shown by the inventors to be highly conserved (see e.g., Example SB. Table 3, and Table 4). The novel finding of the inventors that a number of genes reside within the Arabidopsis centromere can therefore be used to find syntenic genes in other organisms (i.e., evolutionarily conserved relationships in gene order from species to species). For example, the sequence of each Arahidopsis gene can be used to search Through sequence databases from other plants. An exemplary list of such sequences that could be used is a sequence given by SEQ ID NO: I , SEQ 1D N0:2. SEQ ID N0:3, SEQ ID N0:4. SEQ ID
NO:S, SEQ ID N0:6, SEQ ID N0:7. SEQ ID N0:8, SEQ ID N0:9, SEQ ID NO:10. SEQ
ID NO:11, SEQ ID N0:12, SEQ ID N0:13. SEQ ID N0:14. SEQ ID NO:15, SEQ ID
NO: l6, SEQ ID N0:17, SEQ ID N0:18, SEQ ID N0:19, SEQ ID N0:20, and SEQ ID
N0:21. Also useful would be the genes listed in Tables 3 and 4. Finding identical or similar genes would identify candidates that may reside within or near centromeric regions. Mapping these genes using linked markers would identify potential centromeric regions.
Where hybridization is used to obtain centromere sequences, it may be desirable 2~ to use less stringent hybridization conditions to allow formation of a heteroduplex. In these circumstances, one may desire to employ conditions such as about 0.15 M
to about 0.9 M salt. at temperatures ranging from about 20°C to about 55°C. Cross-hybridizing species can thereby be readily identified as positively hybridizing signals with respect to control hybridizations. In any case, it is generally appreciated that conditions can be rendered more stringent by the addition of increasing amounts of formamide.
which serves to destabilize the hybrid duplex in the same manner as increased temperature or decreased salt. Thus, hybridization conditions can be readily manipulated. and thus will generally be a method of choice depending on the desired results.
(ii) ldertti/icatiorr of Cerrtromere-A.c.sncicrted Clrcrructeristics The second method takes advantage of the unique DNA properties that the inventors have discovered at the Arabidopsi.r centromere and adjacent pericentromere regions. The centromeres are composed of long arrays of 180 by repeats flanked by regions that are 10-70% retroelements, up to 15% pseudogenes and up to 29%
!0 transposons see FIGS. 12A-T). This is unique to the centromere since retroelements, transposons and pseudogenes are very rare outside the centromere and pericentromere region. Furthermore, gene density decreases from an average of a gene every 4.5 kb on the chromosomal arm down to one in 150 kb at the centromere. This unique centromere composition could be exploited in a number of ways to find centromere regions in other species, for example:
1 ) Markers specific for retroelements, transposons, repeat DNA elements and pseudogenes can be devised to genetically map regions which are dense with similar elements.
2) The second method involves in situ hybridization. and preferably.
fluorescent in situ hybridization (FISH). Fluorescently labeled DNA probes consisting of retroelements. transposons and/or repetitive DNA native to a particular species can be combined with microscopy to identify parts of a chromosome with a similar percentage of 2~ DNA elements as that found at the Arabiclopsis centromere.
3) Utilizing sequence databases. regions of genomes that have increased numbers of repetitive DNA, pseudogenes, retroelements and transposons, similar to the composition of Arubidopsis identified by the inventors. can be used to identify regions of an organisms' chromosome that are centromeric.

(iii) Utili:.atio» n/~Ce»trnmere-Assacintecl Protei»s The third method involves immunoprecipitatin~ known centromere proteins or kinetochore proteins and analyzing bound DNA. Antibodies specific to centromere proteins can be incubated with proteins extracted from cells. Extracts can be native or previously treated to cross-link DNA to proteins. The antibodies and bound proteins can be purified away from the protein extracts and the DNA isolated. The DNA can then be used as a probe for F1SH (as talked about above) or to probe libraries to find neighboring centromere sequences.
1. Centromere-Associated Protein Specific Antibodies By identifying. for the first time, centromere-associated genes, the inventors have enabled the production of antibodies to the proteins encoded by such centromere-associated genes. The antibodies may be either monoclonal or polyclonal which bind to centromere-associated proteins of the current invention. The centromere-associated protein targets of the antibodies, include proteins which bind to the centromere region.
Further, it is specifically contemplated that these centromere-associated protein specific antibodies would allow for the further isolation and characterization of the centromere-associated proteins. For example, proteins may be isotated which are encoded by the centromeres. Recombinant production of such proteins provides a source of antigen for production of antibodies.
Alternatively. the centromere may be used as a ligand to isolate, using affinity methods, centromere binding proteins. Once isolated. these protein can be used as antigens for the production polyclonal and monoclonal antibodies. A variation on this technique has been demonstrated by Rattner ( 1991 ), by cloning of centromere-associated proteins through the use of antibodies which bind in the vicinity of the centromere.
Means for preparing and characterizing antibodies are well known in the art (see.
c. g., Antibodies: A Laboratory Manual. Cold Spring Harbor Laboratory. 1988;

incorporated herein by reference). The methods for generating monoclonal antibodies (mAbs) generally begin along the same lines as those for preparing polyclonal antibodies.
Briefly, a polyclonal antibody is prepared by immunizing an animal with an immunogenic composition in accordance with the present invention and collecting antisera from that immunized animal. A wide range of animal species can be used for the production of antisera. Typically the animal used for production of antisera is a rabbit, a mouse, a rat, a hamster, a guinea pig or a goat. A rabbit is a preterred choice for production of polyclonal antibodies because of the ease of handling, maintenance and relatively large blood volume.
As is well known in the art, a given composition may vary in its immunogenicity.
It is often necessary therefore to boost the host immune system. as may be achieved by coupling a peptide or polypeptide immunogen to a carrier. Exemplary and preferred carriers are keyhole limpet hemocvanin (KLH) and bovine serum albumin (BSA).
Other IS albumins such as ovalbumin, mouse serum albumin or rabbit serum albumin also can be used as carriers. Means for conju~atin~ a polypeptide to a carrier protein are well known in the art and include glutaraldehyde. nr-maleimidobencoyl-N-hydroxysuccinimide ester, carbodimide and bis-biazotized benzidine.
As is also well known in the art, the immunogenicity of a particular immuno~en composition can be enhanced by the use of non-specific stimulators of the immune response. known as adjuvants. Exemplary and preferred adjuvants include complete Freund's adjuvant (a non-specific stimulator of the immune response containing killed Mvcobacteriunr trrbercrrlosi.s). incomplete Freund's adjuvants and aluminum hydroxide adjuvant.
The amount of immuno«en composition used in the production of polyclonal antihodies varies upon the nature of the immunogen as well as the animal used for immunization. A variety of routes can be used to administer the immuno~en (subcutaneous. intramuscular. mtradermal, intravenous and intraperitoneal ).
The WO 00/~~325 PCT/US00/07392 production of polyclonal antibodies may be monitored by sampling blood of the immunized animal at various points following immunization. A second, booster.
injection also may be given. The process of boostin' and titering is repeated until a suitable titer is achieved. When a desired level of immunogenicity is obtained, the immunized animal can be bled and the serum isolated and stored, and/or the animal can be used to generate mAbs.
Monoclonal antibodies may be readily prepared through use of well-known techniques, such as those exemplified in U. S. Patent 4,196.265, incorporated herein by reference. Typically, this technique involves immunizing a suitable animal with a selected immunogen composition, e.y., a purified or partially purified minichromosome -associated protein, polypeptide or peptide. The immunizing composition is administered in a manner effective to stimulate antibody producing cells. Rodents such as mice and rats are preferred animals, however. the use of rabbit, sheep, or frog cells also is possible.
IS The use of rats may provide certain advantages (coding 1986), but mice are preferred, with the BALB/c mouse being most preferred as this is most routinely used and generally gives a higher percentage of stable fusions.
Following immunization. somatic cells with the potential for producing antibodies, specifically B lymphocytes (B cells), are selected for use in the mAb generating protocol. These cells may be obtained from biopsied spleens, tonsils or lymph nodes, or from a peripheral blood sample. Spleen cells and peripheral blood cells are preferred. the former because they are a rich source of antibody-producing cells that are in the dividing plasmablast stage, and the latter because peripheral blood is easily accessible.
2i Often, a panel of animals will hove been immunized and the spleen of animal with the highest antibody titer will be removed and the spleen lymphocytes obtained by homogenizing the spleen with a aynn~_e. Typically. a spleen from an immunized mouse contains approximately ~ X 10' to ? x l Oh lymphocytes.
_ f,~,_ The antibody-producing B lymphocytes from the immunized animal are then fused with cells of an immortal myeloma cell. generally one of the same species as the animal that was immunized. Myeloma cell linen suited for use in hybridoma-producing fusion procedures preferably are non-antibody-producing. have high fusion efficiency, and enzyme deficiencies that render them incapable of growing in certain selective media which support the growth of only the desired fused cells {hybridomas).
Any one of a number of myeloma cells may be used, as are known to those of skill in the art (coding I986; Campbell 1984). For example. where the immunized animal is a mouse, one may use P3-X63/AgB, X63-Ag8.653. NS1/l.Ag 4 1, Sp210-Agl4, FO, NSO/U. MPC-I 1, MPC11-X45-GTG 1.7 and S 194/~XXO Bul; for rats, one may use R2 I O.RCY3, Y3-Ag 1.2.3, IR983F and 4B210: and U-266, GM 1500-GRG2, LICR-LON-HMy2 and UC729-6 are all useful in connection with human cell fusions.
IS One preferred marine myeloma cell is the NS-1 myeloma cell line (also termed P3-NS-1-Ag4-I), which is readily available from the NIGMS Human Genetic Mutant Cell Repository by requesting cell line repository number GM3573. Another mouse myeloma cell line that may be used is the 8-azaguanine-resistant mouse marine myeloma SP2/0 non-producer cell line.
Methods for generating hybrids of antibody-producing spleen or lymph node cells and mveloma cells usually comprise mixing somatic cells with myeloma cells in a 2:1 ratio, though the ratio may vary from about 20:1 to about 1:1. respectively, in the presence of an agent or agents (chemical or electrical) that promote the fusion of cell membranes. Fusion methods using Sendai virus have been described (Kohler c>r «l.. 197: 1976). and those using polyethylene glycol (PEG). such as 37% (v/v) PEG, fGefter et «l.. 1977). The use of electrically induced fusion methods also is appropriate (coding 1986).
_67_ WO 00/~~32s PCT/US00/07392 Fusion procedures usually produce viable hybrids at low frequencies. about 1 x 10-~ to l x 10-x. However, this does not pose a problem, as the viable, fused hybrids are differentiated from the parental. unfused cells (particularly the unfused myeloma cells that would normally continue to divide indefinitely) by culturing in a selective medium.
The selective medium is generally one that contains an agent that blocks the de uovo synthesis of nucleotides in the tissue culture media. Exemplary and preferred agents are aminopterin, methotrexate, and azaserine. Aminopterin and methotrexate block cle novo synthesis of both purines and pyrimidines, whereas azaserine blocks only purine synthesis. Where aminopterin or methotrexate is used, the media. is supplemented with hypoxanthine and thymidine as a source of nucleotides (HAT medium). Where azaserine is used, the media is supplemented with hypoxanthine.
The preferred selection medium is HAT. Only cells capable of operating nucleotide salvage pathways are able to survive in HAT medium. The myeloma cells are defective in key enzymes of the salvage pathway, e.g., hypoxanthine phosphoribosyl transferase (HPRT), and they cannot survive. The B-cells can operate this pathway, but they have a limited life span in culture and generally die within about two weeks.
Therefore, the only cells that can survive in the selective media are those hybrids formed from mveloma and B-cells.
This culturing provides a population of hybridomas from which specific hybridoma, arc selected. Typically. selection of hybridomas is performed by culturing the cells by single-clone dilution in microtiter plates, followed by testing the individual clonal supernatants (after about two to three weeks) for the desired reactivity. The assay should be sensitive, simple and rapid. such as radioimmunoassays. enzyme immunoassays, cytotoxicity assays. plaque assays. dot immunobindina assays.
and the like.
The selected hyhridomas would then be serially diluted and cloned into individual antibody-producing cell lines. which clones can then be propagated indefinitely to provide mAbs. The cell lines may be exploited for mAb production in two basic ways. A
sample of the hybridoma can be injected (often into the peritoneal cavity) into a histocompatiblc animal of the type that was used to provide the somatic and myeloma cells for the original fusion. The injected animal develops tumors secreting the specific monoclonal antibody produced by the fused cell hybrid. The body fluids of the animal, such as serum or ascites tluid, can then be tapped to provide mAbs in high concentration.
The individual cell lines also could be cultured in vitro, where the mAbs are naturally secreted into the culture medium from which they can be readily obtained in high concentrations. mAbs produced by either means may be further purified, if desired. using filtration. centrifugation and various chromatographic methods such as HPLC or affinity chromatography.
2. ELISAs and Immunoprecipitation ELISAs may be used in conjunction with the invention, for example. in IS identifying expression of a centromere-associated protein in a candidate centromere sequence. Such an assay could thereby facilitate the isolation of centromeres from species other than Arcrbiclopsi.r. By identifying conserved, eentromere-associated coding sequences, the inventors have provided the essential tools for such a screen.
In an ELISA assay, proteins or peptides comprising minichromosome-encoded protein antigen sequences are immobilized onto a selected surface, preferably a surface exhibiting a protein affinity such as the wells of a polystyrene microtiter plate. After washing to remove incompletely adsorbed material. it is desirable to bind or coat the assay plate wells with a nonspecific protein that is known to be anti~enically neutral with regard to the test antisera such as bovine serum albumin (BSA), casein or solutions of milk powder. This allows for blocking of nonspecific adsorption sites on the immobilizin~~ surface and thus reduces the background caused by nonspecific binding of antisera onto the surface.

After binding of antigenic material to the well, coating with a non-reactive material to reduce background, and washing to remove unbound material. the immobilizing surface is contacted with the antisera or clinical or biological extract to be tested in a manner conducive to immune complex (antigen/antibody) formation.
Such conditions preferably include diluting the antisera with diluents such as BSA, bovine gamma globulin (BGG) and phosphate buffered saline (PBS)/Tween°. These added agents also tend to assist in the reduction of nonspecific background. The layered antisera is then allowed to incubate for from about 2 to about 4 hours, at temperatures preferably on the order of about 25° to about 27°C. Following incubation, the antisera-contacted surface is washed so as to remove non-immunocomplexed material. A
preferred washing procedure includes washing with a solution such as PBS/Tween°, or borate buffer.
Following formation of specific immunocomplexes between the test sample and the bound antigen, and subsequent washing. the occurrence and even amount of immunocomplex formation may be determined by subjecting same to a second antibody having specificity for the first. To provide a detecting means, the second antibody will preferably have an associated enzyme that will ~~enerate color or light development upon incubating with an appropriate chromogenic suhstrate. Thus, for example. one will desire to contact and incubate the antisera-bound surface with a urease or peroxidase-conjugated anti-human IgG for a period of time and under conditions which favor the development of immunocomplex formation (e.g., incubation for 2 hours at room temperature in a PBS-containing solution).
After incubation with the second enzyme-tagged antibody, and subsequent to washing to remove unbound material. the amount of label is quantified by incubation with a chromogenic substrate such as urea and bromoeresol purple or ~.~'-azino- _di-(3-ethyl-benzthiazoline)-6-sulfonic acid (ABTS) and HBO,. in the case of peroxidase as the enzyme label. Quantitation i~ then achieved by measuring the degree of color generation. e.,r,~.. using a visible spectra spectrophotometer.
-7()-3. Western Blots Centromere- associated antibodies may find use in immunoblot or western blot analysis, for example. for the identification of proteins immobilized onto a solid support matrix. such as nitrocellulose, nylon or combinations thereof. In conjunction with immunoprecipitation. followed by gel electrophoresis. these may be used as a single step reagent for use in detecting antigens against which secondary reagents used in the detection of the antigen cause an adverse background. This is especially useful when the antigens studied are immunoglobulins (precluding the use of immunoglobulins binding bacterial cell wall components), the antigens studied cross-react with the detecting agent.
or they migrate at the same relative molecular weight as a cross-reacting signal.
lmmunologically-based detection methods for use in conjunction with Western blotting include enzymatically-, radiolabel-. or fluorescently-tagged secondary antibodies against the protein moiety are considered to be of particular use in this regard.
(iv) Genetic Mappirt,~ Bcr.sed Approaches The genetic mapping techniques outlined here for the identification of centromeres in Arnbidopsis may find use in other species. In one aspect, this may comprise actual use of the mapping data provided herein. based on synteny between Arcrbidnp.ci.c chromosomes and those of other specie;. Further. new mapping data may be obtained using the techniques described herein. For example. in any plant that makes tetrads. the detailed methodology described herein for tetrad analysis could be used for the isolation of centromeres. Briefly, tetrad analysis measures the recombination 2~ frequency between genetic makers and a centromere by analyzing all four products of individual meiosis. A particular advantage arises from the c/uartet (c/rt I ) mutation in Arnhidnp.ci.r. which causes the four products of pollen mother cell meiosis in Ar-u(ridopsi.s to remain attached.

Several naturally occurring plant speciea in addition to Arabiclnp.cis are known to release pollen clusters, including water lilies, cattails, heath (Ericaceae arrcl Ehacriclceuc ).
evening primrose (Orrcr,t~rcrcecre). sundews (Dru.ceracenc~). orchids (Orchiclcrceae ). and acacias (Nlirrroscrcene) (Preuss 1994. Smyth 199=l). However, none of these species has been developed into an experimental system. limiting their use for Genetic analysis.
However. it is contemplated by the inventors that the cloning and introduction of the drrortet mutation. or an antisense copy of a non-mutated Quartet gene, could allow the use of tetrad analysis in potentially any species.
Southern ~enomic DNA blots in combination with RFLP analysis may be used to map centromeres with a high degree of resolution. The stored seedling tissue provides the necessary amount of DNA for analysis of the restriction fragments.
Southern blots are hybridized to probes labeled by radioactive or non-radioactive methods.
IS It may, in many cases. be desired to identify new polymorphic DNA markers which are closely linked to the target region. In some cases this can be readily done. For example, in many plant genomes, a polymotphic Scur3A site can be found for about every 8 to 20 kB surveyed. Subtractive methods are available for identifying such polymorphisms (Rosenber~ et al., 1994). and these subtractions may be performed using DNA from selected. centromeric YAC or BAC clones. Screens for RFLP markers potentially linked to centromeres also can be performed using DNA fragments from a centromere-linked YAC clone to probe blots of genomic DNA from a tar~~et organism that has been digested with a panel of restriction enzymes.
To be certain that an entire centromeric region has been cloned, clones or a seriea of clones, are identified that hybridize to markers on either side of each centromere.
These efforts can be complicated by the presence of repetitive DNA in the centromere. as well as by the potential instability of centromere clones. Thus, identification of lar~~e clones with unique sequences that will serve as useful probes simplifies a chromoaomc walkinG strate~l-.
_7 Blot hybridization allows comparison of the structure of the clones with that of senomic DNA. and thug determines whether the clones have suffered deletions or rearrangements. The ccntromeric clones identified are useful for hybridization experiments that can be used to determine whether they share common sequences.
whether they localize irr situ to the cytologically defined centromeric region, and whether they contain repetitive sequences thought to map near Ar«hiclopsi.s centromeres (Richards et ctl.. 1991: Maluszynska et crl., 1991 ).
Exemplary methods for conducting PFGE and YAC genome analysis described (Eeker, 1990). A large insert YAC library for genome mapping in Arahiclop.sis tlrcrli«ircr was described in Creusot ( 1995). The analysis of clones carrying repeated DNA
sequences in two YAC libraries of Ar«biclop.si.s th«licrrr« DNA was discussed by Schmidt et crl., ( 1994}. The construction and characterization of a yeast artificial chromosome I S library of Arabidopsi.s was described by Grill and SomerviIle ( 1991 ).
A particularly useful type of clone is the bacterial artificial chromosome (BAC).
as data has suggested that YAC clones may sometimes not span centromeres (Willard, 1997). The construction and characterization of a bacterial artificial chromosome library from, for example. Ar«bidnpsi.s rlr«licrrr« has been described (Choi et crl., 1995). The complementation of plant mutants with lame ~~enomic DNA fragments can be achieved using transformation-competent minichromosome vectors, thereby speeding positions!
cloning. (Liu et «l.. 1999). The construction and characterization of the IGF
,9rcrhidopsi.s BAC library was described by Mozo et «L, ( 1998.). A complete BAC-based physical map of the Ar«hiclop.sis tlrcrlicrn« genome has been described (Mozo et «L.
1998).
VI. Site Specific Integration and Excision of Nucleic Acid Segments It is specifically contemplated by the inventors that one could employ techniques for the site-specific integration or excision of nucleic acid segments for the construction of minichromosomes (see. e.g.. Example 8B_ below). Such techniques also could be used _73_ WO 0()/55325 PCT/US00/07392 for the site-specific integration or excision of trans'~enes which are introduced into a plant, including minichromosome vectors.
Site-specific integration or excision of nucleic acid molecules can be achieved by Wmeans of homologous recombination (see, for example. U.S. Patent No.
5,527.695.
specifically incorporated herein by reference in its entirety). Homologous recombination is a reaction between any pair of DNA sequence, having a similar sequence of nucleotides, where the two sequences interact (recombine) to form a new recombinant DNA species. The frequency of homologous recombination increases as the length of the shared nucleotide DNA sequences increases, and is higher with linearized plasmid molecules than with circularized plasmid molecules. Homologous recombination can occur between two DNA sequences that are less than identical, but the recombination frequency declines as the divergence between the two sequences increases.
Introduced DNA sequences can be targeted via homologous recombination by linking a DNA molecule of interest to sequences sharing homology with endogenous sequences of the host cell. Once the DNA enters the cell. the two homologous sequences can interact to insert the introduced DNA at the site where the homologous genomic DNA sequences were located. Therefore, the choice of homologous sequences contained on the introduced DNA will determine the site where the introduced DNA is integrated via homologous recombination. For example. if the DNA sequence of interest is linked to DNA sequences sharing homology to a single copy Gene of a host plant cell, the DNA
sequence of interest will be inserted via homologous recombination at only that single specific site. However. if the DNA sequence of interest is linked to DNA
sequences 2~ sharing homology to a muiticopy gene of the host eukaryotic cell, then the DNA sequence of interest can be inserted via homologous recombination at each of the specific sites where a copy of the Gene is located.
DNA can be inserted into a host chromosome or vector by a homologous recombination reaction involving either a single reciprocal recombination (resulting in the -~a-insertion of the entire length of the introduced DNA) or through a double reciprocal recombination (resulting in the insertion of only the DNA located between the two recomhination events 1. For example. if one wishes to inert ,i f~orei~n gene into the Qenomic site where a selected gene is located. the introduced DNA should contain sequences homologous to the selected gene. A single homolo~ot.ts recombination event would then result in the entire introduced DNA sequence bein« inserted into the selected gene. Alternatively, a double recombination event can be achieved by flanking each end of the DNA sequence of interest (the sequence intended to be inserted into the genome) with DNA sequences homologous to the selected gene. A homologous recombination l0 event involving each of the homologous flanking regions will result in the insertion of the foreign DNA. Thus only those DNA sequences located between the two regions sharing genomic homology become integrated into the ~enome.
Although introduced sequences can be targeted for insertion into a specific site via 15 homologous recombination, in higher eukaryotes homologous recombination is a relatively rare event compared to random insertion events. In plant cells.
foreign DNA
molecules find homologous sequences in the cell's genome and recombine at a frequency of approximately 0.~-4.2X 10'~'. Thus any transformed cell that contains an introduced DNA sequence integrated via homologous recombination will also likely contain 20 numerous copies of randomly integrated introduced DNA sequences. Therefore.
it may be desirable to use more precise mechanisms for site-specific recombination. A
preferred manner for carrying out site-specific recombination comprises use of a site-specific recombinase system. In general. a site specific recombinase system consists of three elements: two pairs of DNA sequence (first and second site-specific recombination sequences) and a specific enzyme (the site-specific recombinase). The site-specific recombinase will catalyze a recombination reaction only between two site-specific recombination sequences.
A number of different site specific recombinase system; could be employed in 30 accordance with the instant invention. including, but not limited to. the Crc/lox system of _7i_ bacteriophage P I (Hoess et crl.. 1982: U.S. Patent No. 5,658.772, specifically incorporated herein by reference in its entirety). the FLP/FRT system of yeast (colic and Lindquist. 1989). the Gin recombinase of phage Mu (Maeser and Kahmann. 1991 ).
the Pin recombinase of E. coli (Enomoto et crl.. 1983). the recombinase encoded by the sre gene (ORF469) and which is capable of mediating integration of the R=1 phage genome.
(Matsuura et al., 1996), the site-specific recombinase encoded by pinD of Slri~ella clvsenterine (Tominaga, 1997). the site-specific recombinase encoded in the major 'pathogenicity island' of Salmonella hphi (Zhang et al., 1997) the Int-B 13 site-specific recombinase of the bacteriophage P4 inte~rase family (Ravatn et al., 1998). as well as the and the R/RS system of the pSR 1 plasmid (Araki et ccl.. 1992). The bacteriophage PI
Cre/lox and the yeast FLP/FRT systems constitute two particularly useful systems for site specific recombination. In these systems. a recombinase (Cre or FLP) will interact specifically with its respective site-specific recombination sequence (lox or FRT.
respectively) to invert or excise the intervening sequences. The sequence for each of these two systems is relatively short (34 by for lox and 47 by for FRT) and therefore.
convenient for use with transformation vectors.
The FLP/FRT recombinase system has been demonstrated to function efficiently in plant cells, but could also be used in. for example, a bacterial cell or in aitrr~. The performance of the FLP/FRT system indicates that FRT site structure. and amount of the FLP protein present affect excision activity. In general. short incomplete FRT
sites lead to higher accumulation of excision products than the complete full-length FRT
sites. The systems can catalyze both infra- and intermolecular reactions, indicating their utility for DNA excision as well as integration reactions. The recombination reaction is reversible and this reversibility can compromise the efficiency of the reaction in each direction.
Altering the structure of the site-specific recombination sequences is one approach to remedying this situation. The cite-specific recombination sequence can he mutated in a manner that the product of the recombination reaction is no loner recognized as a substrate for the reverse reaction. thereby stabilizing the integration or excision event.

In the Cre-lox system. discovered in bacteriophage P1, recombination between loxP sites occurs in the presence of the Cre recombinase (see, e.g.. U.S.
Patent No.
x.658.77?. specifically incorporated herein by reference in its entirety).
This system has been utilized to excise a gene located between two lox sites which had been introduced into a yeast genome (Saner. 1987). Cre was expressed from an inducible yeast GALI
promoter and this Cre gene was located on an autonomously replicating yeast vector.
Since the lox site is an asymmetrical nucleotide sequence, lox sites on the same DNA molecule can have the same or opposite orientation with respect to each other.
Recombination between lox sites in the same orientation results in a deletion of the DNA
Segment located between the two lox sites and a connection between the resulting ends of the original DNA molecule. The deleted DNA segment forms a circular molecule of DNA. The original DNA molecule and the resultin~~ circular molecule each contain a single lox site. Recombination between lox sites in opposite orientations on the same IS DNA molecule result in an inversion of the nucleotide sequence of the DNA
segment located between the two lox sites. In addition, reciprocal exchange of DNA
segments proximate to lox sites located on two different DNA molecules can occur. All of these recombination events are catalyzed by the product of the Cre coding region.
VII. Transformed Host Cells and Transeenic Plants Methods and compositions for transformin~~ a bacterium, a yeast cell. a plant cell.
or an entire plant with one or more minichromosomes are further aspects of this disclosure. A transgenic bacterium. yeast cell. plant cell or plant derived from such a transformation process or the progeny and seedy from such a transgenic plant also are further embodiments of the invention.
Means for transforming bacteria and yeast cells are well known in the art.
Typically, means of transformation are similar to those well known means used to transform other bacteria or yeast such as E grill or Scrc~ hcrrumvce.s cerevisiae. Methods for DNA transformation of plant cells include A,s;rnhcrcterirrrn-mediated plant _77_ transformation. protoplast transformation (,as used herein "protoplast transformation"
includes PEG-mediated transformation. electroporation and protoplast fusion transformation). gene transfer into pollen. injection into reproductive organs. injection into immature embryos and particle bombardment. Each of these methods has distinct advantages and disadvantages. Thus. one particular method of introducing genes into a particular plant strain may not necessarily be the most effective for another plant strain.
but it is well known in the art which methods are useful for a particular plant strain.
There are many methods for introducing transforming DNA segments into cells.
but not all are suitable for delivering DNA to plant cells. Suitable methods are believed to include virtually any method by which DNA can be introduced into a cell.
such as by A~=robcrcteriruo infection. direct delivery of DNA such as, for example. by PEG-mediated transformation of protoplasts (Omirulleh et «L. 1993), by desiccationlinhibition-mediated DNA uptake. by electroporation, by agitation with silicon carbide fibers, by acceleration of DNA coated particles, etc. In certain embodiments. acceleration methods are preferred and include, for example, microprojectile bombardment and the like.
Technology for introduction of DNA into cells is well-known to those of skill in the art. Four general methods for deliverinVC a gene into cells have been described: ( 1 ) chemical methods (Graham et «l., 1973; Zatloukal et «l.. 1992); (2) physical methods such as microinjection (Capecchi, 1980). elcctroporation l Wong et «l.. 1982:
Fromm et «L. 1985; U. S. Patent No. 5.384.253) and the gene gun (Johnston et crl.. 199=1:
Fynan et crl.. 1993); (3) viral vectors (Clapp 1993: Lu et «L. 1993: Eglitis et «L, 1988x:
1988b): and (4) receptor-mediated mechanisms (Curiel et «L, 1991; 1992;
Wagner et crl.. 1992).
_7g_ (i) Elecmnpnration The application of brief, high-voltage electric pulses to a variety of animal and plant cells leach to the formation of nanometer-sized pores in the plasma membrane.
DNA is taken directly into the cell cytoplasm either through these pores or as a consequence of the redistribution of membrane components that accompanies closure of the pores. Electroporation can be extremely efficient and can be used both for transient expression of cloned genes and for establishment of cell lines that carry integrated copies of the gene of interest. Electroporation, in contrast to calcium phosphate-mediated transfection and protoplast fusion, frequently gives rise to cell lines that carry one, or at most a few. integrated copies of the foreign DNA.
The introduction of DNA by means of electroporation, is welt-known to those of skill in the art. In this method, certain cell wall-de~radin~ enzymes. such as pectin-de~radin~ enzymes, are employed to render the target recipient cells more 1~ susceptible to transformation by electroporation than untreated cells.
Alternatively.
recipient cells are made more susceptible to transformation, by mechanical woundins. To effect transformation by electroporation one may employ either friable tissues such as a suspension culture of cells, or embryogenic callus. or alternatively. one may transform immature embryos or other organized tissues directly. One would partially degrade the ~0 cell walls of the chosen cells by exposing them to pectin-degrading enzymes (pectolyases) or mechanically wounding in a controlled manner. Such cells would then he recipient to DNA transfer by electroporation, which may be carried out at this stake. and transformed cells then identified by a suitable selection or screening protocol dependent on the nature of the newly incorporated DNA.
(ii) Micr«prnjectile Brnnhordnzent A further advant~~eous method for delivering transformin« DNA segments to plant cells is microprojectile bombardment. In this method. particles may he coated with nucleic acids and delivered into cells by a propelling force. Exemplary particles include s0 those comprised ~f tunssten. fold. platinum. and the like.

An advanta~Te of microprojectile bombardment, in addition to it being an effective means of reproducibly stably transforming monocots. is that neither the isolation of protoplasts (Cristou er ul.. 1988) nor the susceptibility to Agrnhcrcrer-iunr infection is required. An illustrative embodiment of a method for delivering DNA into maize cells by acceleration is a Biolistics Particle Deliven~ System, which can be used to propel particles coated with DNA or cells through a screen. such as a stainless steel or Nytex screen, onto a filter surface covered with plant cells cultured in suspension. 'The screen disperses the particles so that they are not delivered to the recipient cells in large aggregates. It is believed that a screen intervening between the projectile apparatus and the cells to be bombarded reduces the size of projectiles aggregate and may contribute to a higher frequency of transformation by reducing damage inflicted on the recipient cells by projectiles that are too lame.
For the bombardment, cells in suspension are preferably concentrated on filters or solid culture medium. Alternatively, immature embryos or other target cells may be arranged on solid culture medium. The cells to be bombarded are positioned at an appropriate distance below the macroprojectile stopping plate. If desired, one or more screens also are positioned between the acceleration device and the cells to be bombarded. Through the use of techniques set forth herein one may obtain up to I .000 or more foci of cells transiently expressing a marker gene. The number of cells in a focus which express the exo~~enous gene product 48 hours post-bombardment often range from 1 to 10 and average I to 3.
In bombardment transformation. one may optinuze the prebombardment culturing=
conditions and the hombardment parameters to yield the maximum numbers of stable transformants. Both the physical and biological parameters for bombardment are important in this technology. Physical factors are those that involve manipulating the DNA/microprojectile precipitate or those that affect the flight and velocity of either the macro- or microprojectiles. Biological factors include all steps involved in manipulation _80_ of cells before and immediately after bombardment. the osmotic adjustment of target cells to help alleviate the trauma associated with bombardment, and also the nature of the transforming DNA, such as linearized DNA or intact supercoiled plasmids. It is believed that pre-bombardment manipulations are especially important for successful transformation of immature embryos.
Accordingly, it is contemplated that one may wish to adjust various of the bombardment parameters in small scale studies to fully optimize the conditions. One may particularly wish to adjust physical parameters such as gap distance, flight distance, tissue distance, and helium pressure. One also may minimize the trauma reduction factors (TRFs) by modifying conditions which influence the physiological state of the recipient cells and which may therefore influence transformation and imegration efficiencies. For example. the osmotic state. tissue hydration and the subculture stake or cell cycle of the recipient cells may be adjusted for optimum transformation.
The I S execution of other routine adjustments will be known to those of skill in the art in Ii~Tht of the present disclosure.
)iii) Agrohacteritrnr-Mediated Trcrnsjer AKrohacteriunt-mediated transfer is a widely applicable system for introducing genes into plant cells because the DNA can be introduced into whole plant tissues.
thereby bypassing the need for regeneration of an intact plant from a protoplast. The use of Agrnhcrcteriernr-mediated plant inte~~rating vectors to introduce DNA into plant cells is well known in the art. See. for example. the methods described (Fraley et al., 1985;
Rogers et ul., 1987). Advances in Ayrr~bacrerirrm-mediated transfer now allow introduction of large segments of DNA (Hamilton. 1997: Hamilton et al.. 1996).
Using conventional transformation vectors. chromosomal integration is required for stable inheritance of the foreign DNA. However, the vector described herein may be used for transformation with or without integration. as the centromere function required for stable inheritance is encoded within the minichrornosome. In particular -lil-embodiments.. transformation events in which the minichromosome is not chromosomally integrated may be preferred. in that problems with site-specific variations in expression and insertional mutagenesis may be avoided.
The integration of the Ti-DNA is a relatively precise process resulting in few rearrangements. The region of DNA to be transferred is defined by the border sequences.
and intervening DNA is usually inserted into the plant genome as described (Spielmann et «!., 1986: Joraensen et crl., 1987}. Modern Agrohcrctericrnr transformation vectors are capable of replication in E. cnli as well as A~~robcrcterium, allowing for convenient manipulations as described (Klee et «l., 1985}. Moreover. recent technological advances in vectors for Agrnh«ctcrirrnr-mediated gene transfer have improved the arrangement of genes and restriction sites in the vectors to facilitate construction of vectors capable of expressing various polypeptide coding genes. The vectors described (Ropers et «L, 1987). have convenient multi-linker regions t7anked by a t 5 promoter and a polyadenvlation site for direct expression of inserted polypeptide coding genes and are suitable for present purposes. In addition, A,~rohcrcrerieorr containing both armed and disarmed Ti genes can be used for the transformations. In those plant strains where A,yrnhcrcterirrm-mediated transformation is efficient, it is the method of choice because of the facile and defined nature of the gene transfer.
Agrohcrcteri«m-mediated transformation of leaf disks and other tissues such as cotyledons and hypocotvis appears to be limited to plants that Ayroh«crerirrnr naturally infects. Agruh«cterirrnr-mediated transformation is most efficient in dicotyledonous plants. Few monocots appear to be natural hosts for Ayrohcrcterirrnr. although transgenic plants have been produced in asparagus and more signiticantly in maize using A,~rnhacterirrrn vectors as described (Bytebier et «!., 1987: U.S. Patent No.
S.s91.616, specifically incorporated herein by reference). Therefore. commercially important cereal Brains such as rice. corn. and wheat must usually be transformed using alternative methods. However. as mentioned above. the transformation of asparagus using fl,y~nbncterirurt also can be achieved (see, for example. Bytebier et «L, 1987).
_8~_ WO 00/~~32~ PCT/US00/07392 A~rnbcrcterirrnr-mediated transfer may be made more efficient through the use of a mutant that is defective in integration of the A,yrnbacterirrm T-DNA but competent for delivery of the DNA into the cell (Mysore et ul.. ?OOOa). Additionally. even in Arcrhidnp.ci.s ecotypes and mutants that are recalcitrant to Agrobcrcter-irrrrr root transformation, germ-line transformation may be carried out (Mysore er nl.. ?OOOb) A transgenic plant formed using Agrobacterium transformation methods typically contains a single gene on one chromosome. Such Iransgenic plants can be referred to as being= hemizYgous for the added gene. A more accurate name for such a plant is an l0 independent segregant, because each transformed plant represents a unique T-DNA
integration event.
More preferred is a transaenic plant that is homozygous for the added foreign DNA: i.e.. a transgenic plant that contains two copies of a transgene. one gene at the same I S locus on each chromosome of a chromosome pair. A homozygous transoenic plant can be obtained by sexually mating (selfing) an independent segregant trans~~enic plant that contains a single added transgene, germinating some of the seed produced and analyzing the resulting plants produced for enhanced activity relative to a control (native.
non-transgenic) or an independent segregant transaenic plant.
Even more preferred is a plant in which the minichrornosome has not been chromosomallv integrated. Such a plant may be termed ?n + x. where ?n is the diploid number of chromosomes and where x is the number of minichromosomes. Initially.
transformants may be 2n+l. i.e. having 1 additional minichromosome. In this case. it may be desirable to self the plant or to cross the plant with another 2n + 1 plant to yield a plant which is 2n + ?. The 2n + 2 plant is preferred in that it is expected to pass the minichromosome through meiosis to all its offspring.
It is to be understood that two different transgenic plants also can he mated to produce offsprin~_~ that cont.rin two independently segre«atin~~ added.
exogenous -g3_ minichromosomes. Selfing of appropriate progeny can produce plants that are homozygous for both added. exogenous minichromosomes that encode a polvpeptide of interest. Back-crossins to a parental plant and out-crossing with a non-tr~tns~enic plant also are contemplated.
(iv) Otlrer Trcrn.c/~rntati~nr A~fetltods 'Transformation of plant protoplasts can be achieved using methods based on calcium phosphate precipitation, polyethylene glycol treatment, electroporation, and combinations of these treatments (see. e.k.. Potrykus et nl.. 1985: Lorz et nl., 1985;
Fromm et crl., 1986; Uchimiya et al., 1986: Callis et ul., 1987: Marcotte er ul., 1988).
Application of these systems to different plant strains for the purpose of making transgenic plants depends upon the ability to regenerate that particular plant strain from protoplasts. Illustrative methods for the regeneration of cereals from protoplasts are described (Fujimura et ul., 198: Toriyama et al.. 1986; Yamada et ul., 1986;
Abdullah et al.. 1986).
To transform plant strains that cannot be successfully regenerated from protoplasts, other ways to introduce DNA into intact cells or tissues can be utilized. For example, regeneration of cereals from immature embryos or explants can be effected as described (Vasil 1988). In addition. "particle gun" or high-velocity microprojectile technology can be utilized (Vasil 199? >.
Using that latter technology. DNA is carried through the cell wall and into the cytoplasm on the surface of small metal particles its described (Klein et nl..
1987:
Klein et ul.. 1988: McCabe et al.. 1988). The metal particles penetrate through several layers of cells and thus allow the transformation of cells within tissue explants.
Protoplast fusion. for example. could be used to integrate a minichromosomc constructed in a host cell. such as a vesst cell, and then fuse those cells to plant _8~_ protoplasts. The chromosomes lacking plant centromeres (such as yeast chromosomes in this example) would be eliminated by the plant cell while the minichromosome would be stable maintained. Numerous examples ~f protocols for protoplast fusion that could be used with the invention have been described (see. e.~y.. Negrutiu ee ul..
199?. and Peterson).
Liposome fusion could be used to introduce a recombinant construct comprising a centromere, such as a minichromosome. by. for example, packaging the recombinant construct into small droplets of lipids (liposomes) and then fusing these liposomes to plant protoplasts thus delivering the AC into the plant cell (see Lurqui and Rollo, 1993).
VIII. Exogenous Genes for Expression in Plants One particularly important advance of the present invention is that it provides methods and compositions for expression of exogenous genes in plant cells. One advance of the constructs of the current invention is that they enable the introduction of multiple genes. potentially representing an entire biochemical pathway. Significantly, the current invention allows for the transformation of plant cells with a minichromosome comprising a number of structural genes. Another advantage is that more than one minichromosome could be introduced. allowing combinations of genes to be moved and shuffled.
Moreover. the ability to eliminate a minichromosome from a plant would provide additional flexibility. making it possible to alter the set of genes contained within a plant.
Further. by using site-specific recombinases, it should be possible to add genes to an existing minichromosome once it is in a plant.
Added genes often will be genes that direct the expression of a particular protein or polvpeptide product. but they also may be non-expressible DNA segments.
e.g..
transposons such as Ds that do not direct their own transposition. As used herein. an 'expressible gene" is any gene that is capable of being transcribed into RNA
(e.b~..
mRNA. antisense RNA. etc.) or translated into a protein. expressed as a trait of interest.
c>r the like. erc.. and is not limited to selectahle. screenable or non-selectable marker WO 00/~~32~ PCT/US00/07392 genes. 'The inventors also contemplate that. where both an expressible gene that is not necessarily a marker gene is employed in combination with a marker gene. one may employ the separate genes on either the same or different DNA se~rnents for transformation. In the latter case, the different vectors are delivered concurrently to recipient cells to maximize cotransformation.
The choice of the particular DNA segments to be delivered to the recipient cells often will depend on the purpose of the transformation. One of the major purposes of transformation of crop plants is to add some commercially desirable, agronomically important traits to the plant. Such traits include. but are not limited to, herbicide resistance or tolerance: insect resistance or tolerance; disease resistance or tolerance (viral. bacterial. funr'al, nematode): stress tolerance and/or resistance. as exemplified by resistance or tolerance to drought. heat. chilling. freezing. excessive moisture, salt stress;
oxidative stress: increased yields: food content and makeup: physical appearance: male l~ sterility; drydown; standability; prolificacy; starch quantity and quality:
oil quantity and quality: protein quality and quantity: amino acid composition: and the like.
One may desire to incorporate one or more genes conferring any such desirable trait or traits, such as, for example. a gene or genes encoding herbicide resistance.
In certain embodiments, the present invention contemplates the transformation of a recipient cell with minichromosomes comprising more than one exogenous gene.
As -used herein. an "exogenous gene." is a gene not normally found in the host '~enome in an identical context. By this, it is meant that the gene may be isolated from ~
different species than that of the host genome. or alternatively. isolated from the host Uenome but ?s operably linked to one or more regulatory regions which differ from those found in the unaltered. native ~=ene. Two or more exogenous genes also can he supplied in a single transformation event using either distinct transgene-encoding vectors. or using a single vector incorporating two or more gene coding sequences. For example. plasmids bearing the bcn- and « rr~A expression units in either convergent. divergent, or colinear orientation.
are considered to be particularly useful. Further preferred combinations are those of an WO 00/55325 PCT/USO(1/07392 insect resistance acne. such as a Bt gene. along with a protease inhibitor gene such as Mull. or the use of bar in combination with either of the above genes. Of course. any two or more trans~enes of any description. such as those conferring herbicide.
insect. disease (viral. bacterial. fungal. nematode) or drought resistance. male sterility.
drydown.
S standability. prolificacy. starch properties. oil quantity and quality, or those increasin_T
yield or nutritional quality may be employed as desired.
(i) Her-bicidc~ R~~sistance The genes encoding phosphinothricin acetyltransferase (bar and pert).
~lyphosate tolerant EPSP synthase genes. the glyphosate degradative enzyme gene ,~n.r encoding glyphosate oxidoreductase, deh (encoding a dehalogenase enzyme that inactivates dalapon). herbicide resistant (c~.g., sulfonylurea and imidazolinone) acetolactate svnthase.
and hxn genes (encodin_ a nitrilase enzyme that degrades bromoxynil) are food examples of herbicide resistant genes for use in transformation. The bnr and hat ~~enes code for an enzyme, phosphinothricin acetyltransferase (PAT). which inactivates the herbicide phosphinothricin and prevents this compound from inhibiting ~lutamine synthetase enzymes. The enzyme 5-enolpyruvylshikimate 3-phosphate synthase (EPSP
Synthasel, is normally inhibited by the herbicide N-(phosphonomethyl)glycine (glyphosate).
However.
genes are known that encode glyphosate-resistant EPSP synthase enzymes. These 'zenes are particularly contemplated for use in plant transformation. The deh gene encodes the enzyme dalapon dehalo~enase and confers resistance to the herbicide dalapon.
The b.mr gene codes for a specific nitrilase enzyme that converts bromoxynil to a non-herbicidal degradation product.
2> (ii) Irr.cect Re.ci.ctrurce Potential insect resistance genes that can be introduced include l3crcillrr., flrrlr'rrr~~~IC'Ir.Cl.1 Cr'ySI:lI toxin genes or Bt genes (Watrud et ol..
1985). Bt y==ones m,m provide resistance to lepidopteran or coleopteran pests such as European Corn Borer I
ECB ).
Preferred Bt toxin genes for use in such embodiments include the Cwlf1(h) and Crwl.=plc l _g7_ genes. Endotoxin genes from other species of B. tlrurirr,~ien.ci.s which affect insect growth or development also may be employed in this regard.
It is contemplated that preferred Bt genes for use in the transformation protocols S disclosed herein will be those in which the coding sequence has been modified to eftect increased expression in plants. and more particularly. in monocot plants.
Means for preparing synthetic genes are well known in the art and are disclosed in. for example.
U.S. Patent No. S.S00.365 and U.S. Patent Number No. 5.689.OS2> each of the disclosures of which are specifically incorporated herein by reference in their entirety.
Examples of such modified Bt toxin genes include a synthetic Bt CrvIA(h) gene (Perlak et ul., 1991 ), and the synthetic Cr,~IA(c) gene termed 1800b (PCT
Application WO 9S/06128). Some examples of other Bt toxin genes known to those of skill in the art are given in Table 1 below.
I S Table 1: Bacillus thurin;~iensis 8-Endotoxin Genes' New Nomenclature Old l~omenclature GenBank Accession Cr IAa CrvIA(a) M112S0 Cr I Ab C IA( b ) M 13898 C i Ac CrvIAlc ) M 1 1068 Cr IAd CrvIAtd) M732S0 C 1 Ae CrvIAle ) M6S2S2 Cr IBa CrvB X06711 Cry I Bb ETS L32020 Crv 1 Bc PEG ~ 246442 C IBd CrvEl U70726 C iCa CrvlC X07S18 Crv 1 Cb CrvlCi b ) M97880 C 1 Da C ID X54160 Cr I Db PrtB Z22S I 1 Cr 1Ea Cr IE XS398S

C 1 Eb Cr ~IEt b ) M732S3 Crv l Fa Cr IF M63897 Cr 1Fb ~ PrtD 22251?

Cr IGa PrtA 222510 Cr lGb Cr H? U7072S

Crv 1 Ha PrtC Z2?S 13 CrvlHb U3S780 _gS_ New Nomenclature Dld Nomenclature GenBank Accession __ Crv 1 la Cr V ._ X62821 Crv 11b CryV 007642 lJa ET4 L32019 Crv _ ETI f_.i31527 Cr 11b Crv _ C IIA M31738 C 2Aa Crv2Ab C IIB M23724 Crv2Ac Cr IIC X57252 Cry3A C IIIA M22472 Cr 3Ba Cr IIIB X 17123 Cr 3Bb Cr II1B2 M89794 Cr 3C Cr II1D X59797 Cry4A Cr IVA Y00423 C 4B Cr IVB X07423 Cr 5Aa CrvVA(a) L07025 CrvSAb C VAIb) L07026 Crv6A CryVIA L07022 Crv_ 6B Cr VIB L0702=t Cr 7Aa Cr 111C M64478 Cr 7Ab CrylIlCb 004367 Cr 8A Cr IIIE 004364 CrvBB C 11IG 004365 Cr 8C Cr II1F 004366 Cr 9A CrvlG X58120 Cr 9B Cr IX X75019 Crv9C CrvlH 237527 Cry I OA CrvIVC M 12662 CrvllA Cr IVD M31737 Crv 1 I B Jes80 X 86902 Crv 12A CryVB L07027 C l3A C VC L07023 Crv l4A CrvVD ~~ I 3955 Cr ~15A 34kDa M76442 Crv 16A cbm71 X94146 C 17A cbm71 X99478 CrvlBA C BP1 X990=i9 Crv 19A Je~65 Y089?0 Cvt I Aa C tA X03 I 82 Cvt I Ab CvtM X98793 Cvt?A CvtB Z I 41 ~#7 Cvt? B CvtB U 5?04 3 _gc~_ ''Adapted from:
http://epunix.biols.susx.ac.uk/Home/Isieil_Crickmore/Bt/index.html Protease inhibitors also may provide insect resistance (Johnson m crl.. 1989).
and will thus have utility in plant uansformation. The use of a protease inhibitor II gene, pirrll. from tomato or potato is envisioned to be particularly useful. Even more advantageous is the use of a hinll gene in combination with a Bt toxin gene, the combined effect of which has been discovered to produce synergistic insecticidal activity.
Other genes which encode inhibitors of the insect's digestive system, or those that encode enzymes or co-factors that facilitate the production of inhibitors. also may be useful. This group may be exemplified by oryzacystatin and amylase inhibitors such as those from wheat and barley.
Also. genes encoding lectins may confer additional or alternative insecticide properties. Lectins (originally termed phytohemagglutinins) are multivalent IS carbohydrate-binding proteins which have the ability to ag~;lutin~ue red blood cells from a range of species. Lectins have been identified recently as insecticidal agents with activity against weevils. ECB and rootworm (Murdock el crl.. 1990: Czapla & Lang.
1990).
Lectin genes contemplated to he useful include, for example. barley and wheat germ a~~(utinin (WGA) and rice lectins (Gatehouse et crl., 1984). with WGA being preferred.
Genes controlling the production of large or small polypeptides active against insects when introduced into the insect pests. such as. e. y.. lyric peptides.
peptide hormones and toxins and venoms, form another aspect of the invention. For example. it is contemplated that the expression of juvenile hormone esterase. directed towards 2~ specific insect pests. also may result in inseeticidal activity. or perhaps cause cessation of metamorphosis (Hammock et crl.. 1990).
Transoenic plants expressing genes which encode enzymes that affect the integrity of the insect cuticle form yet another aspect of the invention. Such genes include those encoding, e. ~., chitinasc. proteases. lipases and also genes for the production of nikkomycin_ a compound that inhibits chitin synthesis. the introduction of any of which is contemplated to produce insect resistant plants. Genes that code for activities that affect insect molting, such as those affecting the production of ecdysteroid UDP-glucosvl transferase. also fall within the scope of the useful trans«enes of the present invention.
Genes that code for enzymes that facilitate the production of compounds that reduce the nutritional quality of the host plant to insect pests also are encompassed by the present invention. It may be possible. for instance, to confer insecticidal activity on a plant by altering its sterol composition. Sterols are obtained by insects from their diet and are used for hormone synthesis and membrane stability. Therefore alterations in plant sterol composition by expression of novel genes. e.g., those that directly promote the production of undesirable sterols or those that convert desirable sterols into undesirable forms, could have a negative effect on insect growth and/or development and hence endow the plant with insecticidal activity. Lipoxyaenases are naturally occurring plant enzymes that have been shown to exhibit anti-nutritional effects on insects and to reduce IS the nutritional quality of their diet. Therefore, further embodiments of the invention concern transgenic plants with enhanced lipoxygenase activity which may be resistant to insect feeding.
Trips«crr»r d«cwlnicle.s is a species of grass that is resistant to certain insects.
including corn root worm. It is anticipated that genes encoding proteins that are toxic to insects or are involved in the biosynthesis of compounds toxic to insects will be isolated from Tripscrcr»» and that these novel genes will be useful in conferring resistance to insects. It is known that the basis of insect resistance in Trip.r«crr»r is genetic. because said resistance has been transferred to Ze« »r«v.c via sexual crosses (Branson and Guns.
?5 197?). It is further anticipated that other cereal. monocot or dicot plant species may have genes encoding proteins that are toxic to insects which would be useful for producing insect resistant plants.
Further genes encoding proteins characterized as having potential insecticidal activity also rnay be used as transgenea in accordance herewith. Such ;~enea include. for _y 1 _ example. the cowpea trypsin inhibitor (CpTI: Hilder et crl., 1987) which may be used as a rootworm deterrent; genes encoding avermectin (Avenrrectin trot! Ah«nrectin..
Campbell.
W.C.. Ed.. 1989; lkeda c:t «L. 19871 which may prove particularly useful as a corn rootworm deterrent; ribosome inactivating protein genes: and even genes that regulate plant structures. Transgenic plants including anti-insect antibody genes and genes that code for enzymes that can convert a non-toxic insecticide (pro-insecticide) applied to the outside of the plant into an insecticide inside the plant also are contemplated.
(iii) Environment or Stress ReSI.SIClIIG'e Improvement of a plants ability to tolerate various environmental stresses such as.
but not limited to. drought, excess moisture. chilling. freezing, high temperature. salt, and oxidative stress. also can be effected through expression of novel genes. It is proposed that benefits may be realized in terms of increased resistance to freezing temperatures through the introduction of an "antifreeze" protein such as that of the Winter Flounder I S (Cutler et «l.. 1989) or synthetic gene derivatives thereof. Improved chilling tolerance also may be conferred through increased expression of ~Ivcerol-3-phosphate acetyltransferase in chloroplasts (Wolter et «L, 1992). Resistance to oxidative stress (often exacerbated by conditions such as chilling temperatures in combination with hi~~h light intensities) can be . conferred by expression of superoxide dismutase (Gupta et «l.. 1993), and may be improved by ~~lutathione reductase (Bowler et «l., 1992).
Such strategies may allow for tolerance to freezing in newly emerged fields as well as extending later maturity higher yieldin~l varieties to earlier relative maturity zones.
It is contemplated that the expression of novel genes that favorably effect plant water content, total water potential, osmotic potential. and tumor will enhance the ability of the plant to tolerate drought. As used herein, the terms "drought resistance" and "drought tolerance" are used to refer to a plants increased resistance or tolerance to stress induced by a reduction in water availability. as compared to normal circumstances. and the ability of the plant to function and survive in lower-water environments.
In this aspect of the invention it is proposed. for cxarnple. that the expression of genes encodin~T

WO OO/i532a PCT/US00/07392 for the biosynthesis of osmotically-active solutes. such as polyol compounds.
may impart protection against drought. Within this class are genes encoding for mannitol-L-phosphate dehydro~enase (Lee and Saicr. 1982) and trehalose-6-phosphate synthase (Kaasen et al., 1992). Through the subsequent action of native phosphatases in the cell or by the introduction and coexpression of a specific phosphatase.
these introduced genes will result in the accumulation of either mannitol or trehalose, respectively. both of which have been well documented as protective compounds able to mitigate the effects of stress. Mannitol accumulation in transgenic tobacco has been verified and preliminary results indicate that plants expressing high levels of this metabolite are able to tolerate an applied osmotic stress (Tarczynski et al..
1992, 1993).
Similarly, the efficacy of other metabolites in protecting either enzyme function (c.~~.. alanopine or propionic acid) or membrane integrity (e.~~., alanopine) has been documented (Loomis et al.. 1989), and therefore expression of genes encoding for the biosynthesis of these compounds might confer drought resistance in a manner similar to or complimentary to mannitol. Other examples of naturally occurring metabolites that are osmotically active and/or provide some direct protective effect during drought and/or desiccation include fructose, erythritol (Coxson et al., 1992), sorbitol, dulcitol (Karsten et crl.. 1992). glueosylglycerol (Reed et crl.. 198=x: ErdMann et crl.. 1992), sucrose. stachyose lKoster and Leopold, 1988: Blackman er al., 1992), raffinose (Bernal-Lugo and Leopold, 1992), proline (Rensburg et ul.. 1993). '~lycine betaine, ononitol and pinitol (Vernon and Bohnert. 1992). Continued canopy Qrowth and increased reproductive fitness durin~~ times of stress will be augmented by introduction and expression of genes such as those controlling the osmotically active compounds 2> discussed above and other such compounds. Currently preferred genes which promote the synthesis of an osmotically active polyol compound are genes which encode the enzymes rnannitol-I-phosphate dehydrooenase, trehalose-6-phosphate synthase and myoinositol 0-methyltransferase.

It is contemplated that the expression of specific proteins also may increase drought tolerance. Three classes of Late Embryogenic Proteins have been assigned based on structural similarities (see Dure er crl.. 1989). All three classes of LEAs have been demonstrated in maturing (i.e. desiccating) seeds. Within these 3 types of LEA
proteins.
the Type-II (dehydrin-type) have Generally been implicated in drought and/or desiccation tolerance in vegetative plant parts (i.e. Mundy and Chua, 1988: Piatkowski er crl.. 1990:
Yamaguchi-Shinozaki etal.. 1992). Recently, expression of a Type-III LEA (HVA-1) in tobacco was found to influence plant height, maturity and drought tolerance (Fitzpatrick, 1993). In rice. expression of the HVA-1 gene influenced tolerance to water deficit and salinity (Xu et al., 1996). Expression of stntctural genes from all three LEA
groups may therefore confer drought tolerance. Other types of proteins induced during water stress include thiol proteases. aldolases and transmembrane transporters (Guerrero er crl.. 1990).
which may confer various protective and/or repair-type functions during drou~~ht stress. It also is contemplated that genes that effect lipid biosynthesis and hence membrane composition might also be useful in conferring drought resistance on the plant.
Many of these genes for improving drought resistance have complementary modes of action. Thus. it is envisaged that combinations of these genes might have additive and/or synergistic effects in improving drought resistance in plants. Many of these genes also improve freezing tolerance (or resistance): the physical stresses incurred during freezing and drought are similar in nature and may be mitigated in similar fashion.
Benefit may be conferred via constitutive expression of these genes. but the preferred means of expressing these novel genes may be through the use of a tumor-induced promoter (such as the promoters for the turgor-induced genes described in Guerrero er crl.. 1990 and Shagan et crl.. 1993 which are incorporated herein by reference).
Spatial and temporal expression patterns of these Genes may enable plants to better withstand stress.
It is proposed that expression of genes that are involved with specific morphological traits that allow for increased water extractions from drying soil would he of benefit. For example. introduction and expression of genes that alter root characteristics may enhance water uptake. It also is contemplated that expression of genes that enhance reproductive fitness during times of stress would be of significant value. For example. expression of genes that improve the synchrony of pollen shed and receptiveness of the female flower parts, i.c~.. silks. would be of benefit.
In addition it is proposed that expression of genes that minimize kernel abortion during times of stress would increase the amount of Grain to be harvested and hence be of value.
Given the overall role of water in determining yield, it is contemplated that enabling plants to utilize water more efficiently. through the introduction and expression of novel genes. will improve overall performance even when soil water availability is not limiting. By introducing genes that improve the ability of plants to maximize water usage across a full range of stresses relating to water availability, yield stability or consistency of yield performance may be realized.
(ia) Di.sea.ce Resistance It is proposed that increased resistance to diseases may be realized through introduction of genes into plants, for example. into monocotyledonous plants such as maize. It is possible to produce resistance to diseases caused by viruses, bacteria, fund and nematodes. It also is contemplated that control of mycotoxin producing organisms may be realized through expression of introduced genes.
Resistance to viruses may be produced through expression of novel genes. For example, it has been demonstrated that expression of a viral coat protein in a transgenic 2i plant can impart resistance to infection of the plant by that virus and perhaps other closely related viruses (Cuozzo et ctl., 1988. Hemenway et al.. (988. Abel et al., 1986). It is contemplated that expression of antisense ~lenes targeted at essential viral functions may also impart resistance to viruses. For example. an antisense gene targeted at the ~~ene responsible for replication of viral nucleic acid may inhibit replication and lead to resistance to the virus. It is believed that interference with other viral functions through -~)~-the use of antisense genes also may increase resistance to viruses. Further, it is proposed that it may be possible to achieve resistance to viruses through other approaches.
including. but not limited to the use of satellite viruses.
It is proposed that increased resistance to diseases caused by bacteria and fund may be realized through introduction of novel genes. It is contemplated that genes encoding so-called "peptide antibiotics." pathogenesis related (PR) proteins, toxin resistance, and proteins affecting host-pathogen interactions such as morphological characteristics will be useful. Peptide antibiotics are polypeptide sequences which are inhibitory to growth of bacteria and other microorganisms. For example, the classes of peptides referred to as cecropins and magainins inhibit growth of many species of bacteria and fungi. It is proposed that expression of PR proteins in monocotyledonous plants such as maize may be useful in conterrin~ resistance to bacterial disease. These genes are induced following pathogen attack on a host plant and have been divided into at least five classes of proteins (Bol, Linthorst, and Cornelissen. 1990).
Included amongst the PR proteins are J3-I, 3-glucanases, chitinases. and osmotin and other proteins that are believed to function in plant resistance to disease or~unisms. Other genes have been identified that have antifungal properties, e.,~.. UDA (stinging nettle lectin) and hevein (Broakaert et ctl.. 1989; Barkai-Golan et ul.. 1978). It is known that certain plant diseases are caused by the production of phytotoxins. It is proposed that resistance to these diseases would be achieved through expression of a novel gene that encodes an enzyme capable of degrading or otherwise inactiv~ttin~T the phytotoxin. It also is contemplated that expression of novel genes that alter the interactions between the host plant and pathogen may be useful in reducing the ability of the disease organism to invade the 2~ tissues of the host plant, e.y.. an increase in the waxiness of the leaf cuticle or other morphological characteristics.
(a) Plnut A,~rmtnmic Cltat-ctctcistic.c Two of the factors determinin~7 where crop pUants can be grown are the average daily temperature during the ~rowin~ season and the len~lth of time between frosts.

Within the areas where it is possible to grow a particular crop, there are varying limitations on the maximal time it is allowed to brow to maturity and be harvested. For example. a variety to be gown in a particular area is selected for its ability to mature and dry down to harvestable moisture content within the required period of time with maximum possible yield. Therefore, crops of varying maturities is developed for different growing locations. Apart from the need to dry down sufficiently to permit harvest, it is desirable to have maximal drying take place in the field to minimize the amount of energy required for additional drying post-harvest. Also, the more readily a product such as grain can dry down, the more time there is available for growth and kernel fill. It is considered that genes that influence maturity and/or dry down can be identified and introduced into plant lines using transformation techniques to create new varieties adapted to different growing locations or the same 'rowing location, but having improved yield to moisture ratio at harvest. Expression of genes that are involved in regulation of plant development may be especially useful.
It is contemplated that genes may be introduced into plants that would improve standability and other plant growth characteristics. Expression of novel genes in plants which confer stronger stalks, improved root systems, or prevent or reduce ear droppage would be of great value to the farmer. It is proposed that introduction and expression of genes that increase the total amount of photoassimilate available by, for example, increasing light distribution and/or interception would he advantageous. In addition, the expression of genes that increase the efficiency of photosynthesis and/or the leaf canopy would further increase wins in productivity. 1t is contemplated that expression of a phytochrome gene in crop plants may be advantageous. Expression of such a gene may reduce apical dominance, confer semidwarfism on a plant. and increase shade tolerance (U.S. Patent No. 5.268.526). Such approaches would allow for increased plant populations in the field.

(vi) rVrrtrient Utili:.crtinrr The ability to utilize available nutrients may be a limiting factor in growth of crop plants. It is proposed that it would be possible to alter nutrient uptake, tolerate pH
extremes. mobilization through the plant. storage pools. and availability for metabolic S activities by the introduction of novel genes. These modifications would allow a plant such as maize to more efficiently utilize available nutrients. It is contemplated that an increase in the activity of, for example. an enzyme that is normally present in the plant and involved in nutrient utilization would increase the availability of a nutrient. An example of such an enzyme would be phytase. It is further contemplated that enhanced nitrogen utilization by a plant is desirable. Expression of a glutamate dehydrogenase Gene in plants. e.,Q., E~. cnli gdlrA genes. may lead to increased fixation of nitrogen in organic compounds. Furthermore. expression of gdlrA in plants may lead to enhanced resistance to the herbicide glufosinate by incorporation of excess ammonia into Glutamate, thereby detoxifying the ammonia. It also is contemplated that expression of a novel gene may make a nutrient source available that was previously not accessible. c.~., an enzyme that releases a component of nutrient value from a more complex molecule.
perhaps a macromolecule.
(vii) Male Sterilim Male sterility is useful in the production of hybrid seed. 1t is proposed that male sterility may be produced through expression of novel genes. For example. it has been shown that expression of ~=enes that encode proteins that interfere with development of the male inflorescence and/or sametophyte result in male sterility. Chimeric ribonuclease genes that express in the anthem of transgenic tobacco and oilseed rape have been demonstrated to lead to male sterility (Mariani et al.. 1990).
A number of mutations were discovered in maize that confer cytoplasmic male sterility. One mutation in particular. referred to as T cytoplasm. also correlates with sensitivity to Southern corn leaf blight. A DNA sequence. designated TURF-13 (Levings, 1990), was identified that correlates with T cytoplasm. it is proposed that it _yg-would be possible through the introduction of TURF-13 via transformation. to separate male sterility from disease sensitivity. As it is necessary to be able to restore male fertility for breeding purposes and for gain production. it is proposed that genes encoding restoration of male fertility also rnav be introduced.
(viii) Inrpr-ovecl Nutritional Corrterrt Genes may be introduced into plants to improve the nutrient quality or content of a particular crop. Introduction of genes that alter the nutrient composition of a crop may greatly enhance the feed or food value. For example. the protein of many grains is suboptimal for feed and food purposes. especially when fed to pigs, poultry.
and humans.
The protein is deficient in several amino acids that are essential in the diet of these species, requiring the addition of supplements to the grain. Limiting essential amino acids may include lysine, methionine. tryptophan. threonine, valine, arginine, and histidine. Some amino acids become limiting only after corn is supplemented with other inputs for feed formulations. The levels of these essential amino acids in seeds and grain may be elevated by mechanisms which include, but are not limited to. the introduction of genes to increase the biosynthesis of the amino acids. decrease the degradation of the amino acids, increase the storage of the amino acids in proteins, or increase transport of the amino acids to the seeds or grain.
The protein composition ~f a crop may be altered to improve the balance of amino acids in a variety of ways including elevating expression of native proteins, decreasing expression of those with poor composition. changing the composition of native proteins.
or introducing genes encoding entirely new proteins possessing superior composition.
The introduction of genes that alter the oil content of a crop plant may also be of value. Increases in oii content may result in increases in metabolizable-energy-content and density of the seeds for use in feed and food. The introduced genes may encode enzymes that remove or reduce rate-limitations or regulated steps in fatty acid or lipid biosynthesis. Such. genes may include. but are not limited to, those that encode acetyl-CoA carboxylase. ACP-acvltransferase, ~i-ketoacyl-ACP synthase. plus other well known fatty acid biosynthetic activities. Other possibilities are genes that encode proteins that do not possess enzymatic activity such as acyl carrier protein. Genes may be introduced that alter the balance of fatty acids present in the oil providing a more healthful or nutritive feedstuff. The introduced DNA also may encode sequences that block expression of enzymes involved in fatty acid biosynthesis, altering the proportions of tatty acids present tn crops.
Genes may be introduced that enhance the nutritive value of the starch component of crops, for example by increasing the degree of branching. resulting in improved utilization of the starch in livestock by delaying its metabolism.
Additionally. other major constituents of a crop may be altered. including genes that affect a variety of other nutritive, processing, or other quality aspects. For example. pigmentation may be increased or decreased.
Feed or food crops may also possesses insufficient quantities of vitamins.
requiring supplementation to provide adequate nutritive value. Introduction of genes that enhance vitamin biosynthesis may be envisioned including, for example.
vitamins A. E, B,,. choline, and the like. Mineral content may also be sub-optimal. Thus genes that affect the accumulation or availability of compounds containing phosphorus.
sulfur, calcium. manganese, zinc. and iron among others would be valuable.
Numerous other examples of improvements of crops may be used with the invention. The improvements may not necessarily involve grain. but may, for example.
?5 improve the value of a crop for silage. Introduction of DNA to accomplish this might include sequences that alter lignin production such as those that result in the "brown midrih" phenotype associated with superior feed value for cattle.
In addition to direct improvements in feed or food value. genes also may be introduced which improve the proceaainy of crops and improve the value of the products - I C)0-WO 00/5~32s PCT/US00/07392 resulting from the processing. One use of crops if via wetmillin~. Thus novel genes that increase the efficiency and reduce the cost of such processing, for example by decreasing steeping time. may also find use. Improving the value of wetmillino products may include altering the quantity or quality of starch, oil. corn gluten meal. or the components of gluten feed. Elevation of starch may be achieved through the identification and elimination of rate limiting steps in starch biosynthesis or by decreasing levels of the other components of crops resulting in proportional increases in starch.
Oil is another product of wetmilling, the value of which may be improved by introduction and expression of genes. Oil properties may be altered to improve its performance in the production and use of cooking oil, shortenings, lubricants or other oil-derived products or improvement of its health attributes when used in the food-related applications. Novel fatty acids also may be synthesized which upon extraction can serve as starting materials for chemical syntheses. The changes in oil properties may be achieved by altering the type, level, or lipid arrangement of the fatty acids present in the oil. This in turn may be accomplished by the addition of genes that encode enzymes that catalyze the synthesis of novel fatty acids and the lipids possessing them or by increasing levels of native fatty acids while possibly reducing levels of precursors.
Alternatively, DNA sequences may be introduced which slow or block steps in fatty acid biosynthesis resulting in the increase in precursor fatty acid intermediates. Genes that might be added include desaturases. epoxidases, hydratases. dehydratases. and other enzymes that catalyze reactions involving fatty acid intermediates. Representative examples of catalytic steps that might be blocked include the desaturations from stearic to oleic acid and oleic to linolenic acid resulting in the respective accumulations of stearic and oleic acids. Another example is the blockage of elongation steps resulting in the accumulation of C~ to C,, saturated fatty acids.

(i.r) Prncl«ctio» nr As.s«»rlatro» Ot'Clrt'Jricals or Biologicals It may further be considered that a trans~enic plant prepared in accordance with the invention may be used for the production or manufacturing of useful hiolo~ical compounds that were either not produced at all. or not produced at the same level. in the corn plant previously. Alternatively, plants produced in accordance with the invention may be made to metabolize certain compounds. such as hazardous wastes, thereby allowing bioremediation of these compounds.
The novel plants producing these compounds are made possible by the introduction and expression of one or potentially many genes with the constructs provided by the invention. The vast array of possibilities include but are not limited to any biological compound which is presently produced by any organism such as proteins, nucleic acids. primary and intermediary metabolites. carbohydrate polymers.
enzymes for uses in bioremediation, enzymes for modifying pathways that produce secondary plant metabolites such as tlavonoids or vitamina. enzymes that could produce pharmaceuticals.
and for introducing enzymes that could produce compounds of interest to the manufacturing industry such as specialty chemicals and plastics. The compounds may be produced by the plant, extracted upon harvest and/or processing, and used for any presently recognized useful purpose such as pharmaceuticals, fragrances, and industrial enzymes to name a few.
(x) Nn»-Prntei»-E.yres.si»y Sec/«e»cw.c DNA may be introduced into plants for the purpose of expressing RNA
transcripts that function to affect plant phenotype yet are not translated into protein.
Two examples are antisense RNA and RNA with rihozyme activity. Both may serve possihfe functions in reducing or eliminating expression of native or introduced plant genes.
However. as detailed below. DNA need not be expressed to effect the phenotype of a plant.

I. Antisense RNA
Genes may be constructed or isolated. which when transcribed, produce antisense RNA that is complementary to all or parts) of a targeted messenger RNA(s). The antisense RNA reduces production of the polypeptide product of the messenger RNA.
The polypeptide product may be any protein encoded by the plant genome. The aforementioned genes will be referred to as antisense genes. An antisense gene may thus be introduced into a plant by transformation methods to produce a novel transgenic plant with reduced expression of a selected protein of interest. For example, the protein may be an enzyme that catalyzes a reaction in the plant. Reduction of the enzyme activity may reduce or eliminate products of the reaction which include any enzymatically synthesized compound in the plant such as fatty acids, amino acids, carbohydrates, nucleic acids and the like. Alternatively, the protein may be a storage protein, such as a zein, or a structural protein. the decreased expression of which may lead to changes in seed amino acid composition or plant morphological changes respectively. The possibilities cited above I S are provided only by way of example and do not represent the full range of applications.
2. Ribozymes Genes also may be constructed or isolated. which when transcribed. produce RNA
enzymes (ribozymes) which can act as endoribonucleases and catalyze the cleavage of RNA molecules with selected sequences. The cleavage of selected messenger RNAs can result in the reduced production of their encoded polypeptide products. These Genes may be used to prepare novel transgenic plants which possess them. The transgenic plants may possess reduced levels of polypeptidcs including, hut not limited to. the polypeptides cited above.
Ribozymes are RNA-protein complexes that cleave nucleic acids in a site-specific fashion. Ribozymes have specific catalytic domains that possess endonuclease activity (Kim and Cech, 1987: Gerlach et crl.. 1987: Forster and Symons, 1987). For example, a large number of ribozymes accelerate phosphoester transfer reactions with a high decree of specificity. often cleaving only one of several phosphoesters in an oligonucleotide _lp;_ substrate (Cech et «t., 1981; Michel and Westhof~, 1990: Reinhold-Hurek and Shub.
1992). This specificity has been attributed to the requirement that the substrate bind vicr specific base-pairing interactions to the internal wide sequence ("IGS") of the ribozyme prior to chemical reaction.
Ribozyme catalysis has primarily been observed as part of sequence-specific cleavage/ligation reactions involving nucleic acids (Joyce, 1989; Cech et «L, 1981 ). For example, U. S. Patent 5.354.855 reports that certain ribozymes can act as endonucleases with a sequence specificity greater than that of known ribonucleases and approaching that of the DNA restriction enzymes.
Several different ribozyme motifs have been described with RNA cleavage activity (Syrnons, 1992). Examples include sequences from the Group I self splicing introns including Tobacco Ringspot Virus (Prody et crl.. 1986), Avocado Sunblotch Viroid (Palukaitis et «l., 1979: Symons, 1981 ). and Lucerne Transient Streak Virus (Forster and Symons, 1987j. Sequences from these and refuted viruses are referred to as hammerhead ribozyme based on a predicted folded secondary structure.
Other suitable ribozymes include sequences from RNase P with RNA cleavage activity (Yuan et «!.. 1992. Yuan and Altman. 1994. L1. S. Patents 5,168.053 and 5.624,824). hairpin ribozyme structures (Berzal-Herranz et crl.. 1992:
Chowrira et «l.. 1993) and Hepatitis Delta virus based ribozymes (U. S. Patent 5.625,047). The general design and optimization of ribozyme directed RNA
cleavage activity has been discussed in detail (Haseloff and Geriach, 1988. Symons.
1992.
2~ Chowrira et «1.. 1994: Thompson et «l., 1995).
The other variable on ribozyme design is the selection of a cleava~~e site on a given target RNA. Ribozymes are targeted to a given sequence by virtue of annealing to a site by complimentary base pair interactions. Two stretches of homolo<,y are required for this targetinU. These stretches of homolo~~ous sequences t7ank the catalytic ribozyme structure defined above. Each stretch of homologous sequence can vary in length from 7 to 15 nucleotides. The only requirement for defining the homologous sequences is that.
on the target RVA. they are separated by a specific sequence which is the cleavage site.
For hammerhead ribozyme, the cleavage site is a dinucleotide sequence on the target RNA is a uracil (U) followed by either an adenine. cytosine or uracil (A.C or U) (Perriman et «l., 199?: Thompson et al., 1995). The frequency of this dinucleotide occurring in any given RNA is statistically 3 out of 16. Therefore. for a given target messenger RNA of 1.000 bases. 187 dinucleotide cleavage sites are statistically possible.
Designing and testing ribozymes for efficient cleavage of a target RNA is a process well known to those skilled in the art. Examples of scientific methods for designing and testing ribozymes are described by Chowrira et «l., ( 1994) and Lieber and Strauss ( I995 i. each incorporated by reference. The identification of operative and preferred sequences for use in down regulating a Qiven gene is simply a matter of preparing and testing a given sequence, and is a routinely practiced "screening" method known to those of skill in the art.
3. Induction of Gene Silencing It also is possible that genes may be introduced to produce novel transgenic plants which have reduced expression of a native gene product by the mechanism of co-suppression. It has been demonstrated in tobacco. tomato. and petunia (Goring et ciL, 1991: Smith et «L. 1990: Napoli et «l.. 1990: van der Krol et «!., 1990) that expression of the sense transcript of a native gene will reduce or eliminate expression of the native gene in a manner similar to that observed for antisense Genes. The introduced gene may encode all or part of the targeted native protein but its translation may not be required for reduction of levels of that native protein.
=1. Non-RICA-Expressing Sequences DNA elements includin~7 those of transposable elements such as Ds. Ac. or Mu.
may be inserted into a gene to cause mutations. These DNA elements may be inserted in _ 10>_ order to inactivate (or activate) a gene and therehy "tag" a particular trait.
In this instance the transposable element does not cause instability of the tagged mutation.
because the utility of the element does not depend on its ability to move in the genome.
Once a desired trait is tagged, the introduced DNA sequence may be used to clone the corresponding gene, c~.y., using the introduced DNA sequence as a PCR primer together with PCR gene cloning techniques (Shapiro, 1983: Dellaporta et al., 1988).
Once identified. the entire genes) for the particular trait, including control or regulatory regions where desired, may be isolated, cloned and manipulated as desired. The utility of DNA
elements introduced into an organism for purposes of gene tagging is independent of the DNA sequence and does not depend on any biological activity of the DNA
sequence. i.e., transcription into RNA or translation into protein. The sole function of the DNA element is to disrupt the DNA sequence of a gene.
It is contemplated that unexpressed DNA sequences. including novel synthetic sequences, could be introduced into cells as proprietary "labels" of those cells and plants and seeds thereof. It would not be necessary for a label DNA element to disrupt the function of a gene endogenous to the host organism, as the sole function of this DNA
would be to identify the origin of the organism. For example, one could introduce a unique DNA sequence into a plant and this DNA element would identify all cells, plants.
and progeny of these cells as having arisen from that labeled source. It is proposed that inclusion of label DNAs would enable one to distinguish proprietary ~ermplasm or aertnplasm derived from such, from unlabelled ~ermplasm.
Another possible element which may be introduced is a matrix attachment region 2~ element (MAR). such as the chicken Ivsozyme A element (Stief, 1989). which can be positioned around an expressible gene of interest to effect an increase in overall expression of the bene and diminish position dependent effects upon incorporation into the plant genome (Stief et uL. 19R9: Phi-Van et al.. 1990).
s. Other WO 00/~~325 PCT/US00/07392 Other examples of non-protein expressing sequences specifically envisioned for use with the invention include tRNA sequences. for example, to alter colon usage. and rRNA variants, for example. which may confer resistance to various agents such as antibiotics.
IX. Biological Functional Eguivalents Modification and changes may be made in the centromeric DNA segments of the current invention and still obtain a functional molecule with desirable characteristics.
The following is a discussion based upon changing the nucleic acids of a centromere to create an equivalent, or even an improved. second-generation molecule.
In particular embodiments of the invention, mutated centromeric sequences are contemplated to be useful for increasing the utility of the centromere. It is specifically contemplated that the function of the centromeres of the current invention may be based l5 upon the secondary structure of the DNA sequences of the centromere and /
or the proteins which interact with the centromere. By changing the DNA sequence of the centromere. one may alter the affinity of one or more centromere-associated proteins) for the centromere and / or the secondary structure of the centromeric sequences.
thereby changing the activity of the centromere. Alternatively. changes may be made in the centromeres of the invention which do not effect the activity of the centromere. Changes in the centromeric sequences which reduce the size of the DNA segment needed to confer centromere activity are contemplated to be particularly useful in the current invention. as would changes which increased the fidelity with which the centromere was transmitted during mitosis and meiosis.
X. Plants The term "plant." as used herein. refers to any type of plant. The inventors have provided below an exemplary description of some planta that may be used with the invention. However, the list is not in any way limiting. as other types of plants will be known to those of skill in the art and could be used with the invention.

WO 00/»325 PCT/US00/07392 A common class of plants exploited in agriculture are vegetable crops.
including artichokes. kohlrabi, anrgula. leeks, asparagus. lettuce (e. y.. head. leaf, romaine). bok choy, malanga. broccoli. melons (e.,~~., muskmelon. watermelon, crenshaw, honeydew, cantaloupe). Brussels sprouts, cabbage, cardoni, carrots, napa. cauliflower.
okra. onions.
celery, parsley. chick peas, parsnips. chicory. Chinese cabbage, peppers, collards, potatoes. cucumber plants (marrows, cucumbers). pumpkins. cucurbits. radishes, dry bulb onions, rutabaga, eggplant, salsify, escarole. shallots. endive. garlic, spinach, green onions, squash. greens, beet (sugar beet and fodder beet), sweet potatoes.
Swiss chard.
horseradish, tomatoes. kale, turnips, and spices.
Other types of plants frequently finding commercial use include fruit and vine crops such as apples. apricots. cherries, nectarines. peaches, pear. plums.
prunes, quince almonds, chestnuts, filberts, pecans, pistachios. walnuts, citrus.
blueberries.
1 ~ boysenberries, cranberries. currants. loganberries. raspberries.
strawberries. blackberries, grapes, avocados, bananas, kiwi, persimmons, pomegranate. pineapple. tropical fruits, pomes. melon. mango, papaya. and lychee.
Many of the most widely grown plants are field crop plants such as evening primrose, meadow foam: corn (field. sweet. popcorn). hops. jojoba. peanuts, rice, safflower. small grains (barley. oats, rye. wheat. etc. ). sorghum, tobacco, kapok, leguminous plants (beans. lentils, peas, soybeans). oil plants (rape, mustard.
poppy, olives. sunflowers, coconut. castor oil plants. cocoa beans. groundnuts).
fibre plants (cotton, flax. hemp, jute), lauraceae (cinnamon, camphor). or plants such as coffee.
sugarcane, tea. and natural rubber plants.
Still other examples of plants include bedding plants such as flowers. cactus.
succulents and ornamental plants, as well a; trees such as forest ( broad-leaned trees and evergreens, such as conifers). fruit. ornamental, and nut-bearing trees. as well as shrubs and other nursery stock.
-lOb-XI. Definitions Ac used herein. the terms ''autonomous replicating sequence" or ''ARS" or ''origin of replication " refer to an origin of DNA replication recognized by proteins that initiate DNA replication.
As used herein, the terms "binary BAC" or "binary bacterial artificial chromosome" refer to a bacterial vector that contains the T-DNA border sequences necessary for Al;rahacterir.nn mediated transformation (see, for example.
Hamilton et ul..
1996: Hamilton. 1997: and Liu et al., 1999.
As used herein, the term "candidate centromere sequence" refers to a nucleic acid sequence which one wishes to assay for potential centromere function.
As used herein, a "centromere" is any DNA sequence that confers an ability to segregate to daughter cells through cell division. In one context, this sequence may produce a segregation efficiency to daughter cells ranging from about 1 % to about 100%.
including to about ~°~o, 10%, 20%, 30%. 40%, 50%. 60%, 70%, 80%, 90% or about 95%
of daughter cells. Variations in such a segregation efficiency may find important applications within the scope of the invention: for example. mini-chromosomes carrying centromeres that confer 100% stability could be maintained in all daughter cells without selection. while those that confer I °lc stability could be temporarily introduced into a transgenic organism. but be eliminated when desired. In particular embodiments of the invention. the centromere may confer stable segregation of a nucleic acid sequence.
2~ including a recombinant construct comprising the centromere, through mitotic or meiotic divisions. including through both meiotic and meitotic divisions. A plant centromere is not necessarily derived from plants. but has the ability to promote DNA
seare~ation in plant cells.

As used herein. the term "centromere-associated protein" refers to a protein encoded by a sequence of the centromere or a protein which is encoded by host DNA and binds with relatively high affinity to the centromere.
As used herein, "eukaryote" refers to living organisms whose cells contain nuclei.
A eukaryote may be distinguished from a "prokaryote" which is an organism which lacks nuclei. Prokaryotes and eukaryotes differ fundamentally in the way their emetic information is organized, as well as their patterns of RNA and protein synthesis.
As used herein, the term "expression" refers to the process by which a structural gene produces an RNA molecule. typically termed messenger RNA (mRNAj. The mRNA is typically, but not always, translated into polypeptide(s).
As used herein, the term "~enome" refers to all of the genes and DNA sequences that comprise the genetic information within a eiven cell of an organism.
Usually, this is taken to mean the information contained within the nucleus, but also includes the organelles.
As used herein, the term "higher eukaryote" means a multicellular eukaryote, typically characterized by its greater complex physiological mechanisms and relatively large size. Generally, complex organisms such as plants and animals are included in this category. Preferred higher eukaryotes to be transformed by the present invention include.
for example. monocot and dicot angiosperm species, gymnosperm species, fern species, plant tissue culture cells of these species. animal cells and algal cells. It will of course be understood that prokaryotes and eukarvotes alike may be transformed by the methods of this invention.
As used herein, the term "host refers to any organism that is the recipient of a replicable plasmid, or expression vector comprising a plant chromosome.
Ideally. host strains used for cloning experiments should he free of anv restriction enzyme activity that might degrade the foreign DNA used. Preferred examples of host cells for cloning, useful WO 00/~~32s PCT/US00/07392 in the present invention. are bacteria such as E.cchericlricr coli.
Bcrcillrr.o .srrhtili,s, Pscrulcmrrnrra.s, Streptcnrrvces. Salrrronella. and yeast cells such as S.
c'L'I'( 1'lSlclE.'. Host cells which can be targeted for expression of .~ minichromosome may be plant cells of any source and specifically include Arnhiclnp.vi.c. maize. rice, sugarcane.
sorghum. barley.
soybeans. tobacco. wheat. tomato, potato. citrus. or any other agronomieally or scientifically important species.
As used herein. the term "hybridization" refers to the pairing of complementary RNA and DNA strands to produce an RNA-DNA hybrid. or alternatively. the pairing of two DNA single strands from genetically different or the same sources to produce a double stranded DNA molecule.
As used herein. the term "linker" refers to a DNA molecule. generally up to i0 or 60 nucleotides long and synthesized chemically. or cloned from other vectors.
In a preferred embodiment. this fragment contains one. or preferably more than one.
restriction enzyme site for a blunt-cutting enzyme and a staggered-cutting enzyme. such as BamHl. One end of the linker fragment is adapted to be ligatable to one end of the linear molecule and the other end is adapted to be ligatable to the other end of the linear molecule.
As used herein. a "library" is a pool of random DNA fragments which are cloned.
In principle. any gene can be isolated by screening the librarw with a specific hybridization probe (see. for example. Youna er nl., 1977). Each library may contain the DNA of a given organism inserted as discrete restriction enzyme-generated fragments or as randomly sheered fragments into many thousands of plasmid vectors. For purposes of the present invention, E. coli. yeast, and Salrrronellcr plasmids are particularly useful when the ~lenome inserts come from other orsanisms.

As used herein. the term "lower eukaryote" refers to a eukaryote characterized by a comparatively simple physiology and composition. and most often unicellularity.
Examples of lower eukaryotes include flagellates. ciliates, and yeast.
As used herein. a "minichromosome'' is a recombinant DNA construct including a centromere and capable of transmission to daughter cells. The stability of this construct through cell division could range between from about 1 % to about 100%.
including about 5%, l0%. 209. 30%' 40%, 50%, 60%, 70%, 80%, 90% and about 95%. The minichromosome construct may be a circular or linear molecule. It may include elements such as one or more telomeres, ARS sequences. and genes. The number of such sequences included is only limited by the physical size limitations of the construct itself.
It could contain DNA derived from a natural centromere, although it may he preferable to limit the amount of DNA to the minimal amount required to obtain a segre=anon efficiency in the range of 1-100%. The minichromosome may be inherited through mitosis or meiosis. or through both meiosis and mitosis. As used herein. the term minichromosome specifically encompasses and includes the terms "plant artificial chromosome~~ or "PLAC." and all teachings relevant to a PLAC or plant artificial chromosome specifically apply to constructs within the meaning of the term minichromosome.
As used herein, by ''minichromosome-encoded protein" it is meant a polypeptide which is encoded by a sequence of a minichromosome of the current invention.
This includes sequences such as selectable markers. telomeres. c~c.. as well as those proteins encoded by any other selected functional genes on the minichromosome.
A "180 hale pair repeat" is defined as any one of the specific repeats disclosed in SEQ ID NOS: l84-212. or a "consensus'' sequence derived therefrom. Thus. a liven "180 base pair repeat" may include more or less than l80 base pairs. and may retlect a sequence not represented by any of the specific sequences provided herein.
_Ip_ As used herein. the term "plant" includes plant cells, plant protoplasts.
plant calli.
and the like, as well as whole plants regenerated therefrom.
As used herein. the term "plasmid" or "cloning vector" refers to a closed covalently circular extrachromosomal DNA or Linear DNA which is able to autonomously replicate in a host cell and which is normally nonessential to the survival of the cell. A
wide variety of plasmids and other vectors are known and commonly used in the art (see, for example. Cohen er nl., U.S. Patent No. 4.468.464. which discloses examples of DNA
plasmids, and which is specifically incorporated herein by reference).
As used herein, a "probe" is any biochemical reagent (usually tagged in some way for ease of identification), used to identify or isolate a gene. a gene product, a DNA
segment or a protein.
As used herein, the term "recombination" refers to any genetic exchange that involves breaking and rejoininD of DNA strands.
As used herein the term ''regulatory sequence" refers to any DNA sequence that influences the efficiency of transcription or translation of any gene. The term includes, but is not limited to, sequences comprising promoters, enhancers and terminators.
As used herein. a "selectable marker" is a gene whose presence results in a clear phenotype, and most often a growth advantage for cells that contain the marker. Thin erowth advantage may he present under standard conditions. altered conditions such as elevated temperature. or in the presence of certain chemicals such as herbicides or antibiotics. Use of selectable markers is described, for example. in Broach et al. ( 1979).
Examples of selectable markersthethymidine kinasegene, the include cellular adenine-phosphoribosyltramferaseandthe dihydrylfolatereductase gene gene.

hygromycin phosphotransferasebargene and neomycinphosphotransferase genes. the eenes, among others. Preferred selectable markers in the present invention include gene;

whose expression confer antibiotic or herbicide resistance to the host cell.
sufficient to enable the maintenance of a vector within the host cell, and which facilitate the manipulation of the plasmid into new host cells. Of particular interest in the present invention are proteins conferring cellular resistance to ampicillin, chloramphenicol.
tetracycline. G-418. bialaphos, and glyphosate for example.
As used herein, a "screenable marker" is a gene whose presence results in an identifiable phenotype. This phenotype may be observable under standard conditions.
altered conditions such as elevated temperature, or in the presence of certain chemicals used to detect the phenotype.
As used herein, the term "site-specific recombination" refers to any genetic exchange that involves breaking and rejoining of DNA strands at a specific DNA
sequence.
I$
As used herein, a "structural gene" is a sequence which codes for a potypeptide or RNA and includes 5' and 3' ends. The structural gene may be from the host into which the structural gene is transformed or from another species. A structural gene will preferably, but not necessarily, includC one or more regulatory sequences which modulate the expression of the structural gene. such as a promoter. terminator or enhancer. A
structural gene will preferably. but not necessarily. confer some useful phenotype upon an organism comprising the structural gene. for example, herbicide resistance. In one embodiment of the invention. a structural gene may encode an RNA sequence which is not translated into a protein. for example a tRNA or rRNA gene.
As used herein, the term 'oelomere" refers to a sequence capable of capping the ends of a chromosome, thereby preventing degradation of the chromosome end.
ensuring replication and preventing fusion to other chromosome sequences.
As used herein. the terms "transformation" or "transfection" refer to the acquisition in cells of new DNA sequences through the chromosomal or extra-- I I =l-chromosomal addition of DNA. This is the process by which naked DNA. DNA
coated with protein, or whole minichromosomes are introduced into a cell, resulting in a potentially heritable change.
XII. Examples The following examples are included to demonstrate preferred embodiments of the invention. It should be appreciated by those of skilled the art that the techniques disclosed in the examples which follow represent techniques discovered by the inventors to function well in the practice of the invention, and thus can be considered to constitute preferred modes for its practice. However, those of skill in the art should, in light of the present disclosure, appreciate that many changes can be made in the specific embodiments which are disclosed and still obtain a like or similar result without departing from the concept. spirit and scope of the invention. More specifically, it will be apparent that certain agents which are both chemically and physiologically related may be IS substituted for the agents described herein while the same or similar results would be achieved. All such similar substitutes and modifications apparent to those skilled in the art are deemed to be within the spirit, scope and concept of the invention as defined by the appended claims.
EXAMPLE I
Generation of an Arabidopsis tl:aliaua Mapping Population To generate a pollen donor plant, two parental lines carrying grtl were crossed to one another. The grtl-I allele was in the Landsbero ecotype background and the c/rtl-2 allele was in the Columbia ecotype background. The Landsber~~ ecotype was readily 2~ discernible from the Columbia ecotype because it carries a recessive mutation, erecta, which causes the stems to thicken, infloresences to be more compact, and the leaves to be more rounded and small than wildtype. To utilize this as a marker of a donor plant. clrtl-2 pollen was crossed onto a c/rtl-l female stigma. The F, progeny were heterozygous at all molecular markers yet the progeny retain the grrnrtet phenotype of a tetrad of fused pollen gains. In addition, progeny display the ERECTA phenotype of the Columbia plant. This WO 00/~532~ PCT/US00/07392 visible marker serves as an indication that the crossing was successful in generating plants segregating ecotype specific markers. Further testing was done to the donor plants by performing PCR analysis to insure that progeny were heterozygous at molecular loci.
Due to the fact that the pollen Grains cannot be directly assayed for marker segregation and because of the desire to create a long-term resource available for multiple marker assays, it was necessary to cross individual tetrads generated by the donor plant.
This created sets of progeny plants which yielded both large quantities of tissue and seed.
These crosses were accomplished efficiently by generating a recipient plant homozygous for male sterility (msl ). The recessive mutant m.sJ was chosen to guard against the possibility of the recipient plant self-fertilizing and the progeny being mistaken for tetrad plants. Due to the fact that the homozygous plant does not self, a stock seed generated by a heterozygous nrale sreriliw I plant needs to be maintained from which sterile recipient plants can be selected.

Tetrad Poltinations Tetrad pollinations were carried out as follows. A mature flower was removed from the donor plant and tapped upon a glass microscope slide to release mature tetrad pollen grains. This slide was then placed under a 20-40x Zeiss dissecting microscope.
To isolate individual tetrad pollen grains. a small wooden dowel was used to which an eyebrow hair with rubber cement was mounted. Using the light microscope. a tetrad pollen unit was chosen and touched to the eyebrow hair. The tetrad preferentially adhered to the eyebrow hair and was thus lifted from the microscope slide and transported the recipient plant stigmatic surface. The transfer was carried out without the use of the microscope. and the eyebrow hair with adhering tetrad was then placed against the recipient stigmatic surface and the hair was manually dragged across the stigma surface.
The tetrad then preferentially adhered to the stigma of the recipient and the cross pollination was completed.

Initially. S7 tetrad seed sets consisting of 3-4 seeds each, were collected.
Plants were crown from these tetrad seed sets. and tissue was collected. DNA was extracted from a small portion of the stored tissue for PCR based segregation analysis.
Additionally the segregation of the visible erect« phenotype was scored. When the plants S set seed. the seed was collected as n source for the larger amounts of DNA
required to analyze RFLP segregation by Southern blotting.

Preparation and Analysis of Centromere-Spanning Contigs IO Previously. DNA fingerprint and hybridization analysis of two bacterial artificial chromosome (BAC) libraries led to the assembly of physical maps covering nearly all single-copy portions of the Ar«hidopsi.s genome (Marra et «l., 1999). However, the presence of repetitive DNA near the Ar«hidnpsis centromeres, including 180 by repeats.
retroelements. and middle repetitive sequences complicated efforts to anchor centromeric 15 BAC contigs to particular chromosomes (Murata et «l., 1997; Heslop-Harrison et «l..
1999: Brandes et crl., 1997; Franz et «l.. 1998: Wright et «l., 1996;
Koniecznyet crl., 1991;
Pelissier et crl., 1995; Voytas and Ausubel. 1988; Chye et «L. 1997; Tsay et «l., 1993;
Richards et «L. 1991; Simoens et crl.. 1988: Thompson et «l.. 1996; Pelissier et «l.. 1996).
The inventors used genetic mapping to unambiguously assign these unanchored contigs to 20 specific centromeres, scoring polymorphic markers in 48 plants with crossovers informative for the entire genomc (Copenhaver et «L, 1998). In this manner.
several centromeric contigs were connected to the physical maps of the chromosome arms (see EXAMPLE 6 and Table 4), and a lame set of DNA markers defining centromere boundaries were generated. DNA sequence analysis confirmed the structure of the 25 conti~s for chromosomes I1 and IV 1 Lin et «l.. 1999).
CEN2 and CEN4 were selected in particular for analysis. Both reside on structurally similar chromosomes with a 3.S Mb rDNA arrays on their distal tips, with regions measuring 3 and ? Mb, respectively. between the rDNA and centromeres.
and l6 30 and 13 Mb regions on their lon'1 arms (Copenhaver and Pikaard, 1996).

WO 00/~~325 PCT/US00/07392 The virtually complete and annotated sequence of chromosomes II and IV was used to conduct an analysis of centromeres at the nucleotide level (http://www.ncbi.nlm.nih.gov/Entrezlnucleotide.html). The sequence composition was analyzed within the genetically-defined centromere boundaries and compared to the adjacent pericentromeric regions (FIGS. 12A-T). Analysis of the two centromeres facilitated comparisons of sequence patterns and identification of conserved sequence elements.
The centromere sequences were found to harbour 180 by repeat sequences. These sequences were found to reside in the gaps of each centromeric contig (FIG. 3, FIGS. 12B, 12L), with few repeats and no long arrays elsewhere in the genome. BAC clones near these gaps have end sequences corresponding to repetitive elements that likely constitute the bulk of the DNA between the contigs, including 180 by repeats, 5S rDNa or I60-by IS repeats (FIG. 3). Fluorescent iu Biter hybridization has shown these repetitive sequences are abundant components of Arobidopsis centromeres (Murata et al., 1997:
Heslop-Harrison et al., 1999: Brandes et al., 1997). Genetic mapping and pulsed-field gel electrophoresis indicate that many 180 by repeats reside in long arrays measuring between 0.4 and 1.4 Mb in the centromeric regions (Round et al., 1997):
sequence analysis revealed additional interspersed copies near the gaps. The inventors specifically contemplate the use of such 180 by repeats for the construction of minichromosomes.
The annotated sequence of chromosomes II and IV identified regions with homology to middle repetitive DNA, both within the functional centromeres and in the adjacent regions (FIGS. 12B-12E and 12L-120).
In a 4.3 Mb sequenced region that includes CEN2 and a 2.8 Mb sequenced region that includes CEN4, retrotransposon homology was found to account for > 10% of the DNA sequence. with a maximum of 6290 and 70%. respectively (FIGS. 12C, 12M).
Sequences with similarity to transposons or middle repetitive elements were found to occupy a similar zone. but were lesa common (29% and l l °lo maximum density for WO 00/a532s PCT/US00/07392 chromosomes II and IV respectively (FIGS. 12D-12E and FIG. 1?N-120). Finally.
unlike in the case of Drosophilcr and Nerrro.sporcr centromeres (Sun et crl., 1997;
Cambareri et crl.. 1998) low complexity DNA. including microsatellites, homopolymer tracts, and AT rich isochores, were not found to be enriched in the centromeres of Arnbidopsis. Near CEN2, simple repeat sequence densities were comparable to those on the distal chromosome arms, occupying 1.5% of the sequence within the centromere, 3.2°lo in the flanking regions, and ranging from 20 to 319 by in length (71 by on average).
Except for an insertion of mitochondria) DNA at CEN2 the DNA in and around the centromeres did not contain any large regions that deviated significantly from the ~enomic average of - 64% A + T (FIGS. 12F, 12P) (Bevan et al., 1999).
Unlike the 180 by repeats, all other repetitive elements near CE'N? and CEN=l were less abundant within the genetically-defined centromeres than in the flanking regions. The high concentration of repetitive elements outside of the functional centromere domain suggest they may be insufficient for centromere activity.
Thus, identifyinJ segments of the Arubidopsi.s genome that are enriched in these repetitive sequences does not pinpoint the regions that provide centromere function; a similar situation may occur in the genomes of other higher eukaryotes.
The repetitive DNA flanking the centromeres may play an important role.
Forming an altered chromatin conformation that serves to nucleate or stabilize centromere structure. Alternatively. other mechanisms could result in the accumulation of repetitive elements near centromeres. Though evolutionary models predict repetitive DNA
accumulates in regions of low recombination (Charlesworth et crl., 1986:.
Charlesworth et ul.. 1994). many Arabidop.si.s repetitive elements are more abundant in the recombinationally active pericentromeric regions than in the centromeres themselves.
Instead, retroelements and other transposons may preferentially insert into regions tlankin~ the centromeres or be eliminated from the rest of the Qenome at a higher rate.
-ll9-WO 00/~532s PCT/US00/07392 Genetic Niapping of Centromeres To map centromeres, F, plants which were heterozygous for hundreds of polymorphic DNA markers were generated by crossing c/rrartet mutants from the Landsber~ and Columbia ecotypes (Chang et crl. 1988: Ecker, 1994: Konieczy and Ausubel. 1993). In tetrads from these plants, genetic markers segregate in a ?:2 ratio (FIG. 6; Preuss et al. 1994). The segregation of markers was then determined in plants which were generated by crossing pollen tetrads from the F, plants onto a Landsberg homozygote. The genotype of the pollen grains within a tetrad was inferred from the genotype of the progeny. Initially, seeds were generated from greater than l00 successful tetrad pollinations, and tissue and seeds were collected from 57 of these.
This provided sufficient material for PCR, as well as seeds necessary for producing the large quantities of tissue required for Southern hybridization and RFLP mapping. In order to obtain a more precise localization of the centromeres the original tetrad population was increased from 57 tetrads to over > 1,000 tetrads.
PCR analysis was performed to determine marker segregation. To account for the contribution of the Landsberg background from the female parent, one Landsberg complement from each of the four tetrad plants was subtracted. As shown in FIG. 5, markers from sites spanning the entire genome were used for pair-wise comparisons of all other markers. Tetratypes indicate a crossover between one or both markers and their centromeres where as ditypes indicate the absence of crossovers (or presence of a double crossover).
Thus. at every genetic locus, the resulting diploid progeny was either L/C or C/C.
The map Venerated with these plants is based solely on male meioses. unlike the existing map, which represents an average of recombination's in both males and females.
Therefore. several well-established genetic distances were recalculated and thus will determine whether recombination frequencies are significantly altered.
_1?p_ The large quantities of genetic data generated by the analysis must be compared pair-wise to perform tetrad analysis. All of the data was managed in a Microsoft Excel spread-sheet format, assigning Landsberg alleles a value of "1'' and Columbia alleles a value of "0". Within a tetrad, the segregation of markers on one chromosome was compared to centromere-linked reference loci on a different chromosome (see Table 2 below). Multiplying the values of each locus by an appropriate reference, and adding the results for each tetrad easily distinguished PD, NPD, and TT tetrads with values of 2. 0, and 1, respectively.
Monitoring the position of crossovers in this population identified chromosomal regions that could be separated by recombination from centromeres (tetratype), as well as regions that always cosegregated with centromeres (ditype) (Copenhaver et ul..
I 998;
Copenhaver et cr1..1999). Tetratype frequencies decrease to zero at the centromere:
consequently, centromere boundaries were defined as the positions that exhibited small but detectable numbers of tetratype patterns. By scoring the segregation of centromere linked markers in approximately 400 tetrads, centromeres 1-5 were localized to regions on the physical map corresponding to contigs of 550, 1445, 1600, 1790 and 1770 kb, respectively (FIG. 3). Additionally, for each centromeric interval, a number of useful recombinants were identified. The results of the analysis indicated that centromeres reside within large domains that restrict recombination machinery activity and that the transition between these domains and the surrounding recombination-proficient DNA is markedly abrupt.
Table 2: Scoring protocol for tetratypes.
Individual- _ members Locus Reference ReferenceLocus Reference of a 1 Locus Locus Locus 3 Locus tetrad 2 A 1 x 1= 1 0 x I= 0 x I= 0 B 1 x 1= 1 0 x 1= 1 x 1= 1 C 0 x 0= 0 1 x 0= 0 x 0= 0 D 0 x 0= 0 1 x 0= 1 x 0= 0 _p I_ PD NPD TT
Analysis of polymorphisms corresponding to 180 by repeats (RCEN markers.
Round et al, 1997) confirmed that these repeats map within the genetically-defined centromeres. Polymorphisms associated with the 180 by repeats were analyzed by pulsed field gel electrophoresis as described previously (Round et al., 1997).
Segregation of these polymorphisms in tetrads with informative crossovers confirmed complete linkage of a 180 by repeat array at each centromere. In genetic units, the centromere intervals averaged 0.44 cM, (% recombination = 1/2 tetratype frequency), reflecting recombination rates at least 10-30 fold below the genomic average of 221 kb/cM (Somerville and Somerville, 1999; http://nasc.nott.ac.uk/new_ri_map.html).
The low recombination frequencies typically observed near higher eukaryotic centromeres may be due to DNA modifications or unusual chromatin states (Choo, 1998:
Puechberty, 1999; Mahtani and Willard. 1998; Charlesworth et crl.. 1986;
Charlesworth et al., 1994). To modify these states. and thus improve centromere mapping resolution by raising recombination frequencies. FI Landsberg/Columbia plants were treated with one of a series of compounds known to cause DNA damage, modify chromatin structure, or alter DNA modifications. Fl Landsberg qrtl / Columbia gutl plants were grown under 24 hour light in 1" square pots and treated with methanesulfonic acid ethyl ester (0.05%), 5-aza-2'-deoxycytidine (25 or 100 mg/1), Zeocin ( 1 uglml), methanesulfonic acid methyl ester (75 ppm). cis-diamminedichloro-platinum (20 ug/ml), mitomycin C (lOmg/1), n-nitroso-n-ethylurea ( 100 uM), n-butyric acid (20 uM).
trichostatin A ( 10 uM), or 3-methoxybenzamide (? mM ). Plants were watered and flower-bearing stems were immersed in these solutions. Alternatively, plants were exposed to 350 nm UV (7 or 10 seconds), or heat shock (38 or 42°C for 2 hours). Pollen tetrads from these plants were used to pollinate Landsherg stigmas 3-5 days after each treatment: the FI plants were subsequently subjected to additional treatments (up to 5 times per plant, every 3-5 days).
_l~~_ Tetrads from treated plants were crossed to Landsber~ stigmas, and progeny from 8-107 tetrads subjected to each treatment were recovered and analyzed, yielding >600 additional tetrads. These tetrads exhibited higher recombination in regions immediately flanking the centromeres (1.6 vs. 3.4% recombination in untreated and treated plants.
respectively). although the sample size was insufficien«o determine if any individual treatment had a profound affect. The map locations of centromeres were refined on chromosomes 2 to 5 (FIG. 1), yielding intervals spanned by condos of 880. I
I50. 1260.
and 1070 kb. respectively, with all tetrads consistently localizing centromere functions to the same region (Copenhaver et al., 1999).
Efforts to increase recombination yielded a lar~~e number of tetrads with crossovers near the centromeres; these crossovers clustered within a narrow region at the centromere boundaries. Five crossovers occurred over a 70 kb region near CEN2, and 7 over a 200 kb region near CENI , yet no crossovers were detected in the adjacent l5 centromeric intervals of 880 and 550 kb respectively (FIG. 3). Thus, the centromeres were found within large domains that restrict recombination machinery activity; the transition between these domains and surrounding. recombination-proficient DNA
is remarkably abrupt (FIG. 12A and K). Although analysis of more tetrads would yield additional recombination events, the observed distribution of crossovers indicate that centromere positions would not be sicnificantly refined.

Sequence Analysis of Arabidopsis Centromeres A. Abundance of genes in the centromeric regions Expressed genes are located within 1 kb of essential centromere sequences in S.
cerevisiae, and multiple copies of tRNA genes reside within an 80 kb fragment necessary for centromere function in S. pnrnbe (Kuhn et al., 1991 ). In contrast. genes are thought to be relatively rare in the centromeres of higher eukaryotea. though there are notable exceptions. The Drn.coplrila light. coucertincr, respouclc-r, and rolled loci atl map to the centromeric region of chromosome ?, and translocations that remove light from its native heterochromatic context inhibit gene expression. 1n contrast. many Dunsophilcr and human genes that normally reside in euchromatin become inactive when they are inserted near a ceniromere. Thus. genes that reside near centromeres likely have special control elements that allow expression (Karpen. 199:f: Lohe and I-lilliker, 1990. The sequences of Arcrhiclop.sis CEN2 and CEN.~, provided herein, provide a powerful resource for understanding how gene density and expression correlate with centromere position and associated chromatin.
Annotation of chromosome II and IV (http://
www.ncbi.nlm.nih.~ov/Entrez/nucleotide.html) identified many genes within and adjacent to CEN2 and CEN4 (FIG. 8, FIGS. 12A-12T). The density of predicted genes on Arnhiclopsis chromosome arms averages 2~ per 100 kb. and in the repeat-rich regions flanking CEN2 and CEN4 this decreases to 9 and 7 genes per 100 kb, respectively (Bevan et al., 1999). Many predicted genes also reside within the recombination-deficient, genetically-defined centromeres. Within CEN2, there were predicted genes per 100 kb: while CEN4 was strikingly different, with 12 Genes per 100 kb.
There was strong evidence that several of the predicted centromeric Genes are transcribed. The phosphoenolpyruvate gene (CUE! ) defines one CENT border:
mutations in this gene cause defects in light-regulated gene expression (Li et crl., 1995). Within the sequenced portions of CE_N2 and CEN4, 17% (?7/160) of the predicted genes shared >959o identity with cloned cDNAs (SSTs). with three-fold more matches in CEN4 than in CEN2 (http://www.tigr.org/tdb/ac/agad/). Twenty-four of these genes have multiple exons, and four correspond to single-copy genes with known functions. A list of the predicted genes identified is given in Table 3. below. A list of additional genes encoded within the boundaries of CEN~i are listed in Table :~. The identification of these genes is significant in that the genes may themselves contain unique regulatory elements or may reside in genomic locations tlanking unique control or regulatory elements involved in centromere function or gene expression. In particular, the current inventors contemplate _ I ?,~_ use of these genes, or DNA sequences 0 to 5 kb upstream or downstream of these sequences, for insertion into a gene of choice in a minichromosome. It is expected that such elements could potentially viefd beneficial regulatory controls of the expression of these genes, even when in the unique environment of a centromere.
To investigate whether the remaining 23 senes were uniquely encoded at the centromere, a search was made in the database of annotated ~enomic Arcrbiclopsi.c sequences. With the exception of two genes, no homologs with >95% identity were found elsewhere in the 80% of the genome that has been sequenced. The number of independent eDNA clones that correspond to a single-copy gene provides an estimate of the level of gene expression. On chromosome II, predicted genes with high quality matches to the cDNA database (> 95% identity) match an average of four independent cDNA clones (range I-78). Within CEN2 and CEN4. I I/27 genes exceed this average (Table 3). Finally, genes encoded at CEN2 and CEN4 are not members of a single gene family, nor do they correspond to genes predicted to play a role in centromere functions, but instead have diverse roles.
Many genes in the Arabiciopsi.r centromeric regions are nonfunctional due to early stop codons or disrupted open reading frames, but few pseudogenes were found on the chromosome arms. Though a large fraction of these pseudogenes have homology to mobile elements, many correspond to genes that are typically not mobile (FIGS.
12I-J and FIGS. 12S-T). Within the genetically-defined centromeres there were 1.0 (CEA'?/ and 0.7 (CEN4) of these nonmobile pseudogenes per 100 kb: the repeat-rich regions bordering the centromeres have I.5 and 0.9 per 100 kb respectively. The distributions of pseudogenes and transposable elements are overlapping, indicting that DNA insertions in these regions contributed to gene disruptions.
_ 12>_ WO 00/~~325 PCT/US00/07392 Table 3: Predicted genes within CEN2 and CEN4 that correspond to the cDNA
database.
Putative function GenBank protein # of EST
accession matchesx Unknown AAC69124 1 SH3 domain protein AAD15528 5 Unknown AAD I 5529 I

unknownt AAD37022 1 RNA helicase~ AAC26676 2 40S ribosomal protein S 16 AAD22696 9 Unknown AAD36948 1 Unknown AAD36947 4 leu cyl tRNA synthetase AAD36946 4 aspartic protease AAD29758 6 Peroxisomal membrane protein AAD297~9 S
(PPM2) ~

5'-adenylylsulfate reductase AAD29775 14 $

symbiosis-related protein AAD29776 3 ATP synthase gamma chain I (APCl)AAD489 3 ~

protein kinase and EF hand AAD034~3 3 ABC transporter AAD03441 1 -1?6-WO 00/~~325 PCT/US00/07392 Transcriptional regulator AAD03444 14 Unknown AAD03446 12 human PCF 1 1 p homolo~~ AAD03447 6 NSF protein AAD17345 l.3-beta-glucan synthase AAD48971 2 pyridine nucleotide-disulphide AAD48975 4 oxidoreductase Polyubiquitin (UBQIl) ,~~' AAD48980 72 wound induced protein AAD48981 6 short chain dehydrogenase/reductaseAAD489_59 7 SL I St AAD48939 WD40-repeat protein AAD48948 * Independent cDNAs with >95% identity. t related gene present in non-centromeric DNA, - potentially associated with a mobile DNA element. ~ characterized gene (B.
Tugal. 1999; J.F. Gutierrez-Marcos, 1996; N. Inohara, 1991: J. Callis. 1995).
Table 4: List of additional genes encoded within the boundaries of CEN4.
Putative Function GenBank Nucleotide accession Position 3'(2').5'-Bisphosphate NucleotidascAC012392 71298 -73681 Transcriptional re~ultor AC012392 8061 I -81844 Equilibrative nucleoside transporterAC01239? 88570 -90739 Equilibrative nucleoside transporterAC012392 94940 -96878 I

Equilibrative nucleoside transporterAC012392 98929 -101019 I

7_ Equilibrative nucleoside transporterAC012392 1 13069 -I

unknown AC012392 122486 -f 24729 4-coumarate--CoA liaase AC012392 126-505 -128601 ethylene responsive protein AC012392 130044 -Oxygen-evolving enhancer proteinAC012392 134147 -135224 precursor Kinesin AC012392 l 37630 -141536 receptor-like protein kinase AC012392 141847 -144363 LpxD-like protein AC012392 14492 I -146953 hypersensitivity induced proteinAC012392 147 I 58 ubiquitin AC012392 149057 -149677 unknown AC012392 I 50254 -15 ubiquitin-like protein AC012392 E53514 -154470 ubiquitin-like protein ACU12392 155734 -156513 ubiquitin-like protein AC012392 156993 -157382 unknown AC012392 159635 -165559 unknown AC012392 166279 -166920 unknown ACO 12392 I 67724 -1702 l 2 ubiquitin-like protein AC012392 176819 -178066 polyubiquitin ( UBQIO) ACO 12392 I 80613 -phosphatidylinositol-3.=1.5-triphosphatcACO(2477 893Ei4 -91?91 hindin~

protein Mitochondrial ATPase AC012477 9430? -9=1677 RING-H2 finer protein AC012477 9?2 -96142 unknown AC012477 104717 -105196 Mitochondrial ATPase AC012477 10i7~8 -10659 ferredoxin--NADP+ reductase AC012477 1074 1 -109095 unknown AC012477 109868 -1 U3 snoRNP-associated protein AC012477 11 1841 -I

UV-damaged DNA binding factor AC012477 1 14900 -121275 Glucan endo-1.3-Beta-GlucosidaseAC012477 122194 -122895 precursor D123 -like protein AC012477 125886 -126887 Adrenodoxin Precursor AC012477 127660 -129246 N7 like-protein AC012477 129718 -131012 N7 like-protein AC012477 131868 -133963 N7 like-protein AC012477 1342 I ~

N7 like-protein AC012477 13966 -140864 characterized gene (3. Callis.
1995).

B. Conservation of centromeric UNA
To investigate the conservation of CEN? and CEN4 sequences. PCR primer pairs were designed that correspond to unique regions in the Columbia sequence and used to survey the centromeric regions of Landsbcrg and Columbia at -20 kb intervals (FIGS.
14A, Bj. The primers used for the analysis are listed in FIGS. 15A. B.
Amplification products of the appropriate lensth were ohtained in both ecotypes for most primer pairs 18570). indicating that the amplified re~iona were highly similar. In the remaining cases.
primer pairs amplified Columbia. but not Landsber~ DNA. even at very low strinQencics.
In these regions. additional primers were desi«ned t~ determine the extent of nonhomology. In addition to a large insertion of mitochondrial DNA in CEU_.
two other ly_ WO 00!55325 PCT/US00/07392 non-conserved regions were identified (FIGS. 14A. B). Because this DNA is absent from Landsberg centromeres. it is unlikely to be required for centromere function:
consequently. the relevant portion of the centromeric sequence is reduced to 577 kb (CEN2) and 1250 kb (CENT). The high degree of sequence conservation between Landsber~ and Columbia centromeres indicated that the inhibition of recombination frequencies was not due to lame regions of nonhomoto~y. but instead was a property of the centromeres themselves.
C. Sequence similarity between CEN2 and CEN4 In order to discern centromere function. a search was conducted for novel sequence motifs shared between CEN2 and CEN4, excluding from the comparison retroelements. transposons. characterized ~:entromeric repeats, and coding sequences resembling mobile genes. After masking simple repetitive sequences. including homopolymer tracts and microsatellites. contigs of unique sequence measuring d 17 kb t5 and 851 kb for CENZ and CEN4, respectively, were compared with BLAST
(http://blast.wustl.edu).
The comparison showed that the complex DNA within the centromere regions was not homologous over the entire sequence length. However. 16 DNA segments in CEN2 matched 11 regions in CEN=I with >60'70 identity (F1G. 16). The sequences were ~roupcd into families of related sequences. and were desi«nated AtCCSI-7 (Arcrhi~lopsi.c thalicrrrcr centromere conserved sequences 1-7). These sequences were not previously known to be repeated in the Arnhidopsi.c ~enome. The sequences comprised a total of 17 kb (4%) of CEN? DNA. had an average length of 1017 bp. and had an A + T
content of ?5 6590. Based on similarity. the matching sequences were sorted into groups.
including two families containin_ S sequences each (AtCCS 1 and AtCCS2: SEQ ID NOS:I-la). 3 sequences from a small family encoding a putative open reading frame (AtCCS3:
SEQ ID
NOS:21-?2)), and 4 sequences found once within the centromeres (AtCCS~-AtCCS7:
SEQ ID NOS:15-ZO). one of which (AtCCS6: SEQ ID N0:17) corresponds to predicted CEN? and CEN4 proteins with similarity throe<Thout their exons and introns (FIG. 16).

Searches of the Aruhiclnp.ci.s genomic sequence database demonstrated that AtCCS I - AtCCSS were moderately repeated sequences that appear in centromeric and pericentromeric regions. The remaining seqmnces were present only in the genetically-defined centromeres. Similar comparisons of all 16 S. cerevisiae centromeres defined a consensus consisting of a conserved 8 by CDEI motif, an AT-rich 85 by CDEII
element. and a 26 by CDEII region with 7 highly conserved nucleotides (Fleig et crl., 1990. In contrast, surveys of the three S. pombe centromeres revealed conservation of overall centromere structure. but no universally conserved motifs (Clark.
1998).

Mapping Results: Arabidopsis Chromosomes 1-5 The centromere on chromosome 1 was mapped between mi342 (56.7 cM) and T27K12 (59.1 cM). A more refined position places the centromere between the marker T22C23-t7 (-58.5 cM) and T3P8-sp6 (-59.1 cM). Contained within this interval are the previously described markers EKR1V and RCEN 1.
The centrornere on chromosome ? was mapped between mi310 ( 18.6 cM ) and x4133 (23.8 cM). A more refined position places the centromere between the markers FSJ15-sp6 (-19.1 cM) and T15D9 019.3 cM). The following sequenced (http://www.ncbi.nlm.nih.aov/Entrez/nucleotide.html ) BAC (bacterial artificial chromosome) clones are known to span the region between the markers FSJ 1 i-sp6 and T 1 SD9: T l 3E 1 1. F27C2 l . F9A ! 6, TSM2. T 17H 1. T I 8C6. TSE7. T 1212.
F27 B2?.
T6C20, T f 4C8. F7B f 9. and T 1 SD9.
There is a gap in BAC coverage between T12J2 and F27B22. RARE cleavage.
pulse field vela or DN.A sequence tiling will be used to isolate DNA in the '~.rp for sequencing.

The centromere on chromosome 3 was mapped between atpox (48.6 cM > and ATA (53.S cM). A more refined position places the centromere between the marker T9G9-sp6 (-53.1 cM) and TAM I=1-sp6 (-X3.3 cM). Contained within this interval i; the previously described marker: RCEN3.
The centromere on chromosome =l was mapped between mi233 ( 18.8 cM > and mi167 (21.5 cM). A more refined position places the centromere between the markers T24H24.30k3 (-20.3 cM) and F13H14-t7 (-21.0 eM). The following sequenced (http://www.ncbi.nlm.nih.gov/Entrez/nucleotide.html) BAC (bacterial artificial chromosome) clones are known to span the region between the markers FSJI~-sp6 and T6A I 3-sp6: T27D20, T 19B 17. T26N6. F4H6. T 19J 18. T4B21, T 1 J 1, T32N4. C
l 7L7.
C6L9, F6H8, F21I2. F14G16.and F28D6.
There is a gap in BAC coverage between F2I12 and F14G16. RARE cleavage.
pulse field gels or DNA sequence tiling will be used to isolate DNA in the gap for sequencing.
The centromere on chromosome ~ was mapped between nga76 (71.6 cM) and PhyC (74.3 cM). A more refined position places the centromere between the markers FI3K20-t7 (-69.4 cM) and CUEI (-69.~ cM). Contained within thin interval are the publicly available markers: urn~79D. mi?91 b. CMS 1.
A table listing the BAC clones known to reside within the centromeres on chromosomes I-V given as well as Genbank Accession numbers for the sequences of 2~ these clones, is liven below. in Table ~ and Table 6.
Genetic positions (i.e. cM values correspond to the Lister and Dean Recombinant Inbred Genetic map, available on-line at http://nasc.nott.ac.uk/new_ri_map.html Markers are available at http://~enome-wwvv.Stanford.edulArabidopsis/aboutcaps.html.
1 3?_ W O 00/~~325 PCT/US00/07392 Table ~: BAC clones residing within A. thaliarra centromeres and associated Genbank accession numbers CENTROMERE 1 GENBANK ACCESSION #

F24P1 823044' F13J4 AL08696T and AL086966*

F7G10 AL083686' and AL083685*

F17A20 823767*

F13G14 AL086828* and AL086827' F13018 AL087175* and AL087174' F24A15 AQ011599* and B98125* and B98124*

F25C4 823065*and 823064' F3A6 none F2007 B22665 and B22664*

F16K23 897718' and 825748' and B23714*

F8L2 AL084364'and AL084363' F6C2 AL083089' and AL083088' F1H9 AL080601"and AL080600' F27022 AQ01 1488* and B25518*

F15P3 B97045* and 822971' and B22970*

F2406 B23041*

F20P22 AQ251396* and AQ251287' F2C1 AL081001' and AL081000' F 15F 11 B23547*

-l33-W O I111/sS32s PCT/US00/07392 F1 F24 AL080554* and AL080553*

F6J1 AL083277*

F26H20 none F16J24 none F19M18 AQ011034*

F20K7 AG~251392* and AQ251282*

F 1266 AC007781 t F23F21 none F28G17 none F28G 13 none F27A14 AQ251243* and AG~251137' F28G9 B23346* and 823345' F21 F1 B95997* and B22704*

F16K24 B97719* and B25749*

F20C15 AQ251381' and AG~251272*

F9G18 AL084752* and AL084751* and B26534*

F10G23 AL085268* and AL085267*

F22016 AQ250131' and AQ249777* and B96460' and B96459* and B12588* and B08235*

~F23P24 AQ011594* and B98116* and 8981 15' !F24A9 AQ010513* and B96134* and B96133*

F26B21 825313*

I F28019 B25706*

j F 19J21 AQ011011 ' F28E 13 B25592* and 825591 ' F24G 19 828443*

F15H9 822577' and 822576' - I 3:~-WO 00/~532~ PCT/US00/07392 F28A11 825540' F26N17 825374' F15J24 AQ011049* and 897568' F25J4 B23109* and 823108' F28P16 AQ011538* and 825713' F12E11 AL086267* and AL086266*

F28G8 823344' and 823343' F22L3 B22875* and B22874*

F25C2 B23063*

F22B13 829456'and B28433*

F13114 AL086945* and AL086944*

F11 L16 AL085969' and AL085968*

F25B1 B23057* and 823056' F26H18 AQ010880* and AQ010879*

F20P4 822672' F11 K13 AL085923* and AL085922*

F 1965 AQ251104' F15F7 822200'and 822199' F16C4 898549' and 898548' and 823399' GAP

F 19M 16 AQ011032' F22M21 896432' and 896431 *

F27K16 none F21K24 897937' F13P3 AL087187' and AL087186*

F15P18 none F28G 19 825637' F5E5 I AL082645* and AL082644*

F5K9 AL082841' and AL082840*

F5E 12 AL082657' and AL082656*

F21 N 15 861476' F5L13 AL082880* and AL082879*

F17L20 B23905* and B23904*

F14K1 AL087586* and AL087585*

F16J4 B98573*

F15M18 none F14116 AL087535' and AL087534*

F21K13 none F16E23 none F14~5 AL087748' and AL087747*

F20G9 822553' and B22552*

F27119 AQ01 1427' and B25464*

F1118 AL080658' F16C8 898552' and B22985*

F2001 822655' F13H12 AL086902' and AL086901*

F13B12 none F27D7 none I F21 B 16 824625' F8F1 AL084170' and AL084169*

F9A 12 none F22111 824855' and B24854*

~F16N17 825774' and B23737*

F 17H 11 823833' 'F15A12 none F20M21 none F19E19 824191' F25015 825275' and 825274' F27J13 AQ011435' and 825468*

~ ~(,_ WO 00/5532, PCT/US00/07392 F15J7 822603*and B22602*

F13J1 AL086961* and AL086960*

F9D18 AC007183t F9M8 AL084923*

F519 AL082775*

F3L22 AL081822* and AL081821 F5P23 AL083021*

F10023 AL085527' and AL085526' F20J1 AQ010790* and 822625' F7K22 AL083828' and AL083827*

F6J23 AL083299* and AL083298*

T4121 AC022456t F116 AL080639' and AL080638*

F28B8 AQ010984"

F20B1 B22488* and B22487*

F26F14 None F18C13 B28362*

F20K13 AQ011116* and 824519"

F10K7 AL085379*

F12B23 AL086177' F9121 AL084816" and AL084815"

F17120 B23850* and 823849' GAP

T15D9 AC007120 _ F6H5 AL083229' and AL083228*

F6G 13 AL083215* and AL083214*

F21 G23 B97922* and B24664*

F3J24 B19129* and B12732*

F2517 AQ010570* and 823104' F14012 B22064* and B22004*

F1010 AL080869* and AL080868*

F11N16 AL086039* and AL086038*

F19M19 none F301 AL081890* and AL081889*

F1 D9 B21602* and 821631 ' and AQ248831 * and AL080449' and AL080450*

F8F8 none F23A15 none F201 AL081375* and AL081374*

F711 AL083741 * and AL083740*

F25D24 B25156* and B25155*

F10L19 AL085429' and AL085428*

F28J14 825860*and 825859*

F17D19 823796* and B23795*

F2701 1 B25508*

F27P23 AO011498* and 825537' F11N11 B28323* and 828322' F 16117 B97693*

_~3g_ GAP _ F1L15 AL080750* and AL080749*

F2A9 AL080941 ' and AL080940*

F2D1 AL081028* and AL081027*

F2D22 AL081046' and AL081045*

F208 AL081387* and AL081386*

F2014 AL081393' and AL081392*

F3G24 none F9A7 AL084546* and AL084545*

F10N9 AL085473* and AL085472 *

T1415 AL088212* and AL088211'B19832* and B19707*
and T1J6 AL088233* and AL088232*B19834* and 819709' and T2G 13 AL088663' and AL088662*

T6D10 AL090573* and AL090572*B27383* and B27382*
and and 819977" and B19790*

T7K14 AL091315' and B27422*
and 827421 * and B20115*
and B19895' T8012 B21405* and B21348"

T9J24 AL092268* and AL092267*820132" and 81991 and 1 ' T9K2 AL092269' and B20133*
and B19912*

;9_ T10F10 AL092618* and AL092617' and B20076* and B19918*

T15N4 AL095818' and AL095817* and B20044* and 819856' T16C1 AL095981 * and AL095980' T16F22 AL096108*

T16M9 AL096289' and AL096288* and B20053* and 819865' T18P7 860875' and 860874' T21124 B62398* and 820320" and B20288*

T22E7 861351 * and 820426' and B20394*

T2419 B67385* and B67384* and 820450' and B20419*

T2405 B67422* and 820454' T25C15 AQ225286* and B67937* and 820460 T25 F 15 AC009529t T26J6 B76816* and 876815' T28G 19 AC009328t GAP

F6K8 AL083310* and AL083309' F25M24 825253' F25F9 823085' F28F20 825620"

~F16C22 897681 * and 823646' -!.. -.

F24M20 825096' F27B5 823236' and 823235' -- l=lU-WO 00/~~325 PCT/US00/07392 F21 A 14 AC016828t T18B3 AC011624t F12P5 AL086610* and AL086609' F22N7 AQ251226*

F21 N 12 B24707*

F7N6 none F12E16 none F21J13 AQ251199* and AQ011170*

F9B18 AL084600* and AL084599' F20J23 AQ011113* and 824515' F1 G6 AL080561 * and AL080560' and AQ251107 *

F704 AL083940* and AL083939' F1 D4 AL080441 * and AL080440' and B22163*

F19P10 AQ251376* and AQ251268' F4P10 AL082481 *and AL082480' F9123 AL084818* and AL084817' IF3118 AL081711 * and AL081710*

F13K14 AL087018* and AL087017' F13K8 AL087008* and AL087007' F13J3 AL086965* and AL086964' F20F5 822533' F1 K22 AL080723* and AL080722*

F3H 19 AL081679* and AL081678' F23M13 B98039*

-1~1-F23N10 B98054* and B98053*

F8M14 AL084410* and AL084409' F7C16 AL083567* and AL083566*

F26D5 none F10J2 AL085340*

F16L6 B23418*

F26P16 B25396* and B25395*

GAP

F28D17 none F27E12 AQ251248* and AG~251142* and AQ011376*
and AQ011375*

F4M19 AL082399* and AL082398*

F3021 AL081924* and AL081923*

F3114 AL081705' and AL081704*

F20C5 AQ251382* and AQ251273' F14B7 AL087267* and AL087266*

F14K13 AL087604* and AL087603*

F21L14 897938'and B24690*

F23012 898080* and 898079*

F14G1 AL087450* and AL087449*

F19117 AQ225333*

F7C3 AL083548* and AL083547' !F4111 AL082258* and AL082257*

F7J17 AL083789* and AL083788*

_ j:~?_ F18L6 B22332* and B22331*

F16N 18 B25775*

F28J6 823358' F7C6 AL083554* and AL083553*

F28C1 B23304* and B23303*

F18117 B24063*

F10P16 AL085555* and AL085554*

F24G 17 none F4K4 AL082320* and AL082319*

F26B15 B25309* and B25308*

F12P9 AL086614*and AL086613*

F8C3 AL084070* and AL084069*

F25D21 825153' and B25152*

F27C7 AQ010648' and AQ010647* and B23240*

F23G13 none F15B16 AL087857'and AL087856*

C17L7 none C6L9 none GAP

- I 4s-WO 00/~s32s PCT/US00/07392 F3F24 AC018632 _ F13K20 AL087030* and AL087029*

F6L19 none F18F14 810562' F22D5 AQ251214' F12P18 none F6C14 none GAP

F28N5 823377*

F2C13 none F12P1 AL086602*

F9K2 AL084855' F13D7 AL086757' and AL086~56*

F4C11 AL082053" and AL082052*

F28G24 none F7C4 AL083550' and AL083549*

F4B15 AL082023* and AL082022*

F19111 AQ010999*

F3M22 AL081848* and AL081847*

F1 M22 AL080803' and AL080802' F21 A22 B24614* and B24613*

F8P23 AL084535* and AL084534' F17M7 B22216' and B22215*

F21B21 B24632*

F17G22 B23828* and B23827*

F11 P4 AL086088" and AL086087*

IIF14J11 AL087566' and AL087565' F7J19 AL083792* and AL083791' F20G20 j none -~4=1-WO 00/~~325 PCT/US00/07392 F27H14 AQ251251* and AQ251145*

F25E10 none F24123 B25815* and B25066*

T3D5 AL089085* and AL089084*

T17G5 AL096632* and AL096631 F20C16 B24433*

F27M22 none F27K1 B23257*

F21 N24 B61479* and B24716*

F11 F13 AL085745' and AL085744*

F5015 AL082980* and AL082979*

F8G 15 AL084218* and AL084217*

F9A17 B12265* and B10646' F25E19 none F24C5 AQ010525* and AQ010524*

F27L2 AQ010708* and B96166*

F10A6 AL085056* and AL085055*

F23B23 AQ011184*

F1 E3 AL0804828' and AL080481' and B22171' and B22170*

GAP

F20J17 AQ011108' and 824510' F21022 B24736* and B24735*

F26021 n one F25M11 B25245* and B25244*

F18F8 B26318* and B22290*

F17M12 823910' F22M20 896430' iF9K6 A L084860' -IdS-F13J20 AL086992"and AL086991' F12E24 AL086282* and AL086281 F26K6 AQ010623' and AQ010622*

F12L5 AL086477* and AL086476*

F11 B6 AL085606* and AL085605*

F3N7 AL081864* and AL081863*

F10J11 none F11 F9 AL085739* and AL085738*

F3G22 AL081647' and AL081646*

F15E15 823535' F10K18 AL08539T and AL085396' F5B20 AL082559* and AL082558*

F1 F13 AL080535' F26M 13 none F18D9 826307' and 822283"

F28D1 823312' F13C19 AL086736* and AL086735*

F2811 none F26D1 823180' F16J19 B97706'and 825740"

F2D20 AL081042' F22N6 898712" and 898711 F27K3 AQ010703' F 19124 AQ011005' F19J19 none F24E18 AQ011661* and AQ011660* and 825052"

~F27K6 AQ010706* and AQ010705* and 896164" and 823259' F25L7 AQ010583*

F28M5 823516' and 823371 F18L3 none F14C23 AL087326* and AL087325*

F11 C6 AL085640* and AL085639*

F6024 AL083442* and AL083441 F1 M8 AL080782' and AL080781 F16J23 897710*and B23709*

F1809 898639' and B98638* and 898691 * and B22349*

F26L23 AQ011321 * and AQOi 1320*

F3B13 AL081491 * and AL081490*

F22D12 B24795*

F1G16 none F 1 OM21 AL085461 ' F2A14 AL080946* and AL080945' F13M20 AL087096* and AL087095*

F19J6 none F9015 AL085006" and AL085005' F5A6 AL082510* and AL082509' F17D12 B97751 * and B23790*

F11C12 AL085648* and AL085647*

F26P20 B25400' and B25399*

F13118 AL086953* and AL086952*

F2122 B12725" and B08590*

F21811 824621 ' and 824620 "

'~F28A24 AQ011507* and 825554' -I-t7-F13014 AL087167* and AL087166*

F14A22 AL087257* and AL087256' F21 G 14 897912*

F18M12 B09450* and B09052*

F3D18 AL081552*

F28K14 B25874* and B25873*

F28L21 B25895* and 825894' F1 D3 AL080439* and AL080438*

F16019 897731 *

F15115 AQ251156* and A0251026*

F27G 1 AQ010677* and B23247* and B23246*

F22C19 B97947*

F1 E16 AG~251175*

F12P2 AL086604* and AL086603*

F15018 823621 * and 823620' F13D8 AL086759* and AL086758' F23J22 AQ011543* and A0011257' F3K18 none F17022 A0251082*

F25A22 B25136*

~,F15G12 A0251153* and AQ251023*

'F23A7 895912' and 895911 *

IF26L22 A0011319* and AQ01 1318' and 862693' F11820 AL085623' and AL085622*

T28K13 861711' T19L12 861940' and 861939' F25A15 AQ251405* and AQ251342*

F22 H 10 AQ251219*

F3N13 AL081870* and AL081869"

F27F24 AQ251249* and AQ251143*

F27J 18 AQ011439*

F20K22 AQ011121 * and B24528"

F2J19 AL081240* and AL081239* and B26437*

F9F4 AL084708* and AL084707* and B30281 F8P17 AL084523* and AL084522*

F7E 14 AL083629* and AL083628*

F19N2 None F27G5 AQ010682* and AQ010681 *

" = partial (BAC
end) sequence t = full sequence in more than one part Table 6: Fully sequenced BAC clones containing A. thaliana centromere sequences*
Clone Genbank Date Of' AvailabilityComment Accession No.

F28L22 AC007505 Feb 7. 2000:
May 6.

T32E20 .AC020646 10 Feb. 2000:
Jan 8.

F12G6 AC007781 Jun 1 1. 1999 3 unordererd pieces F9D18 AC007183 ~~Iar 30. 19996 unordererd pieces T4121 AC0224~6 Feb ?8. 2000:
Feb 3, ?000 FSA 13 AC008046 Feb 8, 2000:
Jul l4.

CENTROMERE

T 13E I 1 AC006217 Dec 17, 1999:
Dec 24, F27C21 AC006527 Dec 17. 1999:
Feb 5, F9A 16 AC007662 Dec 17. I 999:
May 27. 1999 TSM2 AC007730 Dec 17, 1999:
Jun 5, T 17H 1 AC007143 Dec 17. 1999;
Mar 17, T I 8C6 AC007729 Dec 17. 1999:
Jun ~, TSE7 AC006225 Dec 17. 1999;
Jun 5.

T12J2 AC004483 Dec 17. 1999;
Jul 17, GAP

T6C20 AC005898 Mar 20, 1999; 10 unordererd Dec 7, pieces T14C8 AC006219 Dec 17, 1999:
Feb 9, F7B 19 AC006~86 Dec 17. 1999:
Feb 19.

T 15D9 AC007120 Dec 17. 1999:
Mar 19, entire chromosomeAE002093 Dec 17. 1999:
Il Dec 16, CENTROMERE s T25F1~ AC009~29 Dec 3. 1999: ? unordererd Auk l6. pieces F23H6 AC0 f I 62 Nov 24. 1999:
I Oct 8.

- I ~ 0-WO 00/~~325 PCT/US00/07392 T28G 19 AC009328 Oct 26. I 999: 16 unordererd Auk 16, pieces T I 8B3 ACO l I 624 Nov 18, I 999: 14 unordererd Oct 8. pieces T26P13 AC009261 Nov 3. 1999:
Aug 10, T 14A 1 1 AC012327 Nov 20, 1999;
Oct 23, T4P3 AC009992 Oct 2 I , 1999;
Sep 9.

F21A14 AC016828 Jan 13. 2000: 6 unordererd Dec 3, pieces T27B3 AL137079 Jan 21, 2000 F26B 15 AL 138645 Feb 2. 2000 T 14K23 AL 132909 Nov 12. 1999 T32A11 AL138653 Feb 2 2000 T27D20 AF076274 Au~ 3. 1998 T 19B 17 AF069441 Jun 3, 1999 T26N6 AF076243 Ma 1 1, I 999 F4H6 AF074021 Ma 11.1999 T 19J 18 AF I 49414 Au~ I 3. 1999 T4B21 AF118223 Aua 10. 1999:
Jan 7, T 1 J l AF I 28393 Nov 12. 1999 T32N4 AF 162444 Au~ 13. 1999 C17L7 AC012392 Oct 27. 1999 C6L9 AC012477 Nov 6. 1999 TIJ24 AF147263 Au~ 13. 1999 F6H8 AF178045 Au. 19. 1999 'F21I2 AF147261 Mav I1. 1999 GAP

F14G16 AFI47?60 Au~ 13. 1999 IF28D6 AF14726? Au~ 13. 1999 entire chromosomehttp://websvr.mips.Dec 17. 1999 IV biochem.
mp~.de/p roj/thal/chr4_anno uncement/

,CENTROMERE
s !F3F24 ACO I 863? Dec 1 >. 1999 F23C8 AC0189?8 Dec ?4. 1999 WO 00/ss32~ PCT/US00/07392 '' The sequences for clones from centromeres I . 3 and ~ are given in SEQ ID
NOS:184-208. Sequences for contigs including the centromere ? and 4 clones are given by SEQ ID
NOS:209-212.
' BAC clone number desi~Tnations are given. The centromere number origin of the clone i is as indicated.
~ Where a second date is given. the second date indicates the date for the revised sequence.

Constructing BAC Vectors for Testing Centromere Function A BAC clone may be retrofitting with one or more plant telomeres and selectable markers together with the DNA elements necessary for Agrohcrcteriunr transformation (FIG. 9). This method will provide a means to deliver any BAC clone into plant cells and 15 to test it for centromere function.
The method works in the following way. The conversion vector contains a retrofitting cassette. The retrofitting cassette is flanked by Tn 10. Tn~.
Tn7. Mu or other transposable elements and contains an origin of replication and a selectable marker for 20 A~yrnhrrcrerium, a plant telomere array followed by T-DNA right and left borders followed by a second plant telomere array and a plant selectable marker (FIG.
9). The conversion vector is transformed into an E. c.wli strain carrying the tar~~et BAC. The transposable elements flanking the retrofitting cassette then mediate transposition of the cassette randomly into the BAC clone. The retrofitted BAC clone can now be 2~ translormed into an appropriate strain of A,yrnhcrcrerirnn and then into plant cells where it can be tested for hish fidelity meiotic and mitotic transmission which would indicate that the clone contained a complete functional plant centromerc.

30 Construction of Plant Nlinichromosomes Minichromosomes are constructed by combining the previously isolated essential chromosomal elements. Exemplary rninichromosome vectors include those designed to be "shuttle vectors": i.e.. they can be maintained in a convenient host wuch aE. coh.
_15 A~r~hcrcrcnirnrr or yeast) as well as plant cells.
A. General Techniques for Minichromosome Construction A minichromosome can be maintained in E. c«li or other bacterial cells as a circular molecule by placing a removable stuffer fragment between the telomeric sequence blocks. The stuffer fragment is a dispensable DNA sequence. bordered by unique restriction sites, which can be removed by restriction digestion of the circular DNAs to create linear molecules with telomeric ends. The linear minichromosome can then be isolated by. for example, gel electrophoresis. In addition to the stuffer fragment and the plant telomeres, the minichromosome contains a replication origin and selectable marker that can function in plants to allow the circular molecules to be maintained in bacterial cells. The minichromosomes also include a plant selectable marker. a plant centromere, and a plant ARS to allow replication and maintenance of the DNA
molecules in plant cells. Finally, the minichromosome includes several unique restriction sites where additional DNA sequence inserts can be cloned. The most expeditious method of physically constructing such a minichromosome. i.e.. ligating the various essential elements together for example, will be apparent to those of ordinary skill in this an.
A number of minichromosome vectors have been designed by the current inventors and are disclosed herein for the purpose of illustration (FIGS. 7A-7H). These vectors are not limiting however. as it will be apparent to those of skill in the art that many changes and alterations may be made and still obtain a functional vector.
B. Modified Technique for Minichromosome Construction A two step method was developed for construction of minichromosomes. which allows adding essential elements to BAC clones containing centromeric DNA.
These procedures can take place in viva. eliminating problems of chromosome breakage that often happen in the test tube. The details and advantages of the techniques are as follows:
_Ij;_ 1.) One plasmid can be created that contains markers. origins and border sequences for Ayrol»rcterimo transfer, markers for selection and screening in plants. plant telomeres. and a loxP site or other site useful for site-specific recombination in viva or i~r aitrm. The second plasmid can be an existing BAC
clone, isolated from the available ~enomic libraries (FIG. 11A).
2.) The two plasmids are mixed. either within a single E. cnli cell. or in a test tube, and the site-specific recombinase cre is introduced. This will cause the two plasmids to fuse at the IoxP sites (FIG. I 1B).
3.) If deemed necessary. useful restriction sites (AseI/PacI or Not 1) are included to remove excess material. (for example other selectable markers or replication origins) 4.) Variations include vectors with or without a KanR gene (FIGS. 1 IB. 1 1C), with or without a LAT~2 GUS «ene. with a LAT52 GFP gene, and with a GUS
Gene under the control of other plant promoters. (FIGS. 1 I C, 1 1 D and I 1 E).
C. l~~Iethod for Preparation of Stable Non-Integrated Minichromosomes A technique has been developed to ensure that minichromosomes do no integrate into the host genome (FIG. 1 1F). In particular. minichromosomes must be maintained as distinct elements separate from the host chromosomes. To ensure that the introduced minichromosome does not integrate. the inventors envision a variety that would encode a lethal plant gene (such as diptheria toxin or any other gene product that, when expressed.
causes lethality in plants). This gene could be located between the right Ayrobcrctericm border and the telomere. Minichromosomes that enter a plant nucleus and integrate into a host chromosome would result in lethality. However. if the minichromosome remains separate. and further, if the ends of thin construct are degraded up to the telomeres. then the lethal gene would be removed and the cells would survive.
_ I ;.t_ In Vivo Screen of Centromere Activity by the Analysis of Dicentric Chromosomes A method was designed for the screening of centromere activity (FIG. 10). In the method. plants are first transformed with binary BAC clones that contain DNA
from the genetically-defined centromeric regions. By allowing the DNA to integrate into the host chromosomes. it is expected that this integration will result in a chromosome with two centromeres. This is an unstable situation which often leads to chromosome breakage, as single chromosomes harboring two or more functional centromeres will often times break at junctions between the two centromeres when pulled towards opposite poles during mitotic and meiotic events. This can lead to severe growth defects and inviable progeny when genes important or essentially for cellular and developmental processes are disrupted by the breakage event. Therefore, regions having centromere function could be identified by looking for clones that exhibit. upon introduction into a host plant. any of the following predicted properties: reduced efficiencies of transformation:
causation of genetic instability when integrated into natural chromosomes such that the transformed plants show aberrant sectors and increased lethality: a difficulty to maintain, particularly when the transformed plants are gown under conditions that do not select for maintenance of the transgenes: a tendency to integrate into the genome at the distal tips of chromosomes or at the centromeric regions. In contrast, clones comprising non-centromeric DNA will be expected to integrate in a more random pattern.
Confirmation of a resulting distribution and pattern of integration can be determined by sequencing the ends of the inserted DNA.
The screen is performed by identifying clones of greater than l00 kb that encode 2~ centromere DNA in a BiBAC library (binary bacterial artificial chromosomes) (Hamilton.
1997). This is done by screening filters comprisin« a BiBAC ~enomic library for clones that encode DNA from the centromeres lFIG. 10. step I ). The BiBAC vector is used because it can contain large inserts of ,~trahirln/~ci.s ~lenomic material and also encodes the binary sequences needed for ~l,y-nbacterirrrrr-mediated transformation. The centromere sequence containing BiBAC vectors are then directly integrated into chromosomes by -I»-Ayrnhcrcterirrnr-mediated transformation (FIG. 10. step ?l. As a control.
BiBAC
constructs containing non-centromeric DNA also are used for transformation.
BiBACs harborin~~ sequences with centromere function will result in forming dicentric chromosomes. Progeny from transformed plants will be analyzed for nonviability and gross morphological differences that can be attributed to chromosomal breaks due to the formation of dicentric chromosomes (FIG. 10. step ~ ). Non-centromere seduences are expected to show little phenotypic differences from wildtype plants Refined Centromere Mapping with Treatment for Increased Recombination In order to achieve a more refined map position for the centromeres in Arcrhidopsi.c thctlictrrn. various chemical and environmental treatments were used to stimulate recombination. The treatments were used on pollen donors in crosses performed to create the tetrad sets of plants (see EXAMPLE 2). Pollen donor plants were IS planted individually in 1 inch square pots and grown under 24 hr light in a growth room until f7owerin~. Flowering plants were then dipped in one of the following solutions and watered with 50 ml of the same solution.

..

s G~ 4 y G r ~ O
; G =a.

y-. r,!

r ~ ~ =

C c'I _ m, ~~ - ~ C

C G
J j L .
r G . ~ J .r _ -T

t ~

n',:J U r ,U

v ;J -J = ~~ - _J v .Ur_~ :J J ~ ~ ~, U C

U r U 'v ..

'' G ~ ~ :r ~ ~ Q
:~ .

.C' J ~ ., = ~ r-'~~ ~ _ ~ D ' C 4 y ~
-r ~ . 0 _ ~ -1 ~ _ V O
1 C O ~ ~ j ~ M ~ W

Ct C ..r rc''','~' c ~ _ ;:o _ Z C G = . l C = G ~~ G ~ J O

Q O L G = G O = 1 .G ~ C ~
C c ~

_ ..
~./c~ ~ ~ ~ ~ = -' ' .' !' C
' ... G .. - .~ G r G ..

~

-. ,- ~ r rr- ~ ~ ~ v I r.

_ w G .r ~ ~ - C rD
'~ ~ -- L

rl . _ ~
.
~

~ ~ _ ~

Z C

O O C C C O O C ~o _ Z J J J r ~ J
G

O G C
G " O G G O G O C

.c~ ~ s ~ s ~ .C ~ .~ ~ .c V

~ ~ ~ ~ = = ~ ~ c O = _ O G O O G O G J C ~ C
C

U ~ _ ~- c::.

c o U U

U U

U U

U ~ U

v ,-C ~

p =p ~D ) ~ > =D J =4 ~_-0_:4_:4..~.0 W

C/~(~f/~f~ in :/~ _ Q C _ Q J~ (/7(/7C~ U1 C/7 r i/~ Cn .
C

y Q

G r U
~ l G

, O J
_ T U C

"' ' T

r j U 'J ~ _ C ' J

'v.S I
:C '7 J U _ ~ O r- v .n ~ J _ U U ~ T U = -~ _r c = ~' f =s Q '~ ' v ~ ~ _ - ,-J c ~ ~ ? m T .c.
- ~ - '.

~ ~.J J J >, C ~ V _ _ n -~

-G. ~ ~ -r. X C X
O :! U ' G ~' T =! _ . r ~

i J f J J G f _ ~ ~ ' ~
~ , _ ~ O_ ~ G G

.- ~ .r _ -- J ~.
~ T L

N __ ..J .r ~ U
~

O U J G n ? _ ;,.= = >o f . j :3U ~ n c = ~ ~ J ;~ ~ ~ m Q' :~oU
~ N ' . c ; . _ .

J _ O

O J
f J

C

~iC ~'r c J ...
O J _r Q,- V J

I~J r ~
C L
Z~ ~

Oc o -Q~ ~ c y c , ~ ~ c Z~ J
O O O

V~ ~ ~ ., Oc o c c J~.~_ _ :z:
U
C~ .- a O .on cn cn o0 ~ ~

OT
:J

C
G O v .r OT ~V '~

J.i.Q D (J

WO 00/~~325 PCT/US00/07392 Following treatment. plants were then returned to the growth room and grown under standard conditions for 2-~ days. Pollen was then collected from newly opened flowers and used to pollinate receptive stigmas as described in Example ?.
Then the pollen donor plants were again treated as described above and used in another round of pollination. Pollen donor plants were typically subjected to 5-10 rounds of treatment and pollen collection.
Treatments were also performed using non-chemical agents. As above, the treatments were used to achieve more refined map positions for the centromeres in Arabidop.sis by stimulating recombination in additional pollen donor plants.
The treatments were as follows:
Table 8: Non-Chemical Treatment Agents.
TREATMENT TREATMENT PARAMETERS

heat shock: about 35 C to about 48 C, and preferably.
about 42 C

UV exposure (350 about 1 second to about 50 seconds. and nm): preferably. about 7 seconds Gamma radiation: about 0. I kRads to about 20 kRads, and preferably, about 10 kRads Magnetic field about 1 to 20 Tesla for 1 h to continuous cold stress about -10 to I SC for I rein to continuous l5 Heat shock treatments were performed by placing the pot containing the pollen donor plants in shallow dishes filled with water (to prevent desiccation). and placing the plant-containing dishes in incubators of the appropriate temperature. UV
exposure was performed by placing the pollen donor plants in a BioRad CIV chamber and illuminatin~~
the plants at the appropriate wave length for varying amounts of time. Both the UV and heat shock plants were subjected to several rounds of treatment and pollen collection.
Plants exposed to a gamma radiation source (Cobalt-60) were treated only once and then discarded to prevent the accumulation of deleterious chromosomal rearrangements.
-1~9-Following treatment, plants were then returned to the growth room and grown under standard conditions for 2-5 days. Pollen was then collected from newly opened flowers and used to pollinate receptive stigmas as described in Example ?.
Then the pollen donor plants were again treated as described above and used in another round of pollination. Pollen donor plants were typically subjected to i-10 rounds of treatment and pollen collection. The results are shown at Table 9 below.
Table 9: Results of Recombination After Treatments Treatment Tetrads Obs Exp (O-E) '/E=X'' n-butyric acid 43 11 2.5 28.9**

UV exposure 350 nm 57 12 3.2 24.2**

Methanesulfonic acid ethyl10 5" 0.6 32.2**
ester 5-aza-2'-deoxycytidine 68 16 3.9 37.5*' heat shock 23 7 I .3 25.0**

3-methoxybenzamide 44 8 ?.5 12.1 **

Zeocin !06 14 6.0 10.6**

Untreated 384 22 N/A N/A

** indicates significant by X~ (df=I ) Facilitation of Genetic Introgression It is also contemplated by the inventors that one could employ techniques or treatments which stimulate recombination to facilitate iotrooressi~n.
lntroaression describes a breeding technique whereby one or more desired trait, is transferred into one strain (A) from another (B), the trait is then isolated in the genetic background of the desired strain (A> by a series of backcrosses to the ame strain fA). The number of backcrosses required to isolate the desired trait in the desired genetic hack~round is dependent on the frequency of recombination in each backcross.

Backcrossing transfers a specific desirable trait from one source to an inbred or other plant that lacks that trait. This can be accomplished, for example. by first crossing a superior inbred (A) (recurrent parent) to a donor inbred (non-recurrent parent), which carries the appropriate genes) for the trait in question. for example. a construct prepared in accordance with the current invention. The progeny of this cross first are selected in the resultant progeny for the desired trait to be transferred from the non-recurrent parent.
then the selected progeny are mated back to the superior recurrent parent (A).
After five or more backcross generations with selection for the desired trait, the progeny are hemizygous for loci controlling the characteristic being transferred. but are like the superior parent for most or almost all other genes. The last backcross veneration would be selfed to give progeny which are pure breeding for the genes) being transferred, i.e.
one or more transformation events.
Therefore. through a series a breeding manipulations. a selected transgene may be moved from one line into an entirely different line without the need for further recombinant manipulation. Transgenes are valuable in that they typically behave genetically as any other gene and can be manipulated by breeding techniques in a manner identical to any other corn gene. Therefore. one may produce inbred plants which are true breeding for one or more transgenes. By crossing different inbred plants. one may produce a large number of different hybrids with different combinations of transgenes. !n this way, plants may be produced which have the desirable agronomic properties frequently associated with hybrids Whybrid vigor"). as well as the desirable characteristics imparted by one or more transgene(s).
Breeding also can be used to transler an entire minichromosome from one plant to another plant. For example, by crossine a first plant having a minichromosome to a second plant lacking the minichronu>some. progeny of any generation of thin cross may be obtained having the minichromosome. or any additional number of desired minichromosomes. Through a series of hackcrosses. a plant may be obtained that has the genetic backUround of the second plant hm hat the minichromosome from the first plant.

s; >: ,~c :< r W. ;e x :e x :~: :E: :j:
All of the compositions and methods disclosed and claimed herein can be made and executed without undue experimentation in light of the present disclosure.
While the compositions and methods of thin invention have been described in terms of preferred embodiments, it will be apparent to those of skill in the art that variations may be applied to the compositions and methods and in the steps or in the sequence of steps of the method described herein without departing from the concept, spirit and scope of the invention. More specifically, it will be apparent that certain agents which are both chemically and physiologically related may be substituted for the agents described herein while the same or similar results would he achieved. All such similar substitutes and modifications apparent to those skilled in the art are deemed to be within the spirit. scope and concept of the invention as defined by the appended claims.
I 6?-WO OOh5325 PCT/US00/07392 REFERENCES
The following references. to the extent that they provide exemplary procedural or other details supplementary to those set torch herein. are specifically incorporated herein by reference.
Abdullah et «l.. Biotechnology, 4:1087. 1986.
Abel et crl., Science, 232:738-743. 1986.
Alfenito et «l., "Molecular characterization of a maize B chromosome centric sequence,"
Genetics, 135:589-597, 1993.
Araki et «L, "Site-specific recombinase. R, encoded by yeast plasmid pSR I."
J. Mol.
Biol. 225:25-37. 1992.
Barkai-Golan et «l.. Arclr. Microbial., 1 I 6: ! 19-124. 1978.
Baum et «l.. ''The centromeric K-type repeat and the central core are together sufficient to establish a functional Schizosaccharomyces pombe centromere." Mol. Bio. Cell., 5: 747-76 I . 1994.
Bell et crl., "Assignment of 30 microsatellite loci to the linkage map of Arcrbicloh.si.s,"
Genonric.s, 19:137-144, 1994.
Bernal-Lugo and Leopold, PI«nt Plrvsiol.. 98:1207- I 2 I 0, 1992.
Berzal-Herranz et «l., Genes crncl Dwel.. 6: 129-134. 1992.
Bevan et «l., Nucleic Acids Re.se«rclr, 1 I ( 2 ):369-385. 1983.
Bevan et «L, Bi~Essavs 2 I :1 10. 1999.
Blackman et «l.. PI«rrt Phvsiol.. 100:225-230. 1992.
Bloom. "The centromere frontier: Kinetochore components. microtubule-based motility.
2S and the CEN-value paradox.' Cell. 73:62 I -624. 1993.
Bol et al . Annu. Rev. Phytopath.. 28: I 13-138. 1990.
Bowler et crl.. Aun Re u. Pkrnt Phvsinl.. =13:83-1 16. 199?.
Br,mdes et al.. Chrom. Res.. 5:238. 1997.
Branson and Guns. Proceediuy.s Nortlr Central Br«nclr Fntoruolo,yiccrl Society of Anreric«. 27:91-95. 1972.
_lb;_ WO 00/~s325 PCT/US00/07392 Brisson et crl.. Ncrtrrre. 310:51 1. 1984.
Broach et crl.. Gore. 8:121-133, 1979.
Broakaert er «l.. Science. 245:1 100-1 102. 1989.
Burke et «l.. Science, 236:806-812, 1987.
Bytebier et «l., Prnc. Nntl Acad. Sci. USA, 84:5345, 1987.
Callis et «L. Genes c»rd Development. I: l 183. 1987.
Cambareri et crl., Mol. Cell. Binl., 18:5465, 1998.
Campbell (ed.), In: Avermectin ct»d Abnrrrectin. 1989.
Campbell. "Monoclonal Antibody Technology. Laboratory Techniques in Biochemistry and Molecular Biology," Vol. 13, Burden and Von Knippenber~~. Eds. pp. 75-83, Elsevier, Amsterdam, 1984.
Capecchi. "High efficiency transformation by direct microinjection of DIVA
into cultured mammalian cells," Cell 22(2):479-488. 1980.
Carbon et «l. In: Recnnrhinc»rt Molecules: lrnp«ct art Science cent/ Sncietv (Raven Press).
1 > 335-378, 1977.
Carbon et «L. ''Centromere structure and function in budding and fission yeasts." Ne~4 Biologist, 2:10-19, 1990.
Carpenter et «l., "The control of the distribution of meiotic exchange in I)rnsoplrill«
nrelctrrn,~a.cter," Genetic:c, 101:81-90, 1982.
Cech er «l.. "ln vitro splicing of the ribosomal RNA precursor of Tetrahymena:
involvement of a guanosine nucleotide in the excision of the intervening sequence." Cell. 27:487-496. 1981.
Chandler et crl.. The Plant Cell, 1: I I75-1 183. 1989.
Chant et «l.. "Restriction fragment length polymorphism linkage map for Arcthidnpcis th«li« rr«," Prnc. N«tl Ac«cl. Sci.. USA. 85:6856-6860. 1988.
Charlesworth et «L. Ncrtrrre. 371:315, 1994.
Charlesworth. C.H. Langley, W. Stephan, I 12:947, 1986 Chepko. Cell. 37:1053. 1984.
Choi et «l.. PI«nt Nlol Binl Rep, 13: ! 24-29. 1995.
Choo. K.H.A.Gencnne Re.c. 8:81, 1998.

Chowrira et «l., "l» vitro and in viva comparison of hammerhead, hairpin, and hepatitis delta virus self-processing ribozyme cassetyes," J. Binl. Clre»r.. ?69:?5856-25864.
' 1994.
Chu et «!.. "Separation of large DNA molecules by contour-clamped homogeneous electric fields" Science, 234, 1582-1585. 1986.
Chye et «L. Plc»rt Mnl. Binl., 35:893, 1997.
Clapp, "Somatic gene therapy into hematopoietic cells. Current status and future implications." Cli». Perirr«tnl. 20( 1 ):155-168, 1993.
Clark, L. C«rr. OP. Ge». c& Dev., 8:212. 1998 Clarke et «L, "Isolation of a yeast centromere and construction of functional small circular chromosomes," N«trrre, 287:504-509, 1980.
Cohen et «l.. Pros. N«t'I Ac«cl. Sci. USA. 70:3240. 1973.
Conkling et crl.. Plant Plrvsiul., 93:1203-121 1. 1990.
Copenhaver and Pikaard, "RFLP and physical mapping with an rDNA-specific endonuclease reveals that nucleolus organizer regions of Ar«hiclnh.ci.c rJr«li«rr«
adjoin the telomeres on chromosomes 2 and 4." Plcrrrt J., 9:259-276, 1996.
Copenhaver et crl.. "Use of RFLPs larger than 100 kbp to map position and internal organization of the nucleolus organizer region on chromosome 2 in An«hidnp.ci.c tlr«lict»«." Pkr»t J. 7, 273-286. 1995.
Copenhaver et crl.. Pruc. N«tl. Ac«d. Sci. 95:247, 1998.
Copenhaver et «l.. Scie»ce. 286:2468-2474. 1999.
Copenhaver and Preuss. PI«rrt Biology. 2:104-108, 1999.
Coxson et crl., Biorrupic«. 24:121-133. 1992.
Creosol et «l.. Pl«»t Jn»r»«l. 8:763-70. 1995 Cristou et crl.. Pl«»t Plrv.cinl. 87:671-67=1. 1988.
Cuozzo et crl., Bio/l~echrtoln~v. 6:549-553. 1988.
Curiel et «L, "Adenovints enhancement of transferrin-polylysine-mediated «ene delivery." Prnc. N«tl Ac«cl. Sci. USA 88( 19):8850-8854. 1991.
Curiel et «l., high-etficiency gene transfer mediated by adenovirus coupled m DNA-polvlysine complexes." tlrr»r. Gerr. TTrc~r. 3(2):147-154, 1992.
Cutler et «L. J. Ph»rt Phvsinl., 135:35 I -35=1. 1989.
- I 6>-Czapla and Lang. J. Eco». Errtn»rcrl.. 83:2480-2485, 1990.
Davies et «l., Plct»t Plrvciol., 93:88-~9~, 1990.
Dellaporta et crl.. I»: Chromosome Strnctrrre cr»cl Fr»tctto»: I»rp«ct of New Cnrrcept.c, l8tlr StcrcllerGe»etics Sv7»posir»», 1 1:263-282. 1988.
Depicker et «L. PI«»t Cell Reporr.c. 7:63-66. 1988. , DiLaurenzio et «L> Cell. 86:423-33. 1996 Dillon et «l., Recor»hirrcr»t DNA Metlrodulu~~v. 198.
Donahue et «l., "The nucleotide sequence of the HIS4 region of yeast." Gene Apr;18( 1 ):47-59. 1982.
Dure et crl., Plant Molecular Biology. 12:475-486. 1989.
Earnshaw et «l., "Proteins of the inner and outer centromere of mitotic chromosomes."
Ge»o»re, 3 I :541-552. 1989.
Earnshaw, "When is a centromere not a kinetochore?." J. Cell Sci., 99: I-4, 1991.
Ebert et «l., 84:5745-X749, Prat. Ncrt'l Accrcl. Sci. USA, 1987 IS Ecker..IR, Gerrrn»ic.c, 19:137-144 Ecker, Metlroclc, 1:186-94. 1990.
Eglitis et crl., "Retroviral vectors for introduction of genes into mammalian cells,"
Bioteclrrrid«e.c 6(7):608-614. 1988.
Eglitis et «l., "Retroviral-mediated Gene transfer into hemopoietic cells,"
Avcl. Erp. Med.
Biol. 241:19-27, 1988.
Enomoto et crl., ''Mapping of the pin locus coding for a site-specific recombinase that causes flaoellar-phase variation in Ecclrericlricr coli K-1?." J.
Bcrcteriul.,156:663-668. 1983.
Erdmann et «L. J. Ge». Micrnhinlo~~v. I 38: 363-368. 1992.
Ferrin et crl., ''Selective cleavage of human DNA: RecA-Assited Restriction Endonuclease (RARE) cleavage." Scie»w~. ?4:1494-1497. 1991.
Fitzpatrick. Ge». E»gi»eeri»g rW ms. 22:7. 1993.
Flei_~. U. et crl.. "Functional selection for the centromere DNA from yeast chromosome VIII." Nrrc. Acids. Res. 23:922-924. 1991.
Forster and Symons. "Self-cleava'Te of plus and minus RNAs of a virusoid and a structural model for the active sites." Crll. 49:21 1-220. 1987.
- I 6f~-Fraley et crl.. Bintechrrolo~y, 3:629. 1985.
Franz et al.. Plant J.. 13:867, 1998.
Fromm of crl.. Ncrtnrc. 312:791-793. 1986.
Fromm et al.. "Expression of genes transferred into monocot and dicot plant cells by electroporation," Proc. Natl Aced. Sci. USA 82( 17):582=I-5828. 1985.
Fujimura et al.. Pla»t Tissue C»ltrrre Letters. ?:74. 1985.
Fynan et al.. "DNA vaccines: protective immunizations by parenteral. mucosal, and Gene gun inoculations," Proc. Nat'I Accrd. Sci. USA 90(24):1 1478-1 1482, 1993.
Gatehouse et al.. J. Sci. Foucl. Agric.. 35:373-380, 1984.
l0 Gefter et crl., So»ratic Cell Genet. 3:231-236. 1977.
Gerlach et crl., "Construction of a plant disease resistance gene from the satellite RNA of tobacco rinspot virus." Nature (Ln»do»). 328:802-805. 1987.
Godin~. "Monoclonal Antibodies: Principles and Practice." pp. 60-74. 2nd Edition.
Academic Press. Orlando, FL. 1986.
Golic and Lindquist. "The FLP recombinase of yeast catalyses site-specific recombination in the Drosophilcr oenome." Cell. 59:499-509. 1989.
Goring et al.. Pr~~c. Natl. Acad. Sci. USA. 88:1770-1774, 1991.
Graham et crl., "Transformation of rat cells by DNA of human adenovirus 5,"
Virology 54(2):536-539, 1973.
Grill and Somerville. Mol Gerr Ge»et, 226:484-~)0. 1991 Guerrero et al.. Plcr»t Molecular Biology. 15:1 I-?6. 1990.
Gupta et al.. Pr~c. Natl. Acad. Sci. US'A. 90:1629-1633. 1993.
Gutierrez-Marcos et crl., Proc. Natl. Acad. Sci.. USA. 93: f 3377. 1996.
Haaf et al., "Integration of human a-satellite DNA into simian chromosomes:
centromere protein binding and disruption of normal chromosome segregation." Cell, 70:681-696. 1992.
Hadlaczky et al., "Centromere formation in mouse cells cotransformed with human DNA
and a dominant marker oene." Prnc. Ncrtl ;lard. Sci. USA. 88:8106-8110. 1991.
Hamilton et al., ''Stable transfer of intact hi;=h molecular weight DNA into plant chromosomes. " Proc Natl Accrd Sci U S A 93( I 8 ):9975-9. 1996 Hamilton. " A binary-BAC system for plant transformation with high-molecular-weight DN.A." Ge»e. 4:200( 1-2):107-16. 1997.
Hammock c~t «l.. Ncr»rre. 3=14:458-461. 1990.
Haseloff ct «l.. Pros. N«t'I Accrcl. Sci. USA 9=t(61:? l??-? 127. 1997.
Hauoe et «l.. Sy»rp Sac E.~7~ Biol, 45:45-56. 1991 Hegemann et «l.. "The centromere of budding yeast." Bio«.s.s«v.c. 15(7):451-460. 1993.
Hemewvay et «l., Tlre EMBO J., 7:1273-1280. 1988.
Heslop-Harrison er «l.. Pl«»t Cell, 11:31, 1999.
Hilder et «L, N«trrre, 330:160-163, 1987.
Hinchee et «L. Biolteclr»ol.. 6:915-922. 1988.
Hoess et «L, Prnc N«tl Ac«cl Sci. 79:3398-402. 1982 Hsiao et «L. J. Proc. N«t'I Acacl. Sci. USA. 76:3829-3833. 1979.
Hudspeth and Grula. Plcr»t Mol. Biol.. 12:579-589. 1989.
Hwang et «l.. "Identification and map position of YAC clones comprising one-third of the Ar«hiclopsi.s aenome. Tlre Plo»t Jo»r»crl. I :367-374, 1991.
Ikeda et «l.. J. B«cteriol., 169:5615-5621. 1987.
Ikuta et «L. Bioltech»ol.. 8:241-242, 1990.
Inohara et crl., J. Biol. Clre»r.. 266. 7333. 1991.
Johnston et «l.. "Gene gun transfection of animal cells and genetic immunization,"
A~lethocl.s Cell. Biol. 43(A):353-365. 1994.
Jones. Er»ho J.. =1:?41 l-2418. 198.
Jones. Mol Ge». Ge»e r.. 207:478. 1987.
Jorgensen cm crl.. rt4ol. Ge». Genet.. 207:471. 1987.
Jouanin et «l.. Nlol Ge»e Ge»et. 201:370-4. 1985 Joyce. "RNA evolution and the ori;ins of life." N«rr«w. 338:217-24=1. 1989.
Kaasen et crl.. J. B«creriologv. 174:889-898. 1992.
Karpen. Crrnr. Op. Gen. ~ Dev.. 4:281, 1994.
Karsten et crl., Bm«»ic« Nl«ri»cr. 35:1 I-19. 1992.
Katz et «l.. J. C:e». Microbial.. 129:2703-27 l4. 1983.
Kim and Cech. "Three dimensional model of the active site of the self-splicin<< rRN.~
precursor of Tetrahvmena." Pros. Ncrrl. Acercl. Sci. USA. 8=1:878s-s792. 1987.

Klee et ul.. Bio/Tcchnuln,~~v 3:637-6=1?. 1985.
Klein et ul.. Narrrre. 327:70-73. 19$7.
Klein et aL. Prno. Nat 'lAccrd. Sci. US,-1. S5:8502-5505. 1988.
Kohler et crl.. Errr. J. lrnnttrnol. 6:5 I 1-519. 1976.
Kohler et crl., tVatrrre ?56:495-497. 1975.
Konieezny et crl.. "A procedure for mapping Arcrhidop.si.c mutations using codominant ecotype-specific PCR-based markers." Tlre Plant Jorrrrral. 4:403-410. 1993.
Konieczny et crl.. Genetics, l ? 7: 801, 1991.
Koorneef et crl.. Genetica, 61:41-46, 1983.
Koorneef, "Linkage map of Aralodop.sis thaliana (2n=10)." hr SJ 4'Brien, ed.
Gerretio Mcrps 1987: A conrpilcrtion of linkage and restriction maps of geneticnllv studied orguni.snr.r. 724-745. I 987.
Koorneef, "The use of telotrisomies for centromere mappinv in Arabidop.si.c thcrlicrrru ( L. l Heynh. Gerreticn. 6?:33-40. 1983.
IS Koster and Leopold. Plcrrrt Physiol., 88:829-832, 1988.
Kuby, J., lrrrnrernololy 2rrd Edition. W. H. Freeman & Company, NY, 1994 Kuhn et nL, Proc. Natl. Accrd. Sci., 88:1306. 1991.
Kyte et al., A simple method for displaying the hydropathic character of a protein." J.
MnI. Biol. 157(l):I05-13?. 1982.
Lawton et ul.. Plant Mml. Biol. 9:315-32-1. 1987.
Lechner et crl.. '~A 240 kd multisubunit protein complex. CBF3 is a major component o!
the buddin~~ yeast centromere." Cell. 64:717-7?5. 1991.
Lee and Saier, J. of Bcrcvc~ric~l.. I 53-685. 1983.
Levings, Science. 250:942-947. 1990.
Lewin. Genes Il. John Wilev & Sons. Publishers. N.Y.. 1985.
Li et ul.. Plum Cell. 7: I 599. 1995.
Li et crl.. Proc. Nutl. Accrd. .Sci.. 37:4580--1584, 1990.
Lieber and Strauss. "Selection of efficient cleavage sites in target RNA by using ,r ribozyme expression library." A~lol. Cell. Biol.. 15: 540-551. 1995.
Lin. S.. Kaul. S. Rounsley. T.P. Shca. \1-f. Benito. C.D. Town. C.Y. Fujii. T.
Mason. C.
L. Bowman. Nl Barnstead. T. Feldblvum. C.R. Buell. K.A. Ketchum. C.~~I

Running. H. Koo. K. Moffat. L. Cronin, M. Shcn, G. Pai. S. Van Aken. L..
Umayam. L. Tallon. 1. Gill. M.D. Adams. A.J. Carrera. T.H. Creasy. H.M.
Goodman. C.R. Sornerville. G.P. Copenhaver. D. Preuss. W.C. Merman. O.
White. J.A. Eisen. S. Salzberg, C.IvI. Fraser. and J.C. Venter, ''Sequence and Analysis of Chromosome 2 of Arcrhidohsi.c tlrnlicurer." Ncrnrre 402: 761-768.
1999.
Liu, YG., Shirano. Y.. Fukaki. H.. Yanai, Y.. Tasaka. M., Tabata, S.. Shibata, D. Prnc.
Nat! Acncl Sci USA 96: 6535-40. i 999.
Lohe and Hilliker, Crrrr. Op. Gen. & Dev., 5:746. 1995.
Loomis et crl., J. E.~Pt. Znnln~y. 252:9-15, 1989.
Lorz et al., Mnl. Gerr. Gene t.. I 99:178, 1985.
Louis. EJ, "Corrected sequence for the right telomere of Sacchnronrvce.c cereoisicre chromosome IIL" Yecr.cr. 10(2):271-4. 1994.
Lu et crl., "High efficiency retroviral mediated gene transduction into single isolated immature and replatahle CD34(3+) hematopoietic stem/prooenitor cells from I~ human umbilical cord hlood," J. E.y. Mecl. 178(6):2089-2096. 1993.
Maeser and Kahmann. "The GIN recombinase of phase Mu can catalyse site-specific recombination in plant protoplasts," Mnl. Gen. Genm.. 230:170-176. 1991.
Mahtani, M.M. and Willard. H.F. Gennnre Re s. 8:100, 1998.
Maloy, S. R., "Experimental Techniques in Bacterial Genetics" Jones and Bartlett Prokop, A., and Bajpai. R. K. "Recombinant DNA Technology I" Ann. N. Y.
Accrd. Sci. vol. 646. 1991.
Maiuszynaska et crl.. "Molecular cytogenetics of the genus Ar-crhiclnp.ci.c:
In .citrr localization of rDNA sites. chromosome numbers and diversity in centromeric heterochromatin." Annuls Botcnrv. 71:479-484. 1993.
Maluszynska et nl.. ''Localization of tandemly repeated DNA sequences in Arcrhiclnp.si.c tltalianu." Plant Jnur.. 1 (2):159-166. 1991.
Maniatis et crl.. "Molecular Cloning: a Laboratory Manual," Cold Spring Harbor Laboratory. Cold Sprin~~ Harbor, NY'.. 1982.
Marcotte et al.. Ncrrrrre. 33~:4s4. 1988.
Ivlariani et crl.. Nature. 3=17:737-741. 1990.
Marra et ul.. Nature Genet. 22:?65. 1999.

WO fl0/i5325 PCT/US00/07392 Martinet-Zapater ct «l.. Mnl. Ge». Gc~nct., 204:417-423. 1986.
Matsuura et «L. Jntrr»«I of B«ctcrinln~y. 178:3374-6. 1996 McCabe ct «l.. Bintechrtnln'~y. b:923. 1988.
Michel and Westhof. "Modeling of the three-dimensional architecture of group I
catalytic introns based on comparative sequence analysis." J. Nlnl. Binl.. 216:585-610.
i 990.
Mortimer et «L, "Genetic mapping in S«cclrcrro»roce.s cerevisi«e." Life Cocle «»cl Inheritance, hr: The Molecular Bioloh~>~ nf'tlre Ye«sr S«cch«rnr»vces, 11-26.
1981.
Mozo et crL, Mol Ge» Genet, 258:562-70, 1998.
Mozo et crl.. N«trrre Ge»et. 22:271, 1999.
Mundy and Chua, Tire EMBO J., 7:2279-2286. 1988.
Murakami et «l.. Mnl. Ge». Ge»et.. 205:42-50. 1986.
Murata et «L. Plcr»t J.. 12:31. 1997.
Murdock et «l.. Plrwnclremi.str v. 29:85-89. 1990.
Murray et crl.. N«trrre, 305:189-193, 1983.
Mysore et «l., "An crrcrbiclopsi.s histone H2A mutant is deficient in rr's;~roh«cterirrnr T-DNA integration." Proc N«tl Arad Sci U S A 18:97(21:948-53. 2000x.
Mysore et «l.. ''Ar«bidnp.si.s ecotypes and mutants that are recalcitrant to Agrnbcrcteri»»r root transformation are susceptible to germ-line transformation. PI«»t J 21 ( 1 ):9-I 6. 2000b.
Napoli. Lemieux. Jorgensen. "Introduction of a chimeric chalcone synthase gene into petunia results in reversible co-suppression of homologous genes i» tr«rrs."
Plcrrrt Cell. 2:279-289. I 990.
Ne~rutiu. I.. Hinnisdaels. S.. Cammaerts. D.. Cherdshewasart. W.. Gharti-Chhetri. G..
and lacobs, M. ''Plant protoplasts ac genetic tool: selectable markers for developmental studles.'~ I»t. J. Uev. Binl. 36: 73-8=1. 199?.
Neater. Ann. Re a. PI«»t Plro.s.. 35:387-413. 1984.
Nicklas, "The forces that move chromosomes in mitosis." .4rrrrtr. Rev.
Binhlrvs. Binplry.c.
Che»r.. 17:431-39. 1988.
Nuasbaum et crl., Prnc. N«t'I Accrd. Sci USA. 73:1068. 1976.
Odell et «L. Nature. 313:810-812. 1985.

Ohmori and Tomizawa. ''Nucleotide sequence of the region required for maintenance of colicin EI plasmid," Mcol Gen Genet. Oct 3:176(2):161-70. 1979.
Omirulleh c t «L. Plctrrt M~rlc~cnl«r Bicrln.st'. ? 1:4 l ~-=12S. 1993.
Ow et «!.. Science, ?3=1:856-859. 1986.
Palukaitis et «l., "Characterization of a viroid associated with avocado sunblotch disease."
Vinolo,sy. 99:145-151. 1979.
Pelissier et «l., Genetic«. 97:141, 1996.
Pelissier et «l.. Plant Mol. Bi«l., 26:441, 1995.
Perkins. "The detection of linkage in tetrad analysis." Genetics. 38. 187-197, 1953.
Perlak et crl., Prnc. N«tl. Ac«cl. Sci. USA. 88:3324-3328, 1991.
Perriman et «!.. "Extended target-site specificity for a hammerhead ribozyme,"
Gene, 1 I 3:157-163. 1992.
Peterson et crl., "Production of transgenic mice with yeast artifici,il chromosomes.' Trends Genet. 13: 61-66, I 997.
IS Phi-Van et «l., Mol. Cell. Biol., 10:2302-2307. 1990.
Piatkowski et crl., PI«rrt Phvsiol., 94:1682-1688. 1990.
Potrykus et «l.. Mol. Gen. Genet., 199:183-188, 1985.
Prasher et crl., Binclrenr. Binhlrvs. Re.s. Conrnnnr., 126(3?:1259-1268. 1985.
Preuss et «l.. "Tetrad analysis possible in Ar«hiclopsi.s with mutation of the QUARTET
(QRT) genes," Science. 264:1458. 1994.
Price et «L. "Systematic relationships of Arrrhiclopsis: a molecular and morpoholical perspective", in: Somerville.C. and IVleverowitz. E. (eds. ) Ar«hiclr~p.si.s.
Cold Spiny Harbor Press. NY, 1995.
Prody et «l., "Autolytic processing of dimeric plant virus satellite RNA."
Sr~ieuce.
231:1577-1580, 1986.
Prokop c~t «l.. Ann. N. 1'. Ac«d. Sci. 646. 1991 Puechbertv. J. Gennnric.s 56:247. 1999 Rathore ct crl.. Pl«rrt tYlul Biol. 21:S71-8=1. 1993 Rattner. "Thc structure of the mammalian centromere." Biocr.s.s«v.s. 131? ):~
1-56. 1991.
Ravatn en «l.. Jnrrnr«I n/ Brrcterinln,,u. 180:5505-14. 1998.
Reed et crl.. J. C:e~n. Micrnbinln,~y. 130: I-=1. 1984.

Reiehel et crl.. Proc. Ncrt'I Acact. Sc:i. USA. 93 ( 12) p. 5888-5893. 1996 Reinhold-Hurek and Shub, "Self-splicing introns in tRNA genes of widely divergent bacteria." Nature. 357:173-176. 1992.
Rensburg et al.. J. Plant Physiol.. I=ll :188-194, 1993.
Richards and Ausubel, ''Isolation of a higher eukaryotic telomere from Arvrhidnp.ri.~~
thulianer." Cell, 8:53( 1 ):127-36. 1988.
Richards et al., "The centromere region of Arnbictnpsi.r tlralicrrrcr chromosome 1 contains telomere-similar sequences." Nrrcleic Ac~icls Resecrrclr, f 9( I 2):335 I-3357, 1991.
Rieder, "The formation. structure and composition of the mammalian kinetochore and kinetochore fiber." hrt. Rev. Cvtnl, 79:1-58, 1982.
Ropers et crl., Meth. irr Enzvrrrol.. I 53:253-277, 1987.
Rosenberg et crh "RFLP subtraction: A method for making libraries of polymorphic markers." Prnc. Nnt! Aced. Sci. USA. 91:6113-61 17, 1994.
Round et al., Genonrc Res, 7. 1053. i 997.
IS Saner, "Functional expression of the cre-lox site-specific recombination system in the yeast Succlrcrromvces cereuisiae." Mol. crud Cell. Biol., 7: 2087-2096. 1987.
Schmidt et crl., Plant Journal. 5:735-44. 199=1 Schwartz et crl.. Cold Spring Harbor Symp. Quunt. Binl.. 47, 195-198, 1982.
Sears et al.. "Cytogenetic studies in Arabiclopsis thulicrrrn," Cart. J.
Genet. Cvtol., 12:217-233, 1970.
Sepal, "Biochemical Calculations" 2nd Edition. John Wiley & Sons. New York.
1976.
Setlow et crl.. Genetic En'~~irrccrin,y: Princilrle.c card Methods, 1979.
Shaaan and Bar-Zvi. Plcrrrt Phv.sinl., 101:1397-1398. 1993.
Shapiro, Irr: Mnhile Genetic Elerrrerrr.c. 1983.
Sheen et al., Plant Jurrrnal. 8(5):777-78=1. 199.
Shineo et nl.. Mul. Cell. Biol.. 6:1787. 1986.
Simoens et ul.. Nrrc. Acids Reo.. 16:6753. 1988.
Smith, Watson. Bird. Ray. Schuch. Gricrson. "Expression of a truncated tomato polyoalacturonase gene inhibits expression of the endogenous gene in trans~enic plants." Nlol. Gen. C)errer.. 224:~1~17-=181. 1990.
Smithies et ul., Narrrre. 317:230-? 3:.1. 1985.

Smythe. "Pollen clusters." Crrrr-e»t Birrlrr,r,~y. 4:8> I-8~3. 1994.
Somerville. C. and Somerville. S.. Scie»ce 286:380. 1999.
Spielmann ct crl.. Mrrl. Go». Genet.. 20:3=1. 1986.
Stalker et crl.. Scierrce. 2=12:419-422. 1985.
Stiefel et crl.. Natrrrc~, 341:313, 1989.
Stinchcomb et crl., Nature. 282:39-43, 1979.
Stou~aard. Tlre Plant Jorrr»al. 3:75-761. 1993.
Sullivan, Christensen, Quail, Mol. Ge». Ge»c-t., 215(3):431-440, 1989.
Sun et al., Cell, 91:1007. 1997.
Sutcliffe, Prnc. Nat'I Acncl. Sci. USA. 75:3737-3741. 1978.
Symington et ctl.. Cell. 52:237-240. 1988.
Symons, "Avacado sunblotch viroid: primary sequence . and proposed secondary structure." Nrrcl. Acids Re.c.. 9:6527-6537. 1981.
Symons, "Small catalytic RNAs." Arr»rr. Rc~a. Binche»r., 61:641-671. 1992.
Tarczynski et crl.. ''Expression of a bacterial nrtlD gene in transgenic tobacco leads to production and accumulation of mannitol," Proc. Natl. Acad. Sci. USA. 89:1-5, 1992.
Tarczynski et crl., "Stress Protection of Transgenic Tobacco by Production of the Osmolyte Mannitol," Science, 259:508-510. 1993.
Thillet et al.. J. Biol. Chenr.. 263:12500-12508. 1988.
Thomas et crl.. Cell, 44:419-428. 1986.
Thomas et crl., Pruc. Ncrtl Accul. Sci. USA. 7 l :4>79. 1974.
Thompson et nl., "Decreased expression of BRCA 1 accelerates Growth and is often present during sporadic breast cancer progression." Nature Genet.. 9:4:x=1-450, 1995.
Thompson et crl., Nnc. Acicl.c Re.s.. 2=1:3017. 1996 Tian. Sequin. Charest, Plant Cell Rcl.. 16:267-271. 1997.
Tominaga. Nlicrnhiolo,~y. 143:2017-63. 1997 Toriyama et ul.. TJrenrAppl. Ge»et.. 73: I6. 1986.
Tsav er nl.. Science. 260:342. 1993.
Tuoal et ul.. Plu»t Phv.rirrl.. I 20:309. 1999 - I 7=1-Twell et crl.. Gerrc~.s Dev 5:496-507. 1991 Twell et al.. Plant Phvsiol 91:1270-1274. 1989.
Tyler-Smith of crl.. "Mammalian chromosome structure." Crrrrc~»t Binln,~y.
3:390-397.
1993.
Uchimiya et crl.. MoI. Ge». Ge»ot., 204:204. 1986.
Van der Krol. Mur. Beld, Mol. Stuitje, ''Flavonoid genes in petunia: addition of a limiting number of copies may lead to a suppression of gene expression."
Plarrt Cell. 2:291-99, 1990.
Van't Hof. Kuniyuki. Bjerkens, "The size and number of replicon families of chromosomal DNA of Arabidoh.si.s thcrliarrcr." Clrro»ro.somn. 68: 269-285, 1978.
Vasil et crl.. "Herbicide-resistant fertile transgenic wheat plants obtained by microprojectile bombardment of regenerabie embryogenic callus,"
Binrech»olog~~.
10:667-674, 1992.
Vasil. Binteolrrruloy>~, 6:397, 1988.
Vernon and Bohnert. The EMBO J., 1 1:2077-2085. 1992.
Voytas and Ausubel, Nature, 336:242. 1988.
Wagner et crl.. "Coupling of adenovirus to transferrin-polylysine/DNA
complexes greatly enhances receptor-mediated gene delivery and expression of transfected genes,"
Proc. Nnt'I Aced. Sci. USA 89 ( 13):6099-6103. 1992.
Walker et crl., Prnc. Nat'I Acucl. Sci. USA, 84:6624-6628. 1987.
Want' et crl.. M~leccrlar a»d Cellular Bivlo~y. 12(8):3399-3406. 1992.
Watrud et crl.. I»: E»yi»eered Orga»i.sr».s a»d the E»airnrrme»t, 1985.
Watson et ul.. Recorrri~irrn»t DNA: A Short Ccrcrrse.. 1986.
Weinsink et crl.. Cell, 3:315-325. 1974.
Wevrick et crl.. ''Partial deletion of alpha satellite D\A association with reduced amounts of the centromere protein CENP-B in a mitotically stable human chromosome rearrangement," Mul Cell Bin/.. 10:6374-6380. 1990.
Whitehouse. N~rtru~e. No. =1205: 893. 1950.
Wigler et ul.. Cell. ( 1:223, 1977.
Willard. H.. Nature Ge»etics 15:345-35=1. 1997 Willard. )-1.."Centromeres of mammalian chromosomes" Tr-errd.s Ge»e r.. 6:=) 10-416. 1990.

Wolter a t «l.. The EMBO J.. 4685-=1692. 1992.
WonU et «l., "Electric field mediated gene trsnsfer." Bir~c~lri»r. Bi~plrv.s.
Re.s. Cr»»»r«».
107(2):s84-587. 1982.
Wriaht et crl.. Ge»etics. 1=12:569. 1996.
Xiang and Guerra. Pl«»t Plrv.sinl.. 102:287-293. 1993.
Xu et «l., PI«»t Plra.siol., I 10:249-2s7. 1996.
Yamada et «l., Plant Cell Reh., 4:8~, 1986.
Yamaguchi-Shinozaki et al., Plant Cell Plry.siol., 33:217-224. 1992.
Yang and Russell, Pros. Nnt'I Accrd. Sci. USA. 87:4144-4148, 1990.
Yen. E»rbo J. 10(5). 1245-1254. 1991.
Young et «l.. I»: Errknwotic Genetic Sv:ste»r.s ICN-UCLA St»rp~sicr urt Mnlecrrl«r am!
Cellrrl«r Biolot~~y. VII. 3 I 5-331. 1977.
Yuan and Altman. "Selection of wide sequences that direct efficient cleava«e of mRNA
by human ribonuclease P." Scie»ce. 263:1269-1273. 1994.
is Yuan et crl.. "Targeted cleavage of mRNA by human RNase P." Prno. Ncrtl.
Accrcl. Sci.
USA. 89:8006-8010, 1992.
Zatloukal er crl.. "Transferrinfection: a highly efficient way to express gene constructs in eukaryotic cells," An». N. Y. Accrcl. Sci., 660:136-Is3. 1992.
Zhan~r et «l.. Gene. 202:139-46. 1997 Zukowsky ct «l.. Pruc. Ncrt'I Ac«cl. Sci. USA. 80:1 101-1 10s. 1983.

WO 00/x5325 FCT/US00/07392 SEQUENCE LIST.' .'3 <110> PREUSS, DAPHNE
COPENHAVER, GREGORY
KEITH, KEVIN
<120> CHROMOSOME COMPOSITIONS AND METHODS
<130> ARCD:309PZ6 <140> UNKNOWN
<141> 1999-12-10 <160> 179 <170> PatentIn Ver. 2.0 <210> 1 <211> 1038 <212> DNA
<213> Arabidopsis thaliana <400> 1 tgactatgtg atatggttca aattacctat aactactctc tcaaataaga gatcaattgc 60 agttttttag gatcgaattc acggagttct tttgttcaaa cagtgagtta aatgtcgaga 120 ttaagctagc aggatatgat tgaaaataaa agagaacaaa gtaagaaaac agcagattga 180 ttttgttgta aacgatttaa taaagagcta ggaacagggt attctcacga aactattggt 240 tagtagatct aatgaaagct aggttgtgat caaactaCtc ttaaactcaa actctaatta 300 tggaacaaca ggtaggcgtg ccgcgaaact ccctatatct atagctaata ataaccggag 360 aagccgagaa actatcaacc taaatatgca ttcttaacga gttcaattgt tcatcttact 420 agataggccg attcttatta cacacctata aaccagactc atcaaataat agatccaatt 480 acagatacct atgatgggca tatctagtgt ctggattcaa gatctagtta attactctag 540 atctagcatt aagcatagat gaagaactct acagataacc tagcagaggg ggcaatctac 600 taaaccatat gaatccctaa tgaaaaaccc tattcctaac aagcagatta ctcagacata 660 ttggatggag caaacaacat aattgacctt agcttttgct ccaaaatgtc tccttatctc 720 cattgttgtc ccattgcata aaatacctga aaagacacca aaaagactcg agagataaca 780 taacgactca aaatcctata cctaaaacat ggatacaatc agtaaaaatc gggttatatc 840 aactccccga gacttagctt ttgcttcccc tcaaacaaaa cacaaaagca aaacccgtgg 900 aagaggtttt gaaaacaaag gaactcccaa cattctctag cctattgcca tgatcatcca 960 aactaagtcc atatgcctaa caagtctaat caaatcctaa ccaacatgta cttctctgat 1020 tgatttttcc agttcttt 1032 <210> 2 <211> 601 <212> DNA
<213> Arabidopsis thaliana <400> 2 tgatatggtt caaattacct ataactactc tctcaaataa gagatcaatt gcagtttttt 60 aggatcgaat tcacggagtt cttttgttca aacagtgagt taaatgtcga gattaagcta 120 gcaggatatg attgaaaata aaagagaaca aagtaagaaa acagcagatt gattttgttg 180 taaacgattr aataaagagc taggaacagg gtattctcac gaaactattg gttagtagat 240 ctaatgaaag ctaggttgtg atcaaactat tcttaaactc aaactctaat tatggaacaa 300 caggtaggcg tgccgcgaaa ctccctatat ctatagctaa taataaccgg agaagccgag 360 aaactatcaa cctaaatatg cattcttaac gagttcaatt gttcatctta ctagataggc 420 cgattcttac tacacaccta taaaccagac tcatcaaata atagatccaa ttacagatac 480 ctatgatggg catatctagt gtctggattc aagatctagt taattactct aga~c~agca 54C

ttaagcatag atgaagaact ctacagataa cctagcagar_,ggggcaatct actaaaccat 600 a 601 <210> 3 <211> 885 <212> DNA
<213> Arabidopsis thaliana <400> 3 tttttttgtc cacacaatga gttgaatgtc aagattaagc tagtagagat tgattgtaat 60 aagaagtaaa caaagtaaaa agacaacgga ttgattggtt gtaaacgata aaataaagag 120 gtaggaacaa ggtattctca ggagactatt ggttagtaga tctaatgaaa gctaggttgt 180 tatcgaacca ttattaaaca caaattttaa ttatggaata accggtggtg ttctgcaaaa 240 cttttgtgcc tatagctaag aataaccgca gaagccgaga gatctttaac ctaaacatgc 300 attctaaacg agttcaattg ttcaccttag tatataggcc gattcttatt acacacctat 360 aaaccagact catcaaataa tagatccaac tacatatacc tatggtgggt atatctagtg 420 tctggattca agatctagtt aattactcta gatctagcat taagattaat tctacacata 480 atttagcaag ggggtgatct actaaaccat atgaatccct aatgaaaaac tcaattccta 540 acaagaaact actcagacag attgattgaa acaaacaaca taaatgaata agaaagcata 600 aacacaacaa ataaaattag ggaatgaaag gatctcttca ctgtaatgag aactgaatga 660 atctctgaag aacaacggat gattagctta tgtctctctg aaaataggga ttaaaaactt 720 gataaaagga acttaggtct aaacaatgac ctttaaaact atatataaac cctataaaac 780 gtccagggac taataatgca aatagggaag tcttttgggg caaatttcca cttttgtaaa 840 cttgaaagcg tattggactt ttctgggccg aaactggtgt cgatc 8g5 <210> 4 <211> 1072 <212> DNA
<213> Arabidopsis thaliana <400> 4 tatcttgata tggttcaaat taccctaaga actactctct caaataagag atccattgcg 60 gtatttaagg atcgaattcc acaaagttct tttcttcaaa caataagttc aatgtcaaga 120 ttaagctaga agggtatgat cgaaataata agaaaacaaa ggaagaaaac agtagattgt 180 ttcgttgtaa acgattaaat aaaaagctag gaacagggta ttctcatgaa actattggtt 240 agtagatcta atgaaagcta ggttgttatc gaaccattct taaactcaaa ctctaattat 300 ggaataactg gtggtgttcc gcaaaactcc ctataattat agctaagaat aaccggagaa 360 ttcgagagat tattaaccta aatatgcatt cttaacgagt tcgattgttc accttagtag 420 ataggccaat ttttattaca cacctataaa ccaggctcat caaataatag atccaactac 480 agatacctat ggtggacata tctattgtct ggattcaaga tctagttaat tactctagat 540 ctagcattaa gcataatcaa agatgaagaa ttctacagat aacctagcaa aggggaaaaa 600 ttactaaacc atatgaatcc ctagtgagaa accctattcc taacaagcag attactcaga 660 catattgatt gaagcgaaca acataattgt gtatgaaagg tccaaaatcg tccttagctt 720 ccttttcctt acctcttgct cgaaatgtct cctcatctcc attgttgtcc cgttgcacag 780 aatacctgaa aagacactac aaagactcga gaaataacat aaagactcaa aatccattac 840 caaacacata gataaaatcg gtgaaaatag gatatatcaa ctccccaaga cttagctttt 900 gcttgccctc aagcaaaaca caaaagtcga acccgtggaa gagattttga aaacaaagga 96C
actcccaaca tcctctagac tattgccatg atcatccaaa ctaaatccac atgcctagca 1020 agtctaatca aatcctaacc aacatgtact tctctaccca agctttgtaa tt 1072 <210> 5 <211> 591 <212> DNA
<213> Arabidopsis thaliana <400> 5 tgatatggtt caaattaccc caagaactac tctctcaaat aagagatcca ttgcggtatt 50 taaggatcga attccacaaa gttcttttct tcaaacaata agttcaatgt caagattaag 120 ctagaagggt atgatcgaaa taataagaaa acaaaggaag aaaacagtag attgtttcgt 180 tgtaaacgat taaataaaaa gctaggaaca gggtattctc atgaaactat tggttaatag 240 atctaatgaa agctaggttg ttatcgaacc: attcttaaac tcaaactcta attatggaat 300 aactggtggt gttccgcaaa actccctata attatagcta agaataaccg gagaattcga 360 gagattatta acctaaatat gcattcttaa cgagttcgat tgttcacctt agtagatagg 420 ccaattttta ttacacacct ataaaccagg ctcatcaaat aatagatcca actacagata 480 cctatggtgg acatatctat tgtctggatt caagatctag ttaattactc tagatctagc 540 attaagcata atcaaagatg aagaattcta cagataacct agcaaagggg a 591 <210> 6 <211> 650 <212> DNA
<213> Arabidopsis thaliana <400> 6 taagaactat tatctcaaat atttaaggat cgaatttcac aaagtttttt tgttcacaca 60 atgaattaaa tatcgagatt aagctagtag ggaatgattg aaaataaagg agaacaaagt 120 aaaaagacag cagattaatt ggttgtaaac gattaaataa agagttagga acatggtatt 180 ctcaggaaac tattggttag tagatctaat gaaagctagg ttgttatcga accattctta 240 aactcaaact ctaattatgc aataaccggt ggtgttccgc caaactccct atgcttatag 300 ctaagaataa ccggagaagc cgagagatct ttaacctaaa catgcattct aaacgagttc 360 aattgttcac cttactagat agaccgattc ttattacaca cctataaacc aggttaatca 420 aataataaat ccaattacag atacctatgg tgggcatatc tattgtctgg cttcaagatc 480 tagttaattt ctgtagatct accattaagt ataatcaaag atgaagaatt ctacagataa 540 cctagaaaag gaggcaatat actaaaccat atgaatcccc aatgagaaac cctattccta 600 acaagcaaac tactcagaca tattgaatga aacaaacaac ataattgagt 650 <210> 7 <211> 856 <212> DNA
<213> Arabidopsis thaliana <400> 7 taaatatgca tttttaatga gttcgtttgt tcaccttagt agataggccg attcttatta 60 cacacctaaa aaccagactc atcaaataat agatccaact acatatactt atggtgggca 120 tatctattgt ctggattcaa gatctagtta gttactctag atctagcatt aagcataatt 180 aaagatgaag aattctacag ataacctagc aaagggggca atctactaaa ctatatgaat 240 ccctaatgag aaaccctatt cctaacaagc agactact_ca gacatattga ttgaaggaaa 300 caacataatt gagtatgaaa acataaacac ggcaaataga tttaagggaa agaagggatc 360 tcttcactgt attaggaact gaatcaatct ctgaaaacac tcgatgaata gcttatgtct 420 ctcagtaaca gggtttgcaa aaagcttgat aaaaaacttg ataatgaaaa cttaggtcta 480 aacaatgtat atacaccctc taaaaacgtc tagggactaa taatgtaaat agaaaagttt 540 tctagggcaa atttcctctt ctgtaaactt gaaagcgtct aggactttgc tgggccgaaa 600 ctggtgtcga tcgacactag gagtgtgtcg atcgacactc ctcttgattc gtgaaaccaa 660 agtcgtcctt accttacttt ttcttagctt ttgctccaaa atgtctcctt atctccattg 720 ttgtcccact gcatagaata cctgaaaaga caccaaaaag actcgagaaa taacataaag 780 actcaagatc ctatacctaa aacatagata aaatcagtta aaataaggat atatcaatca 840 ccacaatcta catatt 856 <210> 8 <211> 736 <212> DNA
<213> Arabidopsis thaliana <400> 8 aactatgatt ttagagtaac cgatggcgtt ccgcgaaact c:cctatgctt atagctaaga 60 ataaccggag aagccgagag atctttaact taaacatgca ;:tattatcaa atttgattag 120 ttcacctagt atctaaacca gagcccttat atgagcctac ctgttctttc ttaaatgcct 180 aggctcatct atgatagatc aaatagcaaa tacctatggt gggcatacct attatctaat 240 atcaagttct agttagctac tctagaacta ccaataagaa caattaagat gaagaatcat 300 atagataacc tagcaagggg caatctacta aatcatctaa <xtctctaatg agaaacccta 360 aacctaacaa gtggattact aagacatgat caaagaaaca caaatcatat tctgaataag 420 aaataaatga tgaaaataac aagagaaaag agtaagaaag atccaaaagg gagttttcac 480 aggtttttgc tctccaaagt acaaaagaga tccaggaaat ~acctcccaa agcttacggg 540 tctaaaacaa tgacctaaaa actatatata tgtcttaaaa acatgatggg ccttaattaa 600 acataggaga aagttctggg ccgaatttgg aaatctccaa aacatcaata agttgcgtct 650 cgaaattgca gtcaggtatt agtgttgctc gacactaggg gtggtgtcgt tcgacaccca 720 cgtgcatttt cgtctc 736 <210> 9 <211> 679 <222> DNA ' <213> Arabidopsis thaliana <400> 9 tatcgcaccg ttctcgaact caaactatga ttttagagta acctgtggtg ttccgcgaaa 60 ctctctatac ttagagctaa gataaccgaa gaagccgaga aatcttatac taaaccatgc 120 atttttatca agaatgatta gttcacctag tatctaaact agagcccttc tatgagccta 180 tctgttcttt cttaaatgcc taggcccatc tatgatggat caaatagcaa gtacctatgg 240 tggacatacc tattatctaa tatcaagttc tagtcagcta ctctataact agcattaaga 300 acaatcaaga taaagaactc tacagataac ctagcaaggg ggcaatctac taaatcatct 360 aaatccctaa tgagaaaccc taaacctaac aagtgaatta ctcagacatg atcaaagaaa 420 cacaaatcat agtctgaata aggaatcaat aatagcaaga aaaagagtaa agaagatctt 480 ctccaaaggg agtctccaca gggttttgct cccgaagtac aaaaaaatag agaatatcct 540 ttccaagctt agatctaaac aatgacccta aaaacctaat tatatgtcta aaacacgtga 600 tgggccttca ttaaacatag gagaaagttc tgggccgaat ttggaaatct tcaaaacatc 660 aataagttgt gtctcgaaa 679 <210> 10 <211> 1198 <212> DNA
<213> Arabidopsis thaliana <400> 10 aagacatatc aatgtgctat gtgatatagg tcaaattaat cataagacca ctctctcaaa 60 taaagaggtc aattgcagcg cttagggatc gaataacaaa gttctcggat cacgcaatag 120 actaacggca aatcgaatta tgctaagtaa aatagtaaat aaaataaaga gaaacaaaag 180 tatgcaatag caatcgattg gatggttgtg aaacaagata agaaaagcgt caggcttagg 240 ctattctcag gaaatagatg atagtagatc tagaaatagc taggttatta tcgcaccgtt 300 ctcgaactca aactatgatt ttagagtaac ctgtggtgtt ccgcgaaact ctctatactt 360 agagctaaga taaccgaaga agccgagaaa tcttatacta aaccatgcat ttttatcaag 420 aatgattagt tcacctagta tctaaactag agccctccta tgagcctatc tgttctttct 480 taaatgccta ggcccatcta tgatggatca aatagcaagt acctatggtg gacataccta 540 ttatctaata tcaagttcta gttagctact ctataactag cattaagaac aatcaagata 600 aagaactcta cagataacct agcaaggggg caatccacta aatcatctaa atccctaatg 660 agaaacccta aacctaacaa gtgaattact cagacatgat caaagaaaca caaatcatag 720 tctgaataag gaatcaataa tagcaagaaa aagagtaaag aagatcttct ccaaagggag 780 tctccacagg gttttgctcc cgaagtacaa aaaaatagag aatatccttt ccaagcttag 840 atctaaacaa tgaccctaaa aacctaatta tatgtctaaa acacgtgatg ggccttcatC 9C0 aaacatagga gaaagttctg ggccgaattt ggaaatcttc aaaacatcaa taagttgtgt 960 ctcgaaattg ccgtcaagta atggtttcgc tcgacaccta agttcatttt cgtctccgga 1020 ttgtttctgc agcttaaatt ctctgttttc ctccagaatg ctccattatc tccaaatgaa 1080 tccaaacatg taaagacctg aaaaggacta gaaaagactc tagaaataac aattagactc 1190 taaaacctat atctaaaaca tacttaaatt aggaaaaaca gggatatatc acttatat 1198 <210> 11 <211> 696 <212> DNA
<213> Arabidopsis thaliana <400> 11 tccaactatg attttagagt aaccgatggc gttccgcgaa actscctatg cttatagcta 60 agaataaccg gagaagccga gagatcttta acttaaacnt gcattattat caaatttgat 120 tagttcacct agtatccaaa ccagagccct tatatgagcc tacctgttct ttcttaaatg 180 cctaggctca Cctatgatag atcaaatagc aaatacctat ygtgggcata cctattatct 240 aatatcaagt tctagttagc tactctagaa ctaccaataa gaacaattaa gatgaagaat 300 catatagata acctagcaag gggcaatcta ctaaatcatc taaatctcta atgagaaacc 360 ctaaacctaa caagtggatt actaagacat gatcaaagaa acacaaatca tattctgaat 420 aagaaataaa tgatgaaaat aacaagagaa aagagtaaga aagatccaaa agggagtttt 480 cacaggtttt tgctctccaa agtacaaaag agatccaggg aatagcctcc caaagcttac 540 gggtctaaaa caatgaccta aaaactatat atatgtctta aaaacatgat gggccttaa~ 600 taaacatagg agaaagttct gggccgaatt tggaaatctt caaaacatca ataagttgcg 660 tctcgaaatt gcagtcaggt attagtgttg ctcgac 696 <210> 12 <211> 670 <212> DNA
<213> Arabidopsis thaliana <400> 12 agttgatata gctcaaattg ccttaagctt actccctcaa ttaagagatc gtcgttagca 60 cttaagggtc gaattccatt gagctctcga tgttcacaca atagacttat gatattgttt 120 aataagctaa atgaagtaat tgaatattaa aggcaaataa gcaagtaaat gagttgtaga 180 ttcaagtgat taaagcgtca ggtctaagga attatctcgg gagatagata aattgtagat 240 ctagataata tcaggattgt tatcgcaccg ttctcatact caaactataa ttctagagta 300 atcagtggcg ttccgcaaaa ctctctatac ttacaactaa gataaccgga gaagccgaga 360 aatcctatgc taaagcatgc attgttaata agcttgaata gttcacctag tatctaaacc 420 agagcccttc tatgaaccta cctgtttttt cttaaatgcc taggctcatc tatgatggtt 480 caaataacaa atacctatgg cgggaatacc tattatctaa tatcaagctc taggtgatca 540 atctaaaact agcattaaga ataatcaaga tgaagaacta taagaataat cctaagggct 600 tttcgatcta ctaatccatc taaatcccta ttgagactcc gtagacccaa caaggtgatt 660 actcaaacat 670 <210> 13 <211> 687 <212> DNA
<213> Arabidopsis thaliana <400> 13 tcgatccccg gcaacggcgc caaatttgat atagctcaaa tcgccttaag cttactccct 60 caaataagag ttgtcgttag cacttaagtg tcgaattcca ctgagctctc gatgttcaca 120 caatagaatt atgatgttat taaataatct agacaaagta attgaatgta aaagacaa~t 180 aaccaagtaa acgaagtgta gattcaagtg attaaagcgt cgggtctaag gaattgtctc 240 gggagataga taaattgtag atctagctaa tataaggatt gttatggcac cgttctcaaa 300 ctcaaactaa gattctagag taaccggtga tgttccacaa aactctcttt acttagagct 360 aagataaccg gagaaaccga gaaatcttat actaaagcat gcattgttat caagcttgac 920 tagttcacct agtatctaaa ccagagccct tctatgagcc tacatgttct ttcttaaatg 480 cctaaactca tctatgatag ttcaaacaac aagtacctat ggtgggcata cctattatct 540 aatatcacgt tctaggtgat caatctaaaa ctagcaataa gaataatcaa gatgaagaac 500 tataagaata atcttaaggg gttttcgatc tactaatcca tctaaatccc tattgagact 660 ccctaaaccc aacaaggtga ttactca 687 <210> 19 <211> 802 <212> DNA
<213> Arabidopsis thaliana <400> 14 tctcaatctg aaggtacctg aaaaacaaga gaccaataga caaagaaata cgtgaatacg 60 tggtagaaaa gttgaattta gacttaaaag gacacctagt ccaatggtga actaaaagag 120 aacttcgcca acggggccaa atttgatata gctcaaattg ccttaagctt actccgtcaa 180 ttaagagatc atcgttagca cttaaggatc gaattccatt gagctctcga tgttcacaca 240 atagacttat gatgttgttt aataagctaa atgaagtaat tgaatattaa aggcaaacaa 300 gcaagtaaat gagttgtaca ttcaagtgat taaagagtca ggtctaagga attgtctcc~ 360 gagatagata aattgtagat ctagataata tcaggattgc catcgcaccg ttcccaaact 420 caaactataa ttctagagta acagtggcgt tccgcgaaac tctctatgct tacaactaag 480 ataaccggag aagccgagaa atccgatgct aaagcatgca ttgttaataa gcttgattag 540 ttcacctagt atctaaacca gagcccttct atgagcctac ctgttctttc ttaaatgtct 600 aggctcatct atgatggttc aaataacaaa tacctatggc gggaatacct attatctaat 660 ataaagttct aggtgatcaa tctaaaacta gctataagaa taatcaagat gaagaactct 720 aagaataatc ctaaggggtt ttcgatctac taatccatct aaatccctat tgagactccc 780 tagacccaac aaggtgat=a ct gp2 <210> 15 <211> 821 <212> DNA
<213> Arabidopsis t~:aliana <400> 15 acaaagtctt aatagtacct gttttaaata taatagagaa gattttataa aaacgatgga 60 aacaagtctg gtattgatgt tttccgttct catcaacaac ttcacctatt tcagctcgtt 120 gcattcgttc aaggaattga gtcctgaaac agtagcaaaa aaagaaggaa ataaatagcc 180 aagaataaaa ttatttataa tacactaaac agttaagaga taatgaaaat ataaacgttc 290 ttacgtgatg cgatccatgt taatctctgg gtaactttaa ttgaatgtaa ttcttgaagc 300 accattatgt gtgattagac attgacgacc taaaatattt atgtttttat tatatgcatt 360 agctataaaa aaaacatatg ar_gagaagag agttaaattt accttcatga tcagcaacgt 420 tccaaaatcg aagcacgcac gttataggat tcataccgaa ggtccactga taaccaaggt 480 gtagaccaag ttccctccac aaacgcacaa acaaatcaca agtatatcca cttgcaacac 540 atgtaagagt ttttccccca at.aagaatat cttaattatt tctctctaac atctaagttc 600 tataaattaa gccacctata aagataacaa tgcttactat gtttctacat taaaatatat 660 aaccaaaaat atgtcgaact atatagtcgg aaatactaac caagtatttt taagctcaaa 720 ctcgagaaat tgagaaagtc gaggttttcg gaatattgag gaacataatc tggtcttgtt 780 atattcccaa ctcgcacg~c aacaccataa gcatctacaa a 821 <210> 16 <211> 672 <2I2> DNA
<213> Arabidopsis thaliana <400> 16 cctcccaagt gagcttgttt aaagtcatta gcttgactcc tttaatcatc aagaagctcc 60 tggagatgaa ccgatggcac ctctggaagg atttggtctg ctaagtattt cttgagcctt 120 tgaccattta ctgtaaaatc tccactctta ccagctagag tgactgctcc ataaggacgg 180 acctcagtga tgtagaaggg gccagaccat ctggatttaa gttttcctgg aaagagtttc 240 aagcgagagt tgaatagcag cacctgatca ccaaccttga aatccttaat gatgatcttc 300 ttgtcatgga aaagcttggt tctctccttg tagattttag aactctcata agcttctaga 360 cggatctcat cgaggttac~ tagttggatc aatcgcttct cctcagcagt ttttatgtca 420 aagttcaaaa gttttaccgc caacattgct ttgtactcga gctcaacggg tagatgacat 480 gattttccat agagaagat-_ gaaaggagtt gtacctatgg gggtcttgaa ggctgtcctg 540 taagcccata atgcatcatc tagctttaca gaccagtctt tccttgtaat cccaacagtc 600 ttttctagaa ttgtttttG~ ct~cctatcg gagatctcaa cctgcccgct tgtctgtgga 650 tctatgacca gc 672 <210> 17 <211> 2954 <212> DNA
<213> Arabidopsis t:~a_iana <400> 17 taacatgaaa atattatc~c atgtatctta taatacaaac tttctgcaat cttcttaaga 60 acatggctaa atagcaaaca tcgctatcaa ttggtgaatt taaaaaacaa agagtcactg 120 attacataag aacatccgcg gttggatggc agctttcgca cttgtaagac tcaagtagta 180 cgttcttctg gagaaaaggc attgaacgtt tcgatagcgt actctttagc tgcccctcgc 240 atcctctctt cgtctagtt.- agcacaatag tccatagcgt taactgaatg tagctcaagt 300 aagatcacac aagcaattcc aa~:aaggctt gaaaatcaca cgtgaaatat atatgtcaca 360 cagtcaaata tgaaatac=~ at<:ganacaa gataatgaaa gacaaccata cctaacgacg 420 aacaattaaa ctgtaagt-_g ggtctactca ttgtcataag atcgagcctc tacgttagca 480 WO 00/~s325 PCT/US00/07392 ttcttccgtt tacaccatgc ttttgaataa ggtacaagag catgatttta tgtgttctat 540 gtacttcgcc aactgctcgt ctttcaacac ttcactcttg cagtctaata caatgatttt 600 ctatttcatc aaatcgatga caagcccaac ccaatgctgc ctctcaacct gcattgggca 660 gtagatgaca tcaacctcat caaaccattt tggtcgcacc atggatggga ctcttgtaag 720 cgtggtgaag taaaaaactc tcttacctga agctgcacaa aatttgttgt attggcgtcg 780 tagctctgac aagaaacttg atggtagaaa atcaaaatgc agcggcttct ttttatcccg 840 ccgcatctga atgaacctca ctaaaagatc tggtccttgt ggaagtcact ttttttatta 900 gcacacacct ttaatactta gttttctact aaaaaagcag tgagcacaca gttttatttg 960 aaacaacaca gattgcatgt ataacttact acttcatcta gaagtgtttg cgcctcgtaa 1020 atagttagaa attcagtatt ctccattaca acgccatcag cctgaaaata agatctacat 1080 ggtaccatat accagcaagt tagttgcgaa agcaaataag ttcttctgaa aatatagggg 1140 ctataaatac ttacactctc cgttctattt gtttaagaaa acatttcact ctctctgcac 1200 ttggaacagc gaatcggtca tatggaccaa tggtggtgga tttggaggta tcttctcccc 1260 ttttcctggc ctaacaacat gaactttagt cgaccctcta acaggtgttt gtcgctgctg 1320 taccggaatg gtaggccgac tctctcttct gttgtacatg acaggactgt ccctaatccg 1380 gtcgctcgac cgcaacacta tgacttttgt ggggaatgtc ttcttcttac cttttggagg 1440 taacctgctg atgtcagcaa caatttcttg cgttaggcca acaaaacgtt tggggcaata 1500 acatcacaat catcttccaa gttcttttct gcatcaatat cctgcataca tatacccatg 1560 tacatagttt agcatatata catcatttac ctatagttta acatctttgt catggcttac 1620 cgggttggtt gcatgcaatt cttctttgtt tctaatctgt aatattcaat tagagaaagt 1680 aagtgggttt cattcaaaaa atataagtgc tagtggatta aatatacaga cggatatagt 1740 aataccttac tgcttgacac ctgctccttc tctttagctc ttttaggtgg ccgggtagat 1800 tgggcgtcca tatcagaaat ggattttttt ctccgtcctt tttgagaagc agcgtcacct 1860 cagacctagt agtacatcac atttctatgt catatttact cattaaaaca aatgagtaac 1920 agtctttacc gtatgctaac ctcctcaaga gatactcttg ctgccatctg acacacttaa 1980 tccatcagtt atgccacaaa cacccgctac actgtcactt cctccaggta gccaaatatc 2040 aagttcatcc ggctgggaac caacatgcgt atgccttggc tggtcagagc ttctatgttc 2100 ctatttcatg aattgcaatt gtaacataag tatattatgt cacttaaaat ggaatggaac 2160 ataaatgttc agttttctaa gacatataca acaaatcaag gtaacttaac ctttgtagag 2220 catgactctg ctacacattt gttgcattcc cctttttgag catgttgtat attttgccca 2280 cttaccagtt ccctaacaga ttcagcaatc attgccacca attctttctt atgcttatct 2340 agttgagcat caaagtatgc cttcaagtca actgtgttcg cctgttgaat cttgagaaaa 2400 cgtttttcaa attcatcagc tgcctctttg atcaatgcac tactactatg ttgcctttcc 2460 ccagcaatcg ggacatcatg ctgcaaatct tctctagcct aaaaactgtc atcatttttt 2520 tgtggttgtc catcatcaac aggaacatca acacccatgc cgtcttcaca tacaccggtt 2580 gcaccctctt gtctctcttc tgcagcacca ccatggacat cagcctccat ctcatcttca 2640 ccatcaatct tagacgaatt cacaccatca tctgcaagaa ctcatcatca gatggttgta 2700 caatatcaca agcgtccatg ccaccactcc aactgttatg ctcgtatggg tgctcctcta 2760 ttattaaaga tagcaggtgt gatacttctt catcctcctc ctcatcagat aaagataagt 2820 cgatatcttt tgcaagtagt ttttcattca gcagtgatgt gacccagtcc tacaatataa 2880 agttgaattg gatagttaag gcatgcacat acaattttaa taccaaaatc aagtagctga 2940 catacaatat aaag 2954 <210> 18 <211> 1129 <212> DNA
<213> Arabidopsis thaliana <400> 18 catgcgggaa cccttgtatg gctcttgtat cttggggaag ttccttggtt gttctttgtt 60 ttgcgcttct aattgtagaa agaaacgagg ttctgcccat ggatatttaa gaaatgcaac 120 aacatcttta accatctcta catgttcagg attaagattg ttcccgcaca tgtcaggcaa 180 agaattccat caactattag aagcaaggct aggcgaaacc gggttagcgg gtctttgtat 240 ttcttcccca tagaaagctt tcctacaatc cattttgaag ttgaccttct ctctttacca 300 aatagtgact tgtagaagct tgtaggattc tgaaccgctt tcacattccc tctagttctt 360 gcatccttct tgttgaagtc agctttgctg aagttaagac cagtcaccag ctcaaaatca 420 tcaatgcaaa agtgaattgg tgtcccaaca aacaaccacc gcatctcata cttcttattt 480 gtcaccaact gtctagatag aatgtaatgc aagaacatta cagaaaatcg atgggttccc 540 atttgcaaga tacttcgaaa tttggaacca tccaaattct gcaaaatgct cttcatctag 6C0 tgcagccttt ataatctcaa tccacctaag tgtaaagtaa ttgtttatcc tctttcgacc 660 ttcgggttca gaacaaagcc gaaagaaacg ctttgtcaac tcatcaccc:c cc<3tttaaca 720 WO UU/sS32~ PCT/US00/07392 gataccctga aattcaaagg taccattatc acttcttttc gcttacgaga tacccaaaac 780 agatgtatac aaatcatgta atctaaactg gcactaaatg ttcaataact caccatatag 840 gtggttaaac cacacatgca atatgtagcc atctttccat tagtttccta gtcgcaactc 900 aaattcgacg attatatagt ccccgccaca aattgcacac ccggaagttg cttactcgac 960 ttcaccgtca ccagccctgt cgccgttatg cgaaaccccc aaatcgaaaa accggatacc 1020 cttgcatcgc cactctaacg gtgtcgatct atatcactct tcgaaacttt cccaagttgc 1080 ccgttttatg tatccacgct ttattttggg tattgtaaat tctctgcaa 1129 <210> 19 <211> 713 <212> DNA
<213> Arabidopsis thaliana <400> 19 ctaggttatg aacccacgct acgcatgggt ttatttgtat aacttttaat aattttttgt 60 gtgattagat tttttattga aagttttgaa agatatgttt ttatatgtct acttgttgtt 120 aatattataa actctttgaa ttttaaaatt accatgacaa atagttacaa tttaaataca 180 taactaaaaa taactttaat acaactttta cgtttacaac atttatcaaa tgaacatttt 240 tttggttcat gactctctct ttatctttag atgtttagga taacgacgtt ttttacgtat 300 gaaataaatt gttgtatgga aaataatttt gacgactttc tcatcttggt acccgcaaat 360 ttactatcgt ttttcgttta tcttcgctcc attagttagt atagctcttc taaaaataag 420 atattatcta gaagtacatt cataacctca tctacacttt cttttaacaa ctcatacttt 480 tctgactgtc ttttcaatat ggatatgaat ttctgtcttc acaatttttt acagagagtt 540 atcactcttc ctatcttttt gcaagcactc gttttatttg tttcattttt tcttacagtg 600 tctctttcca tcctcttatg atttgtctaa gcaatatgtg tcaaactttc taaaattttc 660 acatgcttca tattttttct tcgaatatct cttccttttc ttgttagtat aag 713 <210> 20 <211> 1023 <212> DNA
<213> Nrabidopsis thaliana <400> 20 ttatctttca aaacaaaaaa aaacttcctt ttgaaagata tggagcctat ccacaagctc 60 actgatttga atgtaaactt gaccgaatgg aaggtttatg tgaagatcct atctatctgg 120 aatcatcctc caaagaatca tggtgaagtt acaaccatga ttttggttga ttaaaaggtt 180 tggatccttt tattttttgg ttcatttatc aaattttctt tgttacgatt gaaatagagg 240 cttttttctt ctttaaaaga gagttttttc ttttcttCtt acctgggtac tcgaatagat 300 gcaaccatac cttaaaaaaa ctataaatat cctttctgac ccattctcaa gccggacatc 360 tgctttcatc tttctgattt tcgagtcatt tatccaacaa acagagttaa ttattctttg 420 tttcatgtcc aaatcaagtt tatttgggga acaagtgttt agcctgttct agtaattaag 480 aaaagtaatt tctttgattt catttttccc atagacatta agtatccttc gtcttaggaa 540 gttgagtatg tcactggtaa gtgtatagag aatttaagtt gttatcattt aaatctgctt 600 atatttgtta gtatgtgatg catttaattt ttatataatc tatcaacatc ttctgttcca 660 cttagacgct atgggtgtgg tgtcaaatat ctctgctatt aagaattttc catttgttgg 720 tcatcagggt gagactgact ataagtacat gtatgtctct ttcgacattg tggacactat 780 gtaagttgtt gtatttatag gtgtataaag aatcctgctg tcttatgaat atgaaactta 840 gaagaaataa gtattgcaga ttgaatcata ttgtctaggg acagaaaatc aagtgctttc 900 ttgtggaaaa atgttgtgaa ctgtttgtta taaagtggac taaacgtatt gttcagtttc 960 atatagcaat aaaccaattc ttgcaattgt ccgcttctag agaaacacag aggttgaggg 1020 t to <210> 21 <211> 1955 <2i2> DNA
<213> Arabidopsis thaliana <400> ~l taacagaaat aagtataact ataactatac ttttatgtaa tcgttgcaaa ctttgatagg 60 agtttg~ttg gagtttcttt acgtcatatt tgtaaacttt gttaggagtt taacaaaaag 120 taagatagag aaaaatccaa cgggaacaca ataattaaat aggttcaaaa cataagcaaa 180 gtttaacaaa aaaaacaatt tgcaatgaca aagtttggag aaacttaaaa gaagttcaaa 240 acataattta aagaagaata agagctgcca ggtcaaactc ctaaatattt tcaatcactt 300 ccatcctttg gttctctgga acttcataat aatcttcgaa accgcggctg taaaagagaa 360 aaataaaaat ttttattgat taattgatgt tatggttaac taatacaatt aaaaaatgtt 420 aaaagtctta catgtaactt cgaatatcta tctcagggaa agtggttgga ttgatgtaaa 480 ttcttgaaca tccaaattca ctcatcaaaa cattctcacc tgttacaaaa tttttattta 540 aagttatata atacaaataa aaatcataag taaataggat tacaatagta ttgaaaataa 600 ctaaccctca atctcagcaa ttctccaaaa cctcacatgc ctctgaactg atacatttcg 660 agaattgtgt aacaaagatt tcacagcatc ttccaacagc atagcatttc attttctctc 720 ccctaaccaa tatgataaaa cattacacta agcaatacat ttttaagctc aattttcaga 780 cggaagacac attactaaat atacttataa acactaaaac ttacatatta cccaaaagat 840 cgaaagagac atatctcgat tcatagtcgg tttcaccctg gcgacaaaca aatgagaact 900 tcttaatagc cgagatattt gacactacac ccaacattgc atctaagtat agaataatat 960 attaatagag tagttcaaaa aaataataac ataaactact aacaaaaatt cattctttaa 1020 atgagaagaa catagattga ctaaaaactt acccgtgata taatcccaat cctcaagaca 1080 tggatactta aggtttatgg gaaaaatgaa atcaaagaaa tctcttcacc agttccggaa 1140 gaggcaaaac acttgtttcc caaatgaact tgatgtggaa acaaaacgaa gaatacctaa 1200 ccctttcttg aggaacaacg acccgaaagt cagagatgtg aatccatgtg tcaggaattg 1260 catctacacg attattctga aaaaattccc agataactca aagattaaat aatttaaaat 1320 gatctgtttc aataataaca aagtaaactt ggattaaaag aacgaaacaa tcaaaaatac 1380 aaaccttgtc atcatgaaga atcatggttg taatttcacc atgactgttt ggaggatgat 1440 tccagattga caggatcttc acgttaatct tccaattagt cgagttatcc ctcaaatcag 1500 tgagtttgtg aacagacatc atgtttatcc aaggaagttt ttttttttgt ttgaaagata 1560 agagttaaac ctaattgttt taggcttgag gtttttgatt ttatataatt acgaaaagat 1620 acaatgatgg acagttgatg tgtaaaaagg aaacgaattg tccacaacca ataggatttc 1680 ttataggtta gaacaataat gttgatattt atttcttagc ctgttactaa aaaattgtaa 1740 ttatttaaag gagaatgtat gaaaaaatag gaataggttt gtttttcaag attctcatga 1800 ttaatcatat ttaataggat tgattattta gtgtacaaaa tctttcctaa ttctgatt~t 1860 gttggttttt tttgtttaaa atgaccaaaa gtgtttatag ttctccgtct gattaacatt 1920 gtaatataaa agaatatgtt tcttaatcat aaata 1955 <210> 22 <211> 22 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 22 cgccaaagac tacgaaatga tc 22 <210> 23 <211> 23 <212> DNA
<213> Artificial Sequence <220>
<223> Desc=iption of Artificial Sequence: Synthetic Primer <400> 23 ataatagata aagagcccca cac 2;
<210> 24 <211> 20 <212> DNA
<213> Artificial Sequence <220>
c)
6 PCT/US00/07392 <223> Description of Artificial Sequence: Synthetic Primer <400> 24 gggtctggtt agccgtgaag 20 <210> 25 <211> 22 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 25 gttttactta gtccaatggt ag <220> 26 <211> 25 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 26 aaatggccaa cgatcagaag aatag 25 <210> 27 <211> 25 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 27 gaagtccggc atgttatcac ccaag 25 <210> 28 <211> 20 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 28 caagtcgcaa acggaaaatg 20 <210> 29 <211> 24 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthet_c Primer <400> 29 aaactacgcc taaccactat tctc 24 <210> 3v <211> 24 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 30 gaagtacagc ggctcaaaaa gaag 24 <210> 3I
<211> 24 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 31 ttgctgccat gtaataccta agtg 24 <210> 32 <211> 26 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 32 gttgacttgt atttgatttc tttttc 26 <210> 33 <211> 22 <212> DNA
<213> i-artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 33 cgagtgattt ccttttgcta cc 22 <210> 34 <211> 22 <212> DNa <213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 34 aagataaagc agcgaatgtg 22 tc <210> 35 <211> 24 <212> DNA

<213> Artificial Sequence <220>

<223> Description of Sequence:Synthetic Artificial Primer <400> 35 cgaaagccgt aactagataa 24 taag <2I0> 36 <211> 20 <212> DNA

<213> Artificial Sequence <220>

<223> Description of Sequence:Synthetic Artificial Primer <400> 36 taccagcata caggagaacg 20 <210> 37 <211> 22 <212> DNA

<213> Artificial Sequence <220>

<223> Description of Sequence:Synthetic Artificial Primer <400> 37 cctgattgca gttttattta 22 cc <210> 38 <211> 21 <212> DNA

<213> Artificial Sequence <220>

<223> Description of Sequence:Synthetic Artificial Primer <400> 38 tccataccta agttccacaa c 21 <210> 39 <211> 19 <212> DNA
<213> Artificial Seauence <220>
<223> Description o~ Artificial Sea_uence: Synthetic Prime.
1?

<400> 39 aggggcgagt aaatcaatc 1'3 <210> 40 <211> 21 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <900> 40 gaagtgcgga tctgtttgaa g 21 <210> 41 <211> 21 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 41 _ ataaaaagcc ggagatggtt g 2'-<210> 42 <211> 23 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Arti_'icial Sequence: Synthetic Primer <400> 42 attcatgagt gcaaagggta gag 23 <210> 43 <211> 23 <212> DNA
<213> Artificial Sea_uence <220>
<223> Description of Art_~icial Sequence: Synthetic Primer <400> 43 ctcagccaaa gaatcaagta gag 23 <210> 44 <211> 21 <212> DNA
<213> Artificial Sec_uer.c-<220>
<223> Description. o~ Art-=icial Sequence: Synthetic Primer is <400> 44 aagcttcatt ctgtggtttt g 21 <210> 45 <211> 19 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 45 agaatcctta gccgtcctg 19 <210> 46 <211> 20 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 46 gccttggatg atcagtggtg 20 <210> 47 <211> 24 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 97 agcccttgga tcatattctt tagc 24 <210> 48 <211> 20 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 48 ggctactggt caaatcattc 20 <210> 49 <211> 20 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Prime.
l =l <400> 49 gaatctttgc aaacgagtgg <210> 50 <211> 21 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 50 gcggctgatg atctccacct c 21 <210> 51 <211> 22 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 51 ttaccccgca ggaaaaagta tg 22 <210> 52 <211> 21 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 52 acttcatcac ttgcgggact g 21 <210> 53 <212> 21 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 53 ggcccaagaa gcccacaaca c 21 <210> 54 <211> 19 <212> DNA
<213> Artificial Sequence <220>
<223> _Description of Artificial Seauenc~: Synthetic:
Primer l~

WO OO/s~32s PCT/US00/07392 <400> 54 accgcaagtg tggctgttc 19 <210> 55 <211> 27 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 55 ctattctaga agattgttag gagttac 27 <210> 56 <211> 23 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 56 atgcctattt agccttttta tag 23 <210> 57 <211> 20 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 57 cgtctgtatg gattcgtagc 20 <210> 58 <211> 23 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 58 tgagaggtgc aaaatcataa cag 23 <210> 5°
<211> 16 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 59 accgcgtcgt tggagc 16 <210> 60 <211> 18 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 60 ggccgcgtaa gaggagac 18 <210> 61 <211> 26 <212> DNA
<2I3> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 61 aaactgatat tgtagatgtg tattcg 26 <210> 62 <211> 18 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 62 cgttcgaagc gtttgttc 18 <210> 63 <211> 24 <212> DNA
<213> Artificial Sequence <220>
<223> Description of ArtiLicial Sequence: Synthetic Primer <400> 63 attacagttt tgcctagaag atgg 24 <210> 64 <211> 26 <212> DNR
<213> Rrtiricial Sequence <220>
<223> Description o' Artvficial Sequence: Synthetic Primer li <400> 64 aagttgattt tctactgttt atttag 26 <210> 65 <211> 20 <212> DNv <213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 65 catcgtcata tggcttgttc 20 <210> 66 <211> 18 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 66 taacgttccc acatgagc 18 <210> 67 <211> 18 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 67 aactctgtac ctgctgga 18 <220> 68 <211> 21 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 68 catctccatg aaggtgaata g 21 <210> 69 <211> 23 <212> DNA
<213> Artificial Sequence <220>
<223> Description or Artificial Sequence: Synthetic Primer I

<400> 6°
aagttatgca aaacgttatg acg 23 <210> 70 <211> 26 <212> DNa <213> P.rtificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 70 gagcccttct atgagcctac ctgttc 26 <210> 71 <211> 30 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 71 agagatcccc tgttactaaa gcctattctg 30 <210> 72 <211> 21 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 72 atatttcgtc gatcgtgttt g 21 <210> 73 <211> 18 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 73 gtgcc;.cagg gacttcac 18 <210> 74 <211> 20 <212> DNA
<213> Artificial Sequence <220>
<223> Descrio_tion of Artificial Sequence: Synthetic ?Timer <400> 74 ggtaacagcc ttcactcgcc 20 <210> 75 <211> 22 <212> DNR
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 75 aaagacttgt atttgggatt tg 22 <210> 76 <211> 25 <212> DNA
<213> Artificia2 Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 76 tctttccctt aatctatctg ttgtg 25 <210> 77 <211> 22 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 77 aaacgattgt tttcctgcag tg 22 <210> 78 <211> 24 <212> DNA
<213> Artificial Seauence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 78 tctctgtgct ttctctttcc tgac 24 <210> 79 <211> 21 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 79 gcaatgctac cgctctgata g 2i <210> 80 <211> 25 <212> DNA

<213> Artificial Sequence <220>

<223> Description of Sequence:Synthetic Artificial Primer <400> 80 ttgtttttct aggttttgtt 25 gtaag <210> 81 <211> 20 <212> DNA

<213> Artificial Sequence <220>

<223> Description of Sequence:Synthetic Artificial Primer <400> 81 atgctgcgat gtttgtaagg 20 <210> 82 <211> 20 <212> DNA

<213> Artificial Sequence <220>

<223> Description of Sequence:Synthetic Artificial Primer <400> 82 agtcgatgtc taggctcttc 20 <210> 83 <211> 22 <212> DNA

<213> Artificial Sequence <220>

<223> Description of Sequence:Synthetic Artificial Primer <400> 83 cttccatttc ttgatccagt tc 22 <210> 84 <211> 25 <212> DNA
<213> Artificial Sea_uence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 84 actaaggccc gtgttgacgt ccctc ?I

WO 00/~~325 PCT/US00/07392 <210> 85 <21I> 2i <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 85 aaccgcttcc cattcgtctt c 21 <210> 86 <211> 22 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 86 ggcgaccttg gacctgtata cg 22 <210> 87 <211> 22 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 87 aaccgccatt ttcatttcta tc 22 <210> 88 <211> 21 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 88 taggggacat atcaaaccaa c 21 <210> 89 <211> 24 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 89 gtctaaaacc atcttcacca tact 24 <210> 90 WO 00/~s325 PCT/US00/07392 <211> 20 <212> DNA
<213> Artificial Sequence <220>
<223> _Description of Artificial Sequence: Synthetic Primer <400> 90 atgcctaact attcgctgac 20 <210> 91 <211> 22 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 91 ttctgtagtt ctttgtgagt gc 22 <210> 92 <211> 19 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 92 ggcattaatt gggaaggtc lg <210> 93 <211> 24 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 93 tataacatca aaagcggtca tcag 24 <210> 94 <211> 20 <2i2> DNA
<213> Artificial Sequence <220>
<223=~ Description of Artificial Sequence: Synthetic ?rimes <400 > 94 gcattaaaga caaaaagccc 20 <210> 95 <21i> 21 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: SynChetic Primer <400> 95 cgttgacccc gagaagatta c 21 <210> 96 <211> 22 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 96 ttcgggaatc atggtctaca ag 22 <210> 97 <211> 24 <212> DNA
<213> Artificial Seauence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 97 tgtcacatac acggtttctc ttag 24 <210> 98 <211> 19 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 9s caagcttcat ggggactag 19 <210> 99 <211> 23 <212> DNA
<213> Artificial Seauence <220>
<223> Description or Artificial Sequence: Synthetic Primer <400> 99 taatacggga caatctacaa cac 23 <210> 100 <211> 22 <212> DNA

WO 00/~~325 PCTlUS00/07392 <213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 100 ctaattgtaa cggagaagag ag 22 <210> 101 <211> 20 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 101 aagcatgtta cgtgggattg 20 <210> 102 <211> 23 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 102 caaatgatgt ctggtctatc ttc 23 <210> 103 <211> 25 <212> DNA
<213> Artificial Seauence <220>
<223> Description of Art=ficial Sequence: Synthetic Primer <400> 103 aatttaaaag gaatcagaga actac 25 <210> 104 <211> 20 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: synthetic Primer <400> 104 atgatcaaag ggggacgagg 20 <210> 105 <2-~1> 24 <212> DNA
<213> Artificial Sequence WO 00/~s32s PCT/US00/07392 <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 1G~
aaggaaacac caccaaacga aaac 24 <210> 106 <212> 19 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 106 gagacagagg atttggaac 19 <210> 107 <211> 18 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 107 gaaaccctct cctcaaac 18 <210> 108 <211> 21 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 108 cccctcccgc cctaaaccta c 21 <210> 109 <21i= 25 <212=~ DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 109 ttccg~=taca tggccttcta ccttg 25 <210= 110 <211=~ 24 <212> DNA
<213> Ar~i'icial Sequence ?O

<220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 110 cgtattcccc tgaaaagtga cctg 24 <210> 111 <221> 19 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 111 acatccggcc ttcccattg 19 <210> 112 <21I> 23 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 1I2 attcttttgc tttatgggac ttc 23 <210> 113 <211> 22 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 113 aaacatgctg cagcttgatt ag 22 <210> 114 <211> 23 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 114 aggacgatga tacgcttgtg gag 23 <210> 115 <211> 21 <212> DNA
<213> Artificial Sequence <220>

<223> Description of Artificial Sequence: Synthetic Primer <400> 115 atcatgggga cgctgctttt c 21 <210> 116 <211> 24 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 116 ttggttttaa ggctttggtg tagg 24 <210> 117 <211> 23 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 117 atgcgcagaa gagacgatga tag 23 <210> 118 <211> 28 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 118 gtttaaattt ttatgtcatg tctgtttc 28 <210> 119 <211> 21 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 119 ctttgggcga tgtaggagta g 21 <210> 120 <211> 21 <212> DNA
<213> Artificial Seauence <220>
<223> Description or Artificial SequencF~: Synthetic y Primer <400> 120 cgcgacctta gccttgttgt g 21 <210> 121 <211> 20 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 121 tgtgggcagg gtaatggatg 20 <210> 122 <211> 22 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 122 atatccggct ccgaacttgt gg 22 <210> 123 <211> 21 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 123 ccgcgagatg gatgtgatga c 21 <210> 124 <211> 21 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 124 tgagggggct gacatttctt c 21 <210> 125 <2I1> 19 <212> DNA
<213> Artificial Sequence <220>
<223> Description of .4rtificial Sequence: Synt~e~ic Primer
7 c) <400> 125 ttcccccgag gcgactgac 1'~

<210> 126 <221> 31 <212> DNa <213> Artificial Sequence <220>

<223> Description of ArtificialSequence:Synthetic Primer <400> 126 tcggttgggg atagaaaatg g 21 <210> 127 <211> 23 <212> DNA

<213> Artificial Sequence <220>

<223> Description of ArtificialSequence:Synthetic Primer <400> 127 gtggcacgat cgtatgagtt 23 agc <210> 128 <211> 24 <212> DNA

<213> Artificial Sequence <220>

<223> Description of ArtificialSequence:Synt:~etic Primer <400> 128 ctctcatcga ccctcactct 24 caag <210> 129 <211> 27 <212> DNA

<213> Artificial Sequence <220>

<223> Description of ArtificialSequence:Synthetic Primer <400> 129 agtcccaaca aaaccaaaaa 27 cataaac <210> 130 <211> 21 < 212 > DDIr', <213> Artificial Sequence <220>

Descrio_tion of ArtificialSequence:Synthe~ic <223> _ ?timer WO 00/~~325 PCT/US00/07392 <400> 130 ggcctccatg ctaccaacaa c 21 <210> 131 <211> 23 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 131 cacaaaatgc cacccctact acc 23 <210> 132 <211> 23 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 132 tggcagcaga gttatttgac gag 23 <210> 133 <211> 20 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 133 atgcgcgact gaaggacacc 20 <210> 134 <211> 18 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 134 ggcctgccca taaacctg 1g <210> 135 <211> 19 <212> DNA
<213> Artificial Sequence <220>
<223> Description of ~.rti=vcia- Sequence: Synthetic Primer <400> 135 ccgctgtgga acctgaaag 19 <210> 136 <211> 21 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 136 aaacgccgcc aaaatcagaa c 21 <210> 137 <211> 22 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 137 acaaccttag ccccatccat tc 22 <210> 138 <211> 18 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence. Synthetic Primer <400> 138 ctgcgagcga ggtcaatg 18 <210> 139 <211> 18 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 139 gcagccgtgt ggatggag 18 <210> 140 <211> 24 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 140 aatcaattgg tttctacttt ttag 24 WO 00/~532~ PCT/US00/07392 <210> 141 <211> 21 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 141 aactccgact gaaggtatag c 21 <210> 192 <211> 20 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 142 tttgcaccgc ctatgttacc 20 <210> 143 <211> 19 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 143 gaggacgttt tgcagagtg 19 <210> 144 <211> 26 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 144 tcgactagat ttattatttc tctcag 26 <210> 145 <211> 20 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial sequence: Synthetic Primer <400> 145 tttggcttga ctctgtgaac 20 WO 00/i532s PCT/LJS00/07392 <210> 146 <211> 20 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 146 gcgaattcct tgccactaag 20 <210> 147 <211> 26 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 147 aagaagaaga ggaggaagaa gatgtc 26 <210> 148 <211> 22 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 148 agtggacgcc ttcttcaatg tg 22 <210> 149 <211> 19 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sea_uence: Synthetic Primer <400> 149 tggtccgtcg tagggcaac 19 <210> 150 <211> 21 <212> DNA
<213> Artificial Sequence <220=
<22?% Description of Artificial Sequence: Synthetic Primer <40C> 150 cttcacgctg ccttcactct c 21 <2i0> 151 3=l WO 00/55325 PCT/USO(I/07392 <211> 20 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 151 gatacgctcg ttcccactcg 20 <210> 152 <211> 22 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 152 caaaaccaaa tccgcgaaga ac 22 <210> 153 <211> 24 <212> DNR
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <900> 153 agtggccagc cttcttaaca tacc 24 <210> 154 <211> 22 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> I54 tttgtgcaat ttattagggt ag 22 <210> 155 <211> 23 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Prime.
<400> 155 atttgca_qaa gctgaagttg gtc ?3 <21G> 156 <211> 24 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 156 ccgtggtcga gagttgagtt agtc 24 <210> 157 <211> 24 <212> DNA
<213> Artificial Sequence <220> "
<223> Description of Artificial Sequence: Synthetic Primer <400> 157 acccggagta gtttttcagt gttc 24 <210> 158 <211> 21 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 158 agcttcgata acaaactcac c 21 <210> 159 <211> 26 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 159 agaagataaa tcaactaaac aaaatg 26 <210> 160 <211> 25 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 160 aacgcttatc ctctttctct tttac 25 <210> 161 <211> 22 <212> DNA
~6 <213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 161 acggttgccc atcttatcag tg 22 <210> 162 <211> 22 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 162 tctcgttctg atggctcctg tg 22 <210> 163 <211> 23 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 163 gtgtaaccgg tgatactctc gcc 23 <210> 164 <211> 20 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 164 cgacgaagca gtggaggaac 20 <210> 165 <211> 22 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Syr.theCic Primer <400> 165 gcgagaaaac gtgaagagat ag 22.
<210> 166 <211> 22 <212> DNA
<2i3> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 166 agctactacc cgaatgtgaa tc 22 <210> 167 <211> 21 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 167 ttggtgtgtt aagaagagtg g 21 <210> 168 <211> 20 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sea_uence: Synthetic Primer <400> 168 taggacgcaa atcagagaag 20 <210> 169 <211> 24 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 169 ctaatcatgt gtctttaggc tatc 24 <210> 170 <211> 18 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 170 ttgggctggc gtggaatc 18 <210> 171 <211> 18 <212> DNA
<213> Artificial Sequence <220>
<223> Description oz Artificial Sequence: Synthetic Primer <400> 1?1 agggcagaaa gcgtcagg <210> 172 <211> 20 <212> DNA
<213> Artificial Seauence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 172 gctgcgaagg ttgaatgaag 20 <210> 173 <211> 20 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 173 tcgccgggaa aaacagtaac 20 <210> 174 <211> 21 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 174 caccgacgtt atctgggaaa g 21 <210> 175 <2i1> 28 <212> DNA
<213> Artificial Sequence <220>
<223> Description oz ._tificial Sequence: Synthetic Primer <400> 175 aaaagttagg tagtaggaaa gaaagaag 26 <210> 176 <211> 21 <212> DNA
<213> Artificial Seauence <220>

<223> Description of Artificial Sequence: Synthetic Primer <400> 176 gagcgtgctt ttggagtttt g 21 <210> 177 <211> 22 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 177 aaccctagat cgcccttttt tc 22 <210> 178 <211> 20 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 178 gactcatatg tggcgttttc 20 <210> 179 <211> 18 <212> DNA
<213> Artificial Sequence <220>
<223> Description of Artificial Sequence: Synthetic Primer <400> 179 aggattcact ggcggttg 18 WO 00/ss32i PCT/US00/0'7392 SEQ ID NO_ 180 Arnbiclup.sis lhnlia~tn UBQ I I upstream regulatory sequences cataacacaat~~ctttgtccaaagaaagaca~a~aagggct«tttttgtttgtttgtlaaa~~tgtga~_~cat~_~' cat~,~, tgggtcaagtitagatcattca~aggagataaccatgaagaguctacacttcgtccattaacaaaatcatca~
attatt ccttcctaattacta~tcataccctaaatattcacttacattaatttclggtattctctutg~tagt~etcgctagatt tnttgg~tcaaaattaaacttcatctactttatttcagtaagatcatgtttcattgtgagaatgcca~ttcaataacca ttaacacaatctcttgaccattttttttttggtttaattgatacttaaacaaaactc~aattggttcaaagctcaaatt ~=
elacatga«ataagccaaactcaaaactgaaatacugcclaaaacacacagattagcaaaaatcatcatc~=~taatct ccccaataaccaaatcaaaactagttttttcaacaaaaaagaaaattgtaatcgctatggctccgtagatcgtaaaaca a gatccacaggttcctatcagtaagcgaaacacgtacatttgtttatgtctaatatagcgtcgagttaggatcattgccc c aaaaaaaagagctaccttcgacgtggcatcagacaatttctacaatttagaagaaaacaaaaagaaaaaaccctcagcg g acccttgtcaaatcaaccacacagcagacacagaacgaattcagtcacgctagctgttgtgtaaagtgtcatacggcac t tcctatcttttgggattgcttc«catcgggtttataacaatcaaaactcgtaaccaaacattaaacgatttcaagtaaa cagagaaccaaaagaagaaaaaactaagatagaagaaaacgttgttgaagatatgagaaagagggagaagagaatctgt a tttatatagagggtatttgtggcggcataacttaaagccctagcgtgtacgaggcggcatataacttgaaaccctaacg t gtacgaggcggcataacttcaaacccaagcgtttgatcgag«accatatagcccattaagagcctatatgtat«tggg ctgaataaaatatta«gggccagtagattgtggtagtagaaagagaagagctctaactacaaagactactagtct«gt tccltctacgatgattggeacagaaaccaaaccattggttcaa caaatgtggtcccacgtgtooatctcatg~ctcta~g gaaaccac«ggatgacgctgcaaggagttgatcitlatgctgcggttccgtt«gtgttgtgcatataatc«agagaa aggtactgtcttctcatgtactcttgttttataggtttttcctclcttttttaagaaaagigtgtataattgtgttaca a atg«~cagatggagaegaaattgagtttgaatcaaacaaatcagagtttagtgaa~ga~aaggctctg«catatcaga actt~osaatcttQCtgatagactcagcatcttgcagtctcctctacatctcattgtattgtcccct~tttttgctt~
ca GCv.
catgtttttgaaa~~t~cttaaaacattgtttcaaattcgcactgcactttactggcacacttataaggtgatattg«
la tcacacactgtcaagaagttaaaaccattagatgttgacaggactgaacattagatagagactctgtac«ctcacactc atcaaactaagcaeatcatataaaga~acaacgat~_acaaaagtgaatcgaaagagataattgcaataaa~aataagt tc oatttgttg«tgaaagaaaagagagatgaaagaaacggaaacatagtcgatcaattatttatcagaggctgtacatggc cccaaaacalaaaccaccaaagtactagatgaaacgatacatgaacttggttcagtaaccalaagagagagagacaagg t SEQ ID NO: 181 Arabidnp.si.s thalicma UBQI 1 downstream regulatory sequences cataacacaatgct«gtccaaagaaagacagagaa~ggcttttttttgttt~ttt~ttaaagt~tgaggcatg,'citg g tgggtcaa~«tai=atcattcagaggagataaccatgaagagttctacac«c~tccattaacaaaatcatca~atta«
ccucctaattacta~tcatacc:ctaaatattcacitaca«aat«etggtauctcttttggta~ts~tc~ctagatt tt«t«g~tcaaaattaaacttcatctactttatttcagtaagatcat~tttcattgl~a~aat~ccasttcaataacca ttaacacaatctctt~accattttttttttggtttaattgatacttaaacaaaactc~aatt~_gucaaagctcaaatt ~, ~tacatgattataaeccaaactcaaaactgaaatactigcctaaaacacacagaltagcaaaaatcatcatc~'gtaat ct ccccaataaccaaatcaaaactagttttttcaacaaaaaagaaaattgtaatcgctatg;ctccgtagatcgtaaaaca a gatccacaggttcctatcagtaagcgaaacacgtacatttgtttatgtctaatatagcgtcgagttaggatcatteccc c aaaaaaaa~a~,ctaccttcgacgtggcatcagacaatttctacaattta~aagaaaacaaaaagaaaaaac~ctcagc g~~
acccttgtcaaatcaaccacacagcagacacagaacgaattcagtcacgctagctg«gtgtaaagtgtcatacggcact tcctatcttttggga«gcttcttcatcgggtttataacaatcaaaactcgtaaccaaacattaaacgatttcaagtaaa cagagaaccaaaagaagaaaaaactaagatagaagaaaacgttgttgaa~_atatgagaaaga~=ggagaa~'agaatc t''ta tttatataga~g~ta«t~t~gcggcataacttaaagccctagc~tgtacga«gcg~catataacttgaaaccctaacgt YIaC~~ig~C~~~CataacttCaaaCCCaa~CgtttgatCgagttaccatata~CCCattaagagcctatat~tatttt gg~
ctgaataaaatattattgggccagtagattgtggtagtagaaagagaa''agctctaactacaaagactactagtcttt gt tccttctac~'at~attgggacagaaaccaaaccattggttcaacaaatgt~~tcccacgtgt~gatctcatggctcta ~g gaaaccactt~s~at~ac~ctgcaag~~agttgatctttatgct~'c~g«cc~titt~t~ttgtocatataatctta~a gaa a~~tactgtc«ctcat~tactcugttttatag~tttttcctctctttutaaoaaaa~_t~t~talaatt~t_ttacaa atg«
gca~at~';=a~_a~~aaattga~tttgaatcaaacaaatca~agttta~tgaae~a~aa~yctct~_ttcatatca~a aCll~'~~~alC:lt~_Cioata~_aCICa~CalCllgCa~ICICCICIaCaIClC:allolalt~lCCCCt~lttll?
Ctl~Ca catgttt«~aaa~=t~cttaaaaeattgt«caaattcgcact~cactttactg~cacacttataa~gtgatattgtlta IC:ICaCaCIoIC aa~;aa~llaa:laCl:allag:llgllgaCag YaC
l~aaCall:l~alagilg:lClClglaClll:lCilCaC lC
:llCaaalaaa~=l;aCalCaIalaaaga'_:1C8:1CgillgaCa:laagt~'aatc~=aaagagataattgcaata~
_a~'aal:l:1';llC
aralltgll~'lit,';laa~'aaaa~;ai=agalgaaagaaacggaaacataglC!_atc:aallalllalca!?:1~
'~'Cl'=lacal~'~'C
CCC:IaaaCalaaaC(::lCCa;l;l~taCl:lgal''aaaCgatacatgaactt _'_IIC:I''laaCCal:l:lg:li':1'':1'_aL'a(:a:l''qt SEQ ID NO: I 82 Arabidopsi.s rlraliana S16 upstream re~_ulatory sequences cgttttttttttgY~=aaatacgaaatcgaaaggctaaaaacaactcaattcaat~gccaggattaaagecgagtaccc aa atttttaatacca~gtaaaatgatccgcgtgttttctaagtgaaaclgcgaaagcgaacgactaaaaacaa«caattga atggccaagattagaagllatgacccattattttaaaatctgggtcaaaagoatccgc~tgttttgtaagtgaaacccc g aaatcgaacagctaaaaacaattcaattaaatgaccaagattagaag«atgacccattattttaaaatttgggtcaaaa ~gatccgcgcgtttctaaaaaggtttttttcatgacataaaalttttaaccttatggcaaatttattattttagcgaaa t tcatgcatact«attaaacgtaattatcaaataatagagtagattataattgaatagtataaacalattttcattttat ttttgttttlgttlaactaagctcacagcgacacaattccgaalatlagcaattg gagaaagaataaaagacacagagca ccatattattcttttaYCCCaaaaatttccacaastacaaatccYYcaaaaacacatacataeaccaattttt~~itta c tgacgaaattttcagatca~aaaaccggatttgattcaagtg~~tccctaaaccgtcggttttaaacaatattttcaac g atYaYctttttcaaYtcaaactaYcoaaacaYCtatcaoacclasacaaaaca«aeaaacacaatcat:.ratYattal ct tgatcgtcttta~'atg~actattaggcatagatclaagcattcttggt~aatctaagctttttctcaagatctgtclc at CgCagagatC:la:tl~lCIlgICa2atICQalg lYo~_Ctl .1 ~:IlllgaaCflaItaCaClIgIlCatglgICICgCCa g ,._,. g~ ga tagCItCIIC f ICl gIl ~LlaaaBaICIIIfIgIICICaIgaClIf gleClICaCagCIlCa~aaCaaaalCCaEa~=alC
catttccctatgtaac~ctctcttatcatctctgtgtacgattctgtacactcglcg gttaatccgcaacagtcacacgt caccgaatcta~ctcatc ~_ctggatalttgtgtcaccggaaccgtt~a~ataatta«acatccggtgtt~aaattaaca ttgttggalcactaa~=lacactatcagacatggtltcttgatttaaaaccctgltgagaaaat«cgatcaacaatctg a ttcllac«caac:uaaoaaactalaaaeaaaccaagaggagaagttttatcagta'_cagaagatgagaagataacgca a taagcltgtgatatatcttclttgttttatctctg«tctgaaacagagaaatggtttttaaaaag~gattctlaaaaga aallggaatacatt«t«It««aacg«cgltttgccactctclgtc«ctaaattactttcalctcc:gagaaaaat ttcacaaaacaaaacca«~caaaaacagagtattaaaaatgctttttatcntggcttttcttgaaaacaaatagaaaa agttaccptttctatttacttcacggaaaaaacattagaagtlcalaaaaaticactcgaatctclcgttacttragaa g agagaaaaaaaaaa~tattgtt«ggaaatatatgaggaaga~aaaat~a~agaagaagtagtgcag~~c~aagYaYaag c aatgtttggtgttttataagagagacgataglagtaactagtgacglggaatattgat«taattttttagtittat«t atttattgaacaltttalcaaaac~cc~cgtattagaga~'galgc ccta~'ntaat~tctt~~tataaaagcetatattt tecatt«tcatcmactctctYce«a~Ytttcattita~c~~caYac~aYaYaYcaaaa'_caa~Ya"aaaccctaacc _ _ . ._ _ ~ ~_ Arabidnp.si.s rhaliarra S 16 downstream re~ulatnry sequences aacaaagctggg«««ttltttcaatttcgattcatcteaaggt««~gagtttttgaltatg«aett~=t~_agactc ttagaatttctgttttat«atttatt~_~ataactgtgattctccaa~aacttal~tcttalatgautt~tcttcattt c gIlgCIlCltettlelI~CCl~l~al~lall~CaIClgCala~_llg~all~itaa~IlayvllCllog~lllataata a~o CIYIYaCIIYaIYIIaCIIa~CCaI~YCt~atcaaatate_'IC
I;IaCa;la!'l_YYLCt_'LI:ICIc:IYIIaIall'~~IICi'c ~laaalltl~~IlaCB~ItI~ItICC~3~lI~'=IaICIC~taagIC~IaaC~'lllYlgl:lCltallCaCI''aaaa Caa~t tplatlYIIYIIaYICIIIIYIIaCCICICCIB'_laCYallllllllYlfIYIYaaCaa~alCaYaaaCIYYYaCIYI
<_'a attt_taccYataamaaaYaaccaaataacttttYatccaaaaatcYa_~taattaataaccYaaaaYatctttctact aa ~ttttttcctacltc~acetYatttneactle«~caaaatYttatYctrttt~'t~c~ttccaccactmc~taYl'_aa aectc«_Ytalcc~taaaYcaaaaa~'tcaacccltltYa!'tt«t_ovate_'caat«ctYcacatttttratt~ra_ ecc IIYCCdIYYaY:IC:tCCal<_IYIICCCaa~CllCaYC'_~_~Caa_'aC'_tCC
~_l:lY:l:lCClllC'ICa'?ICaCCIICI~_aa'_t'.a auccYYaacaa~r~rYaatacYcacYttcatesataat<_atatc_'aea'_a_'_'cc;laat~'atca_'terma_' mcatacaa~_ caalccYaYaam«m~~ttt~'cveltYta!rtettYtC~IaaYacatat~_calctlcaaaY«l<_t«tttY~~ali'a ~ttY~ta taltctttataccca~=tatcat''attcatataggtaaagaaacaoag«acac:«aaatgcgtatacacaaa~~atat ggl ~tttgt~ctgattugctctuacttgttg«ttcaaacttat~'aaaaga~atalalcara~=a~=I~=tat~aaccalaa cc ~lal::ll~'ICCCaa:lllC:l:laCaa~lCClClalCgaglgt~Itl~:ICS'CCC:IaC'l:tlll~=:Illa:lt ~'aglli'aflaC~C~:l I~=glglcllgflagcaatagcaaatat~=atcggcragccccaaaacc:7lC1'_:hall'=a;l:Il:l~la:IC~;
~=a''CI~IICaC
_coal'r'rCalC~,aatt~_IIICLICa''C''~aIIICCICICCI1CCICCIC''''II1'L:I~:IIC':1':aC:
l~aa''Cl~'Ca:la:l~=:I'~C
(LgCCCICCa:IICCIC4'aaatttglCCa.lglCC~l~aIaICIIIaICa:lg _raai (ItCI~=allC;
g:IIC:ICCIIg~IIaCIC
actccgctctctgtttt~=ggcgatgltggtagactcgaatcc;cga~=cttcaa~'agaatatc''tcacaacattgt tcaacat _Icgactttcgaggagaataaaacactacttgctgaaaaccalcaagtgatrccl~_tgc«gcaaagittatgaaec:a ga agtcagtt~t~acaagaagaaac~ctacatt~acttgagtgtcactttla~aeatlgattclaacaagatcatcatagg g aactcagta_~cactcaaggctctyatagatcttatc~~gea~tl~"ate'acttawa~'clactcatgaa_ctclttg cgc t_ttatatacctctgttgtgatgaaatggagaactegaaaaa~~caatctc;;«gggg«gactcctt«gcagtcagaa atatcaaagcaagaagaaattcgtttgactc~ttagttatcttg~=cattgatltctitacacgaacgc~ttatccagg aa ctggctaatgggttatctatgatct~ttaa~tatctt~altcttgaggaacaaaa~tt~catg~tgacttgtgaeaacg t SEQ ID N0:184 Arabidnpsis thaliana contains 3 contigs, I-4960, 4961-13005, 13006-403-19 tacgctcactatacg~cgaattcttcatctttaattat~cctaat~ctaoatcta~a~taattaacta~atcatgaatc c agacaatagatatgcccgccataggtatctgtaglcgoatctatlattt~at~'a~_ICt~=gl«ata~gt~tgtaata a_a atcggcctatgtaclaagatgaacaatteaacttgttaagaat~c~tat«a~gtt~ata~_«tatcg~c«ctcca~tt atta«agctatagatatagggag«tcgcggcacgcccaccggtta«ccataa«ag:l~tu~aotttaagaatggtt c=ataacaacctagctttcatlagatctactaaccaatagt«cctgaaaataccat~_Itcctagctttttattaaatc k tttacaacccaatcaatct~_ctgttttcttattttgttitctaactattttacteatacr«aata~~cttaatctc~a ca tttaactcactgtttgaacaaaagaactccgcgaattcgatcctaaaatottgcaatt~=atctcttatttgagagagt ag ttctaggtaatttgaaccatatcaaatttggcgccgtlgcc gaagltcttgggaatcaUattgagtttagtttagagat lllglClagIIIIIaCItClalttlgllaClC:ICgCgItlCttgt111C'llClttICC~'l~=Itlc a~glYtalgCCla~~' caaacaaggagcacaaaavagaaggagcttatcaacttglclg«ctgc~amac ~zgaLaacugaacgc:ltcaaccaaa:l aactaaaaaattagcaatcgctgacagagtcatcagagct~at~magaa~~_a~_l«t~=a~_a~at ~a~gat~ggcaagcct acaatga~~cto~ttaga~attggatgatcatg~~actcatcctaccc,=a~=~=ac~ccaat~=aggatca~gc~tgc ctcaat gcgcaattagctactagagacgcagcagggctcggtgtcgagcgataccaact~~=gt~=tcgalcgacaccatcggga gca scacgcagatgcaatc~gtgtcgaacgacacaataga~~t~tc~atc«aca<'c:aacaccacattcctgcagctattc ctc cagctgcggaaccacgtcgtactcttg~saatttcaacaagccag:lcctgltctatgctaatcgatcagcgattgtac cc tcaccattltagaggaacgactatgagctgaagcatggctac«tgctttagltg~'ccagcatcctttccate~~c«at c tcatgagcagccaatggatcatatcgagcgatttgaggalctlgttliga~mueaa~_~_c taalggag«tcagagtttg aaagctctagggaagcutcaa~ataQcc«c«gaacaatttctac~_at~_at~mtaa~«t~'a"oa~lt~=agaaacaa y tgtctacc«cacacaa~_gtccagtlgaagctttcaaggctgc«~==gttcg«lcaa~=~_a~=taccaac~_agattg tcca catcateatttcagt~_ag~tacaacttctag~taccttcttca~a~=~a~lt~=att~~_a~ataccagatggctcta ~atgc agcga~caatggtaacttcaatacacettacccagct~_algctaca;=ccttsataga~aacettgcttgcagcaaca gta ccaagaatgctgattttgagaggaagaaeattgctggagctg«tca~gaactct~atagca~aggtgaalgctaagmg gattcagttcataacttgctcacagggaagaagcatgtccattttgca~cl~aagaa~a~
actattgagcctgagcctga otcagtggaaggagtctt«acatagatggtcagggatacagga:tat«g~~ca~'ccmaaggcaacltcagtggcaaca gatttgcage~aaccag~ettcctcalactacactccaaagcct~'ctttcca~aa~am«tcllcaaagcagctttcag aggacclatggaaattcaacttaccaagcccccccccccctcagca~'agacaa~'aat~'e;l;ttcaatgc«~'agt agattc ttgagagccaaacaaagc«cttgtggagtllaatYgCaa~lll~;llL'Cl~'lC
l;le:l~:l~alCllaal''g~aagBICgac aalClgagClc:lCac:ll~'aagilil~Cla~atgta(:a~gla~ClCaaaClYC:aCa:llllafl:aa~_agacaa gaagoftIICI
tCCl~olaalCC'teal'L'Cl:taCCaaagaaagagCtgtaal_c:c:«Ctt~_afC;l~~a_';1;1_x=a~_aiea l~=I~Iggga=oaal tggacactgaagacgagugga~~ctlgtagttgcagagat=glatC:gacca;leaccc:letatgtcgatcgacacca tat gggacgcttttctcagagacgacccaclacgacg gtgtcgatteatarcc ema~lalc ~_atc~acactggatctggac I~CaCCIIoIYavYtvCa2tvta1C1:1[a~8:lllc:llC:llClIlaaClal ~(:llaaf ~'W;tYaWa~a~taattaacta~
:llcal~aafcca~:IC:lala~:Ilel~Clc:~cC:lla~~lalcl~tagllL';=ai11:111;1t1(~'al~=a~=
lcl~_glllalag~l ~l~l:lal88~ilalt ~'uCCIBIIIaCaaa~al~'aaCaall~aaCICltfaa~'aal~y_l;lltla~'~=lI~=alal'IIICIC~=
CIICICC~_Il:tlfall:tgc:lalai'alala~'~_~a~IIII~=C~~CaC~eC C':1C~;1;=tt;ltlc tataatla,3aglll~a~Il taagaatggttcgataacaacctagctttcaltagatctattaaccaatagtttcclga'_aatacctt,~ttccta~c gtt ttattaaatcgtttacaacccaatcaatclgci~ttctcttatttt~_ttttcttacta«teactcatacc«ycta~ct ~
taalCIlgaCat«aaCICallg«lgaalaaaaoaaOlCCglgaattcgatcctaaaatael~taatt'atctcttatt tgagagagtagttctag~~taatttgaaccatatcactcctaatW
alclctaaclr«at:unla~'uaat~=cattggat tet~acacatt«"accat~aaaacacaaacaaaactg«tactgattctaa~'eaattttlt~'lt,'~_ttla~~'ttc tttt ~agaaaatgggtataagtgttttctaaacaclcctaatccaactctaactcttatcalt;l_'tc:aaat~~catt~~a tt~
tgacacatt«gaccatagaaacagta~caaagctatuactgcW
ctaagcaaattttt"tt~'~'t«ta<ecctcttttg ~~lacaaaatgggtataattatt'=gctaaacactcttaatccatclctaactcttalaa«a~_tc aaat~_t:attagatttl sacctattttaaccataaaatcactaacaaaacagtttcctgcttctaagcaat~_tttt''ltg ~Ilnalcctct«
tgl sagaaaatgggtataagcg«g~ctaaatcactcttaatccatctctaactcnataatta~
tcaaatgcattagattgt eacacattttgaccatagaaacactaaaaaatattaactgcttctaagctaatlltlgttggtittaacctctttt~;g ga gaaaat~=ggtataagtg«atctaaacactcctaatccatctctaactcnataalta~=ICaaaa~ctitg_;Ittgtg ac acattttgaccatagaaacactaacaaagctatttagtgcttctaagcaatt««g«gg«tlagcctc«ttgggac aaaataggtataagtattgtctaaacactcctaatccatctttaactcttacaattaueaaatyatlgga«gtgaca cattttgaccataaaaacaclaacaaaactgtatactgcttctaagttatttt«gttg gtt«agcctc«ttgggaga aaataggtataagtgatgtctaaacactcctaatccatcictaactcttataatta~tcaaa~'gct«~~attgtgaca c attttgagcataaaaacactaacaaagctatttactacttctaagttattttttctt~gttttaecctttttt~=~_ga gaa attgggtataagtgttgtccaaacattcctaatctatctttatctcttataattagtcaaatocatt~~attgtgacac a ttttgaccatagaaacactaacaaaact~gttagtgcttttaagcaattttttottg~ttttactct~ttttaggagaa a atg~gtataagtgtttctaaacactcctaatccatctctaactcttataattagttaaa~oct«g~attgl~acacatt ttgaccatagaaacactaacaaagctatttactgcttctaagttattttttgtt~~lltt;,=c~tctlll~~a~;lga aaat ~ggtataagt~_tt~~tccaaacactcctaatcaatctctaaclcttataatlaatcaaat''eallggattgtgacac atll tgaccataaaaacactaacaaagctgtctactgtttcaaaacaattt«gtiggttttagcctct«tca<_a~aaaaaa~

glataagtgttglctaaacacmctaatccatctctaactcuataattagtcaaat~_ca«~~~attu~_acacatt«g accataaaaacactaacaaaactgtttacl~_cttctaa~caattitttttt~at«ta~=cctc««~e~la~_aaaatg ~g tataagtgtt~tctaa~~cactcctaatccatctctaactcttataattagtcgaalLC;alI~~vttCty;ICaCatt ttoa ccataaaacactaaaaaaactgtttactgctcctaagcaatttttgtggtttagclt«ttga=maccaaacagctggc ctacagccaccaagaaggattactaccacgcccccggaattgacccctcclgycaaacc c:eaaataaccaattctaggtt ttttctcaacccgggaagatattcttgaaaaatttcccctctgcggcacagaaa~~aaaaalcactlctccccccctca tt _gtgcccgcac~ccatcgcattgttgcagccttcgtgtg~gtgcaaacaattc««taccy~ttclccctaggagga~' accacaaacgggctatatggtgctttggttattcgacggtcatctttag~cttccatc«tot~=c~'ccacaaaalcga ca at~atcttctt«tc~tcttctttcca~aaaa~catacaattgttcgagcaaacatcaatc"tatgalalgaaagtccta aattgcccatcaacttttcggtttggtaatgtgaactggtagcctggttlccttct~=gtaaaaaatctgtaaacattt cg caaacagaatccaccaact«tcactcatattataatccgtcttgttctgcatea~_tce'agal'_claaegataattg cea ctggccttcacgacaaccatcgtacaatggatlatttgcaccttctaacat~tc~ta~=aa«tl«I~_aat~=~«acgg a ccggttcttccatactctgataactactatcct~atgataattatcctgat~ataattatcamaaccccac~_ttaaaa t aaaatgcgtcgllcaecatatccacatatg_atttlctacaacattacccac«cttcageatta,'aactaaaatacgt c atalccgtgtaactagttcctatatccatattcaattc«ctccat~=cttataccaaacataat;latc;a~gcalaaa tcc tlgactaaacaaatgactactaaclcttctacccgagatgatgtgtttctcattc«gcalac amacaag'_acaatgaa atttacctc~actactct~lactata~~ct~actattascgaat~'tcatgaactcmeeaetcc ~'y_acatattcg~ct sacaaatttcccgttacttcatcaaatctcttgtacatccaatatcgatagaaacy~ccac cacy'ccataaltgaaott lcccsccattataocataaactaaacttatattttttctttcaaaatcggttttttttccutt~'atl~_t~~'««~at t tegtggtgaatgagaatgaagact~cc~lg~attatatagaggatttcttat~=actattalac~amaaacatt~taga c oClCagaaallaggt gIaCC IaCIC;IgCaaCgCgaaaaaC gtl«CCaaCCatattataC _'aC aattat ~C~actatgtl acgacaaaggcaaaatagtcgtattatcgtcgccgaatgtacgactattntctgtagtcg<eatat~ta;=tcgtcaaa cta cgactactuacgactacatattgttctcagtaatgttgtcgtccgata~tc~=taatt~~ac~>actat~caccgacta aa lallglagtCgc;l3;t;lalaC~:ICLalglaaCg1ClaalIgtaClgll~tC~laaa<_IC~~lC2lCCa~_CaaIC
~la:lgll;l Y.
ICgCaaaaa;l~gaCgaCtilC;lltgllaglCgCCItICgla'=lC'~tC~IItICIi'tItIClf,_la'=li'CiC
aaa~C;l''C
algglcl~lalagaCagIgaCCllggalCCaaccaaata~c«ct'~aacttctcaaat~C
_'a;UaCa;ll.':1~C18oaa'_ct ccitllca~tt~ttgcatatcncctt~~aettttatccaac~=tc;_~ct~=~cotaatatlt~acyg;lace ml~acta tittctggcclagaacagrtcclacggcataatct~ac~tatcacacataatctt~!aac~~ata~lctc;tatla~'~
a~ct cgaacaacag~:t~cagataccaaagctttcttgacggtgtgaaa~yattt«a~=gcat~_tecae:ltc~';rutl~a acta;~g tctccttgacaatagtctagttaac~~cct~'gctatctt~_~=a~aa~tccttaataaatcttct~ta,'aatc:ca~
'catgac C;Iag~aa;lCIlIliTal~IClllgacaa«t«~'~=tgi'Cl~~aalac;llCafCaCttC ~altlllC
l'CIIYICa:ICCICII
IaCCCIICICt~'alatCll~llaCCCaaCattal'L'ccatccuCaCaI~aa;ll~'aCal:lllll;l:C;l:lll:
l:li'Fl:lCaa~a1 lad'(CICIIC~CaCCI~_~;ll:aalaC('I;IC'ecaa~aflcaactaacal~a~~a1';Ia:l~a1' _'1";l C:ll;l~':lCCa:laaaalC' tc;catgaagac«cc:ICCaICICCtc~'attaaatctaaaaatataea~gtcatacacctrtgaaal~«
yca~'~';=acatt '_cataa;lccaaata:laa«ctctaataa~_caauaetmcataa~~acatotgaaa~'t~~IUtc tc«";Itcatta~=~~_t ~,aa«~'~'taI«
~~aaa~'aaaccaclalatcc;alcaaaaaagcaata~tat~'a~t~a«a~'etaa~'c~_«cta~cattt~a tcaatgaat~~~taaa~'aaaaatgatctttccta~atgca~calttagctitctataatcaatalacattctat~
cccaat tata~=«cta''ta;~a«amaattcatct«ttcatttttaacaacaytcattccgccclttttae~'~~;m«ctat~m:
lct~_ ga~aa~cctaa~_tactatta~a~ata~_~~gtaaatgacaccagcatcaa~ca;_tttcaa«ta«mlc a etacttat«
caa~tta~~~~=tttaacc;a~~'t~ata~~ylaaat~acacaagcatcaagt~~ttcaatama"aala_'~a«c~tt«
caa g~=tggatcctateau~cataaacta~_t~aaattcccttaatgtcaaataat~'aataac:caatl~'ctctcc:lat acttt etaaottca~ataatagcagzttcacttcatcatcatttaactcaacattaat~atcac~~_~ataaela~a,~ltt~g acc aaggaatot~ttpoagct~gatttccctcaag~~aatccacctgcgaaaatacca~~cca«cccactaa~cccaccaa~
ta gag~acclgc~caacl~_atgtgcccacgtacgcaaatgg~lacttgagcct~ataaycr~~tcacct~=atoaag~~' a~a taagcga~taacaacctaggaactaacgggga~atattc~cggagcggacctoatatctc~gaactgacct~ccaaata t cca~gagaatctcc~cagatcctggacg~agctaccacgtgt~tagacctatataa~~aa~caag_'ac~'ac«g~aag ac acattcaatgca~ctttacattcacacttacacttttacacttgtaatctcattccacattgtaaaa~ W
ttactatcat caatacaaaa~tctcttcettctta~caatataactctcaaagttcatetgaa~atctagccctcc«tctcataatclc agttacaaatctctact~tg~atttcagaacccacatttggcactgtctgtggegacggaaatagalca«acaacacca aaactaacacatcaattgt~accaacgatcat~gatcccaatttatcc~ attcaacaa~atcl ~gc ~tacca~ttcaagt cttaaacgataac~tctcg~acg~~cacctcccaac~~atctctctttc~gaatcca~~tt~at~ctamcat~acatig atgcegaeaacatctcccaaaactc~aattcttct=aatctgaccaaatcctccccccctt~caaacaa~~=pacg~ge aa caaaacgtcacgatacagcctcctaacggagataaccgggctaacccetctaatca~ctccctcc~~a~acct~ccaac ;a act~~c~~'a"ctcy=cgcaateatcctgaacctaatcaacataagcaaaaaccaagaaa«~'c~catc~caacctcgc ao aaat~~taag~aaattcctcccc~c~aaact~ggattcctgccca~atccccatagatct~aca~'atacc~tctcccc oa ca~atcccactccaa~;=actc~tcta~acttcglctc~atapa~~tctcacctcctctcaacc~~eag~'tc~'ccu«
Ia caggcacccgtgcacactctc~=tc~aaaacccaacaacagg~'aat~a~acac~_taeate~=cctaccncccaa~z~
ctacc gctcccacaga~aacctcotac~tgac~ccctcgaat~~ttt~ctaetctc~aa~=~saactctatt~'atacaetact ce~
tactcatcgtaaacaggacgtctgaagcc~acctct~gggtctcacccaaattcaeaac~aatccct~_c~'atcltac atc caaaaattcaaa_caatcaaatcaaagatt~cgaacctaaacga~gaggtagccataacc~'cccltc~_c;lac~~'C
II~I~
~tIIICCICIa~~'f IC! ~I ~aL~'aaClC:lc:Cg«ag~cacccgg(liClllagpal~'CCl l tcacilaag_eICIlCaclll eccaaapcc~aag~~~_agcacaccgttttggcccaaagattcaaa~aatctaaa~CaCa88aCOCtICCCII~caact aa aatacccttcaaaaa'=~aaaaccaaacacaag~~cagcataccctattc~caata~aa~a~~ca~'tc~a~gac~t~a gtc cagaactt~acctc~aaaaatatt~caa~taccacaaaaagagag=atatttcacctaa~aatatc~_recc~=«acaa aa ttaalceca~~cegg~=lcaaaactaaaaaa~gttcaaaccccaaagctaaaacccctcctccc~at~a~ca~gaggag sa ~caaactcccaaamaaaeaaacgggaccgaacactgeaapoaggtgactctc:ctceacct~'ctag~'aaa~aac~ca ta~
accta~ccttt~=cap'aarlc_'aacl~gaaggca~gacaaaaa~gtca~tcactacacctccacccgcc;m gvaaaa~aaa aacac~a~'~'«mca«acs'cctccc~aaattct~=cagaaacgcaacc~a~'cacclctcmc~aaac~'calc~~:at lcat catgggag~atctca~ctcl~'cata~actcaatcaattcgatcaaaactcatca~a~oaa~=~'ccaa«calacacaa aa~_ ecailaa_ccc~at~at!_~=cacct~atcacca~atcacatttt~~~aaagtgaaaccacl~acclc~ataaacctca c~at ~aI~CICIC~IIBICC~;I;IIC~aC~I;IYCCaaCI;lC~aaCICICCC~Ialaitl~BlC~aCaCo°.de gree._aa~ClCa~'lt~aC~l aCICIICIaI~:ICIC:ICCICC:I:I'.:lC~ICa;IC~CCICCICCC,'CCiICCa~~~a;lYlC:l;l;~ttaCC~
~;ICCI~'=al~~TaaCC;l aC IClaalaaaaaala''C:IC C IC:C:IC ~tCailC Y~C:ICCtccaatccacaa~~';IICC ~,'allC
C' ~it:tt;IC:I;ICICBCI.'lC C' CCCdCC''' ~ L;la~'IC:aIC~'aaCaal(:''IIC.(: ~al:Il'WatlCaaC~CalCaaBC:
aala'<I'?ll:l=;l''L:IC: ~a;l'L'CCI:IC;I;I
fICI~~cattaa~clagclcaa~aaal~=gvoalcC~_cgaaatacacgcccatagcgaemca;ICtC;~lC:ilCCa~
CCa~'l IC~aalC~lll~aaClaaCCil~~aICCCCa~a~~agaaaacacctc~~Ca~aC~CCfI~_~Ca~CICII~;CIICIaa aaCC
ICIIItao~a~'ailaal~~~fal;la~l~Il~tctaaatactcctaattcatctctaaclCttataa«a~W
a;la~'=CIII
~~'attot~aCaCallIlaaCCala~aaaeaataacaaagctatttact~IIICIaa~CaaIIfIII~IIC~=Illla_ 'CCI
CIItIc°c°'.-' - -a.°-aaaaL°-°°tataa~l~Il~tclaaacactcctaatccatctttaactcttacaatta_=ICaaal~_Call ~=
~iltlu9~aCaIalIlI~atcata~aaacactaacatatatattQaCl~CltCIal~'Caattlttl~ltl'=~tltla ~CCCC
l[llovvilvaaaall'_otataa~l,_It~=tctaataactcctaatccatctttaactcttataatta~tcaaat~C
afli' all of ~aCaC;lll Il~al'C alaaaaC llaCaaaac:llaC aICICaI~aaIClCitIl;i~'aClC:l~
~:IaI;IIC;Io ~=atCC ~ ~T
aaa~aC~~ll~alalC~'_°C'CIIC~~:I~CIaaCaC~laC~llClll~~;lCCalC;laaata;lCL'~t C'aatcttC'CaccaW
caa~aacayac aalaaclmaC catcatatc~~ac~ataattatgatcaaagtcaa~a~c«cc«clta~tagcta«a cattccaaaacac~at«ctaaaca~ataatcaaaaata~tatta~aaca~utaattta~~taattcaaatlaaactct ca ila~ail'~:laactt:lcW
111~1c:1cll8llll~aaaQlla~Cl.'lCaaalCa;lC~al;~l~~lltl_'Cll:lc'~lallal~
Ill;ll(: a9llC'lCC aaCaIC':Ila ~;IIC YCIC CC
aCC'aa~lC;lC(:aaaaall;lC;l;i;laalla;lllaaaala;lalC:l~a~IC
aaalatataCa:Il l ~;tl:lll ~;Ial;l:ll ~;lC;laa;lC l:lal ~aaCaClaa~'al ~C al Y:ll;Ila:l:l:llil;Il~:l:lC':lC;laCaaill lCalaBaaalClC:ICC~~;ICl:l:l;t:IC:;l~;llllillC~:IIIIC;IICaIC~L'illilalC~'aa;l;l' ?;1:1.''':Il~',~l;l'';I:l,!ala~=l<_a esaatagtg~=gtacca~amg«aalct~=ac~aca~~tgtcaacagaaaa«caa«ttgaa__~_tat~a««
~;ttggcct ~acaggaccc~=agttctcalc:l:ICCaaaccttgttatc:~ltgcccattctcttttattaataaaW
ct«tta~;t,~at ctgacatcitgtatccaacagta,'c«
caatagtcalaccctacggatcaaacatactaatcaaaactctaaaacatgca aaCCtaaaattalc;a«acaaaacaytcaatactlacat~=ttcgtcgac~'agaagCaaaCCCal:laCClII!_aag algga aCCa''Cv~aLiClaa~aal!'ICCtCaCaaCiClC:ICaC°aataCaCCiIICCIi'la:ll:l~_aC~'~
'lllC:laC:l''l'=la:lga~;~_ al~aaaCgallllalCIIIIC~CI~:la:llttClla~'tlalillllgfatlCgfllIali'a~ali'c:laatgata ccgalal atatatataaga~=a"agaaatac:llCglaY:I~'~Cal~'llagal~Ca:lcatttaaaaaa~'ta:uaata~_tl:
:llg:ll~_ll(C
aagtcatttcgtaatcll~_aala«clglaalatallltoaataaclcaattattaataaaattaaaaatttggcgagt c llacallllaga~Taaalaatalaaafatlftlleg«a«altaaltagagltgtlalalaaafacCaaliglc:ltaaC
l =Cgtg gataaaltagtaacaalcacat galcctataatgtttgagacatlt:1.ll:l.ICaaaaIIICCalaa:lalllaalaa ttatggagagaaataagtaataaalattitlcataltttttataatggaagtga«acaaa«aggaaacaatatatatg caaatatatigctaacaaaa~aatctataatltaa«ctattatggcacaacatacagaatggata«ctaaltlagttt tctactctatta«ggggaaattggaacacaaattaagggatgcttgattaagagataagagttgaaataatttattttt icaacttagaag~=ttcctctatgaag~t~attttacgacatatttctttggacaaatcitttaataaactrctagggg tt gggttttggtgttcttttgtttaataaggtta~atatctaactctttatitgcgtgtccgatgggatacaatatttgcc g ccccatacctctttg«taacccc~c~c~attucaccctaattgttagactcacaaaaggcagcagttgactcagaggt ltacaagggcttgatggggtclgacaeaaatgttatgaaagctaatgacgtaagtlcaacacatgctcgaccaactacl t ccaccaacaactcgaagaaecctccaactt''tcccgaaagcicatgctcgaccaaacaactcaaggcaagcactcaga tg tcagatgattggttagaactcaaggagagatctaaatggcaagacaaagccatacaagagttaacacatattgtgaaag a gctaaaggatcaaatcaaggagctcaatggaaaagccaaccaagtaccactcaacatcaaggacgcccctgacgatgga g acctgtcaagaaatcclaatga~lalgat~alcactctactaaagaaggagaggagacctcatttcctctetacaaccc a ccgtaagcctaatga~tat~a~'aagtcaa~cttaaagacatacaacaagctcactt~_~ga~caaatcccat~_tcta tctt tgtacatatci«atttlcc«~'tt~=««tsalgcat«~~ttagt~=ttttcag gagataaglalaaaga_ct~_agtga agtg_attct~~gctctga~a=tacaacaclalactcoaccacaaagcaatgaaggatactc«accat~ttctagctta tc CaCgCCaggagalCa(:l _ -CgaCt~C~~IgC'lgaCCgcagcagaagaatggagalCga~laIaClCaIggCgoLgCtHHCCa ccatagaggcactgag '~IyegIaCCCacatc'Caggagctgagacagaaca~~~gCg~L'CICgtCl:ll~_gCClgggagC:lal cacat~'cagCCallgacgaccaac«
r~'IICgIICIlcgaglgaggtaagcgtctcacttcaccattL'tattatatcatc tcttgtgatltg«c«cat«t~«tclgtga«ggatttglcctgagtactctcttccaa~lttattcacacagtgga ctgigigalltaaglllgo~e~a~~=~clcaggaagtgtgtgttgcactgtgtatattttttagtctgcattcatttaa g~
catagaaaaaccaaaacaatttoaaaaatttea~aaaatgatttcataaaaaa~a~=tgttaatgtagtt~cattacat tt aggatcgagtcgagagtgttgcatttaggattottgcataagcataggueat~atgatgagatagccttgtaatcattt t gg«caccggataaactcaglgccmcgtt~ctagttgtctgtt~cctagtcaatgaatttgaaataaaact~aaccat~
cctagattgctctactctacca~actgmatgatttgataccactccctalcaat«gaacrt~zaatca~tatc«
laatt atcatgtct~_catcaaat«~aaclcal~~ataccctaaaatacttggattttcttactcatttt~atcactc«gttaa tccaagtaoctcactctrcctatta~=agca~«aacccgaacctaaacctaoac«tctmaaaccrtatatcactt~~t sagtgtttgteag~tcttamc~altaa~ctt~=yta~=aaagtgtta~gttcgtaarcaca~a~ata~t~tctcat~ta ~
ttclagltcscatl«tc~oacta<_ata~sacta~~to~~=lecttatacttigggttgggatgtgtttaaaaaaaaaa aa aagglggallC:lflgafaa~=aaaa~~la:laa~'aClClagYlaaagl:lBgCl~:la:l~Ca!'a:l:l:l:l~'l Cl:l~I:laaggllll gggalll glaa:l~:l:laa~l:l:l~:l~tl(:l l'' llagC Iaal gaagaa'=~ gC aaila_=CCCIC
~gllll:la:latllla:laaCa~=
aaaccttagttg«ac:Ig:l:l:mca:laccm~'agaaagcttctcct~gagitaa~a~aaaa~aaaagaal~attaga aa aagggc«aaatgattcateactgcaaagg_tagagttaagttcttaatttgggactagagataggatlacca«agagc ttcattggttatactct~g~ta~alg~ aatcitatctctgtatgcatagcttg g~acttaccttta~~cattctawaat~
cttaattatttttt~_agagattcc~l~ttact~aa~cctattctataagg~=accatctu~tctctt~~accittatc tta ggcaaatgag«ca«gat~atgratt~cttaa«cact«ccagaactaatgaatQ«aaagggattg Tlagatltgaa aacatgt~taggtc~a~calac~actc~_~att~
att~=ataacaa~_ectgactaac~ttttl~agta~aattt~=atcata tcgcagcltagaactacraac«~y_acattga«tcatctgatttatctagtgctttggctctaagtccccgctttcaaa cctcacctcca~ctt~ttc«tatt~ttt~c«~=ag~~caa~caaagactaagttt~~~g~a~ttgataagt~t~cat«
tacacatttlsa~'catcca«tgtcatc:am«a~=catcatatcatcactgttttataccacttcmatcatttotcam allll;calgtttaggata~l111~_CaI~Cal~lt~C~lalllgfoLLQItIICFI~'~'lgalll~=~a~Cl~llgg C~':IgCI
aattg~aa~aaact~~cca~=atcatatcaalacattl~accacacaglC~=a~ta~at~~cttcac~ucatcaacaaa cca cttgaccccctg~=tc~a~ta,'aa~=~~cltcatcacttcacaccaccactt~accccaa~~tc~~'=talcatcalc tccacc:
acct~atcataactc~atcacaccactc~=accccca~atc~aetacctccatcaccttcactccatcawcactc~att a cactiaccgaotcccaccatcatcaracactc~=aca~atcactctaccaccaagcc~agtactarcatctcc:atcac tcg:l ctlcatactroact=yca~=cttca~=a~_tctmclattcc~=caclcaarca~amctc~age~caag;._aa~aaaa«
aa~_a CICC:I~_CIIaIC:ICIC~aCCallCaCtC'~'~c:aaWY°~=lC~a~lalC~TIlCllaalCC'~lC::l Ca:ll:lll,'C'=lC!~llll 2aYl:lllawotllCi'olalalllL'Oi'lal:la°Ia~=c'al~lalIlCaCaCIIICi':l:l~:la:l a~llllaalW C~=CI~:IC:I

ctgtgttctt~=acattttt_taatccaga«tcuuaatrtamagtattca~tattc~c«ttcattitc~_tttact aa~_ttgttcatcctattatcatcugcttttattct~=«gclatcat=tt«rattctatc~=tttat~_ctttatgcaa tg aI~'tctaag«Igtgactag~'tltclga?gal~=g°ll:ly~_lagttta~'aattctea~=tat~_clil ~'~=l~'~'ll~'a~llll:ll t««ag:IlClCllClaCatta'llllgllcllaal~ccl:111~'c:Illll;_alcaaCl~'~=aafllgil~!CCt' :1'_:IC:IIIIW "c aCCCaaaail'_tgttCgillYaaal'_ICl'=aaCCaCl:lallCal:'=:1'=aIIC'_lnL'CICt'_IaOC;ICI
B:IIC;IIaCaC:ll'':IC
alCailLlCltallllaClCCaaaa~aClaaaCaa~CIllIlCtI~CIICICaaa'=IItI~aI~~'l~La~CC:l:la ~lC(:~l aI~=a~!lclllg~'clllgtalcllcalacaa''gaaacac;laCll:lg~'CIIIga:l~'alca~~a:llgtl~=l lcl:Igllclaitla ClCaalCafaCaCalgaCaICIa;lCalalII~aCICC;IIaaCallaaaCaaga(ICalIllaaIIC:.ICaalaC(l lgal ggt~=tagccgaattccatatgattct«ggettt~talctacaaacaaagaaacactactta~'gcttltlagatccgg ta sc~_'tt_='cta~ttcttalactcaatcatacacatecaatcta~tcatattt~a«ccaaaacactaaccaa'~cttc ttctt octtctcgaa~ctttgatggtgtagct~aacttc~tatgagictlloaatttgtatcttctaacaaagaaacactaatt a egcttttaagatcatgttgcggttctagttctlalaclcaatcatacacacgatatrtagtcalaltlaactccaaaac a claaagaggcttcttcttggttctcaaaocttt~ctc~totl~_ccatagttcttatltgtcttttgctttgtatcttc ta acatggaaacactacagaggcttttaaoatccg~tttcagttct~~ttcttatacttaatcatacal:atgacatctag tc atattt~actccaaaacactatcaaccttc«ctt~cttctccca!'ctttsate~tQta~'ccstaatct~tat~aatc tt tegctttgtatcttctaacaaggaaacacta~gcttttaaatcag~ttacagttctaa~tcttatactcaatcatacac a tgacatctagtcatat«gactcgaaaacactaaccaagg«clccctgcttctcaaagcttlgatggtgtagccaaagt ccgtataagtctttlactttgcatcttctaacaaggaaacactacttaggctttlacgattcagttgcggttctagttc t tatactcaatcatacacatgacatttcgtcatata«tgactccaaaatactaaccatgc«cttcttggttclcaaaac ttt~atggtgtaeccgaagtcgtatga~tctll~~c«tctalctictaagaaggaaacacttcttaggcttttaagatc cggttgcggttctaagtcttatacttaatcatacacal~acatcaa~tcatcttt~actccaaaacaaaaataaa~ctt c ltcttgcttctaaaagctttgatggtgtagtaaaas«t~tac_'a~tcttt~tcttcatatcltctaacaa~gaaacac t acttat~cttttaa_'~'tccaett~cc~otctasttcttatacttaatcatac~catmaatcta_'tcatattt~act cca aaagactaatcaagcttcttctt~cttcta~aagtttt~atggtgtasct~aaotccgtatgagtctltaaa«totatc ttttaacaaggaaacactacttaggcttttaagatccatttgcagttctagttcttatactcaatcatagacatgatat a tagtcatatttgattccaaaacartaaaaa~~=c«attc«gg«ctcgaaocntgctcgt~_tt~
ccatagttctlatt tgtctttcgctttgtatcttctaacatggaaacactacalaggct«taagatcc~
atttcagttccaottcttatactc aatcatacacatsccatctaetcatatttsaatccaaaatactatcaa~tttcttctt~cttctccaa~cttt'_ate~
t~
tagccgaa«gcgtatgaatctttggctttctatcttctaacaa~gaaacacta~gctaatatg~tcta~tt~cggttct aettcttalactcaatcatacacataacatctastcatottloactcoaaaacactaaccaa~cttctttctacttctc a aa~CIII~CIE_~t~L:IECCOaa~ICC~IaI'!a~IC;lII~~CIII_l?CatCllCla:lC:l:l~~aaaCal:laC
llilY~CIllla c~'attca_ttac~~_ttctaettcttatactcaatcatacacateacauta~tcatatttoactccaaaacacaaacc aa scttcttttt~cttctaaaaccttt'_ata~tclaclcsaa~lccatacea~tcatt~~'ct«ctatcttctaaaaa~a aa acattacatgtgcttttaagatccg~tt~c~altcta~ttcltatalacaatcatacacat~'acatcta~=tcatatt t~a cttcaacaatctaacaaa~ettcttcttecuctcata~'cttt~at:.'~t'ztacceaa<_tec~tal~a_'tcttt~
sctttc tatcttctaacaa~oaaacaclacttaeectt«aasattc~~ttacoottctaa~tcttatacttaatcatucacat~a catcaagtcatttllgactccaaaccacaaaccaa~cttcllctt~cttctaaaatctttaatgo~tactcaaaVtccg t acaa_tcttt'~~ctttotatcttctaal:aa9~aaacactac;eaa~_'c««aam'I
cat'tt~_c;aatlaa2lc:atatti2 actccaaaacactaacata~cttcttctt~_e«clc~a;lecttl~attet~ta~l:l~'aa'_lca>=tat~a'_tcl tt~aatt t~tatcttctaacaa~taaaaactacttae~_c««aa~atcl a'_«~t~~tlcla~Itcttatactaaatlalaaaeat ~atatcta~tcatattt~acctcaaaacactaaaaae~cttettcu_~«ctc~aa'_c«t~rtc~l'_tl~'ccata~_ tt chat!'c~tccttcaattt~tatcttctaacat~caaacactacata~'_c«ttaa'_atcc~~tttct~ttcta~ttc tt atactcaatcatacacatgtatcta~tcatattt!_actccaaaacaatatcaa~tttcttctt~cltctccaa~cttl ,=a l~oc~Iil~cceaaatccetat~aatatu~~cttt~tatcttctaacaa~saaacama~~c«ttaa~atcat~tta'~a ettctaa~tcttaactcaatcatacacatoacama~lrattt'_l~actaoaaaaaactaaccaa~c«
ettc:ctactt ctcaaagctttggtggtgtagacgaastccgtat~agtatttggctltgcatcttctaacaa~~aaacal acttatgct llI:lCQallC~olloC~~IICIaHIICIIaIaCllaalCalaCaCalpaCallla~lLalalll~a;l(c C:l:la_~I;tCla:l CCaa~cllcllCll~~IfIIcaaa~cllleal_e__'tela~Ic:aaB~tccatat~a~tClll~iCIIICI:IICII( :laaCaa '~_oaaaCallaCll VoYClllla:l!'aICC l9ll_~C
alll'l:l_,'Ilc:IIaI:ICIYata~a_':t_2_~:1_;,';t:l'?C:l:l_~~la:lCaW
la$OOOttt~a~atctcca~~aaaCaill:BaaCIC_!':l~CllY~aICII:ICllaICICL'al_~;Ila:Il:Yla tC~:l~'a:lil~Cili1 caaaewa[~il~_~:llCCl(:lC(:ICai'vYClaa_~a;I:ICICIIIaalYlil:t;tCl_'?:Iallllall:ll C'1a21a~(:_~l~alleCa aCoBlgalgaal~aa~~!'ClaCllala~a~a;llaaa~:1'L':Ita:IC:IaC~:I:II:ICI(''laaaYCl:1;1 1Ci1lall:~aaYaIlCat IallalClaaa:l:lC:lCaa_~aaaaa~UFlt~aaaal~fa:ll:l_>~_':I:ICallCla:l:IC
~:IIIC'll:;ll:ll;tC~l_~:ItILI:CC:IIYali1 C~la~aC_~:I~il:l_~CCCill~aaClCllaa_YaIYC~_Cl_i'Call':lY~l(.'C(:Cll'_Y~_'l~_'1':
l~;l=_~:IIIC~':IC:ICI_~aalC:l:l CCaCI:ICaII~ICiICCiIC~
ila:IC~_~:l_Ya_YaY!'I:ICIICaCL'll'_;I;IC;:lCall',_';IC_YICC'~I;I:IaIY_1'Yva_~_v aaa~lli' aCICI~'IaCCCaIICICatl~:ll~C~'lll:(;:la~':1CIICaa_'l'_',:1W:I;lICIIiC:_".Y;Il~ll _'a_'ll:l:llli'1;7CIC:11_' ' :l~=~=t~=~_~aaac~_Itcac~g~aga~cactacccaeaccaaatctcca~~'attaaaaa~caattccc~ac~ctt atealca~_ cl~_ca«ctt~~tactt~~~=tcetc~~c~_~cltccaaattartel'=I~catccttgt_aact«rt~caactctmaa caaa~
tccaca~=cattaccat~catoc~a~=ttttatca~~~a~,a~,ll~tcaaatccaaa=~t~cac_l~=~'~acaaa:c y'tacac cacacaaaac~=~'a~_agcatcraacactty=~taaaaa~~cat~all~=f~l~caa:lclc~~clleacata~'c«a ac~tcaC
:lyclcttaat~=Italcmcaactaaacatc~_aa~=aagalctccaa~a~'ctcgatt~'ctaacttc~
~«tg;u:~alcy=a~
t,=a~_~'at~ataagca~'tactcatatccaaectagl~eaaa~taacttccacaaa«atc~ccaaaaal~~amc ay'aatc~' a~tatclc~=gtcacacacaata''aa~'ta~~caaaccat~taaac~ataaatttctc~=aaa~aa~=a~=laa~_~' macylaa ca~calct~_tc~ttttcltacaa~gaataaaatgagccat«t~~~'agaa~c~alccaccaccac:aaa4'ala~_a, _lcgm cc~c~ltga~tac~a~~~a~lcccaaaac~aagtccalactaacatco~tccatg~ctQc~l~°~aala~'~
=~avav~:la~' ata~ao~CCC~c'.all~!~a~~?=fIIIfCCIII~~'C:ta~fl~=oca~atcc~,~~aac~ttccacaaaacoIIC~
aCC~t(:at~'~_ C'=Ca~loaa~~CCaaaaalaa~al QII QI QaCYa~L'Cl~CaaI~ICCI ~ICaC°CCCIaC~I
~~CCIICatI:ll ~'1 aaltC
acaaataatct~aa~tc~~a~~ct~caatct~~aacacaaa~cc~caaacctttgaa~a~~aatcc~ttttataol~ta t attcatc~tttaactcattctc~ac~tcpcttaatatcc~a~a~aagaa~~tgtct~taeagtataastccga~aagaa c caagattg~c~ctgtc~ttaacttgt~ctt~ataa~~tgtaaac~cttcatccgcttctttc~tccat~Igaccttgcc al ccttaatgcaactc~t~att~gtgctacaatocagctgaagtgat~aacaaaccg~caata~aaaga~~ctaattcgle ~~
a:l=CI~~C~IaC:CIC=YllaCIQIII~I~!'R:ICCo~CCaa~aCCEaaCI~Clll:aal8lll~IllCalC~:tCI
laf:lalCC(:
ttatccgatatgacata~cccaaaaaaa~aacctgtgagacaccaaactc~lafllaleEC~azCl~caaacaacttcl c citccgtaaattgctagaacagcttgealgtggttclcgtgttcctcatgtgacga~cteaagat~a~aalatramaaa atacacaaccacgaatttcccgat_aaagtactt~attcatgacgc~cateaacgtacttggg~cgtlo~l~a~eccaa a '_~rgcat~acaaeccattcaaaaaetccttctcgtgtttigaazgcggttttccactcatctccc~s~tcoaatcc~;
aattl ~alg~«ccc~ctttttaoatcaatcttt~aaaa~aga~at~eclt~ccsatu~~lccaatae~tcalcaagtc~l~gl at~gegaagcgafa~cecaccgtaatt«~tlgafa~ctcgactatctacacacatac~ccac~ttcc~tc«tct«gg :lath:l~aaL~~L'C:1~'vlaCaaCYC:lai'°aCl~a~~clttctttaatat~aCCltll~'Cla~aa~
tlCllc:laCLlYlcuvC
YCaaCal ~ICaIQIICII~C ~~aCICaI~'CaaIa~L'l~l~~ ~'lC~~lII~~LTIiI~C ~ll''CiICC
~='~ YCaCa:IC'alC I;II YI Y:1 l~'llgaatatcll~a'~l~uloo[B~CIC(:lY~?IaBCICII~L'C!'YaaaCaC~ICl~Ca88tICllY:lalIaC
~ ItatCY:la IatYYtYIYaYctaC_'_eteQatataYlceClelaafl'_ate~taacactcctaatLCaaatccttcac«Icca~_C
lcfl tctcaaacaCaaaaat~.iaacacaacat~'_'=a~tgl~=IaCaIg~CCI~IC~Taa'=«arat"~~~'Cal~a~a~l t~~~CII~L'C
YIaCC o~IooanolvaC ovaIIC~'_a~
aaa~'c~~I'=l'=ICI'l~agat~'~aa~caocacaatcllat~a=l IICC(:a~:lla aattcatatgtattcgca~ccccatcat~caIaBICIIIC~atcatactcccaa~gtc~lcccaa~at~av~'l ~avl~:lC
allcala~~l~la:lcatcgaaataa~t~cgg[CCll~l:l~aall~CCCIaI~ ~t'a~!'8ila~.'aaa(:la~
oaCIC~'aCEt oaaa czc~=aac~aaacgccatcctgcaaceaaglcaugcatat;~t~ttggatgllcticaaat~~laayc~ca~cttlc~e aca~cttcctc~_~caattacattac~.:~xaacttccc~_aafcaafaacaaaagt_ca~acte~_~=cct«~att~' tgcaagl a~aacgaaaaat~tta~tgcgaaeccaat~ttttfcac~al~ct~tg~a~t~a~=acyaaaaatc~t~ctacc:lacat at _~accttcatcacccttggtottatgc~tttcaataaac~cttcctct~Iftg~tttgoacacQct~ttt~olll,'ea ~:l~
YCIQlIIoIC( _ _~tgaccaagttccccacaa~catagcaacEtaa~~Cattc~u~CvuvtCv;tlColC~Ialll=~~~oCl~
l~'t'aICIIC:IICIIII~Ca~l~3C8C~a~Io~'l~llaa~~i'Cl:lClll~IlCl~lalClll:lC~:IIY;ICC
O:Ia:ICCa~CI~, aalll~alfl:lelclCa~atgglcgleaaCgl~~:la:lgala~:l~utg«
lga~agcaaaacl°ll~(:lCa:laa~aaCal '=l:~C'ICIQIYa''cclccgccacagt~=~'aa~gatcgaalaet~alaac~'C'~Illl~~a~ll~a~lIC~Ca_I
aCICCI;II
~aagl~l~a~accaa«gtacctcact~=ICalagat"tcattacfl~taacaa~3aa~_ataaaattcitclc C;_Iaafc;~l clacaYttcta~aCCCIt~ICICa~atlttYCaYYCYattatacalc:YtYcYatcYlaaft'ltYaYYaaYaaaaYlY
tc1 Cl~a~tttcncttcaat«tlcccaaga~C~aata~~t~CIllacICoIaC'~t~CCCtaaac~lCll~aatt~Ct~C'C
a ccaa~aagctgcat~lCC:aCgeaatc'=li'laoCCaCCaaC~aIaCIC~ICI(:ICa~CaU~l;lCaCaCfl~:l:l :IleC:lal~
tftcctctaca~ccacaa~ccaatcaa~ta~IQCaICICCCC~~ataca~CCll~:laaIIC'lvv_aalv(C~aClll c?aaC
Cff~IIICCCaaC~lafptcttgaaacatcl~Ilcclgglgicla~'cc~gca~aaaatc~_l~L'IlY°fl l~~:IIIY!_c:laa ~actaau~atta~geaQctggtcac~:1''YltYOfl~IvCII
lYC~2ICYalCl~aaC::l~C~lClI~IIYIC~I~~a''CC:1 C~~C:~aCI~CI~CI~ga=°C~'=lla~Cl~CaIC:I:1''al''lCl~allaaC''lal~:t<;laa~'CI
C:aC;~aafl~aa'.'.CCf~C
aaC~CaUlClgagao~C~3~a8~:lYfall.'~C~B:lallC:~lIll:Cilall~al~~ll~~~lCC!'YaaaCll~;
lC'=CCaaC~Il llCYll('IIIYYaYYCalI!'tlYaIlCa:latllCaa!'CCaaYa:taaactaeaat(:IYataccaaet~ala~a' ,d~_'_c~~~:1 a_2Ca88_Yla:lc:aa2C:f_2~~T2lll2a~alCICC:l~2aaaCaaCaaaCIC~a_QCIIC~aICfIaCIIatCIC
''_al'_alaal_' :llC~:l~aa2l~Caalaaa~~=Y2l~a~talCl:ICICCICa~'o~Ilaa~aa~CIClItaal~laaalt~_a~Iltl aIfalCla aaYCYl~allacaacgalgalgeal~:l:l'gclacttlla~~'eaataaa~a~
ataacaacgaatacictaaay'taatcat :llC!':lat'alll:l~'lallalCCIaaIaBCCtaa~aaaaa'_~a~aaaaCCtaalt_~aaca«cfaaaC~aIIC' lcalalaC.T
l_YaaYCCCfIt_Yalal YIaYaCYataaYCC(::llY:IaCIClfaa_~'aIYCYCIYCaICalaI;lCICaIICaIaCaCa1 ~_aC:al (:I:IYlC:llalllY:lClCl:a:ICa:llCl:laC:l:l:l~C'ilClllll~ClIC1(:ala'_CIIIY:I:I_Y~
l_Yla_~'CC~aa_'LCC~_lal Y:I~ICII~~~'CIllClalCflelaaCaaYi'aaac aClaclll~'«Clttcell~C~'~~IlClaaG'IC(Iafacllaalc:ll :lCaCalYaCalCaa_'lCallIllYaCICCaaaCCaCaaaCCa:l~_CIICtICtll'.CIICI:Iaaa~Clllaal~
__'l_'l;l~_lC

WO 00/»325 PCT/US00/07392 aaa~accgtacgagtctttgacttt~tatcttotaaeaa~caaacactactaaggcttttaag~tccagu~~caatcla ~_ tcatattt~attataaaacactaaeeaa~cltcttctt~cttctcgaa~ctttgatggtgta~ctgaa~'tca"tat~' a~~t cttt~aa«~~tatcttctaacaa~tcaacactacua~'gcttttaaoatcca~«gcggltctagttcttattctaaat cataaacatgatatctagtcalattttaclccaaaac:lctaaaaa~'~cttcttctt~~_ttctt~aagcttt~_cty 't~=tt gccatagttc«attt~=tcgtte~attt;_tatc«ctaacatggaaacattacalaggtatltaagatcc~_~ttly~~

ctagttcttatacyaattataeacatactatcta~icataut_actccaalacaclaacaag«tcttctl~=c«c'tc caagctttgat~=~_c~_ta~=ccgaaatca~tatgaatat«g~'ctttgtatcttctaacaaggaaacactag"mt«a a~a tcatgtigceattctaagtcttatactcaatcatacacaagacatctt~_Icatatttgactagaaaactaaccaagct tc ttcct~cttctcaaaactttgatggtgta~ccgaagtccgtatgaggcttt~gctttgcatcttctaacaag~
aaacact acttat~cttttacgattcagtt~cggltctagttcttatactcaatcatacacatgacatctagtcatattt~actc~
a aaacactaaccaagcttcttcgt~cttctcaaagctitgatg gtgtagcagaagtccgtalgagtcattg ~ct«~catc ttctaacaagaaaaaactattaggctlttacgattcggttgcgcttctagttcttatactcaatcatacacatgacatt t agtcatatttgactccaaaacactaaccaagcttcttcttggttctcaacgctttgatggtgtagttgaagtccatatg a otc«tggctttctatcttctaagaaggaaacattagtlcggcttttaagatccggttgcgattctagttcttatactca atcatacacatgacatctagtcatattcgattccaacaatataacaaagcatcttcttgcttctcaaagctttgatggt ~
tagcaaaaatctgtatgagtcutgaatttgtalctgctaacaaggatacactacttaggcttataagatcc~g«tce ttatag«c«atactcaattatacacatgccatcatgtcatat«gactccaaaacacaatcaagtttcttctt gcttc:
tccaagc«tgatggtgtatccaaaattcgtatgaatc«tggctttgtatc«ctaacaagsaaacactaggc«ttaa gatccggttgcggttctaagtcttatattcaatcatccacatgacattitgtcatalttgactcgaaaacatttaccaa g tttcttc:ctgcttctcaaagctttgatggtgtagcaggaclccgtatgagtctttggctttgcatcttctaacaa~ga aa cactactta~gcttttacaattcgacct~actttgatcctag~attagaggatccttaggttctgctlccatatcaagc a caacgaaatacgtaggcacttccactccattaatca«actggtagatcttttagcaggccgaaaggttttcttgaaaae ctatcagcaaagatcaaagtaaggtcgcaaggcctgtactgagtgaattccagcttccttgccacaaaaagt ~'~~c;lttaa ectlactgaa"ctcClaagICaC:ICaa:lCaa(:IgClgaaa;CCaaltgCC(:aitlggaaCalgglaga~_l~_a:
laY:llCaC~' alc:llc:CagCtllt(:ll~ilalaatcctcttgaCa~tilgctctagaaogallagCalCaCallCg(C:7talgaa tCCtItC
ca~_aatt~aatnctcacatctttgtgcgg~tcc~_ggattaggtt~a~_aacttctatcagaggcatcaccgc«ccat ct Iatcaaoctscttctctacaa~aeactt~tatctttctatccacect«cctsaatttccatsseasa~caest~:lae~
t aacggtgctggc«gaatcg«laaccatcctctcaatgatggctgctttctcgtccaaggtggacttagtttgaaagcg aoatccagccgaatggtcgagtccaacaactgaaatttcaatctgagtagaaacctccccgtcclgaacatcact~tcc t ~agtgaugagttg'~agacatettgagtaggcagctctceatcatgYCagatatlgatagcgtgag'c~
gt_gcgtattct ttc~'QottctYtat~'aaC«CCCI>;'L'aaatcCt«aacltt~aeaQaaea~otcQac~ca~ut~tccitcca~~ta tet ~acc«agaggtaaotgcctcaag«tgatgttaagatcgttgaaagtgcaatccaccttgttgtgga«tctgcvatcl ttttggcgagatccattactcctgctccttgtccctg~_agaatctgctgtagcatgttcttcaaatctgaatct~rga g«
g«~aaatcg~'ttcttgttgctgctgtccaaacccatgttgtggaagttgctgctgatagttgccctgalactactgcl t tegaacatacccctyaccttggttgtatggaacaaa~gggltgg~ttgattatgctgctgtgaagggta~=ac«~~tca t etgggtttgcaacattgttacttctotaagacagattgggatgg«ctgcttgaagttg«gaaaccl«ttt~ laacct ccttggt«tgcacataactgagttcttctg:lctatagagtc;tccccatcctgaacttegaac~~tctcatcatccec ea~
aaaat~aac~t_'l«ct'~ttecacta=~astaacttetcca~ctt~tcattcata~ctttcatatccct~ta_'tett tct calca~aatca~a~tt~~teca~at~cttctatc~taatcitcattsta~ttaccatcc~act~t~ccaaauctctacc aectc~catccttcttcaac~tcc«
~u~zaeaaa~ttcccatte~aa~ca~tatcaa~_aascatcct~'atctt~'~~:tae _acarctca~=Iaga'_agtgctga'=aagtgaa,'ctttt«eaacccgtoatvaoovCatt[ootClv:p;IvCCIIt f,':la<'C
~CICCCaIgcttcaaa~=aaggtttca~cattctgCtgIgIgaatCCggatatct:la«Ct''ga~lgl~'C'C:I~ll l:f~~aa~r IIgg3~aaaaalltagCCaggaaggCltlCll~Ca~lCglCCCag~llglgaClgalCCCl~~oola~CotCIICICCC
a Cagatgggcttt~tClICaaglEaaaaCggYaaCaggCgaagCllgaaaCCalC«caCIaaCW Ca«aatt«ti'lla ~gCIElitga~~CCIIICaaaCIC~ICCa~glgalC~aglgg~llallCalf~oCaoglCgt~aaattt~llCCCCI~=
aaCi alagCaalaa~aW~ClCtlgalC:lC;aa:lalt~t'IlgttttYaacc'~~'CggagggaraatgCCattaC~~c;t' ;~llal~<_ll ol_2t_vv_vaaotCoCCaaCYCCiIaI_~tl_~ll~~'_2Ca_vllCll~ctcatctacaac:el2a2CCalCQtatt a~_fal~l~ICI
YIICICIIaY:Ira,'Ci'a~C~:ll~'CuY.lCgal~li~ICglig;lad'a_ga~ottCtaatglCCtl'=ggaCC~
l~llt'=cat:l caactcatatca~cac:a~cuaaaeacaccaatca~acact<_~tcca~c~aa~tt~!'teaatat~a~'tacctaa:l acaca caaacacat~a:laaa>?acaca'_a'_ma~taacuelaacaaaceaactt~atcttaac~tttccaaaaatcte:;ta :uu a~caaa~aaacac:ctaatl~~caac_'c'c~ccaaattsataata~ett«tca~atcaattaatttlaaa~rartatc at~r tc~«eta~taet~aa~~t~tcaalccaaateealaloateca~acaaat~a~atateatcasaa~tt~claa~wa;wry .
ca~_'ca~aat«g~lt~Cltttcaaaca~~tcctaatgttaat~=cagaaaaca~taaaaagtcata~aa:u~=act,' atat:I
CIC~aIC;'C:1~?wIC~:ICI'=Cl;:l~;lad'~'~aCCCI'lCatCi'a';tacaaca'=gat~aca~a~'a~t:l :l;l:lal!_laYaaal'''r :ICCaYtIIaC(Cl:llC;IC:ICa~ICIC;ICt~_(taL'la~aCl~~tHl~atC~aCta~a:lt~°ll~a :l:laC:lY;1;1:1:1;1I21C'l:l aaallCaala'~C:1:1~1~:I;Ilk;la,'Ca~~laa'=taaaca'l:lga:tatcaaat~~al~'a~a:l;lllCCl ;l:l,"_;It'r'r;r!_:1:1:11C' ~aatagtetggtctccct~atcaattcagtt~gttalc aaalclcaattaagctatccctagacaacaagttcaacaaca ~attaattcattccc~t~=atagaaatcctcaagcata~'cta"acact~aCCtaattcccaltagcgatctctaactt agc aggcaataagactgacig«ilaagccaataacc~ctctaaac_~'ccaagcccltg~_~_ataga~'acgattatcact ggaat tagtagatcraal:gttccattaaatgccttttgagc~
tt~,~~,~,~atc~_gatlccaottgatcagaaaggaataagcagta a:IlClaCaC«aaCCa:iC''gallCgaaCal,~_~~«Cl:lCiIaCIaCICtalCCCIICCICaaaaalCClaatCaC
laClCl ~~acaacat~=cgtlaagacaataacatagaaacgattgaa~,attccatacitaagatattaaagg~~aatgaaacaa agat,=
aacaagcaaatgaaagcaaaatctaaataac«agaa~aaaaatcgaaatacaaagtt~_cacgaatgaaaa~_c~=gcl ttt gaataata~_tgcaaaaltcgtgtctcaaaagtgcalgtctcacggcgaaaccctlaaacacttatttatactaagata ag aaa~cgggcttcca~cccatttaacgacccttggacgacgtgatcaagagt~tcggctcaaagtgggcttcctgatggg c ttcgaacg~=gctltgaatgigccttctcttctc«tccgtt~=clcagg«giclgg«gaglgcagattgatcagagggl Cgagglggt~olCoala~gtllgltgaCa~=aalgglCl~~gc:lgCIICIIggCCICLgIgaCCgagtatglglglCa gCI
~tca~~at~~actagttcttctcctctacgalcgagtagaactggtg_actggcaaaglaggactgc«ggtcgcclctg v CD Cff cgatcgaglagaactglaaaatggactgcctcctctcclclgcgatcgagtataaglgtctggatggaclactgagclg a clgcttccalclctgctactggtgatcgaglalgttggtggaaatgtgact«cig«ctglttttcctctttctgagca caaatagctccaaagcacclgaaacaactccagaatgcagaatgtatgcaagactagtcclatactcgactaaagacgc a caatgtacaaaaatgatgatgaatgctatgaaagatgggt~aaactatactcaaaatggtgataaaaggatgtgtaaaa c atgcaaattatagacalatcacagagataalatcccatctaccctgagtataaccaatgaagctctaatggtaalccca t clccaatcccaatalaagaacttaactctaccctttgca«catgaatcttttaatccctllltctaatcattctlttct lttctattaactctaggagaagctttctcaacactttgatacatctagcggattt~satttctttaacaactaaggttc c tgtttaaaaatttlaaaarcgagggcll«glc«tcttca«agctaal:aagaactctttcttttcttlacaaalccca aaacclltactagaatttttctgcttcttlagcltacttcacctaoagtctltlaccttttc«atcaatgaatccacct ttttlttctlltagacacattccaacccaaagtataagcgcccacctaglcctatctagtccggalaacgcgaactaga a clacafgagacactalctctgtcattacgaacctaacact«claccaa~cttaatcgyaataagacctcacaaacactc acaagtgatatagggcltgaaagaaagtttaggtltga'=ttcggglllaactgctctaataaggagagtcagctactt gg attaacaagagtggttaaaatgagtaaaaaaatccaagta«ttaggglatccalaagttcaaatlgatag ggaatggta lcagatcatgacagtgtggtagaatagagcaatctaggcalggtgcagtlttacttaaatttcattgactatgcaacag a caactagcaacgagggeaetgagtttalccggtgaacaaaaatgc«acaaggctatctcatcattatcccclatgcata tgcagcaatcctaaafgaaacaclclagaclcgatcclaaa«aaatgcaactatatgaacactctgtttttgtgaaatc attttctcacatttttcaaaaacaatttatttttttatgcctta~=atgaat~cagactcaagagtatatacaatgcaa ca cacac«c«cclgagccctcccccaaacttaaatcacatgtgctgcacatcctctagtaatccaagttg~~ccagagta atcaagcaag gatatcatgtactgcaaaactc ~ucgcct~atatacmctcaaatglcaataggttagtctcaaagaaa acaatgaaaccaaaattgttatcaacagacaalgaaatggactatattggtcgagtacaatctagtactcgagtatact g aacttttCCCga~ttctlctttaacgatttgaatc~a~t~gaaca~aaatC~gBggICgaglalalClgllgagglCga g ttcagccagtgaagtcaagt~=~=tcgaetaaactatca~tc~acg~tcae~=aatttgcagggacggaaccatgaatc gacg ~tcgagtagatggtgcacitggttacggttgt~a~at~~lcga~tcat~tg~itgtaalcgagtagttcttaaggcttt a acggtgaacgagatggggaactcgactgcggtgec~casa~~l~~tc~aetacggcga~ttgtgctcaagtgic«lgct CIICgC gglC g:IglgllglggCgagaaCaaaglgaC ggt ~'aCagCtlC ItCll ggt:CttC~=aglgtag ~'~ ICgI glgaaa ~~tagagtgaagtggtcga~talggcgac~aaamga~t~aaat~gl~'tcaaaacgagtaattctttggacgagtgaga g afg~Cgalllgtgglgaaggilg8~aaattg~agl~~(C~=~_l~_'a~a~alg:l:lg:lg:lgill~'ggl(:ga~l agBVlllac:Illl ~aatagagtga~ggCgaa«ca~aag~l~l~ll~~aaaletaltctatLatICtctclal:alCll:lIC(lC:lggaag ililil acgcPt~;tgglccatgccctg«gllttgtctttaagcttag«ccct~_«gactgggaaaacctgaaatcctctgtaag tctctgttagtcgaglatatgg«gaatataaaggatclgaatcatcaaattcactcgactaaagatcctcatcagcacg tgaccaatgagatcgagtgacaaatgaaacctt~tttatactc~at«cacagagaaactcgatctccacagcatglggt cgagtgaaacactgagatt~=gatccattgcacttaaatatt~gcagctamctgaaaacacta~=taaaacactagtaa a ggcatgggacttcctcccaagtaagcltgtlltaagtcattagc«gactccc«attctlltgatcatcaagaag«cc lggagatgaaccgacgtcacctctggaaggattteatctgctaa~ta«tc«~aotc«tgaccatttactgigaaatc tctactcetaccaoctasa~teaccsctccataa~~ac~aac«caleataca~aa~o~acc~~actatctasacttaa ~ctttcctggaaaga~tttcaagcgagagataaaaaaaacaccttatcacc:aacctg~=aaatccgaa~tgatgatct tct I_olCalooaaaa~CIIa2llClCICCIl~la2alllt:tYa>_,c:lilt:ll:l:l_'CllClaaaCai'alc;ICa lCaa_~9IC:iCIi' agttggalCaaCCgCIICICCICagCgi'lllllal~lCaill~ll~ilY~it~llll:IICYC
CC:ec'alagCIlIYIaCICga~
ttcaac:aggta_gtgacalgatt«CC:I«lgaeilagallgilailgg:l.llgtaclaalgv~C~llCllgaa:l'!
Ct~'ICCaaI
aagccgaCa~~l~C~llitlCgagCllllc il~aCIaYlClllCellYla;ItCiIC;I;IC
ilglClllICCaaaall~CltItalC
tccclattggaaatclcaacttgcccgcgtglctglgg:il~=:Il;fil~,u;l~'tyv~y_aCCllglgCllIaClc calgClICII
Ca'.'.aaggtlll(;aaa:laC(:ll~Il!!al!'ilatll~C:lllt~(;;i(;W
IIC:IC'll;IlC:;1(:I;IC'Cll~'=''aa(;ICC'.aaaICIC.''_gga a'~allal~~tCII~ aaCa~CIllil~C _Ca:ICIIII_'Call'_ll_~ _'IaY_~:IC I
_'YC':I:IIa<_CtIC ~aCCC iICIII_~al:ll:l IaalClaCaaC~al (:a~lel:il:lCfl~lla~~~lill~'e:l~at'_L' ~'il;lt_'aal'C
c:;ll:i:laatl~:lIlCC~I :llaCill~tlil:l WO 00/5s32s PCT/US00/07392 atclcaactttcaagattYggttctgaYgcatcteatttcttctact~at~ugccc«tctct~acatgaalcacatt ttaaaacaaactcctga~cgtccttaaacattgtta~'ccaccaaaaacct~m«~ea~_aatrlttaacacl~~tctt~
'aal _~tc_cgaagtgaccaccatat~'cg~'a~~ccat~=gt:lat~_ea''caggataccttctac«catcttccga_aca catctcct ata_atcttatctttacagagaytgtaaagataa'rYclcatcueagtattaatggtgtat~_tccctgaaaaacttct icc tttcalaactggtgaag«g~~~_ag~~ctctactct~=cat~maaataatt~'acs~teatca"cataccat~="a~at cttccc tcaaca~~cgtUa~~ttggtg~aactctttccc~'ct~'tagctcccacc~aaaaallclacaaecatta~_tt~_ctc ttct~=~=
(:aflgaglC''IClalCtgaafagglICllCaaIlClCaICCIIi'aCa~al_'YICa~CI~CaCCaIIIICaaIgCI
IICII
YtcIlclgt«ccatgtcaaacttcta caacagaagtatccatctcaata_'tctlg~_Itla«falccltcllagaataca tatgcctccactacaagilaaacaggc~tttaccgactacaacacta~_tc~tttcata~tcgtaaacagaaataatal :gac taatagaCgilCtaittcagagtaEtcgtatatatatagtc~Ctaa~_caacata~Itcot:ltottagtCgl:la;Ig ltaCgaC1 aaClIaCgaClaC glaaCaaICg gCalgaaalaglC gIC fall' SIC ~laaCallt ogC
uaCIaCg:llaC gallaalla8 cctgtttatgggggattattcgtgtagtcgtaaacta~tcgtactmcatt~
acgaclacttlgcgactagtttacgatc aa«gtattaaaclctgggcactattctggttgtcgtatcttaglcgctaaataatatacgactaatgtacgacgccttt ac~actacctaacgactctcttatattgicagacgatagty«aatatl~~t~actat«tacYactatgatacgacta tacgac~actattggacgacaatttgacgacgacattttgt~'tccaaataa~aatttggtcatatcttgatttcttt~
tt ggttggtcatcgattctgagaattcggtitgtttgttgttctaaalgctataaagtataacaaaaacacatttttatig a aaatcctatcaaatacataaaacacaacttacacttaaggttcaaagtactaaacacacacacaaacattatt ~ttcaaa gttcaagtgaaaagggactgaaatctaaacctaltgatcattatgtggagtagctggtagaggcictgtggttglcgaa g altgagaggccatgaagtcaaggaatlgcggatctgt«gtc ggagatacttctccactaac gacaagtcaaggaaltcc aaagallcgggcagtctaagcltcacgcgcaacaatctcagaalcacgcttctcattgtagYclacctgctcctcaatc t tgcgttgcgautcttcagccattcttctaaatctaggaa~~_aa~at~_ate~augtccag_gagatlgcagttgYcac aY
cctaatgt~tctttgagactgccaagiccataagYatt~_uetctg~aatcccttty=gt~_~actgaaaagaaataaa gaa =agattagtgaatgaataaccuaaaactcaaagtcaa~a~=~ttta~'att~=caacattttacct~gagaaagatgYt IYtc aactcclcattt=tcaYcaeaaYcttccYCCtctaYctYtYacaatclatctclcaeattct~ctcataa~ttt~a~ca a tcatctccgccttccgatcaacatacgtaccatccgac«agtalgagtctctacaaacacclcaccaagagtgactgit cttcccaatttctcctcctgcactgaaaccaacacagctcaaatactcaaaaacttacalgtgaggtaactgaaagagg t aacaeaaaaacttactaattcgtcttggatttcctgaaaa~ac«tg~ccca~~aeaagtY;algtgaggaccgagaccg t tacggtcagagagacaagct«ggaata~gtccgactcr~_tt«tgt~ct~~ttc~=~t~tcccaetaagcgcacataac a ectgcaccctaetlctccttgcggtgctaaccatgtctttcaagcgcc~utacaaacctcattgaagtaatact~caca gtctctgttattaagggatcccaactgtgagttttctacaaaaaatt~tagc«~a~tta~t~aaatttaaaagattaag clggaaag«aagaaagtaagagaaatgcalgacagcttaccgcaaaclcaagaaataacctctctcttttattcactgg tacacatgaccagttatagtasg~agcatitgtgtaaaaatcc~'a"tgatct«claaccaact«gatcccct~tcgcg aglaaacctccaaacacatgaaacaatgaaa~ataaaaacatgttacatta~tgaaactactcaacataaagaatagca a gtcaccacaacagaacacaagagagcaagccaccacaacagaacacaa~_tat=ctagacatgaccactatgttcacag aa =atatagcaagcgaccacaacagaacacaaglaagcaagccaccacaacagaacaaaccgccacaacagaacacaagag a Ytaa~ccacracaacagaacacaagatagcaa~~ccaccacaacaeaacacaa~ata~_caa~ccaccacaaca~aaca aac cgccacaacagaacacaagagagtaagccaccacaacagaacacaa~_ctcatcaa~=acatsaccacaatgttcacac aao atagcaagcaaccacaatctaacacaagctcatcaagacagta~tattacaeaa~ctaatcaagacacaatccaatgaa a caatctaatacaa~=cagatcaagacacaatatatgttcacaacaaattaacaaa~c~'Yac:,caatgaYeaalccav c;acaa tctaacacag~ata~caapccaccacaaca~aacacaa~'a~~a~taa"ccac:cacaaca~aacacaagctcatcaa~
acat YilYclttatsttcacacaata~caatc~atcaaaatctaacacaa~ctcatcaa_'acaacaa~~ctaatcaaYacac aatc caccacaatctaacacaaQcaeatcaaeacaacaaYrtaatcaaeacacaacctaaeac;aaectaattatt~ctYcaa cc taagacaagcauctcaagacaacaagctagaggatctcaacetaacacaa~ctaatcatt~ttgcaacctaacacaa~c t tatacagaalctaaaacaagctaataagttgaggaagagcacataccattcagtcttcgecttgs~=aataggasacag ta cg~tgYtccacatttccctcccgggaaccataagtatgtc~'ga~aga~=ccctc~cc~a~'tcttctt~~tagclccg acaat Yattgaggagctccctcttgaaagttgttgccctgagaagatggatggctgtgmcctgaaagttg''galttggagact c C~~aggCICIIgglIlagaggalCllClggCgglgYallaaaCl~'C~'CCI(:llCaVaIC~CaoCg~CgvCaoalCa aCaC
a_Cf'raolnotoYaYCIClg~ilafagllgat~_C~~'C~~Ci"_al;l~ll~'CY°_I;IYI~~~~a~ct aac~=tt~a~~CICCa~°,C°
YCIIga'_~C~IaI','.C~ICagaggagClIgaQ~CgYaI~CYll:a~a==;1~'CIl'';1';~C:1CC~'CCgCC~
~:I~l=:la~'tC~l~
llgctlag~_ca~_C~=acgclgglcllccaacr~~Iglg~~BICCICC~'~I:IIICI~L~'la;lC~ll~~gwiCvl CII(:(:lCl llC~l~a3CCICCaC~aCC~~CgataQoCal~Yllguaa~'augaa~2l:l~:l~~:IIIIY;I:I~aa~a;la~a~aa gill~L'allC~
acgaa~aaaga~aa~atgattcaaagaagaag:lay';liclaat~aa~'a~;lagag~'C~'aa~a:lg:lagll~ca ~aagaagatc 'aa~aa~_aa~'aacttgaa~aagaagatcgaattagatc~;Ia"aa~~tlmnl=aa,'aa~'aagttgaagaa~ata~
atc~aa oaantlaaCCIIIYYaIYaaalYaYaa_YY~_aal('Y;IIC l.'a:lll;l'~_'_cWLIC'lll:ll:ICaC
LIaYICYC:IaI_TIaYIC_Yl alit(a~lCaCfaaIIIICII~~otIlY_onCCoCOaaal;la.'aa:l.'CCC;IaC;l;ll;t;l:l_~;IC'aafl _YICL'CCYaBtaYIC~
tlYaalaolCYIIaCIltllallYYCllli'_'YCIICYaaatlaClaa'~CCCaaC;l:l;Illt;llYYt_'ICIC_ 'c;ItYllilvl' ' cgtaatatcgtCgtlagCttt:lll'_';gIIIgggcttcggaattactaagccc:l:lC:Oa~l:ia:la:1'_CCIg IC~Caaglla~, lCgl:tllala!?lC~_C:ICat(attlt''nllllaggllC'~aaalfaaalagCCCa:ltalCalIa~aC'=ICI'' lC''lCg~ll:l ~ICi'IaCIICa~IC
~=IlaallllaltgggCll~~g~=CllgggalglCgaCaYCCCaaCaaallaallalalll:l~'aagiat tcal~'alaa(tataaaalllctalaalalllallcaaataatlaaacalgatat~attcttaaacaaaaaatnc:iag la :lafa:la8a:t~l(:lafatacatataagc:iCal:altlagaaClcllCaICaICIIC~:IIIICaIC:IIC'_ICI
'_a:l!'aC'':lta aallaC:IlCaa:lallCaICCICIggttc:lgCatclCCtaCalCglaalCga~l:ilg:lll"=~IIIa:l~'CIII
CIaCC~'ga ttgtgta~'lace:igalCllcaall,_lagtcagcaacacagcttcatcattttctg«t,=caaaa~':i~'lC~_ calcatl~~ll llCalallC:lCCagaaat<'tIICCCCg:lggattaacttttaaaacatttagcc:laat'!C:IIIItY~ItICIII
~ talalg !'atacggaatataacaaactlgatctgcgtgcgatgctgaatacacaaattatacattaataaalatacaaaattatt ta ataa«tatagagaacctlatgtattcltaattacatacctagaataaatggctcofalltatt~latttccta~atgat a~aacgtclacaacaccaccatggcttc~ccgagt~=cctctaccaacctttgg~tcataccacttacarttaaaaagt of gat«tcaaattgacgatccccicatactcgagttctatgatatcagttaaggtaccgta"aaatcc~=cttcatca~at g aatccgcgtaa««clcc«taacacataccccatagttacatgtttttcgtccagctccalgatttttcgtatgaaac aaalalcctcgtolgaaatacattggccaagttttaaccttgttcaaaggcccatgcacaatctcttgaacrcaaatag g aaatagagaa~tag~~ttccaaaatgtlacctacataataaataaacaatatcaatatcaacataatttaaatgatcac t tttaaaaattactaaaatgatgtagactlacatattct«aactcattgataatattcttcagctctt«cgcgtaatgt tctttttcggacaaaettggatatttcaaagaaagaaaatcctcgaacatactatatttacaacc:gaagataaataaa tc aaagttatcataatgaaaattttaaattaagaagtgatttaaaatacctctcaagtggttgaaaata~
Icaraatttctl aaaacatatgcatgt~c~=ctctggtagtctttctctgtaagccattctacgtgcalttcaccacttggacggtcgacg tg cataaatatgttagotactccagggacatgatatacaggaacttctccttcatcgaatcttg«aatctgaltcatgtaa ttgatucaattctgaa~tatagattgcgaatcgtaaaataaatagttatatttacctt~attttgttt~tatatgatc~
' ~_c;aaagtagtgagtcaccaaacagct~ct SEQ ID NO: I85 Arnbidopsis rlraliunn saaticactca~lata~~cctt~cYaccteacttt~atccttQCtsataa~lc«caaaaaaacccttt~_crl~ctaca agatclaccaotaal~=attaatggagtggaagtacctacagatttcgttgtgctlgaaatggaagcagaacctaagga tc ctctaatccta~~~a~accttttcttagcctccgtgggagcgatgatagat~tcaaagacl~gagaataagtcttaacc t lgggaaecacatgaagctgcagtttgacatcaatgaaacttcgcaaageacagctgtagaagaaaagatcagggclc:a ac ctcaavcttc~sgattcaatcaacagaccaagcacagcctctacacctgacttgcgagatctcaaaaagaaatctgatg ag caagaagaaal:catagagaagclagctcagacagttgaggaactlaagagtaaacto~=atcat~~tgcaagagaaae cma atcaaaalYCa~oatt~:lcactatccc~aeaaaaaa~attacitcaa~at'_etct~a~__=aeataeattatcaacc a,_aa,_ a~aaa'~a~s~eeta«IC~a~eaaa~aaeaattca~tatlct~ctactcatcW
caaea~_a~~at,_ct'_aatat~at~at oa'~atca~a,_ac_actat~ca~~atcttctctatcacccattttcttcttaat~a'_t~toa~oaalcaa~cta~a~
acl«
aaacaa~ctcac«~_'=a,_~aattcccaa~actetttct~laaataaaacttllatlttctlstta«ttt_'ac«~tt t tt~~ttgt~Tl«~toattCtca~gaacagagaaacagcgto~agglagaglaaaaalllaaaaattlIlaCIClaOaga g caacagg~gatr~aetat~tcagaaattcaagagtttgaaaaacttctgtt~cactaaga~~ccat_'aggtc~a~taa ~l t~~tcga~tattagtgatgattttaaaaaacaaaatlttgaaattatacttataclcgaccaaca~aagctacagagac t tecatggagtttaacaagtttacaga~=gatgac:Igaagattctagtcaacag'agaacaglgcttcaggacaaaca~
'arac agc=lggcc~amacctctcacctttgttcccccacgcgtttttagagataacaaaatctc«ccctccttccccaccac tc~atcmac:cctatcctttcccacc~acatcatctctcttcctctccaaac~clcaacctc~aacaclcactctcact c acgtctctcacal gaccaaac«cactcaccatctctcttcctcaccaaamclcgaccacmcatlacactr'_accaca cc~ctcacte_~'ateCCacacctcc_~~'cttclccatctctcac~~cltc~tclcccateccac c~accaWaCaca~eC,_ C1~_CC~_ClI_'C~CC':lCl_'(:C:(lC:iIII;IIICfItC:ll:ll~aCC~~ItIC~CCaII:ICICaICaIIC
It~aCCtIL'lC'?Il' C:ICIC~aCa:lClWi'CCIIICaCtIC;~'l:lg~a8g(:lllaCCnICll:aaC'TICIIC:ICCafClIl:aCIC~
aII~~C:IaC''aC
,_CtCCICICC:IaIC:lal~l:alal:IC1'all:YCC~'_aCIIIC~_CCaICICCCtL'ICaC,'l'l_~I,~IIC
aCIC,_aC('ICaCC~I( CCa~CIICa(:CalIICICaCC_~CIC~I_~tCICCICIICactcaacCLTclacac~aCCaafICICCCIICIC'~CC
aIIC~I
C~tICallI~CC:i~:lecilCICICICaIICtcamaacacIC~aCC~CIC:IICICICCaCC,_<_'a~IIC_'aa!.
TaOICICa;l ICaICIaI~'CttaCIC~aCCICaa,_CC_:IIICaCIC,_,~~ILaa~Clla_'_Cll:_aCCnC(CIfCaaCCC'_C
CaCIaltlCl C;_"_aL'aanIIIaCCI'_aCCIC°CC~'CaC;lal:C,_caaacaclCYacCaCa~ICCIICIal',_aa Caaaal'aCCYaaYC(:aC' aatltCaCICIaCIC,_aW
''_lattaml:eaCI~C,_IaCtt~aCC~<_'llla~t~lfl,_C2tIlIalila_':ICI:IaCal:ilf _ _ WO 00/5532, PCT/US00/07392 eatat«~gcl«oa~llaca«cu«tcaggaaa«aatatgaotaactaca~t~r~cgaatcctccat ~gat~c~~~~at tacaac~tc~at~aa,,ct~aatctt~~tcaact;gaccagagagagagcgacag~~cttat~=a~a~'cucagagct~
_a~ac ccaacgctta~'ta~~clcgaycaat~agaaga~aeclgaga~tgcta~ra~gaaaga«agcaat~'acca~~cagata t~'a~'a tgatc~ac;=aa~a[att~=ac;_tcga~_tat~~agcctga~=tcat~~gcaca~'aoaeacgaa~=ct~'tt~,aar ag~_cctgat~aa gt[aca~=t~r~_a~=~ra~'mcam;y~act««~~a~'ctca:ltgacttct~r~'g~'aac~_;y'gtacccct~'tl atca'arttlg,~c ccagct~~~~l~,ctact~~=ag~_act'tacaacacttattc~a~aaglgtcatct~'_ag:lc~'ct~at~~tctta cecatac~«~=
c«acaagaa;=~aaacaata~_agtttctctctactctgcaagtggagalgtafcaa~'~'acttaca~;cagateagc tggag a~tgaa~e~tt~=e~=;=tlcttgactttttcagtgaacgagcagcgttaccagctatctafcaa~=a~cit~~aaa~_ gatt~«
tggtttccccaat~~aaag~~=aactaaacccaagttcga~a~'ggaagagttgaaaaatltglggttaaccattggga acg atatagc~ctcaactct~c~a~~tctaa~a~caaccagattcgaagccct~tagcgaat~ttct~lactctag~~aatc t acaggcatcgt~tctaacaca~=ac:a[~~=agatgat(gattctgcac[caagggcattctccgcagaacaaagggga agaa gglcctgaaggoc~acctcaateatgcacca~'cgg«atgctlctgttgatccacct~_t~tg~=atacagoaa~tggg c°c acaccaacgagaagaagagggtgcgaggagccctttgtgtaggtggtg«gtgacaccgattctgattgcatglggtgta ccgctcatgtctccaggg«tgatcc gag gatgatggatttagalcatttgcgtc gttgtgagtttctggagcacgacat ggttggcgatttctalcgctacaaattcgagcactccttgacccgaacagccaacattctgc«ccctgcaccgaggcca taaccatacttcagggcgaaaaca«gacttcaatcctgcgcgtgattacctctactttgagagcgctccaccgactgat eacaacg~=ccctacaaaagaaectaca~ag~atgagatt~ctgaQaca~at~aggatagaaag~a~gagtacgalacg ae catgtatcatttca~lgagaatgtacctccagcgcgggagagcaagagctt~a~cgaagctcacagaaacaacagtaag t lgcagaggtggtgcaagaaacaagataggctactcatcaagtgc[tcaaggccatcaagtttctgaca~~acaagctaa gc tgctcctcttctaccacagctattcegca~ogagagcctcctcaggacat~ccctc~aagagatatgac~=cgccagag cc aactcgccact~~cct~a~'ctaa~tcaccacaggcctgagcctagtgaccgagtagtcccaccagtccctgtgtg~ca tt catcattcaagcctcgg~'agcaegggaoaaagaagaaggctgcactcgctcggtctggcagtag~ragtacacgactt ctc CagICCC ola~Cfl;IC gCaaCC aC ~~lgatggccgcilgcataagaaCagBgglC
gagl:IlC:lll::l~':t~CgClg(:I~~CC
CgaCga:lggagC:lgit~~lC~:IgIaCC(:CCilgggggaagClaagaCdCaaCa~~ggagallC[IC"aIggCClg ggagCaal C:BCaagCa'C:Iall~aCuaCCaaC[CCOCICCI[CIICCaCI~ag~taagcacctatCtCCaccattgtaatatacc alc tcctgtttttattitotttlt~_tgal~_to«ttgtcctgagtactctcttccaaatttgstcacacagtggact~t~t sa titaa~ttt~~~=~~a~,'~ctca~~aa~l~~t~tgtlgcaltgtatataatcttgaQtctgcattcatctgaa~cata gaaa aacccaaaaaaattgaaaaatttcaeaaratgatltcacaaaaalagagtgttcatglagltgcattgca«taggalcg agtctagagtgtttc~tttag~alt~tl~catatgcataggggataatgatgagatagtcttgtaagcattttggttca c cagataagctcagtgccctcgttgttagttgtttgatacglagtcaataaaatlgaaglaaaactgcaccatgcctaca t tgctctac[cgaccacacggltaggatctgalacca«ccctatcaa«tgaacltgaatctø'~altta~;aattatcat gt cctg gca«~_aattt caactcatg~_ataccctaaaatacttggattttcttactcattttaaccactcttgltaatccaa gtagctaactctcc«attaga~ca~=tlaacccgaarccaaacctaaartttctttcaagccctatatcacttgtgagt g tttgt~~a~gtcttatttcca«~a~'ctt~~taoaaagtgttagQttc~taacgacagagatagtgtctca[gtagttc ta oltt~CglltIIC~~aCI~'=ala!_~aCfauClgggCgCIIaIaIC:algv~[[irLrvatvlvlll:lil:laL:l:
l:laa~~lgeal lcallgttgataa~~aaag~r~aaa~aatlclaggggaattaagctaaagaagttagaaaaaacaaatctagtaaaggQ
II
llggaalgllaaa~tlaea~raal~:IwlICtty[laa3gacaaaCICIlagaaaaCaaaalatal;7Cailtaa~'al aftaC[
alaaatacatatalall:talttll;it:«aa«taltalcaaacalatatacaa~~acat«at«clacaaaaa~ala'_ l aIICafaa~lacaaala~alltaatt~'alcacaclaalg«caaatatttitatglaa~aaa«tcatcclaatataa«t ctttt~attataatct~ae'_agataaactaaaanatttlcttgctttttcplttttsacttcaaaatatatatalaat ac aacca«tgtaacaaa~_cataaacamaaaaacaaaattaa~_tcaaaatcatasaaatgactalta~caa~aaaatgtt a atCgtt~tatCaga~l~lt~a~>aaa~CItCtccta~'agttaa~agaaaa~aaaa~aalgalat~Jaaaaa~
aglll~a:laY
aIICaI~3~l~caaa~~g[a~~a~Itang(tCtIMallgggaClggagllggg:tllaCC:lll:l~~:I~CI[C:lll ~llala~
tctgsgtagatgggatcttatctctgtatgcataacttgg~t'aCII:IOClllaac;Iac;la:la~c;llaafCalI
CflYl~a~
aICCCCI~I(aCllaa~Cll:lllClEli'avvvaCCglC:IIIBIC(CII~aCCfICaCCIIa~'CCaaalga~_ttc att~al ~aIQCaII~Clf~aflC;lC~[W
a~aaCtaatgaat~_uaaag~~aflg~lagatll~aaa~Catgt~~l:l~glC~aafa IaagagaC~~lalt~all~=alaaCaa~~C:lt~~'ClaaC~tllllga~taaaattcaatcatatc~catctfa~'aa ctaCCa aCllggacatt~allllalllC'CLC:t;IClI~atlllll~~Cl~a~tccCCacctlc:laacctCICCIICaaC[;l lolall l;llalllgCll~awvCa;1_Ca:Ia.;ICla:l~lll~~ggagaatt~_alal~IClaIa~tCtgC:1t~111ICa't '=ICCaIIC
alCalC
~IIII~a:IICta~Iltt;~f:IlL:1110:IlC:ICIalIllatalCalllClCaflgltall~C:7IaCIII=Cal=
at lage'ala~llll~c;al:lC:WUI~C;ItIIei~a~
Il~llllCa~_gtaatttyla~Cl~llt_C~a~~CaallIeL'aa~aaa cgagcctgaaecagaae:matactcgaccaccaggtcgc~=tgcatccagcaccatacltgacmrct~_~'I
a~_tgactt t~~a~ccattc«cccatctactcg;lc~cccaygtc~aolaaccteatctca~,~ccacty'at~_ayrceam~_~=cc ~cc CI~aIC'~a~lallaClIC!_'C'C:IC:ICC;ICCI~BI:C:IC:ICIC~'n[CvllC3CICl:ICC;:ICCII;ICI
C~=aCWCCI~!_ll:''a';
lalC~lCaC(C::IC(:;ICC::1;lC:;lc:C;lll':ICIC:sac;aacacaclC~alC;lCafcllcata~.:lcl actcaaatctclal(C~~al IaCCCC:IICCII:I~~:l;4IC'ft'Iy_IC
J;ltlc,n:1[tIClfI~llIlalC~lIaCC!?ClaaalC~_IICICatf~_ell~IllC
li itcgittltccaatfactcgatcaacttaclcgaccacacccagtgtcttgcaaca'=actgt_la~,tc~,a~~tac~
~ttot ctcata~;acts;ttt~ttacatg:tcca~tatactc~~accacacccagtylcigyaacagact~=t~_cagtc~~ag tatatct glltCaIalCalgtill~llaClcY:ICClgoactcgaccacatcclaCIICIiT~?lalCa~'CCf~ll'=I'~~(C~
tlgl~aCl Il:igtaattctgttatlalttclgtlltcl~~cal~_Illgctta~'~=aftgttagaaaccccaaaact~tlalIL, CIIg''Cl l;~aclf:lft~aCllClgatcaCalCtClt 'ti,lll'=Cafe;ICaCCIatII~"_atI~:ICaCCl:laaalaCfaCaaCi':ICaI
_att~~,tgtttta~~ataatt~'aClaaaaaCClaltafCatuaacat~,~aa~_c;lc'IaCIIIiICItfIC~~oa ICIILII
~ag~~tttta~ttttatattcaatcataC,~aata:lcat~=tt~'tcatatttcacict~aaac~_cta:fecaaaat ttttclt_ cttctttaagtattatagtatatttgctcctaaacactaaacctaaaccctacacctfaaalcccaaaccctaaafcta a ttccttaaccclaaaacctaaacectaaaccclaaaccctaaaccctaac~ctaaaccttaaaccctaacfctaaacca t aaaccgtacacccaaaaccttaaaacataaacactaaacgcaaaalcttaaccttaaaccttaaaccctaaatcctact t taacttcctggttctttttgcgtttcta~=«citagactgaat~ataaacaaaacatcaa~_tcatattt~actacaaa aa cacaaaccaagctlcttcltgattclcaaagcItIQatggtgaagccgaagttcttat'_aottttug~tfttgaatca t alaacaaggaagcactaclttact«tcagtatctcgttga~g«ctagttttata«caa«atacacatgacalclag tcatatttcactccgaaacgctaarcaagattcttcltgctttltaaagtatcafactataltlgatcctailacacta aa cctaaactctacaccctalatccca~aacctaaaatcgaacccctaaaccgtaactcat~aaccctaaatcttaaaccc g aaaccgtaaaccctaaactcttaaccctaaaccclcaaacctaaaccctaaacclfagafcccaaactttaaaatctaa a tcctactttaggc«ccggaatcgagttgcggttctagttcitatgctcaatgatacacaaagcatctagtcatatttga caacaaatccgctaacgaagtttcttcttlattctcaaaggtttgatggtgaagccaaaattcttatgagttttcagtg l tttgaatcgtttaacaaggaagcactactttcactctaaaacgataaccaacattcttcttgcttcttaaa~ttttata g latatttgctcctaaactclaaaacttaaactctacaccataaatcccaaaacctaaaatctaacccctaaaccctcaa c cctaaaccctaaaccttaaataccaaatcctaaaccctaaaccctaaaccataaaccctaaaccctaaactcttaaccc t aaacccttaaacctaaat~ataa~t~t~talt«
~~cats=tltt~a~catccattt~tcatcact«agcatcatatcatc actettttataccatttctcatcattt~tcatcactuacatglttaggatagatct~cat~_cat~tt~,catattigt gt t~atttcaa~tgatttggagct~,tt~ac~agctatct~gaagagcaavctgatcat~tcaaaccacteeaccccgagg tc ~agtagaagacatcaccacttcacctcaccactc~tccacgagglcgagtgtcctcatctcratcacct~laccatcac tc ~atcacatcactcgaccccgaggtc~a=_Igfcttcacctccattatcaaaccaccactcgatctcaccactctgcctt ga agtcgagtatcaccatcaccaccactc~act~c~tactc~atoaaaa~cuca~=a~=cctlcttcaficc~~cactcaa cca gacactc~agcacgaggaa~=aaaagaa~actccagctactcactc~accactt~~~tcgactaca=ttc«aatccglc c caatacttcgtcgttttataagtagcatgtac«cacattttcgaaaacaagtttttatcfa~ttttattcc~cagacct tgt_ttctagacctt«gtaatctggatttrtctttatcfafmagtattcagtattca~=cttttgttctt~atttc~tt tactattgttcattctgttatcatcctgct~ttacact~«~ttatcat~ttttcaacttg«caac~tttatgctttct gttatgatgtctga~tagt~_aatag~ttfct~a~~at~,_~tfa~a~ta~f~Ia ~aattclca~tatgctaggtgattgae tattga«gafagatcccftctagatfaott~ttcttaat~cctatt~cutc;
~atcaactggaattt~_a~rcccagaca Ettctgcgcccaaaa~gtgttc~atgaaalgtctgaaCCactaattcta~a~a«c~t~=accat~_tacraa~gtattg gt IoCagggagC~llllggCllIaaCIf~tf~allC~lael~tTCCl~ltaggllaoClCIC'=ICaat~~l'~alloaol tlg<._o actaggttaactgg:lgaglClCI~tlgc_~'t~_~'CaCItaglllltggllaal~ra:lCtlgIt~Tlcla~lnutt aafflatlga gCalgtcaatcacctctcgggaattctttatct~allyaaflc:lll~_llfallCiaClYlfoIttaCl~CafClI~t ,ttal CIgIC~CIIaaIIIClaca~IlCIIICII~IIaI'IC
~!aC(:aCCLa~tI~ICI~=~=Ca:ICa~:Ill~~l~rCaalCoaolalClp l~llflaftffCl~ICtaffaCIC~aCC I~IC;tCIC ~acctcacrta~l~'elCl~
l:aaaYa"CI~'IYI1~~'lC~aglgll Ilacl~rlllcl~Cllgaalllc~gflllclgc:ll~ltCaclla~!aact~ctagaacararc:ICaaCCl~rIl:ll l~Clt~~
CIIgacllaglgacI~CI~amacatctl~allYltay:llcacaaccat«~.t~allVacaaccuca;f:lClalBaC~
a Calgalaglgcttta~'gtttaattgalllaaaa:ICCtatt:llcaCtaaacaclaaar~tcaaaccala''aCCC(:
a:laCll taaaCCtaaalCClaCttacotfta~l~caccaaac:atcnnnnnnnnnnnnnnnnnnnnacCI~aIftla~~'ttaaf ag~
tatecccaccttaggiatctattactg~aacaatcita~at'~Igtgctagc~'tc~attacltagat,,aacltatca aacc l«allallalgCtl~CfIaoIlCla~~=~alttaC:l:l~H"a~lf~
~YalCaalac:lalfaa'=Cala~ll~a~llll~C~~a acacfcatcggttactttatatta~=I~atll~l~=ttct:l_'aatg~~Itcaataataatcctaatttact~=ctag atctact attct«atatttcctgagaatttccctaaaccceacamtaatc:atctgaatcaaaaac;caactttaa«~ctttctt~

ICfIifaCflttalaalallllll~flla~CIIIatIl~aaaclallaalClafwfulu,tttl~'ay:luCtltolno aat ttgacccfaaagtactacaafcgalcfcttat«~'aya~=a~tg~tctta~=~anaa«t~~acct;u;ucattaagett ta a~ggltta~~gttaaga~fata~r~~Itta,»~tltac~=~'utca~atua~gaml~r~=a«ca_~~~'~«a~~~ltta ~U
~gttgoatt«aggttttgggalua~~~f'=tai'ayttta"attaa~tcuta"aamaa;natactat~~atactttaa~

a~gcaa~aa~aatctt~etta~c~tttc~;=f_=t~;aaatat~actacat;_tcat,~tgta~gattyamafaaaact agaac ctcaacga~atcccaaaaa~taaa~taet~~cne~«~_«atac~_altcaaoaccaaaaaamcalaa~aacatcagctt caccatcaaa~cttl~a~aatcaaoaacaagctt~=~nta_ma~=atltafn~_tcaaatayacta~~atg:uttot~t atce ttgagcataaaaacta~aaccya:ueca~'a«c~=a~=aa~ttaaa~_tai'~~acta~=«=nita~,~atclat~~«t c~'~=ottt auyytlla~oatctat~~~[lla~~~~111:1~2c'Ilfaf~_'_tll~'°o°_illa'raa11I
;1~'"'_Ill;l'_'=l~ll:l;;~_~Ill:l~''r I~

a'~,tag~~atttlagguug''«at[ta~~=,=tgtagagitta;gtttaatgtttag~=atcaaatatacfat,=ata c« Bata IaaglCaaaflaal44laC''accactctctct~,alaacacta~'attttaacCattttaa~'ccett~tttucatYC
attt ag4ttaglltcaagtatctttlcat~caltccta~-tctcttlt44tafctttt4ac:atutgcatt'_c att~,cataccat :l4llgCa«lagalla«Ifacag~Ig:1;114:lagIaCIIIII~gaCICaaClf~~aC~tltCla~'CCC:Ia~'=a~
IggilCl ;t'_Cf4gaglIlCIg tgatcaaagga;li';'aC;il,'_=aa'=lal:l4lC'_g ~~lalllaC IC,'i'CI
~'aa;=alc:;ta~l'~_'a~'4aC~;C
~lgaggaaCaag~agl;llaCIC~aCl:alai'4a4fC~~Cl;1';aa~'1L';a'=l~a~'_,'aaalaaa''~'aga aY:lY;I:l~Cla4l4 gaCC:lagCll;CIag'.'.CClgaa'=l4 gaglgal;I=acaaaga,;ga'?CI.4lCC'aC[1114[aCICYi'44I;IYaC4L:~aglg ~Ig[I~YCg4Cg84lltCllaIfIlafCIIIIICCCaaIfaY~~lalC4fagtatt~=catataaata_,ac4tcttat g«t Illg4CCaaaalalC«t_ClltglCIaCIggaIa(laggg«4llagallIClaIIgCa[[ltC(W'[Ivtlgl~il''a gaalllgaall(gaIIfCIICIgCaaaCIIgtttatcuattaa~_~atclal4alll~ctaa~Ilfala:It~;ltflC
l~_g~
t«tatgattt«ctgagtlttgttclatgat~«ctlcattgttctlgagta~'tc«cttcaaa~ttc«~~aggat~~g~
[faggttta~~gtgga«cctctg«ctllagcta~ctaa~«ta[aaat~catg~«ccc«cl~'t« ta"~a~ttctta algClalfIlCIItctaatcaattgaaaattgattttagactttctccccctaataagtgttt~at~aaatgttigaac c:
aaclaafltcagagatttgtaga«ctagcccaagacctlggatgtttagac[gc«~«~aatltaa[a~~attegtg[I
tcatcaatgaaaattgagatlaggaagtggtaaagttalgagaatcattatcacgaaagt~
ea«~attt~gttctgaat ctattgtctagggataacttgattgagtttg«ag«ag4taalggattgaggagtccactagaaccaa«accccattt ttaagagclctgttlcgttlcattg(gcaaatctcgigtctttaglactctgttcgttttgtttagttctlgtg4aaaa t tggtglttttagtattctactcpacctatttaagfaatatgctc_gccgagtaccttacttagt~_ctto~tttacttt cc tgttcttgcacttgacagttccttgattttaglt[catttatgl««atcacttglgctct«acc«clgcttaglal agtlt4atttccgaattlgattlgcatattaacclattgtita~gattagtaeaaacacaccaaacc;tattcacactt gg c«gacttagtattcictgaccacatcctgattgtlagcaattccacccaa«a~att~a4acctlaaalgctacaatga cafaagatgcalttagggaattgacacacaaaacc«glatcaclctcaaataaaga'=gtca~'ttgca~_cacttagc ~al c~_aattcacaaagttclcggalcacgcaa«Igactaatggctattgaatttagctaa~,taaaata~laat~=aaaat aaau eaaacaaaaagtaagcaatatcaatcaattgtgagttgtgaaaacaagataataaaagcgtcaggtta'ggtannnnnn n nnnnnnnnnnnnn~'atgtttggacactaaea4taaacclaaactctaca4cctaaatcccaaaacctaaaatc4aacc cc taaa4cclaac44cggaa4cctaaaagclaaacccaaaaccgcaaac4ctaaacccttaaacctaaacc4taaaccata g atc4taaactftaaaa4claaalcc[actttaggcttccggaatctggttgcggitctac«4«atg4t4aatgataca caaagcatctagtcatattt4actacaaatcagctaa4caagccta«cttgattctcaaa~cut~~ataalgaa~=ccg a at[ctlatgagtlttittgttttgaalcgtttaacaaggaagcactactttacttttcgggat~fggttaa~~ttcta~
t ttfatattlaatcatacacatgacaacacgtggtcatattt4a4ac4gaaacg4taacaaagattttcttgcttcttaa a ~tatta[aatatatttgctcclaaacactaaacctaaaacctccactctaaatc4tgaaccctaaaatctaatccctaa a tcctaa4gctaaat4ttaaaccctaaaccctaactctaaaccctaaaccgtacaccctaaacc«aaaac4taaacc4ga aacgcaaaalgtlaa44ttaaaccttaaaccttgaatcctalt«aacttcc~~~ttat««gt~tttcta~'«tttat actgaatgalaat4aaaatatcaagtcatatttgactacaaaaacacaaacgaa~cttcttcal~,attctcaaa~ctt tg at~~gtgaagccgaaguc«aa4agtttttggttttgaatcgtataacaaggaa~cactac«tactc«cgc~atctc~~
tt~a~~ttctaettttatattctatcatacacat~acat4utclc:«atttcactct~aaat~
ctaaf4aasafcttt4t t~cttcttaaa~fatcatattQ[at«eatcclaaa4aclcaac4taaact4ta4accclaaalcccaaaa4ctaaaat4 caacccctaaaccctaaccct~~aaccctaaaacctaaacc4aaaacc~~caaacmaaacccttaaa4claaaccclaa ctcaatcaccattgacgagagctaacctaaca~~ca«ac~_aatcaa4aa~«aaa~c:caaaac~zmc44tycaaccaa t accttggtacagggccac~aatctctgeaatta~t~~«caoacatttcatc~aacacct«t~_g~c~,ca~aaatgtct g ggclcaaatt4cagttgatcaaaaagcaataggcatlaagaacaactaatccataa,=ggatctat4aatcaatactca at 4a44ta=catactaa~aattctacactactc[aac4catcctcaaacctatt4actactca<_acatcataaca~aaa~
ca taaacgttgaacaagttgaaaa4atgataa4aacagl~taacagcaggatgataacagaat~aacaaca~taaacgaaa t taagaacaaaaactgaatactgaata«gaatagataaagagaaatccggattacaaaaggtctcgaacacaaggtctgc ~~aataaaacta~aaaaaaaactt~ttt~ceaaaat~t~aaetacatectacttata~ccaaatatccwaaaccctaat aClCaaaaCgaCgaaglatlggga4ggatlaagaactgtactcgac4Caagl~gl4ga~tgg~al,~'tC~';l~I~aa IagCl '_oaCICIICIIIICIICaIf~'I~CICLa'~lL'ICI_'_~lI'_a~,t_~C~YiIBt~'aa~a;l_~FiCl4l_~;l ;l~'4llV4Cal(:u_,~l:ll ~Ia~IC~aalIglggfgalgglgafaCICgaCflC:lag~lag;lglg~t~TagillC~a~lil~l=~141~~4;1:11 Y:la~~l~a a~aCaCICi'aCCaC~'_nntC~a~[~~[lIYatCYa~l°al'~i'ICa~~I!,'al~"_a'_aI'_:1'_~
alaClC=;IO4lc:~l~~IC~' a~_tg,'[gaggtgaaglgglgatttcurtactc~,ac4tgg~'g,;tc~~agt g'=t4tctt~,agatcttyaa~ctc ctagtc~'a cctc~~g~lc~aglgctttgacat~_atcag~[ccgctfrttccagatagct4~tcaaca~ctc4uaatc:lc4tyaaa tca acacaaatat'_caacatsca['~taaatctalcctaaaaat~taaaeteat'_acaaat~at~a~aa;Ue~tataaaa ca~t '_atcatat_at=ctaaast_~at~a4aaat~=at~ciraaaacatecaaaataca4acttatea~'ta:uetm:laat t4t;l t4aa ~coagtaa~~~tctg~[~~'factatcttc~ ~atcc4aaaa~atlagaala~tact4aac aamat~'cwtt"atctg atcaacatfaaccacac~atccc:cattalcccttct'_a<~aaac«aataa'_a«ct'~a~cctn:u~_a'_caa~'e ~c~act4 t'_l~eaataatctt~t'_tta~calcacc« cct«aaccat«~attct~_,ac«
4[~ttt<_'tuaaa'_eauma~a~,~a li ~~cagc=aa~aaattccaatttlt~'c~~~'raacat~ttra~=ccctgaataagaagtca~_ace~'att~'<~acal caatt~a~~c ttgaatattactgaa,'ccacgtctattca,'accts;taca~=gccttcttagcctcttitaaga~_ttuctaa~'~r aaaaca t;aaagaacccaeagctatctccttctgccac~wa~mcaaaata~'aa_agataaaatcccgat~=egtc,':laagaa agaa aa~'laCll~aaaCIIllctIIIII~';laa~a~'~'c~'~':l~;l~=II:lIlga~~aatgaccatacacgel'-C~f~~tC:l'_a~ICLCI
:1~_~:1~'~=~lCaaal'a1;1''Ca'_aC'_CC1'll~'~'c'a:l,~la~Caa''CCa;lC;l:lCCatta:IC:I
aI~HCICI:IIC;Ia';CllCl'IIa _';lall~'gallalClli'CIA'uti'a11;1~~aCC::la~~l~'tacagcacacccctacacg~'tao;llCCaCa a,'L'ICaCI'.,'ICICtC
:Ilacaa~~Cttgcag~'W
ctccaa~_ccu~'la;lggaaatattagacgg'catgagagagtaat~'ctrt,'I~=aca~'aa~_caat tt~yttaaagtctccaacaacaa~ccal~ga~~aa«acaaagag_a;aggaa~rca~acaacc«gtaatatcatcccaa a ~agaccttctctca;~'cacaotaltcctmcataaacaaaaagtaacagcaaac_actaagcaatatca~'gaa«ttaa l cgagcagaccattaactgatca~=tcctgctgaaaaccaaaacagaaartoacggatcccaaactatcca~atccgtcc ca ectctgaacagcaataattagaatcratcctccaacct~~ta~agtagcagccaaaaca~aactagcalt~tcctaa~c ~
acgtgagttaccaaaaaataacctaccag~_agattattc~'aagcaalccagc«cttactatccgttg~'cga~=acg _act a«aaggcctctaacatttcaaaaacagatcgccataa~Ttaacccaaaagatgataatog~aacaactamaatg~aaao agaggactggcg;~~galcclagagtcatcacatctcaactta~acaaaaagaactctctcttgaaatga~cgct~ccg ~
«olgccacatttttcaa«gc«c«cttctcclta«ggacatcgcaccccaacca~~a~_sa~_gag~aaa:le~'aggat caacta~ctgatccctt~t~~accgaatetcuc~acacacagactagttggacctccatc~~a~acctttgaaatagaa ggcgccaccttag~gatctcaeaaagagggctaagaccagcttgaacaacagtaa~a~aa~aaeaaettactgtlatg~
~
tgccttcacgtctgatttgggaatccatttctgt«aggcaccgctgutaacatcccggataaeagctcttt~=,'atcc at ctgatgacttgataggaccaega~~caa~ct~t«gatctciltt«gaacgggacctccttcttttgaac«~=ata«I
atattgaatcatacatataacatctagtcatatttcactctgaaacgctaaccaaaattctttttgcttcttaaa~tat t atagtata«tgctctaaaacactgaacctaaactctactccataaatcccaaaaactaaaatttaacccctaaacccta aacctaaaccctaaaccclaa;ltccaaaaccttaaactcttaaccctaaacccttaaaactaaacactaaaccmaaac c atagaccccaaactttaaaacctaaatcctact:ttagcttccgaaattcggttgogtttctattttttatactaaalg at atacaaa~=catttaetcatalttgactacaaatccgctaaccaagcttcttoatt~'tcaaaccttttat~~_t~'aa ~'cc~_;1 atttcttatgaetttttt~tttt~aatc~tacaataaggaacctatactttacctttcgg~attt~~tt~=a~~'«cta ~_t «latattcaatcatacacat~acat~ta~tcatatttcactccoaaact~_taatagtgcacaaactactnnnnnnnnn n nnnnnnnnnncttctgtggtatgctgatcacgtcaattacctggttagtgotgaagagcecccaaatc«tyagctatg agaagaagaagttcttcaaggacattaaccatttttactga~acoaaccttalclctacacacW
~Caaaaat:l;lgalc tacaa~agat~tQtctc;lgaagac~aaatc:gaaggCatCCtactoCaCIgCCatL~;~ICI~=CClal~'~~iL~CC
aC:ItIYC
aacgttcaagacagtgtcaaaaatccttcaagctggtttttggtggccgacaatgtttaaggatgcicaa_'aat«atc t cgaaatgtgartcatgtcagaeaa~_agggaacataagcagaaggaatgagatgcctca«aatcc~atccta~'aa~_t tgag atc«tgacgtttgo~gaattgattttatg~~lctatttccttctlcttacgggaacaagtacatactgglcecagtaga ctatetatcaaaatgg~t~eaa~ccataactaeccccaccaatgatgctaga~ltgtettaaa~ct~ucaa~acaatta tcttccctagatttgga~tcccga~a~«gtaatcagt~ac~gaggoaaacattttatcaacaaggtutt~a~~aacc«
ttaaagaagcatggagtaaaggacaaggtagccactccttatcalccacagacga=rgg~'ca_gtggaaalctcagac a~_ ggagatacaagcaattctagagaaaaca;tgggaa«acaaggagagaltgglcttctaaactc~'at~~ac~'cactal gg~
cttaca~aacagcmtcaaaacccctalt~~cac~actcctttcaacctcctctatg~aaaatcct ~=ICa«t~cct~tt ~aactcgagtataaa~_cvat~t~'"~_ca;_ttaaactcct~=aacittgacattaaaaccgtc~'a~Ua~aa~~c,T
~=tl~atcta act~aac~atctcaac~a~attcgcata~aa~ettat~'aga~ttccaaaatctacaa~~ayc~aaccaa~'tctttcc at~
acaagaaeata~tctcaa~=a~attttaa«~~tt~'a't~atcaa_t~tt~~ct~ttcaacletc~cctaa"~'ctt«t ccaay' aagmc;tagtcta~ataetcto;=tcctttctctguact~=ca~tct~=accuat~~_t~'clatcactcta~cl~~;v :taga;l lggagacttcacagtcaat~~ctaaco~ctcaasaaatacat~ata~atca~ttlattcca~_aa_~agmct~'«cc:l t t~~aggagcctataaacgcttaat~aotat~a~~agtcaasttagagacctaaaacaapctcactt~~~'~=a~~~agl clca IoCCtatCIllgtaC:IIaICIIIaaItllCCII~II~tIIIIYaI~CaICII~IIa~t~IfIICa~~a~ataaolal~
aa ~agctaagvvaaalagaltclggClll!_:lg~_gaacaaagalacactcgaccac°gaottatcaa~=~_a rarlc~acattgt tct~~tttccccacccca~gaaatcactc~accacacccaacagg~accgaat~ac~atet~atc~_a~tat~~_e~_a otta t~=caatcaaaacttclccatgleagtaaatcactcgac:ci'c~'glgl:lgi'cCgcapcagaa~aa~=a4'a'_~=
tC~_a~=talcat ca~a~Cf~IQCt~gccgteaCYaa~~a~Ca~a~'~Il~aataccccca~gc~~aa~Ct~a~aCOCaaca~=~Cra~a«c ttc nat~~cct~'_<_a'_caatcaca_'_'camtatl~am'accaaat«~'CIICIICIICCaC:Ii';t!'_'(aa~CaC
CICacICCac catt~taatatarcalctcct~«tuatltl~lllClal_~'atclIlll~lcCl_Ya~l;lclcllllcCaaa«t~at:
l CaCa~l~'~_aCl~t~'I~allfaaolll~ouu~ra~oaClc a~~aa~l~l~l!'ItIC:lll~lalal:lalCll~'a~lCl;=Calt catclgag~cata~aaaaa:lccaaaaaa:ltl,'aaaaalllragaaa;ll~alllcaC;laaaal:l~a~I~llcal ~la~=tl~' CaIl~Callla~~'OIC~a~I~Ia~'a;_falllC~lllaY~~allCIl~CaIaIL'cata~woalaat~al~a,'ala ~CCt!'' (aaoCaIIll~~ltlaCC:l~alaar_CICai'l~'CCCIC~~ifi'll;l~lltlllYalVc'~ta~ICaaIrlaaal l_'aaolaa:l,i CIi'CaCCaI~CCIa~aIl';CICIaCIC'_aWac:l'_«a~~_alCfa:II:ICC;IIICCCI:Ite:I:IIIIYa;I
CIIY;IaICIL' atlta~aallalCatL'tLlt~'_c'all~0atlt~'aaCalal~~alaCCCIaaa;ll:lCll29;1llllCllaC;l CatlllaaC

cactcttgttaatccaa~_ta~~ttgactctccttattagagcagttaacccgaacccaaacctaaactctcittcaag ccc tatatcacttgt~;a~_tgttlgtgaygtctta«mcattgagW
tggtagaaagtgttagg«c'_taacyaga~_agata"
lgIClCaIgIagtlClagllCgCglltllc~'~aca~=~ata~~altgglg~gCECIlalafC:llgggtl!_goal~_ Igltla aaagaaaagggtgaattcatlgttgataa~_gaa:l~g~aaagaattclag~~ggaagtaa~
claaa;=aagtta~taaaaaaaa aatctagtaaa~~~=ttttg~_gaatgttaaa~;aaaagaatgaggttctigttagctaaagaagaagg'gttaaaagc ctt«~~
!«taaatattaaaaaca'_gaaccttagttgtt:laagaaalccaaatccgcttgatgtatcagagt~=«g>a~'aaag cttc tcctagagliaagagaaaa~_aaaceaat~=atat~'aaaaa~~agtttgaaa~aticatga~'t~~caaag~'gtaga gttaa~~tt cttgtattg_~actgga~tl~_ggattaccattagagcttcaltgttatactct~~g~,tagat~'a=a«tatctctgt atgc ataac«gggac«acctttagrattctactaaagctcaatcattctigagagatcccctg«acttaagcctattctgt aagsoaccalctctgtctcltgaccttcacc«agccaaatgatttcattgatgatgca«gc«gattcacgttccaga actaatgaat~ttaaa~;ggattggtagatttgaaagcatgtgtaggtc~agtataa~_a~~acggattga«galaaca ag~
catgactaacgtttttgagtaaaattctatcatatcgcatcttagaactaccaacttggacaltgattttatttgctct a cctgatgctttggttctgagtccccaccttcaaacctctcc«caactatgicttcttatttgcttgagggcaagcaaaa actaagt«ggg ggaattgatatatctataatttgcalgttttcagtatccaticatcatcgtttteagttta~tttc~ t atcattnnnnnnnnnnnnnnnnnnnngatg«tggtcactattgtactggag~~tggacaatgccatgac~=ctggttgt gg tlot~agggaagtcactagcaccaatgttgttaggttgctcttgcocatctacaatgtcagccaltgtgtctgttgctg t ccgttctttcagttggcgagcaatacggttgatgttatcgttgaattaggaagttclaattcccttgtgaccgtgtatg a catgtatcaacctgaclataacagacgaataaagtgtgttgctaagtacctgaaatacaaattctgaaaagacacaatg t tagtatctctlataacaaaaacgaacttgatcttaacaa«ltgaaatclcaaalatagcaaacaaacacccaattggca acggc gccatattgataataaatttttaatcaaltalcctaaaacacaattcatgtcattgtagtattttaggtgtcaat ccaaatggglgloatgcaaacagatgagatgtoatcagaagtcactaaglcaagccaagcaataacagttttggegttt c taacagtcctaagcgaacatgcagaaaacagaagcaataacagaattaetaaaatcactcgaccacaacaggctgatgc c agaagta~'gatgtggtc~~agtaacaggtcgagtaacagaca~gatatgaaaca~_ctatactcoactgcaca~=tct gtt~c ca~acactg~'~~tgtggtcgagtataclggacgagtaacagacaggaaacaagacaatcttactcgactgcacagtct g«
'_ccagacactgggtglggtcgagtaag«~~glcgagtgaltgtaaaaactaagaaacaggcaatgagaacgattaagc ga taacgataaaacagagaaataagatagacgagaaa«cclaaggat_~~ggtaalc~agtagagtggtatcctagcctat t cgaatggltaacaagcgcaalcaagctatcactagacaacaeg«caacaacagatcaattcactcccgtgatagaaatc ctcaagcaaagctaacccagactaattcccattaacggaatctaactatgcaggcaataagaacatactgalaaagtca a taaac=ctctaaacagccaatcccttgggatagagacgcatatctcaggaacta~tggatcaagcattccatcgaacac c ltctogglgcgggaatgcttgggatcgaattcc;agttgatcagaaagaattaaccagtaaacgcaactagtcctaaag gg attcaattaattctaaaac«aaccattctacagacaacgaattecacaactactltatcccalccctaagaattctaag tcactaclcagacaacatgctcaaagalalaaacocagataaacgataatattgcalaaglaaagagataaa~_a~~ta gag aaacaagatgacagatgaaatcaaatgcggaatctgaataaclt~gaaaaatctegaattacaggttgca~_aa~caaa aa cggcgcaacagaglaaaacagaaaataaaaclgaageaaaaattcaatgtc~actg~;atgtccc:ag~_aca~=aacc ctaa~=
acatctat«atacaaaacgaaataaaatgata~ta~'acgg~ccaataacctaatrgctcg~_cccatagagacataat gg ~CC~agatyvaCtlCgIgalCCgafi,~aCOICC:a~=~'C~CtICC:laaCaCIII~?1CIICICCICIICICCICCI
IgIgCIC
aoCItgICtg~llgagtgcgg:ltttgagt:l~'aclglgaa~acgl~alcgaglgtcCggtcgagtgalgctgatggt ggtg agt~algalaCICBaCCa~~~~gglCgagla:lcltgglaga~t~allg~'ICg:I~'l~~IlglC:l~~~'1~~~'lg l~~gCg:lBglgal aClCgaCCag ~o'_~ICi'a~lagI~ICaIC~a'TIC'_CCl'_ai'(:l''ai~Y/laClC~_:iCCI!?~t~=glC~a'_li~~':l t'.',~;~aa~:!
atggctacaaagtcactcgacct=gg~~tccactal~tcggaagaatggctecaaagtcac;tcgacct'_'o'gtcga gta tgtcttctggttct~ytc~=«tcttccaattt~_cttecaaacagctccaaatcacctgaaaacaactca ~aaat~~caat atgtatgcaaa~ctatcctaatcat~caaaotatgcaagaatgat~a~aaatgatataaaaca~=t~=at~'aat~ata ceaa actggactcaaaac~atgatgaat~gacact~aaaacatgcaaattata~acatatcaactctccraaacttagtcttt ~
cttgccctcaagcaaataagaa~acatagtt~aa~ga~ao~agtgaaggcggggactcagaaccaaagcatatgataga ~=
caaataaaattaatgtccaagttggtagttctaagalgc~atat~att~aattctactcaaaaacgttagccctgcctt ~
«atcaalcaalccgtctcttatactcgacctacacatgcttlcaaalctaccaatccctttaacattcattagtlctg~

aacgtgaatcaagcagl=calcalcaalgaaclcalll~~'Claag~lgaag~=ICa.l~agacaaagal~glccclla caca ata~~cltaaglaactlg~g~alaCClC:lag:1;11~2fIl~:IYClll:l~la~':latyCtaaagglaaglCl:l:
a:l~llal~Cala la~_a~_ataa__'aICCCilICIaICC:I~:!_'lala:lC:lal~a:l_~?IIL:I:l:ll1'v[aaICCC:I:I(:
l(:CI~,ICCC:I:ICIIaa(:ICi:I
CCCttI~CaCIlal2aalCIllC:l:laClC111ItC;II:IfOatl~llllc:IlllCICII:I:ICIIIa~YaYaa~
CIIIClC:1 aCoCICICaIaC:lICta~L
2~~~lIlLT2aIIICIII:IaC:I:ICla:7Y=IICCI~IIIIIIIIIaIClll:laa:ICC:IaBY~CI
lltaaCCCIICIICIIta~ClaaCae~':IaCClCaIICIIIICIIIlaCallCC
Caaa:ICCIIIaCIa~~allllllllllC
taacmttagc«acttcccctu~aa«W ttecc«tcmcal~aacaal~~aauc;ectcttttc:««aaacarat cccaacccat~'alataa~=cgcccacctagtectauc agtccyaagaay_c~_:Iactagaactacat'_a~~:ICactatctcl ~tc~ttar_'aacctaacact«ctaccaaecmaalceaaataa~auemac:laacaclcacaa~_t~
atata~e~c«~'a aa~aaa~tteas_'ttt~_~'~tale~~ltaacl~_ct~_taalaa_~~~~_a_'tca~ctam~~atcaac:aa~~a~t ~_~_« aaaat WO 00/~~325 PCT/US00/07392 ~agtaa~aaaatccaagtattttagggtatccat~'a~«caaattc~at~ccaa~_acatgataattctaaatcagatt ca a~tlcaaatt~ata~g~aatgatatca~atcctaacagt~rt~r~rtc~ra~tana~caalcla~~cat~gt~_ca~tt ttactt caat«caatgacaacgcatcaaacaactaacaac~=ag~r~c am~_agcttatct~~rt_aactaaaat~_cttacaag~ct:, tctcttcattalcccctatgcatatgcaacaatcctaaac«aaacaetcta~ame,!atcctaaatgcaatgcaactac a t~aacaWCUUttttgt~raaatcaltttctaaaaatttW
atttttute,'tttttctat~CC'.tla''alg:l:il=C:lgaC
lc::la~_all:llal:l(:aatgcaacacacacttect'':1 L'CCCti CCC:IaaC
ttaaatcacac::l~'tCC:lt:l~'I ~'1~'ilCC:la:ll Il~'~:tat:1'':1$IQCICa~gaCaailaC:lCalCaCaaaaaC aaaal aaaaaCaa~=a'al ~
gtalatl:ICaal':;l~gag ( v:l =c'tgcatacctcagtggaagaaggagcgg:I~II~';IC ~_ccaata~rcttCC f ~l'~all~~CICCC:I~'"CC:ILC Yac~lra:ltct ccclgtl~=lgtctcagcttccccttggg~gtactc~accmcttcttcl~mt~zca~eca~caccgc~'~_tc~;agte aalt actgacatgg~=gaaglcgtgattgcatcattc~_ctatactc~=atcacatcttcattcgctccctgtag~=gtgtg~
tcea~
t«atttcctagcgtggagaagccagaacaatotcga~t~~tc«t~_att~ctctgt~_gtcgagtgtatlgtto«ccct ca aagccagaatccatttccctcagctcttcattcttatctcct~aaaactctaaca~~aatgtattaaaaataacaagga aa aataaaagatatttacgaaaatagacatg~=acucctcccaa~t~=a~~cttgtt«aagtctctagc«~=actcctcac a CICalaagaCtlaaggagggtgatagagaa~aatt~aaatc«ctcttcttccttt~Ca~atacaacatC;eacttagct tgctctc«gacatgtctgtaattgagcactcaatp~ccctttctugtaaocttc«aaatttcttgtgg~tagtcttc ttcttcacaccattctgaaacgaacttcttctttgetataacttcaalgccgagttgaggttgaatagcticcctttgc a tctgactcactgtatccctaacticttcaacagtgcaagclagc«ctggatge«tc«ct«gagatcagaattcctt ttaaettctagccaatcaactatagagagcgtacta~acteagtg~ttga~cttgaaggtcgtgttggagctaaactcg g ttcctttactctttcagtttcaaatctttcttctgtaaccactcctt''eaccccaaaggtctgtccctctiuggtlga tc ttctt~gagttttgttgatgtcaaactgtaacclgatatccttcccaagatt~alttcaatc«
Iccalctttgacgtct atcactgctccaactaaggccaagaaacgtcttccta~~ateagageatcctta~=~~ttcttcatccatctcaa~cac aac aaaatctgtag~cacttgtactccattaatctt~ata~~rcalatcttcta~_ta~~ccalaag~ttttcttaagg tcctat cagcaa~gatcaaggataggtcgtatggcttata«toaaaaatccca~cttcctt~ccacagaaa~tg~cattacgct~
=
actgaa~ctcctagatcacatagacagttgctgaaa~tca~u~~cctaata~a~~cat~r~la~aetacaa~ratcccg gatc ttccaacttttcttgaataatgttcttctagatgatt~cactaaactcgt~aggtaaegccactgttcctlgaacttct t tgatcctctctgtaatcatgtccttcacacgtttgtgtgag«cggtatta~CgCaagacaglctaclagaggcattgtg atttcaaggtccttcaattttttttccaagagagcttcatac«c«tatca~ugcttcttaaatctccctggaaat~~
taalg~a~~ctt~tag~~tgga~~aacaaa~ac«tatcttt~_,_t«
ta~ca~atecag~ctttttaact~aa~aaa~ta ctacttgagc«gggatcgagtactgtgatc~a~=tatgp~ctcctctaat~_ctattttagcttgaatttcattctgaa ca aaatcctccccttcttgaacttcactgtccctaglgatggag,'tagaggcal°lc°agtt~gca gctctc~atcatggca ~ataltgaloECglgagCagIggCglaCICCIIa~~attCl=~aII~CIC;ICCCIggaagClg_~CCI~~=gllgllg llaa caeatg~f~la~C~_aggatgccttccacatatctcalccta~t~CIC;J~t~cItCYaaCIIgCIgllaagllC~aC~
L'Cll tgtcggtc~aclttgttatttaatttacccaacttcttagcgatltccatt~ClCCI~t~"_CIIQICCII~aatgagc ag ctggagcatagtcattatgtctgaattctgaggc«caaclg~ct~ttgtl~'atgtggtgtgaaaccaggtgotggtta ct g«g«gat:ltcctccttgaaactgcatctttg~gac~aalrctt~=~=ct«g~ttofaa~~=aataaaoo~ttt~golt gg ttttgctg«gctgagggtagacttggtcatgaQggtlg~cgato«~gt~cttct~'tat~;aca~atttg~ollgggtc t gtaattgttgtaccccttgttgtagccaccttga«ct~aacatagct~taa~cacact~ratcattctccccctcttgt a c«ggaatggttcatcttgagagatgaggtgaaeatgl«clgclyactt~gag~attttgtctagittetcaugaga actttcatgtctctacogtgcttgtcatcagagtcagaacl~~t~=c~=~'at~ctlctgtcataatcitca«ataa«g cc atcc~=act~a~caaagttttcaactagctcccaccc«rttcaacatccttctttaae:laatmcccttt~=aagc~at _t caa~=ta~catccttatcttag~caagacgcctctata~'a~t~t~'clga~'aa~a~raa~cugcttaaamc~~t~_a t~ag~;~
cattttettt~ataacccttaaagc~«cccaagctmalagaaae«tcacttt~_tttct~a~_t~=aatcc~rgatatc tc attcc~~a~tctt~ca~ttctggagtagga~aa~aau«ycaa~'aaa~ctttttlgca~teatcccag~tcetgatt~
atccctggggta~c~tcttttcccacagat~~~clttelctcc;aa~'lgagaat~~aaacaa~c~aagcttgaaacca tcl tcactaactccattgattttagtaaggctaca~agcct«caaactc~=lcta~=at~atc~aotg~atcctccattggc aa vCCaluaaaCtt~IICCCCIgcaccatt~caat~a~aCI~CIIII~ralClCaaa~_LI~=il~llll~IBCIC~a~gl ggaa CaaI~CCalgacgcIgoIIglggltolga~~gaaatcaCl:I!'C;ICCaaI!!Il~lla~~lI~CICIIgC~'catct acaat~
tla~CCatI~l~lCl~tICIgICCEItCIttCagll~oC~a=Caat:lC'"=tC2al~llalC~ll~aala~~aa~'ll Cla aIICCCII~_t'LraCCYlglalgaCalglalCaaCCt~aCt:lla:IC:I";IC'':Iala;l:lYl~l~ll'~Claa ~~laCClQaaSlaC
a:laflCl''a:l:la~aCaCaaIgllaglalClCII:IIaaC:la:l:l;IC''=:I:ICItYatCll:l:lC:a:l( l(:l'?aaaICICa88lal:l _YCa:IaC:laaCaCCCilatl~~CaaCggCYCCaaalt~ala:lla:lulllll:lalC:lallillCCl:laaaC:I
C:I:lIlC:ll~lC:1 ll~la~litlllla~gl~'lCaatCCeaaIg=clefYat~Ca:l:lCa~':Ilea,'all=alla~:lalrlCalIa:I
gIC:Ia~CCae .C:l:lla:l(::1;IIIt~g~glllClaaCa~lCCIaa~C~:1:11;11~C:IYaa:l:lCa~=a:l~'C:I:II:Ia C:l'aall:lC'.l:l:l:lll(::1C
IC'=:1CC:ICiIaC:t~~Clgal°CC:I~:la~la~~al~'l~''lC ":I~l:la~':1,'_rll' Y:15.'l:l:ICa~:ICa~~al:Il~:l:l:ICa~Cl:ll :1(:IC~aCI';CitCa~lCI~II~CC.aY:ICaCI~'vYtYl=_lCL'aC'I;llaCli'vaC:~':l~la:lC:l~a Ca'r~aaaCaa''aC:IaIC
Il;ICIC_aCf~CaCa~ICI~II~C(::I~aCaCIY~'~l_lS';-lya','.l:l:l'rllL'YICY:Ii'l~:lll~laaal:lClaa''a:l'?lil WO 00/ss32s PCT/US00107392 g~'caat~~agaacgattaagc~~ataacgataaaaca~~il~'aaataagatagac gagaaattcctaa~;gat';~yglaatcgag taga;_'I~~'talcctagceta«cgaatg~ttaacaagcgcaatcaagmatccctagacaacaggucaacaacagatc a a«cactccc~_t~~ata~=aaatcctcaa~~caaagcta~~ccca_actaattcccattaac~=~aatctaactatgta ~gcaat aagaae:a~'actgataaa~~tcaataaacgetetaaacagccaatccctlge~~ata~'a~=ac~'catatctca~';
~'aactagtgg atcaa~'cattccatc~=aacaccttctg~=gt~=r~ggaalgc«~;ggatc~'aatlcca~tt~;alca~aaag:laa taagcagt aaacgvaarta~'tcctaailg ooattcaatcaattctaaaacttaaecattctaca~'actac~~aattccacaactactua tcccatccctaa~aattctaagtcactactca~acaacatgctcaaa~atataaac~'ca~ataaac~=:uaalatl~;
cata agtaaagagataaa;agtagagaaacaagatgacagat~~aaatcaaat~c~yaatct~aataacttggaaaaatctcg aa ttacagg«gcagaa~_caaaaacggcgcagtagagtaaaatagaaaataaaactgaagcaaaaattcaat~tcgact~~
a tgtcccaggacagaaccctaagacatctatttgtacaaaacgaaataaaatgata~tagacgggccaataacctaatcg c tcggcccatagagagataatgggccgagatgggcttcgtgatccgat~=ga~~tcca~_gccgcttccaaacactttgt ctt ctcctcttccctccctigigctcagc«ttctggttgagt~~cggatttgagtagact~t~aagatgtgatcgagt~tcc o gtcga~=tgalgctgat~gt~_gtgagtgatgatactcgaccag~~~glc~_agtaac«_~tagaglgatt~,gtc~a~
tott atcag gtggtgt~gcgaagtgatactcgaccagggggtcgagtagtgtcatc«agtggcctgagctgaggttactcgacc tggggglcgagtatgtgg gaagaatggctccaaagtcaclcgacctggggglcgaglatgtcttctggttctggctcgtt tcttccaalttgcttgcaaacagctccaaatcacctgaaaacaactcagaaatgcaatatgtatgcaaagclatcctaa t catgcaaagtatgcaagcatgatgagaaatgatataaaacagtgatgaatgalacgaaactggactcaaaacgatgatg a atggacactgaaaatatgcaaattatagacatatcaactcccccaaacttagtctttgtttgccctcaagcaaataaga a gacatagttgaaggagaggagtgaaggcggggactcagaaccaaag::atatgataga~caaataaaatcaatgtccaa gt tggtag«ctaagatgcgatatgattgaattctactgaaaaacgttagctatgccltgttatcaatcaatctgtctctta tactcgacclacacatgctttcaaatctaccaatctclttaacattcatta~ttctp~aacgtgaatcaa~cagtgcat c atcaatgaactcatttg~ctaaogtgaaggtcaagagacaaagatg~tcccttacaaaata~=~cttaagtaacagggo al ccctcaagaatgattgagctttagta~~aatgctaaa~;gtaagtcccaagttatgcatacata~ataa~atcccatct acr ca~a~tataacaatgaa~ttctaatggtaatcccaactcctgtcccaacttaactctaccctttgcactcatgaatctt t caaactctttttcatatcattcltttcltttclcttaactctaggagaagctttctcaacgcictgatacatctagcgg g lu~gatttctttaacaactaaggttcctgtttttttttatctttaaaaccaaag~ct«laaccc«cllc«tagcta acaa~aacctcattc«Itctttaacattcccaaaacctttacta~atttttttt«ctaacttctlla~~c«
actlccc ctaeaattcittccc«tcctcatcaacaateaattccctcttttcttttaaacacatcccaacccat~atataa~c~cc cacctagtcctatccagtccgaagaacgcgaaclagaactacatgagacactatctctgtcgttacgaacctaacactt t ctaccaaactcaatcgaaataagacctcacaaactctcacaagtgatata~ggcltgaaa~aaagttcuggtttgggta l ~ggttaactgctctaataagga~a~=tcagctacttggatcaacaagagtggttaaaatgagtaagaaaall:caagta ttt tagggtatccatgag«caaattcgatgccaagacatgataattctaaatca~=attcaagttcaaattgatag~~aatg a tatcagatcclaacagtgtggtcgagtagagcaatctaggcatg~tgcag««ac«caa«tcaatgacaacycatca aacaactaacaacgagggcactgagcttatctggtgaactaaaat~cttacaagoctatctcttcattatcccctatgc a tat~~caacaatcctaaacgaaacactctagactcgatcctaaatgcaatgcaactacatga~cactctattttt~tga aa tcaltttctaaaat«Itcat«ttttttttgittttctatgcc«agatgaatgcagactcaagattalatacaatgca acacacac«cct~agccctcccccaaacttaaatcacacagtccactgtgC~2lCC:aaatltggaagagaglaCl(:8 gga caaaacacatcacaaaaacaaaataaaaacaagagtll~gtatattacaati'~=Iggilgl~a~~lgcalaCCICa_I
g~taal' ;la'_~agCg~agllg'_ICgICaalagCIgCClglgal«llICl;Ca~gl:l:all:''aC:L'aaICIC:CC:IgII
gIgICECagI:(ll CCCllggggglaC'lC:~:1(:CICICIICIICIgCIgCiIgCCilglaCCgC~'IC';agl'_aattailgacatggg '=aa''IC~I'=
atl2_CaICa«Ct'ctatactC~alCaC:11CI1(:aIIC~CICCCI~
laC'_'_l~l_'~°_ICc'ant~atlICClai~C9tL'~aHaa CCagaaCaalglCgaglglClllgallgClCtglgglCg:lglglallgtl_uccctcaaagccagaatccatttccct ca~ctcltcattcttatctcctgaaaacictaatagaat~tattaaaaataacaaggaaaaataaaagatatttacgaa g atagacatgggacttcctcccaagtgagcttgttttaagtctctagcttgaclcctcacactcataagacttaag'~ag gg t~ata~agaaeaatt~aaatcttctcttcttcc«agca~atacaacatc~~
aCIlagClIgCiClCllgaCalglClgl aattgagcactcaat~=gCCCIItCCllgtaa~=cttctttaatttcttgtgggtagtcttcllCllCaCaCCaIIClg aBa cgaacttcttcttt'_gtataacttcaalgccgagttga~~gttgaatagcttccctttgcatctgactcactgtatcc cta ac«cttcaaca~~t~Caag(:lagCIlCl~gal'?gIfICIICIt~gaoalc:aYaallCClIll:lil~llc:lagCC
aalC;laC
laCa~aga~~CglaCIa~aC:lga~I~gtt~=a~!Cttgaa~'glcglvtl~'ga"c't:til;lC'tC~gItCCIIIa CICIIIeeglll C
aaalc:lllCIICt~taaCCaCtCCII~~acccCaaagglc;l~ICCCICI:tlgi'tlgillCIICII~gaPIIIlgl l~aW
tcaaact~_taacctgatatccttcccaa~attgatttcaatctuc:catctuga~
gtctatactgmccaactaag~c Caa~itaac 9IClIOCla~2al'_a~a~'_all;llla~~llClll alCl'alCtCaa!'C:1C''aill'all;laLCI~la~l'l~Cll'~la ctcca«aatcttgala'gcatatcttcla~=taggccataagg«tullaa~~ll:l'I:IIC;I~Cail~,?all:aag galilg', tC,'lal~C'Cllalatlli'aaaaatccca~lIICCtIgCCiIC21!_aaagl~~lallaC~l:(~:1(:I~ailCCt C;ClagalCaCa l;lYilL:lglt~'ct~'aaa~ICagII~cclaala~;l~Cal'~gla';ai'IaCaai'att;CC''~'alcllCCaa c fIIICII~aataa l~llc llel~~:«;=all~_cactaaactcgt~~agglain'gCl:aC'I;~IICC'II'=;I:1C'IfC
l~l~'atl:l:ICIClgtaatCat', WO 00/~a32S PCT/US00/07392 tcc«c acac~rt«~tglgagmcggtattagc~~caa=acagtctactagaggca«gtgat«caag~_tcc«caaltl Il«ICCaagagagc«catac«c«catcag«gcttc«aaatciccct~_oaaat~'~taal,~ya~_~=ctt~ta~;~y_ t~_ ~a~gaacaaa~actttatcctt~~tttta~ca~at~!ca~tctttttaacl~~aa~aaa~~tactactt,!a~clt~_~
=~'atrsa ~tacm~atc~ra~'tat~r~_gctcctctaat~ctatttta~_cttgaatttcattctgaacaaaalcctccccttett ~aac ( r /' VV' VV n VV V V (1 VV O V ' VV . V 1 VV V V' V ' t ttcact_ICC;ca;yt',u' _a~_ta:a~~cat_tc~a_tt~=ca_ctcic~atc,u~ ~ca=atatt; at, _c_t_,l_c,y Iggc!!t:lctcctt:y~'attct~,~atl~clttccct~gaai'et~'gccl~=~ll~'llYllaac;l,'al~r~~l ~la~Tc~;lY~al~
CCIICCaC:II;IICICaICCta~l~'CICagloallCgaaCIIgClgllaagllCg:IC~'ClllYlC~=SIC=aCII
IgIIaII
faaltlaCCCa:ICttCtla~y=alllCCatIgCICCI~?t~gCttoICCII''aalgag(:ag(a~'g:l~'Calagt Cattat~_I
Cl~aattctga~rYC
gCa:ICIggCIgll~tt~atotggtgtgaaaccag~=tgglggllgCl~;llgltgaI:IICCICC(l~:l aactgc:llClll~''gaCga;IllCIIggCIfI~~llgtaaggaataaa~~~'gI«gggttgglllIgCIgllgClga gggla gacttggtC:ll_aggg«ggcgatattggtgctlctglalgacagaltigggttgggtclgtaallgltgtaCCCC«gl tgtagccacc:«~attctgaacala~ct~taagcacactgatcattctccccctcttatactt~gaatggttcatcttc a ~a~atgag~tgaavatgtttctgctgcacttggaggattttgtctagtttgtcattgagagc«tcatgtctctacggl~
~
cttgtcatcagagmagaactggtgcggatgcttctgtcataatcttcattataattgccalccgaclgagcaaagtttt caactagctcmaccc«cttcaacatccttcttlaagaaattccca«tgaagcggtgtcaagtagcatccttatctta ggcaagac~cctct~la~agt~tgcteagaaeagaagc«gcttaaalccgtgatgagggcattttgtttgataaccc«
aaagc~ttcccaa~cttcaca~aaactttcactttgtttctgagtgaatccggatatctcatictggagtcttgca~tt c t~~a~ttg~a~aagaattttgccaagaaagcttttttgcagtcatcacaggtcgtgatt~atccctgggglagcgtcll t tcccaca~atg~
ecttt~tctccaaglgagaatggaaacaa~cgdagcttgaaaccatcttcactaactccattgatttt a~taa~Qctaca~a~cclttcaaactc:gtctagatgatcgagtggatcctccatt~gcaa~ccat~aaactt~ltccc ct ecaccattgcaat~agactgcttttgatctcaaagttgtt~ttttgtactgga~gt~oaacaatgccatgacgctgg«g tuvllvtvavvv;la~rlcactagcaccaatgIIgIldggllgclcllgcgcdlclacdalgtcagcca«gtgtctgtt gc I~ICIgIICIII(:;t~ll~'~~CgagCa:lt:ICggICg:llgllalCgllgaalaggaagllClBaIICCCIIg(L' aCCglolaC
~ac: atglalCaaCC I ~:lCf:llaaCag;iCgaataaagl glgllgCtadglaCCIgaaalaCaaa«c:l~~aaaa~acacaat ~ttaotatctcttataacaaaaac~aactteatcttaacaattcteaaatctcaaatataocaaacaaacacccaattt _c caac~_yc~ccaaatt~=ataataaatttttaatcaattatcctaaaacacaattcat~tcattgtagtattttagstg lca~
atccaaatgg;_t~=lgat~caaacagatgaeatgtgatcagaagtcactaagtcaagccaagcaataaca~rutt;~~
gggu tctaacagtcctaa'=cgaacatgcagaaaacagaagcaataacagaattactaaaatcactcgaccacaacaggctga tg ccagaagta~~at~=t~gtcgagtaacaggtcgagtaacagacaggatatgaaacdgctatactcgact~caca~tctg tt eccagacactgg=tgtgglcgaglatactggacgagtaacagacaggaaacaagacaatcttactcgactgaacaotct g tt~cca~'acact~~_t~t~~'lcoa~taagllgalcgagtgattgtaaaaactae~aaaca~~caataa~aay~atta agc ~ataac~=ataaaaca~_a~aaataagata~acgagaaattcctaaegat~ggoldalcsaeta~'a~I~~tatclcag ccta ttcgaat~~ttaacaagr~caatcaa~ctatccctagacaacaggttcaacaaca~atcaattcactccg~tgatagaa a tcctcaa~caaa~_ctagcccagactaattccCattaacggaatctaactatgcaggcaataagaacagactgataaag tc aataaacgctctaaacg_'ccaatcccugggata_dgacgcalalctcaggaactagtggatcaagcattccatcgaac a CCIIC[vY,_t!'C='W~:taleCtt~~u~'atCgaatlCCagtIgalCagaaaoaadlaagCagIaaaC=CaaCla~t cctaaa'_ ggattraatcaallctaaa:lc«aaccattctacagactacgaaltccataactactcattcccatcc:ctaagaattc ta aotcactactcaeacaacatmteaaa~acataaac~ca~ataaac~ataatattacataa~taaa~aeataaa~a~ta~

agaaacaagat~acagatgaaalcaaat~_rggaatctyaataacttsgaaaaatctcgaattacag~~ttgcagaagc aaa aac_'~caca~cat_a~taaaacaaaaaataaaacteaa~caaaaactcaat~tc~acta~al~tccc:aa~acaea:l cc:cla a=acatctat«atacaaaac~aaataaaatgataglacacg~gccaataacctaatcgctcggcccatagagagataal ovoCCaaoaln~roCIIC~I~aICC:~_alg~a~y~ICCag9CCgClICCaaaCaCllt~lCllc:lCCICIICICIIC
CII~=I
clCagCltgtClg~llgBglgCagatttgagta~aCl~tgdagaCglgalCgagloICC~'~tC~_a~loat~Cl~=al '_~tg 2lgagl~3lgalaClCgaCCag~~~aulC2a!'t:laCttggtil~agl~allgglCgaglaltolla~gl~~I~I~~C
~a3al ~ataC(C
~:ICCa_~g'=°tC~':I'=laglotcatcaagtg~CClgagClgalQIfaCIC~aCCC~~_I~LICQa~t' ~'al~gaa agaalggClc Ca:laglcaclCgaC(l'ru~gglCgaglalglggoaagaat~gClCCaaa~;ICaCIC'=aCC~'_'=r_~~L'IC~;
J
Yl:«gI(:llc'IggllCl~~CIC~=IllgllCCa:ItItgCIlgCaaaCdgCICCaaaICdCClgaaaaCaaCICa~a aal~=C
aatatel;tl~Ca:lai'CfalCClaalCalecaaaf?tal~caa>_'autoaloaQaaal_~atataaaaca2t~al;
a:ll(_alaC
oaaaCtvLTaCt~;laaaCvateat~a.I(oCaCaCl~aaaacat2caaattata_acatalcaccaacccla2l~'at a~lCl a~totl~ovCy:t~lCvyaota~r~ICItatamattaaacactcl_i'CL_ctt«aa~aaaacaat~tatac<_Ia~Ia lala '?(:IIaIaC:IIIt?C::la;lQ.l'?llaC:ll:aaaaatatatctttatelCl_~3ICCll~ttaaaaatacataa taacaaetaattt taca~aa~caac~caaacatccaa«t«I_~caattccta~tettc~tcttrcc~ctat~ctcatcccactat~attacc tataa~acaa~la~aamat~a~taatcldecatcccitaet_'amtl_~~lctatecac:aaataaatt~ctat_eett taa tcaacttaataa=taaam;l_'malc;t~_a_'atcacaaaacacctaataaac'_lattamatatat,ltataataat acaa ~aatatyca_tcaaama~aca~at~aat_Taatcttg~~ctca~catactycaaccttaccaaaaag~'t~rcamacaa c atca~caaaacctccc~catacl_t~tatacatt_ttciccacac~tat~clacca~acacam_tacat~c;tcaccca caue ZO

acacgtacgtacacaactalgaacaccc~agctatagccacaaag~cttataaccg~aacatcactttattgcatcgtc a tycattacccaacatacactct~_agtcttgtgceatcatgtaagt~accac~_caa~atcaagac;ttactgacacaa tcac aaaaaaaaaatgaaggc;I«ca~cttaclctaa~catgatgetatcgg~_Iaaaccttyaataccacttagacacaata acc tcatgcagtactaaatata:fat:tgtttaccacaccccaaacaagtataacacaatcaaglaccaaatcacg««ttat at atatatatatatacalaatata~caa~=gac«a«cacaacctcatcgcatatctcatataagtataacaatclaacaag ~caagtaa~'ccctc gaatcaaatttcaa«catca:lgtaaacttagttaagtccaaaticavtt:I«ccta~~icatgcacc agtctattcca«ccttaatgatattaactatcaagttcttcaaatacccatca«ttaa~a«eaagttctaactta~=a ctgaattctgatta~~ttaagtctaagtgagtaagcatgcatatgattgataagt=t~~tatlttecat~_t«tgagca tcc atttgtcatcattttagcatcatatcatcactgttttataccatttcacatcatctgtcatcactttgcatgt«aggat agttttgcatgcatgttgcatat«gtgtlgttttcaggtgattttgagctgtcgat,=a~mtaactgaaagaagcggac c tgatcatgacaaaccactcgacctagaggtcaagtaggagcttcaagatctcaagagaccactcgacccccaggtcgag t agaagacaccaccactctatcacatcactc gacccccaagtc gagtttc«catctccatctcttgaccaccactc gate acctcactcgacccccaggtcgagtgac«catctccatcttctgaccatcactrgatcacatcacttgacccccaggtc gagtgacticatcccattgccagacatctacacgacctcgtcactctacctggaactcgagtatcaccatcacaccact c gactgcatactcgatgaaaagc«cagagccttrttcattctgcactcaaccagacactcgagcacaaggaagaaaaaaa gactccagctattcactcgacctcccactcgaccacatgggtcgagtacag«c«aatccgtcccaatacttcgtcgtt ttgagtattagggtttcagaatat«ggctataagtagcatgtac«cacatt«cgcagacaagaag«ttttatcgag ttitatttccgcagaccttglgttctagacttttgtaatccagatttctctttatctattcagtattcagtattcagc«

ttgttcttgatttcgttacactgttgttatcatgtcttcaacctgttcaacg«tatgctttctguatgat~~tctgagt agtgaataggtttctgagoatgggttagagtagtgtaggattctcagtatgctaggtgattgagtattga«gatagatc ccttatagattagttgttittaatgcctattgctttcigatcaacttgaatttgagcccagacat«ccgcgcccaaaag gtgttcgatgaaatgtct~aaccactaattccagagattcgtgaccctgtaccaaggtattggttgcaggga~lcgttt tg gctttaacttgttgattcgtaalgcct~tta~ottagctctcgtcaahgtgattgagtctgggactaggttaacttgag ~~!IIICIgtlgCggtggCaClIagalftggttaalgaaCllgttgtCtaggg«aa«IaElgagCalglCaaIClCllg taaactgagaaaacgaatctactcaattaccccatcctcgggaattctttatctgattgaattc«tgtttactctactg ttgtttactgcatc«gttatctgtcgc«aatttctgaagttctttcttgttaclcgacc«atacmgaccacctagt tgtctggcaacagactgtgaagtcgaglatctgtg«ICa«ttctgtctgttactcgacctgtcattcgacctcaccca atgctctgcaaagagcacg«cacttaggactgctagaacacaccaaaacctgttattgcttg gcttgacttagtgactt ctgatcacatctggattgttagcatcacacccatttggattgacaacctaaaatactacaacgatatgattggtgtttt a ogataaatgatctgaaaaacctattatcaatttggcgtcgttgccaattgggtgtttgtttgttacatttgagatttca g aCtIgCttCgatCaagtICIttIlCa3tttlCafItIagIlaClgaCtglol~IlllICIIgII~ICIICICvaCCoav a tacatctctactatattcaaactcgatcaacaggcaaccagaacctcc«ttcaatgataacatcgaccgcctagttcgt gaacttagag;faaggagaa:lcatagtcaacc«ataccgcagcaacgactcgaaatggctgacgaacaaattcaacag aa tggccctacaaacattg~tgctega~atgcaccacgggaccacc~tcaaa~gaaa~rgaattgcacetcctgctatcca ga acaacaacttcgagattaagagt~glcacatctcgatgattca~g~saataaattctatggtctgccaatggaggatcc a ctcgaccaccttgat~aattcgataggcictacaacctaacaaa~atcaatggtgtcagtgaa~acggaticaagctcc g tttgtttccagtctccttg~gcgacaaa~_cacacatctgg~=agaagaatctgccccatgactcaatcaccacctggg atg att~caagaaa~cttttttatcaaagtttttctccaatgccagaacl~caa~~actc a~aaat«a~a«tcttg tttctca cagaagactggt~aaagcttctgt~a~gcatg~taecgtucaa~ggttacaccaaccaat~mcrtcatcac~=gcttta c taaagcctctctgctcagcactatttacaaaggagtctaccacgcatcagaal~cltm~=yataccgcca~caatg~ea a tttccagaacaaggatgttgaagaaggctgggaattggttgagaacctcgctcaatca~~atggcaattac;aacgagg act gtgataggaccgtcagaggaaca~ctgaetctgatgacaaacacag~aa~~_a~=atcaaa~cgtt~aatgacaagctg gac aggattcttcttagccagcagaagcatgtgcac«ccttgttgatsacgagcagg«caagtccaagatggggaggataa ccagttggaagaagtcagctacatcaacaacaatcagggtggttacaaacgatacaacaacttcaaaaccaacaacccc a acctctcctaccgcagcaccaacg«gctaatcct~=aagatcaagtgtatcctccacaacaacaacaaggtcagaacaa a cctttt~ttccctacaatcaa~=~~t«c~ttcccaa~caacaatttcag~~=gaactaccagcaaeaaccacctgggtt tgc acctcagcaaaaccaag~te'=tact~ctcctgatgctgaaatgaaacagatggtacaacaactgctacatggacaagc at ctagctccatggaaatt~ctaa~aa~ttatetgaatt~cac:cacaagct~~cct~maactacaat«accttaat~;ca aaa ol~gag~Cgtl~aattccaaa~tca~atac«agaaggaCaatCC
gC:ilC«CCtrtYCtCCC;laa~ttaci~a~tacttcc aaggaagtcaa«cagaatcccaaggagtatsctaatgttcatgctatcaacata:Iggagt~~yaagtgagttatccac tr ~agagagacccaa~tcagtcact«a~_gacagtgcataccaagatg=gga~,~atttca~'tctcaataaagatcaagc tgae aaacaactc~agca~ccactcgaccaytcactcgagca~ceaetcgacca~'ttactcga~ma,_ccactc~acca~IC
act egagcagccactcgacca;_tcactcgagcagccactcgac cagteactc ~:IgC;I~CCaC'tt,'al Ca'=ttaCtC~a~CagC
CaCIloaCa~laCIC~a~a;=CCaCICiTaCIagtCaClC~aYC;IYeC;tCICY:ICCIYIC:ICIC~':IgCI~~CC
:7ClC~aCCaC
ttcactcgaccgaccttcccagc;latatcaccaam~ctcyaa:watgtt«ctglcaaaa:maaa~_aaaaa~tcttca t ICCICaICCI«ICaagCCC;CCaCIIIC:IIIICCtaglCgaCala;lEaEla~_a'_ll:lY;A:lYaa;l:li'tac a~_agclal~~llEg ccgagaacatcaa~=~_aagtcgaattaaggatacctgtt~=ct~atyctctaac~=c~tatccc~,~attcccaaaag ttcctc aaagacttgamatg~agagaatteaaga~_~«caaaaaacaacag«ctaa~'mat~;a~t~caal~ccatcataca~_~
a gaacgatgtcccagaaaagccaggag:tccctggttcattcactctaccat~ctecma~=o«cattgactttcaataca t ~cctttgt~,accta~,~a~,catcagtaaatctcatgccactcgcagta~ctaa~'a~'~=tt~,~ltttcaacaa~t ataaatac t,caatatctcctt~lalcttg~ctgatag!'tctgtta~~actacctcat~_tgttatta~_aa~'a«t~'cctaaca a'~att~!e aaat~_t~=~=a~~t~c_,taca~acttcgtagtgctaaacat~~~_at~taa~aa~caaaa~'alcrml~'atatta~
~aa~~gcclt ttttg~ctacagct«~,agcaalcatagatgttaagcaa~_gaaa~=att~atcttaam~_~_~=gaa"aattt~~aaa tgaag tttgatatcaatgat~=clatgaagaagccgacaatagaagaacaaac ctlcttaeltaa~~~_aa~=t~~aacaottagct~,~, tgaactgctagt~gaacttgaagaagcaaataattcaaaaactgttttgaccaa~_a~'lg~_taaa~~t~~~~tatct cccta ~tgaaactttgagttctaagaagtcauagactcacacaag_cagcagtl~~gctca~a~=yc«acaa~pgcttgatggg ~
tctggcacaaaggttatgaaagctaatgaagtaagttcaacacatectc~accaactaauccactaacaactcgaagaa gcctccaacttgtcctgagagatcatgcicgaccaaacaactcaaggcaagcacua~at~=tcaaat~attggttagaa c tcaaggaga~atctaaat~gcaagaaaaagccgtacaagagctaacacatattgtgaaa~a~~ctaaa~~atcaaatca a~~
gagctcaatggaaaagcaaaccaagtactactcaacatcaaggacgcccct~_ac~at~~c~ccaclatacttgttagc aa agaaggatgtgaattcacgtcggaatggtctagaggagaagactatcttgataaaaaagaagcttactgtgaggacatg a ccactgagtacccaattgcagacatgtcaagaaaccctaatgagtatgatgatcacgctaccaaagaaggagaggagac c tcatttcctctctacaacccaccctaagcctaatgagtatgagaagtcaagc«agagacltaaaacaagctcacttgg_ ~
agcaaaacccttgtctatctttgtacatatctttatttttccttg«gtttttgatgcattt~'~ttagtgttttcagga g ataagtataaagagctgagtgaagtgga«ctggctctgagagtacaacactatactcgaccacagagcaatgaaggata ctcgaccatgttttagcttatctacggcaggagatcactcgactgt~_gtgctggccgcagca~aacaatggaggtcga gt atactcatggcggtgctggccgccaga~a~_ctgatgaggtc~agtacccacct~_cl~~,a~ct~atacagaacaa~g tggt tcgtctatggccttggagcaatcacaggca~_ccattgacgagcaactac ~_ucaucuc~att~aggtacgcgcctcac ttcaccatt~tattataccatct~ttstoattt~tttcticattttttp«
tct~t~attt~_attt~tcct~a~tactct cttccaagtttattcacaca~tggactgtgtgatttaagttta~rgg~,agggcte:l~g;l:l~I~l~lgltgCallg tgtalil Itlttgagtctgcattcatctaaggcatagaaaaacaaa:waaaattgaaaa:mtea~;l;l;t;ltgatltcacaaaa aaaa oagtgttcatgtagttgcatcacalttaggalcgagtctagagtg«tcat«ac'ac«g«gcatat~_cataggggatg atgatgagatagccttgtaagcattttga«caccg ~ataaactca~~tgccclc~«~'rtag«~«t~tt~lc gtagtca atgaatttgaaataaaactgaaccacgccta~_att~=ctctactc~=accacacl~_tcat~atrt~ataccattccc tatca atttgaacctgaatctaalctttaattatcatgtctgcatcaaatttgaactcatogataccctaaaatacttggat«t cttattcattttgatcactcttgttaatccaagtagctgactctcctta«agagcag«aacccaaacctagactttct ttcaa~ccctatatcactt~t~a~t~tttUt«a~~tcttattccea«aa~ctt_r~ta~.aa~t~ttae~ttc~taac~
a caga~atagtgtctcatgtagitctagttcgcatttttcggactagatag~acta~'~_t~ggc~c«atactltgggct g~
gatgtgt«aaaagaaaaaaagggg«gaticattgat~~a~=aaaag~_taaa~,~actcta~'~=t~=aa~taa~ctaa agaagc agaaaaagtctagtaaag~ttttgg~atit~taaagaaaaeaaa~a~=ttc«~tta~=clatl~_~aaatg~~caaaag ccc ttggttttaaaatgttaaaaaca~gaacctta~=tt~_ttaaagaaatccaaatcc "ma~at<_'taacuaa~tgttgagaaa ~cttctcctagagttaagagaaaagaaaagaateattayaaaaa~=ggcttaaaagattcatga;ltacaaa;~ggtag agtt aagttcttatattgg~attggasatggeattaccattagagcttcatttaacalactctgg~lagat~g~
a(cttatctc t~tat~cala~ctt~~~acttaccuta~rca«c taetcaamtlaa«att««_'a_';matc uect_«am~~aa~cc tattctataagggaccatctttgtctcttgacc«ttaccttagccaaatga~_l«tttt~';u~al~_catt~cuga«c acgttccataactaatgaatgttaaagggaglgglagattl"aaaactt~_t~=ta~_~'lc'~=a~'calal ~_a~'lc~~attgatt r~attgatgaggcatggctaacgtttttaagta~;aa«cgatc«gcagctta"aactt«aacu~~aca«tatttcat ciggtctatctggtgttttggctctaagtccrc~;gtttcaaacclcacctcca«ctt~«cltgatt;_«tycttgagg g caagcaacgactaagt«ggggga~ttgataa~tgt~tattttgcatett«~a~,catccat«~tcatca«ttagcat catatcatcactgttttataccatttcacatcatctgtcatcactttocatotttag~~ata,,tttt~cat~'catgt t~cu tatttgtgttgttttcaggtgatttggaactgtcgacgagctaactggaagaagcg~acctgatcatg:lc:aaaccac tc~_ acccagaggtcga«taggagcttcaagatatcaa~agaccaclcgacccccag~lc~aglagaagacaccaccactcta t cacatcactc~acccccaa~tcoaet«cttcatctccatctctt~accaccactceatcac cteacaceaccccca~~t c~a'_teacttcatctccatcttct~accatcactc~atcacatcamc~accmca~_~me;l_t~:tctmatc:ccatt ~t ca~atatctactc~acctc~tcactctacctteaactc_~a~tateaccatcacaue:tcm_~ac '_catactc~ateaeaa ~CllCa~aCTCCIICIICaIICC'scactcaacca~ill;tll(:~:1~C:ICiIa~~;l;1~;1:1:1aY:l:l~';l etCCa~C:latlCal(lQa CCICCCaCllgaccatataggtcgag«ICaaalClliIillCIgIICCiIalaCtlL'!(C~Illl~a''falfag~~l llCaaa ' alalllggctataa~l3~=Calg(acttcac:tilllC gCa~aC:lag;l:l~ll ~lI ll;Ilcl;I ~~l I I
I;II I tcyCa ~;lcell ol~IlCla°aCtlllglaalCCggalllClCtItalClallCaYIallCa~lallC;!_~Cllll:lllc ll~';IlllC_111:1 Clalt~IlClIlCt~llalC~lCCl2CI9lIaCaCl~tl2ttalC;Il9tCtl(:a;IC:CI'_ltCa:W
1'III;II'~eIIlCt21 Ial9al~ICl~a2lil~l_~aala!'_~ttlCl_';t2!':Il!_'v°-tla,'a~_la_~'121;1.'Y:IIICIc::IL'l;ll'W l:l~_i'l~'alI_~ar_ta Il~att~ata~alCCCltc'lg!_;111:tgltYllCIlFlal,_CCtatl~_Ctlllii'alC'a;ICIY~;t;lltt ~'a_'CCCa~alall tccgc~cccaaat~'~=t~«c~'af~~aaat~'tct~'aaccactaa«cca~=agattc~'t~acc'cl~'tacc;la~
'~'t;llt~'~;lt~, Cagg gagCtlllli'YClllaaCll~It~allC'_l:lal"CII~II:t~~'lla~CICIC~'tcaat~'i'l'':1(l;_:I
YIC'tL'uY;7C
la~~«aactt~=a~,~gW
ct'_fl'_C~'~'l~_~_C:7Cll:7~';illlf~'=llalll~'a:t(:IELI(~lCl:l~'Y,~ltOalll:7ll'_ a'=C
at~tcaatcactt_laaact~a!'aaaat'~aaactactcaattaccccatcctc_'~_~aaumil:ltct_';Itt_' :I:Itml l~'utactctacl~=u~ulacl!_calcll_'u:Itch=lcnttaalucl'_cacltct«clt~llactcyauc:m:Il :1 ctcgaccacctagttg'l:It;~caaca~'act~t_'cad=tcgagtatctgtgtttcattttetglcl,'llacty_;
leel;_tca ttcgagclcacccagt__ctctgcalaga~'ctgl~tt~'~Ttc~'a~tgttttaclgtttct~t«~,:n«itt~_ttt m~=r:l I~IICaCIIa~~aCI~Cla~;laC'aCaCCaaa:lCC1'!llall~~CII~~CII~aCll2l~l~aCIICIL'alC~aC
;IICII'=all ~ttagcatcacacccaltl~~:fllgaraacctaaaatactacaucgatatgatt~'~t~'lllt;l~'~'afaallga ctaaaaa cctattatcaat~atctaaa~acta~cteta~~aaaaaaigcaaacaaacacacaattcgaccacaacar;lla~'all cggf aallCYgaa~~~lllClglglelfgggcaaaatlccfcgalcacgglC;gaCgalCCCII:ICICaI
~''ICa:l~lt''ICI(:~C
CCaCaagCgalgalIClCCagl:lllggaCagagIIICCICK:ICCBC~YYCaa~alllatl!':l;lllli',~YC:a aYIIIICIC
ggllglggaCgagllalltgll~ggC:IgCCIIaCaCICICtCICIIgCCII(:aaa:la:l~';I:lailll~~t~"~
a~;l:l~:7~Ca1 catilalaglccacccaaggtcagClalgCagCaIIgICCCClgCCBCaIIICCgg'aCaaCl~( l~C~~mcccat~_cgac ag~t~tctcgaacaat~,_c~lg,~ceacagtt«gcgacg~et~tcactccacc_~ccclatcagctty_,_clacaa, =actca gccaaclttggclctaaggttg_teacllgagcigfaaggcactgcga~t~ae~t~tccactcat~tc~=~IILC~Coo ~C
eac~~~ocaaacttcactacaacatgcccaactltlcccalcttcgcccattcg,t~cccattcct~cctccattcct~
cc catctttgccct«gatggl«ggc:accttttg~gcacatacggtcaacta~crat~cauc4'~_~caaea««cfca«
cgglcc«Iggccglolccgal~,cccttat~cgg~tcacta~a~gtaatctctt~t«a~tea~~t~~~_«aa~attaag t agatccctttltaallcgttcgccctaalgctlatftcactcagttaagtttggaattc=acrcaa~=,_tatmca~t~
tc CaaaagacettCa:lal~~:Il~C:I:I~aICI:ICS'aattctagagattagtagaclcta~CCfalai'tIl"a~;=
laY:ll~:l(::l aaaacaaa~ralCQacttC~fl:llgctgaaccgac=alga:tggagcatultCl'_i':IL'CI;II;IC.'C:a;lL
:1:1;1t~'IICCIQaaC
lCoIIICaCIII!_CIIIIaItlIaalll~aacllall:7lalgCICCggcccatttataatttat,'~:1;1:7CIlI
t!_'l:Iall IICCCaaICIC_'aaafattcattlt~~eaQatactculaaali;'ca~_LECI°ail'_~aaC_'Cal_'a :7lllilc:eLC!'aatr actcc~clccag~ato~aaatc~ccaacclcaag~gafaaggtttctttataaatl~=~=cacl;_laattllctuatt cct CaCCallClalIIIIIICCIIg~Caoalalggaa«CaaCCllCgglgCICggaalCCatgW :li':Illlc'IC
Ya~Y~IC~' aaatgalggacggtgaaattctgallgtgccgaglaggCggccllclccalCll~'cvvcllc:Ilc:l~'~'lcccaa allcga gCICCgIgIgCICIgaICa:IggaICfllCggalagagaCCaagaaalCllgalg gC~lICalaf;l~'~'l~Cll~'C~_attca acatcagaagcgtac~Clg~agaYpaf_'oaaac;lCoaoaaaClQallCaal~Bal~CaaCaIB''~~=f~aCC'_l~
aflCaag nL'aactttataccclcaactcccaCllfa~flaCCCecalaacCsCalco2aLaC,_~lCaaaL_'a~c'_'~alCaa aatttt tt~cccgaataeaaa~ocaatc~=acc~'c~t~tatccaaaag~lgccaaaacattaaa~~~caa~aac~'g~caa~aa tggg ClC
_~a8tgg~Caaa~~ato~aaaaa~ttonoCofoCC~CggCgaaglllaIaCCCCgIC;,'i'CI~C~,VaaCC~'a(:a lg8gla ~g;ICacatcacacgcagtgcctfgcggClC:IagICaCCaaCCllggBgCCaaaYlCa:ICI~_H~ICII'_':I~CC
~L':la~'lIg 8lgggoCvofvvau[gaCaCCCgICrYCCaCl:lglCaCCaI:gCCgICgICC'EaL'aCBCII~IOY~I;t''lS'~~
'CC CagCagC
IYICIeeaCalll9oCa2CeY~ct'2I2C1_C;ll~e~CeaCCIIL'~°C~_ealfalaaaea'_8_'I:IC
I~IC Oi'C~Ca~IIIC
alltlloaaCCcaa~a~aoa~a~:7Cf~ aaa~_~ClataCYattataaactceaaCCaaactaa2aaaaac:IC_'W
:l'_aatC
'_attaaatctt~cccgtggtcga~~aaacatt~~tctgagactagaaaaaattg'cal=t~_~ttaa~aaaacctcam gtga staa~~~atc~=tc~acy't~calga~aagttgcac«aoacccaaaaaactttcctaacta~t~aatcc~t,>a~_«
~'t~'ct l~aallll~l~lffolll'_Cal~llllCta~a?ClaolCIlIaooflalalECali'Cllal:lC:7Cll;I~;ll'l la:ll:l:l:ll l~aBaflCg~Cll~a~lfaYalalLgaatctcaaagllaala«ggCatllg:ICgaaCllualaalac;t:ll aelaa''"_a at~~aal~=:lll~_l~aat~aClaC~c'I:I:lal_~a:ltat~'t':7Cll:l:lc:la8~'111:1C112:llla:
lllY:1_';Il;ll'_atle<_'a ~~~CllaCll~'Cgttgftagallgllalacll:Il:ll~:lValal~'c;lal~agcllglg:laleaafc:ClI~c:
I;llallal~_1 ' alalalallgalligglgCllgallilel;ll:ICllglllggggCalgglaaaatacttalatata~taetacat"ai '_lt allololCfaaolvofa(Ila'?oolIlaCICoaIaCCalCalgCIlagaglaagClgal ~CCfICCIIIaI~'I~_lCl~la a~fCtt~aICCIgaYIggICICII:SCaI~':7l~~CaCaapaClOgga~(:~Ial~IC'_~~taaayal~:lC'=at, _c:alfa:la ~t~al~tlcc~~ttalaamclltet~~ctata~_rtcooetoucaca~ft~l~lac'_tac'_t'_t'_~l'_t~~'_' _t_'t~cau ~tac~t~t~t(to~ta~calac~t'_t~ea~aac:acetatacac~tat'_c~~~a~t«~t~lt~at~ttet~_'tcca cc«
tttt~taaa,~ll~~caVlat_'«~"_,_cccaa~~ttcattcal~letcta~ttt~~cl'_calattrt'__'la«
a«ataa_' rata(salaatac~tttattac~t'm«t~t'_atettl'_at~~alaa_Tlttacctattaa~«!'attaaacccta_' n'_at~
flat(t~f~cata~accca~ctcaca~a~taatcua~_attaclcal~attclactt~fc«aca~~Jtaacc ata'_1,'~'~
al2accata'1C~2_aaCacea:lcaclaa_~_aaat'CCC2aaata~~at'_ttltaC_ott'_cttclW
aaa;Itl;llYllal laf~laaat«taacaa~_~alct~afalaaa~'alalalifltg~~laaclclft~claa~tataa;~matataelat~
'c':1 laaa~I~f«tcllaaHo~CaoCW'aaf_'lllaal9alataaCaccla,?ICl'~l'_f'_'WtaW
';la_Ta;lla(c;u'~aY
nolfoaCaaCCCfa;7Cl~'_Clc'll~Cta_'_la_'_'_a_'_';l~n~lC'_allalaaltlCL'Ifaa'_'IaW
aaalW ;IltlWla;lCC
lClaClaa~ll'_I'_'_IC''?alCalc'll(':ICI~Iai'CCtaaltC~~tc~TllCal'_~CII~ICC_'_>W
:ll:l_1'L,_'C''~:Illc:_Y_Y_Y
'_~C~IIaCaCCIaCCCC'_alCall'_'_2CC'.a'_llli'°_a;l«lCt'Za:IIIIYCaC'_'l:laa iCi'_';IW li:_'C'1'__'I~a:lr'~:1 1;

~tccaa~_t~!ttctgcac~_«aa~"I;f~'a~utc ct~=attcgaa~~t~cacacafgataaaacttc~_tclcmu~_c«atfa ctt~aalaaatgtc~~ctttgl«cct~'lct«acat_«cacatagccgaacttfac«lc:ctf~ttcgttcaaaca~'I
c aaactctttttct«mat~~acaaaataaaaaaaaaaatauagtagctgaactCgtticcga~_ca~_cctac~'tatcc :al tlC!?~_galc;aa!_ccacacci'taettc alyLacataacaaaaaat'=cglgllaCllaaaCalaaa:lf:ItglIl;lY:lttlC:1 CCaC:IIICC;I:I'.T.i'aCgCI~'gaCl'='~CCIi'ICIICIaa;IIgCII(:IC;~CIaCI~IfIfClll:l'=
'_C~C'=C11:~'1'CLI:ILCI
al'nolll:taCClllIgaCIICCIIf'=Cll~':1:1C~'=l''''f::laIICI:IICICaagCIC:II;CaCi'at l:ll:L'IYC'CL'C;I~CIC'=
ayacctt~laalgaC:llaaalllCIfLCICCICfIlatt~Clllgt:lCCgaaafgCClICtacc~_'~aalgacWll lC1 CCIIa'_i'al'_CI~aCICi'°Cllll~laCtClatalcalactC<_Clla!'CICCaCIi'CCCaIfIli 'CCa:«CLICC_'faCi aaattggaccatgaagaattg Ilcgaa~=l~=~'c l ggllgglwacacttcg alC g ~alCa'=allgg:l:l:ll;ll i' g:ll ~'Ca~C
ILfC~'_YCagagg«aCC:ICCgCaagl~=C'gaglllllcc;alcallgcalacltg_llllg~'aalclglc;tl~_ ~;aCllacl lalgtag«hallggfllllglll:lCC:ICaalCIlgcaggatcaagaCICCgCtaaCIgCalg~llC~aCaCCoaaaC
'=l ac:aaatatagCttttcatCC:lCItccg~lllgaaCaalac:ag~t'lggCElggtcaaettcgctttaagl:Ig lCCaaaI~CC
ICtICaCalllCllglCCCa:Iaggaalc:ClllgIllCCI«CagaaIClgaIafaaagglaaagaCll~=ICIZIIga ICI
t gaaafaa:I:IlggtlCaaagCl gCaatClg:lCCggilaaCC gllglaCIIalIlgaagaaCl«ggagal ~=aCalgIfla agaaagcgttaattlg gttca~=attggctlcga«cctctttttgtcacaatgtaaccaagaaattctcc~»_al~=ggacL
ctgaatgtacatitcgccagg«ca~tttcattt~=atattgattaagaatttcaaaacattcc«caaal~_cct~'atg tg otcttctttcttcaatgacttgaccaacatatt~tctatgtagacttccattgtttttccaagatgctcgtagacatt«

oflgaccaaccgttg~taagttgctcct~ca«clttaaaccaaaaggcattaccttataacaataaatacctclalact t~acaaatgacgttttttcctogtcatcaagaltlatcatlat«aaalafaacaagagaaggcatccatgaaagltaac aaclcgttccctgtagtagattccac~~aggceatctatatgtgggagigagaaatatcctttgggcaagcttta«gag g tctgtgaagtcgalgcaaacl:ctglcl«tc l;attc«ctttttcaccacgacgglattt~;ccactaactctg;_ataaa~=
~aL:CICC(:IlalagaCCllat(:IICaaaaY«lglCIaCCIC'=ICgIIIaCCgCCIICgCICffIICI:ICICC'=
ai'a~ltt tctlcttutigttttaca~=gctigaagcte"~_all;cacgttcaactcgtggcaaa«atgcltg~'atctatxcagg ca Igtct'=cttccgarcac~=cgaac~=at~=lt«cutttcctcaaaagttcaatcaaggcttcttttattttttcac't taaa olagctccaattc~aactgtelgatcf~gtttatcttcltctagtactactttgatltcgacgtcatccactt«tgate tcgg~_ga«attcattttttcgcgggg«nnnnnnnnnnnnnnnnnnnngcllcaatictcattcla~_acagatgatca g ctacaccgttttcaattcccttcuutcaacgatctccatgtcaaactcttggagcaataa~=atccatctcaacatcct a vv[Ilggl:llCClll;ltCgcatalatatgC:CICagagCagCalgalCa~lglagaCIaICaCCIaa~'ai'CCCBC
Caggl:l ecttctaaatttcttgaatgcaaataccaca~~ctagcagttccttctctgtggt~~calaccttac«gtgcatcafcc a tcgltctactcgcgtagtagalcacatgca~=c«tttatcaatccgctggcccagaactgctcclactgcataatc(ga a ~catctcacatgatctcgaat~=gVaagoccagtttggagcltggactattg~=ggcagtgaccaaagc«ccttgatta a cttaaaggctgtcaaacattcalcatcaaat~_rga:lclcggtttccttgcacagca~tctggtga~_cg~lc«~'cc aact f~gagaaatccttgatgaalcuctgtaaaatcctgcatgtccaagaaaactcctgatglccaagaaaactrmgatgtc cllgaaa~talllggl~ecfglaact~calcatcacglcaaccttagccttatcaacctctattccctcctca~aaatt t tglgccccaaaacgalgccticttlaaccataaaat~=gcatttctccaagttcagcaccaggttt~tclc«cacatct t ttgagtaccctgcatag~tlc~acaaacac~_ag~=agaagaaagagccatacacagaaaaatcatccat~
aatacc:tcc:ac catclcctctatcaaatct~agaaaal~_~aa~lcatacaccgctgaaaagtagvaggt~cattac;lea~'acc~aaa ggca tac~cttgtaa~caaag;=tcccataag~acaastaaaagtagtutclc(tggtcalu~g~igaatag!~,'at«~'aa aa aatct~=ctata~ccatccaagaa;_caata~'laa~'galggutectaatctttctagcatctgatraat~_aa~_~=
~caat~'~
gaaatggtcttttctaga~=gl a<'ca«t;l;IC:ItI
l:««aalcaatacacatcctat'_tcia~=ttatagtecta~=t~_g~'la lcag«CaICIllafaatllllHalaaCl''gC:II:ICCICCtItlllagnaCCgca~l~aacfg~a~aIaCC~'aa'!
l~'CIa fCagagalaggglaaglalltCClIIIICaC;IaCCICIIICaa~ItIgg~IttaatcICCltlga~'~fflaat~Cla ~'av laggaltCallllC;lagalggetlalatg=gl:~eataa~_ol~g~lgaaatcccctfaatatcatct:laCg:l~I:
IIC;Cl:l1 agCICICCIaIaClfCllga'L'ClCa~llaICagCa~~=If(aCIIr=alCaglagllagClCglC:IllC:lll;Il ~aCagYal aagtagaglttaglccaaea:laaac~lacctlay=cCCILllggla~cgIIIlcaaalclaclllg=_l~cc It~:I~IIC:I
gaCCaglCalC
_ggala(:Iaa~=Itr;°C(lCa~a~_~'aggootlofaalCgagtagll(:gglCg:lYlal:lCl~aal~
"=LClfll lgglCgagIgalglgglCga~talatt~'ataaglglalallllacafgltttgagcatcca«tgtcatl:aal(tagc at CatatCalCaa1g1I1CaIaCCa«tCICalCallt~=LCaIC:ICIII:ICaIglllaggal:lgillllgCal_'C:1 1~'llgC:l lalttgC',,,IlgalIICagi'lL'aIII'=°a~'ll,'llgac ,'arc I:ILCIY~aa~aagC
gg'=CCI g;llCal''aCaaaCC:ICIC
;1CCCCaan'L'lC~_agta~'ga~_ellCa;tYallIC:laCaE,IC:ICII~aCCCC~(:g'_ICga~fa''ai'~a C:lICaCI~CIICaC
(:aCaICaCICgaI:CCC~a~_'=IC~a'=I~IICIL
aICCCaII:;lI:l:Igacc:atcacllgall:ac;;tICaCIC'CaCCCCL'a~n[C
'_agl'_ICIICaI'CIlCall~'CCaoall:aC'IaCIC,'aCCIlIIC:I(:lClaC(Clggaa~ll:ga:1I;11l :aCCaIISICaCLaO
IC°_a(:I'_~alaCll~al'_alaa'=LIIIa,'a''CILIC'IIC:IIIC:I;aC:ICII:Gall;agaC
aCll:Yaglal:la'_';:la'aaaa'' aa~aclcca~ttattcactc ~'acCICCeaCIC~aCCICC'~'~Cl:tll:l~
~~'lllCge;l:llafll~~~Ctataa~_t:l~_Cat~=t aCIICaCaIIIICW
aCaCC;l2llllCWCI:I~_lltt:ILICC~'CaCaCCII~ICIICIa_~alCllll~taaftl;aCatll CIClalalllallC~olallCa~IaIIC:ICI(lt_'Ilc;ll~aLIICaIIIaCIaII~_ILCaIICIr_IlalCalC
CI_'Cl 2flaCaCIgIlgllalCalgilllCaaCll~=lICaaCglIIal~CIfICIglfal~_alglCl~'agfagl~~a.lt;
t~'glllC
llagval~~sna~agta~'tglagaatlClea~=laloCfagglgallgagiat(gallgalagaICCCIIeIYgalIa gl t~IICllaalgCCltIII~~ClllalgalC:;I;lCIL':l;lathga~CaCagaCaIIICCgC~=(:CCaa;la'~~' I'=flC;;al~~aaat _toga.lccaagglallg~=flgca~~gLa~'C
~'flllggcltlBacllgflgallcglaalRccl~Tlla~=alta_c«tcgt caat~=~=tgattgclgtctg~_~_acta~=~_ttaamya__~=gtctct_~ttacg~_tagcac«a~_,f«
Ig~_Ilaal~_aacttgtt PtCtagg~_ttaatttattga'_CalgICaaICaCCI~'I'=aaClt~llgtf'_ag=:IaaCiaa;IClaClCaalla C CCCaICC
lCgggaatlClllalCtgallgaaltClfl~IIIaCICIaCIgtf~III;ICIgCaICtlglIalClglC«CIIaaIIt Ct '_CaatICIIICIIaIIaCICgaCCICaIaCIC~'aCCaCCCagfI'=ICI=gCa.llagaCIgIgCai'It1';li'I
;IIClgtgll ataCtIICIgICIgIIaCICgaCCIt'ICacIC~=acctcaccca~tgctggcaata~lag(:lot~Illy=tC
~:IyI~ItttaC
t~tttctgtttgattttctgttttctgcat~=atcac«a~~'act~ctagaacacaccaaaalttgtlattgc«agctt g acttagtgacttctgatcacatcttgaftgtta~catcacacccatttgga«gacaacctaaaatactataacgacatg attggtgtttt~_g~ataattgactaaaaacctattalcaatttggcgttgttgccaattgggtgtttgtttgttacat tt ~agatttcagacttgcttagatcaagttcttg«caa«ttct«tcfgttactaactgtglgtftttcttgttgtc«c ttgatccaggtacacctctactgtatgcaaactc~atcaacaggcaatcagaacclccttttcaacgacaacalcgacc g catagctcgcgaactcagagaaaggaaaaacacagtcaaccttgtaccttagcaaccactcgaaatggctgacgaacag a atcaacaaaatggccctaccaacattggt~~ctgeagatgcaccacgtlatcaccgtcagag~_aaag~=aatagcacc (cct octalccagaacaacaarttcgagattaagagtggtctcatctcgatgatttaggggaacaaattccatggactgccaa t gcaggafccaclcgaccaccttgatgaa«
egataggctctacaacctgacaaagatcaatggtgtcagtgaagacggat tcaagctccgtttgfttccattctccttgggcgaaaaagcacacatctgegagaagaatltgccccatgactcaatcac c acctgggafgattgtaagaaggcttttctatcaaag«c(tctccaatgccagaactgcaa~actgagaaatgagatttc tggttttacacagaagactggtgaaagc(tatgtgaa~catgg~agcgtttcaag gg«al:accaaccaatgccctcatc alggc«factaaagcctttctactcagcactc«tacagaggagtccfaccacgcatea.laatgctictgcataccgcc agCaalggg:latttcCagaaCaaggat~«~aaYa;lggctg~'~aauggttgagaacctCgctcaatl:agat~~gca atta caac~'aggactgtgalaggaccgtcagaggaacagctaactctgatgacaaacaca~_gaa~~~=agatcaaa~cgct gaatg acaagctggacaggattcltcttagccagcagaaYCatgtscacttccttgttgatgac~=agcagtatcaaglccaag at ggggagggtaaccagttggaagaagtcaactacatcaacaacaaccagagtggclacaaagggtataacaacttcaaaa c caacaaccccaacctctcctaccgcagcaccaac~t(gctaatcctcag gatctagtctatcctccacagcaacaacaag stcagaacaaaccttttgttccctacaatcaaggttlcgttcctaagcagcaatttctggggaactacca~
ccgccacca ccacctgggtttacacafcagcaaaaccaaggttcl,'ctgclcctgaggctgatatgaaacaaatgctaeaacagctg ct lcacgggcaagcctctagctccatggagallg~caaagaagatatctgagttacataacaagctggactgtagctacaa tg atctcaatgtcaagatggaaacactgaataccaaagtccgatac«agayggacattctgcatcttcttca~
ctccaaaa cagacaagccaactaccatgcaaagcagttcagaatccaaaaaaatatgctcatgctatcacaclgcgtaglggaaaag c acttccaactagg gaggaaccaaagacggtcact~ag~acagtgaagatcaagatgg~~ga~=gattlaaglctcgagaaag atcaa~ct~acaaaccactcsa~caaccac(c~atcteccacoeacaactuaetcaaccaacaac(c~_acc:latct( c ccaoca~catcaacaact~ctccaaaacaa~tt~ct~_'tcaaaaacaaa~'aaaa~~tcitt~tccctcctccctaca aacc acagcttccatttcctggccgtcacaagaaa~cltt~_~ca~ataaatatagagctctgtttgc:caagaalattaagg agg llg;lgllgCggalaCCICIIgllg;lCgClCIagCYCIaaICCCagaCICICaCaag«tctgaaagacflga«gtgga g agaattcaagaagtgcaagggatg~t~=gta«gagtcatgaat,~ca~=tgctatcatacaaaagaagatcaucctaag aa octlagt~atccl~~gttcaucactctaccal~ c(c«laggtccalt~=~cutcaatagatgc ctatgc~'atttaggag catcagteagtetcatgeegetcict~_tcgecaaaagatt~_g~~,ltcactcuatacaa~~lcctgcaatatawc(c afe cta~ct~aca~atca~taa~~atccctcat~e«tec«saaaacctcccaatca~~alc~~t=tt~_'I_"_acalacca ac tgactttgta~tcgtggagatggatgaagagcccaa~gaccc«
Igattclagggagacctttctlagccaclgcaggag ctatgattgatgtcaagaaagggaagattaatctaaatc«g~caaagactttaggataacct(t~=atgtcaaagacgc g atgaagaagcctaccatagaag~~caactcttttg~=atcaaagaaata~~alcag«a~cc~=atgaa«actggaagag ct oca~aagaagatcatcttaacagt~=ctttaaccaaaa~t~gtgaa~_atggt(ttrlgcatlt~_~~aaacattgg~a tact agaagctgtta~actcccataaagcgat~=~'aaaa;ucagaacgc«tgaggaa«~_aatggaccagcaac~'gaggta atg ~taatgagc~aagaaggglcaactcga~tmamct~mactetr~=age=ac~_lamc~arca~=ccactc~accttgtct gc caacgaatrt~=~g~a~cc~atcattcc:aact«;I~at~'acl~~ICt~=aacataa~,ycactgaagg«
~atcmaaaccac «cctaaa~~tctaa~~tavecauc:cit'_,'tccaaaclctactlaccct~toatcattaat~ct~a~ttaaata~l~
at gaagtgaacctgctattatcttagcttagaaagtata_,~a~a~_c~attg~=ttattcaltatrc~acat«lag''~a atttc accta~tttat~cuaccata~~atccatcit~aaaac~aa«
ctattctamatteaaccacaaa<_~'a'~_'ttaaatccca a( 1g88agaaglaglg;laaaaat:l8a11«i'aaacy_CIl'_al'=CI'='_I~ICaICtaCCCtatctCt~'ataYta cal~~
gtttctcca~t~'cattgcgtccctaaaaa«~'~_lg~aat~at~«~=teaoaaat~'aaaaa,7at~aacl~=ate cctactaga ;lcf;1111;1clY~lCal:l~;l:ll~lec;llleallala~~aa_'fl~aal~CtCCatata'_'_aaa'_alCall lICClltalCall c:att~accaaat~ctt~aaColtta~_ctaamatccatactalt_'ctttctf~'aleealaca'_(uvtllClllca aatac raa« cacc:ctaatealcaa~aaaaaaacamncacatetcc It:«=_'aactttt_'cttata;l~'a~uat~cca«t~et ?i WO 00/~~325 PCT/US00/07392 (lal'=Caal'=CICCI!?CaaCaIIICa_'a!_!_l~lal_'aCWllalafll(c:aeacttaatcgil'='=agaIC
T'=l~_L':!'~~llll lal~'~_ate'alIIllC~'~~lClal~_gCCCCICIIICIC~IICaf~ttt«It;aalClt~=~,cap=~'i'tattY
aCla~~'I~'t~:t;l~
a~'avgaatcttgttcteaatt_~g=aaaagtgtcatttcat~=~tyaa_=gaa~~catagtgm~'gatcacaa~_ata tcaga~' aa~='~_tata~=«ggtlgacaaa~~gaaaa~_ttgaagtgatgatgcagttc=cagccaccaaaaacggtgaa";_ac atca~'aa~_ «lCClt'_''lcats'Ct~.'_~.'.gttCtacagaa~'atuatl:lAa°acttclcraa~ata;=CCa~YC
C iTtla:lCla~:lCtall,'I
'caaggaaacagaat«~~aattcgat~'aggaca~'cW
caaamcmuaaaccatcaa~_~al'_CtII~'~=l;ltCl'_CtCC~
CItCIIC~_a~=Coccaaatt~!t'.gaClatc:C:ttlli'ag;llCal~IC'I1'alYCalCa~aItaCi'Caata~
~a;=Cl~tltla';L' cca_=aaaatapacaagaagCllcal=lcalata«acgcca_=cc~=aaCgllg«algacgc(cagg~Taa~lafatgc aacaa ct~_a~~aae~a~cttcta~ct~ttgtatttgcatu~a~=aa~tma~'aa~ctatttggttggatraaa~~taact~tc tal acagaccttt~~a~gcatctgtatgccaa~=aa~'gatacaaaaccaa~act~tt~agat~~~atacttttatt~=caa ~~a~ttt ~acatg_~a~ata~tagataaoaaa~~cattgaaaat~~tgca~ct~=accatctotcaa~~_at~a~aatt;~aaaaa cccct tcccatagaceattcaatgccacaa~~agcagctcategitgta;=agttcttcggaagga~~ctaca~'tgggaaaga gttcc accaactgaatgctgttgaa=gagaatctccatggtatgctgatcatatcaattacttggcgtgcagagta~~agcctc cc aacct~acaagttat~aaaggaagaa=ttlttcagagacatacaccattacaactaggatoa~ccttatctttacactc t ctgtaaagataagatctacaggagatgtgtcttagaagatgaagtggaaggtatcctgctgcattgccatggctcagca t atggtggtcacttcgcgac_ttcaagacagtgtccaagaltcttcaaecaggcttitggtggccaacaatg«taaggac ectcaggagttlgltltaaagtgtgattcatgccagagaaaaggcaacatcagcaoaagaaat~agatc«l~atgtatg ~~augattttatgoglccattctcatctlcatacggtaataaatatatactggtcgccglaaactacgtatcaaaatgg gtcgaagctattgclagtcctaccaacgatgcaaaag«gtgctoaagclencaa~~accataatc«
cccaagattlge aettcccagggla~taatcagt_atggcgggaagtatttcatcaacaag;(ttitgagaacctctt~aa~aa~cat~ga ~1 taaaecacaae~tcsccactccctatcatccaca~acaa~ca~eca~ett~a~atctccaatao~~aeataaaaacaat t ctggaaaagactgttgggattacaaggaaagactg~tctocaaagcta~atgat~~cattat~~'gcttact~_aacag cctt caagacccccattggtacaaclcctttcaatcitctctatggaaaatcat''tcacctacctgtagagctcgagtacaa gg caatgt~~~~'cggtaaaacttctgaacttlgacataaaaact~clgaggagaagtga«aatccaactcagt~'acctc eat ~agatccgtctagaagcttatgagagctctaaaatctacaaggagagaaccaascttttccatgacaagaa~atcatca c taa~gat«cca~~ctg~aeatca~~tgctgctgttcaactctc~ctt~aaactctttcca~
~aaaacicaaamcaaat ~slctggccccttct~tatgactaaggtcc~fctttat~ga~ca~~tcactcta«ctg~taaaagta~a~:tc«cacag ta aatgglcaaagactcaagaaatacttaacagatcaaalcc«ccaga~=~tgacatcg~ttcatctcca~gaac«ctt~_ a tga«aaaggagtaaaggagtcaagcttatgactltaaacaagctcacttggg<tggaagtcccal~actatctcl~taa a t;lull111ttattltc«gttatttttgalllglcllggllglglllglgallclC;Ig~=aac~aagEaacaalgto~
a ata_agtaaaaattcaaaactittactalatagaagacctg°aoatC
~a~t~lataaccc~«~~ai'tacaaaatttt~a aaatcttctattgcacctagfgaccttacactcgaglac~«
g~=tc~a~tat~t~tgatgtttaaaac,=caaaaaac«t taaatcatcattttacttgacccacaaaasctacaga~'ac«gta~'ggatttcctgcagtctaaagtg''gtttlcig caa aattttcccaetcaacagagt~tg~accac~cact~~a~~ataagatggagagagaatctcaanctcaatcmagcca«
c clttttacatttcactcgacc«gctatcttctgttatccaccactc~atcaaaacaaattamc~
acctaataatcctt a~ccaclclcgacclccecc~cttcagca~aaucaaggaamactegaccaatc~alctacttgaccccc~'~~'aatca c Ic~acctcc~'cc~cc~tcatcautcYltctatccetc:ictc~acta~acta~atcaatc~~attcctttte'_clal aetc raccaattcacccgtclctttcaaaagaagctactc~'aCCIC~IIaCCC
a~_ICtCCta;_CC;cactc~'accgar«ccgcc tcagccgattlcaacaatt~=actccaccgccgtgaateactc~aW
Itcca~=ccgagtactcctcctc~ttctc;c~'tcg aatlctca~~gttcaa~atttcactcgacctc~'a~ttttaactc~acctctacaam~zcaactc~'actte«~aaamn a ca~a~crct~_aa~_cactstaataoa~~ttactcearct~aasacttcacci'~C;2l~;t:lClCSacatccavaam caclc~'a cctccaeeaacaectcaasttactcsastacasaatt~tactc~..'atcaattcaatcta«tcauecctatt~al;ta caa ttttgg«tcattgctatctttgagaclaacctattgacatttgagct«gag«ctaaa«tctacc~gagaatcat~_a ~~aattacagtggcagttcttctggttatcctg;lclacaacatggat~agaca~agtcgtcalc«
eaa~=~mca~agaoa ~aac agagagaatatgaaa~cttcagaaggaaa~ct~agata~c:cc~'ag~,~aaga~
a~;c~'atya~ata~'a~=~~tatoa;~cl ti«a~ac~aagatctggaa~ae~agtacatgcctgaacagacaayagaoctaccaaacttct~'cacaa~=rc:aaaco tat l2CClC'CIYaY~aalat~llaY2CllIICtaYCl~aillY:tYllBtCla_'_CaC'_a~~latCCtICCiC~_ac'W
c:aCll2Ca Ca;tclC~~~lllall~~aa~ati'llla~C:lCCIi'lalL.laa~ll_~tCatCtYS'al:aclCtaat~_'Cllat tC~_tal~_la_'C
_l:ll'=aa~atgagacaatacaattcclctccatoCl~'caa'_la~'a'lCICtaccaa'"=lal!_aCCfCI<_aa ~Ta!_'lf~i'all ~l~aa~'Yatlg~~aflCllgCoatlClClga~lal'=~'lCal'_a'nata~~ilatCaatcaa<Wi'CIIC'oaa~' c_all~'ltl CC'CtICCCCagIV ~aal:gggatctaagccCaa~_lal~ala'_a~aa~a~=lt'=aaa~aCll'=l~~alt'ac CaII~WCaYCIC
t~_tacc~ct~aattc WO OOh~325 PCT/i1S00/07392 SEQ ID N0:186 Artrbitlnp.,is rhtrlitrrrtr B.aC T6C20 «ast~tttt~y_a~tcaaatatgacta~atgtcat~=tctatgatigactataagaa~ta~aa_cacaacca;atccta aa a~ctaaao«gtgtttccatgctagaagatacatagccagatactcataca_acttttat~~;at_tc~~~attcacaca ct caa~tatttatauaut;_tgg gctaaagclctcaaatacaaaagtacacgacalatatgcataa~=aaaagtataaga~=a etatatagaaatgtatgaagaattgtacatagaaaatggtatagtattccttatetaaagatc«acalaaaaa~=a~_a a~_ aaataatc«ttgtgtatttatgagaaaataaraatgtttaagtttaagattt~tg~tca~aatatagaatctcta~ccc cccttatctig~agtcateacctctcttttatagcctctttccttcttccaatstgccattaattatagaactaattca t aaaataactttcacaatcaattcatacacttgaat~acaactcgccaaaatgtgtatcctaactaat~ctcalgcttca l cca~clcat~tgcagctcctaattatgaaactgaaactcacgatcaagggacg~ccactcct~ctttgaaccaa~acga ~
attacacc~_ag t~_tteatgaggcttcagctcatcttaactcttttatggctattaccataaaatctg gggattgtcccaa tttattttateaatac~actttcccgaaag~~aatttttatttgcccaagggsttttaaaatttgttgg«gt~=tgccc a a~=a~~~etttttaaaacttttttccttttcgttgtccataggggttttggaatgacactt:cttaaagtatttocttc ~a a~tt~'cc ggatttocaaagtccgtactttttca:laaatiteg~_ccat~tgoctaagagaaaut~~gccat~t~y'ctaa~=
agag~aaatag~ltgatcaaacccc:caaactcaatctttggactttctttgatt~c~agaatt~~t~ctcc~_cctaa ~aagl atlaeetaeatc~'aclaaacaccgtaaagatctttgttttgcta~tt~'acccctt~tccaa«a~~attaaatccatu ta «~actcaaactc~tcagucaaccatgga~ttu~~tatca~aattaaactc~a~ttat~cat~~~a~ctaacrtc~a~
«aa~«gttgc''aagtgaactcgactttcgaatc~'aagctataagcttgagtttagaaagtgtattctgcttata~aa t acaaa~gca~_~_ttatt~a~caocaatctagagagaaactgaa~atcttcttccgagctigtactcaaagactly~tc ata aatsclt ott~at~'tgaggtggtatctaaatttgtgaagagcgatctcaacacttagatttcacagagatttat««cc aa~a~ttggaacttgVaaaaacccttcacgagatagttcagtgaagagceaactcaacacttatagtccaccaaaataa t cttatttttccaa~agttgaaacttg~aaaaaccttcttgggat~a~atcta~tttc~t~aa~a~cgaatto~acac«~

aaccccgccgaaattaaccaatattcccaagagt«caactatgatctggtoasacata~ataaa~tcgca~ctlcc:la ~~
uuaa~l~actacatcaalcgcaa~tttctccattataatotaacttaattttg~act~gtt~at~acttgmcal~taa caaalcg~ctttcactctcttct~tcttca~aaattattactccattaac~gcatgattt~aaaal~a~
at~talaa~ta caatttttcatygc«caggcttcagagaattagagcaacacttaagtatgtctttagttgaccgaaagcatcttcara tcutcatccaaclcaactctttgttcccttttagtatttgatagaaaeaaaggcatttatcggct~'actga~_atgtt aa cc~~tttaac~ca~taatcctaccaaagagtcttta~~acatcttt«atgttcc«~~~~ctaacat~ttc~'aaaa=ag cg «aatcl~tcttg~'gt«gc«cggtttctctcttt«ttgctaa~=taaccgaoaaactctccggatg~=gactcagaat~
t -ctac«gecaa~atteaacttcatctgaagcttgttcaaoataltaaaacattctcecaa~tcttaaat~t,=accgctc t C«u~ac~'at«caC~a';cat~ICatctatata'~aCCtccaCC~lllllCaaYal~all:llC''I1~~1:11~aeaC
:ICCI
aCaf~=ttta~uca(:ac~_~'Caratcotaaaata~tatatacvlC~gll~,vl~ClYaal~:lallalllilCllaa lC~_tt a~_~altCatleatatatCCCIafIIIIaCC~llltlaa~talo2lll:l~'~I;lla~~c ttla;a~_IClaalf ~Ilalllal aCa~ItIlttCla~=tCttIltCa~glClIlaCatgtllta~lalgCaItl~L'aapatlat=~'a~cattct~~
a~taaaac:at acaatttaa~ct~_aagaaQcaatccagagacgaaaatgeacgta~gtgtceaac~aca~cavccrt~=gt«tcyt~_c oaca ccaatacctgac<_~caattgaaa~acgcaacttattgttgttttgaaoatt«tt~aattc~~cccaaaactttctcct a t~tttaatgaag~cccaacacgttcttaagacatalttatatagctitgaggttaatgtttagaccctaa~=ctttg~~
a~
_'ctattccc«tttcaattct~tactttgggagcaaaaccctat~~a~actccct~ta~apaa~atccttctaaacctt a ttclcctc:«g««attca«eat«cttattcaatcalgt«tg«tllctat~atcattttt~a~taatcttctt~t ta~~ttta~~gutctcatta~_ggatttagatga«tai'tagattgcccccttoctaagttatct~_laggattctW
am ma«gllc«aat~cta~ma~a~'tacctaacta~'aat«;_atatta~~taata~~tat~cccaccata~gtaclt ~«a« lgacctaW
al;t~atYa~~cWaa;,cat«aa~aaa~aaca~~ta~~ctcata;_aaa!'~Ct'~t~C~LTIII:IYataCl a~~ulv;l;lclB;llCaa:lCll~'al8ataalgaat~~Ctttagtatag~aIIICICooCIIIICCggtaatctta!
_ClCl;l;l''1 ata~;l_a~llIlYyY:laCaWaCC~~IIaCICtaaaatcatao111~aollC~'a~aac~~tlccalaalaalCla~' Cla t«cl;l~alclaClatC<llCfallIlClga~_aataccctaa~cctaac~CCtltatlalClf~lIlICaCaac:ll alaal ~lltitl~CllilClIlltYttlCtlllallllClflaClaltltac'lta~_Cllaalll~atta~CCaIIaUICICC
I~C;'l ~'atc:~ ~'a~aac «t~l~aa«
c~atccctaa~_t~'ctycaact=acctct«atu~=aoa~a~t~~~'tctta~gatlaaa«
'aallalalC:l;l;ltll~l:C~=C'=nll!.l:C~=aa~llClClol~allCaccattagactac'utVtil~~:la l(1;1~';Illaa~llt IlaIf111CI11Ctlff ~'lIICIC;IC;1C'=tlitClIICIICIIIIICCI.Its'aC:l''~So08aCtaCQ;t~:II~YBttC;CCaaC

WO 00/~532~ PCT/US00/07392 aaacagctg;~ca«atttcaac~~aca~tgtrgatc~acaccct~ttggt~tc~ttcaacaccatctgtt;;~gtgtcg ttcga caccatctgtt~'~t~'tc~tictacaccatct~ca~aaraaaccagctaccata~~=cgacttcaaca~_a~'ctga~
cttur lallccaatcgatc:lgcgalcc:lgctacctcctattgcaaga~aa~'as«t~~ctattcaccctgcctactattatc ttg t tmacgaggaaagtttc~~a~_~_tcattttgat~=aatcactactagatcatetaga~~~=tcttcgaa~;actta;_t ytcctcaa tcaaa~~ca~'atg~=~_~'tacaa~;ctga«atcatcttt«caaactc«cccacactctcic~~ctg~;ca~~a:_~_ aacatcatg;_ ctgcgacaaW
a~'aacccg~_ttcactaaccaattggaglgacaegaagaacgcattcalgatccacticcttc~atgaltc a~t~=acaaaattactaa~y=~agaa,~atctcttt~ttltctcaa~ctccaacagaaggttt~aa~=gla~~ctt~=~
atcagat tcaaatcctatcagaggea«~=tccacaacatggtttctaggaaagcaaactaatcaacatattcttccg't~;~=aat cgat aagcagtatc~yttatotttt~gtggtgcaa~taatg~caatttcatgacaaagactccaaccgaa~calca~ttctta t caacaac~ctctcacaa~tctctccacaaa~~aaata~actatoatc~aaQaaaat«_~ct~a!~~cttcc~acaatc~
ca ~asas~aaeaag~atctacar~_cattgcitctccacctaggaaaeatc~t~gttgagtttactggcaa~=toga«ctc t clacacagatcttaatgggaagatcgacaatctgagatctcatatatcttagcc«ctcctacattagcatttatcactg cagtcacactacgaagt~eaaa~=caactaaacccaattctttagt~t~aacgttcagctcagacctcatctatcccag tt gcagaaaaagcctcaatgtcgattgatacatca~ggtgtcgatcgacaccaattgatctt~acggctctgtat«ccttt atcgtgtggatacgatactttcacagaggaggaggaacttcttcccgatggtgtcgaacgacacccacctcatgtcgat c gacaccaataccacataggcaacccagtgctgacgtttgacagagcaggatcgatgccacaca~;tagt~_tcgatcga cac tcacctagtg~=c~atcgacacaagacttaaactgccccaatatgccttgcatcmcacctgttactatacaagtctaca c accacaa~«ccttgtctactacctcgacagtccaagaaggagattcctgagatgcattgcat~ggtatcat~gatga~a ttctacttcagttgcctccagttgatatagagaagttgtcaccctctcttcaaagtgatatgatacgagatgatgctcc a oagaagcaagatgacaagacaaagataatggttgtgtcataacaggtactcagtaacagcaagegtccacatgaggcca a tttcagattc~ttgctgaoaatcaatctgta~agat~gtgtt~atc~acaccatcacctgtgtcgatcgacatcccaaa c aggctaaaccagaatt~tataat~aa~ctgcttigcaagttcagatcactgct~aaatttgaggacatg«gccacgtca cccacatcacctctac tccttcatclacctcataaac ~ccaaaootactcatctcctcctcc taaacctccteatttcat CaaCa:tgag«taC:l:ltttCaaaa:lactgttttac~gttaaaCa:l~cCaC~agItICICgI(;g;lgCal;Ili' tICIIICIC
llgatgattgtg~ag~lacaaggagtccaccacccatacctct~tcctgC(:CagClgaCalICClc'a(:IICCItgg Iaga CC gcacacacc ~=aclgttaatacgccaccit ggll gaf gC~gg~ttctactaggcagtatttacIICCICCaIIIoaICC
acct~atccctaaactccaccaaagtcaa~ctattgactctaaacaagc~ctta~tgcgaggcaacccacta~tatggt t ltctttttcttgltcttttc««ac~ttttttttttctttttctcgcttactctgaattcgaatgccattggggacaat ~acgtttaagtgt~_ggggaggcagttactaactacgttlttctttttgagtcttattcttattcttatt«tcit«taa ~tcgtt«auga~tcaaaattttgttaggaattatgatctatattctttagattattgcctggtt~~attaat~ctgcag gt«gactcaatccgacta«ggaaaatctag:uggcaaaccaatagacttgaggatatcrtagtttcc:aacaccaecac caa~~cte~a~ctaatctcactectcttcctt«acctt«caaac~~tta~tattcata~atctt~actcctlttcata cct~atattga~=accaat~=t~_ataaaaagaat~~tgttcctct~cccagaaaaaaaaaatgautgatgaatgagag attga ta~gattttcagata~totatag~~~taeaaat~tttcttat~atcctattattatacattatttggaat~atcaatta c aaaggttggaaaaa~tcaagggtetcoatccagtatgc gaottcccctttgtttactc~claaaaaaataa~_ttteggag a~aaagaacaccaaa~aaaaaatattatataagagaccg~tc~atttatcacggtataecacag~=aatea~lctagaa g~
atcttgaatgttaacugtttgatgagactcagcgatgaaagcctgatagatatagtt~_tggaactta~y_tat~~gaa tct ataat~tt~t~_taaectt~m~~aeaaetacaa~nt~~ua;'zattt~~a«aeacti~cta~ccatat~_~'at«a~tt aoa al~_atcat~~caata~ ~« a_'a~aata«aecacttcrt~t cI«tcaaacctcttccat~ectW
eacttttet~ct«
_cttga~=ggtaagcaaaa~~ccaagtct~y~~~~a~tt~atatatcc:ctattiltacrattltaactatgtttta~' gtata~' ~tttta~a~7tctaatt~«at«cta~~agtrttttttagtctttttca~~tctttacatgtttggyatgcatttcgaga t tatgg~gcattctg~_cccaaaacatacaacttaagctatagaagcaatct~ga~=acgaaaatgcac~'ta~~'tgtc gaay' acaccatccctggtattgaac~=acaccacccct~gtttcgagc~ataccaatacctgacg~caattyaaaaacgtaac «
attgat~ttttgaagatttcc~aattcgocccagaactltctcclaletttaatgaaggcccaacacgttcttaagaca t atttatagttttta~T~ICaat~ttta~acc~taagctttgg~aggclattccctt«tctattctgtacttt~a~;agc aa accctgttgagactcccttta;=agaagatccltctaaacctta«c;tcctcttgtt«attcaatlatitcttattcaa t catgttttgttcltctat,'atcalgttt~a~taatcttcttgttag~IllagvuttlClC~IIa~~~~_anta«a~~I
altl a~la~ala2eCCCCII~ClaC2Ilallf~ta2Yattcttcatc«tall9lltataal~Cla2tfcta~a~Tta'_Ctaa ct agaacugatattag:«:lataggtatgc;c;aecatuagt;lc«~'Itatttgaccaamalayat~~a,_rcta~_~lc atuaa~, a;IFI~;taC::1_.'nt:lYYCll::ll:IL':l:l~?~_;CIC'.II~iIlad;llaClB~~~(~aactaamaaaat t_~ataataatecat~cttta _Vlala~_~BIIICIt_~~CIICiCC~~taatctta~ctctaaetata_~a~a<_'IIIC_'_m'~aacaccacca~ll aCtctaaaa lC:ll;l_~III~:IYIICY:IYa:IC~YItl~:llaataaCCl:Itl:lalllClaQalllaClalCalc;lallIC~
I~;leaatarlC
l:l:l_2CCti':lt__'CCttl;lll;llCll_2llllC:ICa:IC:IC;;ICa:lit'~':lll~i'l:lll_'c Il;l Illtl_~I1(CIIllallllCl ll;lCl~tlll:Illl:lYllt:l:Itlli;lil;t!'CC:III;ICICi:I~I_'c'_~'I_~:lllC_~:1.';laC
lll_'_l'_a:ICI~'~_lClla,'_~alt asffl~acllal:llc allc;llWtlalll! _~ll~l:lacaa_~al~c tacltICIICCaa;lllcllc~ilc;_Tl~_alclllCl C:lllaali~_C:C:~Clll' _':lllalL'aC~tcatc(:altaCtl _'CCl'.
~I:II'!al:llll'LtaaaalCaCIILIIICIICaI Yaoa y titgttaattggataaccatcc~_atgagtcsttaaataacaagattit~aa~_atlctt!_alctccttgtaaaattt tcac accattc~="~'attaaalatttaac~_lattgacaata~au~;at~_~'"alaacctteatcc~!ataaatccat~g~
«ctccaa ctat~tca ~cct~;gctacttlccattctrtccaagcstactcaa~'aata~ta~atclalcaaacttccaacatcaatcac aatcttl~'aaacccc~'acglca~'ccae~_tct:lad'~_I~a~~acaaaatcaac~=tcatca~'«ttct~caagg l~'acgaattt CIICIICCII~'aal~_llglgaltgctcla~c~'a~llcga~Ill~IC~=CII!'I~=clc~cctttaata~=aacat icccctaa aC«Illgaltgla Cl~gtcg;latctCtaCaaaaCttagaCCC'>ItCat~auaC
(lCaatCIICItCllall:lCCaaaa ~~atta~aIiTIIIIa~_~=talltlCtl:lBClaal~al~aaCltII~FICIICC;~~CCIIIICIIC~_I~=~Iiaa allflCaC;Bac:
ICICC~~aCglCgaggllll~;Cg:l~'_tCaCCtCltlglClgIagClCa~Caa'=:lti'CCICIf'~alal(:g'~
Ial:a~lClllf ~tagtataattatttactctat_gatgttgtcgtatgcctt«care"caacc:at~=tt~'tca~;a~;at~_~=ttgt clcgatc ttctcttttccaagagafgglgtaatgccccttacc«~aclatgt~T~_ccggac~=laca~_tctatcga~'lagaat gcgag atcaa~;ccaaaaaaat~=caagatcag«cgaccattta~tgtggcaa~lcat~_~_gtaagaaocattgaattg~atc aagt ccatggolaaaggaatgactcaccatgagacggalcaag«tagctacaaloc~acaaatca«agaag~tcgaggggaa tccatttggaca~tga~atatgtgtatgggtcgtgaalggtcgc~=a~Tatt~alc~~_a~aatocg~~ataagt_acg atagc tggcttactc~~_tlaatlatggcatggaacgtgttatagtcctaaggaaaccttaa«tgcgtcatggaccctcaaga~
a cattggaacggaatgtgccgtattaatsttcttgcggtct~gt~aca~atcota~=~tttccccg~aatgaa~aaatgg ac ggccaggatlgaaccgtgcctcatgtttatccgtaaagsctcaagcttatccatatttt~gtcatccttatctttttag g gtcgtgcttatccattlttggutgatgtttatccgaaaaatggaaggaaccaatcg~~ag~ccgattagtgtcttggag cg gccgttggtcgacatttltcacgttttgaltaatccaggacacclcclgc«ecc«ggcc gacctaatcaccgaccltg gaccatataaataccccctccgcccatttgattccttagacctaaaaatcgcctagaagagctaagttcacgggagaaa a caltcggaagccaaaccgctcagaaatccaagtgagtc«gaettac~'tgaag«lgcacgtgaataggatagtttcltc a ttaatttccttaatatagcttaggaggttaattatgaltac~ta~cttaattattyc:«aalccttgattaggatagtt aatttggttlattaattgttaattaggaata~tgatcattaattacaatta~tatttcctaa~~atoatta~=ggttag tl aaaccatgatta~~~~actaaccgctatta~taataatlaa~t~t~=aua~=tt~l«~atta~tagagtttlgcatata g gatgaatgcatgatagtatgtgagtctg«gtcatgtlag«aotc«taatta~at~acta~catta~cataaccatga ggaICCCg lCCgllllglt gICC gllaIgoCHIa(: gaaggCICg:lyuCl~,BaII~~C IaCC
~aCCgggC~'CIaCBC ~aCga1 gta~tataaaggaatgClalgiCICIlalagaClllglgtClaaga~l:ll~~a~lCal';ICCtIgggaYalCBlgga llC
~tacaaa~ccitaCCta~~It~ataQaa_QCatL'Ctt2catta>~_TtlatCe_tt_~_aatataa_~~aaattC~_a =ICIaCaI~C
allCl~Cal~!glgCatlataltltgCltglgllgtalgtlgCgl~ttaICIIlgfllat~'ICII~CII~IIaggla~
Ial tattgttacgtgttgttgctgcttaggggotaaggaag~~ta«aggataacattgtctagatccggggttcactaagta atactagattacttatgacgtlacttatctttttcaggaaatcttagttgaccaacctlgta~a~ggtgctgatgactt a ggacgaactaatgtag~ttttactaagtatattat~tatgtttuttgtaaattcttactaa~tatattat~tatggtll tcaaaga~gatccggaaactatatacgtttt~gtaaccautctc:gtatgatatatatatatatataaeactcg~attt c taaaatttatal~gttatact~actgatatta~tcttgttcgtactatcacaacmac~"cc~c~atagg~«
gaaaagc cttaggtcacgggcgg~aggctagactcttgcg~ct~aaceactcagec~ aac ~aattcatttt~=~aaacctggattg~_c cgactgcgg gcgtctaacgt~~acactcctatttc gt«atatttatata~_c ~stata~, ~'ggtcttaca~=at ggtgtctcc acattattg«aaatg~glllaatccaagtggtttgatctcttlgtt~ccc«taactlt~y~lttccfo~aagggacagt aaCaallggCat:IglCggCttlCIllIIC«gCgaa~t~3a~C;I:I:IICIICII
~ICIIC'l:lCtlllalCCLIlllgallgC
lCggaataaagCaggtaaactatggaaaaaactcttaaaltccatc~'a:llg:l;lc:lc;a««tceaaga~_ttaga acllgg a~aaalctcttclC~at~~aalctaalttcel~aa~aoCaaaClCaaC;ICII:I~_alll:l':1( Caalt~'aaattcacatctc catoaa~~tQaataacatt'_eaaa~tt~t~stata~catc~aa~tt~mlltuc aaaclt_'aatctc«ctt'_tteaca aat_'tcecatt«~actccn'_aautt~a~l~al:.Tatataaatal~,~actc:ca:lacll<_'Ia_'eaat~attt atcaatca c~ttaaetgataat~ct~tataattctcctccuaaga~~cttaactn~_c«talattat«~=~aagat~_cccccacct ~
ttt~gtcttagacatctatgcttttcgtcataacattttgtgtaccuaagat~~a~'acattcct~ttaag~agtaaac ~
t~tacaccata_tt~aacttatc«t~at~cttcatacatsatcteamcaaatU«tt~aeccaatateatcttotcct gleatg~ctgtaattgtt~aagtcattcgctaatgatttccgaa~aaac,'c cat~aaaacttctclgtatcattgtgtca IggCllgaICgatlgaotlaalctta~la~aC(CeaCCll~CllClaCCla; ac lf~l~_CtCCI:ICIIIg~l:lgglgaC:l:l ovllvllgIIlC~c:CaCC:IC~a~CI~la~lc:ll~Il~ao~lc c a~=a~llClll<_!_fi'aaiTlll~l~ta~_Ctaatggg8a atctcggaacatcalgagacg«gatggcctctttttttttttgttt~'ctutac ~a~tmt~'cltcegagaaaaa'~t YaaIC~CC:IC(:~(:llltlaCl~'~2l:Illlg3~lCCaIYBCga~~ll:laalC~;II:II;II:I:I;ILIC:Wl 1( ~aaalga~cacaltc tlccg~tgetcugacatcaaaa~catcinga~=ggatgaaaamuy'ca:l1«I~'a;y';I~maaccmaacacttagat ttc accaaaat~aatataattttcaa~_aa~t~~aacu~~aaa:tatcmt~ac _~ateaa;llcla_'tcal'~teaa~c~coaac Icaacacttaea«tcacc~aaal~aalala«t«ccaa~a~~maaxlle_'aa;l.ale«t~ae~~al~aaatata~
tctt~t~aa_'a_'c_~aactcaacaclta_~Tatttc~ctaaaaleaatacat«tttc:aa_';1~__~I~_aaactW
'~aaaaatccl tga'1~_'atgaaalcla~'felt~l~aa~a~c~aacicaacactt;l~=:otIC;IW
I;l:l;tt~;l;It:ll:llltlll:C~:l~'C'g~ln _~'aaClt~_Y:laaaClIIIIT_a~_~_~_al~_aaatrll~,lCIfIl~aala_'c:aaaeit;J:lC;lc.'11:1 1';IIIICaCC:laaat~aala laIIIIICCaa'_a 2~l _~'aallf ~aaaaaatcctt~aa~~af ~aaatCla~=iC'If ~'I'_;l:l~;lYC
~a~'Ctaaatoaatata ,y WO 00/5~32> PCT/US00/07392 ttt«ccaaga«gt~aaacttggaaatatcctt~a~~_gatataatctaotc«;_c~~aa~~tal_'aaetcaacactta ~att tcacl~aaatgaatacat«itccaaga~'gt~!~aacttg~aaaaalccu~a,'~_aal~;taatctaatctt~t~aag a~r~, aaclcaaeactt~~autcaccgaaatyattatatttttcaaaga~et~~aamt~'"aaaaatcctl~a~~a~_atgaaa tt tagtcctgtgaaga~'cya~_ctte:tcattta~=a;ttcaccgaaataaatala««
maaa:oag~t~~aactcgaaaaaat cell''a:lggatgaaatettgtctt~~t_aa~_a~=caaactcaacacttaat«ea~aaaaay'aatatatttttcca a''agg tayaacttg:la:l;la:uc~_tt~'a~'g~;atgaaatctagtctt~'tgaaga~y'aaelea~cm«
ayatttcmcaaa;ueaa tatatttttccaaga~';_tgg~=acica_aaaatgtttt~~agggatga:talclt~'lclt~'laaae;lac,'a~' ctcaacactta tctttcaccgtaatgaatatatttttttccga~_aggtggaactt~!gaaaa;mc«~=a~~~'galgaaatctaatc«~
~tg'a agllaa~a~'Ct(:a:lC;lcttagatttCaecaaaat~aataCattttlt:aaaaa ~_ ~=t ~~ ~'aac t t Y=a:laaatactt~ ai'°~L'al ~aaatctactctt=:1~'aagagc«aactcaacactgg=~atticacc~2laal~'c;:lt:ltattt«tcaa~~aggt tgaacttgg aaaaatcc«gaggy~t~,aaat«a~tc«~=t~at~agcaagttcaacactla«Imeac~gaaataaatatatttttc cs~a~=aggtasaactttgaaaaatacttgaggoat~aaatctagtc«gtgaaga_c:aaa«caacatttaaatttcac cg aaatgaatatatttttccaa~aggtggaacttggaaaaatctttgagggal~aaatcta~tctt~~t~aatagcgaact ca acactaacatttcacaaaaatgaatacattttttcaagaggtggaacttg~~aaaaatccttgagggatgaaatctaat ct tgtgaaaagcgaactcaacacttggatttcaccgaaatgattatatitalctaaga' gtggaactaggaaaaatccttga ~gggtgaaatctagtcttgtgataagcgagttcaacacttagatttcaccoaaat~'aatatat«ttccga~a~gtgga a cttggataaatccttgacggatcaaatctagttttgtgaagagctaactcaacatlta~atttcacc~aaat~aatata t t«tccaagaggt~gaacttg~aaaaatctttgaocgatgaaatctagtctt~tgaaaa~=c~a~ctcaacacttaaatt t cgctgaaatgataatattttttcaaga~gt~_aaacttgeaaaaattcttgagVeat~aaatatagtcttgtgaagagc ga actcaacacttaaattttaccgaaatg~atatatttttccgagaggtogaacttggaaaaatcc«gagagatgaaacct agttttgt~aagagcaaactcaacacttagatttcaacaaaatgaatatatttltc:c:aa~a~'~~tegaacttg~aa aaatc cttgagggatgaaatctggtctggtgaagagcgaacacaacgcttaaattmalcaaaat_aatalatt«tccaaga~_~
=
tggaac«ggaaaaatccttgacggatgaaatctaglc«gtgaagcgcaaacmaacartlagatt«tccaaaattaa tatatttutcaa~aaat~~aactt~_~aaaaatctitaa~_'_at~_'caalcta~
lcll_'l<_aaaa~ceaartcaacactta oatttcacc~aaalgaatatatttttccaagag~t~gaacttnaaaaact«tt~a~~~=al~aaatrtt~tcttgtgaa t agcaaactcaacatttagatttaaccaaaaatgaatatatttttccaa~a~gtg~aamtga;la:laalccttgaagga l~_ aaatcta~tcatgtgaaga~cgagctaaatgaatttatttutcgagag~l~ga:tcll~'a;laatatcc«~agg~atg aa atctagttttctgaagagc~aactcaacacttagatttc~ctgaaatgaatatat«t«aa~~=a~~t~'aaacttggaa a aatc«tgagg~atgaaatctagtcttgtgaagagcgaacgcaacacttagat«caccgaaatgaatalattttccaao aggtgaaacttggaaaaattctteag~gatgaaatctagtcttgtgaa~a~c«aactcaacactta~atttcaccgaaa t gatlatattttt«ccgagag_tggaacttggaaaaatcatt~ac~oat~aaatcta~tctt~tgaagtgcgaactcaal acttagatttcaccgaaat«aatatatttttccaagaagggcaac«ggaa:laalccttga=ggatgaaatccagtctt g tgaagagcgaactcaacacttcoalIlCaCCgaaatgaatatattttctcaaactCllY~aCt_C~a~'actaag~aat ~a tatctccg~tttcacacaaacaaataatgagactttctgtgaugcatg~~a~c~cucaa~~~«accaoac~caat~tc ctcatcalg~tttctccaaagcttcacttclcagcactctctacagag~t~tcctlcccaa~
ata=~at~cttcttgat accgc«ctaacgggaacttcctcaacaaagacgtlgaagaa~~at~g_~agctg~la~=a~aactt~~cacagtcggat gg caactacaatgaagattacgata_aagcatcc~ccaca~ctct~attctoat<_a~aa~caccacao~~aaatgaaagc ta taaateacaaact~gacaagctactcctt~t~_caaca«aagcacattcat«tct~=~~t~al~at~'a~'ac~_ttcc aagtc ca~_gat~
gg~_atactctgtaglcagaaga~=gtcaactal''IgCagaaOC:L:I~'~;lirvlla~'aacaaaga«W
aacaac«
eaagca~=aaccatcccaatctgtcttaca~=aa~tacaaat~=ttgcaaaccc ;te:l~'~'ae~aa ~tclac:ccctctca~_ca~'r tgaataaacccaa~cccttt~=ttccatacaaccaaggtca:=~=~~tat~«celaa~'c;a~_c ~~tama~~~~caactatcc~, ccecaacttccarcacctg~gttcacaca~caocaacuaca:tcca~ctlcaacaamcaca«
c;aaacttoaa~aacat sttaca~caaatactcca~eeacaa~ca~caggg~caatggatctcyccaacaaeal~~>cayaaatccataacaag~t cg attgcactttcaacgatctgaacattaa:lcttgaggcactcacctcaaa~gtca~'atacat~=oaagoacaaactgc gtcy acctclgctcccaaagtaagaggacttccaggaaagttcatacagaaccy=aa~=~~auac~_ccacc~ctcacgctat cac catctgtcat~atc:ga~a~ll~cclallcgacat~tctccacatcaalcarc~=a~y_aca~t"alettca~gacggg aa~~
cttctactcaUattgaaatttcaglt~tt~gactc~accatttagct~
~atccc~tt«caaamca~ztccaacctagac ~agaaagca~ccatcattgaVa~'g:ltggtaaaacgaucaagccagcaccattacrmac~_lgctcuccatggaaa«
cag~aaa~'cats'g.uagaaa~_atacaattctc«~ca~a~=aa~_ca~ctt~'ate'a~at~'_aayc~~t~;y'c:
cmaata~_ aaguctcaacct~atcccg~atccteaeaaa~at~'tya~'aaattcaattct=~>aaa~~:ltcaa~=;uuatc aagattca _~~aa~algattgt~a~~ct;latccatrtaygccact~=ttaa=a~aa~tetmaagaaa:tac't~~aa~atc;cl~g aacm c:acaClaCC:llgllca:allg~=Il:lat(g~=IIIICaoCaaflgll:lal~l~alt('=~'''a'_ellC:l~l:
l<'tUCll;t;ll~lCaC
lCtlc;al~~(::la~~aagClgYaBlICallla~tacaa~cc«ycoaCel~_aClll~'alCCllC'CL~al;Wvlll l(:a8~~~
aa:ICC;CIII~=~CCI~CI:IC:la~alCl'?Cl:agtaateattaal'_'=a'il'=~Taa~'IaC'ilaC:l~:ll lli ~ll~l'=Cll~a~;ll _~:laaCCa~il:ll:l:laau~alLCWaatccta~_~aanaCCalIClla~(:C'llC~llY~'Y;t_'Ci':IIY:I
ISIY:Iti'!(:;laa':aCa a~a~_a;Ilaa~ICll'aaCCl(~°naayaCalCaa~LI;=Ca~lll~'aC:IICa:W i';laaefl YC:1;1:1~~'aCB~CI'_la';:la WO 00/5532, PCT/US00/07392 ~aaaaaatca~~_~'mcaocctcaaccttcggattcaatcaccagaccaagcacaacctctacacctgac«
gcgagatct caaaaa~~aaatct~at~'a;_caa;~aa~aaaccata~=a~aa~cta~ctcagaca~u~a~'~_aacuaa~=ac=taa acl~=~atc a~at~caa~'a!_aaa~=elcaamaaaat~y_~~~att~alactatcccoagaaaaaagtttact(caa~'at;~,~lc t~ta~~'a~' ata~attatcracca~=aa~~agaaagag~~cctatttc~=a~aaaa~aaoaattaa~,tattct~ctac~'catctet caa;aya =gatgct~aatat~=at~al~'a~'atta~_a~_a~~~actalgca~~alcclctctalracccamtcllcttaata;l ~=t~t~'a~' _'a~caamta~=a~_ac tta:tacaa;cteactt~c~_any'aattcccaa~act,'«tctataaataaaat(ttt:utttct tgtta«t«gar(t~=ttttt~;~ctgtgut~_l~=attclca~~~aaaatagaaaca~=c~'tg~'a'~ta~'a,'taa aaatttta aaatlttactctaca~_agcaaca~'~"~_atc gagcat~_tca~t~taaagaaattcaagaa«tgaaaaaag«rtg«gca atcagagaccatgagatcga~'ta~ttt~glc~~a~tatta~lt~~at=attttaaaaacecaaaatittgaaatcata tttat actc~accaaca'~aa~ctacaaa~acttaca~a~a~tttatcaa~tttaca~a_'~attaca'~aa~~~«rtaa~aca ttcc tagfcaaca~a~aaca~tgcttcaggacaaaca~aca~agtot~_gcccaccacctclracctttgitcccacatgcgt tt tta~_a~ataac~aaatcccaa_';tctacttctccaccactc~atctcaccctatcctticccacc~_acatcatctc tttlc ctctcca«cactcgacctc~'cagavacattctccgtcacgtctctcactc~accaaacatcactc~=acatctctcic ac cgcctctct«cactcgatc~aaacgcctcfcctctctattctccgcctactctacc~cca~accttcacc~ICtcaac~
_ tcgcctcctcttcacmgacctcgcc'=tcccaactlcaccalccctcaccacttcgtcaactttclcactcgaccaaaa t tcagctttcatcgctcacgccactgccttctccctctrttccactcaacgacgggacceolttcatcalctctcacc~=
ct ctc~cctcctc«cactceacc'_cac~ascacctcaacctctactcceatttctttttctcacctctccatactcaacc ~
ctactcgacctcatctccg«ccctclcttttactcgaccgccggaccggcttcaccatctctcaactatccaccgltca ctcgaccicgccattcattgcgcctccgtttatctcttcactcgaccgctcctcaaaccgccamglcttctctccattc ~cc~ucactcgaccac~catttac:cgtclctcattcgtcttcactctaccgctaaactcgaaaccataatttcactgt a ctcgaccgtaatactcgaccgtgtac«g:iccgg«lagtgt«gcatttatttggactaacatattg'ac~_t«~octtt sa~tlacattctt«tca~~=aaatcaatatgagtaactacagtggcgaatcctccatg'~at~c~~attaeaacatc~a t~
aa~ct~aatctto'_maact<_'_~acca~_as_a~~aea~'c~acaa~cttat~a'_aecttca~a'_c'_oa~accc aac~ctcaeta actt~ac~caat~a~'a~~a~a~ct~a~att,'ctagaggaaagagagcaatgaccagcagatat~_a~'tt~ategac ~atga tall~acglC~a~lal~a~cclga~tcat~~cata~agagaceaaecl~ll~aala~~CCla:IlYa;l~'ICC:Ca~l ~~auv agtaCaIC:IL'aCIIIICYa~CtYaaC~aCIICIQ~geaacoaaotacccctattatca~act«aoccca~mI~~CQC
Ia CCggaggaCgIaCaBCaCCIaIIC ~'a~'a;IglgICalCtg gagacactgatgtc«acccgIaCgIC
i'CIIilCaagaag ga aataaga~a~titctclceactct~_caa~tg~agatgtatcagg~acttaca~ca~at~a~ct~~aoa~tgaa~g~tt «~
ogttcttgactttttCC~I'_a~C'~agCa~tgttaccagctBlClelCa~za~Cll~~aa~~alfotlt~c'CIIaCC
Ca~I
ovaaagi'HaaCIaaaCCCaagIIagagagggaagagttgaaggatttglootta;IClal[~uvaaCvatalopCvCl Caa CICI~C~BC~ICC;la~ayaama~attc~aagccct~l=aICf~L'CIaCtatcaacgcictala~c~aalZItCI~L' IaCI
crag' oaalctacagg«tec ~t~_ICtaacacagacatggagalgattaattctgcactcaag~_~=cattctccgtagaaca aa~~~caaoaa~_'ICCt~aa~~~~c~acctcaat~at~caccacca~ttat~cttct~ttcatccacct'_tet~~at acao saagtg~~gc=cacaccaacgggaagaagagggcgcgaggagccctttotgtaggt~'gtgttgtsacaccaa«ctgat ta cal~Ie~l~~tacclttCaC~ICICC:I~~~~lllgalCCgavv:lltalggalilagalCnc:ll~'colc~'II~I~
a~'lllCl~
~a~cac~acatg~tt~_~=cgatlmatcgctacaaa«tgagcaclcctt~accc~aaca~ccaacatm~=cttccclg caccgaggccacaaccatac«ra~_~~c~~aaaacattgac«caa~cct~c~cgt~attacctctacttt~a~'a~cgc tc cacc~act~'al~acaac~tccctacaaa~aasctaeccaa~cts?a'_att~'ctaacaca'_;1t'_a_'~ala'~~
~'a~~'a~~a~~t ayatac~yal~tatcatttca~'tyaocat~tacctccaac~ag~=~a~~a~caa~=a~'ctt~a~_c~aa~_cmaca~
aaac aaca~taa~«~ma__'a_~~t~et_'caa~aaacaaoata~~ctactcatcaaet'_c«caa~~_ccalcaa'_t«ct~
aca~a caagctaagCl'=CICCII:II(:laW
1C;1'=CgafIlCgCagggBgagCCICCICa~?gacat~CCCIC~a~'Ea~alal'=aCg cgcca~;aeccaactc~CCaCa~~CCI~a~CCBa~IC:I(:CaCaIgCCI~a~CCIa~I~aCC'la~la~lCICaCaa~
ICCCI
2C gtggCatICalCatlC gagCC lC Yg ~a'=l:lC~~gagaaaeaa~ila~gC I~~I:;IC lC ~CIC
~~lC I'L' ~C;1~'C a~~a~laC
aC~aCIICICCa~ICCC~IagCltaCgCgalCgC~glgCl~;°CC~Ca~CB~aa:laa~a~aY;IIC~':IY
I:IIC:IIC:I~ag('g _~IoCI~~CC'=C''?aWaagga''C;t''a~~It~,aolaCCCCCag~oggaagC;lgagaC;ICa;ll:a~:1'':t '';IIICIIC~BI~CCC
Igggagcattcaca~=~C~_~Cl~fl~af~_aCCaalICCgCICCIlCIICCaCI~ag~lilaLCaCCICaC:ICCa(C
allgla atataccatctcrl'_111Ita1IlIVIIICI_~'l~al_~IoIIII~IC;Cf_pa'_laClClC«CCe:lalll~~lC
;iCaCa~I~' ~aCl~l~l~atuaa~t«~~o~~~=~_a~_~~'Clca~oaagtgl~l~IIQCoIl~lalalaalCll~'e~lCl,_C:II
IC:IICIaa ~_'C;lla~Taa;laaaCCaaaaaaa;laal(!_aaaaatlll::l~aaaalgatticacaaaaaml~~;l~'1,'tlC
alata~ll~Call oCallla~oalC~a~ICIa~a'_l~_lllCalIIai'~alCall~_catat~cata_QQo_'aCaal~al~aL'al:l' _cCtIQia:l~
l:ltlll!',!_~IIC:ICCaVataaaW
.1'_I~cCClc~lt~lla~ll~icl~al~cata~lcaaf~:laall_'a:l~laaaaCIaC~
tlCCal~CCla~alt~Cl(aaCIC'_ayaCaC'=aCf'_lla'=~alCl~ataccauccctalC;ntlll~':laWl_a etCl~'a tttaeaattatcat~tctt~~calceaatu_'aactcat~~ataccctaaaatactt~~at«tcatacicallllaaca actctt~tlaatccaa~laect~_ammotattata~ca~(taacccaaacccaaacctaaaetitr(t(caa~_ccct :llalC:IC(l~l~':l~lYlll~'l!_;t~'~_ICII;IIIIC~=atl~a~Clt~~?fa8aaa~l~lla~=~'IlC
ila;lC "JCa;=a~=ataL'I
~ICllal~'la~IlCI;IYIIC~C~IIIIIC
~_T;.IC:le~ata~~acta~~t'_~_~C_'Clla~alC;tl~'_'_IILS','_al'_I~Ilta WO 00/55325 PCT/US00/()7392 aaa~=aaaa~~~Tgt~aa«catt~;u;_Tataa~~aaag~~aaa~aattctag~_~~aa~'tuayctaaaaaa~=ttag aaaaaaaa tcta~_taaa~,Ttttt~~~a;u~ttaaa=a:laa~aat~=ao~ttcu~tta~ttaaa~aagaa~~'~=ttaaaa~_cc tttt~tt taaaaaattaaaaaca~~aaccna~tt~'uaaa~aaatccaaatcc~cta~atgtatcaaa~t~ll~'a~Taaa~c«ct c cta~=a~ttaa;;Ta~Taaaa~_aaaa;=aat~'atatyaaaaa~=a~'ttt;aaagatlcatgagt~'caaa~y=gta ~a~=ttaa~ttct I'_t:Ilt~ooaCta~a~lt'=~'~;u;l;teC:lll;lta;W
llCall'_llal:ICfat~L'~=l:l!_al~'''~':llClfalClClYlal~Ca Ia~3Cll°°°aCllaCClll;!'~Cattctactaaa~'ctcaatcattclt~'a~aoa llCC:CIL'il:lc'llaa'~CCI:IIICt~Ia a~=~;~taccatlatt~tctctt~=accttcacctta~ccaaat~a~ttca«~_at~~;lt~catt~Tc«eattcacgt trta~aa ctaat~aat~«aaa~~«a«get,eatttgaaaecatstatag~Ttcgagtataa~Taa'acaoatt~att~'ataacaa ~~c at~ectaac~itt«ea~taaaat«aatc;ltatc~catctta~aactaccaactt~_~acatt~attttattt~ctcta c ctgat~c«t~~~ttct~a~tceccaccltcaaacctctcc«caactatgtc« chat«~cug:l~'V~caa~caaa_a ctaa~tttt~~~T~a~ttaatat~tctalaal«~catgttttcagt~tccattcatcatcatttc~a~~tcaa~_tttt ota tcattcatcact~ttttatatca«tctcatcattctlocatactttgcat~atta~gataocut~~catacatatt~ca tttct~a~tt_ttttca~~t~at«~~a~ctgttt~cgaecaaatta~aagaaac_agcca~~aaccagaa~Tatataet c~
accaccatgtcgtgt~catccaac~ccatactc~accccctgotcgagtgact«ogagccattcttcccatctactgga ccccca~atcgagtaacctca~ctcag~ccactc~at~acgccactcgaccccct~~lc~aotalcac«cgccaaacca cctgaccacactcggccg«cactctaccace«actcgacca~toggtcgaetatcatcactcaccaccaacaccacta ctcgaccgggcactcgatcacatcttctlagtctactcaaatccocactcaaccaoacaa~_c~~agcacaa~~aagag aa _a~gagaagaceaagt~tttg~aaec''gcctogacctccatcegatcacgaaocccatctc~gcccattatcactcta tg ggtcgggcgateaggnattggcctetctcctatcattttatttcettttgcataaatagatgtcttt~gY«ctetcct gagacatctaglcgaca«eEg«t«gc«cag««attttclgt«tacictggctgcgccgc««Yc«ctacaa CCtttaattc~aYatttttccaa~uattea~attccoCattlQatttcattt~ttatctt~tttctctaCICt«atCt ctttacttatccaatattatcatttatct~_c~=ltlat~tctttga~catattgtct~a~tagtgac«a~,aucttaa ~
gatg~gatagagtagttotg~aatceelagtctatagaat~~ttaagtttta~aatt~att~_aatccctica~~acta ~l lgt~tttactgcttatttctttctoatcaact~=oaattc~alcccaa~cattccc~caacca'=aa~=gt~_tlc~at ~~aat ~cttYatccacta~ticct~agatattcgtctctatcccaa~g~att~gccgtttaga~c~tttattgactttatca~t c I~ItCltaIt~CCI~L'Cala~ffa~BllCeoll;tet~ovaatla~lCl~iT~oClagl:l(lgCIIY:I~~TaIIIC
I:IICaCo~~
agt_aatt~atctatt~tt~=aacat~«otctaggYataec«~att~cecttgttaaccatly~ata~gctaggatac cactttaltcoattaccccatcc«ag~=aat«ctc:~tctat«~atttctct~ttttatcgttatc~=c«aaltgttct catt~~catgtctcttc~ttttt~caattactcyatcaacttactc~accacactca~t~tct~~:caaca~actatgc agt cgagtacagtt~tcatatttcctetct~=ttactc~accatatcttacttctggcatca~cct~tt~togttga~_t~a ttt tagtaattct~ttau~«lctgtt« c;tec:at~ttc~ctta~gactgttagaaaccccaaaact ~ttattgcttv~ctt ~acttagt~acttctgatcacatctcatctgt«gcatcacacctattt~gattgacaacctaaaaulctacaac~acat =att=~tgtttttg~ataattgactaaaaacc ta«atcacttaacactta~atticaccgaaataaaalatit«ccaa ~atgtsaaacll~~aaac««t~a~=;;~ateaantc«gtctt~tsaatagcaaacacaacatttagattlcargaaaat saatatatttttccaagaggt~_taacu~aaaaaatccttgas~~at~aaatcta~ic«~'tgat~a_cgaacicaaca c «agatttcaccaaaat~aatalatttttcraa~a~~t~~gartc~aaaaaattctt~=ac~~atgaaatc«ytc«gt~
aata~caaacacaacattta~at«cac~=aaaat~aatatatttttccaa~a~ototaacttgaaaaaatcc«ga~gga teaaatctagtc«gtgaaga~ceaactc:aacacata~atttcaccaaaat~aatatattt«ccaa~ap~t«~oaclco gaaaaattcttgac~~at~laatcll~ll lli'lYa:lca~C~a~ctcaacacttatatttc;llc''taat~attalattttt ttcga~a~gt_eaacitgc:aaaaatcc«
~at~gatgaaatcta~tctt~t~aa~_a~cgaactcaacactta~attttgc tgaaat~aatata««tcaaaaa~~t~eaac«goaaaaatcclloa~odaloaaatcaa~lcltgt~aa~~a~~c~aacl caacacatggattmaec~aaat~_attatal««lcaa~ag~t~_~aac«~yaaaactculyag~g~l~'aaama~_t cttetgat~atcaa~ttaaacactta~atttcatc~aaataaatatatt«tcc~a~=a~=~t~~aac«t~aaaaatcc«

~agggatgaaatcta~tcit~t~~aa~a~c~=a~ctcaacatttaaat«caccaaaao~aatatattttitl:aa~ao~
tg~
aagtlggaaaaatctltga~=~~at~aaama«tctcatsaata~c~aattraacactta~at«cacy~aaala~atat atttttcc~a~ag~tgoaactt~aaaaaalcctt«a~=gett~aaatctaotctt~_c~aa~'a~_c~aactcaacact ta~at ttcaccaaaatgaaatatattt«aaaa~atlt~~aacttg~aaaagtcctt~aa~gat~aaatctattttt~_toaaca ~
cga~ctcaacacttaoatttcacc~aatt~_aatataattttccaaaatgtg~aactt~~aaaaac«tt~agg~~atoa aa lCla~lCll~l~aa~a~~aa~lClc'aalW
lla~:IIIIC;ICC~a;l;llYaalalalllllltl~~';l~'a~'~I~GTaacll=~aa aa:llCltloa~~'o~lCa;lalc:la~=(Cll~lu;l;tll~c ~aaClc:aal::lCllaalalllCalc ~>:laal,~aal;IlalllllCt v_a~=ao_~lvvaaC ll~~:laaac:lllll ~a~,'~:11~:1;lalC tC' YIlllYl ~a:tla'_TC:la;1(:lC;l:l(:a(:Il;l=;llllC;l(:Caa aal~aalal;llllllc;Caa~a~~_( ~~a:lC ll~ ~;I~aa;llCC It ~:1~
vqa[vaaalCla~lClt'=l~;l;l:lO~c' ~aac;lCaa catttaoatllcacyaaal~Taalatatt111ccaa~aaal~caactl~gaaaaatcctt~a~'~_aat~'aaatcta~
tctt al~BflY:l~(: i'aalICaaC'aC Il__ all IO:IC'l ~:1:1:1lYaalal:lf tIIICCa:laa~
~I'=la:ll tl ~~a:la:lalCCII=aa Yawl!2C;laicla~ltll~l~:lad'any':1~'W~:l:lc;;lt;lla~alllC:lc:c'~;ta;Il~;l;ll;ll:
lIIIILW :I;I~T:IY'~l~';laC
ll~gaaaaatlCll~a,'Y=;II~:l;I:IICI:I.'tillll'~a:IYB~C~:I:lCI1:1:11;:lCll:lC':Ill ll':1C(:aa:l;ll~:l:llalall ~7 DEMANDES OU BREVETS VOLUMINEUX
LA PRESENTS PARTIE DE CETTE DEMANDS OU CE BREVET
COMPREND PLUS D'UN TOME.
CECI EST LE TOME _ 1 DE S
NOTE: Pour les tomes additionels, veuillez contacter le Bureau canadien des brevets :. I r.
JUMBO APPLICATIONS/PAi'~1VTS
THIS SECTION OF THE APPLICATION/PATENT CONTAINS MORE
THIS IS VOLUME ( _ OF s WOTE:.For additional voiumes-please contaci'the Canadian Patent Ofif~cQ -~,: ' ~ :,.
..:~.:;..._...: v...~ :: .. .. ..: . :.... ,;. ;, .:~:.., , _,;:.:. . . '~ :_~
.,. ;: .. .,. ..

Claims (228)

WHAT IS CLAIMED IS:
1. A recombinant DNA construct comprising a plant centromere.
2. The recombinant DNA construct of claim 1, which additionally comprises a telomere.
3. The recombinant DNA construct of claim 2, wherein the telomere is a plant telomere.
4. The recombinant DNA construct of claim 3, wherein the plant telomere is an Arnbidopsis thaliana telomere.
5. The recombinant DNA construct of claim 2, wherein the telomere is a yeast telomere.
6. The recombinant DNA construct of claim 1, which additionally comprises an autonomous replicating sequence (ARS).
7. The recombinant DNA construct of claim 6, wherein said ARS is a plant ARS.
8. The recombinant DNA construct of claim 6, wherein said plant ARS is an Arabidopsis thaliana ARS.
9. The recombinant DNA construct of claim 1, which additionally comprises a structural gene.
10. The recombinant DNA construct of claim 9, wherein the structural gene comprises a selectable or screenable marker gene.
11. The recombinant DNA construct of claim 9, which additionally comprises a second structural gene.
12. The recombinant DNA construct of claim 9, wherein said structural gene is selected from the group consisting of an antibiotic resistance gene. a herbicide resistance gene, a nitrogen fixation gene, a plant pathogen defense gene. a plant stress-induced gene.
a toxin gene, a receptor gene, a ligand gene and a seed storage gene.
13. The recombinant DNA construct of claim 12, wherein said construct is capable of expressing said structural gene.
14. The recombinant DNA construct of claim 13, wherein said construct is capable of expressing said structural gene in a prokaryote.
15. The recombinant DNA construct of claim 13, wherein said construct is capable of expressing said structural gene in a eukaryote.
16. The recombinant DNA construct of claim 15 wherein said eukaryote is a higher eukaryote.
17. The recombinant DNA construct of claim 16, wherein said higher eukaryote is a plant.
18. The recombinant DNA construct of claim 9 wherein said structural gene is selected from the group consisting of a hormone gene, an enzyme gene. an interleukin gene, a clotting factor gene, a cytokine gene, an antibody gene, and a growth factor gene.
19. The recombinant DNA construct of claim 18. wherein said construct is capable of expressing said structural gene.
20. The recombinant DNA construct of claim 19, wherein said construct is capable of expressing said structural gene in a prokaryote.
21. The recombinant DNA construct of claim 19, wherein said construct is capable of expressing said structural gene in a eukaryote.
22. The recombinant DNA construct of claim 21, wherein said eukaryote is a higher eukaryote.
23. The recombinant DNA construct of claim 22, wherein said higher eukaryote is a plant.
24. The recombinant DNA construct of claim 1, further defined as a plasmid.
25. The recombinant DNA construct of claim 24, wherein the plasmid comprises an origin of replication.
26. The recombinant DNA construct of claim 25, wherein the origin of replication functions in bacteria.
27. The recombinant DNA construct of claim 26, wherein the origin of replication functions in E. coli.
28. The recombinant DNA construct of claim 26, wherein the origin of replication functions in Agrobacterium.
29. The recombinant DNA construct of claim 25, wherein the origin of replication functions in plants.
30. The recombinant DNA construct of claim 25, wherein the origin of replication functions in yeast.
31. The recombinant DNA construct of claim 30, wherein said yeast is S.
cerevisiae.
32. The recombinant DNA construct of claim 24, wherein the plasmid comprises a selection marker.
33. The recombinant DNA construct of claim 32, wherein the selection marker functions in bacteria.
34. The recombinant DNA construct of claim 32, wherein the selection marker functions in E. coli.
35. The recombinant DNA construct of claim 32, wherein the selection marker functions in Agrobacterium.
36. The recombinant DNA construct of claim 32, wherein the selection marker functions in plants.
37. The recombinant DNA construct of claim 32, wherein the selection marker functions in yeast.
38. The recombinant DNA construct of claim 37, wherein said yeast is S.
cerevisiae.
39. The recombinant DNA construct of claim 1, which is capable of being maintained as a chromosome. wherein acid chromosome is transmitted in dividing cells.
40. The recombinant DNA construct of claim 1, wherein said plant centromere is an Arabidopsis thaliana centromere.
41. The recombinant DNA construct of claim 40. wherein said plant centromere is an Arabidopsis thaliana chromosome I centromere.
42. The recombinant DNA construct of claim 41. wherein said centromere is flanked by the genetic markers T22C23-T7 and T3P8-SP6.
43. The recombinant DNA construct of claim 42, wherein the centromere is further defined as flanked by the genetic markers T22C23-T7 and T5D18, T22C23-T7 and T3L4, T5D18 and T3P8-SP6, T5D18 and T3L4. and T3L4 and T3P8-SP6.
44. The recombinant DNA construct of claim 40, wherein said plant centromere comprises an Arabidopsis thaliana chromosome 2 centromere.
45. The recombinant DNA construct of claim 44, wherein said centromere comprises from about 100 to about 611.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:209.
46. The recombinant DNA construct of claim 44, wherein said centromere comprises from about 500 to about 611.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:209.
47. The recombinant DNA construct of claim 44, wherein said centromere comprises from about 1,000 to about 611.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:209.
48. The recombinant DNA construct of claim 44, wherein said centromere comprises from about 10,000 to about 611.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:209.
49. The recombinant DNA construct of claim 44. wherein said centromere comprises from about 20.000 to about 611.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:209.
50. The recombinant DNA construct of claim 44. wherein said centromere comprises from about 40.000 to about 611,000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:209.
51. The recombinant DNA construct of claim 44. wherein said centromere comprises from about 80.000 to about 611,000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:209.
52. The recombinant DNA construct of claim 44. wherein said centromere comprises from about 150.000 to about 611.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:209.
53. The recombinant DNA construct of claim 44, wherein said centromere comprises from about 300,000 to about 611,000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:209.
54. The recombinant DNA construct of claim 44. wherein said centromere comprises the nucleic acid sequence of SEQ ID NO:209.
55. The recombinant DNA construct of claim 44. wherein said centromere comprises from about 100 to about 50.959 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:210.
56. The recombinant DNA construct of claim 44. wherein said centromere comprises from about 500 to about 50,959 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:210.
57. The recombinant DNA construct of claim 44. wherein said centromere comprises from about 1.000 to about 50.959 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:210.
58. The recombinant DNA construct of claim 44. wherein said centromere comprises from about 5,000 to about 50,959 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:210.
59. The recombinant DNA construct of claim 44, wherein said centromere comprises from about 10,000 to about 50.959 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:210.
60. The recombinant DNA construct of claim 44. wherein said centromere comprises from about 20,000 to about 50.959 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:210.
61. The recombinant DNA construct of claim 44. wherein said centromere comprises from about 30.000 to about 50,959 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:210.
62. The recombinant DNA construct of claim 44. wherein said centromere comprises from about 40.000 to about 50.959 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:210.
63. The recombinant DNA construct of claim 44. wherein said centromere comprises the nucleic acid sequence of SEQ ID NO:210.
64. The recombinant DNA construct of claim 40. wherein said plant centromere is an Arabidopsis thaliana chromosome 3 centromere.
65. The recombinant DNA construct of claim 62. wherein centromere is further defined as flanked by the genetic markers T9G9-SP6 and T5M 14-SP6.
66. The recombinant DNA construct of claim 65, wherein the centromere is still further defined as flanked by a pair of genetic markers selected from the group consisting of T9G9-SP6 and T14H20. T9G9-SP6 and T7K14. T9G9-SP6 and T21P20, T14H20 and T7K14, T14H20 and T21P20, T14H20 and T5M14-SP6, T7K14 and T5M14-SP6, T7K14 and T21P20, and T21P20 and T5M14-SP6.
67. The recombinant DNA construct of claim 40, wherein said plant centromere is an Arabidopsis thaliana chromosome 4 centromere.
68. The recombinant DNA construct of claim 67, wherein said centromere comprises from about 100 to about 1,082,000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:211.
69. The recombinant DNA construct of claim 67, wherein said centromere comprises from about 500 to about 1,082,000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:211.
70. The recombinant DNA construct of claim 67. wherein said centromere comprises from about 5.000 to about 1.082,000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:211.
71. The recombinant DNA construct of claim 67. wherein said centromere comprises from about 5.000 to about 1.082.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:211.
72. The recombinant DNA construct of claim 67, wherein said centromere comprises from about 10,000 to about 1.082.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:311.
73. The recombinant DNA construct of claim 67, wherein said centromere comprises from about 50.000 to about 1,082.000 continuous nucleotides of the nucleic acid sequence of SEQ ID NO:211.
74. The recombinant DNA construct of claim 67, wherein said centromere comprises from about 100,000 to about 1,082,000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:211.
75. The recombinant DNA construct of claim 67. wherein said centromere comprises from about 200,000 to about 1.082.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:211.
76. The recombinant DNA construct of claim 67, wherein said centromere comprises from about 400,000 to about 1,082.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:211.
77. The recombinant DNA construct of claim 67. wherein said centromere comprises from about 800,000 to about 1.082.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:211.
78. The recombinant DNA construct of claim 67. wherein said centromere comprises the nucleic acid sequence of SEQ ID NO:211.
79. The recombinant DNA construct of claim 67. wherein said centromere comprises from about 100 to about 163,317 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:212.
80. The recombinant DNA construct of claim 67, wherein said centromere comprises from about 500 to about 163,317 continuous nucleotides of the nucleic acid sequence of SEQ ID NO:212.
81. The recombinant DNA construct of claim 67, wherein said centromere comprises from about 1.000 to about 163,317 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:212.
82. The recombinant DNA construct of claim 67, wherein said centromere comprises from about 5.000 to about 163,317 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:212.
83. The recombinant DNA construct of claim 67, wherein said centromere comprises from about 10.000 to about 163,317 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:212.
84. The recombinant DNA construct of claim 67, wherein said centromere comprises from about 30.000 to about 163,317 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:212.
85. The recombinant DNA construct of claim 67, wherein said centromere comprises from about 50,000 to about 163,317 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:212.
86. The recombinant DNA construct of claim 67, wherein said centromere comprises from about 80,000 to about 163,317 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:212.
87. The recombinant DNA construct of claim 67. wherein said centromere comprises from about 120.000 to about 163,317 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:212.
88. The recombinant DNA construct of claim 67, wherein said centromere comprises the nucleic acid sequence of SEQ ID NO:212.
89. The recombinant DNA construct of claim 40, wherein said plant centromere is an Arabidopsis thaliana chromosome 5 centromere.
90. The recombinant DNA construct of claim 89. wherein said centromere is flanked by the genetic markers F13K20-T7 and CUE1.
91. The recombinant DNA construct of claim 90, wherein said centromere is flanked by a pair of genetic markers selected from the group consisting of F13K20-T7 and T18M4, F13K20-T7 and T18F2, F13K20-T7 and T24120. T18M4 and T18F2, T18M4 and T24120, T18M4 and CUE1, T18F2 and T24120, T18F2 and CUE1. and T24120 and CUE1.
92. The recombinant DNA construct of claim 1. comprising n copies of a repeated nucleotide sequence, wherein n is at least 2.
93. The recombinant DNA construct of claim 92. wherein n is from about 5 to about 100.000.
94. The recombinant DNA construct of claim 92, wherein n is from about 10 to abut 80,000.
95. The recombinant DNA construct of claim 92. wherein n is from about 25 to about 60,000.
96. The recombinant DNA construct of claim 92. wherein n is from about 100 to about 50,000.
97. The recombinant DNA construct of claim 92, wherein n is from about 200 to about 40,000.
98. The recombinant DNA construct of claim 92, wherein n is from about 400 to about 30,000.
99. The recombinant DNA construct of claim 92, wherein n is from about 1.000 to about 30,000.
100. The recombinant DNA construct of claim 92, wherein n is from about 5.000 to about 20,000.
101. The recombinant DNA construct of claim 92, wherein n is from about 10,000 to about 15,000.
102. The recombinant DNA construct of claim 92. wherein said repeated nucleotide sequence is isolatable from the nucleic acid sequence given by SEQ ID NO:184, SEQ ID
NO:185, SEQ ID NO:186. SEQ ID NO:187. SEQ ID NO:188, SEQ ID NO:189. SEQ ID
NO:190, SEQ ID NO:191. SEQ ID NO:192, SEQ ID NO:193, SEQ ID NO:194. SEQ ID
NO:195, SEQ ID NO:196. SEQ ID NO:197. SEQ ID NO:198. SEQ ID NO:199. SEQ ID
NO:200. SEQ ID NO:201. SEQ ID NO:202. SEQ ID NO:203, SEQ ID NO:204, SEQ ID
NO:205. SEQ ID NO:206. SEQ ID NO:207. SEQ ID NO:208, SEQ ID NO:209, SEQ ID
NO:210. SEQ ID NO:211 or SEQ ID NO:212.
103. A minichromosome vector comprising a plant centromere and a telomere sequence.
104. The minichromosome vector of claim 103. comprising an autonomous replicating sequence.
l05. The minichromosome vector of claim 103, comprising a second telomere sequence.
106. The minichromosome vector of claim 103, comprising a structural gene.
107. The minichromosome vector of claim 103. further defined as comprising a second structural gene.
108. The minichromosome vector of claim 103, further defined as comprising a nucleic acid sequence selected from the group consisting of SEQ ID NO:1, SEQ ID NO:2.
SEQ
ID NO:3, SEQ ID NO:4, SEQ ID NO:5. SEQ ID NO:6, SEQ ID NO:7. SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13. SEQ ID
NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17. SEQ ID NO:18. SEQ ID
NO:19, SEQ ID NO:20, and SEQ ID NO:21.
109. A cell transformed with a recombinant DNA construct comprising a plant centromere.
110. The cell of claim 109, wherein said cell is a prokaryotic cell.
111. The cell of claim 109. wherein said cell is a eukaryotic cell.
112. The cell of claim 111, wherein said cell is a yeast cell.
113. The cell of claim 109, wherein said cell is a higher eukaryotic cell.
114. The cell of claim 113, wherein acid higher eukaryotic cell is a plant cell.
115. The cell of claim 114. wherein said plant cell is from a dicotyledonous plant.
116. The cell of claim 115, wherein said dicotyledonous plant is selected from the group consisting of group consisting of tobacco, tomato. potato, sugar beet, pea. carrot.
cauliflower. broccoli, soybean, canola, sunflower. alfalfa, cotton and Arabidopsis.
117. The cell of claim 116, wherein said dicotyledonous plant is Arabidopsis thaliana.
118. The cell of claim 114, wherein said plant cell is from a monocotyledonous plant.
119. The cell of claim 118, wherein said monocotyledonous plant is selected from the group consisting of wheat, maize, rye, rice, turfgrass. oat, barley. sorghum, millet, and sugarcane.
120. The cell of claim 109. wherein the plant centromere is an Arabidopsis thaliana centromere.
l21. The cell of claim 120, further defined as an Arabidopsis thaliana cell.
122. The cell of claim 109. wherein said recombinant DNA construct comprises a telomere.
123. The cell of claim 109, wherein said recombinant DNA construct comprises an autonomous replicating sequence (ARS).
124. The cell of claim 109. wherein said recombinant DNA construct comprises a structural gene.
125. The cell of claim 124, wherein the structural gene comprises a selectable or screenable marker gene.
126. The cell of claim 124, wherein said recombinant DNA construct comprises a second structural gene.
127. The cell of claim 124, further defined as capable of expressing said structural gene.
128. A plant comprising the cell of claim 109.
129. A method of preparing a transgenic plant cell comprising contacting a starting plant cell with a recombinant DNA construct comprising a plant centromere, whereby said starting plant cell is transformed with said recombinant DNA construct.
130. The method of claim 129, wherein said recombinant DNA construct comprises a structural gene.
131. The method of claim 130, wherein the recombinant DNA construct comprises a second structural gene.
132. The method of claim 129, wherein the plant centromere is an Arabidopsis thaliana centromere.
133. The method of claim 132, wherein said starting plant cell is an Arabidopsis thaliana cell.
134. A transgenic plant comprising a minichromosome vector, wherein said vector comprises a plant centromere and a telomere sequence.
135. The transgenic plant of claim 134. wherein said minichromosome vector comprises an autonomous replicating sequence.
136. The transgenic plant of claim 134. wherein said minichromosome vector comprises a second telomere sequence.
137. The transgenic plant of claim 134, wherein said minichromosome vector comprises a structural gene.
138. The transgenic plant of claim 137, wherein said structural gene is selected from the group consisting of an antibiotic resistance gene, a herbicide resistance gene, a nitrogen fixation gene, a plant pathogen defense gene, a plant stress-induced gene. a toxin gene, a receptor gene, a ligand gene and a seed storage gene.
139. The transgenic plant of claim 137, wherein said first exogenous structural gene is selected from the group consisting of a hormone gene, an enzyme gene, an interleukin gene. a clotting factor gene, a cytokine gene, an antibody gene. and a growth factor gene.
140. The transgenic plant of claim 134. wherein said minichromosome vector comprises a second structural gene.
141. The transgenic plant of claim 134. wherein said minichromosome vector comprises a nucleic acid sequence selected from the group consisting of SEQ ID
NO:1, SEQ ID NO:2, SEQ ID NO:3. SEQ ID NO:4. SEQ ID NO:5, SEQ ID NO:6, SEQ ID
NO:7. SEQ ID NO:8, SEQ ID NO:9. SEQ ID NO:10. SEQ ID NO:11. SEQ ID NO:12, SEQ ID NO:13. SEQ ID NO:14. SEQ ID NO:15, SEQ ID NO:16. SEQ ID NO:17. SEQ
ID NO:18. SEQ ID NO:19, SEQ ID NO:20. and SEQ ID NO:21.
142. The transgenic plant of claim 134. further defined as a dicotyledonous plant.
143. The transgenic plant of claim 143. wherein said dicotyledonous plant is selected from the group consisting of tobacco. tomato. potato. sugar beet. pea. carrot.
cauliflower.
broccoli. soybean. canola. sunflower. alfalfa. cotton and Arabidopsis.
144. The transgenic plant of claim 143, wherein the dicotyledonous plant is Arabidopsis thaliana.
145. The transgenic plant of claim 134. further defined as a monocotyledonous plant.
146. The transgenic plant of claim 145, wherein said monocotyledonous plant is selected from the group consisting of wheat, maize, rye, rice, turfgrass. oat, barley, sorghum, millet, and sugarcane.
147. A method of producing a minichromosome vector comprising:
(a) obtaining a first vector and a second vector, wherein said first vector or said second vector comprises a selectable or screenable marker, an origin of replication, a telomere. and a plant centromere, and wherein said first vector and said second vector comprises a site for site-specific recombination: and (b) contacting said first vector with said second vector to allow site-specific recombination to occur between said site for site-specific recombination on said first vector and said site for site-specific recombination on said second vector to create a minichromosome vector comprising said selectable or screenable marker, said origin of replication. said telomere and said plant centromere.
148. The method of claim 147. wherein said contacting is done in vitro.
149. The method of claim 148. wherein said contacting is done in vivo.
150. The method of claim 149, wherein said contacting is carried out in a prokaryotic cell.
151. The method of claim 150, wherein said prokaryotic cell is an Agrobacterium cell.
152. The method of claim 150, wherein said prokaryotic cell is an E. coli cell.
153. The method of claim 149, wherein said contacting is carried out in lower eukaryotic cell.
154. The method of claim 153. wherein said lower eukaryotic cell is a yeast cell.
155. The method of claim 149, wherein said contacting is carried out in a higher eukaryotic cell.
156. The method of claim 155, wherein said higher eukaryotic cell is a plant cell.
157. The method of claim 156, wherein said plant cell is an Arabidopsis thaliana cell.
158. The method of claim 147, wherein said contacting is done in the presence of a recombinase.
159. The method of claim 158, wherein acid recombinase is selected from the group consisting of Cre. Flp, Gin, Pin, Sre, pinD, Int-B13. and R.
160. The method of claim 147. wherein said first vector or said second vector comprises border sequences for Agrobacterium-mediated transformation.
161. The method of claim 147. wherein said plant centromere is an Arabidopsis thaliana centromere.
162. The method of claim 147. wherein said telomere is a plant telomere.
163. The method of claim 147. wherein said plant selectable or screenable marker is selected from the group consisting of GFP, GUS, BAR, PAT. HPT or NPTII.
164. A method of screening a candidate centromere sequence for plant centromere activity, said method comprising the steps of:
(a) obtaining an isolated nucleic acid sequence comprising a candidate centromere sequence:
(b) integratively transforming plant cells with said isolated nucleic acid:
and (c) screening for centromere activity of said candidate centromere sequence.
165. The method of claim 164, wherein said screening comprises observing a phenotypic effect present in the integratively transformed plant cells or plants comprising said plant cells, wherein said phenotypic effect is absent in a control plant cell not integratively transformed with said isolated nucleic acid sequence, or a plant comprising said control plant cell.
166. The method of claim 165 wherein said phenotypic effect is selected from the group consisting of: reduced viability. reduced efficiency of said transforming, genetic instability in the integratively transformed nucleic acid, aberrant plant sectors, increased ploidy. aneuploidy, and increased integrative transformation in distal or centromeric chromosome regions.
167. The method of claim 164. wherein said isolated nucleic acid sequence comprises a bacterial artificial chromosome.
-195-l68. The method of claim 167. wherein said bacterial artificial chromosome is further defined as a binary bacterial artificial chromosome.
169. The method of claim 164. wherein said integratively transforming comprises use of Agrobacterium-mediated transformation.
170. The method of claim 164. wherein said control plant cell has been integratively transformed with a nucleic acid sequence other than a candidate centromere sequence.
171. A recombinant DNA construct comprising an Arabidopsis polyubiquitin 11 promoter, wherein said promoter comprises from about 25 to about 2.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:180.
172. The recombinant DNA construct of claim 171, wherein said wherein said promoter comprises from about 75 to about 2.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:180.
173. The recombinant DNA construct of claim 171, wherein said wherein said promoter comprises from about 125 to about 2,000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:180.
174. The recombinant DNA construct of claim 171, wherein said wherein said promoter comprises from about 200 to about 2.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:180.
175. The recombinant DNA construct of claim 171, wherein said wherein said promoter comprises from about 400 to about 2.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:180.
176. The recombinant DNA construct of claim 171, wherein said wherein said promoter comprises from about 800 to about 2.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:180.
177. The recombinant DNA construct of claim 171. wherein said wherein said promoter comprises from about 1,000 to about 2,000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:180.
178. The recombinant DNA construct of claim 171, wherein said promoter comprises the nucleic acid sequence of SEQ ID NO:180.
179. The recombinant DNA construct of claim 171, further comprising an enhancer.
180. The recombinant DNA construct of claim 171. further comprising a telomere sequence.
181. The recombinant DNA construct of claim 171, further comprising a plant centromere sequence.
182. The recombinant DNA construct of claim 171, further comprising an ARS.
183. The recombinant DNA construct of claim 171, wherein said promoter is operably linked to a structural gene.
184. The recombinant DNA construct of claim 183, wherein said structural gene is selected from the group consisting of an antibiotic resistance gene, a herbicide resistance gene, a nitrogen fixation gene. a plant pathogen defense gene, a plant stress-induced gene, a toxin gene. a receptor gene. a ligand gene and a seed storage gene.
185. The recombinant DNA construct of claim 183, wherein said structural gene is selected from the group consisting of a hormone gene, an enzyme gene, an interleukin gene, a clotting factor gene, a cytokine gene. an antibody gene. and a growth factor gene.
186. A recombinant DNA construct comprising an Arabidopsis 40S ribosomal protein S16 promoter, wherein said promoter comprises from about 25 to about 2,000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:182.
187. The recombinant DNA construct of claim 186, wherein said wherein said promoter comprises from about 75 to about 2,000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:182.
188. The recombinant DNA construct of claim 186. wherein said wherein said promoter comprises from about 125 to about 2,000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:182.
189. The recombinant DNA construct of claim 186, wherein said wherein said promoter comprises from about 200 to about 2,000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:182.
190. The recombinant DNA construct of claim 186, wherein said wherein said promoter comprises from about 400 to about 2.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:182.
191. The recombinant DNA construct of claim 186, wherein said wherein said promoter comprises from about 800 to about 2.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:182.
192. The recombinant DNA construct of claim 186, wherein said wherein said promoter comprises from about 1,000 to about 2,000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:182.
193. The recombinant DNA construct of claim 186, wherein said promoter comprises the nucleic acid sequence of SEQ ID NO:182.
194. The recombinant DNA construct of claim 186, further comprising an enhancer.
195. The recombinant DNA construct of claim 186, further comprising a telomere sequence.
196. The recombinant DNA construct of claim 186, further comprising a plant centromere sequence.
197. The recombinant DNA construct of claim 186, further comprising an ARS.
198. The recombinant DNA construct of claim 186, wherein said promoter is operably linked to a structural gene.
199. The recombinant DNA construct of claim 198, wherein said structural gene is selected from the group consisting of an antibiotic resistance gene, a herbicide resistance gene. a nitrogen fixation gene, a plant pathogen defense gene, a plant stress-induced gene, a toxin gene, a receptor gene. a ligand gene and a seed storage gene.
200. The recombinant DNA construct of claim 198, wherein said structural gene is selected from the group consisting of a hormone gene, an enzyme gene, an interleukin gene, a clotting factor gene, a cytokine gene. an antibody gene, and a growth factor gene.
201. A recombinant DNA construct comprising an Arabidopsis polyubiquitin 11 3' regulatory sequence, wherein said 3~ regulatory sequence comprises from about 25 to about 2001 contiguous nucleotides of the nucleic acid sequence of SEQ ID
NO:181.
202. The recombinant DNA construct of claim 201, wherein said wherein said 3' regulatory sequence comprises from about 75 to about 2001 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:181.
203. The recombinant DNA construct of claim 201, wherein said wherein said 3' regulatory sequence comprises from about 125 to about 2001 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:181.
204. The recombinant DNA construct of claim 201, wherein said wherein said 3' regulatory sequence comprises from about 200 to about 2001 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:181.
205. The recombinant DNA construct of claim 201, wherein said wherein said 3' regulatory sequence comprises from about 400 to about 2001 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:181.
206. The recombinant DNA construct of claim 201, wherein said wherein said 3' regulatory sequence comprises from about 800 to about 2001 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:181.
207. The recombinant DNA construct of claim 201, wherein said wherein said 3' regulatory sequence comprises from about 1,000 to about 2,001 continuous nucleotides of the nucleic acid sequence of SEQ ID NO:181.
208. The recombinant DNA construct of claim 201, wherein said 3' regulatory sequence comprises the nucleic acid sequence of SEQ ID NO:181.
209. The recombinant DNA construct of claim 201. further comprising an enhancer.
210. The recombinant DNA construct of claim 201. further comprising a telomere sequence.
211. The recombinant DNA construct of claim 201, further comprising a plant centromere sequence.
212. The recombinant DNA construct of claim 201, further comprising an ARS.
213. The recombinant DNA construct of claim 201, wherein said 3' regulatory sequence is operably linked to a structural gene.
214. The recombinant DNA construct of claim 213, wherein said structural gene is selected from the group consisting of an antibiotic resistance gene, a herbicide resistance gene, a nitrogen fixation gene, a plant pathogen defense gene, a plant stress-induced gene.
a toxin gene, a receptor gene, a ligand gene and a seed storage gene.
215. The recombinant DNA construct of claim 213. wherein said structural gene is selected from the group consisting of a hormone gene, an enzyme gene, an interleukin gene, a clotting factor gene, a cytokine gene, an antibody gene, and a growth factor gene.
216. A recombinant DNA construct comprising an Arabidopsis 40S ribosomal protein S16 3' regulatory sequence, wherein said 3' regulatory comprises from about 25 to about 2.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:183.
217. The recombinant DNA construct of claim 216, wherein said wherein said 3' regulatory sequence comprises from about 75 to about 2,000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:183.
218. The recombinant DNA construct of claim 216, wherein said wherein said 3' regulatory sequence comprises from about 125 to about 2.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:183.
219. The recombinant DNA construct of claim 216. wherein said wherein said 3' regulatory sequence comprises from about 200 to about 2,000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:183.
220. The recombinant DNA construct of claim 216. wherein said wherein said 3' regulatory sequence comprises from about 400 to about 2,000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:183.
221. The recombinant DNA construct of claim 216. wherein said wherein said 3' regulatory sequence comprises from about 800 to about 2,000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:183.
222. The recombinant DNA construct of claim 216, wherein said wherein said 3' regulatory sequence comprises from about 1,000 to about 2.000 contiguous nucleotides of the nucleic acid sequence of SEQ ID NO:183.
223. The recombinant DNA construct of claim 216. wherein said 3' regulatory sequence comprises the nucleic acid sequence of SEQ ID NO:183.
224. The recombinant DNA construct of claim 216, further comprising an enhancer.
225. The recombinant DNA construct of claim 216. further comprising a telomere sequence.
226. The recombinant DNA construct of claim 216. further comprising a plant centromere sequence.
227. The recombinant DNA construct of claim 216. further comprising an ARS.
228. The recombinant DNA construct of claim 216. wherein said 3' regulatory sequence is operably linked to a structural gene.
CA002362897A 1999-03-18 2000-03-17 Plant centromeres Abandoned CA2362897A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CA2904855A CA2904855A1 (en) 1999-03-18 2000-03-17 Plant centromeres

Applications Claiming Priority (13)

Application Number Priority Date Filing Date Title
US12521999P 1999-03-18 1999-03-18
US60/125,219 1999-03-18
US12740999P 1999-04-01 1999-04-01
US60/127,409 1999-04-01
US13477099P 1999-05-18 1999-05-18
US60/134,770 1999-05-18
US15358499P 1999-09-13 1999-09-13
US60/153,584 1999-09-13
US15460399P 1999-09-17 1999-09-17
US60/154,603 1999-09-17
US17249399P 1999-12-16 1999-12-16
US60/172,493 1999-12-16
PCT/US2000/007392 WO2000055325A2 (en) 1999-03-18 2000-03-17 Plant centromeres

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CA2904855A Division CA2904855A1 (en) 1999-03-18 2000-03-17 Plant centromeres

Publications (1)

Publication Number Publication Date
CA2362897A1 true CA2362897A1 (en) 2000-09-21

Family

ID=27558033

Family Applications (2)

Application Number Title Priority Date Filing Date
CA002362897A Abandoned CA2362897A1 (en) 1999-03-18 2000-03-17 Plant centromeres
CA2904855A Abandoned CA2904855A1 (en) 1999-03-18 2000-03-17 Plant centromeres

Family Applications After (1)

Application Number Title Priority Date Filing Date
CA2904855A Abandoned CA2904855A1 (en) 1999-03-18 2000-03-17 Plant centromeres

Country Status (5)

Country Link
EP (1) EP1165792A2 (en)
JP (2) JP2004512806A (en)
BR (1) BR0009119A (en)
CA (2) CA2362897A1 (en)
WO (1) WO2000055325A2 (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
BR0209745A (en) * 2001-05-30 2004-08-10 Chromos Molecular Systems Inc Artificial plant chromosomes, uses and same and method for preparing artificial plant chromosomes
MXPA03010626A (en) 2001-05-30 2004-12-06 Chromos Molecular Systems Inc Chromosome-based platforms.
AU2002953516A0 (en) * 2002-12-23 2003-01-16 Murdoch Childrens Research Institute Genetic therapy and genetic modification
EP2295586A3 (en) * 2003-06-27 2011-06-29 Chromatin, Inc. Plant centromere compositions
AU2012202836B2 (en) * 2003-06-27 2015-03-19 Chromatin, Inc. Plant centromere compositions
EP2357240A1 (en) * 2003-06-27 2011-08-17 Chromatin, Inc. Plant centromere compositions
EP1829960B1 (en) * 2004-12-08 2011-09-14 Nippon Paper Industries Co., Ltd. Method for production of plant cell having chromosome loss
US7855164B1 (en) 2005-02-22 2010-12-21 Mendel Biotechnology, Inc. Screening methods employing stress-related promoters
ES2620431T3 (en) * 2008-08-04 2017-06-28 Natera, Inc. Methods for the determination of alleles and ploidy
WO2011011693A1 (en) 2009-07-23 2011-01-27 Chromatin, Inc. Sorghum centromere sequences and minichromosomes
CN112011566B (en) 2009-08-31 2023-12-05 巴斯夫植物科学有限公司 Regulatory nucleic acid molecules for enhancing seed-specific gene expression in plants to promote enhanced polyunsaturated fatty acid synthesis
KR20120092104A (en) 2009-08-31 2012-08-20 바스프 플랜트 사이언스 컴퍼니 게엠베하 Regulatory nucleic acid molecules for enhancing constitutive gene expression in plants
WO2011023539A1 (en) 2009-08-31 2011-03-03 Basf Plant Science Company Gmbh Regulatory nucleic acid molecules for enhancing seed-specific and/or seed-preferential gene expression in plants
BR112014016774A2 (en) 2012-01-06 2020-10-27 Pioneer Hi-Bred International, Inc method to produce large seed population, self-reproducing plant and seed
CN114747478B (en) * 2021-01-15 2023-10-13 冯永德 Composite breeding method for pasture
CN113498738A (en) * 2021-07-16 2021-10-15 云南省烟草农业科学研究院 Method for creating new interspecific allopolyploid germplasm of tobacco by utilizing horizontal genome transfer
CN115623986B (en) * 2022-10-27 2023-12-15 天津农学院 Method for identifying complete embryogenesis of celery callus

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03503599A (en) * 1988-03-24 1991-08-15 ザ・ジェネラル・ホスピタル・コーポレーション artificial chromosome vector
US5270201A (en) * 1988-03-24 1993-12-14 The General Hospital Corporation Artificial chromosome vector
US5773705A (en) * 1992-12-31 1998-06-30 Wisconsin Alumni Research Foundation Ubiquitin fusion protein system for protein production in plants
US6025155A (en) * 1996-04-10 2000-02-15 Chromos Molecular Systems, Inc. Artificial chromosomes, uses thereof and methods for preparing artificial chromosomes
US6156953A (en) * 1997-06-03 2000-12-05 University Of Chicago Plant artificial chromosome compositions and methods
AU8682198A (en) * 1997-07-31 1999-02-22 Sanford Scientific, Inc. Transgenic plants using the tdc gene for crop improvement
EP1033405A3 (en) * 1999-02-25 2001-08-01 Ceres Incorporated Sequence-determined DNA fragments and corresponding polypeptides encoded thereby

Also Published As

Publication number Publication date
EP1165792A2 (en) 2002-01-02
CA2904855A1 (en) 2000-09-21
JP2004512806A (en) 2004-04-30
WO2000055325A9 (en) 2010-01-07
JP2011125354A (en) 2011-06-30
WO2000055325A3 (en) 2001-08-23
BR0009119A (en) 2001-12-26
WO2000055325A2 (en) 2000-09-21

Similar Documents

Publication Publication Date Title
AU746775B2 (en) Plant artificial chromosome (PLAC) compositions and methods
US7193128B2 (en) Methods for generating or increasing revenues from crops
US7226782B2 (en) Plant centromere compositions
US20050266560A1 (en) Plant chromosome compositions and methods
US7235716B2 (en) Plant centromere compositions
US7227057B2 (en) Plant centromere compositions
CA2362897A1 (en) Plant centromeres
US7847151B2 (en) Plant artificial chromosome (PLAC) compositions and methods
US20130007927A1 (en) Novel centromeres and methods of using the same
US20100333235A1 (en) Plant Centromere Compositions
WO2019129145A1 (en) Flowering time-regulating gene cmp1 and related constructs and applications thereof
JP2023531153A (en) Enhancing production capacity in C3 plants
EP1644510A2 (en) Plant centromere compositions
AU2008207566B2 (en) Plant centromeres
WO2024023207A1 (en) Eif(iso)4e protein variants for resistance to maize viral diseases
CA2530178A1 (en) Methods for producing a modified commercial crop plant having a commercially desirable trait

Legal Events

Date Code Title Description
EEER Examination request
FZDE Discontinued

Effective date: 20150918