WO2006138012A1 - P450 substrates and methods related thereto - Google Patents

P450 substrates and methods related thereto Download PDF

Info

Publication number
WO2006138012A1
WO2006138012A1 PCT/US2006/019001 US2006019001W WO2006138012A1 WO 2006138012 A1 WO2006138012 A1 WO 2006138012A1 US 2006019001 W US2006019001 W US 2006019001W WO 2006138012 A1 WO2006138012 A1 WO 2006138012A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
activity
sequence
arabidopsis thaliana
protein sequence
Prior art date
Application number
PCT/US2006/019001
Other languages
French (fr)
Inventor
Tanya Kruse
Joon-Hyun Park
Steven Craig Bobzin
Original Assignee
Ceres Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ceres Inc. filed Critical Ceres Inc.
Publication of WO2006138012A1 publication Critical patent/WO2006138012A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8241Phenotypically and genetically modified plants via recombinant DNA technology
    • C12N15/8242Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits
    • C12N15/8257Phenotypically and genetically modified plants via recombinant DNA technology with non-agronomic quality (output) traits, e.g. for industrial processing; Value added, non-agronomic traits for the production of primary gene products, e.g. pharmaceutical products, interferon
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/0004Oxidoreductases (1.)
    • C12N9/0071Oxidoreductases (1.) acting on paired donors with incorporation of molecular oxygen (1.14)
    • C12N9/0077Oxidoreductases (1.) acting on paired donors with incorporation of molecular oxygen (1.14) with a reduced iron-sulfur protein as one donor (1.14.15)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/26Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving oxidoreductase
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/5097Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving plant cells

Definitions

  • the present invention relates to nucleic acid constructs that encode polypeptides that function as cytochrome P450 enzymes (P450s), transgenic organisms including the same, and methods for modifying chemical structures of compounds using the same. More particularly, the invention relates to in vitro, in vivo, and whole organism procedures that result in oxidation, reduction, dealkylation, or epoxidation of a candidate compound. The invention also relates to down-stream processing, e.g., glycosylation, methylation or acetylation, of a modified compound by endogenous processes in whole organisms.
  • P450s cytochrome P450 enzymes
  • This application also includes a compact disc (Disc 1 of 1, submitted in triplicate with identical files) that contains a sequence listing.
  • the compact disc was created June 17, 2005 and contains one file, entitled 50283674.txt, which is 1,993 kilobytes in size.
  • the file can be accessed using Microsoft Word on a computer that uses Windows OS.
  • the entire contents of the sequence listing are herein incorporated by reference in their entirety and are considered to be part of the specification.
  • Cytochrome P450s are the principal catalysts involved in the metabolism of drugs and other xenobiotics in humans. Much attention has focused on the role of human P450 enzymes in the metabolism and toxicity of pharmaceuticals. While the pharmaceutical industry is dedicating considerable resources to predicting how human P450s will interact with drugs in vivo, little effort is expended on using P450s in the identification or production of pharmacologically active compounds.
  • An aspect of the present invention provides methods of screening for a substrate of a P450, which method comprises contacting a pharmacologically active candidate compound with plant cells containing a recombinant nucleic acid construct comprising a regulatory region operably linked to a coding sequence for a P450 having 80% or greater identity to a sequence selected from SEQ ID NO: 137-533; and determining whether the candidate compound is modified after the contact, whereby detection of a modification indicates that the candidate compound is a substrate.
  • the P450 coding sequence is a heterologous sequence.
  • the plant cells e.g., root cells, leaf cells or stem cells, can be part of a whole plant or from an explant. In some embodiments, the plant is a seedling.
  • the plant is grown under hydroponic conditions.
  • the candidate compound may be contacted with the plant cells for a period of time sufficient for the cytochrome P450 to detectably modify the candidate compound, e.g., about 1 hour to about 28 days.
  • the pharmacological activity of the candidate compound can be Alzheimer's disease treatment, analgesic activity, anesthetic activity, anti-Addison's disease activity, anti-HIV activity, anti-infective activity, anti-inflammatory activity, antianginal activity, antiangiogenic activity, antianxiety activity, antiarrhythmic activity, antiarthritic activity, antiatherosclerotic activity, antibacterial activity, antibiotic activity, anticancer activity, anticholesterol activity, anticholinergic activity, anticoagulant activity, anticonvulsant activity, antidepressant activity, antidiabetic activity, antidiuretic activity, antiedemic activity, antifungal activity, antigout activity, antiglaucoma activity, antihemorrhagic activity, antihistamine activity, antihyp
  • Another aspect of the invention provides methods of screening a collection of P450s for a P450 capable of modifying a compound having pharmacological activity. Such methods comprise: (a) contacting the compound with plant cells containing a recombinant nucleic acid construct comprising a regulatory region operably linked to a coding sequence for a P450 of the collection; (b) determining whether the compound is modified after the contact, whereby detection of a modification indicates that the P450 is capable of modifying the compound; and (c) repeating steps (a) and (b) for each P450 of the collection, wherein at least one P450 of the collection is identified as capable of modifying the compound.
  • Yet another aspect of the invention provides methods of modifying a candidate compound, which methods comprise contacting a candidate compound with plant cells containing a recombinant nucleic acid construct comprising a regulatory region operably linked to a coding sequence for a P450 encoding a sequence having 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-533, whereby the candidate compound is modified.
  • the unmodified candidate compound can be aesculetin, ajmalicine, ajmaline, aloin, amikacin, amlodipine besylate, amoxicillin, amphotericin B, andrographolide, apomorphine hydrochloride, arecoline hydrobromide, artemesinin, atorvastatin calcium, atropine, azithromycin, berberine chloride, bergenin monohydrate, betamethasone, betulinic acid, bixin, brucine, budesonide, bupropion hydrochloride, butamben, caffeine, camptothecin, capsaicin, celecoxib, ciprofloxacin, clarithromycin, clopidogrel sulfate, codeine, colchicine, convallatoxin, curcumin, cyclobenzaprine hydrochloride, danthron, dextromethorphan, digitoxin, digoxin, doxepin hydro
  • the plant cells e.g., root cells, leaf cells or stem cells
  • the plant cells can be part of a whole plant or from an explant.
  • the plant is a seedling.
  • the plant is grown under hydroponic conditions.
  • the P450 coding sequence is a heterologous sequence.
  • the candidate compound may be contacted with the plant cells for a period of time from about 1 hour to about 28 days.
  • methods for making a modified candidate compound. Such methods comprise contacting a candidate compound with transgenic plant cells containing a recombinant nucleic acid construct comprising a regulatory region operably linked to a coding sequence for a P450 having 80% or greater identity to sequence selected from the group consisting of SEQ ID NOs: 137-533, whereby the P450 modifies the candidate compound and recovering the modified candidate compound from the plant cells.
  • the candidate compound can be one of the compounds described herein.
  • the plant cells may be part of a whole plant, e.g., a seedling, or from an explant or may be grown in culture, e.g. in suspension culture or tissue culture.
  • the P450 coding sequence is a heterologous sequence.
  • the candidate compound may be contacted with the plant cells for a period from about 1 hour to about 28 days.
  • the method further comprises the step of characterizing the chemical structure of the modified candidate compound.
  • the invention also features transgenic plant cells comprising a recombinant nucleic acid construct.
  • the construct comprises a regulatory region operably linked to a coding sequence for a P450 having 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-533.
  • the plant cells may be part of a whole plant, e.g., a seedling, or may be grown in culture, e.g. in suspension culture or tissue culture.
  • the plant can be a dicotyledonous plant such as Arabidopsis thaliana.
  • the plant may be a monocotyledonous plant, such as Lemna minor, or an algae, such as Chlamydomonas reinhardii.
  • the regulatory region comprises a constitutive promoter, an inducible promoter, or a tissue-specific promoter.
  • the P450 coding sequence is a heterologous sequence. The transgenic plant cells can be effective for modifying a candidate compound when the cells are contacted with the compound.
  • composition of matter comprising 5-O- ⁇ -D- glucopyranosyl-8-methoxypsoralen, i.e., a composition of matter having the formula of Compound 3 as set forth in Figure 88.
  • a composition of matter described herein can be combined with a pharmaceutically acceptable carrier to form a pharmaceutical composition.
  • a pharmaceutical composition comprising a composition of matter described herein can be used to treat, ameliorate, or prevent a symptom or disorder associated with a disease (e.g., cancer, psoriasis, vitiligo, or nicotine addiction).
  • Figure 1 is a mass spectroscopy analysis of Arabidopsis seedlings overexpressing CYP82C2 (Panels A and B), incubated in the presence of 8- methoxypsoralen, producing a hydroxylated product and a glycosylated product not found in wild-type (WS) seedlings (Panels C and D) incubated with this compound.
  • Panels A and C display single ion chromatograms of m/z 233.5, corresponding to hydroxylated 8-methoxypsoralen.
  • Panels B and D display single ion chromatograms of m/z 395.5, corresponding to a glycosylated form of 8-methoxypsoralen.
  • Figure 2 is a mass spectroscopy analysis of Arabidopsis seedlings overexpressing CYP51A2 (Panel A), incubated in the presence of pipeline, producing a glycosylated product not found in wild-type (WS) seedlings (Panel B) incubated with this compound.
  • Panels A and B display single ion chromatograms of m/z 464.3, corresponding to glycosylated piperine.
  • Figure 3 is a mass spectroscopy analysis of Arabidopsis seedlings overexpressing CYP85A2 (Panel A), incubated in the presence of curcumin, producing new products not found in wild-type (WS) seedlings (Panel B) incubated with this compound.
  • Panels A and B display single ion chromatograms of m/z 401.5, corresponding to dihydroxylated curcumin.
  • Figures 4 to 87 provide an amino acid sequence alignment of a given P450 (identified by cDN AJDD) with its functional homologs, identified as described in Example 3.
  • a functional homolog is expected to have a similar function and/or activity as the P450.
  • Each figure also provides a consensus sequence. Boxed residues represent identical or conserved amino acids.
  • FIG. 4 provides an amino acid sequence alignment of SEQ ID NO: 137 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 273-277.
  • FIG. 5 provides an amino acid sequence alignment of SEQ ID NO: 138 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 278-280.
  • FIG. 6 provides an amino acid sequence alignment of SEQ ID NO: 139 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 281-283.
  • FIG. 7 provides an amino acid sequence alignment of SEQ ID NO: 140 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 284-286.
  • FIG. 8 provides an amino acid sequence alignment of SEQ ID NO: 141 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 287-293.
  • FIG. 9 provides an amino acid sequence alignment of SEQ ID NO: 144 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 294-296.
  • FIG. 10 provides an amino acid sequence alignment of SEQ ID NO: 146 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 297-310.
  • FIG. 11 provides an amino acid sequence alignment of SEQ ID NO: 154 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NO: 311.
  • FIG. 12 provides an amino acid sequence alignment of SEQ ID NO: 155 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NO: 312.
  • FIG. 13 provides an amino acid sequence alignment of SEQ ID NO: 156 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 313-316.
  • FIG. 14 provides an amino acid sequence alignment of SEQ ID NO: 157 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NO: 317.
  • FIG. 15 provides an amino acid sequence alignment of SEQ ID NO: 158 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 318-321.
  • FIG. 16 provides an amino acid sequence alignment of SEQ ID NO: 160 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 322-326.
  • FIG. 17 provides an amino acid sequence alignment of SEQ ID NO: 161 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 327-331.
  • FIG. 18 provides an amino acid sequence alignment of SEQ ID NO: 163 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 327, 329, and 331-334.
  • FIG. 19 provides an amino acid sequence alignment of SEQ ID NO: 164 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 327, 328, and 330-334.
  • FIG. 20 provides an amino acid sequence alignment of SEQ ID NO: 165 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NO: 327.
  • FIG. 21 provides an amino acid sequence alignment of SEQ ID NO: 166 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 327-331 and 334.
  • FIG. 22 provides an amino acid sequence alignment of SEQ ID NO: 167 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 335-340.
  • FIG. 23 provides an amino acid sequence alignment of SEQ ID NO: 168 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 335-340.
  • FIG. 24 provides an amino acid sequence alignment of SEQ TD NO: 169 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 341 and 342.
  • FIG. 25 provides an amino acid sequence alignment of SEQ ID NO: 170 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 343 and 344.
  • FIG. 26 provides an amino acid sequence alignment of SEQ ID NO: 171 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 344 and 345.
  • FIG. 27 provides an amino acid sequence alignment of SEQ ID NO: 172 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 346-349.
  • FIG. 28 provides an amino acid sequence alignment of SEQ ID NO: 173 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 346-352.
  • FIG. 29 provides an amino acid sequence alignment of SEQ ID NO: 174 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 346-349 and 353-355.
  • FIG. 30 provides an amino acid sequence alignment of SEQ ID NO: 175 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 274-277 and 356-359.
  • FIG. 31 provides an amino acid sequence alignment of SEQ ID NO: 176 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 360-368.
  • FIG. 32 provides an amino acid sequence alignment of SEQ ID NO: 180 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ TD NOs: 285, 286 and 369.
  • FIG. 33 provides an amino acid sequence alignment of SEQ ID NO: 181 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 284, 286 and 369.
  • FIG. 34 provides an amino acid sequence alignment of SEQ ID NO: 184 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 370 and 371.
  • FIG. 35 provides an amino acid sequence alignment of SEQ ID NO: 185 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NO: 372.
  • FIG. 36 provides an amino acid sequence alignment of SEQ ID NO: 186 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 371 and 373.
  • FIG. 37 provides an amino acid sequence alignment of SEQ ID NO: 187 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 371 and 373-375
  • FIG. 38 provides an amino acid sequence alignment of SEQ ID NO: 188 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 371, 373, 376 and 377.
  • FIG. 39 provides an amino acid sequence alignment of SEQ ID NO: 189 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 371, 373, 378 and 379.
  • FIG. 40 provides an amino acid sequence alignment of SEQ ID NO: 190 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 380-384.
  • FIG. 41 provides an amino acid sequence alignment of SEQ ID NO: 191 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 382, 383 and 385.
  • FIG. 42 provides an amino acid sequence alignment of SEQ ID NO: 192 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 386 and 387.
  • FIG. 43 provides an amino acid sequence alignment of SEQ ID NO: 193 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 386 and 388.
  • FIG. 44 provides an amino acid sequence alignment of SEQ ID NO: 194 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NO: 389.
  • FIG. 45 provides an amino acid sequence alignment of SEQ ID NO: 195 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NO: 390.
  • FIG. 46 provides an amino acid sequence alignment of SEQ ID NO: 196 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NO: 391.
  • FIG. 47 provides an amino acid sequence alignment of SEQ ID NO: 201 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 392-406.
  • FIG. 48 provides an amino acid sequence alignment of SEQ ID NO: 202 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 407-417.
  • FIG. 49 provides an amino acid sequence alignment of SEQ ID NO: 203 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 418-453.
  • FIG. 50 provides an amino acid sequence alignment of SEQ ID NO: 204 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 454, 456 and 457.
  • FIG. 51 provides an amino acid sequence alignment of SEQ ID NO: 206 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 454, 458-468.
  • FIG. 52 provides an amino acid sequence alignment of SEQ ID NO: 207 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 454, 458-460 and 467-469.
  • FIG. 53 provides an amino acid sequence alignment of SEQ ID NO: 208 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NO: 454, 458, 460-463, 469 and 470.
  • FIG. 54 provides an amino acid sequence alignment of SEQ ID NO: 209 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 455, 458, 459, 461-463 and 469.
  • FIG. 55 provides an amino acid sequence alignment of SEQ ED NO: 210 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 455, 458-460 and 469.
  • FIG. 56 provides an amino acid sequence alignment of SEQ ID NO: 212 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 471 and 472.
  • FIG. 57 provides an amino acid sequence alignment of SEQ ID NO: 213 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 473 and 474.
  • FIG. 58 provides an amino acid sequence alignment of SEQ ID NO: 215 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NO: 475.
  • FIG. 59 provides an amino acid sequence alignment of SEQ TD NO: 216 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 476-478.
  • FIG. 60 provides an amino acid sequence alignment of SEQ ID NO: 217 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ DD NOs: 471 and 472.
  • FIG. 61 provides an amino acid sequence alignment of SEQ ID NO: 218 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NO: 479.
  • FIG. 62 provides an amino acid sequence alignment of SEQ DD NO: 220 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ DD NO: 480.
  • FIG. 63 provides an amino acid sequence alignment of SEQ DD NO: 221 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ DD NO: 481.
  • FIG. 64 provides an amino acid sequence alignment of SEQ DD NO: 222 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ DD NO: 482.
  • FIG. 65 provides an amino acid sequence alignment of SEQ DD NO: 223 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ DD NO: 482.
  • FIG. 66 provides an amino acid sequence alignment of SEQ ID NO: 225 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NO: 483.
  • FIG. 61 provides an amino acid sequence alignment of SEQ ID NO: 228 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 484 and 485.
  • FIG. 68 provides an amino acid sequence alignment of SEQ ID NO: 229 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NO: 486.
  • FIG. 69 provides an amino acid sequence alignment of SEQ ID NO: 231 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 487-490.
  • FIG. 70 provides an amino acid sequence alignment of SEQ ID NO: 232 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 488-491.
  • FIG. 71 provides an amino acid sequence alignment of SEQ ID NO: 233 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 487 and 489-491.
  • FIG. 72 provides an amino acid sequence alignment of SEQ ID NO: 234 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 487, 488, 490 and 491.
  • FIG. 73 provides an amino acid sequence alignment of SEQ ID NO: 237 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NO: 492.
  • FIG. 74 provides an amino acid sequence alignment of SEQ ID NO: 238 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 493 and 494.
  • FIG. 75 provides an amino acid sequence alignment of SEQ ID NO: 244 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 495-500.
  • FIG. 76 provides an amino acid sequence alignment of SEQ ID NO: 245 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 496-503.
  • FIG. 77 provides an amino acid sequence alignment of SEQ ID NO: 249 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 504-509.
  • FIG. 78 provides an amino acid sequence alignment of SEQ ID NO: 250 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 510-513.
  • FIG. 79 provides an amino acid sequence alignment of SEQ ID NO: 251 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 512-514.
  • FIG. 80 provides an amino acid sequence alignment of SEQ ID NO: 252 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NO: 513.
  • FIG. 81 provides an amino acid sequence alignment of SEQ ID NO: 253 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 515 and 516.
  • FIG. 82 provides an amino acid sequence alignment of SEQ ID NO: 256 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NO: 517.
  • FIG. 83 provides an amino acid sequence alignment of SEQ ID NO: 258 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NO: 519.
  • FIG. 84 provides an amino acid sequence alignment of SEQ ID NO: 267 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 520 and 521.
  • FIG. 85 provides an amino acid sequence alignment of SEQ ID NO: 268 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NO: 522.
  • FIG. 86 provides an amino acid sequence alignment of SEQ ID NO: 269 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NO: 523-525.
  • FIG. 87 provides an amino acid sequence alignment of SEQ ID NO: 272 with its functional homologs.
  • the functional homologs include proteins having an amino acid sequence of SEQ ID NO: 526-533.
  • FIG. 88 provides the chemical formula of 8-methoxypsoralen (compound 1), 5-hydroxy-8-methoxypsoralen (compound 2), and 5-0- ⁇ -D-glucopyranosyl- 8-methoxypsoralen (compound 3).
  • Polypeptide and nucleic acid sequences useful in the instant invention include those described in the Sequence Listing and in Figures 4 through 87.
  • SEQ ID NO: 1 is the DNA sequence of cDNA ID no. 12324097 from Arabidopsis thaliana.
  • SEQ ID NO: 2 is the DNA sequence of cDNA ID no. 12489505 from Arabidopsis thaliana.
  • SEQ ID NO: 3 is the DNA sequence of cDNA ID no. 12559461 from Arabidopsis thaliana.
  • SEQ ID NO: 4 is the DNA sequence of cDNA ID no. 12672824 from Arabidopsis thaliana.
  • SEQ ID NO: 5 is the DNA sequence of cDNA ID no. 7106710 from Arabidopsis thaliana.
  • SEQ ID NO: 6 is the DNA sequence of cDNA ID no. 12723568 from Arabidopsis thaliana.
  • SEQ ID NO: 7 is the DNA sequence of cDNA ID no. 12591866 from Arabidopsis thaliana.
  • SEQ ID NO: 8 is the DNA sequence of cDNA ID no. 12652339 from Arabidopsis thaliana.
  • SEQ ID NO: 9 is the DNA sequence of cDNA ID no. 12669291 from Arabidopsis thaliana.
  • SEQ E) NO: 10 is the DNA sequence of cDNA ID no. 12605081 from Arabidopsis thaliana.
  • SEQ ID NO: 11 is the DNA sequence of cDNA ID no. 12722269 from Arabidopsis thaliana.
  • SEQ ID NO: 12 is the DNA sequence of cDNA ID no. 4806743 from Arabidopsis thaliana.
  • SEQ ID NO: 13 is the DNA sequence of cDNA ID no. 12716191 from Arabidopsis thaliana.
  • SEQ ID NO: 14 is the DNA sequence of cDNA TD no. 12420597 from Arabidopsis thaliana.
  • SEQ E) NO: 15 is the DNA sequence of cDNA E) no. 12652542 from Arabidopsis thaliana.
  • SEQ E) NO: 16 is the DNA sequence of cDNA E) no. 12330374 from Arabidopsis thaliana.
  • SEQ TD NO: 17 is the DNA sequence of cDNA E) no. 12713945 from Arabidopsis thaliana.
  • SEQ E) NO: 18 is the DNA sequence of cDNA E) no. 12488814 from Arabidopsis thaliana.
  • SEQ E) NO: 19 is the DNA sequence of cDNA TD no. 12667451 from Arabidopsis thaliana.
  • SEQ TD NO: 20 is the DNA sequence of cDNA TD no. 12719182 from Arabidopsis thaliana.
  • SEQ ID NO: 21 is the DNA sequence of cDNA ID no. 12726480 from Arabidopsis thaliana.
  • SEQ ID NO: 22 is the DNA sequence of cDNA ID no.
  • SEQ ID NO: 23 is the DNA sequence of cDNA DD no. 12724226 from Arabidopsis thaliana.
  • SEQ ID NO: 24 is the DNA sequence of cDNA ID no. 12348720 from Arabidopsis thaliana.
  • SEQ ID NO: 25 is the DNA sequence of cDNA ID no. 12720530 from. Arabidopsis thaliana.
  • SEQ ID NO: 26 is the DNA sequence of cDNA ID no. 12575419 from Arabidopsis thaliana.
  • SEQ ID NO: 27 is the DNA sequence of cDNA ID no. 12720534 from Arabidopsis thaliana.
  • SEQ ID NO: 28 is the DNA sequence of cDNA DD no.
  • SEQ ID NO: 29 is the DNA sequence of cDNA ID no. 12671019 from Arabidopsis thaliana.
  • SEQ ID NO: 30 is the DNA sequence of cDNA ID no. 12720518 from Arabidopsis thaliana.
  • SEQ ID NO: 31 is the DNA sequence of cDNA ID no. 12603446 from Arabidopsis thaliana.
  • SEQ ID NO: 32 is the DNA sequence of cDNA ID no. 12660455 from Arabidopsis thaliana.
  • SEQ ID NO: 33 is the DNA sequence of cDNA ID no. 1817341 from Arabidopsis thaliana.
  • SEQ ID NO: 34 is the DNA sequence of cDNA ID no. 12651234 from Arabidopsis thaliana.
  • SEQ ID NO: 35 is the DNA sequence of cDNA ID no. 12666501 from Arabidopsis thaliana.
  • SEQ ID NO: 36 is the DNA sequence of cDNA ID no. 12685377 from Arabidopsis thaliana.
  • SEQ ID NO: 37 is the DNA sequence of cDNA ID no. 12602360 from Arabidopsis thaliana.
  • SEQ ID NO: 38 is the DNA sequence of cDNA DD no. 12559722 from Arabidopsis thaliana.
  • SEQ ID NO: 39 is the DNA sequence of cDNA ID no. 12657646 from Arabidopsis thaliana.
  • SEQ ID NO: 40 is the DNA sequence of cDNA ID no. 12670002 from Arabidopsis thaliana.
  • SEQ ID NO: 41 is the DNA sequence of cDNA ID no. 12321246 from Arabidopsis thaliana.
  • SEQ DD NO: 42 is the DNA sequence of cDNA DD no. 12680308 from Arabidopsis thaliana.
  • SEQ DD NO: 43 is the DNA sequence of cDNA DD no. 12737089 from Arabidopsis thaliana.
  • SEQ DD NO: 44 is the DNA sequence of cDNA DD no. 12657899 from Arabidopsis thaliana.
  • SEQ DD NO: 45 is the DNA sequence of cDNA DD no. 12657903 from Arabidopsis thaliana.
  • SEQ DD NO: 46 is the DNA sequence of cDNA DD no.
  • SEQ DD NO: 47 is the DNA sequence of cDNA DD no. 5003066 from Arabidopsis thaliana.
  • SEQ DD NO: 48 is the DNA sequence of cDNA DD no. 12736047 Horn Arabidopsis thaliana.
  • SEQ DD NO: 49 is the DNA sequence of cDNA E ) no. 12736059 from Arabidopsis thaliana.
  • SEQ ID NO: 50 is the DNA sequence of cDNA ID no. 12601981 from Arabidopsis thaliana.
  • SEQ ID NO: 51 is the DNA sequence of cDNA ID no. 12331336 from Arabidopsis thaliana.
  • SEQ ID NO: 52 is the DNA sequence of cDNA ID no. 12650952 from Arabidopsis thaliana.
  • SEQ ID NO: 53 is the DNA sequence of cDNA ID no. 12650956 from Arabidopsis thaliana.
  • SEQ ID NO: 54 is the DNA sequence of cDNA ID no. 12370997 from Arabidopsis thaliana.
  • SEQ ID NO: 55 is the DNA sequence of cDNA ID no. 3036716 from Arabidopsis thaliana.
  • SEQ ID NO: 56 is the DNA sequence of cDNA ID no. 12561167 from Arabidopsis thaliana.
  • SEQ ID NO: 57 is the DNA sequence of cDNA ID no. 12721393 from Arabidopsis thaliana.
  • SEQ ID NO: 58 is the DNA sequence of cDNA ID no. 12680005 from Arabi dopsis thaliana.
  • SEQ ID NO: 59 is the DNA sequence of cDNA ID no. 6431419 from Arabidopsis thaliana.
  • SEQ ID NO: 60 is the DNA sequence of cDNA ID no. 6428581 from Arabidopsis thaliana.
  • SEQ ID NO: 61 is the DNA sequence of cDNA ID no. 12662082 from Arabidopsis thaliana.
  • SEQ ID NO: 62 is the DNA sequence of cDNA ID no. 12675449 from Arabidopsis thaliana.
  • SEQ ID NO: 63 is the DNA sequence of cDNA ID no. 12733016 from Arabidopsis thaliana.
  • SEQ ID NO: 64 is the DNA sequence of cDNA ID no. 12575176 from Arabidopsis thaliana.
  • SEQ ID NO: 65 is the DNA sequence of cDNA ID no. 12489109 from Arabidopsis thaliana.
  • SEQ ID NO: 66 is the DNA sequence of cDNA ID no. 12670510 from Arabidopsis thaliana.
  • SEQ ID NO: 67 is the DNA sequence of cDNA ID no. 12558789 from Arabidopsis thaliana.
  • SEQ ID NO: 68 is the DNA sequence of cDNA ID no. 12658399 from Arabidopsis thaliana.
  • SEQ ID NO: 69 is the DNA sequence of cDNA ID no. 12575003 from Arabidopsis thaliana.
  • SEQ ID NO: 70 is the DNA sequence of cDNA ID no. 12488947 from Arabidopsis thaliana.
  • SEQ ID NO: 71 is the DNA sequence of cDNA ID no. 12658410 from Arabidopsis thaliana.
  • SEQ ID NO: 72 is the DNA sequence of cDNA ID no. 12671988 from Arabidopsis thaliana.
  • SEQ ID NO: 73 is the DNA sequence of cDNA ID no. 4929561 from Arabidopsis thaliana.
  • SEQ ID NO: 74 is the DNA sequence of cDNA ID no. 12658403 from Arabidopsis thaliana.
  • SEQ ID NO: 75 is the DNA sequence of cDNA ID no. 12711003 from Arabidopsis thaliana.
  • SEQ ID NO: 76 is the DNA sequence of cDNA ID no. 12727851 from Arabidopsis thaliana.
  • SEQ DD NO: 77 is the DNA sequence of cDNA ID no. 12734891 from Arabidopsis thaliana.
  • SEQ ID NO: 78 is the DNA sequence of cDNA ED no. 12736978 from Arabidopsis thaliana.
  • SEQ DD NO: 79 is the DNA sequence of cDNA ID no. 12722908 from Arabidopsis thaliana.
  • SEQ DD NO: 80 is the DNA sequence of cDNA DD no. 4931385 from Arabidopsis thaliana.
  • SEQ DD NO: 81 is the DNA sequence of cDNA ID no. 12672204 from Arabidopsis thaliana.
  • SEQ DD NO: 82 is the DNA sequence of cDNA ID no. 12604034 from Arabidopsis thaliana.
  • SEQ DD NO: 83 is the DNA sequence of cDNA DD no. 6446020 from Arabidopsis thaliana.
  • SEQ DD NO: 84 is the DNA sequence of cDNA DD no. 12323946 from Arabidopsis thaliana.
  • SEQ DD NO: 85 is the DNA sequence of cDNA DD no. 12578257 from Arabidopsis thaliana.
  • SEQ DD NO: 86 is the DNA sequence of cDNA DD no.
  • SEQ DD NO: 87 is the DNA sequence of cDNA DD no. 12716909 from Arabidopsis thaliana.
  • SEQ DD NO: 88 is the DNA sequence of cDNA DD no. 12672200 from Arabidopsis thaliana.
  • SEQ DD NO: 89 is the DNA sequence of cDNA DD no. 12736967 from Arabidopsis thaliana.
  • SEQ DD NO: 90 is the DNA sequence of cDNA DD no. 12672193 from Arabidopsis thaliana.
  • SEQ DD NO: 91 is the DNA sequence of cDNA DD no. 12728556 from Arabidopsis thaliana.
  • SEQ DD NO: 92 is the DNA sequence of cDNA DD no. 12716894 from Arabidopsis thaliana.
  • SEQ DD NO: 93 is the DNA sequence of cDNA DD no. 12736956 from Arabidopsis thaliana.
  • SEQ DD NO: 94 is the DNA sequence of cDNA DD no. 12482287 from Arabidopsis thaliana.
  • SEQ DD NO: 95 is the DNA sequence of cDNA DD no. 12654831 from Arabidopsis thaliana.
  • SEQ DD NO: 96 is the DNA sequence of cDNA DD no.
  • SEQ DD NO: 97 is the DNA sequence of cDNA DD no. 12654819 from Arabidopsis thaliana.
  • SEQ DD NO: 98 is the DNA sequence of cDNA DD no. 12654815 from Arabidopsis thaliana.
  • SEQ DD NO: 99 is the DNA sequence of cDNA DD no. 12726436 from Arabidopsis thaliana.
  • SEQ DD NO: 100 is the DNA sequence of cDNA DD no. 12692662 from Arabidopsis thaliana.
  • SEQ DD NO: 101 is the DNA sequence of cDNA DD no. 12672307 from Arabidopsis thaliana.
  • SEQ DD NO: 102 is the DNA sequence of cDNA DD no. 12664747 from Arabidopsis thaliana.
  • SEQ DD NO: 103 is the DNA sequence of cDNA DD no. 12661189 from Arabidopsis thaliana.
  • SEQ DD NO: 104 is the DNA sequence of cDNA DD no. 12718491 from Arabidopsis thaliana.
  • SEQ DD NO: 105 is the DNA sequence of cDNA ID no. 12722285 from Arabidopsis thaliana.
  • SEQ ID NO: 106 is the DNA sequence of cDNA ID no.
  • SEQ ID NO: 107 is the DNA sequence of cDNA ID no. 12653581 from Arabidopsis thaliana.
  • SEQ ID NO: 108 is the DNA sequence of cDNA ID no. 12336276 from Arabidopsis thaliana.
  • SEQ ID NO: 109 is the DNA sequence of cDNA ID no. 12720013 from Arabidopsis thaliana.
  • SEQ ID NO: 110 is the DNA sequence of cDNA ID no. 12700669 from Arabidopsis thaliana.
  • SEQ ID NO: 111 is the DNA sequence of cDNA ID no. 12733198 from Arabidopsis thaliana.
  • SEQ ID NO: 112 is the DNA sequence of cDNA ID no. 12733202 from Arabidopsis thaliana.
  • SEQ ID NO: 113 is the DNA sequence of cDNA ID no. 12670681 from Arabidopsis thaliana.
  • SEQ ID NO: 114 is the DNA sequence of cDNA ID no. 12663534 from Arabidopsis thaliana.
  • SEQ ID NO: 115 is the DNA sequence of cDNA ID no. 12672657 from Arabidopsis thaliana.
  • SEQ ID NO: 116 is the DNA sequence of cDNA ID no. 12601847 from Arabidopsis thaliana.
  • SEQ ID NO: 117 is the DNA sequence of cDNA ID no. 6442206 from Arabidopsis thaliana.
  • SEQ ID NO: 118 is the DNA sequence of cDNA ID no. 12656365 from Arabidopsis thaliana.
  • SEQ ID NO: 119 is the DNA sequence of cDNA ID no. 12672102 from Arabidopsis thaliana.
  • SEQ ID NO: 120 is the DNA sequence of cDNA ID no. 12731901 from Arabidopsis thaliana.
  • SEQ ID NO: 121 is the DNA sequence of cDNA ID no. 12696098 from Arabidopsis thaliana.
  • SEQ ID NO: 122 is the DNA sequence of cDNA ID no. 12653150 from Arabidopsis thaliana.
  • SEQ ID NO: 123 is the DNA sequence of cDNA ID no. 12731797 from Arabidopsis thaliana.
  • SEQ ID NO: 124 is the DNA sequence of cDNA ID no. 12731793 from Arabidopsis thaliana.
  • SEQ ID NO: 125 is the DNA sequence of cDNA ID no. 6445548 from Arabidopsis thaliana.
  • SEQ ID NO: 126 is the DNA sequence of cDNA ID no. 12731781 from Arabidopsis thaliana.
  • SEQ ID NO: 127 is the DNA sequence of cDNA ID no. 12576154 from Arabidopsis thaliana.
  • SEQ ID NO: 128 is the DNA sequence of cDNA ID no. 12729533 from Arabidopsis thaliana.
  • SEQ ID NO: 129 is the DNA sequence of cDNA ID no. 12661185 from Arabidopsis thaliana.
  • SEQ ID NO: 130 is the DNA sequence of cDNA ID no. 12574629 from Arabidopsis thaliana.
  • SEQ ID NO: 131 is the DNA sequence of cDNA ID no. 12575795 from Arabidopsis thaliana.
  • SEQ ID NO: 132 is the DNA sequence of cDNA ID no. 12731921 from Arabidopsis thaliana.
  • SEQ ID NO: 133 is the DNA sequence of cDNA ID no. 12654597 from Arabidopsis thaliana.
  • SEQ ID NO: 134 is the DNA sequence of cDNA ID no. 12680828 from Arabidopsis thaliana.
  • SEQ ID NO: 135 is the DNA sequence of cDNA ID no. 12709025 from Arabidopsis thaliana.
  • SEQ ID NO: 136 is the DNA sequence of cDNA ID no. 12602001 from Arabidopsis thaliana.
  • SEQ ID NO: 137 is the protein sequence of cDNA ID no. 12324097 from Arabidopsis thaliana.
  • SEQ ID NO: 138 is the protein sequence of cDNA ID no. 12489505 from Arabidopsis thaliana.
  • SEQ ID NO: 139 is the protein sequence of cDNA ID no. 12559461 from Arabidopsis thaliana.
  • SEQ ID NO: 140 is the protein sequence of cDNA ID no.
  • SEQ ID NO: 141 is the protein sequence of cDNA ID no. 7106710 from Arabidopsis thaliana.
  • SEQ ID NO: 142 is the protein sequence of cDNA ID no. 12723568 from Arabidopsis thaliana.
  • SEQ ID NO: 143 is the protein sequence of cDNA ID no. 12591866 from Arabidopsis thaliana.
  • SEQ ID NO: 144 is the protein sequence of cDNA ID no. 12652339 from Arabidopsis thaliana.
  • SEQ ID NO: 145 is the protein sequence of cDNA ID no. 12669291 from Arabidopsis thaliana.
  • SEQ ID NO: 146 is the protein sequence of cDNA ID no. 12605081 from Arabidopsis thaliana.
  • SEQ ID NO: 147 is the protein sequence of cDNA ID no. 12722269 from Arabidopsis thaliana.
  • SEQ ID NO: 148 is the protein sequence of cDNA ID no. 4806743 from Arabidopsis thaliana.
  • SEQ ID NO: 149 is the protein sequence of cDNA ID no. 12716191 from Arabidopsis thaliana.
  • SEQ ID NO: 150 is the protein sequence of cDNA ID no. 12420597 from Arabidopsis thaliana.
  • SEQ ID NO: 151 is the protein sequence of cDNA ID no. 12652542 from Arabidopsis thaliana.
  • SEQ ID NO: 152 is the protein sequence of cDNA TD no. 12330374 from Arabidopsis thaliana.
  • SEQ ID NO: 153 is the protein sequence of cDNA ID no. 12713945 from Arabidopsis thaliana.
  • SEQ ID NO: 154 is the protein sequence of cDNA ID no. 12488814 from Arabidopsis thaliana.
  • SEQ ID NO: 155 is the protein sequence of cDNA ID no. 12667451 from Arabidopsis thaliana.
  • SEQ ID NO: 156 is the protein sequence of cDNA ID no.
  • SEQ ID NO: 157 is the protein sequence of cDNA ID no. 12726480 from Arabidopsis thaliana.
  • SEQ ID NO: 158 is the protein sequence of cDNA ID no. 12653029 from Arabidopsis thaliana.
  • SEQ ID NO: 159 is the protein sequence of cDNA ID no. 12724226 from Arabidopsis thaliana.
  • SEQ ID NO: 160 is the protein sequence of cDNA ID no. 12348720 from Arabidopsis thaliana.
  • SEQ ID NO: 161 is the protein sequence of cDNA ID no. 12720530 from Arabidopsis thaliana.
  • SEQ ID NO: 162 is the protein sequence of cDNA ID no. 12575419 from Arabidopsis thaliana.
  • SEQ ID NO: 163 is the protein sequence of cDNA ID no. 12720534 from Arabidopsis thaliana.
  • SEQ ID NO: 164 is the protein sequence of cDNA ID no. 12577086 from Arabidopsis thaliana.
  • SEQ ID NO: 165 is the protein sequence of cDNA ID no. 12671019 from Arabidopsis thaliana.
  • SEQ ID NO: 166 is the protein sequence of cDNA ID no. 12720518 from Arabidopsis thaliana.
  • SEQ ID NO: 167 is the protein sequence of cDNA ID no. 12603446 from Arabidopsis thaliana.
  • SEQ ID NO: 168 is the protein sequence of cDNA ID no. 12660455 from Arabidopsis thaliana.
  • SEQ ID NO: 169 is the protein sequence of cDNA ID no. 1817341 from Arabidopsis thaliana.
  • SEQ ID NO: 170 is the protein sequence of cDNA ID no. 12651234 from Arabidopsis thaliana.
  • SEQ ID NO: 171 is the protein sequence of cDNA ID no. 12666501 from Arabidopsis thaliana.
  • SEQ ID NO: 172 is the protein sequence of cDNA ID no. 12685377 from Arabidopsis thaliana.
  • SEQ ID NO: 173 is the protein sequence of cDNA ID no. 12602360 from Arabidopsis thaliana.
  • SEQ ID NO: 174 is the protein sequence of cDNA ID no. 12559722 from Arabidopsis thaliana.
  • SEQ ID NO: 175 is the protein sequence of cDNA ID no. 12657646 from Arabidopsis thaliana.
  • SEQ ID NO: 176 is the protein sequence of cDNA ID no. 12670002 from Arabidopsis thaliana.
  • SEQ ID NO: 177 is the protein sequence of cDNA ID no. 12321246 from Arabidopsis thaliana.
  • SEQ ID NO: 178 is the protein sequence of cDNA ID no. 12680308 from Arabidopsis thaliana.
  • SEQ ID NO: 179 is the protein sequence of cDNA ID no. 12737089 from Arabidopsis thaliana.
  • SEQ ID NO: 180 is the protein sequence of cDNA ID no. 12657899 from Arabidopsis thaliana.
  • SEQ ID NO: 181 is the protein sequence of cDNA ID no. 12657903 from Arabidopsis thaliana.
  • SEQ ID NO: 182 is the protein sequence of cDNA ID no. 12727559 from Arabidopsis thaliana.
  • SEQ ID NO: 183 is the protein sequence of cDNA ID no. 5003066 from Arabidopsis thaliana.
  • SEQ ID NO: 184 is the protein sequence of cDNA ID no. 12736047 from Arabidopsis thaliana.
  • SEQ TD NO: 185 is the protein sequence of cDNA ID no. 12736059 from Arabidopsis thaliana.
  • SEQ ID NO: 186 is the protein sequence of cDNA ID no. 12601981 from Arabidopsis thaliana.
  • SEQ ID NO: 187 is the protein sequence of cDNA ED no. 12331336 from Arabidopsis thaliana.
  • SEQ TD NO: 188 is the protein sequence of cDNA E) no. 12650952 from Arabidopsis thaliana.
  • SEQ E) NO: 189 is the protein sequence of cDNA TD no. 12650956 from Arabidopsis thaliana.
  • SEQ TD NO: 190 is the protein sequence of cDNA E) no. 12370997 from Arabidopsis thaliana.
  • SEQ TD NO: 191 is the protein sequence of cDNA TD no. 3036716 from Arabidopsis thaliana.
  • SEQ TD NO: 192 is the protein sequence of cDNA TD no. 12561167 from Arabidopsis thaliana.
  • SEQ TD NO: 193 is the protein sequence of cDNA TD no. 12721393 from Arabidopsis thaliana.
  • SEQ TD NO: 194 is the protein sequence of cDNA TD no. 12680005 from Arabidopsis thaliana.
  • SEQ E) NO: 195 is the protein sequence of cDNA TD no. 6431419 from Arabidopsis thaliana.
  • SEQ TD NO: 196 is the protein sequence of cDNA TD no.
  • SEQ TD NO: 197 is the protein sequence of cDNA TD no. 12662082 from Arabidopsis thaliana.
  • SEQ E) NO: 198 is the protein sequence of cDNA TD no. 12675449 from Arabidopsis thaliana.
  • SEQ TD NO: 199 is the protein sequence of cDNA TD no. 12733016 from Arabidopsis thaliana.
  • SEQ TD NO: 200 is the protein sequence of cDNA TD no. 12575176 from Arabidopsis thaliana.
  • SEQ TD NO: 201 is the protein sequence of cDNA TD no. 12489109 from Arabidopsis thaliana.
  • SEQ TD NO: 202 is the protein sequence of cDNA TD no. 12670510 from Arabidopsis thaliana.
  • SEQ TD NO: 203 is the protein sequence of cDNA TD no. 12558789 from Arabidopsis thaliana.
  • SEQ TD NO: 204 is the protein sequence of cDNA TD no. 12658399 from Arabidopsis thaliana.
  • SEQ TD NO: 205 is the protein sequence of cDNA TD no. 12575003 from Arabidopsis thaliana.
  • SEQ TD NO: 206 is the protein sequence of cDNA TD no.
  • SEQ TD NO: 207 is the protein sequence of cDNA TD no. 12658410 from Arabidopsis thaliana.
  • SEQ TD NO: 208 is the protein sequence of cDNA TD no. 12671988 from Arabidopsis thaliana.
  • SEQ TD NO: 209 is the protein sequence of cDNA E) no. 4929561 from Arabidopsis thaliana.
  • SEQ ID NO: 210 is the protein sequence of cDNA ID no. 12658403 from Arabidopsis thaliana.
  • SEQ ID NO: 211 is the protein sequence of cDNA TD no. 12711003 from Arabidopsis thaliana.
  • SEQ ID NO: 212 is the protein sequence of cDNA ID no. 12727851 from Arabidopsis thaliana.
  • SEQ ID NO: 213 is the protein sequence of cDNA ID no. 12734891 from Arabidopsis thaliana.
  • SEQ ID NO: 214 is the protein sequence of cDNA TD no. 12736978 from Arabidopsis thaliana.
  • SEQ E) NO: 215 is the protein sequence of cDNA E) no. 12722908 from Arabidopsis thaliana.
  • SEQ E) NO: 216 is the protein sequence of cDNA E) no.
  • SEQ E) NO: 217 is the protein sequence of cDNA E) no. 12672204 from Arabidopsis thaliana.
  • SEQ TD NO: 218 is the protein sequence of cDNA TD no. 12604034 from Arabidopsis thaliana.
  • SEQ TD NO: 219 is the protein sequence of cDNA TD no. 6446020 from Arabidopsis thaliana.
  • SEQ TD NO: 220 is the protein sequence of cDNA E) no. 12323946 from Arabidopsis thaliana.
  • SEQ TD NO: 221 is the protein sequence of cDNA TD no. 12578257 from Arabidopsis thaliana.
  • SEQ TD NO: 222 is the protein sequence of cDNA TD no. 12348600 from Arabidopsis thaliana.
  • SEQ TD NO: 223 is the protein sequence of cDNA TD no. 12716909 from Arabidopsis thaliana.
  • SEQ TD NO: 224 is the protein sequence of cDNA TD no. 12672200 from Arabidopsis thaliana.
  • SEQ TD NO: 225 is the protein sequence of cDNA TD no. 12736967 from Arabidopsis thaliana.
  • SEQ TD NO: 226 is the protein sequence of cDNA TD no.
  • SEQ TD NO: 227 is the protein sequence of cDNA TD no. 12728556 from Arabidopsis thaliana.
  • SEQ TD NO: 228 is the protein sequence of cDNA TD no. 12716894 from Arabidopsis thaliana.
  • SEQ E) NO: 229 is the protein sequence of cDNA TD no. 12736956 from Arabidopsis thaliana.
  • SEQ TD NO: 230 is the protein sequence of cDNA TD no. 12482287 from Arabidopsis thaliana.
  • SEQ TD NO: 231 is the protein sequence of cDNA TD no. 12654831 from Arabidopsis thaliana.
  • SEQ TD NO: 232 is the protein sequence of cDNA E) no. 12654823 from Arabidopsis thaliana.
  • SEQ TD NO: 233 is the protein sequence of cDNA TD no. 12654819 from Arabidopsis thaliana.
  • SEQ E) NO: 234 is the protein sequence of cDNA TD no. 12654815 from Arabidopsis thaliana.
  • SEQ TD NO: 235 is the protein sequence of cDNA TD no. 12726436 from Arabidopsis ihaliana.
  • SEQ ID NO: 236 is the protein sequence of cDNA DD no.
  • SEQ DD NO: 237 is the protein sequence of cDNA ID no. 12672307 from Arabidopsis thaliana.
  • SEQ ID NO: 238 is the protein sequence of cDNA ID no. 12664747 from Arabidopsis thaliana.
  • SEQ ID NO: 239 is the protein sequence of cDNA ID no. 12661189 from Arabidopsis thaliana.
  • SEQ ID NO: 240 is the protein sequence of cDNA ID no. 12718491 from Arabidopsis thaliana.
  • SEQ ID NO: 241 is the protein sequence of cDNA ID no. 12722285 from Arabidopsis thaliana.
  • SEQ ID NO: 242 is the protein sequence of cDNA ID no. 1265 Al '61 from Arabidopsis thaliana.
  • SEQ ID NO: 243 is the protein sequence of cDNA ID no. 12653581 from Arabidopsis thaliana.
  • SEQ ID NO: 244 is the protein sequence of cDNA ID no. 12336276 from Arabidopsis thaliana.
  • SEQ ID NO: 245 is the protein sequence of cDNA ID no. 12720013 from Arabidopsis thaliana.
  • SEQ ID NO: 246 is the protein sequence of cDNA ID no.
  • SEQ ID NO: 247 is the protein sequence of cDNA ID no. 12733198 from Arabidopsis thaliana.
  • SEQ ID NO: 248 is the protein sequence of cDNA ID no. 12733202 from Arabidopsis thaliana.
  • SEQ ID NO: 249 is the protein sequence of cDNA ID no. 12670681 from Arabidopsis thaliana.
  • SEQ ID NO: 250 is the protein sequence of cDNA ID no. 12663534 from Arabidopsis thaliana.
  • SEQ ID NO: 251 is the protein sequence of cDNA ID no. 12672657 from Arabidopsis thaliana.
  • SEQ ID NO: 252 is the protein sequence of cDNA ID no. 12601847 from Arabidopsis thaliana.
  • SEQ ID NO: 253 is the protein sequence of cDNA ID no. 6442206 from Arabidopsis thaliana.
  • SEQ DD NO: 254 is the protein sequence of cDNA DD no. 12656365 from Arabidopsis thaliana.
  • SEQ DD NO: 255 is the protein sequence of cDNA DD no. 12672102 from Arabidopsis thaliana.
  • SEQ DD NO: 256 is the protein sequence of cDNA DD no.
  • SEQ DD NO: 257 is the protein sequence of cDNA DD no. 12696098 from Arabidopsis thaliana.
  • SEQ DD NO: 258 is the protein sequence of cDNA DD no. 12653150 from Arabidopsis thaliana.
  • SEQ DD NO: 259 is the protein sequence of cDNA DD no. 12731797 from Arabidopsis thaliana.
  • SEQ DD NO: 260 is the protein sequence of cDNA DD no. 12731793 from Arabidopsis thaliana.
  • SEQ ID NO: 261 is the protein sequence of cDNA ID no. 6445548 from Arabidopsis thaliana.
  • SEQ ID NO: 262 is the protein sequence of cDNA ID no. 12731781 from Arabidopsis thaliana.
  • SEQ ID NO: 263 is the protein sequence of cDNA ID no. 12576154 from Arabidopsis thaliana.
  • SEQ ID NO: 264 is the protein sequence of cDNA ID no. 12729533 from Arabidopsis thaliana.
  • SEQ ID NO: 265 is the protein sequence of cDNA ID no. 12661185 from Arabidopsis thaliana.
  • SEQ ID NO: 266 is the protein sequence of cDNA ID no. 12574629 from Arabidopsis thaliana.
  • SEQ ID NO: 267 is the protein sequence of cDNA ID no. 12575795 from Arabidopsis thaliana.
  • SEQ ID NO: 268 is the protein sequence of cDNA ID no. 12731921 from Arabidopsis thaliana.
  • SEQ ID NO: 269 is the protein sequence of cDNA ID no. 12654597 from Arabidopsis thaliana.
  • SEQ ID NO: 270 is the protein sequence of cDNA ID no. 12680828 from Arabidopsis thaliana.
  • SEQ ID NO: 271 is the protein sequence of cDNA ID no. 12709025 from Arabidopsis thaliana.
  • SEQ ID NO: 272 is the protein sequence of cDNA ID no. 12602001 from Arabidopsis thaliana.
  • SEQ ID NO: 273 is the protein sequence of NCBI gi no. 14475586 from Arabidopsis thaliana.
  • SEQ ID NO: 274 is the protein sequence of NCBI gi no. 5921933 from Lycopersicon esculentum.
  • SEQ ID NO: 275 is the protein sequence of clone ID no. 756140 from Triticum aestivum.
  • SEQ ID NO: 276 is the protein sequence of clone ID no.
  • SEQ ID NO: 277 is the protein sequence of NCBI gi no. 50838910 from Oryza sativa subsp. japonica.
  • SEQ ID NO: 278 is the protein sequence of clone ID no. 748722 from Triticum aestivum.
  • SEQ ID NO: 279 is the protein sequence of NCBI gi no. 34911504 from Oryza sativa subsp. japonica.
  • SEQ ID NO: 280 is the protein sequence of NCBI gi no. 34911506 from Oryza sativa subsp. japonica.
  • SEQ ID NO: 281 is the protein sequence of NCBI gi no. 14030557 from Zea mays.
  • SEQ ID NO: 282 is the protein sequence of clone ID no. 261301 from Zea mays.
  • SEQ ID NO: 283 is the protein sequence of NCBI gi no. 50919857 from Oryza sativa subsp. japonica.
  • SEQ ID NO: 284 is the protein sequence of NCBI gi no. 7270098 from Arabidopsis thaliana.
  • SEQ ID NO: 285 is the protein sequence of NCBI gi no. 23296518 from Arabidopsis thaliana.
  • SEQ ID NO: 286 is the protein sequence of NCBI gi no. 7430660 from Arabidopsis thaliana.
  • SEQ ID NO: 287 is the protein sequence of NCBI gi no. 47156870 from Solarium chacoense.
  • SEQ ID NO: 288 is the protein sequence of clone ID no. 1119783 from Glycine max.
  • SEQ ID NO: 289 is the protein sequence of NCBI gi no. 18000074 from Nicotiana tabacum.
  • SEQ ID NO: 290 is the protein sequence of NCBI gi no. 27461067 from Nicotiana tabacum.
  • SEQ ID NO: 291 is the protein sequence of NCBI gi no. 27542762 from Sorghum bicolor.
  • SEQ ID NO: 292 is the protein sequence of NCBI gi no. 5921924 from Sorghum bicolor.
  • SEQ ID NO: 293 is the protein sequence of clone ID no. 782009 from Triticum aestivum.
  • SEQ ID NO: 294 is the protein sequence of clone ID no. 527128 from Glycine max.
  • SEQ ID NO: 295 is the protein sequence of NCBI gi no. 6739527 from Manihot esculenta.
  • SEQ ID NO: 296 is the protein sequence of NCBI gi no. 56553510 from n/a .
  • SEQ ID NO: 297 is the protein sequence of NCBI gi no. 17978651 from Pinus taeda.
  • SEQ ID NO: 298 is the protein sequence of NCBI gi no. 2738998 from Glycine max.
  • SEQ ID NO: 299 is the protein sequence of NCBI gi no. 27650337 from Solenostemon scutellarioides.
  • SEQ ID NO: 300 is the protein sequence of NCBI gi no. 17978831 from Sesamum indicum.
  • SEQ ID NO: 301 is the protein sequence of NCBI gi no. 5915857 from Sorghum bicolor.
  • SEQ ID NO: 302 is the protein sequence of NCBI gi no. 40641240 from Triticum aestivum.
  • SEQ ID NO: 303 is the protein sequence of NCBI gi no. 40641238 from Triticum aestivum.
  • SEQ ID NO: 304 is the protein sequence of NCBI gi no. 46798530 from Triticum aestivum.
  • SEQ ID NO: 305 is the protein sequence of clone ID no. 299213 from Zea mays.
  • SEQ ID NO: 306 is the protein sequence of NCBI gi no. 45331333 from Camptotheca acuminata.
  • SEQ ID NO: 307 is the protein sequence of NCBI gi no. 26522472 from Lithospermum erythrorhizon.
  • SEQ ID NO: 308 is the protein sequence of NCBI gi no. 22651521 from Ocimum basilicum.
  • SEQ ID NO: 309 is the protein sequence of NCBI gi no. 22651519 from Ocimum basilicum.
  • SEQ ID NO: 310 is the protein sequence of NCBI gi no. 46947675 from Ammi majus.
  • SEQ ID NO: 311 is the protein sequence of NCBI gi no. 7270932 from Arabidopsis thaliana.
  • SEQ ID NO: 312 is the protein sequence of NCBI gi no. 7594541 from Arabidopsis thaliana.
  • SEQ ID NO: 313 is the protein sequence of clone ID no. 763442 from Triticum aestivum.
  • SEQ ID NO: 314 is the protein sequence of clone ID no. 1557311 from Zea mays.
  • SEQ ID NO: 315 is the protein sequence of NCBI gi no. 34908446 from Oryza sativa subsp. japonica.
  • SEQ ID NO: 316 is the protein sequence of NCBI gi no.
  • SEQ ID NO: 317 is the protein sequence of clone ID no. 1469649 from Zea mays.
  • SEQ ID NO: 318 is the protein sequence of clone ID no. 1578373 from Zea mays.
  • SEQ ID NO: 319 is the protein sequence of clone ID no. 1583137 from Zea mays.
  • SEQ ID NO: 320 is the protein sequence of NCBI gi no. 60677681 from n/a .
  • SEQ ID NO: 321 is the protein sequence of NCBI gi no. 34902330 from Oryza sativa subsp. japonica.
  • SEQ ID NO: 322 is the protein sequence of clone ID no. 718939 from Glycine max.
  • SEQ ID NO: 323 is the protein sequence of NCBI gi no. 45260636 from Nicotiana tabacum.
  • SEQ ID NO: 324 is the protein sequence of NCBI gi no. 60677685 from n/a .
  • SEQ ID NO: 325 is the protein sequence of NCBI gi no. 60677683 from n/a .
  • SEQ ID NO: 326 is the protein sequence of NCBI gi no. 9587211 from Vigna radiata.
  • SEQ E) NO: 327 is the protein sequence of NCBI gi no. 46095226 from Cucumis melo.
  • SEQ E) NO: 328 is the protein sequence of NCBI gi no. 15293197 from Arabidopsis thaliana.
  • SEQ E) NO: 329 is the protein sequence of NCBI gi no. 20198022 from Arabidopsis thaliana.
  • SEQ E) NO: 330 is the protein sequence of NCBI gi no. 15450601 from Arabidopsis thaliana.
  • SEQ E) NO: 331 is the protein sequence of NCBI gi no. 5042429 from Arabidopsis thaliana.
  • SEQ E) NO: 332 is the protein sequence of NCBI gi no. 5042428 from Arabidopsis thaliana.
  • SEQ E) NO: 333 is the protein sequence of NCBI gi no. 34098875 from Arabidopsis thaliana.
  • SEQ E) NO: 334 is the protein sequence of NCBI gi no. 1432145 from Arabidopsis thaliana.
  • SEQ E) NO: 335 is the protein sequence of NCBI gi no. 11934677 from Cucurbita maxima.
  • SEQ E) NO: 336 is the protein sequence of NCBI gi no.
  • SEQ E) NO: 337 is the protein sequence of NCBI gi no. 13022042 from Hordeum vulgar e subsp. vulgar e.
  • SEQ E) NO: 338 is the protein sequence of NCBI gi no. 47498770 from Ginkgo biloba.
  • SEQ TD NO: 339 is the protein sequence of NCBI gi no. 5915847 from Zea mays.
  • SEQ E) NO: 340 is the protein sequence of NCBI gi no. 55775106 from Oryza sativa subsp. japonica.
  • SEQ E) NO: 341 is the protein sequence of NCBI gi no. 13641298 from Brassica rapa subsp. pekinensis.
  • SEQ E) NO: 342 is the protein sequence of NCBI gi no. 4850398 from Arabidopsis thaliana.
  • SEQ E) NO: 343 is the protein sequence of NCBI gi no. 10177808 from Arabidopsis thaliana.
  • SEQ E) NO: 344 is the protein sequence of NCBI gi no. 31432758 from Or ⁇ za sativa subsp. japonica.
  • SEQ ID NO: 345 is the protein sequence of NCBI gi no. 8346562 from Arabidopsis thaliana.
  • SEQ ID NO: 346 is the protein sequence of NCBI gi no. 10442763 from Triticum aestivum.
  • SEQ ID NO: 347 is the protein sequence of clone E) no. 292778 from Zea mays.
  • SEQ E) NO: 348 is the protein sequence of NCBI gi no. 50251848 from Ot ⁇ za sativa subsp. japonica.
  • SEQ E) NO: 349 is the protein sequence of NCBI gi no. 50927909 from Oryza sativa subsp. japonica.
  • SEQ E) NO: 350 is the protein sequence of clone E) no. 1357674 from Arabidopsis thaliana.
  • SEQ E) NO: 351 is the protein sequence of NCBI gi no. 7267123 from Arabidopsis thaliana.
  • SEQ E) NO: 352 is the protein sequence of NCBI gi no. 22137048 from Arabidopsis thaliana.
  • SEQ E) NO: 353 is the protein sequence of clone E) no. 156577 from Arabidopsis thaliana.
  • SEQ E ) NO: 354 is the protein sequence of NCBI gi no. 15223436 from Arabidopsis thaliana.
  • SEQ E) NO: 355 is the protein sequence of NCBI gi no. 26449891 from Arabidopsis thaliana.
  • SEQ E) NO: 356 is the protein sequence of NCBI gi no.
  • SEQ TD NO: 357 is the protein sequence of clone E) no. 11278 from. Arabidopsis thaliana.
  • SEQ E) NO: 358 is the protein sequence of NCBI gi no. 27544770 from Arabidopsis thaliana.
  • SEQ E) NO: 359 is the protein sequence of NCBI gi no. 9294419 from Arabidopsis thaliana.
  • SEQ E) NO: 360 is the protein sequence of NCBI gi no. 10197650 from Brassica napus.
  • SEQ E) NO: 361 is the protein sequence of NCBI gi no. 10197652 from Brassica napus.
  • SEQ E) NO: 362 is the protein sequence of NCBI gi no. 47933890 from Camptotheca acuminata.
  • SEQ E) NO: 363 is the protein sequence of NCBI gi no. 5731998 from Liquidambar styracifl.ua.
  • SEQ E) NO: 364 is the protein sequence of clone E) no. 545898 from Glycine max.
  • SEQ E) NO: 365 is the protein sequence of NCBI gi no. 6688937 from Populus balsamifera subsp. trichocarpa.
  • SEQ E) NO: 366 is the protein sequence of NCBI gi no. 46403211 from Centaurium erythraea.
  • SEQ E) NO: 367 is the protein sequence of NCBI gi no. 57470997 from n/a .
  • SEQ E) NO: 368 is the protein sequence of NCBI gi no. 5002354 from Lycopersicon esculentum x Lycopersicon peruvianum.
  • SEQ E ) NO: 369 is the protein sequence of NCBI gi no. 7270101 from Arabidopsis thaliana.
  • SEQ E) NO: 370 is the protein sequence of NCBI gi no. 2642441 from Arabidopsis thaliana.
  • SEQ ID NO: 371 is the protein sequence of NCBI gi no. 33521521 from Medicago truncatula.
  • SEQ ID NO: 372 is the protein sequence of NCBI gi no. 2642444 from Arabidopsis thaliana.
  • SEQ ID NO: 373 is the protein sequence of NCBI gi no. 4894170 from Cicer arietinum.
  • SEQ ID NO: 374 is the protein sequence of NCBI gi no. 4200044 from Glycyrrhiza echinata.
  • SEQ ID NO: 375 is the protein sequence of NCBI gi no. 2443348 from Glycyrrhiza echinata.
  • SEQ ID NO: 376 is the protein sequence of NCBI gi no. 7270718 from Arabidopsis thaliana.
  • SEQ ID NO: 377 is the protein sequence of NCBI gi no. 7415996 from Lotus japonicus.
  • SEQ TD NO: 378 is the protein sequence of NCBI gi no. 4006851 from Arabidopsis thaliana.
  • SEQ ID NO: 379 is the protein sequence of NCBI gi no. 51968888 from Arabidopsis thaliana.
  • SEQ ID NO: 380 is the protein sequence of NCBI gi no. 20197777 from Arabidopsis thaliana.
  • SEQ ID NO: 381 is the protein sequence of NCBI gi no. 2739008 from Glycine max.
  • SEQ ID NO: 382 is the protein sequence of clone ID no. 779234 from Triticum aestivum.
  • SEQ ID NO: 383 is the protein sequence of NCBI gi no. 50948231 from Oryza sativa subsp. japonica.
  • SEQ ID NO: 384 is the protein sequence of NCBI gi no. 5921925 from Pinus radiata.
  • SEQ ID NO: 385 is the protein sequence of clone ID no. 624225 from Glycine max.
  • SEQ ID NO: 386 is the protein sequence of clone ID no. 627596 from Glycine max.
  • SEQ ID NO: 387 is the protein sequence of NCBI gi no. 50916627 from Oryza sativa subsp. japonica.
  • SEQ ID NO: 388 is the protein sequence of NCBI gi no. 50939101 from Oryza sativa subsp. japonica.
  • SEQ ID NO: 389 is the protein sequence of NCBI gi no. 52076870 from Oryza sativa subsp. japonica.
  • SEQ ID NO: 390 is the protein sequence of NCBI gi no. 438241 from Solatium melongena.
  • SEQ ID NO: 391 is the protein sequence of NCBI gi no. 9665096 from Arabidopsis thaliana.
  • SEQ ID NO: 392 is the protein sequence of NCBI gi no. 12231886 from Matthiola incana.
  • SEQ ID NO: 393 is the protein sequence of clone ID no. 595123 from Glycine max.
  • SEQ ID NO: 394 is the protein sequence of NCBI gi no. 28603528 from Glycine max.
  • SEQ ID NO: 395 is the protein sequence of NCBI gi no. 12231914 from Pelargonium x hortorum.
  • SEQ ID NO: 396 is the protein sequence of NCBI gi no. 5921647 from Petunia x hybrida.
  • SEQ ID NO: 397 is the protein sequence of NCBI gi no. 38093218 from Ipomoea purpurea.
  • SEQ ID NO: 398 is the protein sequence of NCBI gi no. 44889632 from Allium cepa.
  • SEQ ID NO: 399 is the protein sequence of NCBI gi no. 12231880 from Callistephus chinensis.
  • SEQ ID NO: 400 is the protein sequence of NCBI gi no. 38093216 from Ipomoea nil.
  • SEQ ID NO: 401 is the protein sequence of NCBI gi no. 31431083 from Oryza sativa subsp. japonica.
  • SEQ ID NO: 402 is the protein sequence of NCBI gi no. 14278925 from Perilla frutescens.
  • SEQ ID NO: 403 is the protein sequence of NCBI gi no. 62086547 from n/a .
  • SEQ ID NO: 404 is the protein sequence of NCBI gi no. 19910935 from Torenia hybrida.
  • SEQ ID NO: 405 is the protein sequence of NCBI gi no. 42821962 from Ipomoea quamoclit.
  • SEQ ID NO: 406 is the protein sequence of NCBI gi no.
  • SEQ ID NO: 407 is the protein sequence of NCBI gi no. 1890152 from Arabidopsis thaliana.
  • SEQ ID NO: 408 is the protein sequence of NCBI gi no. 16973300 from Nicotiana attenuata.
  • SEQ ID NO: 409 is the protein sequence of NCBI gi no. 29373135 from Citrus sinensis.
  • SEQ ID NO: 410 is the protein sequence of NCBI gi no. 20160362 from Solanum tuberosum.
  • SEQ ID NO: 411 is the protein sequence of NCBI gi no. 1352186 from Linum usitatissimum.
  • SEQ ID NO: 412 is the protein sequence of NCBI gi no. 7581989 from Lycopersicon esculentuin.
  • SEQ ID NO: 413 is the protein sequence of NCBI gi no. 21616113 from Cucumis melo.
  • SEQ ID NO: 414 is the protein sequence of NCBI gi no. 33504426 from Medicago truncatula.
  • SEQ ID NO: 415 is the protein sequence of NCBI gi no. 50919025 from Oryza sativa subsp. japonica.
  • SEQ DD NO: 416 is the protein sequence of clone ID no.
  • SEQ ID NO: 417 is the protein sequence of clone ID no. 632974 from Triticum aestivum.
  • SEQ ID NO: 418 is the protein sequence of NCBI gi no. 8572559 from Citrus sinensis.
  • SEQ ID NO: 419 is the protein sequence of NCBI gi no. 4566493 from Pinus taeda.
  • SEQ ID NO: 420 is the protein sequence of NCBI gi no. 1574976 from Populus tremuloides.
  • SEQ ID NO: 421 is the protein sequence of NCBI gi no. 12276037 from Populus x generosa.
  • SEQ ID NO: 422 is the protein sequence of NCBI gi no. 3915089 from Populus kitakamiensis.
  • SEQ ID NO: 423 is the protein sequence of NCBI gi no. 4688632 from Cicer arietinum.
  • SEQ ID NO: 424 is the protein sequence of NCBI gi no. 1044868 from Glycine max.
  • SEQ ID NO: 425 is the protein sequence of NCBI gi no. 586081 from Medicago sativa.
  • SEQ ID NO: 426 is the protein sequence of NCBI gi no. 2624383 from Phaseolus vulgaris.
  • SEQ ID NO: 427 is the protein sequence of NCBI gi no. 19864010 from Pisum sativum.
  • SEQ ID NO: 428 is the protein sequence of NCBI gi no. 9957081 from Pisum sativum.
  • SEQ ID NO: 429 is the protein sequence of NCBI gi no. 7430603 from Pisum sativum.
  • SEQ ID NO: 430 is the protein sequence of NCBI gi no. 4753128 from Pisum sativum.
  • SEQ ID NO: 431 is the protein sequence of NCBI gi no. 586082 from Vigna radiata.
  • SEQ ID NO: 432 is the protein sequence of NCBI gi no. 3915088 from Petroselinum crispum.
  • SEQ ID NO: 433 is the protein sequence of NCBI gi no. 473229 from Catharanthus roseus.
  • SEQ ID NO: 434 is the protein sequence of NCBI gi no. 12003968 from Capsicum annuum.
  • SEQ ID NO: 435 is the protein sequence of NCBI gi no. 14423323 from Nicotiana tabacum.
  • SEQ ID NO: 436 is the protein sequence of NCBI gi no. 14423325 from Nicotiana tabacum.
  • SEQ ID NO: 437 is the protein sequence of NCBI gi no. 18859 from Helianthus tuberosus.
  • SEQ ID NO: 438 is the protein sequence of NCBI gi no. 14192803 from Sorghum bicolor.
  • SEQ ID NO: 439 is the protein sequence of clone ID no. 755965 from Triticum aestivum.
  • SEQ DD NO: 440 is the protein sequence of NCBI gi no. 44889626 from Allium cepa.
  • SEQ ID NO: 441 is the protein sequence of NCBI gi no. 47933894 from Camptotheca acuminata.
  • SEQ ID NO: 442 is the protein sequence of NCBI gi no. 9965897 from Gossypium arboreum.
  • SEQ ID NO: 443 is the protein sequence of NCBI gi no. 9965899 from Gossypium arboreum.
  • SEQ ID NO: 444 is the protein sequence of NCBI gi no. 3915112 from Zinnia elegans.
  • SEQ ID NO: 445 is the protein sequence of NCBI gi no. 16555877 from Lithospermum erythrorhizon.
  • SEQ ID NO: 446 is the protein sequence of NCBI gi no.
  • SEQ ID NO: 447 is the protein sequence of NCBI gi no. 13548653 from Ruta graveolens.
  • SEQ ID NO: 448 is the protein sequence of NCBI gi no. 24571503 from Ruta graveolens.
  • SEQ ID NO: 449 is the protein sequence of NCBI gi no. 14210375 from Citrus xparadisi.
  • SEQ ID NO: 450 is the protein sequence of NCBI gi no. 51702531 from Agastache rugosa.
  • SEQ ID NO: 451 is the protein sequence of NCBI gi no. 55168223 from Oryza sativa subsp. japonica.
  • SEQ ID NO: 452 is the protein sequence of NCBI gi no. 3915095 from Glycyrrhiza echinata.
  • SEQ ID NO: 453 is the protein sequence of NCBI gi no. 29123387 from Ammi majus.
  • SEQ ID NO: 454 is the protein sequence of NCBI gi no. 21842133 from Zea mays.
  • SEQ ID NO: 455 is the protein sequence of clone ID no. 1478163 from Zea mays.
  • SEQ TD NO: 456 is the protein sequence of NCBI gi no.
  • SEQ ID NO: 457 is the protein sequence of NCBI gi no. 34912880 from Oryza sativa subsp. japonica.
  • SEQ ID NO: 458 is the protein sequence of NCBI gi no. 34912882 from Oryza sativa subsp. japonica.
  • SEQ TD NO: 459 is the protein sequence of NCBI gi no. 27754241 from Arabidopsis thaliana.
  • SEQ TD NO: 460 is the protein sequence of NCBI gi no. 9294387 from Arabidopsis thaliana.
  • SEQ E) NO: 461 is the protein sequence of NCBI gi no. 15231899 from Arabidopsis thaliana.
  • SEQ E) NO: 462 is the protein sequence of NCBI gi no. 9294386 from Arabidopsis thaliana.
  • SEQ TD NO: 463 is the protein sequence of NCBI gi no. 24111277 from Arabidopsis thaliana.
  • SEQ E) NO: 464 is the protein sequence of NCBI gi no. 13661770 from Lolium rigidum.
  • SEQ TD NO: 465 is the protein sequence of NCBI gi no. 13661768 from Lolium rigidum.
  • SEQ E) NO: 466 is the protein sequence of NCBI gi no.
  • SEQ TD NO: 467 is the protein sequence of NCBI gi no. 13661774 from Lolium rigidum.
  • SEQ TD NO: 468 is the protein sequence of NCBI gi no. 13661772 from Lolium rigidum.
  • SEQ TD NO: 469 is the protein sequence of NCBI gi no. 20465787 from Arabidopsis thaliana.
  • SEQ TD NO: 470 is the protein sequence of clone TD no. 479101 from Glycine max.
  • SEQ E) NO: 471 is the protein sequence of NCBI gi no. 21542404 from Arabidopsis thaliana.
  • SEQ E) NO: 472 is the protein sequence of NCBI gi no. 9294291 from Arabidopsis thaliana.
  • SEQ E) NO: 473 is the protein sequence of NCBI gi no. 30923413 from Arabidopsis thaliana.
  • SEQ TD NO: 474 is the protein sequence of NCBI gi no. 7649376 from Arabidopsis thaliana.
  • SEQ TD NO: 475 is the protein sequence of NCBI gi no. 3164132 from Arabidopsis thaliana.
  • SEQ E) NO: 476 is the protein sequence of NCBI gi no.
  • SEQ TD NO: 477 is the protein sequence of NCBI gi no. 20197089 from Arabidopsis thaliana.
  • SEQ TD NO: 478 is the protein sequence of NCBI gi no. 7430672 from Arabidopsis thaliana.
  • SEQ TD NO: 479 is the protein sequence of NCBI gi no. 9294288 from Arabidopsis thaliana.
  • SEQ ID NO: 480 is the protein sequence of NCBI gi no. 13878369 from Arabidopsis thaliana.
  • SEQ ID NO: 481 is the protein sequence of NCBI gi no. 11994442 from Arabidopsis thaliana.
  • SEQ ID NO: 482 is the protein sequence of NCBI gi no. 4850393 from Arabidopsis thaliana.
  • SEQ ID NO: 483 is the protein sequence of NCBI gi no. 11994438 from Arabidopsis thaliana.
  • SEQ ID NO: 484 is the protein sequence of NCBI gi no. 23506035 from Arabidopsis thaliana.
  • SEQ ID NO: 485 is the protein sequence of NCBI gi no. 16226474 from Arabidopsis thaliana.
  • SEQ ID NO: 486 is the protein sequence of NCBI gi no.
  • SEQ ID NO: 487 is the protein sequence of NCBI gi no. 15238720 from Arabidopsis thaliana.
  • SEQ ID NO: 488 is the protein sequence of NCBI gi no. 13878407 from Arabidopsis thaliana.
  • SEQ ID NO: 489 is the protein sequence of NCBI gi no. 51971443 from Arabidopsis thaliana.
  • SEQ ID NO: 490 is the protein sequence of NCBI gi no. 1345641 from Thlaspi arvense.
  • SEQ ID NO: 491 is the protein sequence of NCBI gi no. 15238726 from Arabidopsis thaliana.
  • SEQ ID NO: 492 is the protein sequence of NCBI gi no. 20465427 from Arabidopsis thaliana.
  • SEQ ID NO: 493 is the protein sequence of NCBI gi no. 42566749 from Arabidopsis thaliana.
  • SEQ ID NO: 494 is the protein sequence of NCBI gi no. 4584536 from Arabidopsis thaliana.
  • SEQ ID NO: 495 is the protein sequence of NCBI gi no. 34365731 from Arabidopsis thaliana.
  • SEQ ID NO: 496 is the protein sequence of clone ID no.
  • SEQ ID NO: 497 is the protein sequence of NCBI gi no. 34903888 from Oryza sativa si ⁇ sp. japonica.
  • SEQ ID NO: 498 is the protein sequence of NCBI gi no. 34903876 from Oryza sativa subsp. japonica.
  • SEQ ID NO: 499 is the protein sequence of NCBI gi no. 34903874 from Oryza sativa subsp. japonica.
  • SEQ ID NO: 500 is the protein sequence of NCBI gi no. 34903880 from Oryza sativa subsp. japonica.
  • SEQ ID NO: 501 is the protein sequence of NCBI gi no. 20856398 from Arabidopsis thaliana.
  • SEQ ID NO: 502 is the protein sequence of clone ID no. 158108 from Arabidopsis thaliana.
  • SEQ ID NO: 503 is the protein sequence of NCBI gi no. 21553706 from Arabidopsis thaliana.
  • SEQ ID NO: 504 is the protein sequence of NCBI gi no. 20259299 from Arabidopsis thaliana.
  • SEQ ID NO: 505 is the protein sequence of NCBI gi no. 42572955 from Arabidopsis thaliana.
  • SEQ ID NO: 506 is the protein sequence of NCBI gi no.
  • SEQ ID NO: 507 is the protein sequence of clone ID no. 512411 from Glycine max.
  • SEQ ID NO: 508 is the protein sequence of clone ID no. 791996 from Triticum aestivum.
  • SEQ ID NO: 509 is the protein sequence of clone ID no. 579678 from Triticum aestivum.
  • SEQ ID NO: 510 is the protein sequence of NCBI gi no. 7267932 from Arabidopsis thaliana.
  • SEQ ID NO: 511 is the protein sequence of NCBI gi no. 28973099 from Arabidopsis thaliana.
  • SEQ ID NO: 512 is the protein sequence of NCBI gi no. 11345411 from Matthiola incana.
  • SEQ ID NO: 513 is the protein sequence of clone ID no. 525053 from Glycine max.
  • SEQ ID NO: 514 is the protein sequence of NCBI gi no. 51536592 from Arabidopsis thaliana.
  • SEQ ID NO: 515 is the protein sequence of NCBI gi no. 3885331 from Arabidopsis thaliana.
  • SEQ ID NO: 516 is the protein sequence of NCBI gi no. 3885330 from Arabidopsis thaliana.
  • SEQ ID NO: 517 is the protein sequence of NCBI gi no. 9294001 from Arabidopsis thaliana.
  • SEQ ID NO: 518 is the protein sequence of NCBI gi no. 9294002 from Arabidopsis thaliana.
  • SEQ ID NO: 519 is the protein sequence of NCBI gi no. 28973083 from Arabidopsis thaliana.
  • SEQ ID NO: 520 is the protein sequence of NCBI gi no. 30689861 from Arabidopsis thaliana.
  • SEQ ID NO: 521 is the protein sequence of NCBI gi no. 2344895 from Arabidopsis thaliana.
  • SEQ ID NO: 522 is the protein sequence of NCBI gi no. 28460683 from Arabidopsis thaliana.
  • SEQ ID NO: 523 is the protein sequence of NCBI gi no. 18481718 from Sorghum bicolor.
  • SEQ ID NO: 524 is the protein sequence of clone ID no. 244116 from Zea mays.
  • SEQ ID NO: 525 is the protein sequence of NCBI gi no. 50940815 from Oryza sativa subsp. japonica.
  • SEQ ID NO: 526 is the protein sequence of NCBI gi no.
  • SEQ ID NO: 527 is the protein sequence of NCBI gi no. 37954114 from Piston sativum.
  • SEQ ID NO: 528 is the protein sequence of NCBI gi no. 53792013 from Oryza sativa subsp. japonica.
  • SEQ E) NO: 529 is the protein sequence of NCBI gi no. 50727139 from Oryza sativa subsp. japonica.
  • SEQ ID NO: 530 is the protein sequence of NCBI gi no. 50957226 from Oryza sativa subsp. japonica.
  • SEQ ID NO: 531 is the protein sequence of NCBI gi no. 53792010 from Oryza sativa subsp. japonica.
  • SEQ ID NO: 532 is the protein sequence of NCBI gi no. 34304722 from Stevia rebaudiana.
  • SEQ ID NO: 533 is the protein sequence of NCBI gi no. 46811123 from Fragaria grandiflora.
  • SEQ ID NO: 534 is the DNA sequence of promoter ID no. pl3879 from Arabidopsis thaliana.
  • SEQ ID NO: 535 is the DNA sequence of promoter ID no. ⁇ 32449 from Arabidopsis thaliana.
  • SEQ ID NO: 536 is the DNA sequence of promoter ID no.
  • SEQ ID NO: 537 is the DNA sequence of promoter ID no. YP0050 from Arabidopsis thaliana.
  • SEQ ID NO: 538 is the DNA sequence of promoter ID no. YP0190 from Arabidopsis thaliana. IV. DETAILED DESCRIPTION
  • nucleic acid molecule and “polynucleotide” are used interchangeably and refer to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof. This term refers only to the primary structure of the molecule and thus includes double- and single-stranded DNA and RNA.
  • the alphabetical representation of a nucleic acid can be input into databases in a computer having a central processing unit and used for bioinformatics applications such as functional genomics and homology searching.
  • regulatory region refers to a nucleic acid sequence that modulates, e.g., regulates, facilitates or drives, the expression of a second nucleic acid sequence.
  • a regulatory region may include sequences that stimulate or repress transcription of a gene in a specific or non-specific manner and in an inducible or non-inducible manner. Regulatory regions may include multiple control elements. Typical control elements, include, but are not limited to, transcription promoters, transcription enhancer elements, transcription termination signals, polyadenylation sequences (located 3' to the translation stop codon), sequences for optimization of initiation of translation (located 5' to the coding sequence), translation enhancing sequences, and translation termination sequences.
  • Transcription promoters can include inducible promoters (where expression of a polynucleotide sequence operably linked to the promoter is induced by an analyte, cofactor, regulatory protein, etc.), tissue-specific promoters (where expression of a polynucleotide sequence operably linked to the promoter is induced only in selected tissue), repressible promoters (where expression of a polynucleotide sequence operably linked to the promoter is repressed by an analyte, cofactor, regulatory protein, etc.), and constitutive promoters.
  • inducible promoters where expression of a polynucleotide sequence operably linked to the promoter is induced by an analyte, cofactor, regulatory protein, etc.
  • tissue-specific promoters where expression of a polynucleotide sequence operably linked to the promoter is induced only in selected tissue
  • repressible promoters where expression of a polynucleotide sequence operably linked to
  • “Expression enhancing sequences” typically refer to control elements that improve transcription or translation of a polynucleotide relative to the expression level in the absence of such control elements (for example, promoters, promoter enhancers, enhancer elements, and translational enhancers (e.g., Shine-Dalgarno sequences).
  • operably linked refers to covalent linkage of two or more nucleic acid sequences in such a way as to permit modulation of transcription and/or translation of the nucleic acid by the one or more regulatory regions.
  • the control elements of the regulatory region need not be contiguous with the coding sequence, so long as they function to direct the expression thereof.
  • intervening untranslated yet transcribed sequences can be present between a promoter and the coding sequence and the promoter can still be considered “operably linked" to the coding sequence.
  • exogenous nucleic acid indicates that the nucleic acid is part of a recombinant nucleic acid construct, or is not in its natural environment.
  • an exogenous nucleic acid can be a sequence from one species introduced into another species, i.e., a heterologous nucleic acid. Typically, such an exogenous nucleic acid is introduced into the other species via a recombinant nucleic acid construct.
  • An exogenous nucleic acid can also be a sequence that is native to an organism and that has been reintroduced into cells of that organism.
  • exogenous nucleic acid that includes a native sequence can often be distinguished from the naturally occurring sequence by the presence of non-natural sequences linked to the exogenous nucleic acid, e.g., non-native regulatory sequences flanking a native sequence in a recombinant nucleic acid construct.
  • stably transformed exogenous nucleic acids typically are integrated at positions other than the position where the native sequence is found and are transferable to the progeny of the cell.
  • a P450 polypeptide can be endogenous or exogenous to a particular plant or plant cell.
  • Exogenous P450 polypeptides therefore, can include polypeptides that are native to a plant or plant cell, but that are expressed in a plant cell via a recombinant nucleic acid construct.
  • a regulatory region can be exogenous or endogenous to a plant or plant cell.
  • An exogenous regulatory region is a regulatory region that is part of a recombinant nucleic acid construct, or is not in its natural environment.
  • an Arabidopsis promoter present on a recombinant nucleic acid construct is an exogenous regulatory region when an Arabidopsis plant cell is transformed with the construct.
  • the term "polypeptide" or "protein sequence” is used in its broadest sense to refer to a compound of two or more subunit amino acids. A peptide of three or more amino acids is commonly called an oligopeptide if the peptide chain is short.
  • the peptide is typically called a polypeptide or a protein. Full-length proteins, analogs, mutants and fragments thereof are encompassed by the definition.
  • the terms also include post- expression modifications of the polypeptide, for example, glycosylation, acetylation, phosphorylation, methylation and the like.
  • a particular polypeptide may be obtained as an acidic or basic salt, or in neutral form.
  • a polypeptide may be obtained directly from the source organism, or may be recombinantly or synthetically produced (see further below).
  • transgenic plant is meant a plant into which one or more exogenous polynucleotides have been introduced using genetic engineering tools. Examples of means by which this can be accomplished are described below, and include Agrobacterium-mediated transformation, biolistic methods, electroporation, and the like.
  • “Misexpression or aberrant expression”, as used herein, refers to a non- wild type pattern of gene expression, at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over or under expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of decreased expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength
  • pharmacological activity refers to a property of a compound that confers a benefit when the compound is used to treat, diagnose or prevent a human or veterinary disease condition
  • a pharmacological activity of a compound can provide a valuable pharmaceutical activity, e.g., the pharmacological activity of a compound that is the subject of an Investigational New Drug application before the United States Food and Drug Administration (FDA), is the subject of a phase I, II, or III clinical trial, is claimed in a patent listed in the FDA Orange Book or is approved for commercial sale by the FDA.
  • FDA Investigational New Drug application before the United States Food and Drug Administration
  • the pharmacological activity of the compound is one of the following: Alzheimer's disease treatment, analgesic activity, anesthetic activity, anti-Addison's disease activity, anti-HIV activity, anti-infective activity, anti-inflammatory activity, antianginal activity, antiangiogenic activity, antianxiety activity, antiarrhythmic activity, antiarthritic activity, antiatherosclerotic activity, antibacterial activity, antibiotic activity, anticancer activity, anticholesterol activity, anticholinergic activity, anticoagulant activity, anticonvulsant activity, antidepressant activity, antidiabetic activity, antidiuretic activity, antiedemic activity, antifungal activity, antigout activity, antiglaucoma activity, antihemorrhagic activity, antihistamine activity, antihypertensive activity, antimalarial activity, antimicrobial activity, antimigraine activity, antimotion sickness activity, antineoplastic activity, antineuralgic activity, antiobesity activity, antioxidant activity, antiparasi
  • the candidate can have several activities.
  • the candidate can have anticancer, antipsoriasis, and antivitiligo activities (e.g., 8-methoxypsoralen).
  • the candidate compound is not an antibiotic or an antifungal agent.
  • Suitable candidate compounds include (i) compounds listed in the PharmaProjects database, a research and development tracking database available from PJB Publications at pjbpubs.com/pharmaprojects/index.htm.
  • the present invention comprises plant cells expressing a P450 polypeptide that can be utilized to screen for P450s that modify candidate compounds, such as pharmacologically active compounds.
  • the present invention also comprises plant cells expressing a P450 polypeptide that can be utilized to screen for substrate(s) for the expressed P450, e.g., pharmacologically active compounds that can serve as substrates for the P450.
  • the invention provides methods of screening for a substrate of a P450.
  • Such methods comprise contacting a pharmacologically active candidate compound with plant cells containing a recombinant nucleic acid construct comprising a regulatory region operably linked to a coding sequence for a P450 having 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-533. Following the contact, it is determined whether the candidate compound is modified. Detection of a modification to the candidate indicates that the candidate compound is a substrate.
  • a collection of P450s can be screened by making a library of plant cells, each member of the library containing a recombinant nucleic acid construct comprising a regulatory region operably linked to a coding sequence for a P450 in the collection. Li this way, the collection of P450s can be screened by repeating the contacting and determining steps with plant cells from each member of the library.
  • methods of making the modified compound can be used. Such methods comprise contacting a candidate compound with transgenic plant cells containing a recombinant nucleic acid construct comprising a regulatory region operably linked to a coding sequence for a P450 having 80% or greater identity to a sequence selected from the group consisting of SEQ ID NO: 137-533, whereby the P450 modifies the candidate compound.
  • the modified candidate compound can then be recovered from the plant cells if desired.
  • a candidate compound is contacted with a P450 for at least one hour, e.g., 2 hours, 4 hours, 8 hours, 12 hours, 1 day (24 hours), 2 days, 4 days, 7 days or 14 days. In certain embodiments, a candidate compound is contacted with a P450 for no more than 28 days, e.g., 14 days, 7 days, 4 days, 2 days or 1 day.
  • the candidate compound can be contacted with the P450 for between about 1 hour and about 14 days, between about 8 hours and about 7 days, between about 8 hours and about 4 days, between about 8 hours and about 48 hours, between about 1 day and about 14 days, between about 2 days and about 14 days, between about 5 days and about 10 days, between about 6 days and about 14 days, between about 7 days and about 18 days, or between about 2 days and about 7 days.
  • candidate compounds are suitable for use in the methods described herein.
  • suitable candidate compounds having pharmacological activity include, without limitation, aesculetin, ajmalicine, ajmaline, aloin, amikacin, amlodipine besylate, amoxicillin, amphotericin B, andrographolide, apomorphine hydrochloride, arecoline hydrobromide, artemesinin, atorvastatin calcium, atropine, azithromycin, berberine chloride, bergenin monohydrate, betamethasone, betulinic acid, bixin, brucine, budesonide, bupropion hydrochloride, butamben, caffeine, camptothecin, capsaicin, celecoxib, ciprofloxacin, clarithromycin, clopidogrel sulfate, codeine, colchicine, convallatoxin, curcumin, cyclobenzaprine hydrochloride, danthron
  • a candidate compound can be 8-methoxypsoralen (8-MOP).
  • 8-MOP has antiproliferative activity, antipsoriasis activity, antivitiligo activity, and anticancer activity, hi some cases, 8-MOP can be used for preventing cancer, hi other cases, 8-MOP can also be used to facilitate smoking cessation.
  • a compound modified by plant cells expressing a P450 can be further modified, e.g., compounds exposed to a P450 in metabolically rich plant cells can be modified both by action of the P450 itself as well as by subsequent action of endogenous enzymes to further modify the initial compound.
  • Such further modifications of the compound include, but are not limited to, acetylation, carboxylation, glycosylation, demethylation, O- methylation, O-acetylation, decarboxylation, oxime generation, oxidation, phosphorylation, lipidation and acylation.
  • Such subsequent modifications to a compound, mediated by enzymes endogenous to such plant cells can provide a rich source of novel compounds that, in certain cases, may exhibit pharmacological activity.
  • Candidate compound collections may contain molecules isolated from natural sources, artificially synthesized molecules, or molecules synthesized, isolated, or otherwise prepared in such a manner so as to have one or more moieties variable, e.g., moieties that are independently isolated or randomly synthesized.
  • the collection contains of compounds known to have at least some therapeutic effect in humans or other animals.
  • the collection constains of odorants or flavorants.
  • Compound libraries for use the invention may be purchased on the commercial market or prepared or obtained by means including, but not limited to, combinatorial chemistry techniques, fermentation methods, plant and cellular extraction procedures and the like (see, e.g., Cwirla et al, (1990) Biochemistry, 87, 6378-6382; Houghten et al., (1991) Nature, 354, 84-86; Lam et al., (1991) Nature, 354, 82-84; Brenner et al., (1992) Proc. Natl. Acad. Sd. USA, 89, 5381- 5383; R. A. Houghten, (1993) Trends Genet., 9, 235-239; E. R.
  • Libraries of a variety of types of candidate compounds can be prepared in order to obtain members having one or more preselected attributes that can be prepared by a variety of techniques, including but not limited to parallel array synthesis (Houghton, (2000) Annu Rev Pharmacol Toxicol 40:273-82, Parallel array and mixture-based synthetic combinatorial chemistry; solution-phase combinatorial chemistry (Merritt, (1998) Comb Chem High Throughput Screen l(2):57-72, Solution phase combinatorial chemistry, Coe et al., (1998-99) MoI Divers;4(l):3l-S, Solution-phase combinatorial chemistry, Sun, (1999) Comb Chem High Throughput Screen 2(6):299-318, Recent advances in liquid-phase combinatorial chemistry); synthesis on soluble polymer (Gravert et al., (1997) Curr Opin Chem Biol 1(1): 107-13, Synthesis on soluble polymers: new reactions and the construction of small molecules); and the like.
  • a candidate compound need not be purified prior to contacting it with plant cells containing a recombinant nucleic acid construct comprising a regulatory region operably linked to a coding sequence for a P450.
  • the candidate compound can be a component of an extract, solution, suspension, blend or emulsion.
  • the candidate compound represents at least 10% (w/w) of the solutes, on a dry basis, of a extract, solution, suspension, blend or emulsion.
  • the compound represents at least 25% (w/w), 50% (w/w), 75% (w/w), or 90% (w/w) of the solutes of the extract, solution, suspension, blend or emulsion.
  • the candidate compound may be substantially pure and is contacted with plant cells in an amount of from 1 nM to about 100 niM, e.g., from about 100 nM to about 1 mM, or about 1 ⁇ M to about 100 ⁇ M.
  • a plurality of candidate compounds can be contacted simultaneously with plant cells.
  • the plurality of candidate compounds can be components of a mixture of known compounds, i.e. a. pool of candidate compounds to be screened or modified.
  • the plurality of candidate compounds can be components of an unknown or partially characterized mixture, e.g. an extract of a organism, such as bacteria, yeast or plant or animal tissue.
  • the plurality of candidate compounds also can be components of a crude reaction mixture from a synthetic process or combinatorial chemistry reaction process.
  • Whether or not modification of a compound has occurred can be determined by various methods, including but not limited to gas chromatography-mass spectrometry (GC-MS), liquid chromatography-MS (LC- MS), HPLC, PDA detection, electrospray ionization-MS, Fourier-transform-ion- cyclotron-resonance-MS, or nuclear magnetic resonance (NMR).
  • GC-MS gas chromatography-mass spectrometry
  • LC- MS liquid chromatography-MS
  • HPLC HPLC
  • PDA detection electrospray ionization-MS
  • Fourier-transform-ion- cyclotron-resonance-MS or nuclear magnetic resonance (NMR).
  • mass spectrometry is a widely used technique for the characterization and identification of molecules, both in organic and inorganic chemistry. MS provides molecular weight information about a molecule.
  • the molecular weight of a molecule is a useful piece of information in the identification of a particular molecule in a mixture of molecules.
  • the term mass spectrometer refers to an analytical device that uses the difference in mass-to-charge ratio (m/z) of ionized atoms or molecules to separate them from each other. Mass spectrometry is therefore useful for quantitation of atoms or molecules and also for determining chemical and structural information about molecules. Molecules have distinctive fragmentation patterns that provide structural information to identify structural components.
  • the general operation of a mass spectrometer is: (a) create gas- phase ions; (b) separate the ions in space or time based on their mass-to-charge ratio, and (c) measure the quantity of ions of each mass-to-charge ratio. The ion separation power of a mass spectrometer is described by its resolution.
  • mass spectrometers may be coupled to separation means such as gas chromatography (GC) and high performance liquid chromatography (HPLC).
  • separation means such as gas chromatography (GC) and high performance liquid chromatography (HPLC).
  • GC/MS gas-chromatography mass-spectrometry
  • capillary columns from a gas chromatograph are coupled directly to the mass spectrometer, optionally using a jet separator.
  • the gas chromatography (GC) column separates sample components from the sample gas mixture and the separated components are ionized and chemically analyzed in the mass spectrometer.
  • P450 polypeptides and nucleic acids encoding them.
  • the P450 superfamily includes a large number of enzymes that catalyze a wide variety of chemical reactions in a broad range of substrates. Some P450 enzymes are very substrate-specific, while others are more catholic in their substrate selection.
  • Known P450 substrates include steroids, eicosinoids, fatty acids, lipid hydroperoxides, retinoids and xenobiotics, such as drugs, alcohols, carcinogens, antioxidants, organic solvents, dyes, pesticides, odorants and flavorants.
  • P450 polypeptides are commonly known to be oxygenating enzymes.
  • P450 polypeptides also catalyze 2-electron reduction of compounds, reductively cleave xenobiotics and lipid hydroperoxides, and mediate various dealkylation, epoxidation and sulfoxidation reactions.
  • Cytochrome P450s are heme-containing enzymes.
  • One step in the oxidation of a substrate by a P450 is the addition of one atom of molecular oxygen, which is activated by a reduced heme iron, to the substrate.
  • the reaction is usually, but not always, a hydroxylation reaction.
  • the second oxygen atom is reduced to water, thereby accepting electrons from NAD(P)H via a flavoprotein or a ferredoxin.
  • the activation of oxygen is common to all P450s. This activation takes place at the iron-protoporphyrin IX (heme).
  • the heme iron is six-fold coordinated. It has a conserved thiolate residue as the fifth ligand and, in the inactive ferric form, a water molecule as its sixth ligand.
  • the catalytic mechanism involves six steps:
  • Typical cytochrome P450s contain characteristic domains as defined by Werch-Reichhart, et al. "Cytochromes P450", The Arabidopsis Book, Werch- Reichhart, D., Bak, S., Paquette, S. (2002).
  • Typical domains include an I-helix domain involved in oxygen binding and activation, a K-helix domain containing an ERR triad involved in locking the heme into position, a heme-binding domain, and an N-terminal region consisting of a membrane insertion domain and a hinge region containing a section with basic residues followed by a proline rich region.
  • P450 polypeptides are often classified into clades based on sequence identity.
  • An amino acid sequence identity of about 40% between polypeptides is accepted as indicating membership in a given P450 family.
  • An amino acid sequence identity of about 55% among members of a given P450 family indicates that the members are part of the same subfamily, and an amino acid sequence of about 95% indicates that the members of a given P450 subfamily are isoforms.
  • P450 classifications on the website drnelson.utmem.edu/CytochromeP450 can be utilized by those skilled in the art.
  • functional homologs of a P450 polypeptide are members of the same P450 family, and often are members of the same subfamily or are isoforms.
  • a P450 functional homolog typically has at least about 80% amino acid sequence identity, e.g., about 80% amino acid sequence identity, about 85% amino acid sequence identity, about 90% sequence identity, about 95% amino acid sequence identity, or at least about 98% amino acid sequence identity with a P450 comprising a sequence having 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-533.
  • Table 1 provides an exemplary list of P450s useful in the invention and further provides the classification into a P450 family and locus in the Arabidopsis thaliana genome of each identified sequence.
  • Suitable P450 polypeptides can be identified by analysis of polypeptide sequence alignments involving analysis of nonredundant databases using amino acid sequences of P450 polypeptides. Those polypeptides in the database that have greater than 40% sequence identity are candidates for further evaluation for suitability as a P450 polypeptide. If desired, manual inspection of such candidates can be carried out in order to narrow the number of candidates to be further evaluated. Manual inspection can be performed by selecting those candidates that appear to have domains suspected of being present in P450 polypeptides. See, e.g., the Pfam web site at sanger.ac.uk/Pfam and genome.wustl.edu/Pfam.
  • percent sequence identity refers to the degree of identity between any given query sequence and a subject sequence.
  • a percent identity for any query nucleic acid or amino acid sequence, e.g., a P450 polypeptide, relative to another subject nucleic acid or amino acid sequence can be determined as follows.
  • a query nucleic acid or amino acid sequence is aligned to one or more subject nucleic acid or amino acid sequences using the computer program ClustalW (version 1.83, default parameters), which allows alignments of nucleic acid or protein sequences to be carried out across their entire length (global alignment).
  • ClustalW version 1.83, default parameters
  • ClustalW calculates the best match between a query and one or more subject sequences, and aligns them so that identities, similarities and differences can be determined. Gaps of one or more residues can be inserted into a query sequence, a subject sequence, or both, to maximize sequence alignments.
  • word size 2; window size: 4; scoring method: percentage; number of top diagonals: 4; and gap penalty: 5.
  • gap opening penalty 10.0; gap extension penalty: 5.0; and weight transitions: yes.
  • word size 1; window size: 5; scoring method: percentage; number of top diagonals: 5; gap penalty: 3.
  • weight matrix blosum; gap opening penalty: 10.0; gap extension penalty: 0.05; hydrophilic gaps: on; hydrophilic residues: glycine, proline, asparagine, aspartic acid, glutamine, glutamic acid, arginine, and lysine; residue-specific gap penalties: on.
  • the output is a sequence alignment that reflects the relationship between sequences.
  • ClustalW can be run, for example, at the Baylor College of Medicine Search Launcher site searchlauncher.bcm.tmc.edu/multi-align/multi-align and at the European Bioinformatics Institute site ebi.ac.uk/clustalw.
  • the number of matching bases or amino acids in the alignment is divided by the total number of matched and mis-matched bases or amino acids excluding gaps, followed by multiplying the result by 100.
  • the output is the percent identity of the subject sequence with respect to the query sequence.
  • the amino acid sequence of a suitable P450 polypeptide has greater than 60% sequence identity (e.g., > 60%, > 70%, > 80% or > 90%) to the amino acid sequence of the query P450 polypeptide.
  • Nucleic acids encoding P450 polypeptides for use in the methods and compositions described herein may be derived from, but not limited to, bacteria, yeasts, alga, animals and plants. They can be obtained also from various other sources. The sequences obtained from those sources may be connected to a suitable regulatory region. In vitro mutagenesis, gene shuffling or de novo synthesis can also enhance translation efficiency in the host plants or change the catalytic effect of the encoded enzyme. The modification includes the modification of the residue concerning catalytic functions but is not limited thereto.
  • the P450 gene can be modified so as to have an optimum codon depending on the codon usage of the host or the organelle to be expressed.
  • such a gene sequence may be connected to a nucleic acid sequence encoding a suitable transit peptide.
  • a DNA sequence encoding a polypeptide can be optimized for expression in a selected cell by using the codons preferred by the selected cell to represent the DNA copy of the desired polypeptide coding sequence.
  • Suitable cytochrome P450 nucleic acids include those that encode a P450 having an amino acid sequence having ⁇ 0% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-533.
  • a recombinant nucleic acid construct disclosed herein typically includes one or more regulatory regions operably linked to the nucleic acid encoding a P450 polypeptide.
  • a promoter is located 5' to the sequence to be transcribed, and proximal to the transcriptional start site of the sequence. Promoters are upstream of the exon of a coding sequence and upstream of the of multiple transcription start sites, hi some embodiments, a promoter is positioned about 5,000 nucleotides upstream of the ATG of the exon of a coding sequence. In other embodiments, a promoter is positioned about 2,000 nucleotides upstream of the first of multiple transcription start sites.
  • a promoter typically comprises at least a core promoter.
  • a promoter may also include at least one control element such as an upstream element.
  • control elements include upstream activation regions (UARs) and optionally, other DNA sequences that affect transcription of a polynucleotide such as a synthetic upstream element.
  • UARs upstream activation regions
  • a 5' untranslated region is transcribed, but is not translated, and lies between the start site of the transcript and the translation initiation codon and may include the +1 nucleotide.
  • a 3' UTR can be positioned between the translation termination codon and the end of the transcript. UTRs can have particular functions such as increasing mRNA message stability or translation attenuation. Examples of 3' UTRs include, but are not limited to polyadenylation signals and transcription termination sequences.
  • constitutive, tissue-nonspecific or developmental stage-nonspecific promoters can be used.
  • An example of a constitutive promoter is the CaMV35S promoter.
  • Inducible promoters such as heat shock promoters, wound responsive promoters such as hydroxyproline-rich protein promoters, chemically inducible promoters such as nitrate reductase promoters and dark inducible promoters such as asparagine synthetase promoters can be useful. See, e.g., U.S. Pat. No. 5,256,558.
  • Tissue-, organ- and cell-specific promoters that confer transcription only or predominantly in a particular tissue, organ, and cell type, respectively, can be used if desired.
  • promoters specific to vegetative tissues such as the stem, parenchyma, ground meristem, vascular bundle, cambium, phloem, cortex, shoot apical meristem, lateral shoot meristem, root apical meristem, lateral root meristem, leaf primordium, leaf mesophyll, or leaf epidermis can be suitable regulatory regions.
  • vegetative tissue promoter is promoter p32449 (gDNA ID 7418615; SEQ ID NO:45), which has preferential activity in the roots.
  • Other vegetative promoters include; a maize leaf-specific gene described by Busk (1997) Plant J., 11:1285-1295; knl-related genes from maize and other species; and constitutive Cauliflower mosaic virus 35S.
  • Other suitable promoters include those that have preferential activity in organs like the stem and silique and/or that have preferential activity in specific cell-types, such as vascular bundles.
  • Examples include YP0086 (gDNA ID 7418340), YPOl 88 (gDNA ID 7418570), YP0263 (gDNA ID 7418658), and others, as set forth in U.S. Patent Application Ser. Nos. 60/518,075; 60/544,771; 60/505,689; 60/583,691; 10/957,569; and 60/558,869.
  • a cell type or tissue-specific promoter can drive expression of operably linked sequences in tissues other than the target tissue.
  • a cell-type or tissue-specific promoter is one that drives expression preferentially in the target tissue, but can also lead to some expression in other cell types or tissues as well.
  • Suitable regulatory regions that can be operably linked to a nucleic acid capable encoding a P450 polypeptide include, without limitation, those listed in the Regulatory Regions Table below.
  • Suitable P450 polypeptides can be identified by analysis of nucleotide sequence alignments utilizing known sequence alignment methods as described above for amino acid sequences.
  • the nucleotide sequence of a suitable subject nucleic acid has greater than 70% sequence identity (e.g., > 75%, > 80%, > 90%, > 91%, > 92%, > 93%, > 94%, > 95%, > 96%, > 97%, > 98%, or > 99%) to the nucleotide sequence of the query nucleic acid.
  • the degree of sequence similarity between nucleic acid sequences can be determined by hybridization of nucleic acids under conditions that form stable duplexes between homologous regions, followed by digestion with single-stranded-specific nuclease(s), and size determination of the digested fragments.
  • Two DNA, or two polypeptide sequences are "substantially homologous" to each other when the sequences exhibit at least about 43%-60%, preferably 60-70%, more preferably 70%-85%, more preferably at least about 85%-90%, more preferably at least about 90%-95%, and most preferably at least about 95%-98% sequence identity over a defined length of the molecules, or any percentage between the above-specified ranges, as determined using the methods above.
  • substantially homologous also refers to sequences showing complete identity to the specified DNA or polypeptide sequence.
  • DNA sequences that are substantially homologous can be identified in a Southern hybridization experiment under, for example, stringent conditions, as defined for that particular system. Defining appropriate hybridization conditions is within the skill of the art. See, e.g., Sambrook et al, supra.
  • the degree of sequence identity between two nucleic acid molecules affects the efficiency and strength of hybridization events between such molecules.
  • a partially identical nucleic acid sequence will at least partially inhibit a completely identical sequence from hybridizing to a target molecule. Inhibition of hybridization of the completely identical sequence can be assessed using hybridization assays that are well known in the art (e.g., Southern blot, Northern blot, solution hybridization, or the like, see Sambrook, et al., Molecular Cloning: A Laboratory Manual, Third Edition, (1989) Cold Spring Harbor, N. Y.). Such assays can be conducted using varying degrees of selectivity, for example, using conditions varying from low to high stringency.
  • the absence of non-specific binding can be assessed using a secondary probe that lacks even a partial degree of sequence identity (for example, a probe having less than about 30% sequence identity with the target molecule), such that, in the absence of non-specific binding events, the secondary probe will not hybridize to the target.
  • a partial degree of sequence identity for example, a probe having less than about 30% sequence identity with the target molecule
  • a nucleic acid probe When utilizing a hybridization-based detection system, a nucleic acid probe is chosen that is complementary to a target nucleic acid sequence, and then by selection of appropriate conditions the probe and the target sequence "selectively hybridize,” or bind, to each other to form a hybrid molecule.
  • a nucleic acid molecule that is capable of hybridizing selectively to a target sequence under "moderately stringent” conditions typically hybridizes under conditions that allow detection of a target nucleic acid sequence of at least about 10-14 nucleotides in length having at least approximately 70% sequence identity with the sequence of the selected nucleic acid probe.
  • Stringent hybridization conditions typically allow detection of target nucleic acid sequences of at least about 10-14 nucleotides in length having a sequence identity of greater than about 90-95% with the sequence of the selected nucleic acid probe.
  • Hybridization conditions useful for probe/target hybridization where the probe and target have a specific degree of sequence identity can be determined as is known in the art (see, for example, Nucleic Acid Hybridization: A Practical Approach, editors B. D. Hames and S. J. Higgins, (1985) Oxford; Washington, DC; IRL Press).
  • stringency conditions for hybridization it is well known in the art that numerous equivalent conditions can be employed to establish a particular stringency by varying, for example, the following factors: the length and nature of probe and target sequences, base composition of the various sequences, concentrations of salts and other hybridization solution components, the presence or absence of blocking agents in the hybridization solutions (e.g., formamide, dextran sulfate, and polyethylene glycol), hybridization reaction temperature and time parameters, as well as, varying wash conditions.
  • blocking agents in the hybridization solutions e.g., formamide, dextran sulfate, and polyethylene glycol
  • hybridization reaction temperature and time parameters as well as, varying wash conditions.
  • the selection of a particular set of hybridization conditions is selected following standard methods in the art (see, for example, Sambrook, et al., supra.)
  • transgenic plant cells comprising a recombinant nucleic acid construct that comprises a regulatory region operably linked to a coding sequence for a P450 having 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-533.
  • the invention also encompasses transgenic plants comprising the transgenic plant cells described herein.
  • a plant or plant cell used in methods of the invention contains a recombinant nucleic acid construct as described herein.
  • the plant or plant cells can be transformed by having the construct integrated into its genome, i.e., be stably transformed. Stably transformed cells typically retain the introduced nucleic acid sequence with each cell division.
  • the plant or plant cells can also be transformed by having the construct not integrated into its genome. Such transformed cells are called transiently transformed cells. Transiently transformed cells typically lose all or some portion of the introduced nucleic acid construct with each cell division such that the introduced nucleic acid camiot be detected in daughter cells after sufficient number of cell divisions. Both transiently transformed and stably transformed transgenic plants and plant cells can be useful in the methods described herein.
  • transgenic plant cells used in methods described herein constitute part or all of a whole plant or explant. Such plants can be contacted with a candidate compound, for example, as seedlings in liquid medium, on solid medium, or hydroponically.
  • transgenic plant cells are grown in culture and contacted with a candidate compound.
  • Such plant cell cultures can be undifferentiated cells such as a callus culture, or can be cultures of a differentiated tissue or organ, e.g., an embryogenic cell culture or a root culture from a tissue or organ explant.
  • protoplasts are suitable for contacting with a candidate compound.
  • solid and/or liquid tissue culture techniques can be used. When using solid medium, transgenic plant cells can be placed directly onto the medium or can be placed onto a filter that is then placed in contact with the medium. When using liquid medium, transgenic plant cells can be placed onto a floatation device, e.g., a porous membrane that contacts the liquid medium.
  • Solid medium typically is made from liquid medium by adding agar.
  • a solid medium can be Murashige and Skoog (MS) medium containing agar and a suitable concentration of an auxin, e.g., 2,4-dichlorophenoxyacetic acid (2,4-D), and a suitable concentration of a cytokinin, e.g., kinetin.
  • MS Murashige and Skoog
  • auxin e.g., 2,4-dichlorophenoxyacetic acid (2,4-D)
  • a cytokinin e.g., kinetin.
  • transgenic plants are grown in a greenhouse or in a field and bred as desired for a particular purpose, e.g., to introduce a recombinant nucleic acid into other lines, to transfer a recombinant nucleic acid to other species or for further selection of other desirable traits.
  • transgenic plants can be propagated vegetatively for those species amenable to such techniques. Progeny includes descendants of a particular plant or plant line.
  • Progeny of an instant plant include seeds formed on F 1 , F 2 , F 3 , F 4 , F 5 , F 6 and subsequent generation plants, or seeds formed on BC 1 , BC 2 , BC 3 , and subsequent generation plants, or seeds formed on F 1 BC 1 , F 1 BC 2 , F 1 BC 3 , and subsequent generation plants. Seeds produced by a transgenic plant can be grown and then selfed (or outcrossed and selfed) to obtain seeds homozygous for the nucleic acid construct.
  • Techniques for introducing exogenous nucleic acids into monocotyledonous and dicotyledonous plants are known in the art, and include, without limitation, Agrobacterium-mQdiated transformation, viral vector- mediated transformation, electroporation and particle gun transformation, e.g., U.S. Patents 5,538,880, 5,204,253, 6,329,571, 6,013,863 and Yamamoto et ⁇ l., In Vitro Cellular and Development Biology - Plant 37(3):349-353 (2001). If a cell or tissue culture is used as the recipient tissue for transformation, plants can be regenerated from transformed cultures if desired, by techniques known to those skilled in the art.
  • a suitable group of plant species includes dicots, such as Arabidopsis, safflower, alfalfa, soybean, coffee, rapeseed, or sunflower. Also suitable are monocots such as Lemna, corn, wheat, rye, barley, oat, rice, millet, amaranth or sorghum. Suitable plants include vegetable crops or root crops such as lettuce, carrot, onion, broccoli, peas, sweet corn, popcorn, tomato, potato, beans (including kidney beans, lima beans, dry beans, green beans) and the like. Also suitable are fruit crops such as grape, strawberry, pineapple, melon (e.g., watermelon, cantaloupe), peach, pear, apple, cherry, orange, lemon, grapefruit, plum, mango, banana, and palm.
  • dicots such as Arabidopsis, safflower, alfalfa, soybean, coffee, rapeseed, or sunflower.
  • monocots such as Lemna, corn, wheat,
  • the methods described herein can be utilized with dicotyledonous plants belonging to the orders Magniolales, Illiciales, Laurales, Piperales, Aristochiales, Nymphaeales, Ranunculales, Papeverales, Sarraceniaceae, Trochodendrales, Hamamelidales, Eucomiales, Leitneriales, Myricales, Fagales, Casuarinales, Caryophyllales, Batales, Polygonales, Plumbaginales, Dilleniales, Theales, Malvales, Urticales, Lecythidales, Violales, Salicales, Capparales, Ericales, Diapensales, Ebenales, Primulales, Rosales, Fabales, Podostemales, Haloragales, Myrtales, Cornales, Proteales, Santales, Rafflesiales, Celastrales, Euphorbiales, Rhamnales, Sapindales, Juglandales, Geraniales, Polygalales, Umbellales, Gentianales,
  • Methods described herein can also be utilized with monocotyledonous plants belonging to the orders Alismatales, Hydrocharitales, Najadales, Triuridales, Commelinales, Eriocaulales, Restionales, Poales, Juncales, Cyperales, Typhales, Bromeliales, Zingiberales, Arecales, Cyclanthales, Pandanales, Arales, Lilliales, and Orchidales, or with plants belonging to Gymnospermae, e.g., Pinoles, Ginkgoales, Cycadales and Gnetales.
  • the invention has use over a broad range of plant species, including species from the genera Allium, Alseodaphne, Anacardium, Arachis, Asparagus, Atropa, Avena, Beilschmiedia, Brassica, Citrus, Citrullus, Capsicum, Catharanthus, Carthamus, Cocculus, Cocos, Coffea, Croton, Cucumis, Cucurbita, Daucus, Duguetia, Elaeis, Eschscholzia, Ficus, Fragaria, Glaucium, Glycine, Gossypium, Helianthus, Heterocallis, Hevea, Hordeum, Hyoscyamus, Lactuca, Landolphia, Linum, Litsea, Lolium, Lupinus, Lycopersicon, Malus, Manihot, Majorana, Medicago, Musa, Nicotiana, Olea, Oryza, Panicum, Pannesetum, Papaver, Part
  • Suitable plant species include green alga belonging to the group Viridaeplantae, and the orders Volvocales, Coleochaetales, Ulvales, and Bryopsidales.
  • Suitable genera include Brachiomonas, Carteria, Cercidium, Chlainomonas, Chlamydomonas (especially Chlamydomonas reinhardii), Chloroceras, Chlorogonium, Chloromonas, Diplostauron, Gigantochloris, Gloeomonas, Heterochlamydomonas, Hyalobrachion, Hyalogonium, Lobomonas, Oltmannsiella, Parapolytoma, Peterflella, Phyllariochloris, Polytoma, Protococcus, Provasoliella, Pyramichlamys, Sphaerella, Sphaerellopsis, Spirogonium, Tetrablepharis, Tetratoma, Tussetia
  • composition of matter comprising 5-O- ⁇ -D- glucopyranosyl-8-methoxypsoralen, i.e., Compound 3 set forth in Figure 88.
  • compositions provided herein can contain therapeutically effective amounts of one or more of the compounds provided herein that are useful in the treatment or amelioration of one or more of the symptoms associated with a disease such as cancer, vitiligo, psoriasis, or nicotine addiction, and a pharmaceutically acceptable carrier.
  • Pharmaceutical carriers suitable for administration of the compounds provided herein include any such carriers known to those skilled in the art to be suitable for the particular mode of administration.
  • the compounds can be formulated as the sole pharmaceutically active ingredient in the composition or may be combined with other active ingredients.
  • compositions can contain one or more compounds provided herein.
  • the compounds can be formulated into suitable pharmaceutical preparations such as solutions, suspensions, tablets, dispersible tablets, pills, capsules, powders, sustained release formulations or elixirs, for oral administration or topical administration or in sterile solutions or suspensions for parenteral administration, as well as transdermal patch preparation and dry powder inhalers.
  • suitable pharmaceutical preparations such as solutions, suspensions, tablets, dispersible tablets, pills, capsules, powders, sustained release formulations or elixirs, for oral administration or topical administration or in sterile solutions or suspensions for parenteral administration, as well as transdermal patch preparation and dry powder inhalers.
  • the compounds described above can be formulated into pharmaceutical compositions using techniques and procedures well known in the art (see, e.g., Ansel Introduction to Pharmaceutical Dosage Forms, Fourth Edition 1985, 126).
  • compositions effective concentrations of one or more compounds or pharmaceutically acceptable derivatives thereof can be mixed with a suitable pharmaceutical carrier.
  • the compounds can be derivatized as the corresponding salts, esters, enol ethers or esters, acetals, ketals, orthoesters, hemiacetals, hemiketals, acids, bases, solvates, hydrates or prodrugs prior to formulation, as described above.
  • concentrations of the compounds in the compositions can be effective for delivery of an amount, upon administration, that treats or ameliorates one or more of the symptoms of a disease, e.g., psoriasis.
  • compositions can be formulated for single dosage administration.
  • the weight fraction of compound can be dissolved, suspended, dispersed or otherwise mixed in a selected carrier at an effective concentration such that the treated condition is relieved or one or more symptoms are ameliorated.
  • the active compound can be included in the pharmaceutically acceptable carrier in an amount sufficient to exert a therapeutically useful effect in the absence of undesirable side effects on the patient treated.
  • the therapeutically effective concentration can be determined empirically by testing the compounds in in vitro and in vivo systems, and then extrapolated therefrom for dosages for humans.
  • the concentration of active compound in the pharmaceutical composition can depend on absorption, inactivation and excretion rates of the active compound, the physicochemical characteristics of the compound, the dosage schedule, and amount administered as well as other factors known to those of skill in the art.
  • Pharmaceutical dosage unit forms can be prepared to provide from about 0.01 mg, 0.1 mg or 1 mg to about 500 mg, 1000 mg or 2000 mg of the active ingredient or a combination of essential ingredients per dosage unit form.
  • the active ingredient can be administered at once, or can be divided into a number of smaller doses to be administered at intervals of time. It is understood that the precise dosage and duration of treatment is a function of the disorder being treated and can be determined empirically using known testing protocols or by extrapolation from in vivo or in vitro test data. It is to be noted that concentrations and dosage values can also vary with the severity of the condition to be alleviated. It is to be further understood that for any particular subject, specific dosage regimens should be adjusted over time according to the individual need and the professional judgment of the person administering or supervising the administration of the compositions, and that the concentration ranges set forth herein are exemplary only and are not intended to limit the scope or practice of the claimed compositions.
  • sol ⁇ bilizing compounds can be used. Such methods are known to those of skill in this art, and can include, but are not limited to, using cosolvents, such as dimethylsulfoxide (DMSO), using surfactants, such as TWEEN®, or dissolution in aqueous sodium bicarbonate. Derivatives of the compounds, such as prodrugs of the compounds can also be used in formulating effective pharmaceutical compositions.
  • cosolvents such as dimethylsulfoxide (DMSO)
  • surfactants such as TWEEN®
  • dissolution in aqueous sodium bicarbonate such as sodium bicarbonate
  • the resulting mixture can be a solution, suspension, emulsion or the like.
  • the form of the resulting mixture can depend upon a number of factors, including the intended mode of administration and the solubility of the compound in the selected carrier or vehicle.
  • the effective concentration can be sufficient for ameliorating the symptoms of the disease, disorder or condition treated and can be empirically determined.
  • the pharmaceutical compositions can be provided for administration to humans and animals in unit dosage forms, such as tablets, capsules, pills, powders, granules, sterile parenteral solutions or suspensions, and oral solutions or suspensions, and oil-water emulsions containing suitable quantities of the compounds or pharmaceutically acceptable derivatives thereof.
  • the pharmaceutically active compounds and derivatives thereof can be formulated and administered in unit-dosage forms or multiple-dosage forms.
  • Unit-dose forms as used herein refers to physically discrete units suitable for human or non-human animal subjects and packaged individually as is known in the art. Each unit-dose can contain a predetermined quantity of the therapeutically active compound sufficient to produce the desired therapeutic effect, in association with the required pharmaceutical carrier, vehicle or diluent.
  • unit- dose forms can include, without limitation, ampoules and syringes and individually packaged tablets or capsules.
  • Unit-dose forms can be administered in fractions or multiples thereof.
  • a multiple-dose form is a plurality of identical unit-dosage forms packaged in a single container to be administered in segregated unit-dose form.
  • Examples of multiple-dose forms include vials, bottles of tablets or capsules or bottles of pints or gallons.
  • multiple dose form is a multiple of unit-doses which are not segregated in packaging.
  • Liquid pharmaceutically administrable compositions can, for example, be prepared by dissolving, dispersing, or otherwise mixing an active compound as defined above and optional pharmaceutical adjuvants in a carrier, such as, for example, water, saline, aqueous dextrose, glycerol, glycols, ethanol, and the like, to thereby form a solution or suspension.
  • a carrier such as, for example, water, saline, aqueous dextrose, glycerol, glycols, ethanol, and the like, to thereby form a solution or suspension.
  • the pharmaceutical composition to be administered can also contain minor amounts of nontoxic auxiliary substances such as wetting agents, emulsifying agents, solubilizing agents, pH buffering agents and the like, for example, acetate, sodium citrate, cyclodextrine derivatives, sorbitan monolaurate, triethanolamine sodium acetate, triethanolamine oleate, and other such agents.
  • nontoxic auxiliary substances such as wetting agents, emulsifying agents, solubilizing agents, pH buffering agents and the like, for example, acetate, sodium citrate, cyclodextrine derivatives, sorbitan monolaurate, triethanolamine sodium acetate, triethanolamine oleate, and other such agents.
  • compositions containing active ingredient in the range of 0.001% to 100% with the balance made up from non-toxic carrier may be prepared. Methods for preparation of these compositions are known to those skilled in the art.
  • the contemplated compositions can contain 0.001%- 100% active ingredient, e.g., 0.001%-90%, 0.1%-95%, 0.5%-95%, 0.5%-30%, 10%- 25%, 10%-99%, 50%-85%, or 80%-100%.
  • the methods include administering one or more of the compounds described herein, or a pharmaceutically acceptable salt or derivative thereof, or a composition comprising the same, to a mammal, e.g., a human, cat, dog, horse, pig, cow, sheep, mouse, rat, or monkey.
  • a mammal e.g., a human, cat, dog, horse, pig, cow, sheep, mouse, rat, or monkey.
  • the compound is 5-0- ⁇ -D- glucopyranosyl-8-methoxypsoralen.
  • a method for treating or ameliorating a disease such as cancer, vitiligo, psoriasis, or nicotine addiction can include administering to a mammal a compound having the chemical formula of Compound 3 set forth in Figure 88, or a pharmaceutically acceptable salt or derivative thereof.
  • a compound or a pharmaceutically acceptable salt or derivative thereof can be administered by methods appropriate for treating, ameliorating, or preventing a symptom or a disorder related to a disease (e.g., cancer, psoriasis, vitiligo, or nicotine addiction).
  • Methods of administration can include, without limitation, oral, intravenous, and topical administration.
  • T-DNA binary vector constructs were made using standard molecular biology techniques. A set of constructs were made that contained a P450 coding sequence operably linked to a CaMV 35S promoter. Each of these constructs also contained a marker gene conferring resistance to the herbicide Finale®.
  • Each construct was introduced into Arabidopsis ecotype Wassilewskija (WS) by the floral dip method essentially as described in Bechtold, N. et al., CR. Acad. Sci. Paris, 316:1194-1199 (1993). The presence of each P450 construct was verified by PCR. At least two independent events from each transformation were selected for further study; these events were referred to as Arabidopsis thaliana ME lines.
  • the P450 polypeptide expressed in a given ME line is shown in Table 2 below. T 1 seeds were germinated and allowed to self- pollinate. T 2 seeds were collected and a portion were germinated, allowed to self-pollinate, and T 3 seeds were collected. T 2 and/or T 3 seeds were used in the experiments described below.
  • Each seedling set was transferred to a 1.7 ml Eppendorf tube and the MS media was saved.
  • the seedlings were homogenized with a plastic fitted pestle or by bead beating.
  • 1 ml methanol (MeOH) was added and the tube vortexed.
  • the mixture was incubated at 37°C for 1 hr in a shaker at 300 rpm.
  • the solids were pelleted by centrifugation at 14,000 rpm for 3 min.
  • the supernatant was transferred to 13x75 borosilicate glass test tubes.
  • the MS media from the appropriate well was added to the supernatant.
  • 1 ml hexane was added to the glass tubes containing the MeOH/MS mixture.
  • each seed line producing a positive result in the primary screen was subjected to a secondary screen, an "event confirmation screen.”
  • event confirmation screen seeds from all available separate transformation events were separately assayed for the ability to modify the target compound. 25 seeds from a single transformation event were used per microtiter well in a final volume of 1.25 ml of 1/2X MS.
  • the ME lines from the primary screen were analyzed along with a wild type control and a no plant control. Seeds were incubated for one week at 22°C, long day, on a shaker at 140 rpm. 12.5 ⁇ l of a 10OmM of the target substrate (identified as modified in Table 4, respectively) was added to each well (ImM final) and the seedlings incubated one additional day.
  • Each seedling was homogenized and extracted as described above.
  • the resulting samples were analyzed by LC/MS using standard procedures. Typically, the samples were analyzed using a 10 min water/acetonitrile gradient on a reverse phase column. Table 4 provides the results of the analysis.
  • the hydroxylated product of 8-methoxypsoralen (8-MOP) was scaled up in yeast.
  • IL AHC+Trp/Galactose medium was inoculated with WATl 1 bearing CYP82C2 that had been grown to saturation in AHC+Trp/Dextrose.
  • the cultures were incubated at 28°C with shaking for 1 day.
  • 1 ml of 500 mM 8- MOP prepared in DMSO was added to the flask, for a final concentration of 500 ⁇ M.
  • the culture was incubated for another 3 days.
  • Yeast culture medium was transferred to a round bottom flask. The medium was removed by rotoevaporation, and the resulting residue was resuspended in 20 ml MeOH.
  • the MeOH suspension was transferred to a 50 ml centrifuge tube and the sample was centrifuged for 1 hr, 1,800 x g. The supernatant was transferred to a round bottom flask and removed by rotoevaporation, yielding 6.5 g of crude dry extract. Solid phase extraction of this material on a C 18 Extract-Clean column (Alltech) using a block gradient of H 2 0/Me0H resulted in 150 mg in the 25% H 2 OZMeOH fraction.
  • Example 5 Identification of glycosylated 8-methoxypsoralen
  • Seedlings from each 24 well microtiter plate were transferred into a 145 ml mortar, ground thoroughly with a pestle, and transferred into a 50 ml centrifuge tube, rinsing with MeOH for final volume of 35 ml. Tubes containing crushed seedlings were placed in a sonication bath (Branson 2510) for 30 min at room temperature, then chilled in ice water for 5 min; this cycle was repeated six times. Tubes were then shaken at ambient temperature overnight.
  • a sonication bath Branson 2510
  • the extraction mixture was vacuum filtered through a coarse sintered glass funnel and filter paper (Whatman); the filtrate was reduced to ⁇ 2 ml by rotoevaporation, resuspended with 15% acetonitrile, frozen, and lyophilized to dryness.
  • the dry extract was dissolved in MeOH for semi- preparative HPLC (Waters 600), using the following chromatographic conditions: a C 18 5 ⁇ m 150 x 10 mm column (Alltech Alltima), solvent A: H 2 O + 0.1% formic acid, solvent B: MeOH + 0.1% formic acid; flow rate of 1.5 ml/min; monitoring at 308 nm. Three major peaks eluted using a gradient of 20 - 80% B in 30 min.
  • a subject sequence is considered to be a functional homolog of a query sequence if the subject and query sequences encode proteins having a similar function and/or activity.
  • a process known as Reciprocal BLAST was used to identify homologous and variant sequences by looking for top hits in bidirectional BLAST searches. Before starting a reciprocal BLAST process, a query polypeptide and polypeptides having a sequence percent identity of 80% or greater to the query polypeptide were designated as a cluster.
  • polypeptide A A query polypeptide sequence, "polypeptide A,” from species S ⁇ was then BLASTed against all protein sequences from species S B in public and proprietary databases, and the top hits were determined using an E-value cutoff of 10 "5 and an identity cutoff of 35%. The process was repeated using polypeptide A as a query sequence in BLASTs against all plant species.
  • top hits from all species other than S A were BLASTed against all protein sequences from species S A in the same databases. Any top hit from the first round of the BLAST process that returned polypeptide A as its best hit in the second round was identified as a potential functional homolog. Any top hit from the first round that returned a polypeptide from the cluster as its best hit was also considered a functional homolog. Any top hit that did not return polypeptide A or a polypeptide from the cluster as its best hit was not considered a functional homolog. To generate a consensus sequence, the query peptide sequence and the sequences of its functional homologs were aligned using the BLASTP program with option "-m 1," and the conserved amino acids based on the alignment are extracted using a proprietary program.
  • Redundant sequences include transcriptional or RNA processing variants, such as clones having different UTR lengths or having one or more unspliced introns and/or deleted exons.
  • Defective clones include 5', 3' and/or internally truncated, frame-shifted (e.g. by insertion or deletion), or chimeric clones.
  • Alignment Tables (FIG. 4-87) were then prepared with the remaining functional homolog sequences. The query sequence is identified in each Table as a "Lead-cDNA.” Boxed residues represent identical or conserved amino acids.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Medicinal Chemistry (AREA)
  • Analytical Chemistry (AREA)
  • Urology & Nephrology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Cell Biology (AREA)
  • Biophysics (AREA)
  • Hematology (AREA)
  • Food Science & Technology (AREA)
  • Botany (AREA)
  • Plant Pathology (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)

Abstract

Methods and materials for modifying candidate compounds utilizing transgenic plant cells capable of expressing a P450 or related polypeptide are disclosed. Materials and methods of screening for a substrate of a P450 and for screening a collection of polypeptides for a P450 capable of modifying a compound are also disclosed.

Description

P450 SUBSTRATES AND METHODS RELATED THERETO
CROSS-REFERENCE TO RELATED APPLICATION
This application claims priority to U.S. Provisional Application Ser. No. 60/691,956, filed June 17, 2005, incorporated by reference in its entirety herein.
I. INTRODUCTION
A. Field of the Invention
The present invention relates to nucleic acid constructs that encode polypeptides that function as cytochrome P450 enzymes (P450s), transgenic organisms including the same, and methods for modifying chemical structures of compounds using the same. More particularly, the invention relates to in vitro, in vivo, and whole organism procedures that result in oxidation, reduction, dealkylation, or epoxidation of a candidate compound. The invention also relates to down-stream processing, e.g., glycosylation, methylation or acetylation, of a modified compound by endogenous processes in whole organisms.
B. Sequence Listing Submitted on Compact Disc Attached
This application also includes a compact disc (Disc 1 of 1, submitted in triplicate with identical files) that contains a sequence listing. The compact disc was created June 17, 2005 and contains one file, entitled 50283674.txt, which is 1,993 kilobytes in size. The file can be accessed using Microsoft Word on a computer that uses Windows OS. The entire contents of the sequence listing are herein incorporated by reference in their entirety and are considered to be part of the specification.
C. Background of the Invention
Cytochrome P450s are the principal catalysts involved in the metabolism of drugs and other xenobiotics in humans. Much attention has focused on the role of human P450 enzymes in the metabolism and toxicity of pharmaceuticals. While the pharmaceutical industry is dedicating considerable resources to predicting how human P450s will interact with drugs in vivo, little effort is expended on using P450s in the identification or production of pharmacologically active compounds.
The relatively poor stability of cytochrome P450s and the costs of product separation limit the application of P450s to in vitro synthesis of chemicals. Further, the heme iron of an isolated P450 requires a continuous electron supply to complete the reaction. Currently, the use of P450 enzymes in industrial processes is restricted to microbial fermentation or chemical syntheses, including biotransformation of steroid hormones (U.S. Patent No. 4,353,985; Duport et al., Nat Biotechnol 16:186-9 (1998)), formation of dicarboxylic acids from alkanes (Picataggio et al., Biotechnology 10:894-8 (1992)) and hydroxylation of aromatic synthones (U.S. Patent No. 5,928,912; Dingier et al., Pestic Sd 46:33-5 (1996); Bezalel et al., Appl Environ Microbiol 62:2247-53 (1996)).
D. Summary of the Invention
An aspect of the present invention provides methods of screening for a substrate of a P450, which method comprises contacting a pharmacologically active candidate compound with plant cells containing a recombinant nucleic acid construct comprising a regulatory region operably linked to a coding sequence for a P450 having 80% or greater identity to a sequence selected from SEQ ID NO: 137-533; and determining whether the candidate compound is modified after the contact, whereby detection of a modification indicates that the candidate compound is a substrate. In certain embodiments, the P450 coding sequence is a heterologous sequence. The plant cells, e.g., root cells, leaf cells or stem cells, can be part of a whole plant or from an explant. In some embodiments, the plant is a seedling. In some embodiments, the plant is grown under hydroponic conditions. The candidate compound may be contacted with the plant cells for a period of time sufficient for the cytochrome P450 to detectably modify the candidate compound, e.g., about 1 hour to about 28 days. The pharmacological activity of the candidate compound can be Alzheimer's disease treatment, analgesic activity, anesthetic activity, anti-Addison's disease activity, anti-HIV activity, anti-infective activity, anti-inflammatory activity, antianginal activity, antiangiogenic activity, antianxiety activity, antiarrhythmic activity, antiarthritic activity, antiatherosclerotic activity, antibacterial activity, antibiotic activity, anticancer activity, anticholesterol activity, anticholinergic activity, anticoagulant activity, anticonvulsant activity, antidepressant activity, antidiabetic activity, antidiuretic activity, antiedemic activity, antifungal activity, antigout activity, antiglaucoma activity, antihemorrhagic activity, antihistamine activity, antihypertensive activity, antimalarial activity, antimicrobial activity, antimigraine activity, antimotion sickness activity, antineoplastic activity, antineuralgic activity, antiobesity activity, antioxidant activity, antiparasitic activity, antiparkinsonian activity, antipsoriasis activity, antipsychotic activity, antipyretic activity, antirheumatic activity, antiseizure activity, antithrombotic activity, antitussive activity, antiulcer activity, antiviral activity, antivitiligo activity, anxiolytic activity, appetite suppressant activity, asthma treatment, bronchodilator activity, cardiac depressant activity, cardiotonic activity, cerebral ischemia treatment, CNS depressant or stimulant activity, cognition enhancing activity, contraceptive activity, dermatitis treatment, diuretic activity, emetic activity, dopamine agonist activity, expectorant activity, gastrointestinal treatment, hepatoprotective activity, immunostimulant activity, immunosuppressant activity, antiimpotence activity, irritable bowel syndrome treatment, ischemia treatment, activity in metabolic and enzyme disorders, multiple sclerosis treatment, muscle relaxant activity, neuromuscular blocker activity, neuroprotective activity, opioid activity, osteoporosis treatment, purgative activity, radioprotective activity, respiratory stimulant activity, restenosis treatment, rheumatoid arthritis treatment, schizophrenia treatment, sedative/hypnotic activity, sepsis treatment, smoking deterrent activity, stroke treatment, thrombocytopenia treatment, thrombolytic therapy activity, vaccine activity, or vasodilator activity. Modification of the candidate compound can be detected using a method known in the art. Preferably, the determining step comprises mass spectrometry analysis or NMR analysis.
Another aspect of the invention provides methods of screening a collection of P450s for a P450 capable of modifying a compound having pharmacological activity. Such methods comprise: (a) contacting the compound with plant cells containing a recombinant nucleic acid construct comprising a regulatory region operably linked to a coding sequence for a P450 of the collection; (b) determining whether the compound is modified after the contact, whereby detection of a modification indicates that the P450 is capable of modifying the compound; and (c) repeating steps (a) and (b) for each P450 of the collection, wherein at least one P450 of the collection is identified as capable of modifying the compound.
Yet another aspect of the invention provides methods of modifying a candidate compound, which methods comprise contacting a candidate compound with plant cells containing a recombinant nucleic acid construct comprising a regulatory region operably linked to a coding sequence for a P450 encoding a sequence having 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-533, whereby the candidate compound is modified. The unmodified candidate compound can be aesculetin, ajmalicine, ajmaline, aloin, amikacin, amlodipine besylate, amoxicillin, amphotericin B, andrographolide, apomorphine hydrochloride, arecoline hydrobromide, artemesinin, atorvastatin calcium, atropine, azithromycin, berberine chloride, bergenin monohydrate, betamethasone, betulinic acid, bixin, brucine, budesonide, bupropion hydrochloride, butamben, caffeine, camptothecin, capsaicin, celecoxib, ciprofloxacin, clarithromycin, clopidogrel sulfate, codeine, colchicine, convallatoxin, curcumin, cyclobenzaprine hydrochloride, danthron, dextromethorphan, digitoxin, digoxin, doxepin hydrochloride, emetine dihydrochloride hydrate, emodin, enalapril, eserine, esomeprazole potassium, estradiol, fentanyl citrate, formoterol, furosemide, gabapentin, glimerpiride, D- (+) glucosamine hydrochloride, glycyrrhizin, gossypol, griseofulvin, hesperidin, homatropine hydrobromide, (-)-hydrastine, hydrocortisone, ibuprofen, idazoxan hydrochloride, ipratropium bromide, ivermectin, ketoconazole, ketoprofen, lanatoside C, lansaprazole, lapachol, levofloxacin hydrochloride, lidocaine, (-)- lobeline hydrochloride, lomerizine hydrochloride, mefenamic acid, 8- methoxypsoralen, miconazole, mitoxantrone hydrochloride, morphine, mycophenolic acid, nocodazole, nordihydroguaiaretic acid, (S,R)-noscapine, oleanolic acid, omeprazole, pantaprazole, phentolamine mesylate, picrotoxin, pilocarpine hydrochloride, D-pinitol, piperine, piperlongumine, pirenzepine dihydrochloride, podophyllotoxin, prasterone, pravastatin sodium salt, prednisolone, protoveratrine A, pyridostigmine bromide, quercetin dihydrate, quinidine (cinchonidine), quinine, rebamipide, rescinnamine, reserpine, resveratrol, retinoic acid, risperidone, rofecoxib, rotenone, rutin trihydrate, salicin, salicylic acid, santonin, (-)-scopolamine hydrobromide, sertraline hydrochloride, silybin, simvastatin, (-)-sparteine, streptozocin, strophanthidin, tetracycline, (+)-tetrandrine, thebaine, theobromine, theophylline, thymol, tobramycin, triamcinolone acetonide, tubocurarine chloride, ursolic acid, vincamine, warfarin pestanal, or yohimbine hydrochloride. The plant cells, e.g., root cells, leaf cells or stem cells, can be part of a whole plant or from an explant. In some embodiments, the plant is a seedling. In some embodiments, the plant is grown under hydroponic conditions. In some embodiments, the P450 coding sequence is a heterologous sequence. The candidate compound may be contacted with the plant cells for a period of time from about 1 hour to about 28 days.
In one aspect, methods are provided for making a modified candidate compound. Such methods comprise contacting a candidate compound with transgenic plant cells containing a recombinant nucleic acid construct comprising a regulatory region operably linked to a coding sequence for a P450 having 80% or greater identity to sequence selected from the group consisting of SEQ ID NOs: 137-533, whereby the P450 modifies the candidate compound and recovering the modified candidate compound from the plant cells. The candidate compound can be one of the compounds described herein. The plant cells may be part of a whole plant, e.g., a seedling, or from an explant or may be grown in culture, e.g. in suspension culture or tissue culture. In a preferred embodiment, the P450 coding sequence is a heterologous sequence. The candidate compound may be contacted with the plant cells for a period from about 1 hour to about 28 days. In some embodiments, the method further comprises the step of characterizing the chemical structure of the modified candidate compound.
The invention also features transgenic plant cells comprising a recombinant nucleic acid construct. The construct comprises a regulatory region operably linked to a coding sequence for a P450 having 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-533. The plant cells may be part of a whole plant, e.g., a seedling, or may be grown in culture, e.g. in suspension culture or tissue culture. The plant can be a dicotyledonous plant such as Arabidopsis thaliana. Alternatively, the plant may be a monocotyledonous plant, such as Lemna minor, or an algae, such as Chlamydomonas reinhardii. In some embodiments, the regulatory region comprises a constitutive promoter, an inducible promoter, or a tissue-specific promoter. In some embodiments, the P450 coding sequence is a heterologous sequence. The transgenic plant cells can be effective for modifying a candidate compound when the cells are contacted with the compound.
This document provides a composition of matter comprising 5-O-β-D- glucopyranosyl-8-methoxypsoralen, i.e., a composition of matter having the formula of Compound 3 as set forth in Figure 88. In some embodiments, a composition of matter described herein can be combined with a pharmaceutically acceptable carrier to form a pharmaceutical composition. A pharmaceutical composition comprising a composition of matter described herein can be used to treat, ameliorate, or prevent a symptom or disorder associated with a disease (e.g., cancer, psoriasis, vitiligo, or nicotine addiction).
It will be appreciated by one of skill in the art that the embodiments described herein may be combined to generate other embodiments not expressly recited, and that such other embodiments are considered to be part of the present invention. The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.
II. BRIEF DESCRIPTION OF DRAWINGS
Figure 1 is a mass spectroscopy analysis of Arabidopsis seedlings overexpressing CYP82C2 (Panels A and B), incubated in the presence of 8- methoxypsoralen, producing a hydroxylated product and a glycosylated product not found in wild-type (WS) seedlings (Panels C and D) incubated with this compound. Panels A and C display single ion chromatograms of m/z 233.5, corresponding to hydroxylated 8-methoxypsoralen. Panels B and D display single ion chromatograms of m/z 395.5, corresponding to a glycosylated form of 8-methoxypsoralen.
Figure 2 is a mass spectroscopy analysis of Arabidopsis seedlings overexpressing CYP51A2 (Panel A), incubated in the presence of pipeline, producing a glycosylated product not found in wild-type (WS) seedlings (Panel B) incubated with this compound. Panels A and B display single ion chromatograms of m/z 464.3, corresponding to glycosylated piperine.
Figure 3 is a mass spectroscopy analysis of Arabidopsis seedlings overexpressing CYP85A2 (Panel A), incubated in the presence of curcumin, producing new products not found in wild-type (WS) seedlings (Panel B) incubated with this compound. Panels A and B display single ion chromatograms of m/z 401.5, corresponding to dihydroxylated curcumin.
Figures 4 to 87 provide an amino acid sequence alignment of a given P450 (identified by cDN AJDD) with its functional homologs, identified as described in Example 3. A functional homolog is expected to have a similar function and/or activity as the P450. Each figure also provides a consensus sequence. Boxed residues represent identical or conserved amino acids.
FIG. 4 provides an amino acid sequence alignment of SEQ ID NO: 137 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 273-277.
FIG. 5 provides an amino acid sequence alignment of SEQ ID NO: 138 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 278-280.
FIG. 6 provides an amino acid sequence alignment of SEQ ID NO: 139 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 281-283.
FIG. 7 provides an amino acid sequence alignment of SEQ ID NO: 140 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 284-286.
FIG. 8 provides an amino acid sequence alignment of SEQ ID NO: 141 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 287-293.
FIG. 9 provides an amino acid sequence alignment of SEQ ID NO: 144 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 294-296.
FIG. 10 provides an amino acid sequence alignment of SEQ ID NO: 146 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 297-310. FIG. 11 provides an amino acid sequence alignment of SEQ ID NO: 154 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NO: 311.
FIG. 12 provides an amino acid sequence alignment of SEQ ID NO: 155 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NO: 312.
FIG. 13 provides an amino acid sequence alignment of SEQ ID NO: 156 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 313-316.
FIG. 14 provides an amino acid sequence alignment of SEQ ID NO: 157 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NO: 317.
FIG. 15 provides an amino acid sequence alignment of SEQ ID NO: 158 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 318-321.
FIG. 16 provides an amino acid sequence alignment of SEQ ID NO: 160 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 322-326.
FIG. 17 provides an amino acid sequence alignment of SEQ ID NO: 161 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 327-331.
FIG. 18 provides an amino acid sequence alignment of SEQ ID NO: 163 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 327, 329, and 331-334.
FIG. 19 provides an amino acid sequence alignment of SEQ ID NO: 164 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 327, 328, and 330-334.
FIG. 20 provides an amino acid sequence alignment of SEQ ID NO: 165 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NO: 327.
FIG. 21 provides an amino acid sequence alignment of SEQ ID NO: 166 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 327-331 and 334. FIG. 22 provides an amino acid sequence alignment of SEQ ID NO: 167 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 335-340.
FIG. 23 provides an amino acid sequence alignment of SEQ ID NO: 168 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 335-340.
FIG. 24 provides an amino acid sequence alignment of SEQ TD NO: 169 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 341 and 342.
FIG. 25 provides an amino acid sequence alignment of SEQ ID NO: 170 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 343 and 344.
FIG. 26 provides an amino acid sequence alignment of SEQ ID NO: 171 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 344 and 345.
FIG. 27 provides an amino acid sequence alignment of SEQ ID NO: 172 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 346-349.
FIG. 28 provides an amino acid sequence alignment of SEQ ID NO: 173 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 346-352.
FIG. 29 provides an amino acid sequence alignment of SEQ ID NO: 174 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 346-349 and 353-355.
FIG. 30 provides an amino acid sequence alignment of SEQ ID NO: 175 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 274-277 and 356-359.
FIG. 31 provides an amino acid sequence alignment of SEQ ID NO: 176 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 360-368.
FIG. 32 provides an amino acid sequence alignment of SEQ ID NO: 180 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ TD NOs: 285, 286 and 369. FIG. 33 provides an amino acid sequence alignment of SEQ ID NO: 181 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 284, 286 and 369.
FIG. 34 provides an amino acid sequence alignment of SEQ ID NO: 184 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 370 and 371.
FIG. 35 provides an amino acid sequence alignment of SEQ ID NO: 185 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NO: 372.
FIG. 36 provides an amino acid sequence alignment of SEQ ID NO: 186 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 371 and 373.
FIG. 37 provides an amino acid sequence alignment of SEQ ID NO: 187 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 371 and 373-375
FIG. 38 provides an amino acid sequence alignment of SEQ ID NO: 188 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 371, 373, 376 and 377.
FIG. 39 provides an amino acid sequence alignment of SEQ ID NO: 189 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 371, 373, 378 and 379.
FIG. 40 provides an amino acid sequence alignment of SEQ ID NO: 190 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 380-384.
FIG. 41 provides an amino acid sequence alignment of SEQ ID NO: 191 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 382, 383 and 385.
FIG. 42 provides an amino acid sequence alignment of SEQ ID NO: 192 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 386 and 387.
FIG. 43 provides an amino acid sequence alignment of SEQ ID NO: 193 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 386 and 388. FIG. 44 provides an amino acid sequence alignment of SEQ ID NO: 194 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NO: 389.
FIG. 45 provides an amino acid sequence alignment of SEQ ID NO: 195 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NO: 390.
FIG. 46 provides an amino acid sequence alignment of SEQ ID NO: 196 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NO: 391.
FIG. 47 provides an amino acid sequence alignment of SEQ ID NO: 201 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 392-406.
FIG. 48 provides an amino acid sequence alignment of SEQ ID NO: 202 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 407-417.
FIG. 49 provides an amino acid sequence alignment of SEQ ID NO: 203 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 418-453.
FIG. 50 provides an amino acid sequence alignment of SEQ ID NO: 204 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 454, 456 and 457.
FIG. 51 provides an amino acid sequence alignment of SEQ ID NO: 206 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 454, 458-468.
FIG. 52 provides an amino acid sequence alignment of SEQ ID NO: 207 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 454, 458-460 and 467-469.
FIG. 53 provides an amino acid sequence alignment of SEQ ID NO: 208 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NO: 454, 458, 460-463, 469 and 470.
FIG. 54 provides an amino acid sequence alignment of SEQ ID NO: 209 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 455, 458, 459, 461-463 and 469. FIG. 55 provides an amino acid sequence alignment of SEQ ED NO: 210 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 455, 458-460 and 469.
FIG. 56 provides an amino acid sequence alignment of SEQ ID NO: 212 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 471 and 472.
FIG. 57 provides an amino acid sequence alignment of SEQ ID NO: 213 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 473 and 474.
FIG. 58 provides an amino acid sequence alignment of SEQ ID NO: 215 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NO: 475.
FIG. 59 provides an amino acid sequence alignment of SEQ TD NO: 216 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 476-478.
FIG. 60 provides an amino acid sequence alignment of SEQ ID NO: 217 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ DD NOs: 471 and 472.
FIG. 61 provides an amino acid sequence alignment of SEQ ID NO: 218 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NO: 479.
FIG. 62 provides an amino acid sequence alignment of SEQ DD NO: 220 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ DD NO: 480.
FIG. 63 provides an amino acid sequence alignment of SEQ DD NO: 221 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ DD NO: 481.
FIG. 64 provides an amino acid sequence alignment of SEQ DD NO: 222 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ DD NO: 482.
FIG. 65 provides an amino acid sequence alignment of SEQ DD NO: 223 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ DD NO: 482. FIG. 66 provides an amino acid sequence alignment of SEQ ID NO: 225 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NO: 483.
FIG. 61 provides an amino acid sequence alignment of SEQ ID NO: 228 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 484 and 485.
FIG. 68 provides an amino acid sequence alignment of SEQ ID NO: 229 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NO: 486.
FIG. 69 provides an amino acid sequence alignment of SEQ ID NO: 231 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 487-490.
FIG. 70 provides an amino acid sequence alignment of SEQ ID NO: 232 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 488-491.
FIG. 71 provides an amino acid sequence alignment of SEQ ID NO: 233 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 487 and 489-491.
FIG. 72 provides an amino acid sequence alignment of SEQ ID NO: 234 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 487, 488, 490 and 491.
FIG. 73 provides an amino acid sequence alignment of SEQ ID NO: 237 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NO: 492.
FIG. 74 provides an amino acid sequence alignment of SEQ ID NO: 238 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 493 and 494.
FIG. 75 provides an amino acid sequence alignment of SEQ ID NO: 244 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 495-500.
FIG. 76 provides an amino acid sequence alignment of SEQ ID NO: 245 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 496-503. FIG. 77 provides an amino acid sequence alignment of SEQ ID NO: 249 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 504-509.
FIG. 78 provides an amino acid sequence alignment of SEQ ID NO: 250 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 510-513.
FIG. 79 provides an amino acid sequence alignment of SEQ ID NO: 251 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 512-514.
FIG. 80 provides an amino acid sequence alignment of SEQ ID NO: 252 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NO: 513.
FIG. 81 provides an amino acid sequence alignment of SEQ ID NO: 253 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 515 and 516.
FIG. 82 provides an amino acid sequence alignment of SEQ ID NO: 256 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NO: 517.
FIG. 83 provides an amino acid sequence alignment of SEQ ID NO: 258 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NO: 519.
FIG. 84 provides an amino acid sequence alignment of SEQ ID NO: 267 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NOs: 520 and 521.
FIG. 85 provides an amino acid sequence alignment of SEQ ID NO: 268 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NO: 522.
FIG. 86 provides an amino acid sequence alignment of SEQ ID NO: 269 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NO: 523-525.
FIG. 87 provides an amino acid sequence alignment of SEQ ID NO: 272 with its functional homologs. The functional homologs include proteins having an amino acid sequence of SEQ ID NO: 526-533. FIG. 88 provides the chemical formula of 8-methoxypsoralen (compound 1), 5-hydroxy-8-methoxypsoralen (compound 2), and 5-0-β-D-glucopyranosyl- 8-methoxypsoralen (compound 3).
III. BRIEF DESCRIPTION OF SEQUENCE LISTING
Polypeptide and nucleic acid sequences useful in the instant invention include those described in the Sequence Listing and in Figures 4 through 87.
The following is a brief description of each sequence presented in the Sequence Listing.
SEQ ID NO: 1 is the DNA sequence of cDNA ID no. 12324097 from Arabidopsis thaliana. SEQ ID NO: 2 is the DNA sequence of cDNA ID no. 12489505 from Arabidopsis thaliana. SEQ ID NO: 3 is the DNA sequence of cDNA ID no. 12559461 from Arabidopsis thaliana. SEQ ID NO: 4 is the DNA sequence of cDNA ID no. 12672824 from Arabidopsis thaliana. SEQ ID NO: 5 is the DNA sequence of cDNA ID no. 7106710 from Arabidopsis thaliana. SEQ ID NO: 6 is the DNA sequence of cDNA ID no. 12723568 from Arabidopsis thaliana. SEQ ID NO: 7 is the DNA sequence of cDNA ID no. 12591866 from Arabidopsis thaliana. SEQ ID NO: 8 is the DNA sequence of cDNA ID no. 12652339 from Arabidopsis thaliana. SEQ ID NO: 9 is the DNA sequence of cDNA ID no. 12669291 from Arabidopsis thaliana. SEQ E) NO: 10 is the DNA sequence of cDNA ID no. 12605081 from Arabidopsis thaliana.
SEQ ID NO: 11 is the DNA sequence of cDNA ID no. 12722269 from Arabidopsis thaliana. SEQ ID NO: 12 is the DNA sequence of cDNA ID no. 4806743 from Arabidopsis thaliana. SEQ ID NO: 13 is the DNA sequence of cDNA ID no. 12716191 from Arabidopsis thaliana. SEQ ID NO: 14 is the DNA sequence of cDNA TD no. 12420597 from Arabidopsis thaliana. SEQ E) NO: 15 is the DNA sequence of cDNA E) no. 12652542 from Arabidopsis thaliana. SEQ E) NO: 16 is the DNA sequence of cDNA E) no. 12330374 from Arabidopsis thaliana. SEQ TD NO: 17 is the DNA sequence of cDNA E) no. 12713945 from Arabidopsis thaliana. SEQ E) NO: 18 is the DNA sequence of cDNA E) no. 12488814 from Arabidopsis thaliana. SEQ E) NO: 19 is the DNA sequence of cDNA TD no. 12667451 from Arabidopsis thaliana. SEQ TD NO: 20 is the DNA sequence of cDNA TD no. 12719182 from Arabidopsis thaliana. SEQ ID NO: 21 is the DNA sequence of cDNA ID no. 12726480 from Arabidopsis thaliana. SEQ ID NO: 22 is the DNA sequence of cDNA ID no. 12653029 from Arabidopsis thaliana. SEQ ID NO: 23 is the DNA sequence of cDNA DD no. 12724226 from Arabidopsis thaliana. SEQ ID NO: 24 is the DNA sequence of cDNA ID no. 12348720 from Arabidopsis thaliana. SEQ ID NO: 25 is the DNA sequence of cDNA ID no. 12720530 from. Arabidopsis thaliana. SEQ ID NO: 26 is the DNA sequence of cDNA ID no. 12575419 from Arabidopsis thaliana. SEQ ID NO: 27 is the DNA sequence of cDNA ID no. 12720534 from Arabidopsis thaliana. SEQ ID NO: 28 is the DNA sequence of cDNA DD no. 12577086 from Arabidopsis thaliana. SEQ ID NO: 29 is the DNA sequence of cDNA ID no. 12671019 from Arabidopsis thaliana. SEQ ID NO: 30 is the DNA sequence of cDNA ID no. 12720518 from Arabidopsis thaliana.
SEQ ID NO: 31 is the DNA sequence of cDNA ID no. 12603446 from Arabidopsis thaliana. SEQ ID NO: 32 is the DNA sequence of cDNA ID no. 12660455 from Arabidopsis thaliana. SEQ ID NO: 33 is the DNA sequence of cDNA ID no. 1817341 from Arabidopsis thaliana. SEQ ID NO: 34 is the DNA sequence of cDNA ID no. 12651234 from Arabidopsis thaliana. SEQ ID NO: 35 is the DNA sequence of cDNA ID no. 12666501 from Arabidopsis thaliana. SEQ ID NO: 36 is the DNA sequence of cDNA ID no. 12685377 from Arabidopsis thaliana. SEQ ID NO: 37 is the DNA sequence of cDNA ID no. 12602360 from Arabidopsis thaliana. SEQ ID NO: 38 is the DNA sequence of cDNA DD no. 12559722 from Arabidopsis thaliana. SEQ ID NO: 39 is the DNA sequence of cDNA ID no. 12657646 from Arabidopsis thaliana. SEQ ID NO: 40 is the DNA sequence of cDNA ID no. 12670002 from Arabidopsis thaliana.
SEQ ID NO: 41 is the DNA sequence of cDNA ID no. 12321246 from Arabidopsis thaliana. SEQ DD NO: 42 is the DNA sequence of cDNA DD no. 12680308 from Arabidopsis thaliana. SEQ DD NO: 43 is the DNA sequence of cDNA DD no. 12737089 from Arabidopsis thaliana. SEQ DD NO: 44 is the DNA sequence of cDNA DD no. 12657899 from Arabidopsis thaliana. SEQ DD NO: 45 is the DNA sequence of cDNA DD no. 12657903 from Arabidopsis thaliana. SEQ DD NO: 46 is the DNA sequence of cDNA DD no. 12727559 from Arabidopsis thaliana. SEQ DD NO: 47 is the DNA sequence of cDNA DD no. 5003066 from Arabidopsis thaliana. SEQ DD NO: 48 is the DNA sequence of cDNA DD no. 12736047 Horn Arabidopsis thaliana. SEQ DD NO: 49 is the DNA sequence of cDNA E) no. 12736059 from Arabidopsis thaliana. SEQ ID NO: 50 is the DNA sequence of cDNA ID no. 12601981 from Arabidopsis thaliana.
SEQ ID NO: 51 is the DNA sequence of cDNA ID no. 12331336 from Arabidopsis thaliana. SEQ ID NO: 52 is the DNA sequence of cDNA ID no. 12650952 from Arabidopsis thaliana. SEQ ID NO: 53 is the DNA sequence of cDNA ID no. 12650956 from Arabidopsis thaliana. SEQ ID NO: 54 is the DNA sequence of cDNA ID no. 12370997 from Arabidopsis thaliana. SEQ ID NO: 55 is the DNA sequence of cDNA ID no. 3036716 from Arabidopsis thaliana. SEQ ID NO: 56 is the DNA sequence of cDNA ID no. 12561167 from Arabidopsis thaliana. SEQ ID NO: 57 is the DNA sequence of cDNA ID no. 12721393 from Arabidopsis thaliana. SEQ ID NO: 58 is the DNA sequence of cDNA ID no. 12680005 from Arabi dopsis thaliana. SEQ ID NO: 59 is the DNA sequence of cDNA ID no. 6431419 from Arabidopsis thaliana. SEQ ID NO: 60 is the DNA sequence of cDNA ID no. 6428581 from Arabidopsis thaliana.
SEQ ID NO: 61 is the DNA sequence of cDNA ID no. 12662082 from Arabidopsis thaliana. SEQ ID NO: 62 is the DNA sequence of cDNA ID no. 12675449 from Arabidopsis thaliana. SEQ ID NO: 63 is the DNA sequence of cDNA ID no. 12733016 from Arabidopsis thaliana. SEQ ID NO: 64 is the DNA sequence of cDNA ID no. 12575176 from Arabidopsis thaliana. SEQ ID NO: 65 is the DNA sequence of cDNA ID no. 12489109 from Arabidopsis thaliana. SEQ ID NO: 66 is the DNA sequence of cDNA ID no. 12670510 from Arabidopsis thaliana. SEQ ID NO: 67 is the DNA sequence of cDNA ID no. 12558789 from Arabidopsis thaliana. SEQ ID NO: 68 is the DNA sequence of cDNA ID no. 12658399 from Arabidopsis thaliana. SEQ ID NO: 69 is the DNA sequence of cDNA ID no. 12575003 from Arabidopsis thaliana. SEQ ID NO: 70 is the DNA sequence of cDNA ID no. 12488947 from Arabidopsis thaliana.
SEQ ID NO: 71 is the DNA sequence of cDNA ID no. 12658410 from Arabidopsis thaliana. SEQ ID NO: 72 is the DNA sequence of cDNA ID no. 12671988 from Arabidopsis thaliana. SEQ ID NO: 73 is the DNA sequence of cDNA ID no. 4929561 from Arabidopsis thaliana. SEQ ID NO: 74 is the DNA sequence of cDNA ID no. 12658403 from Arabidopsis thaliana. SEQ ID NO: 75 is the DNA sequence of cDNA ID no. 12711003 from Arabidopsis thaliana. SEQ ID NO: 76 is the DNA sequence of cDNA ID no. 12727851 from Arabidopsis thaliana. SEQ DD NO: 77 is the DNA sequence of cDNA ID no. 12734891 from Arabidopsis thaliana. SEQ ID NO: 78 is the DNA sequence of cDNA ED no. 12736978 from Arabidopsis thaliana. SEQ DD NO: 79 is the DNA sequence of cDNA ID no. 12722908 from Arabidopsis thaliana. SEQ DD NO: 80 is the DNA sequence of cDNA DD no. 4931385 from Arabidopsis thaliana.
SEQ DD NO: 81 is the DNA sequence of cDNA ID no. 12672204 from Arabidopsis thaliana. SEQ DD NO: 82 is the DNA sequence of cDNA ID no. 12604034 from Arabidopsis thaliana. SEQ DD NO: 83 is the DNA sequence of cDNA DD no. 6446020 from Arabidopsis thaliana. SEQ DD NO: 84 is the DNA sequence of cDNA DD no. 12323946 from Arabidopsis thaliana. SEQ DD NO: 85 is the DNA sequence of cDNA DD no. 12578257 from Arabidopsis thaliana. SEQ DD NO: 86 is the DNA sequence of cDNA DD no. 12348600 from Arabidopsis thaliana. SEQ DD NO: 87 is the DNA sequence of cDNA DD no. 12716909 from Arabidopsis thaliana. SEQ DD NO: 88 is the DNA sequence of cDNA DD no. 12672200 from Arabidopsis thaliana. SEQ DD NO: 89 is the DNA sequence of cDNA DD no. 12736967 from Arabidopsis thaliana. SEQ DD NO: 90 is the DNA sequence of cDNA DD no. 12672193 from Arabidopsis thaliana.
SEQ DD NO: 91 is the DNA sequence of cDNA DD no. 12728556 from Arabidopsis thaliana. SEQ DD NO: 92 is the DNA sequence of cDNA DD no. 12716894 from Arabidopsis thaliana. SEQ DD NO: 93 is the DNA sequence of cDNA DD no. 12736956 from Arabidopsis thaliana. SEQ DD NO: 94 is the DNA sequence of cDNA DD no. 12482287 from Arabidopsis thaliana. SEQ DD NO: 95 is the DNA sequence of cDNA DD no. 12654831 from Arabidopsis thaliana. SEQ DD NO: 96 is the DNA sequence of cDNA DD no. 12654823 from Arabidopsis thaliana. SEQ DD NO: 97 is the DNA sequence of cDNA DD no. 12654819 from Arabidopsis thaliana. SEQ DD NO: 98 is the DNA sequence of cDNA DD no. 12654815 from Arabidopsis thaliana. SEQ DD NO: 99 is the DNA sequence of cDNA DD no. 12726436 from Arabidopsis thaliana. SEQ DD NO: 100 is the DNA sequence of cDNA DD no. 12692662 from Arabidopsis thaliana.
SEQ DD NO: 101 is the DNA sequence of cDNA DD no. 12672307 from Arabidopsis thaliana. SEQ DD NO: 102 is the DNA sequence of cDNA DD no. 12664747 from Arabidopsis thaliana. SEQ DD NO: 103 is the DNA sequence of cDNA DD no. 12661189 from Arabidopsis thaliana. SEQ DD NO: 104 is the DNA sequence of cDNA DD no. 12718491 from Arabidopsis thaliana. SEQ DD NO: 105 is the DNA sequence of cDNA ID no. 12722285 from Arabidopsis thaliana. SEQ ID NO: 106 is the DNA sequence of cDNA ID no. 12654761 from Arabidopsis thaliana. SEQ ID NO: 107 is the DNA sequence of cDNA ID no. 12653581 from Arabidopsis thaliana. SEQ ID NO: 108 is the DNA sequence of cDNA ID no. 12336276 from Arabidopsis thaliana. SEQ ID NO: 109 is the DNA sequence of cDNA ID no. 12720013 from Arabidopsis thaliana. SEQ ID NO: 110 is the DNA sequence of cDNA ID no. 12700669 from Arabidopsis thaliana.
SEQ ID NO: 111 is the DNA sequence of cDNA ID no. 12733198 from Arabidopsis thaliana. SEQ ID NO: 112 is the DNA sequence of cDNA ID no. 12733202 from Arabidopsis thaliana. SEQ ID NO: 113 is the DNA sequence of cDNA ID no. 12670681 from Arabidopsis thaliana. SEQ ID NO: 114 is the DNA sequence of cDNA ID no. 12663534 from Arabidopsis thaliana. SEQ ID NO: 115 is the DNA sequence of cDNA ID no. 12672657 from Arabidopsis thaliana. SEQ ID NO: 116 is the DNA sequence of cDNA ID no. 12601847 from Arabidopsis thaliana. SEQ ID NO: 117 is the DNA sequence of cDNA ID no. 6442206 from Arabidopsis thaliana. SEQ ID NO: 118 is the DNA sequence of cDNA ID no. 12656365 from Arabidopsis thaliana. SEQ ID NO: 119 is the DNA sequence of cDNA ID no. 12672102 from Arabidopsis thaliana. SEQ ID NO: 120 is the DNA sequence of cDNA ID no. 12731901 from Arabidopsis thaliana.
SEQ ID NO: 121 is the DNA sequence of cDNA ID no. 12696098 from Arabidopsis thaliana. SEQ ID NO: 122 is the DNA sequence of cDNA ID no. 12653150 from Arabidopsis thaliana. SEQ ID NO: 123 is the DNA sequence of cDNA ID no. 12731797 from Arabidopsis thaliana. SEQ ID NO: 124 is the DNA sequence of cDNA ID no. 12731793 from Arabidopsis thaliana. SEQ ID NO: 125 is the DNA sequence of cDNA ID no. 6445548 from Arabidopsis thaliana. SEQ ID NO: 126 is the DNA sequence of cDNA ID no. 12731781 from Arabidopsis thaliana. SEQ ID NO: 127 is the DNA sequence of cDNA ID no. 12576154 from Arabidopsis thaliana. SEQ ID NO: 128 is the DNA sequence of cDNA ID no. 12729533 from Arabidopsis thaliana. SEQ ID NO: 129 is the DNA sequence of cDNA ID no. 12661185 from Arabidopsis thaliana. SEQ ID NO: 130 is the DNA sequence of cDNA ID no. 12574629 from Arabidopsis thaliana. SEQ ID NO: 131 is the DNA sequence of cDNA ID no. 12575795 from Arabidopsis thaliana. SEQ ID NO: 132 is the DNA sequence of cDNA ID no. 12731921 from Arabidopsis thaliana. SEQ ID NO: 133 is the DNA sequence of cDNA ID no. 12654597 from Arabidopsis thaliana. SEQ ID NO: 134 is the DNA sequence of cDNA ID no. 12680828 from Arabidopsis thaliana. SEQ ID NO: 135 is the DNA sequence of cDNA ID no. 12709025 from Arabidopsis thaliana. SEQ ID NO: 136 is the DNA sequence of cDNA ID no. 12602001 from Arabidopsis thaliana. SEQ ID NO: 137 is the protein sequence of cDNA ID no. 12324097 from Arabidopsis thaliana. SEQ ID NO: 138 is the protein sequence of cDNA ID no. 12489505 from Arabidopsis thaliana. SEQ ID NO: 139 is the protein sequence of cDNA ID no. 12559461 from Arabidopsis thaliana. SEQ ID NO: 140 is the protein sequence of cDNA ID no. 12672824 from Arabidopsis thaliana.
SEQ ID NO: 141 is the protein sequence of cDNA ID no. 7106710 from Arabidopsis thaliana. SEQ ID NO: 142 is the protein sequence of cDNA ID no. 12723568 from Arabidopsis thaliana. SEQ ID NO: 143 is the protein sequence of cDNA ID no. 12591866 from Arabidopsis thaliana. SEQ ID NO: 144 is the protein sequence of cDNA ID no. 12652339 from Arabidopsis thaliana. SEQ ID NO: 145 is the protein sequence of cDNA ID no. 12669291 from Arabidopsis thaliana. SEQ ID NO: 146 is the protein sequence of cDNA ID no. 12605081 from Arabidopsis thaliana. SEQ ID NO: 147 is the protein sequence of cDNA ID no. 12722269 from Arabidopsis thaliana. SEQ ID NO: 148 is the protein sequence of cDNA ID no. 4806743 from Arabidopsis thaliana. SEQ ID NO: 149 is the protein sequence of cDNA ID no. 12716191 from Arabidopsis thaliana. SEQ ID NO: 150 is the protein sequence of cDNA ID no. 12420597 from Arabidopsis thaliana.
SEQ ID NO: 151 is the protein sequence of cDNA ID no. 12652542 from Arabidopsis thaliana. SEQ ID NO: 152 is the protein sequence of cDNA TD no. 12330374 from Arabidopsis thaliana. SEQ ID NO: 153 is the protein sequence of cDNA ID no. 12713945 from Arabidopsis thaliana. SEQ ID NO: 154 is the protein sequence of cDNA ID no. 12488814 from Arabidopsis thaliana. SEQ ID NO: 155 is the protein sequence of cDNA ID no. 12667451 from Arabidopsis thaliana. SEQ ID NO: 156 is the protein sequence of cDNA ID no. 12719182 from Arabidopsis thaliana. SEQ ID NO: 157 is the protein sequence of cDNA ID no. 12726480 from Arabidopsis thaliana. SEQ ID NO: 158 is the protein sequence of cDNA ID no. 12653029 from Arabidopsis thaliana. SEQ ID NO: 159 is the protein sequence of cDNA ID no. 12724226 from Arabidopsis thaliana. SEQ ID NO: 160 is the protein sequence of cDNA ID no. 12348720 from Arabidopsis thaliana.
SEQ ID NO: 161 is the protein sequence of cDNA ID no. 12720530 from Arabidopsis thaliana. SEQ ID NO: 162 is the protein sequence of cDNA ID no. 12575419 from Arabidopsis thaliana. SEQ ID NO: 163 is the protein sequence of cDNA ID no. 12720534 from Arabidopsis thaliana. SEQ ID NO: 164 is the protein sequence of cDNA ID no. 12577086 from Arabidopsis thaliana. SEQ ID NO: 165 is the protein sequence of cDNA ID no. 12671019 from Arabidopsis thaliana. SEQ ID NO: 166 is the protein sequence of cDNA ID no. 12720518 from Arabidopsis thaliana. SEQ ID NO: 167 is the protein sequence of cDNA ID no. 12603446 from Arabidopsis thaliana. SEQ ID NO: 168 is the protein sequence of cDNA ID no. 12660455 from Arabidopsis thaliana. SEQ ID NO: 169 is the protein sequence of cDNA ID no. 1817341 from Arabidopsis thaliana. SEQ ID NO: 170 is the protein sequence of cDNA ID no. 12651234 from Arabidopsis thaliana.
SEQ ID NO: 171 is the protein sequence of cDNA ID no. 12666501 from Arabidopsis thaliana. SEQ ID NO: 172 is the protein sequence of cDNA ID no. 12685377 from Arabidopsis thaliana. SEQ ID NO: 173 is the protein sequence of cDNA ID no. 12602360 from Arabidopsis thaliana. SEQ ID NO: 174 is the protein sequence of cDNA ID no. 12559722 from Arabidopsis thaliana. SEQ ID NO: 175 is the protein sequence of cDNA ID no. 12657646 from Arabidopsis thaliana. SEQ ID NO: 176 is the protein sequence of cDNA ID no. 12670002 from Arabidopsis thaliana. SEQ ID NO: 177 is the protein sequence of cDNA ID no. 12321246 from Arabidopsis thaliana. SEQ ID NO: 178 is the protein sequence of cDNA ID no. 12680308 from Arabidopsis thaliana. SEQ ID NO: 179 is the protein sequence of cDNA ID no. 12737089 from Arabidopsis thaliana. SEQ ID NO: 180 is the protein sequence of cDNA ID no. 12657899 from Arabidopsis thaliana.
SEQ ID NO: 181 is the protein sequence of cDNA ID no. 12657903 from Arabidopsis thaliana. SEQ ID NO: 182 is the protein sequence of cDNA ID no. 12727559 from Arabidopsis thaliana. SEQ ID NO: 183 is the protein sequence of cDNA ID no. 5003066 from Arabidopsis thaliana. SEQ ID NO: 184 is the protein sequence of cDNA ID no. 12736047 from Arabidopsis thaliana. SEQ TD NO: 185 is the protein sequence of cDNA ID no. 12736059 from Arabidopsis thaliana. SEQ ID NO: 186 is the protein sequence of cDNA ID no. 12601981 from Arabidopsis thaliana. SEQ ID NO: 187 is the protein sequence of cDNA ED no. 12331336 from Arabidopsis thaliana. SEQ TD NO: 188 is the protein sequence of cDNA E) no. 12650952 from Arabidopsis thaliana. SEQ E) NO: 189 is the protein sequence of cDNA TD no. 12650956 from Arabidopsis thaliana. SEQ TD NO: 190 is the protein sequence of cDNA E) no. 12370997 from Arabidopsis thaliana.
SEQ TD NO: 191 is the protein sequence of cDNA TD no. 3036716 from Arabidopsis thaliana. SEQ TD NO: 192 is the protein sequence of cDNA TD no. 12561167 from Arabidopsis thaliana. SEQ TD NO: 193 is the protein sequence of cDNA TD no. 12721393 from Arabidopsis thaliana. SEQ TD NO: 194 is the protein sequence of cDNA TD no. 12680005 from Arabidopsis thaliana. SEQ E) NO: 195 is the protein sequence of cDNA TD no. 6431419 from Arabidopsis thaliana. SEQ TD NO: 196 is the protein sequence of cDNA TD no. 6428581 from Arabidopsis thaliana. SEQ TD NO: 197 is the protein sequence of cDNA TD no. 12662082 from Arabidopsis thaliana. SEQ E) NO: 198 is the protein sequence of cDNA TD no. 12675449 from Arabidopsis thaliana. SEQ TD NO: 199 is the protein sequence of cDNA TD no. 12733016 from Arabidopsis thaliana. SEQ TD NO: 200 is the protein sequence of cDNA TD no. 12575176 from Arabidopsis thaliana.
SEQ TD NO: 201 is the protein sequence of cDNA TD no. 12489109 from Arabidopsis thaliana. SEQ TD NO: 202 is the protein sequence of cDNA TD no. 12670510 from Arabidopsis thaliana. SEQ TD NO: 203 is the protein sequence of cDNA TD no. 12558789 from Arabidopsis thaliana. SEQ TD NO: 204 is the protein sequence of cDNA TD no. 12658399 from Arabidopsis thaliana. SEQ TD NO: 205 is the protein sequence of cDNA TD no. 12575003 from Arabidopsis thaliana. SEQ TD NO: 206 is the protein sequence of cDNA TD no. 12488947 from Arabidopsis thaliana. SEQ TD NO: 207 is the protein sequence of cDNA TD no. 12658410 from Arabidopsis thaliana. SEQ TD NO: 208 is the protein sequence of cDNA TD no. 12671988 from Arabidopsis thaliana. SEQ TD NO: 209 is the protein sequence of cDNA E) no. 4929561 from Arabidopsis thaliana. SEQ ID NO: 210 is the protein sequence of cDNA ID no. 12658403 from Arabidopsis thaliana.
SEQ ID NO: 211 is the protein sequence of cDNA TD no. 12711003 from Arabidopsis thaliana. SEQ ID NO: 212 is the protein sequence of cDNA ID no. 12727851 from Arabidopsis thaliana. SEQ ID NO: 213 is the protein sequence of cDNA ID no. 12734891 from Arabidopsis thaliana. SEQ ID NO: 214 is the protein sequence of cDNA TD no. 12736978 from Arabidopsis thaliana. SEQ E) NO: 215 is the protein sequence of cDNA E) no. 12722908 from Arabidopsis thaliana. SEQ E) NO: 216 is the protein sequence of cDNA E) no. 4931385 from Arabidopsis thaliana. SEQ E) NO: 217 is the protein sequence of cDNA E) no. 12672204 from Arabidopsis thaliana. SEQ TD NO: 218 is the protein sequence of cDNA TD no. 12604034 from Arabidopsis thaliana. SEQ TD NO: 219 is the protein sequence of cDNA TD no. 6446020 from Arabidopsis thaliana. SEQ TD NO: 220 is the protein sequence of cDNA E) no. 12323946 from Arabidopsis thaliana.
SEQ TD NO: 221 is the protein sequence of cDNA TD no. 12578257 from Arabidopsis thaliana. SEQ TD NO: 222 is the protein sequence of cDNA TD no. 12348600 from Arabidopsis thaliana. SEQ TD NO: 223 is the protein sequence of cDNA TD no. 12716909 from Arabidopsis thaliana. SEQ TD NO: 224 is the protein sequence of cDNA TD no. 12672200 from Arabidopsis thaliana. SEQ TD NO: 225 is the protein sequence of cDNA TD no. 12736967 from Arabidopsis thaliana. SEQ TD NO: 226 is the protein sequence of cDNA TD no. 12672193 from Arabidopsis thaliana. SEQ TD NO: 227 is the protein sequence of cDNA TD no. 12728556 from Arabidopsis thaliana. SEQ TD NO: 228 is the protein sequence of cDNA TD no. 12716894 from Arabidopsis thaliana. SEQ E) NO: 229 is the protein sequence of cDNA TD no. 12736956 from Arabidopsis thaliana. SEQ TD NO: 230 is the protein sequence of cDNA TD no. 12482287 from Arabidopsis thaliana.
SEQ TD NO: 231 is the protein sequence of cDNA TD no. 12654831 from Arabidopsis thaliana. SEQ TD NO: 232 is the protein sequence of cDNA E) no. 12654823 from Arabidopsis thaliana. SEQ TD NO: 233 is the protein sequence of cDNA TD no. 12654819 from Arabidopsis thaliana. SEQ E) NO: 234 is the protein sequence of cDNA TD no. 12654815 from Arabidopsis thaliana. SEQ TD NO: 235 is the protein sequence of cDNA TD no. 12726436 from Arabidopsis ihaliana. SEQ ID NO: 236 is the protein sequence of cDNA DD no. 12692662 from Arabidopsis thaliana. SEQ DD NO: 237 is the protein sequence of cDNA ID no. 12672307 from Arabidopsis thaliana. SEQ ID NO: 238 is the protein sequence of cDNA ID no. 12664747 from Arabidopsis thaliana. SEQ ID NO: 239 is the protein sequence of cDNA ID no. 12661189 from Arabidopsis thaliana. SEQ ID NO: 240 is the protein sequence of cDNA ID no. 12718491 from Arabidopsis thaliana.
SEQ ID NO: 241 is the protein sequence of cDNA ID no. 12722285 from Arabidopsis thaliana. SEQ ID NO: 242 is the protein sequence of cDNA ID no. 1265 Al '61 from Arabidopsis thaliana. SEQ ID NO: 243 is the protein sequence of cDNA ID no. 12653581 from Arabidopsis thaliana. SEQ ID NO: 244 is the protein sequence of cDNA ID no. 12336276 from Arabidopsis thaliana. SEQ ID NO: 245 is the protein sequence of cDNA ID no. 12720013 from Arabidopsis thaliana. SEQ ID NO: 246 is the protein sequence of cDNA ID no. 12700669 from Arabidopsis thaliana. SEQ ID NO: 247 is the protein sequence of cDNA ID no. 12733198 from Arabidopsis thaliana. SEQ ID NO: 248 is the protein sequence of cDNA ID no. 12733202 from Arabidopsis thaliana. SEQ ID NO: 249 is the protein sequence of cDNA ID no. 12670681 from Arabidopsis thaliana. SEQ ID NO: 250 is the protein sequence of cDNA ID no. 12663534 from Arabidopsis thaliana.
SEQ ID NO: 251 is the protein sequence of cDNA ID no. 12672657 from Arabidopsis thaliana. SEQ ID NO: 252 is the protein sequence of cDNA ID no. 12601847 from Arabidopsis thaliana. SEQ ID NO: 253 is the protein sequence of cDNA ID no. 6442206 from Arabidopsis thaliana. SEQ DD NO: 254 is the protein sequence of cDNA DD no. 12656365 from Arabidopsis thaliana. SEQ DD NO: 255 is the protein sequence of cDNA DD no. 12672102 from Arabidopsis thaliana. SEQ DD NO: 256 is the protein sequence of cDNA DD no. 12731901 from Arabidopsis thaliana. SEQ DD NO: 257 is the protein sequence of cDNA DD no. 12696098 from Arabidopsis thaliana. SEQ DD NO: 258 is the protein sequence of cDNA DD no. 12653150 from Arabidopsis thaliana. SEQ DD NO: 259 is the protein sequence of cDNA DD no. 12731797 from Arabidopsis thaliana. SEQ DD NO: 260 is the protein sequence of cDNA DD no. 12731793 from Arabidopsis thaliana. SEQ ID NO: 261 is the protein sequence of cDNA ID no. 6445548 from Arabidopsis thaliana. SEQ ID NO: 262 is the protein sequence of cDNA ID no. 12731781 from Arabidopsis thaliana. SEQ ID NO: 263 is the protein sequence of cDNA ID no. 12576154 from Arabidopsis thaliana. SEQ ID NO: 264 is the protein sequence of cDNA ID no. 12729533 from Arabidopsis thaliana. SEQ ID NO: 265 is the protein sequence of cDNA ID no. 12661185 from Arabidopsis thaliana. SEQ ID NO: 266 is the protein sequence of cDNA ID no. 12574629 from Arabidopsis thaliana. SEQ ID NO: 267 is the protein sequence of cDNA ID no. 12575795 from Arabidopsis thaliana. SEQ ID NO: 268 is the protein sequence of cDNA ID no. 12731921 from Arabidopsis thaliana. SEQ ID NO: 269 is the protein sequence of cDNA ID no. 12654597 from Arabidopsis thaliana. SEQ ID NO: 270 is the protein sequence of cDNA ID no. 12680828 from Arabidopsis thaliana.
SEQ ID NO: 271 is the protein sequence of cDNA ID no. 12709025 from Arabidopsis thaliana. SEQ ID NO: 272 is the protein sequence of cDNA ID no. 12602001 from Arabidopsis thaliana. SEQ ID NO: 273 is the protein sequence of NCBI gi no. 14475586 from Arabidopsis thaliana. SEQ ID NO: 274 is the protein sequence of NCBI gi no. 5921933 from Lycopersicon esculentum. SEQ ID NO: 275 is the protein sequence of clone ID no. 756140 from Triticum aestivum. SEQ ID NO: 276 is the protein sequence of clone ID no. 570049 from Triticum aestivum. SEQ ID NO: 277 is the protein sequence of NCBI gi no. 50838910 from Oryza sativa subsp. japonica. SEQ ID NO: 278 is the protein sequence of clone ID no. 748722 from Triticum aestivum. SEQ ID NO: 279 is the protein sequence of NCBI gi no. 34911504 from Oryza sativa subsp. japonica. SEQ ID NO: 280 is the protein sequence of NCBI gi no. 34911506 from Oryza sativa subsp. japonica.
SEQ ID NO: 281 is the protein sequence of NCBI gi no. 14030557 from Zea mays. SEQ ID NO: 282 is the protein sequence of clone ID no. 261301 from Zea mays. SEQ ID NO: 283 is the protein sequence of NCBI gi no. 50919857 from Oryza sativa subsp. japonica. SEQ ID NO: 284 is the protein sequence of NCBI gi no. 7270098 from Arabidopsis thaliana. SEQ ID NO: 285 is the protein sequence of NCBI gi no. 23296518 from Arabidopsis thaliana. SEQ ID NO: 286 is the protein sequence of NCBI gi no. 7430660 from Arabidopsis thaliana. SEQ ID NO: 287 is the protein sequence of NCBI gi no. 47156870 from Solarium chacoense. SEQ ID NO: 288 is the protein sequence of clone ID no. 1119783 from Glycine max. SEQ ID NO: 289 is the protein sequence of NCBI gi no. 18000074 from Nicotiana tabacum. SEQ ID NO: 290 is the protein sequence of NCBI gi no. 27461067 from Nicotiana tabacum.
SEQ ID NO: 291 is the protein sequence of NCBI gi no. 27542762 from Sorghum bicolor. SEQ ID NO: 292 is the protein sequence of NCBI gi no. 5921924 from Sorghum bicolor. SEQ ID NO: 293 is the protein sequence of clone ID no. 782009 from Triticum aestivum. SEQ ID NO: 294 is the protein sequence of clone ID no. 527128 from Glycine max. SEQ ID NO: 295 is the protein sequence of NCBI gi no. 6739527 from Manihot esculenta. SEQ ID NO: 296 is the protein sequence of NCBI gi no. 56553510 from n/a . SEQ ID NO: 297 is the protein sequence of NCBI gi no. 17978651 from Pinus taeda. SEQ ID NO: 298 is the protein sequence of NCBI gi no. 2738998 from Glycine max. SEQ ID NO: 299 is the protein sequence of NCBI gi no. 27650337 from Solenostemon scutellarioides. SEQ ID NO: 300 is the protein sequence of NCBI gi no. 17978831 from Sesamum indicum.
SEQ ID NO: 301 is the protein sequence of NCBI gi no. 5915857 from Sorghum bicolor. SEQ ID NO: 302 is the protein sequence of NCBI gi no. 40641240 from Triticum aestivum. SEQ ID NO: 303 is the protein sequence of NCBI gi no. 40641238 from Triticum aestivum. SEQ ID NO: 304 is the protein sequence of NCBI gi no. 46798530 from Triticum aestivum. SEQ ID NO: 305 is the protein sequence of clone ID no. 299213 from Zea mays. SEQ ID NO: 306 is the protein sequence of NCBI gi no. 45331333 from Camptotheca acuminata. SEQ ID NO: 307 is the protein sequence of NCBI gi no. 26522472 from Lithospermum erythrorhizon. SEQ ID NO: 308 is the protein sequence of NCBI gi no. 22651521 from Ocimum basilicum. SEQ ID NO: 309 is the protein sequence of NCBI gi no. 22651519 from Ocimum basilicum. SEQ ID NO: 310 is the protein sequence of NCBI gi no. 46947675 from Ammi majus.
SEQ ID NO: 311 is the protein sequence of NCBI gi no. 7270932 from Arabidopsis thaliana. SEQ ID NO: 312 is the protein sequence of NCBI gi no. 7594541 from Arabidopsis thaliana. SEQ ID NO: 313 is the protein sequence of clone ID no. 763442 from Triticum aestivum. SEQ ID NO: 314 is the protein sequence of clone ID no. 1557311 from Zea mays. SEQ ID NO: 315 is the protein sequence of NCBI gi no. 34908446 from Oryza sativa subsp. japonica. SEQ ID NO: 316 is the protein sequence of NCBI gi no. 52353707 from Oryza sativa subsp. japonica. SEQ ID NO: 317 is the protein sequence of clone ID no. 1469649 from Zea mays. SEQ ID NO: 318 is the protein sequence of clone ID no. 1578373 from Zea mays. SEQ ID NO: 319 is the protein sequence of clone ID no. 1583137 from Zea mays. SEQ ID NO: 320 is the protein sequence of NCBI gi no. 60677681 from n/a .
SEQ ID NO: 321 is the protein sequence of NCBI gi no. 34902330 from Oryza sativa subsp. japonica. SEQ ID NO: 322 is the protein sequence of clone ID no. 718939 from Glycine max. SEQ ID NO: 323 is the protein sequence of NCBI gi no. 45260636 from Nicotiana tabacum. SEQ ID NO: 324 is the protein sequence of NCBI gi no. 60677685 from n/a . SEQ ID NO: 325 is the protein sequence of NCBI gi no. 60677683 from n/a . SEQ ID NO: 326 is the protein sequence of NCBI gi no. 9587211 from Vigna radiata. SEQ E) NO: 327 is the protein sequence of NCBI gi no. 46095226 from Cucumis melo. SEQ E) NO: 328 is the protein sequence of NCBI gi no. 15293197 from Arabidopsis thaliana. SEQ E) NO: 329 is the protein sequence of NCBI gi no. 20198022 from Arabidopsis thaliana. SEQ E) NO: 330 is the protein sequence of NCBI gi no. 15450601 from Arabidopsis thaliana.
SEQ E) NO: 331 is the protein sequence of NCBI gi no. 5042429 from Arabidopsis thaliana. SEQ E) NO: 332 is the protein sequence of NCBI gi no. 5042428 from Arabidopsis thaliana. SEQ E) NO: 333 is the protein sequence of NCBI gi no. 34098875 from Arabidopsis thaliana. SEQ E) NO: 334 is the protein sequence of NCBI gi no. 1432145 from Arabidopsis thaliana. SEQ E) NO: 335 is the protein sequence of NCBI gi no. 11934677 from Cucurbita maxima. SEQ E) NO: 336 is the protein sequence of NCBI gi no. 27764531 from Pisum sativum. SEQ E) NO: 337 is the protein sequence of NCBI gi no. 13022042 from Hordeum vulgar e subsp. vulgar e. SEQ E) NO: 338 is the protein sequence of NCBI gi no. 47498770 from Ginkgo biloba. SEQ TD NO: 339 is the protein sequence of NCBI gi no. 5915847 from Zea mays. SEQ E) NO: 340 is the protein sequence of NCBI gi no. 55775106 from Oryza sativa subsp. japonica.
SEQ E) NO: 341 is the protein sequence of NCBI gi no. 13641298 from Brassica rapa subsp. pekinensis. SEQ E) NO: 342 is the protein sequence of NCBI gi no. 4850398 from Arabidopsis thaliana. SEQ E) NO: 343 is the protein sequence of NCBI gi no. 10177808 from Arabidopsis thaliana. SEQ E) NO: 344 is the protein sequence of NCBI gi no. 31432758 from Orγza sativa subsp. japonica. SEQ ID NO: 345 is the protein sequence of NCBI gi no. 8346562 from Arabidopsis thaliana. SEQ ID NO: 346 is the protein sequence of NCBI gi no. 10442763 from Triticum aestivum. SEQ ID NO: 347 is the protein sequence of clone E) no. 292778 from Zea mays. SEQ E) NO: 348 is the protein sequence of NCBI gi no. 50251848 from Otγza sativa subsp. japonica. SEQ E) NO: 349 is the protein sequence of NCBI gi no. 50927909 from Oryza sativa subsp. japonica. SEQ E) NO: 350 is the protein sequence of clone E) no. 1357674 from Arabidopsis thaliana.
SEQ E) NO: 351 is the protein sequence of NCBI gi no. 7267123 from Arabidopsis thaliana. SEQ E) NO: 352 is the protein sequence of NCBI gi no. 22137048 from Arabidopsis thaliana. SEQ E) NO: 353 is the protein sequence of clone E) no. 156577 from Arabidopsis thaliana. SEQ E) NO: 354 is the protein sequence of NCBI gi no. 15223436 from Arabidopsis thaliana. SEQ E) NO: 355 is the protein sequence of NCBI gi no. 26449891 from Arabidopsis thaliana. SEQ E) NO: 356 is the protein sequence of NCBI gi no. 21536828 from Arabidopsis thaliana. SEQ TD NO: 357 is the protein sequence of clone E) no. 11278 from. Arabidopsis thaliana. SEQ E) NO: 358 is the protein sequence of NCBI gi no. 27544770 from Arabidopsis thaliana. SEQ E) NO: 359 is the protein sequence of NCBI gi no. 9294419 from Arabidopsis thaliana. SEQ E) NO: 360 is the protein sequence of NCBI gi no. 10197650 from Brassica napus.
SEQ E) NO: 361 is the protein sequence of NCBI gi no. 10197652 from Brassica napus. SEQ E) NO: 362 is the protein sequence of NCBI gi no. 47933890 from Camptotheca acuminata. SEQ E) NO: 363 is the protein sequence of NCBI gi no. 5731998 from Liquidambar styracifl.ua. SEQ E) NO: 364 is the protein sequence of clone E) no. 545898 from Glycine max. SEQ E) NO: 365 is the protein sequence of NCBI gi no. 6688937 from Populus balsamifera subsp. trichocarpa. SEQ E) NO: 366 is the protein sequence of NCBI gi no. 46403211 from Centaurium erythraea. SEQ E) NO: 367 is the protein sequence of NCBI gi no. 57470997 from n/a . SEQ E) NO: 368 is the protein sequence of NCBI gi no. 5002354 from Lycopersicon esculentum x Lycopersicon peruvianum. SEQ E) NO: 369 is the protein sequence of NCBI gi no. 7270101 from Arabidopsis thaliana. SEQ E) NO: 370 is the protein sequence of NCBI gi no. 2642441 from Arabidopsis thaliana.
SEQ ID NO: 371 is the protein sequence of NCBI gi no. 33521521 from Medicago truncatula. SEQ ID NO: 372 is the protein sequence of NCBI gi no. 2642444 from Arabidopsis thaliana. SEQ ID NO: 373 is the protein sequence of NCBI gi no. 4894170 from Cicer arietinum. SEQ ID NO: 374 is the protein sequence of NCBI gi no. 4200044 from Glycyrrhiza echinata. SEQ ID NO: 375 is the protein sequence of NCBI gi no. 2443348 from Glycyrrhiza echinata. SEQ ID NO: 376 is the protein sequence of NCBI gi no. 7270718 from Arabidopsis thaliana. SEQ ID NO: 377 is the protein sequence of NCBI gi no. 7415996 from Lotus japonicus. SEQ TD NO: 378 is the protein sequence of NCBI gi no. 4006851 from Arabidopsis thaliana. SEQ ID NO: 379 is the protein sequence of NCBI gi no. 51968888 from Arabidopsis thaliana. SEQ ID NO: 380 is the protein sequence of NCBI gi no. 20197777 from Arabidopsis thaliana.
SEQ ID NO: 381 is the protein sequence of NCBI gi no. 2739008 from Glycine max. SEQ ID NO: 382 is the protein sequence of clone ID no. 779234 from Triticum aestivum. SEQ ID NO: 383 is the protein sequence of NCBI gi no. 50948231 from Oryza sativa subsp. japonica. SEQ ID NO: 384 is the protein sequence of NCBI gi no. 5921925 from Pinus radiata. SEQ ID NO: 385 is the protein sequence of clone ID no. 624225 from Glycine max. SEQ ID NO: 386 is the protein sequence of clone ID no. 627596 from Glycine max. SEQ ID NO: 387 is the protein sequence of NCBI gi no. 50916627 from Oryza sativa subsp. japonica. SEQ ID NO: 388 is the protein sequence of NCBI gi no. 50939101 from Oryza sativa subsp. japonica. SEQ ID NO: 389 is the protein sequence of NCBI gi no. 52076870 from Oryza sativa subsp. japonica. SEQ ID NO: 390 is the protein sequence of NCBI gi no. 438241 from Solatium melongena.
SEQ ID NO: 391 is the protein sequence of NCBI gi no. 9665096 from Arabidopsis thaliana. SEQ ID NO: 392 is the protein sequence of NCBI gi no. 12231886 from Matthiola incana. SEQ ID NO: 393 is the protein sequence of clone ID no. 595123 from Glycine max. SEQ ID NO: 394 is the protein sequence of NCBI gi no. 28603528 from Glycine max. SEQ ID NO: 395 is the protein sequence of NCBI gi no. 12231914 from Pelargonium x hortorum. SEQ ID NO: 396 is the protein sequence of NCBI gi no. 5921647 from Petunia x hybrida. SEQ ID NO: 397 is the protein sequence of NCBI gi no. 38093218 from Ipomoea purpurea. SEQ ID NO: 398 is the protein sequence of NCBI gi no. 44889632 from Allium cepa. SEQ ID NO: 399 is the protein sequence of NCBI gi no. 12231880 from Callistephus chinensis. SEQ ID NO: 400 is the protein sequence of NCBI gi no. 38093216 from Ipomoea nil.
SEQ ID NO: 401 is the protein sequence of NCBI gi no. 31431083 from Oryza sativa subsp. japonica. SEQ ID NO: 402 is the protein sequence of NCBI gi no. 14278925 from Perilla frutescens. SEQ ID NO: 403 is the protein sequence of NCBI gi no. 62086547 from n/a . SEQ ID NO: 404 is the protein sequence of NCBI gi no. 19910935 from Torenia hybrida. SEQ ID NO: 405 is the protein sequence of NCBI gi no. 42821962 from Ipomoea quamoclit. SEQ ID NO: 406 is the protein sequence of NCBI gi no. 38093221 from Ipomoea tricolor. SEQ ID NO: 407 is the protein sequence of NCBI gi no. 1890152 from Arabidopsis thaliana. SEQ ID NO: 408 is the protein sequence of NCBI gi no. 16973300 from Nicotiana attenuata. SEQ ID NO: 409 is the protein sequence of NCBI gi no. 29373135 from Citrus sinensis. SEQ ID NO: 410 is the protein sequence of NCBI gi no. 20160362 from Solanum tuberosum.
SEQ ID NO: 411 is the protein sequence of NCBI gi no. 1352186 from Linum usitatissimum. SEQ ID NO: 412 is the protein sequence of NCBI gi no. 7581989 from Lycopersicon esculentuin. SEQ ID NO: 413 is the protein sequence of NCBI gi no. 21616113 from Cucumis melo. SEQ ID NO: 414 is the protein sequence of NCBI gi no. 33504426 from Medicago truncatula. SEQ ID NO: 415 is the protein sequence of NCBI gi no. 50919025 from Oryza sativa subsp. japonica. SEQ DD NO: 416 is the protein sequence of clone ID no. 315053 from Zea mays. SEQ ID NO: 417 is the protein sequence of clone ID no. 632974 from Triticum aestivum. SEQ ID NO: 418 is the protein sequence of NCBI gi no. 8572559 from Citrus sinensis. SEQ ID NO: 419 is the protein sequence of NCBI gi no. 4566493 from Pinus taeda. SEQ ID NO: 420 is the protein sequence of NCBI gi no. 1574976 from Populus tremuloides.
SEQ ID NO: 421 is the protein sequence of NCBI gi no. 12276037 from Populus x generosa. SEQ ID NO: 422 is the protein sequence of NCBI gi no. 3915089 from Populus kitakamiensis. SEQ ID NO: 423 is the protein sequence of NCBI gi no. 4688632 from Cicer arietinum. SEQ ID NO: 424 is the protein sequence of NCBI gi no. 1044868 from Glycine max. SEQ ID NO: 425 is the protein sequence of NCBI gi no. 586081 from Medicago sativa. SEQ ID NO: 426 is the protein sequence of NCBI gi no. 2624383 from Phaseolus vulgaris. SEQ ID NO: 427 is the protein sequence of NCBI gi no. 19864010 from Pisum sativum. SEQ ID NO: 428 is the protein sequence of NCBI gi no. 9957081 from Pisum sativum. SEQ ID NO: 429 is the protein sequence of NCBI gi no. 7430603 from Pisum sativum. SEQ ID NO: 430 is the protein sequence of NCBI gi no. 4753128 from Pisum sativum.
SEQ ID NO: 431 is the protein sequence of NCBI gi no. 586082 from Vigna radiata. SEQ ID NO: 432 is the protein sequence of NCBI gi no. 3915088 from Petroselinum crispum. SEQ ID NO: 433 is the protein sequence of NCBI gi no. 473229 from Catharanthus roseus. SEQ ID NO: 434 is the protein sequence of NCBI gi no. 12003968 from Capsicum annuum. SEQ ID NO: 435 is the protein sequence of NCBI gi no. 14423323 from Nicotiana tabacum. SEQ ID NO: 436 is the protein sequence of NCBI gi no. 14423325 from Nicotiana tabacum. SEQ ID NO: 437 is the protein sequence of NCBI gi no. 18859 from Helianthus tuberosus. SEQ ID NO: 438 is the protein sequence of NCBI gi no. 14192803 from Sorghum bicolor. SEQ ID NO: 439 is the protein sequence of clone ID no. 755965 from Triticum aestivum. SEQ DD NO: 440 is the protein sequence of NCBI gi no. 44889626 from Allium cepa.
SEQ ID NO: 441 is the protein sequence of NCBI gi no. 47933894 from Camptotheca acuminata. SEQ ID NO: 442 is the protein sequence of NCBI gi no. 9965897 from Gossypium arboreum. SEQ ID NO: 443 is the protein sequence of NCBI gi no. 9965899 from Gossypium arboreum. SEQ ID NO: 444 is the protein sequence of NCBI gi no. 3915112 from Zinnia elegans. SEQ ID NO: 445 is the protein sequence of NCBI gi no. 16555877 from Lithospermum erythrorhizon. SEQ ID NO: 446 is the protein sequence of NCBI gi no. 16555879 from Lithospermum erythrorhizon. SEQ ID NO: 447 is the protein sequence of NCBI gi no. 13548653 from Ruta graveolens. SEQ ID NO: 448 is the protein sequence of NCBI gi no. 24571503 from Ruta graveolens. SEQ ID NO: 449 is the protein sequence of NCBI gi no. 14210375 from Citrus xparadisi. SEQ ID NO: 450 is the protein sequence of NCBI gi no. 51702531 from Agastache rugosa.
SEQ ID NO: 451 is the protein sequence of NCBI gi no. 55168223 from Oryza sativa subsp. japonica. SEQ ID NO: 452 is the protein sequence of NCBI gi no. 3915095 from Glycyrrhiza echinata. SEQ ID NO: 453 is the protein sequence of NCBI gi no. 29123387 from Ammi majus. SEQ ID NO: 454 is the protein sequence of NCBI gi no. 21842133 from Zea mays. SEQ ID NO: 455 is the protein sequence of clone ID no. 1478163 from Zea mays. SEQ TD NO: 456 is the protein sequence of NCBI gi no. 54290354 from Oryza sativa subsp. japonica. SEQ ID NO: 457 is the protein sequence of NCBI gi no. 34912880 from Oryza sativa subsp. japonica. SEQ ID NO: 458 is the protein sequence of NCBI gi no. 34912882 from Oryza sativa subsp. japonica. SEQ TD NO: 459 is the protein sequence of NCBI gi no. 27754241 from Arabidopsis thaliana. SEQ TD NO: 460 is the protein sequence of NCBI gi no. 9294387 from Arabidopsis thaliana.
SEQ E) NO: 461 is the protein sequence of NCBI gi no. 15231899 from Arabidopsis thaliana. SEQ E) NO: 462 is the protein sequence of NCBI gi no. 9294386 from Arabidopsis thaliana. SEQ TD NO: 463 is the protein sequence of NCBI gi no. 24111277 from Arabidopsis thaliana. SEQ E) NO: 464 is the protein sequence of NCBI gi no. 13661770 from Lolium rigidum. SEQ TD NO: 465 is the protein sequence of NCBI gi no. 13661768 from Lolium rigidum. SEQ E) NO: 466 is the protein sequence of NCBI gi no. 13661766 from Lolium rigidum. SEQ TD NO: 467 is the protein sequence of NCBI gi no. 13661774 from Lolium rigidum. SEQ TD NO: 468 is the protein sequence of NCBI gi no. 13661772 from Lolium rigidum. SEQ TD NO: 469 is the protein sequence of NCBI gi no. 20465787 from Arabidopsis thaliana. SEQ TD NO: 470 is the protein sequence of clone TD no. 479101 from Glycine max.
SEQ E) NO: 471 is the protein sequence of NCBI gi no. 21542404 from Arabidopsis thaliana. SEQ E) NO: 472 is the protein sequence of NCBI gi no. 9294291 from Arabidopsis thaliana. SEQ E) NO: 473 is the protein sequence of NCBI gi no. 30923413 from Arabidopsis thaliana. SEQ TD NO: 474 is the protein sequence of NCBI gi no. 7649376 from Arabidopsis thaliana. SEQ TD NO: 475 is the protein sequence of NCBI gi no. 3164132 from Arabidopsis thaliana. SEQ E) NO: 476 is the protein sequence of NCBI gi no. 9294290 from Arabidopsis thaliana. SEQ TD NO: 477 is the protein sequence of NCBI gi no. 20197089 from Arabidopsis thaliana. SEQ TD NO: 478 is the protein sequence of NCBI gi no. 7430672 from Arabidopsis thaliana. SEQ TD NO: 479 is the protein sequence of NCBI gi no. 9294288 from Arabidopsis thaliana. SEQ ID NO: 480 is the protein sequence of NCBI gi no. 13878369 from Arabidopsis thaliana.
SEQ ID NO: 481 is the protein sequence of NCBI gi no. 11994442 from Arabidopsis thaliana. SEQ ID NO: 482 is the protein sequence of NCBI gi no. 4850393 from Arabidopsis thaliana. SEQ ID NO: 483 is the protein sequence of NCBI gi no. 11994438 from Arabidopsis thaliana. SEQ ID NO: 484 is the protein sequence of NCBI gi no. 23506035 from Arabidopsis thaliana. SEQ ID NO: 485 is the protein sequence of NCBI gi no. 16226474 from Arabidopsis thaliana. SEQ ID NO: 486 is the protein sequence of NCBI gi no. 11994434 from Arabidopsis thaliana. SEQ ID NO: 487 is the protein sequence of NCBI gi no. 15238720 from Arabidopsis thaliana. SEQ ID NO: 488 is the protein sequence of NCBI gi no. 13878407 from Arabidopsis thaliana. SEQ ID NO: 489 is the protein sequence of NCBI gi no. 51971443 from Arabidopsis thaliana. SEQ ID NO: 490 is the protein sequence of NCBI gi no. 1345641 from Thlaspi arvense.
SEQ ID NO: 491 is the protein sequence of NCBI gi no. 15238726 from Arabidopsis thaliana. SEQ ID NO: 492 is the protein sequence of NCBI gi no. 20465427 from Arabidopsis thaliana. SEQ ID NO: 493 is the protein sequence of NCBI gi no. 42566749 from Arabidopsis thaliana. SEQ ID NO: 494 is the protein sequence of NCBI gi no. 4584536 from Arabidopsis thaliana. SEQ ID NO: 495 is the protein sequence of NCBI gi no. 34365731 from Arabidopsis thaliana. SEQ ID NO: 496 is the protein sequence of clone ID no. 779326 from Triticum aestivum. SEQ ID NO: 497 is the protein sequence of NCBI gi no. 34903888 from Oryza sativa siώsp. japonica. SEQ ID NO: 498 is the protein sequence of NCBI gi no. 34903876 from Oryza sativa subsp. japonica. SEQ ID NO: 499 is the protein sequence of NCBI gi no. 34903874 from Oryza sativa subsp. japonica. SEQ ID NO: 500 is the protein sequence of NCBI gi no. 34903880 from Oryza sativa subsp. japonica.
SEQ ID NO: 501 is the protein sequence of NCBI gi no. 20856398 from Arabidopsis thaliana. SEQ ID NO: 502 is the protein sequence of clone ID no. 158108 from Arabidopsis thaliana. SEQ ID NO: 503 is the protein sequence of NCBI gi no. 21553706 from Arabidopsis thaliana. SEQ ID NO: 504 is the protein sequence of NCBI gi no. 20259299 from Arabidopsis thaliana. SEQ ID NO: 505 is the protein sequence of NCBI gi no. 42572955 from Arabidopsis thaliana. SEQ ID NO: 506 is the protein sequence of NCBI gi no. 7268718 from Arabidopsis thaliana. SEQ ID NO: 507 is the protein sequence of clone ID no. 512411 from Glycine max. SEQ ID NO: 508 is the protein sequence of clone ID no. 791996 from Triticum aestivum. SEQ ID NO: 509 is the protein sequence of clone ID no. 579678 from Triticum aestivum. SEQ ID NO: 510 is the protein sequence of NCBI gi no. 7267932 from Arabidopsis thaliana.
SEQ ID NO: 511 is the protein sequence of NCBI gi no. 28973099 from Arabidopsis thaliana. SEQ ID NO: 512 is the protein sequence of NCBI gi no. 11345411 from Matthiola incana. SEQ ID NO: 513 is the protein sequence of clone ID no. 525053 from Glycine max. SEQ ID NO: 514 is the protein sequence of NCBI gi no. 51536592 from Arabidopsis thaliana. SEQ ID NO: 515 is the protein sequence of NCBI gi no. 3885331 from Arabidopsis thaliana. SEQ ID NO: 516 is the protein sequence of NCBI gi no. 3885330 from Arabidopsis thaliana. SEQ ID NO: 517 is the protein sequence of NCBI gi no. 9294001 from Arabidopsis thaliana. SEQ ID NO: 518 is the protein sequence of NCBI gi no. 9294002 from Arabidopsis thaliana. SEQ ID NO: 519 is the protein sequence of NCBI gi no. 28973083 from Arabidopsis thaliana. SEQ ID NO: 520 is the protein sequence of NCBI gi no. 30689861 from Arabidopsis thaliana.
SEQ ID NO: 521 is the protein sequence of NCBI gi no. 2344895 from Arabidopsis thaliana. SEQ ID NO: 522 is the protein sequence of NCBI gi no. 28460683 from Arabidopsis thaliana. SEQ ID NO: 523 is the protein sequence of NCBI gi no. 18481718 from Sorghum bicolor. SEQ ID NO: 524 is the protein sequence of clone ID no. 244116 from Zea mays. SEQ ID NO: 525 is the protein sequence of NCBI gi no. 50940815 from Oryza sativa subsp. japonica. SEQ ID NO: 526 is the protein sequence of NCBI gi no. 11934675 from Cucurbita maxima. SEQ ID NO: 527 is the protein sequence of NCBI gi no. 37954114 from Piston sativum. SEQ ID NO: 528 is the protein sequence of NCBI gi no. 53792013 from Oryza sativa subsp. japonica. SEQ E) NO: 529 is the protein sequence of NCBI gi no. 50727139 from Oryza sativa subsp. japonica. SEQ ID NO: 530 is the protein sequence of NCBI gi no. 50957226 from Oryza sativa subsp. japonica.
SEQ ID NO: 531 is the protein sequence of NCBI gi no. 53792010 from Oryza sativa subsp. japonica. SEQ ID NO: 532 is the protein sequence of NCBI gi no. 34304722 from Stevia rebaudiana. SEQ ID NO: 533 is the protein sequence of NCBI gi no. 46811123 from Fragaria grandiflora. SEQ ID NO: 534 is the DNA sequence of promoter ID no. pl3879 from Arabidopsis thaliana. SEQ ID NO: 535 is the DNA sequence of promoter ID no. ρ32449 from Arabidopsis thaliana. SEQ ID NO: 536 is the DNA sequence of promoter ID no. p326 from Arabidopsis thaliana. SEQ ID NO: 537 is the DNA sequence of promoter ID no. YP0050 from Arabidopsis thaliana. SEQ ID NO: 538 is the DNA sequence of promoter ID no. YP0190 from Arabidopsis thaliana. IV. DETAILED DESCRIPTION
Before describing the present invention in detail, it is to be understood that this invention is not limited to particularly exemplified molecules or process parameters. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments of the invention only, and is not intended to be limiting. In addition, the practice of the present invention will employ, unless otherwise indicated, conventional methods of plant biology, virology, microbiology, molecular biology, and recombinant DNA techniques all of which are within the ordinary skill of the art. Such techniques are explained fully in the literature. See, e.g., Evans, et al., Handbook of Plant Cell Culture (1983, Macmillan Publishing Co.); Binding, Regeneration of Plants, Plant Protoplasts (1985, CRC Press); Sambrook, et al., Molecular Cloning: A Laboratory Manual (2nd Edition, 1989); DNA Cloning: A Practical Approach, vol. I & II (D. Glover, ed.); Oligonucleotide Synthesis (N. Gait, ed., 1984); A Practical Guide to Molecular Cloning (1984); and Fundamental Virology, 2nd Edition, vol. I & II (B. N. Fields and D. M. Knipe, eds.); Ausubel, F. M., et al., CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley and Sons, Inc., Media Pa.; Plant Molecular Biology: Essential Techniques, P. G. Jones and J. M. Sutton, New York, J. Wiley, 1997; Miglani, Gurbachan Dictionary of Plant Genetics and Molecular Biology, New York, Food Products Press, 1998; Henry, R. J., Practical Applications of Plant Molecular Biology, New York, Chapman & Hall, 1997. In case of conflict, the present specification will control. AU publications, patents and patent applications cited herein, whether supra or infra, are hereby incorporated by reference in their entirety.
It must be noted that, as used in this specification and the appended claims, the singular forms "a", "an" and "the" include plural referents unless the content clearly dictates otherwise. Thus, for example, reference to "a polypeptide" includes a mixture of two or more polypeptides, and the like.
A. Definitions
The terms "nucleic acid molecule" and "polynucleotide" are used interchangeably and refer to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof. This term refers only to the primary structure of the molecule and thus includes double- and single-stranded DNA and RNA. The alphabetical representation of a nucleic acid can be input into databases in a computer having a central processing unit and used for bioinformatics applications such as functional genomics and homology searching.
As used herein, the term "regulatory region" refers to a nucleic acid sequence that modulates, e.g., regulates, facilitates or drives, the expression of a second nucleic acid sequence. A regulatory region may include sequences that stimulate or repress transcription of a gene in a specific or non-specific manner and in an inducible or non-inducible manner. Regulatory regions may include multiple control elements. Typical control elements, include, but are not limited to, transcription promoters, transcription enhancer elements, transcription termination signals, polyadenylation sequences (located 3' to the translation stop codon), sequences for optimization of initiation of translation (located 5' to the coding sequence), translation enhancing sequences, and translation termination sequences. Transcription promoters can include inducible promoters (where expression of a polynucleotide sequence operably linked to the promoter is induced by an analyte, cofactor, regulatory protein, etc.), tissue-specific promoters (where expression of a polynucleotide sequence operably linked to the promoter is induced only in selected tissue), repressible promoters (where expression of a polynucleotide sequence operably linked to the promoter is repressed by an analyte, cofactor, regulatory protein, etc.), and constitutive promoters.
"Expression enhancing sequences" typically refer to control elements that improve transcription or translation of a polynucleotide relative to the expression level in the absence of such control elements (for example, promoters, promoter enhancers, enhancer elements, and translational enhancers (e.g., Shine-Dalgarno sequences).
As used herein, "operably linked" refers to covalent linkage of two or more nucleic acid sequences in such a way as to permit modulation of transcription and/or translation of the nucleic acid by the one or more regulatory regions. The control elements of the regulatory region need not be contiguous with the coding sequence, so long as they function to direct the expression thereof. Thus, for example, intervening untranslated yet transcribed sequences can be present between a promoter and the coding sequence and the promoter can still be considered "operably linked" to the coding sequence.
The term "exogenous" with respect to a nucleic acid indicates that the nucleic acid is part of a recombinant nucleic acid construct, or is not in its natural environment. For example, an exogenous nucleic acid can be a sequence from one species introduced into another species, i.e., a heterologous nucleic acid. Typically, such an exogenous nucleic acid is introduced into the other species via a recombinant nucleic acid construct. An exogenous nucleic acid can also be a sequence that is native to an organism and that has been reintroduced into cells of that organism. An exogenous nucleic acid that includes a native sequence can often be distinguished from the naturally occurring sequence by the presence of non-natural sequences linked to the exogenous nucleic acid, e.g., non-native regulatory sequences flanking a native sequence in a recombinant nucleic acid construct. In addition, stably transformed exogenous nucleic acids typically are integrated at positions other than the position where the native sequence is found and are transferable to the progeny of the cell.
Similarly, a P450 polypeptide can be endogenous or exogenous to a particular plant or plant cell. Exogenous P450 polypeptides, therefore, can include polypeptides that are native to a plant or plant cell, but that are expressed in a plant cell via a recombinant nucleic acid construct.
Likewise, a regulatory region can be exogenous or endogenous to a plant or plant cell. An exogenous regulatory region is a regulatory region that is part of a recombinant nucleic acid construct, or is not in its natural environment. For example, an Arabidopsis promoter present on a recombinant nucleic acid construct is an exogenous regulatory region when an Arabidopsis plant cell is transformed with the construct. The term "polypeptide" or "protein sequence" is used in its broadest sense to refer to a compound of two or more subunit amino acids. A peptide of three or more amino acids is commonly called an oligopeptide if the peptide chain is short. If the peptide chain is long, the peptide is typically called a polypeptide or a protein. Full-length proteins, analogs, mutants and fragments thereof are encompassed by the definition. The terms also include post- expression modifications of the polypeptide, for example, glycosylation, acetylation, phosphorylation, methylation and the like. Furthermore, as ionizable amino and carboxyl groups are present in the molecule, a particular polypeptide may be obtained as an acidic or basic salt, or in neutral form. A polypeptide may be obtained directly from the source organism, or may be recombinantly or synthetically produced (see further below).
By "transgenic plant" is meant a plant into which one or more exogenous polynucleotides have been introduced using genetic engineering tools. Examples of means by which this can be accomplished are described below, and include Agrobacterium-mediated transformation, biolistic methods, electroporation, and the like.
"Misexpression or aberrant expression", as used herein, refers to a non- wild type pattern of gene expression, at the RNA or protein level. It includes: expression at non-wild type levels, i.e., over or under expression; a pattern of expression that differs from wild type in terms of the time or stage at which the gene is expressed, e.g., increased or decreased expression (as compared with wild type) at a predetermined developmental period or stage; a pattern of expression that differs from wild type in terms of decreased expression (as compared with wild type) in a predetermined cell type or tissue type; a pattern of expression that differs from wild type in terms of the splicing size, amino acid sequence, post-transitional modification, or biological activity of the expressed polypeptide; a pattern of expression that differs from wild type in terms of the effect of an environmental stimulus or extracellular stimulus on expression of the gene, e.g., a pattern of increased or decreased expression (as compared with wild type) in the presence of an increase or decrease in the strength of the stimulus.
As used herein, the term "pharmacological activity" refers to a property of a compound that confers a benefit when the compound is used to treat, diagnose or prevent a human or veterinary disease condition, hi some embodiments, a pharmacological activity of a compound can provide a valuable pharmaceutical activity, e.g., the pharmacological activity of a compound that is the subject of an Investigational New Drug application before the United States Food and Drug Administration (FDA), is the subject of a phase I, II, or III clinical trial, is claimed in a patent listed in the FDA Orange Book or is approved for commercial sale by the FDA. In certain embodiments, the pharmacological activity of the compound is one of the following: Alzheimer's disease treatment, analgesic activity, anesthetic activity, anti-Addison's disease activity, anti-HIV activity, anti-infective activity, anti-inflammatory activity, antianginal activity, antiangiogenic activity, antianxiety activity, antiarrhythmic activity, antiarthritic activity, antiatherosclerotic activity, antibacterial activity, antibiotic activity, anticancer activity, anticholesterol activity, anticholinergic activity, anticoagulant activity, anticonvulsant activity, antidepressant activity, antidiabetic activity, antidiuretic activity, antiedemic activity, antifungal activity, antigout activity, antiglaucoma activity, antihemorrhagic activity, antihistamine activity, antihypertensive activity, antimalarial activity, antimicrobial activity, antimigraine activity, antimotion sickness activity, antineoplastic activity, antineuralgic activity, antiobesity activity, antioxidant activity, antiparasitic activity, antiparkinsonian activity, antipsoriasis activity, antipsychotic activity, antipyretic activity, antirheumatic activity, antiseizure activity, antithrombotic activity, antitussive activity, antiulcer activity, antiviral activity, antivitiligo activity, anxiolytic activity, appetite suppressant activity, asthma treatment, bronchodilator activity, cardiac depressant activity, cardiotonic activity, cerebral ischemia treatment, CNS depressant or stimulant activity, cognition enhancing activity, contraceptive activity, dermatitis treatment, diuretic activity, emetic activity, dopamine agonist activity, expectorant activity, gastrointestinal treatment, hepatoprotective activity, immunostimulant activity, immunosuppressant activity, antiimpotence activity, irritable bowel syndrome treatment, ischemia treatment, activity in metabolic and enzyme disorders, multiple sclerosis treatment, muscle relaxant activity, neuromuscular blocker activity, neuroprotective activity, opioid activity, osteoporosis treatment, purgative activity, radioprotective activity, respiratory stimulant activity, restenosis treatment, rheumatoid arthritis treatment, schizophrenia treatment, sedative/hypnotic activity, sepsis treatment, smoking deterrent activity, stroke treatment, thrombocytopenia treatment, thrombolytic therapy activity, vaccine activity, or vasodilator activity. In some embodiments, the candidate can have several activities. For example, the candidate can have anticancer, antipsoriasis, and antivitiligo activities (e.g., 8-methoxypsoralen). In certain embodiments, the candidate compound is not an antibiotic or an antifungal agent. Suitable candidate compounds include (i) compounds listed in the PharmaProjects database, a research and development tracking database available from PJB Publications at pjbpubs.com/pharmaprojects/index.htm.
B. Modification of Chemical Compounds
The present invention comprises plant cells expressing a P450 polypeptide that can be utilized to screen for P450s that modify candidate compounds, such as pharmacologically active compounds. The present invention also comprises plant cells expressing a P450 polypeptide that can be utilized to screen for substrate(s) for the expressed P450, e.g., pharmacologically active compounds that can serve as substrates for the P450. These discoveries permit the production of modified products of the substrate and/or the characterization of such modified products.
Thus, in one aspect, the invention provides methods of screening for a substrate of a P450. Such methods comprise contacting a pharmacologically active candidate compound with plant cells containing a recombinant nucleic acid construct comprising a regulatory region operably linked to a coding sequence for a P450 having 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-533. Following the contact, it is determined whether the candidate compound is modified. Detection of a modification to the candidate indicates that the candidate compound is a substrate. A collection of P450s can be screened by making a library of plant cells, each member of the library containing a recombinant nucleic acid construct comprising a regulatory region operably linked to a coding sequence for a P450 in the collection. Li this way, the collection of P450s can be screened by repeating the contacting and determining steps with plant cells from each member of the library.
Once a P450 has been identified that modifies a pharmacologically active compound, methods of making the modified compound can be used. Such methods comprise contacting a candidate compound with transgenic plant cells containing a recombinant nucleic acid construct comprising a regulatory region operably linked to a coding sequence for a P450 having 80% or greater identity to a sequence selected from the group consisting of SEQ ID NO: 137-533, whereby the P450 modifies the candidate compound. The modified candidate compound can then be recovered from the plant cells if desired.
In some embodiments of the invention, a candidate compound is contacted with a P450 for at least one hour, e.g., 2 hours, 4 hours, 8 hours, 12 hours, 1 day (24 hours), 2 days, 4 days, 7 days or 14 days. In certain embodiments, a candidate compound is contacted with a P450 for no more than 28 days, e.g., 14 days, 7 days, 4 days, 2 days or 1 day. Thus, the candidate compound can be contacted with the P450 for between about 1 hour and about 14 days, between about 8 hours and about 7 days, between about 8 hours and about 4 days, between about 8 hours and about 48 hours, between about 1 day and about 14 days, between about 2 days and about 14 days, between about 5 days and about 10 days, between about 6 days and about 14 days, between about 7 days and about 18 days, or between about 2 days and about 7 days.
A number of candidate compounds are suitable for use in the methods described herein. Examples of suitable candidate compounds having pharmacological activity include, without limitation, aesculetin, ajmalicine, ajmaline, aloin, amikacin, amlodipine besylate, amoxicillin, amphotericin B, andrographolide, apomorphine hydrochloride, arecoline hydrobromide, artemesinin, atorvastatin calcium, atropine, azithromycin, berberine chloride, bergenin monohydrate, betamethasone, betulinic acid, bixin, brucine, budesonide, bupropion hydrochloride, butamben, caffeine, camptothecin, capsaicin, celecoxib, ciprofloxacin, clarithromycin, clopidogrel sulfate, codeine, colchicine, convallatoxin, curcumin, cyclobenzaprine hydrochloride, danthron, dextromethorphan, digitoxin, digoxin, doxepin hydrochloride, emetine dihydrochloride hydrate, emodin, enalapril, eserine, esomeprazole potassium, estradiol, fentanyl citrate, formoterol, furosemide, gabapentin, glimerpiride, D- (+) glucosamine hydrochloride, glycyrrhizin, gossypol, griseofulvin, hesperidin, homatropine hydrobromide, (-)-hydrastine, hydrocortisone, ibuprofen, idazoxan hydrochloride, ipratropium bromide, ivermectin, ketoconazole, ketoprofen, lanatoside C, lansoprazole, lapachol, levofloxacin hydrochloride, lidocaine, (—)- lobeline hydrochloride, lomerizine hydrochloride, mefenamic acid, 8- methoxypsoralen, miconazole, mitoxantrone hydrochloride, morphine, mycophenolic acid, nocodazole, nordihydroguaiaretic acid, (S,R)-noscapine, oleanolic acid, omeprazole, pantaprazole, phentolamine mesylate, picrotoxin, pilocarpine hydrochloride, D-pinitol, pipeline, piperlongumine, pirenzepine dihydrochloride, podophyllotoxin, prasterone, pravastatin sodium salt, prednisolone, protoveratrine A, pyridostigmine bromide, quercetin dihydrate, quinidine (cinchonidine), quinine, rebamipide, rescinnamine, reserpine, resveratrol, retinoic acid, risperidone, rofecoxib, rotenone, rutin trihydrate, salicin, salicylic acid, santonin, (-)-scopolamine hydrobromide, sertraline hydrochloride, silybin, simvastatin, (-)-sparteine, streptozocin, strophanthidin, tetracycline, (+)-tetrandrine, thebaine, theobromine, theophylline, thymol, tobramycin, triamcinolone acetonide, tubocurarine chloride, ursolic acid, vincamine, warfarin pestanal, or yohimbine hydrochloride. Typically, a candidate compound is not an herbicide and/or an environmental contaminant.
In some embodiments, a candidate compound can be 8-methoxypsoralen (8-MOP). 8-MOP has antiproliferative activity, antipsoriasis activity, antivitiligo activity, and anticancer activity, hi some cases, 8-MOP can be used for preventing cancer, hi other cases, 8-MOP can also be used to facilitate smoking cessation.
In some embodiments, a compound modified by plant cells expressing a P450 can be further modified, e.g., compounds exposed to a P450 in metabolically rich plant cells can be modified both by action of the P450 itself as well as by subsequent action of endogenous enzymes to further modify the initial compound. Such further modifications of the compound include, but are not limited to, acetylation, carboxylation, glycosylation, demethylation, O- methylation, O-acetylation, decarboxylation, oxime generation, oxidation, phosphorylation, lipidation and acylation. Such subsequent modifications to a compound, mediated by enzymes endogenous to such plant cells can provide a rich source of novel compounds that, in certain cases, may exhibit pharmacological activity.
Candidate compound collections may contain molecules isolated from natural sources, artificially synthesized molecules, or molecules synthesized, isolated, or otherwise prepared in such a manner so as to have one or more moieties variable, e.g., moieties that are independently isolated or randomly synthesized. In some embodiments, the collection contains of compounds known to have at least some therapeutic effect in humans or other animals. In some embodiments, the collection constains of odorants or flavorants.
Compound libraries for use the invention may be purchased on the commercial market or prepared or obtained by means including, but not limited to, combinatorial chemistry techniques, fermentation methods, plant and cellular extraction procedures and the like (see, e.g., Cwirla et al, (1990) Biochemistry, 87, 6378-6382; Houghten et al., (1991) Nature, 354, 84-86; Lam et al., (1991) Nature, 354, 82-84; Brenner et al., (1992) Proc. Natl. Acad. Sd. USA, 89, 5381- 5383; R. A. Houghten, (1993) Trends Genet., 9, 235-239; E. R. Felder, (1994) Chimia, 48, 512-541; Gallop et al., (1994) J. Med. Chem., 37, 1233-1251; Gordon et al., (1994) J. Med. Chem., 37,1385-1401; Carell et al., (1995) Chem. Biol., 3, 171-183; Madden et al., Perspectives in Drug Discovery and Design 2, 269-282; Lebl et al., (1995) Biopolymers, 37 177-198); small molecules assembled around a shared molecular structure; collections of chemicals that have been assembled by various commercial and noncommercial groups, natural products; extracts of marine organisms, fungi, bacteria, and plants.
Libraries of a variety of types of candidate compounds can be prepared in order to obtain members having one or more preselected attributes that can be prepared by a variety of techniques, including but not limited to parallel array synthesis (Houghton, (2000) Annu Rev Pharmacol Toxicol 40:273-82, Parallel array and mixture-based synthetic combinatorial chemistry; solution-phase combinatorial chemistry (Merritt, (1998) Comb Chem High Throughput Screen l(2):57-72, Solution phase combinatorial chemistry, Coe et al., (1998-99) MoI Divers;4(l):3l-S, Solution-phase combinatorial chemistry, Sun, (1999) Comb Chem High Throughput Screen 2(6):299-318, Recent advances in liquid-phase combinatorial chemistry); synthesis on soluble polymer (Gravert et al., (1997) Curr Opin Chem Biol 1(1): 107-13, Synthesis on soluble polymers: new reactions and the construction of small molecules); and the like. See, e.g., Dolle et al., (1999) J Comb Chem l(4):235-82, Comprehensive survey of combinatorial library synthesis: 1998. Freidinger R M., (1999) Nonpeptidic ligands for peptide and protein receptors, Current Opinion in Chemical Biology; and Kundu et al., Prog Drug Λes;53:89-156, Combinatorial chemistry: polymer supported synthesis of peptide and non-peptide libraries).
A candidate compound need not be purified prior to contacting it with plant cells containing a recombinant nucleic acid construct comprising a regulatory region operably linked to a coding sequence for a P450. The candidate compound can be a component of an extract, solution, suspension, blend or emulsion. Typically, the candidate compound represents at least 10% (w/w) of the solutes, on a dry basis, of a extract, solution, suspension, blend or emulsion. In some embodiments, the compound represents at least 25% (w/w), 50% (w/w), 75% (w/w), or 90% (w/w) of the solutes of the extract, solution, suspension, blend or emulsion. Alternatively, the candidate compound may be substantially pure and is contacted with plant cells in an amount of from 1 nM to about 100 niM, e.g., from about 100 nM to about 1 mM, or about 1 μM to about 100 μM.
In some embodiments, a plurality of candidate compounds can be contacted simultaneously with plant cells. The plurality of candidate compounds can be components of a mixture of known compounds, i.e. a. pool of candidate compounds to be screened or modified. Alternatively, the plurality of candidate compounds can be components of an unknown or partially characterized mixture, e.g. an extract of a organism, such as bacteria, yeast or plant or animal tissue. The plurality of candidate compounds also can be components of a crude reaction mixture from a synthetic process or combinatorial chemistry reaction process.
Whether or not modification of a compound has occurred can be determined by various methods, including but not limited to gas chromatography-mass spectrometry (GC-MS), liquid chromatography-MS (LC- MS), HPLC, PDA detection, electrospray ionization-MS, Fourier-transform-ion- cyclotron-resonance-MS, or nuclear magnetic resonance (NMR). Preferably, changes in the chemical structure of the compound are detected by mass spectrometry. Mass spectrometry (MS) is a widely used technique for the characterization and identification of molecules, both in organic and inorganic chemistry. MS provides molecular weight information about a molecule. The molecular weight of a molecule is a useful piece of information in the identification of a particular molecule in a mixture of molecules. The term mass spectrometer refers to an analytical device that uses the difference in mass-to-charge ratio (m/z) of ionized atoms or molecules to separate them from each other. Mass spectrometry is therefore useful for quantitation of atoms or molecules and also for determining chemical and structural information about molecules. Molecules have distinctive fragmentation patterns that provide structural information to identify structural components. The general operation of a mass spectrometer is: (a) create gas- phase ions; (b) separate the ions in space or time based on their mass-to-charge ratio, and (c) measure the quantity of ions of each mass-to-charge ratio. The ion separation power of a mass spectrometer is described by its resolution.
In addition, mass spectrometers may be coupled to separation means such as gas chromatography (GC) and high performance liquid chromatography (HPLC). In gas-chromatography mass-spectrometry (GC/MS), capillary columns from a gas chromatograph are coupled directly to the mass spectrometer, optionally using a jet separator. In such an application, the gas chromatography (GC) column separates sample components from the sample gas mixture and the separated components are ionized and chemically analyzed in the mass spectrometer.
C. Cytochrome P450 Polypeptides
Methods and compositions described herein utilize P450 polypeptides and nucleic acids encoding them. The P450 superfamily includes a large number of enzymes that catalyze a wide variety of chemical reactions in a broad range of substrates. Some P450 enzymes are very substrate-specific, while others are more catholic in their substrate selection. Known P450 substrates include steroids, eicosinoids, fatty acids, lipid hydroperoxides, retinoids and xenobiotics, such as drugs, alcohols, carcinogens, antioxidants, organic solvents, dyes, pesticides, odorants and flavorants. P450 polypeptides are commonly known to be oxygenating enzymes. P450 polypeptides also catalyze 2-electron reduction of compounds, reductively cleave xenobiotics and lipid hydroperoxides, and mediate various dealkylation, epoxidation and sulfoxidation reactions.
Cytochrome P450s are heme-containing enzymes. One step in the oxidation of a substrate by a P450 is the addition of one atom of molecular oxygen, which is activated by a reduced heme iron, to the substrate. The reaction is usually, but not always, a hydroxylation reaction. The second oxygen atom is reduced to water, thereby accepting electrons from NAD(P)H via a flavoprotein or a ferredoxin. The activation of oxygen is common to all P450s. This activation takes place at the iron-protoporphyrin IX (heme). The heme iron is six-fold coordinated. It has a conserved thiolate residue as the fifth ligand and, in the inactive ferric form, a water molecule as its sixth ligand. The catalytic mechanism involves six steps:
(1) the substrate is bound and the water molecule is displaced;
(2) the ferric enzyme is reduced to a ferrous state by an one-electron transfer;
(3) molecular oxygen is then bound, resulting in a ferrous-dioxy species
(4) a second reduction, followed by a proton transfer leads to an iron- hydroperoxo intermediate;
(5) the cleavage of the 0-0 bond releases water and an activated ion- oxo ferryl species; and
(6) this iron-oxo ferryl oxidizes the substrate and the product is subsequently released.
Typical cytochrome P450s contain characteristic domains as defined by Werch-Reichhart, et al. "Cytochromes P450", The Arabidopsis Book, Werch- Reichhart, D., Bak, S., Paquette, S. (2002). Typical domains include an I-helix domain involved in oxygen binding and activation, a K-helix domain containing an ERR triad involved in locking the heme into position, a heme-binding domain, and an N-terminal region consisting of a membrane insertion domain and a hinge region containing a section with basic residues followed by a proline rich region.
P450 polypeptides are often classified into clades based on sequence identity. An amino acid sequence identity of about 40% between polypeptides is accepted as indicating membership in a given P450 family. An amino acid sequence identity of about 55% among members of a given P450 family indicates that the members are part of the same subfamily, and an amino acid sequence of about 95% indicates that the members of a given P450 subfamily are isoforms. P450 classifications on the website drnelson.utmem.edu/CytochromeP450 can be utilized by those skilled in the art. Typically, functional homologs of a P450 polypeptide are members of the same P450 family, and often are members of the same subfamily or are isoforms. A P450 functional homolog typically has at least about 80% amino acid sequence identity, e.g., about 80% amino acid sequence identity, about 85% amino acid sequence identity, about 90% sequence identity, about 95% amino acid sequence identity, or at least about 98% amino acid sequence identity with a P450 comprising a sequence having 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-533. Table 1 provides an exemplary list of P450s useful in the invention and further provides the classification into a P450 family and locus in the Arabidopsis thaliana genome of each identified sequence.
Table 1: P450 Sequences
Figure imgf000048_0001
Figure imgf000049_0001
Figure imgf000050_0001
Figure imgf000051_0001
Suitable P450 polypeptides can be identified by analysis of polypeptide sequence alignments involving analysis of nonredundant databases using amino acid sequences of P450 polypeptides. Those polypeptides in the database that have greater than 40% sequence identity are candidates for further evaluation for suitability as a P450 polypeptide. If desired, manual inspection of such candidates can be carried out in order to narrow the number of candidates to be further evaluated. Manual inspection can be performed by selecting those candidates that appear to have domains suspected of being present in P450 polypeptides. See, e.g., the Pfam web site at sanger.ac.uk/Pfam and genome.wustl.edu/Pfam.
As used herein, the term "percent sequence identity" refers to the degree of identity between any given query sequence and a subject sequence. A percent identity for any query nucleic acid or amino acid sequence, e.g., a P450 polypeptide, relative to another subject nucleic acid or amino acid sequence can be determined as follows.
A query nucleic acid or amino acid sequence is aligned to one or more subject nucleic acid or amino acid sequences using the computer program ClustalW (version 1.83, default parameters), which allows alignments of nucleic acid or protein sequences to be carried out across their entire length (global alignment).
ClustalW calculates the best match between a query and one or more subject sequences, and aligns them so that identities, similarities and differences can be determined. Gaps of one or more residues can be inserted into a query sequence, a subject sequence, or both, to maximize sequence alignments. For fast pairwise alignment of nucleic acid sequences, the following default parameters are used: word size: 2; window size: 4; scoring method: percentage; number of top diagonals: 4; and gap penalty: 5. For multiple alignment of nucleic acid sequences, the following parameters are used: gap opening penalty: 10.0; gap extension penalty: 5.0; and weight transitions: yes. For fast pairwise alignment of protein sequences, the following parameters are used: word size: 1; window size: 5; scoring method: percentage; number of top diagonals: 5; gap penalty: 3. For multiple alignment of protein sequences, the following parameters are used: weight matrix: blosum; gap opening penalty: 10.0; gap extension penalty: 0.05; hydrophilic gaps: on; hydrophilic residues: glycine, proline, asparagine, aspartic acid, glutamine, glutamic acid, arginine, and lysine; residue-specific gap penalties: on. The output is a sequence alignment that reflects the relationship between sequences. ClustalW can be run, for example, at the Baylor College of Medicine Search Launcher site searchlauncher.bcm.tmc.edu/multi-align/multi-align and at the European Bioinformatics Institute site ebi.ac.uk/clustalw.
To determine a percent identity between a query sequence and a subject sequence, the number of matching bases or amino acids in the alignment is divided by the total number of matched and mis-matched bases or amino acids excluding gaps, followed by multiplying the result by 100. The output is the percent identity of the subject sequence with respect to the query sequence.
In some embodiments, the amino acid sequence of a suitable P450 polypeptide has greater than 60% sequence identity (e.g., > 60%, > 70%, > 80% or > 90%) to the amino acid sequence of the query P450 polypeptide.
D. Nucleic Acids Encoding P450 Polypeptides
Nucleic acids encoding P450 polypeptides for use in the methods and compositions described herein may be derived from, but not limited to, bacteria, yeasts, alga, animals and plants. They can be obtained also from various other sources. The sequences obtained from those sources may be connected to a suitable regulatory region. In vitro mutagenesis, gene shuffling or de novo synthesis can also enhance translation efficiency in the host plants or change the catalytic effect of the encoded enzyme. The modification includes the modification of the residue concerning catalytic functions but is not limited thereto. The P450 gene can be modified so as to have an optimum codon depending on the codon usage of the host or the organelle to be expressed. If necessary, such a gene sequence may be connected to a nucleic acid sequence encoding a suitable transit peptide. A DNA sequence encoding a polypeptide can be optimized for expression in a selected cell by using the codons preferred by the selected cell to represent the DNA copy of the desired polypeptide coding sequence.
Suitable cytochrome P450 nucleic acids include those that encode a P450 having an amino acid sequence having §0% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-533.
A recombinant nucleic acid construct disclosed herein typically includes one or more regulatory regions operably linked to the nucleic acid encoding a P450 polypeptide. Typically, a promoter is located 5' to the sequence to be transcribed, and proximal to the transcriptional start site of the sequence. Promoters are upstream of the exon of a coding sequence and upstream of the of multiple transcription start sites, hi some embodiments, a promoter is positioned about 5,000 nucleotides upstream of the ATG of the exon of a coding sequence. In other embodiments, a promoter is positioned about 2,000 nucleotides upstream of the first of multiple transcription start sites. A promoter typically comprises at least a core promoter. Additionally, a promoter may also include at least one control element such as an upstream element. Such elements include upstream activation regions (UARs) and optionally, other DNA sequences that affect transcription of a polynucleotide such as a synthetic upstream element.
A 5' untranslated region (UTR) is transcribed, but is not translated, and lies between the start site of the transcript and the translation initiation codon and may include the +1 nucleotide. A 3' UTR can be positioned between the translation termination codon and the end of the transcript. UTRs can have particular functions such as increasing mRNA message stability or translation attenuation. Examples of 3' UTRs include, but are not limited to polyadenylation signals and transcription termination sequences.
The choice of regulatory regions to be included depends upon several factors, including, but not limited to, efficiency, selectability, inducibility, desired expression level, and cell or tissue specificity. For example, constitutive, tissue-nonspecific or developmental stage-nonspecific promoters can be used. An example of a constitutive promoter is the CaMV35S promoter. Inducible promoters such as heat shock promoters, wound responsive promoters such as hydroxyproline-rich protein promoters, chemically inducible promoters such as nitrate reductase promoters and dark inducible promoters such as asparagine synthetase promoters can be useful. See, e.g., U.S. Pat. No. 5,256,558.
Tissue-, organ- and cell-specific promoters that confer transcription only or predominantly in a particular tissue, organ, and cell type, respectively, can be used if desired. For example, promoters specific to vegetative tissues such as the stem, parenchyma, ground meristem, vascular bundle, cambium, phloem, cortex, shoot apical meristem, lateral shoot meristem, root apical meristem, lateral root meristem, leaf primordium, leaf mesophyll, or leaf epidermis can be suitable regulatory regions.
An example of a vegetative tissue promoter is promoter p32449 (gDNA ID 7418615; SEQ ID NO:45), which has preferential activity in the roots. Other vegetative promoters include; a maize leaf-specific gene described by Busk (1997) Plant J., 11:1285-1295; knl-related genes from maize and other species; and constitutive Cauliflower mosaic virus 35S. Other suitable promoters include those that have preferential activity in organs like the stem and silique and/or that have preferential activity in specific cell-types, such as vascular bundles. Examples include YP0086 (gDNA ID 7418340), YPOl 88 (gDNA ID 7418570), YP0263 (gDNA ID 7418658), and others, as set forth in U.S. Patent Application Ser. Nos. 60/518,075; 60/544,771; 60/505,689; 60/583,691; 10/957,569; and 60/558,869.
A cell type or tissue-specific promoter can drive expression of operably linked sequences in tissues other than the target tissue. Thus, as used herein a cell-type or tissue-specific promoter is one that drives expression preferentially in the target tissue, but can also lead to some expression in other cell types or tissues as well. Methods for identifying and characterizing promoter regions in plant genomic DNA are known.
Other suitable regulatory regions that can be operably linked to a nucleic acid capable encoding a P450 polypeptide include, without limitation, those listed in the Regulatory Regions Table below.
Table 2: Regulatory Regions
Figure imgf000054_0001
Figure imgf000055_0001
Suitable P450 polypeptides can be identified by analysis of nucleotide sequence alignments utilizing known sequence alignment methods as described above for amino acid sequences. In some embodiments, the nucleotide sequence of a suitable subject nucleic acid has greater than 70% sequence identity (e.g., > 75%, > 80%, > 90%, > 91%, > 92%, > 93%, > 94%, > 95%, > 96%, > 97%, > 98%, or > 99%) to the nucleotide sequence of the query nucleic acid.
Alternatively, the degree of sequence similarity between nucleic acid sequences can be determined by hybridization of nucleic acids under conditions that form stable duplexes between homologous regions, followed by digestion with single-stranded-specific nuclease(s), and size determination of the digested fragments. Two DNA, or two polypeptide sequences are "substantially homologous" to each other when the sequences exhibit at least about 43%-60%, preferably 60-70%, more preferably 70%-85%, more preferably at least about 85%-90%, more preferably at least about 90%-95%, and most preferably at least about 95%-98% sequence identity over a defined length of the molecules, or any percentage between the above-specified ranges, as determined using the methods above. As used herein, substantially homologous also refers to sequences showing complete identity to the specified DNA or polypeptide sequence. DNA sequences that are substantially homologous can be identified in a Southern hybridization experiment under, for example, stringent conditions, as defined for that particular system. Defining appropriate hybridization conditions is within the skill of the art. See, e.g., Sambrook et al, supra.
The degree of sequence identity between two nucleic acid molecules affects the efficiency and strength of hybridization events between such molecules. A partially identical nucleic acid sequence will at least partially inhibit a completely identical sequence from hybridizing to a target molecule. Inhibition of hybridization of the completely identical sequence can be assessed using hybridization assays that are well known in the art (e.g., Southern blot, Northern blot, solution hybridization, or the like, see Sambrook, et al., Molecular Cloning: A Laboratory Manual, Third Edition, (1989) Cold Spring Harbor, N. Y.). Such assays can be conducted using varying degrees of selectivity, for example, using conditions varying from low to high stringency. If conditions of low stringency are employed, the absence of non-specific binding can be assessed using a secondary probe that lacks even a partial degree of sequence identity (for example, a probe having less than about 30% sequence identity with the target molecule), such that, in the absence of non-specific binding events, the secondary probe will not hybridize to the target.
When utilizing a hybridization-based detection system, a nucleic acid probe is chosen that is complementary to a target nucleic acid sequence, and then by selection of appropriate conditions the probe and the target sequence "selectively hybridize," or bind, to each other to form a hybrid molecule. A nucleic acid molecule that is capable of hybridizing selectively to a target sequence under "moderately stringent" conditions typically hybridizes under conditions that allow detection of a target nucleic acid sequence of at least about 10-14 nucleotides in length having at least approximately 70% sequence identity with the sequence of the selected nucleic acid probe. Stringent hybridization conditions typically allow detection of target nucleic acid sequences of at least about 10-14 nucleotides in length having a sequence identity of greater than about 90-95% with the sequence of the selected nucleic acid probe. Hybridization conditions useful for probe/target hybridization where the probe and target have a specific degree of sequence identity, can be determined as is known in the art (see, for example, Nucleic Acid Hybridization: A Practical Approach, editors B. D. Hames and S. J. Higgins, (1985) Oxford; Washington, DC; IRL Press).
With respect to stringency conditions for hybridization, it is well known in the art that numerous equivalent conditions can be employed to establish a particular stringency by varying, for example, the following factors: the length and nature of probe and target sequences, base composition of the various sequences, concentrations of salts and other hybridization solution components, the presence or absence of blocking agents in the hybridization solutions (e.g., formamide, dextran sulfate, and polyethylene glycol), hybridization reaction temperature and time parameters, as well as, varying wash conditions. The selection of a particular set of hybridization conditions is selected following standard methods in the art (see, for example, Sambrook, et al., supra.)
E. Transgenic Plants
One aspect of the invention provides transgenic plant cells comprising a recombinant nucleic acid construct that comprises a regulatory region operably linked to a coding sequence for a P450 having 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-533. The invention also encompasses transgenic plants comprising the transgenic plant cells described herein.
A plant or plant cell used in methods of the invention contains a recombinant nucleic acid construct as described herein. The plant or plant cells can be transformed by having the construct integrated into its genome, i.e., be stably transformed. Stably transformed cells typically retain the introduced nucleic acid sequence with each cell division. The plant or plant cells can also be transformed by having the construct not integrated into its genome. Such transformed cells are called transiently transformed cells. Transiently transformed cells typically lose all or some portion of the introduced nucleic acid construct with each cell division such that the introduced nucleic acid camiot be detected in daughter cells after sufficient number of cell divisions. Both transiently transformed and stably transformed transgenic plants and plant cells can be useful in the methods described herein.
In certain embodiments, transgenic plant cells used in methods described herein constitute part or all of a whole plant or explant. Such plants can be contacted with a candidate compound, for example, as seedlings in liquid medium, on solid medium, or hydroponically.
In other embodiments, transgenic plant cells are grown in culture and contacted with a candidate compound. Such plant cell cultures can be undifferentiated cells such as a callus culture, or can be cultures of a differentiated tissue or organ, e.g., an embryogenic cell culture or a root culture from a tissue or organ explant. hi some embodiments, protoplasts are suitable for contacting with a candidate compound. For the purposes of this invention, solid and/or liquid tissue culture techniques can be used. When using solid medium, transgenic plant cells can be placed directly onto the medium or can be placed onto a filter that is then placed in contact with the medium. When using liquid medium, transgenic plant cells can be placed onto a floatation device, e.g., a porous membrane that contacts the liquid medium. Solid medium typically is made from liquid medium by adding agar. For example, a solid medium can be Murashige and Skoog (MS) medium containing agar and a suitable concentration of an auxin, e.g., 2,4-dichlorophenoxyacetic acid (2,4-D), and a suitable concentration of a cytokinin, e.g., kinetin.
In some embodiments, transgenic plants are grown in a greenhouse or in a field and bred as desired for a particular purpose, e.g., to introduce a recombinant nucleic acid into other lines, to transfer a recombinant nucleic acid to other species or for further selection of other desirable traits. Alternatively, transgenic plants can be propagated vegetatively for those species amenable to such techniques. Progeny includes descendants of a particular plant or plant line. Progeny of an instant plant include seeds formed on F1, F2, F3, F4, F5, F6 and subsequent generation plants, or seeds formed on BC1, BC2, BC3, and subsequent generation plants, or seeds formed on F1BC1, F1BC2, F1BC3, and subsequent generation plants. Seeds produced by a transgenic plant can be grown and then selfed (or outcrossed and selfed) to obtain seeds homozygous for the nucleic acid construct.
Techniques for introducing exogenous nucleic acids into monocotyledonous and dicotyledonous plants are known in the art, and include, without limitation, Agrobacterium-mQdiated transformation, viral vector- mediated transformation, electroporation and particle gun transformation, e.g., U.S. Patents 5,538,880, 5,204,253, 6,329,571, 6,013,863 and Yamamoto et αl., In Vitro Cellular and Development Biology - Plant 37(3):349-353 (2001). If a cell or tissue culture is used as the recipient tissue for transformation, plants can be regenerated from transformed cultures if desired, by techniques known to those skilled in the art.
A suitable group of plant species includes dicots, such as Arabidopsis, safflower, alfalfa, soybean, coffee, rapeseed, or sunflower. Also suitable are monocots such as Lemna, corn, wheat, rye, barley, oat, rice, millet, amaranth or sorghum. Suitable plants include vegetable crops or root crops such as lettuce, carrot, onion, broccoli, peas, sweet corn, popcorn, tomato, potato, beans (including kidney beans, lima beans, dry beans, green beans) and the like. Also suitable are fruit crops such as grape, strawberry, pineapple, melon (e.g., watermelon, cantaloupe), peach, pear, apple, cherry, orange, lemon, grapefruit, plum, mango, banana, and palm.
Thus, the methods described herein can be utilized with dicotyledonous plants belonging to the orders Magniolales, Illiciales, Laurales, Piperales, Aristochiales, Nymphaeales, Ranunculales, Papeverales, Sarraceniaceae, Trochodendrales, Hamamelidales, Eucomiales, Leitneriales, Myricales, Fagales, Casuarinales, Caryophyllales, Batales, Polygonales, Plumbaginales, Dilleniales, Theales, Malvales, Urticales, Lecythidales, Violales, Salicales, Capparales, Ericales, Diapensales, Ebenales, Primulales, Rosales, Fabales, Podostemales, Haloragales, Myrtales, Cornales, Proteales, Santales, Rafflesiales, Celastrales, Euphorbiales, Rhamnales, Sapindales, Juglandales, Geraniales, Polygalales, Umbellales, Gentianales, Polemoniales, Lamiales, Plantaginales, Scrophulariales, Campanulales, Rubiales, Dipsacales, and Asterales. Methods described herein can also be utilized with monocotyledonous plants belonging to the orders Alismatales, Hydrocharitales, Najadales, Triuridales, Commelinales, Eriocaulales, Restionales, Poales, Juncales, Cyperales, Typhales, Bromeliales, Zingiberales, Arecales, Cyclanthales, Pandanales, Arales, Lilliales, and Orchidales, or with plants belonging to Gymnospermae, e.g., Pinoles, Ginkgoales, Cycadales and Gnetales.
Thus, the invention has use over a broad range of plant species, including species from the genera Allium, Alseodaphne, Anacardium, Arachis, Asparagus, Atropa, Avena, Beilschmiedia, Brassica, Citrus, Citrullus, Capsicum, Catharanthus, Carthamus, Cocculus, Cocos, Coffea, Croton, Cucumis, Cucurbita, Daucus, Duguetia, Elaeis, Eschscholzia, Ficus, Fragaria, Glaucium, Glycine, Gossypium, Helianthus, Heterocallis, Hevea, Hordeum, Hyoscyamus, Lactuca, Landolphia, Linum, Litsea, Lolium, Lupinus, Lycopersicon, Malus, Manihot, Majorana, Medicago, Musa, Nicotiana, Olea, Oryza, Panicum, Pannesetum, Papaver, Parthenium, Persea, Phaseolus, Pinus, Pistachia, Pisum, Pyrus, Prunus, Raphanus, Rhizocarya, Ricinus, Secale, Senecio, Sinomenium, Sinapis, Solanum, Sorghum, Stephania, Theobroma, Trigonella, Triticum, Vicia, Vinca, Vitis, Vigna and Zea.
Other suitable plant species include green alga belonging to the group Viridaeplantae, and the orders Volvocales, Coleochaetales, Ulvales, and Bryopsidales. Suitable genera include Brachiomonas, Carteria, Cercidium, Chlainomonas, Chlamydomonas (especially Chlamydomonas reinhardii), Chloroceras, Chlorogonium, Chloromonas, Diplostauron, Gigantochloris, Gloeomonas, Heterochlamydomonas, Hyalobrachion, Hyalogonium, Lobomonas, Oltmannsiella, Parapolytoma, Peterflella, Phyllariochloris, Polytoma, Protococcus, Provasoliella, Pyramichlamys, Sphaerella, Sphaerellopsis, Spirogonium, Tetrablepharis, Tetratoma, Tussetia, and Vitreochlamys.
F. Compositions of matter
Also provided herein is a composition of matter comprising 5-O-β-D- glucopyranosyl-8-methoxypsoralen, i.e., Compound 3 set forth in Figure 88.
G. Pharmaceutical compositions
The pharmaceutical compositions provided herein can contain therapeutically effective amounts of one or more of the compounds provided herein that are useful in the treatment or amelioration of one or more of the symptoms associated with a disease such as cancer, vitiligo, psoriasis, or nicotine addiction, and a pharmaceutically acceptable carrier. Pharmaceutical carriers suitable for administration of the compounds provided herein include any such carriers known to those skilled in the art to be suitable for the particular mode of administration. hi addition, the compounds can be formulated as the sole pharmaceutically active ingredient in the composition or may be combined with other active ingredients.
The compositions can contain one or more compounds provided herein. The compounds can be formulated into suitable pharmaceutical preparations such as solutions, suspensions, tablets, dispersible tablets, pills, capsules, powders, sustained release formulations or elixirs, for oral administration or topical administration or in sterile solutions or suspensions for parenteral administration, as well as transdermal patch preparation and dry powder inhalers. The compounds described above can be formulated into pharmaceutical compositions using techniques and procedures well known in the art (see, e.g., Ansel Introduction to Pharmaceutical Dosage Forms, Fourth Edition 1985, 126).
In the compositions, effective concentrations of one or more compounds or pharmaceutically acceptable derivatives thereof can be mixed with a suitable pharmaceutical carrier. The compounds can be derivatized as the corresponding salts, esters, enol ethers or esters, acetals, ketals, orthoesters, hemiacetals, hemiketals, acids, bases, solvates, hydrates or prodrugs prior to formulation, as described above. The concentrations of the compounds in the compositions can be effective for delivery of an amount, upon administration, that treats or ameliorates one or more of the symptoms of a disease, e.g., psoriasis.
In some cases, the compositions can be formulated for single dosage administration. To formulate a composition, the weight fraction of compound can be dissolved, suspended, dispersed or otherwise mixed in a selected carrier at an effective concentration such that the treated condition is relieved or one or more symptoms are ameliorated.
The active compound can be included in the pharmaceutically acceptable carrier in an amount sufficient to exert a therapeutically useful effect in the absence of undesirable side effects on the patient treated. The therapeutically effective concentration can be determined empirically by testing the compounds in in vitro and in vivo systems, and then extrapolated therefrom for dosages for humans.
The concentration of active compound in the pharmaceutical composition can depend on absorption, inactivation and excretion rates of the active compound, the physicochemical characteristics of the compound, the dosage schedule, and amount administered as well as other factors known to those of skill in the art.
Pharmaceutical dosage unit forms can be prepared to provide from about 0.01 mg, 0.1 mg or 1 mg to about 500 mg, 1000 mg or 2000 mg of the active ingredient or a combination of essential ingredients per dosage unit form.
The active ingredient can be administered at once, or can be divided into a number of smaller doses to be administered at intervals of time. It is understood that the precise dosage and duration of treatment is a function of the disorder being treated and can be determined empirically using known testing protocols or by extrapolation from in vivo or in vitro test data. It is to be noted that concentrations and dosage values can also vary with the severity of the condition to be alleviated. It is to be further understood that for any particular subject, specific dosage regimens should be adjusted over time according to the individual need and the professional judgment of the person administering or supervising the administration of the compositions, and that the concentration ranges set forth herein are exemplary only and are not intended to limit the scope or practice of the claimed compositions.
In instances in which the compounds exhibit insufficient solubility, methods for solύbilizing compounds can be used. Such methods are known to those of skill in this art, and can include, but are not limited to, using cosolvents, such as dimethylsulfoxide (DMSO), using surfactants, such as TWEEN®, or dissolution in aqueous sodium bicarbonate. Derivatives of the compounds, such as prodrugs of the compounds can also be used in formulating effective pharmaceutical compositions.
Upon mixing or addition of the compound(s), the resulting mixture can be a solution, suspension, emulsion or the like. The form of the resulting mixture can depend upon a number of factors, including the intended mode of administration and the solubility of the compound in the selected carrier or vehicle. The effective concentration can be sufficient for ameliorating the symptoms of the disease, disorder or condition treated and can be empirically determined.
The pharmaceutical compositions can be provided for administration to humans and animals in unit dosage forms, such as tablets, capsules, pills, powders, granules, sterile parenteral solutions or suspensions, and oral solutions or suspensions, and oil-water emulsions containing suitable quantities of the compounds or pharmaceutically acceptable derivatives thereof. The pharmaceutically active compounds and derivatives thereof can be formulated and administered in unit-dosage forms or multiple-dosage forms. Unit-dose forms as used herein refers to physically discrete units suitable for human or non-human animal subjects and packaged individually as is known in the art. Each unit-dose can contain a predetermined quantity of the therapeutically active compound sufficient to produce the desired therapeutic effect, in association with the required pharmaceutical carrier, vehicle or diluent. Examples of unit- dose forms can include, without limitation, ampoules and syringes and individually packaged tablets or capsules. Unit-dose forms can be administered in fractions or multiples thereof. A multiple-dose form is a plurality of identical unit-dosage forms packaged in a single container to be administered in segregated unit-dose form. Examples of multiple-dose forms include vials, bottles of tablets or capsules or bottles of pints or gallons. Hence, multiple dose form is a multiple of unit-doses which are not segregated in packaging.
Liquid pharmaceutically administrable compositions can, for example, be prepared by dissolving, dispersing, or otherwise mixing an active compound as defined above and optional pharmaceutical adjuvants in a carrier, such as, for example, water, saline, aqueous dextrose, glycerol, glycols, ethanol, and the like, to thereby form a solution or suspension. If desired, the pharmaceutical composition to be administered can also contain minor amounts of nontoxic auxiliary substances such as wetting agents, emulsifying agents, solubilizing agents, pH buffering agents and the like, for example, acetate, sodium citrate, cyclodextrine derivatives, sorbitan monolaurate, triethanolamine sodium acetate, triethanolamine oleate, and other such agents.
Actual methods of preparing such dosage forms are known, or will be apparent, to those skilled in this art; for example, see Remington's Pharmaceutical Sciences, Mack Publishing Company, Easton, Pa., 15th Edition, 1975.
Dosage forms or compositions containing active ingredient in the range of 0.001% to 100% with the balance made up from non-toxic carrier may be prepared. Methods for preparation of these compositions are known to those skilled in the art. The contemplated compositions can contain 0.001%- 100% active ingredient, e.g., 0.001%-90%, 0.1%-95%, 0.5%-95%, 0.5%-30%, 10%- 25%, 10%-99%, 50%-85%, or 80%-100%.
H. Methods of use of the compounds and compositions
Provided herein are methods to treat or ameliorate symptoms or disorders associated with a cancer, including hematologic and solid tumor cancers. The methods include administering one or more of the compounds described herein, or a pharmaceutically acceptable salt or derivative thereof, or a composition comprising the same, to a mammal, e.g., a human, cat, dog, horse, pig, cow, sheep, mouse, rat, or monkey. In some cases, the compound is 5-0-β-D- glucopyranosyl-8-methoxypsoralen.
In some cases, a method for treating or ameliorating a disease such as cancer, vitiligo, psoriasis, or nicotine addiction can include administering to a mammal a compound having the chemical formula of Compound 3 set forth in Figure 88, or a pharmaceutically acceptable salt or derivative thereof.
A compound or a pharmaceutically acceptable salt or derivative thereof can be administered by methods appropriate for treating, ameliorating, or preventing a symptom or a disorder related to a disease (e.g., cancer, psoriasis, vitiligo, or nicotine addiction). Methods of administration can include, without limitation, oral, intravenous, and topical administration.
In practicing the methods, effective amounts of the compounds or composition provided herein are administered. Such amounts are sufficient to achieve a therapeutically effective concentration of the compound or active component of the composition in vivo.
V. EXAMPLES
Although the following characterizations were carried out in Arabidopsis, the methods described herein can be applied to other plant species that comprise recombinant nucleic acid constructs described herein.
A. Example 1: Production of Transgenic Arabidopsis Plants Containing Recombinant P450 Polypeptides
T-DNA binary vector constructs were made using standard molecular biology techniques. A set of constructs were made that contained a P450 coding sequence operably linked to a CaMV 35S promoter. Each of these constructs also contained a marker gene conferring resistance to the herbicide Finale®.
Each construct was introduced into Arabidopsis ecotype Wassilewskija (WS) by the floral dip method essentially as described in Bechtold, N. et al., CR. Acad. Sci. Paris, 316:1194-1199 (1993). The presence of each P450 construct was verified by PCR. At least two independent events from each transformation were selected for further study; these events were referred to as Arabidopsis thaliana ME lines. The P450 polypeptide expressed in a given ME line is shown in Table 2 below. T1 seeds were germinated and allowed to self- pollinate. T2 seeds were collected and a portion were germinated, allowed to self-pollinate, and T3 seeds were collected. T2 and/or T3 seeds were used in the experiments described below.
B. Example 2: Modification of Compounds by P450s
Per experiment, 25 seeds from each ME line were used per microtiter well. These were mixes of all transformation events available for that particular ME line. Sterilized seeds were placed in l/2x MS media at a final volume of 1.25 ml. Two 24-well microtiter plates were used per experiment, which included a wild type control and a no plant control. Seeds were incubated for one week at 22°C, long day, on a shaker at 140 rpm. 12.5 μl of a 100 mM chemical substrate from Table 3 was added to each well (ImM final) and the seedlings incubated one additional day.
TABLE 3: Candidate compounds
Figure imgf000065_0001
Figure imgf000066_0001
Figure imgf000067_0001
Each seedling set was transferred to a 1.7 ml Eppendorf tube and the MS media was saved. The seedlings were homogenized with a plastic fitted pestle or by bead beating. Then 1 ml methanol (MeOH) was added and the tube vortexed. The mixture was incubated at 37°C for 1 hr in a shaker at 300 rpm. The solids were pelleted by centrifugation at 14,000 rpm for 3 min. The supernatant was transferred to 13x75 borosilicate glass test tubes. The MS media from the appropriate well (saved from before) was added to the supernatant. 1 ml hexane was added to the glass tubes containing the MeOH/MS mixture. The tubes were mixed by vortexing and then permitted to equilibrate at room temperature for 15 minutes. The aqueous methanol and hexane phases were separated by centrifugation at 4000 rpm for 15 min. The upper hexane phase was discarded. The remaining aqueous methanol phase was dried by lyophilization overnight with no heat. The samples were resuspended in 150 μl methanol and sonicated in a sonicator bath. Particulates were removed by filtering through a 0.45 μm filter. The resulting samples were analyzed by LC/MS using standard procedures. Typically, the samples were analyzed using a 10 min water/acetonitrile gradient on a reverse phase column. Positive results from this primary (1°) screen are provided in Table 4. TABLE 4: Modified Compounds
Figure imgf000068_0001
Since each transformation and integration event is unique, each seed line producing a positive result in the primary screen was subjected to a secondary screen, an "event confirmation screen." In the event confirmation screen, seeds from all available separate transformation events were separately assayed for the ability to modify the target compound. 25 seeds from a single transformation event were used per microtiter well in a final volume of 1.25 ml of 1/2X MS. The ME lines from the primary screen were analyzed along with a wild type control and a no plant control. Seeds were incubated for one week at 22°C, long day, on a shaker at 140 rpm. 12.5 μl of a 10OmM of the target substrate (identified as modified in Table 4, respectively) was added to each well (ImM final) and the seedlings incubated one additional day.
Each seedling was homogenized and extracted as described above. The resulting samples were analyzed by LC/MS using standard procedures. Typically, the samples were analyzed using a 10 min water/acetonitrile gradient on a reverse phase column. Table 4 provides the results of the analysis.
C. Example 3: NMR analysis
NMR experiments were performed in MeOD and CDCl3 as internal standards (δH 3.31 for MeOD and δH 7.26 for CDCl3). 1H NMR data was obtained at room temperature on a Bruker Avance 600 MHz spectrometer operating at 600.01MHz. D. Example 4: Identification of hydroxylated 8- methoxypsoralen
The hydroxylated product of 8-methoxypsoralen (8-MOP) was scaled up in yeast. IL AHC+Trp/Galactose medium was inoculated with WATl 1 bearing CYP82C2 that had been grown to saturation in AHC+Trp/Dextrose. The cultures were incubated at 28°C with shaking for 1 day. 1 ml of 500 mM 8- MOP prepared in DMSO was added to the flask, for a final concentration of 500 μM. The culture was incubated for another 3 days. Yeast culture medium was transferred to a round bottom flask. The medium was removed by rotoevaporation, and the resulting residue was resuspended in 20 ml MeOH. The MeOH suspension was transferred to a 50 ml centrifuge tube and the sample was centrifuged for 1 hr, 1,800 x g. The supernatant was transferred to a round bottom flask and removed by rotoevaporation, yielding 6.5 g of crude dry extract. Solid phase extraction of this material on a C18 Extract-Clean column (Alltech) using a block gradient of H20/Me0H resulted in 150 mg in the 25% H2OZMeOH fraction. This fraction was further separated using a C18 5 μm 250 x 10 mm column (Phenomenex) with a gradient solvent system of H2O + 0.1% formic acid (65%-0% in 28 min) in MeOH + 0.1% formic acid (flow rate 2.0 mL/min, detection at 300 nm) to yield three major peaks. LC/MS analysis confirmed that peak 2, eluting at 22.2 min, was the hydroxylated compound and peak 3, eluting at 22.5 min, was 8-MOP. Fractions for the hydroxylated 8-MOP peak were combined and dried as in example 2. The hydroxylated 8-MOP was obtained as a white amorphorous solid and subjected to NMR.
Five resonances were identified in the 1H NMR spectrum of the proposed hydroxylated product (four methine protons at δ 8.32, δ 7.76, δ 7.08, δ 6.20, and one methyl group at δ 4.05), thus verifying the addition of a hydroxyl group to 8- MOP. 2D-NMR revealed two 1H-1H COSY relationships between resonances at δ 7.76 and δ 7.08 as well as at δ 8.32 and δ 6.2, confirming that the hydroxyl group was added at C-5, resulting in 5-hydroxy-8-methoxypsoralen (Figure 88, compound 2).
E. Example 5: Identification of glycosylated 8-methoxypsoralen Four microtiter plates of seedlings were grown and substrate added as per the standard plant in vivo assay. Seedlings from each 24 well microtiter plate were transferred into a 145 ml mortar, ground thoroughly with a pestle, and transferred into a 50 ml centrifuge tube, rinsing with MeOH for final volume of 35 ml. Tubes containing crushed seedlings were placed in a sonication bath (Branson 2510) for 30 min at room temperature, then chilled in ice water for 5 min; this cycle was repeated six times. Tubes were then shaken at ambient temperature overnight. The extraction mixture was vacuum filtered through a coarse sintered glass funnel and filter paper (Whatman); the filtrate was reduced to < 2 ml by rotoevaporation, resuspended with 15% acetonitrile, frozen, and lyophilized to dryness. The dry extract was dissolved in MeOH for semi- preparative HPLC (Waters 600), using the following chromatographic conditions: a C18 5 μm 150 x 10 mm column (Alltech Alltima), solvent A: H2O + 0.1% formic acid, solvent B: MeOH + 0.1% formic acid; flow rate of 1.5 ml/min; monitoring at 308 nm. Three major peaks eluted using a gradient of 20 - 80% B in 30 min. The presence of the glycosylated product in the first two peaks was verified by analytical LC/MS. Fractions for peak 1 were pooled and reduced to <5 ml by rotoevaporation, likewise for peak 2. The concentrated peaks were mixed with 15% acetonitrile, frozen, and lyophilized to dryness. Peak 2 was redissolved in MeOH and further separated using a gradient of 40- 50% B in 35 min, yielding the purified glycosylated product eluting at 20.7 min. The glycosylated 8-MOP product was obtained as a white amorphorous solid and subjected to NMR.
The 1H NMR spectrum of this product revealed four methine peaks (δ 8.54, δ 7.80, δ 7.30 and δ 6.33) for the psoralen moiety, seven methine protons (δ 4.50, δ 3.85, δ 3.73, δ 3.58, δ 3.46, δ 3.44, δ 3.28) for glucose and one methoxy group at δ 4.19. Compared with the 1H NMR spectrum of 5- hydroxy-8-methoxypsoralen, two methine resonances at δ 7.80 and δ 7.30 in the furan ring and at δ 8.54 and δ 6.33 in the pyran lactone ring remained; therefore, the O- glucose moiety was present at C-5. The large coupling constant for an anomeric proton at δ 4.50 (J= 8.0 Hz, H-I") and δ 3.85 (J= 12 Hz, 2.1 Hz, H- 6"), δ 3.73 (J = 12 Hz, 4.0 Hz, H-6"), δ 3.58 (J = 8.0 Hz, 7.0 Hz, H-2") confirmed that the glycosylated product was 5-O-β-D-glucopyranosyl-8- methoxypsoralen (Figure 88, compound 3). F. Example 6: Identification of functional homologs
A subject sequence is considered to be a functional homolog of a query sequence if the subject and query sequences encode proteins having a similar function and/or activity. A process known as Reciprocal BLAST was used to identify homologous and variant sequences by looking for top hits in bidirectional BLAST searches. Before starting a reciprocal BLAST process, a query polypeptide and polypeptides having a sequence percent identity of 80% or greater to the query polypeptide were designated as a cluster.
A query polypeptide sequence, "polypeptide A," from species SΛ was then BLASTed against all protein sequences from species SB in public and proprietary databases, and the top hits were determined using an E-value cutoff of 10"5 and an identity cutoff of 35%. The process was repeated using polypeptide A as a query sequence in BLASTs against all plant species.
In a second round, the top hits from all species other than SA were BLASTed against all protein sequences from species SA in the same databases. Any top hit from the first round of the BLAST process that returned polypeptide A as its best hit in the second round was identified as a potential functional homolog. Any top hit from the first round that returned a polypeptide from the cluster as its best hit was also considered a functional homolog. Any top hit that did not return polypeptide A or a polypeptide from the cluster as its best hit was not considered a functional homolog. To generate a consensus sequence, the query peptide sequence and the sequences of its functional homologs were aligned using the BLASTP program with option "-m 1," and the conserved amino acids based on the alignment are extracted using a proprietary program.
Manual inspection of potential functional homologs was carried out to remove truncated sequences, redundant sequences, and defective sequences. Redundant sequences include transcriptional or RNA processing variants, such as clones having different UTR lengths or having one or more unspliced introns and/or deleted exons. Defective clones include 5', 3' and/or internally truncated, frame-shifted (e.g. by insertion or deletion), or chimeric clones. Alignment Tables (FIG. 4-87) were then prepared with the remaining functional homolog sequences. The query sequence is identified in each Table as a "Lead-cDNA." Boxed residues represent identical or conserved amino acids. It is to be understood that while the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.

Claims

WHAT IS CLAIMED IS:
1. A method of screening for a substrate of a P450, which method comprises: a) contacting a candidate compound, wherein the candidate compound has pharmacological activity, with plant cells containing a recombinant nucleic acid construct comprising a regulatory region operably linked to a sequence encoding a P450 polypeptide having 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-533; and b) determining whether the candidate compound is modified after the contact, whereby detection of a modification indicates that the candidate compound is a substrate.
2. The method of claim 1, wherein the sequence encoding a P450 polypeptide has 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs:
137-272.
3. The method of claim 1, wherein the sequence encoding a P450 polypeptide has 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-144.
4. The method of claim 1, wherein the P450 coding sequence is a heterologous sequence.
5. The method of claim 1 , wherein the cells are part of a whole plant.
6. The method of claim 5, wherein the cells are root cells, leaf cells or stem cells.
7. The method of claim 5, wherein the plant is a dicotyledonous plant.
8. The method of claim 7, wherein the plant is Arabidopsis thaliana.
9. The method of claim 5, wherein the plant is a monocotyledonous plant.
10. The method of claim 9, wherein the plant is Lemna minor.
11. The method of claim 1 , wherein the plant cells are algae.
12. The method of claim 1, wherein the plant cells are grown in suspension culture or tissue culture.
13. The method of claim 1 , wherein the pharmacological activity of the candidate compound is Alzheimer's disease treatment, analgesic activity, anesthetic activity, anti- Addison's disease activity, anti-HIV activity, anti-infective activity, anti-inflammatory activity, antianginal activity, antiangiogenic activity, antianxiety activity, antiarrhythmic activity, antiarthritic activity, antiatherosclerotic activity, antibacterial activity, antibiotic activity, anticancer activity, anticholesterol activity, anticholinergic activity, anticoagulant activity, anticonvulsant activity, antidepressant activity, antidiabetic activity, antidiuretic activity, antiedemic activity, antifungal activity, antigout activity, antiglaucoma activity, antihemorrhagic activity, antihistamine activity, antihypertensive activity, antimalarial activity, antimicrobial activity, antimigraine activity, antimotion sickness activity, antineoplastic activity, antineuralgic activity, antiobesity activity, antioxidant activity, antiparasitic activity, antiparkinsonian activity, antipsoriasis activity, antipsychotic activity, antipyretic activity, antirheumatic activity, antiseizure activity, antithrombotic activity, antitussive activity, antiulcer activity, antiviral activity, antivitiligo activity, anxiolytic activity, appetite suppressant activity, asthma treatment, bronchodilator activity, cardiac depressant activity, cardiotonic activity, cerebral ischemia treatment, CNS depressant or stimulant activity, cognition enhancing activity, contraceptive activity, dermatitis treatment, diuretic activity, emetic activity, dopamine agonist activity, expectorant activity, gastrointestinal treatment, hepatoprotective activity, immunostimulant activity, immunosuppressant activity, antiimpotence activity, irritable bowel syndrome treatment, ischemia treatment, activity in metabolic and enzyme disorders, multiple sclerosis treatment, muscle relaxant activity, neuromuscular blocker activity, neuroprotective activity, opioid activity, osteoporosis treatment, purgative activity, radioprotective activity, respiratory stimulant activity, restenosis treatment, rheumatoid arthritis treatment, schizophrenia treatment, sedative/hypnotic activity, sepsis treatment, smoking deterrent activity, stroke treatment, thrombocytopenia treatment, thrombolytic therapy activity, vaccine activity, or vasodilator activity.
14. The method of claim 1, wherein the candidate compound is selected from the group consisting of aesculetin, ajmalicine, ajmaline, aloin, amikacin, amlodipine besylate, amoxicillin, amphotericin B, andrographolide, apomorphine hydrochloride, arecoline hydrobromide, artemesinin, atorvastatin calcium, atropine, azithromycin, berberine chloride, bergenin monohydrate, betamethasone, betulinic acid, bixin, brucine, budesonide, bupropion hydrochloride, butamben, caffeine, camptothecin, capsaicin, celecoxib, ciprofloxacin, clarithromycin, clopidogrel sulfate, codeine, colchicine, convallatoxin, curcumin, cyclobenzaprine hydrochloride, danthron, dextromethorphan, digitoxin, digoxin, doxepin hydrochloride, emetine dihydrochloride hydrate, emodin, enalapril, eserine, esomeprazole potassium, estradiol, fentanyl citrate, formoterol, furosemide, gabapentin, glimerpiride, D-(+) glucosamine hydrochloride, glycyrrhiziii, gossypol, griseofulvin, hesperidin, homatropine hydrobromide, (-)-hydrastine, hydrocortisone, ibuprofen, idazoxan hydrochloride, ipratropium bromide, ivermectin, ketoconazole, ketoprofen, lanatoside C, lansaprazole, lapachol, levofloxacin hydrochloride, lidocaine, (-)-lobeline hydrochloride, lomerizine hydrochloride, mefenamic acid, 8-methoxypsoralen, miconazole, mitoxantrone hydrochloride, morphine, mycophenolic acid, nocodazole, nordihydroguaiaretic acid, (S,R)-noscapine, oleanolic acid, omeprazole, pantaprazole, phentolamine mesylate, picrotoxin, pilocarpine hydrochloride, D-pinitol, pipeline, piperlongumine, pirenzepine dihydrochloride, podophyllotoxin, prasterone, pravastatin sodium salt, prednisolone, protoveratrine A, pyridostigmine bromide, quercetin dihydrate, quinidine (cinchonidine), quinine, rebamipide, rescinnamine, reserpine, resveratrol, retinoic acid, risperidone, rofecoxib, rotenone, rutin trihydrate, salicin, salicylic acid, santonin, (-)-scopolamine hydrobromide, sertraline hydrochloride, silybin, simvastatin, (-)-sparteine, streptozocin, strophanthidin, tetracycline, (+)-tetrandrine, thebaine, theobromine, theophylline, thymol, tobramycin, triamcinolone acetonide, tubocurarine chloride, ursolic acid, vincamine, warfarin pestanal, or yohimbine hydrochloride.
15. The method of claim 1, wherein the determining step comprises mass spectrometry analysis.
16. The method of claim 1, wherein the determining step comprises NMR analysis.
17. The method of claim 1 , wherein a plurality of candidate compounds are contacted with the plant cells.
18. A method of screening a collection of polypeptides for a P450 capable of modifying a compound having pharmacological activity, wherein the polypeptide has 80% or greater sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 137-533, which method comprises: a) contacting the compound with plant cells containing a recombinant nucleic acid construct comprising a regulatory region operably linked to a coding sequence for a P450 of the collection; b) determining whether the compound is modified after the contact, whereby detection of a modification indicates that the P450 is capable of modifying the compound; c) repeating steps a) and b) for each P450 of the collection, whereby at least one P450 of the collection is identified as capable of modifying the compound.
19. The method of claim 18, wherein the polypeptide has 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-272.
20. The method of claim 19, wherein the polypeptide has 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-144.
21. A method of modifying a candidate compound, which method comprises contacting a candidate compound with plant cells containing a recombinant nucleic acid construct comprising a regulatory region operably linked to a coding sequence for a P450, whereby the candidate compound is modified, and wherein the unmodified candidate compound has pharmacological activity.
22. The method of claim 21, wherein the P450 has 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-533.
23. The method of claim 22, wherein the P450 has 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-272.
24. The method of claim 23, wherein the P450 has 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-144.
25. The method of claim 21 , wherein the candidate compound is selected from the group consisting of curcumin, griseofulvin, methoxypsoralen, piperine, and warfarin.
26. The method of claim 21, wherein the P450 coding sequence is a heterologous sequence.
27. The method of claim 21 , wherein the cells are part of a whole plant.
28. The method of claim 27, wherein the cells are root cells, leaf cells or stem cells.
29. The method of claim 27, wherein the plant is a dicotyledonous plant.
30. The method of claim 29, wherein the plant is Arabidopsis thaliana.
31. The method of claim 27, wherein the plant is a monocotyledonous plant.
32. The method of claim 31, wherein the plant is Lemna minor.
33. The method of claim 21 , wherein the plant cells are algae.
34. The method of claim 21, wherein the plant cells are grown in suspension culture or tissue culture.
35. A method of making a modified candidate compound, comprising
(a) contacting a candidate compound with transgenic plant cells containing a recombinant nucleic acid construct comprising a regulatory region operably linked to a coding sequence for a P450, whereby the P450 modifies the candidate compound, and wherein the candidate compound has pharmacological activity; and
(b) recovering the modified candidate compound from the plant cells.
36. The method of claim 35, wherein the P450 has 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs:137-533.
37. The method of claim 36, wherein the P450 has 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-272.
38. The method of claim 32, wherein the P450 has 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs:137-144.
39. The method of claim 35, wherein the candidate compound is selected from the group consisting of curcumin, griseofulvin, methoxypsoralen, piperine, and warfarin.
40. Transgenic plant cells comprising a recombinant nucleic acid construct that comprises a regulatory region operably linked to a coding sequence for a P450 having 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-533.
41. The transgenic plant cells of claim 40, wherein the coding sequence for the P450 has 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-272.
42. The transgenic plant cells of claim 40, wherein the coding sequence for the P450 has 80% or greater identity to a sequence selected from the group consisting of SEQ ID NOs: 137-144.
43. The transgenic plant cells of claim 40, wherein the P450 coding sequence is a heterologous sequence.
44. The transgenic plant cells of claim 40, wherein the cells are part of a whole plant.
45. The transgenic plant cells of claim 44, wherein the cells are root cells, leaf cells or stem cells.
46. The transgenic plant cells of claim 44, wherein the plant is a dicotyledonous plant.
47. The transgenic plant cells of claim 46, wherein the plant is Arabidopsis thaliana.
48. The transgenic plant cells of claim 44, wherein the plant is a monocotyledonous plant.
49. The transgenic plant cells of claim 48, wherein the plant is Lemna minor.
50. The transgenic plant cells of claim 40, wherein the plant cells are algae.
51. The transgenic plant cells of claim 40, wherein the plant cells are grown in suspension culture or tissue culture.
52. The transgenic plant cells of claim 40, wherein the regulatory region comprises a constitutive promoter, an inducible promoter, or a tissue-specific promoter.
53. The transgenic plant cells of claim 40, wherein the cells are effective for modifying a candidate compound selected from the group consisting of aesculetin, ajmalicine, ajmaline, aloin, amikacin, amlodipine besylate, amoxicillin, amphotericin B, andrographolide, apomorphine hydrochloride, arecoline hydrobromide, artemesinin, atorvastatin calcium, atropine, azithromycin, berberine chloride, bergenin monohydrate, betamethasone, betulinic acid, bixin, brucine, budesonide, bupropion hydrochloride, butamben, caffeine, camptothecin, capsaicin, celecoxib, ciprofloxacin, clarithromycin, clopidogrel sulfate, codeine, colchicine, convallatoxin, curcumin, cyclobenzaprine hydrochloride, danthron, dextromethorphan, digitoxin, digoxin, doxepin hydrochloride, emetine dihydrochloride hydrate, emodin, enalapril, eserine, esomeprazole potassium, estradiol, fentanyl citrate, formoterol, furosemide, gabapentin, glimerpiride, D-(+) glucosamine hydrochloride, glycyrrhizin, gossypol, griseofulvin, hesperidin, homatropine hydrobromide, (-)-hydrastine, hydrocortisone, ibuprofen, idazoxan hydrochloride, ipratropium bromide, ivermectin, ketoconazole, ketoprofen, lanatoside C, lansaprazole, lapachol, levofloxacin hydrochloride, lidocaine, (-)-lobeline hydrochloride, lomerizine hydrochloride, mefenamic acid, 8-methoxypsoralen, miconazole, mitoxantrone hydrochloride, morphine, mycophenolic acid, nocodazole, nordihydroguaiaretic acid, (S,R)-noscapine, oleanolic acid, omeprazole, pantaprazole, phentolamine mesylate, picrotoxin, pilocarpine hydrochloride, D-pinitol, pipeline, piperlongumine, pirenzepine dihydrochloride, podophyllotoxin, prasterone, pravastatin sodium salt, prednisolone, protoveratrine A, pyridostigmine bromide, quercetin dihydrate, quinidine (cinchonidine), quinine, rebamipide, rescinnamine, reserpine, resveratrol, retinoic acid, risperidone, rofecoxib, rotenone, rutin trihydrate, salicin, salicylic acid, santonin, (-)-scopolamine hydrobromide, sertraline hydrochloride, silybin, simvastatin, (-)-sparteine, streptozocin, strophanthidin, tetracycline, (+)-tetrandrine, thebaine, theobromine, theophylline, thymol, tobramycin, triamcinolone acetonide, tubocurarine chloride, ursolic acid, vincamine, warfarin pestanal, or yohimbine hydrochloride, when the cells are contacted with the candidate compound.
54. The transgenic plant cells of claim 53, wherein the cells are effective for modifying a candidate compound selected from the group consisting of curcumin, griseofulvin, methoxypsoralen, pipeline, and warfarin.
55. A transgenic plant comprising a cell of any one of claims 40 to 54.
56. A composition of matter comprising 5-O-β-D-glucopyranosyl-8- methoxypsoralen.
57. A pharmaceutical composition comprising the composition of claims 56 in combination with a pharmaceutically acceptable carrier or excipient.
58. A method for treating, ameliorating, or preventing a symptom or disorder associated with psoriasis, vitiligo, cancer, or nicotine addiction, comprising administering the pharmaceutical composition of claim 58 to a mammal.
PCT/US2006/019001 2005-06-17 2006-05-17 P450 substrates and methods related thereto WO2006138012A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US69195605P 2005-06-17 2005-06-17
US60/691,956 2005-06-17

Publications (1)

Publication Number Publication Date
WO2006138012A1 true WO2006138012A1 (en) 2006-12-28

Family

ID=37570754

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2006/019001 WO2006138012A1 (en) 2005-06-17 2006-05-17 P450 substrates and methods related thereto

Country Status (1)

Country Link
WO (1) WO2006138012A1 (en)

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1947925A1 (en) * 2005-10-20 2008-07-30 Commonwealth Scientific And Industrial Research Organisation Cereals with altered dormancy
EP2154946A2 (en) * 2007-04-09 2010-02-24 Evogene Ltd. Polynucleotides, polypeptides and methods for increasing oil content, growth rate and biomass of plants
US7910800B2 (en) 2005-08-15 2011-03-22 Evogene Ltd. Methods of increasing abiotic stress tolerance and/or biomass in plants and plants generated thereby
WO2012010872A3 (en) * 2010-07-22 2012-03-29 Glaxosmithkline Australia Pty Limited Plant cytochrome p450
US8847008B2 (en) 2008-05-22 2014-09-30 Evogene Ltd. Isolated polynucleotides and polypeptides and methods of using same for increasing plant utility
US8921658B2 (en) 2008-10-30 2014-12-30 Evogene Ltd. Isolated polynucleotides encoding a MAP65 polypeptide and methods of using same for increasing plant yield
US8937220B2 (en) 2009-03-02 2015-01-20 Evogene Ltd. Isolated polynucleotides and polypeptides, and methods of using same for increasing plant yield, biomass, vigor and/or growth rate of a plant
US8962915B2 (en) 2004-06-14 2015-02-24 Evogene Ltd. Isolated polypeptides, polynucleotides encoding same, transgenic plants expressing same and methods of using same
US9012728B2 (en) 2004-06-14 2015-04-21 Evogene Ltd. Polynucleotides and polypeptides involved in plant fiber development and methods of using same
US9018445B2 (en) 2008-08-18 2015-04-28 Evogene Ltd. Use of CAD genes to increase nitrogen use efficiency and low nitrogen tolerance to a plant
CN104758278A (en) * 2015-03-24 2015-07-08 宁夏医科大学 Uses of aloin as medicines for treating cerebral arterial thrombosis
US9096865B2 (en) 2009-06-10 2015-08-04 Evogene Ltd. Isolated polynucleotides and polypeptides, and methods of using same for increasing nitrogen use efficiency, yield, growth rate, vigor, biomass, oil content, and/or abiotic stress tolerance
US9303269B2 (en) 2003-05-22 2016-04-05 Evogene Ltd. Methods of increasing abiotic stress tolerance and/or biomass in plants
US9328353B2 (en) 2010-04-28 2016-05-03 Evogene Ltd. Isolated polynucleotides and polypeptides for increasing plant yield and/or agricultural characteristics
US20160130602A1 (en) * 2013-06-03 2016-05-12 Vib Vzw Means and methods for yield performance in plants
US9447444B2 (en) 2012-03-13 2016-09-20 Sun Pharmaceutical Industries (Australia) Pty Ltd Biosynthesis of opiate alkaloids
US9458481B2 (en) 2010-06-22 2016-10-04 Sun Pharmaceutical Industries (Australia) Pty Ltd Methyltransferase nucleic acids and polypeptides
AU2015200581B2 (en) * 2010-07-22 2016-10-20 Sun Pharmaceutical Industries (Australia) Pty Ltd Plant cytochrome p450
CN106102451A (en) * 2013-11-28 2016-11-09 株式会社Woojungbsc Utilize preparation method and the plant thereof of the conversion plant of 20 hydroxyecdysone content increases of the CYP85 gene being derived from spinach
US9493785B2 (en) 2009-12-28 2016-11-15 Evogene Ltd. Isolated polynucleotides and polypeptides and methods of using same for increasing plant yield, biomass, growth rate, vigor, oil content, abiotic stress tolerance of plants and nitrogen use efficiency
US9518267B2 (en) 2007-07-24 2016-12-13 Evogene Ltd. Polynucleotides, polypeptides encoded thereby, and methods of using same for increasing abiotic stress tolerance and/or biomass and/or yield in plants expressing same
US9551006B2 (en) 2010-12-22 2017-01-24 Evogene Ltd. Isolated polynucleotides and polypeptides, and methods of using same for improving plant properties
US9631000B2 (en) 2006-12-20 2017-04-25 Evogene Ltd. Polynucleotides and polypeptides involved in plant fiber development and methods of using same
AU2016208335B2 (en) * 2010-07-22 2017-05-25 Sun Pharmaceutical Industries (Australia) Pty Ltd Plant cytochrome p450
US9670501B2 (en) 2007-12-27 2017-06-06 Evogene Ltd. Isolated polypeptides, polynucleotides useful for modifying water user efficiency, fertilizer use efficiency, biotic/abiotic stress tolerance, yield and biomass in plants
US10457954B2 (en) 2010-08-30 2019-10-29 Evogene Ltd. Isolated polynucleotides and polypeptides, and methods of using same for increasing nitrogen use efficiency, yield, growth rate, vigor, biomass, oil content, and/or abiotic stress tolerance
US10760088B2 (en) 2011-05-03 2020-09-01 Evogene Ltd. Isolated polynucleotides and polypeptides and methods of using same for increasing plant yield, biomass, growth rate, vigor, oil content, abiotic stress tolerance of plants and nitrogen use efficiency

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020160950A1 (en) * 2000-07-14 2002-10-31 Lal Preeti G. Cytochrome P450 variant

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020160950A1 (en) * 2000-07-14 2002-10-31 Lal Preeti G. Cytochrome P450 variant

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
DATABASE PROTEIN [online] 14 February 2004 (2004-02-14), KANEKO T. ET AL.: "Structural analysis of Arabidopsis thaliana chromosome 3. II. Sequence features of the 4,251,695 bp regions covered by 90 P1, TAC and BAC clones", XP003005079, accession no. NCBI Database accession no. (BAB02270) *
DATABASE PROTEIN [online] 16 April 2005 (2005-04-16), BEVAN M., XP003005082, accession no. NCBI Database accession no. (CAA16592) *
DATABASE PROTEIN [online] 25 January 2005 (2005-01-25), XP003005080, accession no. NCBI Database accession no. (NP_565617) *
DATABASE PROTEIN [online] 30 January 2001 (2001-01-30), LIN X. ET AL.: "Arabidopsis thaliana chromosome 1 BAC F10D13 genomic sequence", XP003005081, accession no. NCBI Database accession no. (AAG60111) *
DNA RES., vol. 7, no. 3, 2000, pages 217 - 221 *

Cited By (48)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9303269B2 (en) 2003-05-22 2016-04-05 Evogene Ltd. Methods of increasing abiotic stress tolerance and/or biomass in plants
US9012728B2 (en) 2004-06-14 2015-04-21 Evogene Ltd. Polynucleotides and polypeptides involved in plant fiber development and methods of using same
US8962915B2 (en) 2004-06-14 2015-02-24 Evogene Ltd. Isolated polypeptides, polynucleotides encoding same, transgenic plants expressing same and methods of using same
US7910800B2 (en) 2005-08-15 2011-03-22 Evogene Ltd. Methods of increasing abiotic stress tolerance and/or biomass in plants and plants generated thereby
US9487796B2 (en) 2005-08-15 2016-11-08 Evogene Ltd. Methods of increasing abiotic stress tolerance and/or biomass in plants and plants generated thereby
AU2006303820B2 (en) * 2005-10-20 2013-06-20 Commonwealth Scientific And Industrial Research Organisation Cereals with altered dormancy
EP1947925A4 (en) * 2005-10-20 2009-09-02 Commw Scient Ind Res Org Cereals with altered dormancy
US8269082B2 (en) 2005-10-20 2012-09-18 Commonwealth Scientific And Industrial Research Organisation Cereals with altered dormancy
EP1947925A1 (en) * 2005-10-20 2008-07-30 Commonwealth Scientific And Industrial Research Organisation Cereals with altered dormancy
US9631000B2 (en) 2006-12-20 2017-04-25 Evogene Ltd. Polynucleotides and polypeptides involved in plant fiber development and methods of using same
CN101854798B (en) * 2007-04-09 2013-04-03 伊沃基因有限公司 Polynucleotides, polypeptides and methods for increasing oil content, growth rate and biomass of plants
US9487793B2 (en) 2007-04-09 2016-11-08 Evogene Ltd. Polynucleotides, polypeptides and methods for increasing oil content, growth rate and biomass of plants
AU2008236316B2 (en) * 2007-04-09 2013-05-02 Evogene Ltd. Polynucleotides, polypeptides and methods for increasing oil content, growth rate and biomass of plants
US10036031B2 (en) 2007-04-09 2018-07-31 Evogene Ltd. Polynucleotides, polypeptides and methods for increasing oil content, growth rate and biomass of plants
WO2008122980A3 (en) * 2007-04-09 2011-12-29 Evogene Ltd. Polynucleotides, polypeptides and methods for increasing oil content, growth rate and biomass of plants
EP2154946A4 (en) * 2007-04-09 2010-09-08 Evogene Ltd Polynucleotides, polypeptides and methods for increasing oil content, growth rate and biomass of plants
EP2154946A2 (en) * 2007-04-09 2010-02-24 Evogene Ltd. Polynucleotides, polypeptides and methods for increasing oil content, growth rate and biomass of plants
US9518267B2 (en) 2007-07-24 2016-12-13 Evogene Ltd. Polynucleotides, polypeptides encoded thereby, and methods of using same for increasing abiotic stress tolerance and/or biomass and/or yield in plants expressing same
US9670501B2 (en) 2007-12-27 2017-06-06 Evogene Ltd. Isolated polypeptides, polynucleotides useful for modifying water user efficiency, fertilizer use efficiency, biotic/abiotic stress tolerance, yield and biomass in plants
US8847008B2 (en) 2008-05-22 2014-09-30 Evogene Ltd. Isolated polynucleotides and polypeptides and methods of using same for increasing plant utility
US9018445B2 (en) 2008-08-18 2015-04-28 Evogene Ltd. Use of CAD genes to increase nitrogen use efficiency and low nitrogen tolerance to a plant
US8921658B2 (en) 2008-10-30 2014-12-30 Evogene Ltd. Isolated polynucleotides encoding a MAP65 polypeptide and methods of using same for increasing plant yield
US8937220B2 (en) 2009-03-02 2015-01-20 Evogene Ltd. Isolated polynucleotides and polypeptides, and methods of using same for increasing plant yield, biomass, vigor and/or growth rate of a plant
US9096865B2 (en) 2009-06-10 2015-08-04 Evogene Ltd. Isolated polynucleotides and polypeptides, and methods of using same for increasing nitrogen use efficiency, yield, growth rate, vigor, biomass, oil content, and/or abiotic stress tolerance
US9493785B2 (en) 2009-12-28 2016-11-15 Evogene Ltd. Isolated polynucleotides and polypeptides and methods of using same for increasing plant yield, biomass, growth rate, vigor, oil content, abiotic stress tolerance of plants and nitrogen use efficiency
US9328353B2 (en) 2010-04-28 2016-05-03 Evogene Ltd. Isolated polynucleotides and polypeptides for increasing plant yield and/or agricultural characteristics
US9458481B2 (en) 2010-06-22 2016-10-04 Sun Pharmaceutical Industries (Australia) Pty Ltd Methyltransferase nucleic acids and polypeptides
AU2016208335B2 (en) * 2010-07-22 2017-05-25 Sun Pharmaceutical Industries (Australia) Pty Ltd Plant cytochrome p450
US9725732B2 (en) 2010-07-22 2017-08-08 Sun Pharmaceutical Industries (Australia) Pty Ltd Plant cytochrome P450
US11312973B2 (en) 2010-07-22 2022-04-26 Sun Pharmaceutical Industries (Australia) Pty Ltd Plant cytochrome P450
US10844391B2 (en) 2010-07-22 2020-11-24 Sun Pharmaceutical Industries (Australia) Pty Ltd Plant cytochrome P450
US9200261B2 (en) 2010-07-22 2015-12-01 Glaxosmithkline Australia Pty Limited Plant cytochrome P450
AU2015200581B2 (en) * 2010-07-22 2016-10-20 Sun Pharmaceutical Industries (Australia) Pty Ltd Plant cytochrome p450
EP3121193A3 (en) * 2010-07-22 2017-04-12 Sun Pharmaceutical Industries (Australia) Pty Limited Plant cytochrome p450
US10385354B2 (en) 2010-07-22 2019-08-20 Sun Pharmaceutical Industries (Australia) Pty Ltd Papaver somniferum cytochrome P450
AU2011281312B2 (en) * 2010-07-22 2014-12-18 Sun Pharmaceutical Industries (Australia) Pty Ltd Plant cytochrome P450
US20130133105A1 (en) * 2010-07-22 2013-05-23 Glaxosmithkline Australia Pty Limited Plant cytochrome p450
WO2012010872A3 (en) * 2010-07-22 2012-03-29 Glaxosmithkline Australia Pty Limited Plant cytochrome p450
US10457954B2 (en) 2010-08-30 2019-10-29 Evogene Ltd. Isolated polynucleotides and polypeptides, and methods of using same for increasing nitrogen use efficiency, yield, growth rate, vigor, biomass, oil content, and/or abiotic stress tolerance
US9551006B2 (en) 2010-12-22 2017-01-24 Evogene Ltd. Isolated polynucleotides and polypeptides, and methods of using same for improving plant properties
US10760088B2 (en) 2011-05-03 2020-09-01 Evogene Ltd. Isolated polynucleotides and polypeptides and methods of using same for increasing plant yield, biomass, growth rate, vigor, oil content, abiotic stress tolerance of plants and nitrogen use efficiency
US9862979B2 (en) 2012-03-13 2018-01-09 Sun Pharmaceutical Industries (Australia) Pty Ltd Biosynthesis of opiate alkaloids
US9447444B2 (en) 2012-03-13 2016-09-20 Sun Pharmaceutical Industries (Australia) Pty Ltd Biosynthesis of opiate alkaloids
US20160130602A1 (en) * 2013-06-03 2016-05-12 Vib Vzw Means and methods for yield performance in plants
US10801032B2 (en) * 2013-06-03 2020-10-13 Vib Vzw Means and methods for yield performance in plants
EP3075238A4 (en) * 2013-11-28 2017-06-14 Woojungbsc Inc Method for preparing transgenic plant with increased 20-hydroxyecdysone content by using spinacia oleracea-derived cyp85 gene, and plant prepared thereby
CN106102451A (en) * 2013-11-28 2016-11-09 株式会社Woojungbsc Utilize preparation method and the plant thereof of the conversion plant of 20 hydroxyecdysone content increases of the CYP85 gene being derived from spinach
CN104758278A (en) * 2015-03-24 2015-07-08 宁夏医科大学 Uses of aloin as medicines for treating cerebral arterial thrombosis

Similar Documents

Publication Publication Date Title
WO2006138012A1 (en) P450 substrates and methods related thereto
US7312376B2 (en) Regulatory regions from Papaveraceae
US7795503B2 (en) Modulating plant alkaloids
US20060236421A1 (en) Secondary metabolite production via manipulation of genome methylation
US7897839B2 (en) Methods for modifying plant characteristics
Zhang et al. Revealing evolution of tropane alkaloid biosynthesis by analyzing two genomes in the Solanaceae family
US7750210B2 (en) Compositions with increased phytosterol levels obtained from plants with decreased triterpene saponin levels
Facchini Regulation of alkaloid biosynthesis in plants
Huang et al. WEAK SEED DORMANCY 1, an aminotransferase protein, regulates seed dormancy in rice through the GA and ABA pathways
Caboche et al. Comparison of the growth promoting activities and toxicities of various auxin analogs on cells derived from wild type and a nonrooting mutant of tobacco
US20090178160A1 (en) Modulation of Triterpenoid Content in Plants
Chen et al. The ERF transcription factor LTF1 activates DIR1 to control stereoselective synthesis of antiviral lignans and stress defense in Isatis indigotica roots
WO2008073617A2 (en) Increasing tolerance of plants to low light conditions
US7601892B2 (en) Nucleotide sequences and polypeptides encoded thereby useful for modifying plant characteristics
EP3615668B1 (en) Alfalfa with reduced lignin composition
US7982096B2 (en) Root specific promoters
Mmereke et al. Preference of Agrobacterium rhizogenes Mediated Transformation of Angiosperms
CN107828803A (en) The hydroxylation enzyme gene dsmABC of 3,6 dichlorosalicylic acid 5 and its application
Hooper Re-engineering the tropane alkaloid biosynthesis pathway in potato
HAO et al. Functional Characterization of CYP80A and CYP80G Involved in the Biosynthesis of Benzylisoquinoline Alkaloids in the Sacred Lotus (Nelumbo Nucifera)
CN114480322A (en) Oat glycosyltransferase AsUGT73E5 and application thereof in synthesis of steroid saponin
CN114480323A (en) Oat glycosyltransferase AsUGT73E1 and application thereof in synthesis of steroid saponin
Κροκιδά Genome organization, functional analysis of biosynthetic genes and metabolic diversity of triterpenes in legumes
Krokida Genome organization, functional analysis of biosynthetic genes and metabolic diversity of triterpenes in legumes
Subramanian short blue root (sbr), an Arabidopsis mutant that ectopically over-expresses an ABA-and auxin-inducible transgene Dc3-GUS and has defects in the cell wall

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06759975

Country of ref document: EP

Kind code of ref document: A1