WO2007112377A9 - Three-dimensional structure of hdhd4 complexed with magnesium and a phosphate mimetic - Google Patents

Three-dimensional structure of hdhd4 complexed with magnesium and a phosphate mimetic

Info

Publication number
WO2007112377A9
WO2007112377A9 PCT/US2007/064983 US2007064983W WO2007112377A9 WO 2007112377 A9 WO2007112377 A9 WO 2007112377A9 US 2007064983 W US2007064983 W US 2007064983W WO 2007112377 A9 WO2007112377 A9 WO 2007112377A9
Authority
WO
WIPO (PCT)
Prior art keywords
hdhd4
polypeptide
seq
crystalline form
phosphate
Prior art date
Application number
PCT/US2007/064983
Other languages
French (fr)
Other versions
WO2007112377A3 (en
WO2007112377A2 (en
Inventor
Patricia A Mcdonnell
Keith L Constantine
Herbert E Klei
Stephen R Johnson
Valentina Goldfarb
Kevin Kish
Soong-Hoon Kim
Original Assignee
Bristol Myers Squibb Co
Patricia A Mcdonnell
Keith L Constantine
Herbert E Klei
Stephen R Johnson
Valentina Goldfarb
Kevin Kish
Soong-Hoon Kim
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bristol Myers Squibb Co, Patricia A Mcdonnell, Keith L Constantine, Herbert E Klei, Stephen R Johnson, Valentina Goldfarb, Kevin Kish, Soong-Hoon Kim filed Critical Bristol Myers Squibb Co
Publication of WO2007112377A2 publication Critical patent/WO2007112377A2/en
Publication of WO2007112377A9 publication Critical patent/WO2007112377A9/en
Publication of WO2007112377A3 publication Critical patent/WO2007112377A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/34Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving hydrolase
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/573Immunoassay; Biospecific binding assay; Materials therefor for enzymes or isoenzymes
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2299/00Coordinates from 3D structures of peptides, e.g. proteins or enzymes

Definitions

  • the present invention relates generally to the three-dimensional structure of haloacid dehalogenase-like hydrolase domain containing protein 4 (HDHD4) in general, and more particularly to HDHD4 in complex with magnesium and a phosphate mimetic, such as vanadate. Additionally, the present invention relates to methods of designing and/or identifying modulators and/or ligands of HDHD4. Methods of modulating HDHD4 activity, methods of designing HDHD4 mutants, mutant HDHD4 polypeptides or portions of mutant HDHD4 polypeptides, and models of HDHD4 also form aspects of the present invention.
  • HDHD4 haloacid dehalogenase-like hydrolase domain containing protein 4
  • the present invention further relates to machine-readable data storage media comprising structural coordinates of HDHD4 in complex with magnesium and a phosphate mimetic, vanadate for example, and optionally in further complex with a ligand, and computer systems capable of producing three-dimensional representations of all or any part of a structure of HDHD4 in complex with magnesium and a phosphate mimetic, such as vanadate.
  • the present invention relates to the three-dimensional structure of
  • HDHD4 in complex with magnesium and a phosphate mimetic, as determined by X- ray crystallography methods.
  • phosphate mimetics include but are not limited to vanadate, phosphate, tungstate, sulfate and aluminum trifluoride (Madhusudan et al. (2002) Nature Structur. Biol. 9:273-277).
  • HDHD4 is an intra- cellular protein having a molecular weight of about 31,000 Da that is a member of the haloacid dehalogenase (HAD) superfamily of enzymes (Allen and Dunaway-Mariano (2004) Trends Biochem. ScL 29: 495-503).
  • HAD haloacid dehalogenase
  • the HAD superfamily is a large family of enzymes that occur in both prokaryotes and eukaryotes. While the HAD superfamily includes dehalogenases, the majority of the superfamily members are involved in phosphoryl group transfer reactions (phosphatase, phosphonotase and phospho- mutase activities). Recent examples of biologically important mammalian HAD superfamily members include chronophin (Gohla, Birkenfeld and Bokoch (2005) Nature Cell Biol. 7: 21-29), which is involved in the regulation of cofilin-dependent actin dynamics, and the Drosophila eyes absent homolog 2 (Zhang et al. (2005) Cancer Res. 65: 925-932), which is up-regulated in ovarian cancer and promotes tumor growth. Mammalian HAD superfamily members are potential novel targets for cancers and other diseases.
  • HDHD4 was initially identified as a potential oncology target through studies of its ortholog in Drosophila. Over-expression of the Drosophila gene CG 15771 suppresses the small eye defect caused by over-expression of human p21(+) in the eye. Subsequent studies of HDHD4 in cancer cell lines indicate that it acts synergistically with the Ras/P21 pathway.
  • HDHD4 suppression causes transient reduction in p21 protein levels in M 109 murine melanoma cells transfected with an siRNA shown to cause the specific degradation of HDHD4 mRNA
  • HDHD4 overexpression reverses p21-mediated Gl arrest in A549 cells
  • HDHD4 over-expression reverses p21 -mediated S-phase arrest in HEK293 cells.
  • HDHD4 displays weak phosphatase activity against several small- molecule substrates such as 2,3-diphosphoglycerate, which was used as the basis of a high-throughput screen. Based on the observed in vitro activity, and by comparing the HDHD4 active site composition to the known HAD superfamily active site motifs (Allen & Dunaway-Mariano (2004) Trends Biochem. ScL 29: 495-503), HDHD4 is likely to be a novel human phosphatase. Recently, N-acetylneuraminate 9-phosphate was identified as a biologically relevant substrate for HDHD4 (Maliekal et al. (2006) Glycobiology 16: 165-172).
  • HDHD4 is a member of subfamily I of the HAD super-family (Allen & Dunaway-Mariano (2004) Trends Biochem. ScL 29: 495-503).
  • Subfamily I members contain a core domain and a cap domain. It was believed that the natural substrates of subfamily I members are exclusively small molecules, since the core and cap domains adopt a "closed" conformation when substrates/inhibitors bind.
  • HDHD4 A detailed three-dimensional structure of HDHD4 would greatly facilitate not only an understanding of HDHD4 structure and activity, but would also facilitate the design of modulators that can be employed in the diagnosis, prognosis and treatment of HDHD4-related conditions, such as different forms of cancer.
  • Such information can take the form of, for example, structural coordinates derived from a crystalline form of a HDHD4-ligand complex.
  • HDHD4 can be designed and/or identified, and additional details regarding HDHD4's mechanism of action can be obtained.
  • the present invention provides a crystalline form comprising a complex comprising a HDHD4 polypeptide and a moiety comprising a metal atom.
  • the moiety comprising a metal atom is selected from the group consisting of magnesium, manganese, calcium, a phosphate mimetic, both magnesium and a phosphate mimetic, both manganese and a phosphate mimetic, and both calcium and a phosphate mimetic.
  • the phosphate mimetic can be, for example, vanadate, tungstate, sulfate or aluminum trifluoride.
  • a HDHD4 polypeptide comprises the amino acid sequence of SEQ ID NOs :2 or 4.
  • a HDHD4 polypeptide can also comprise a His-tagged form.
  • the crystalline form is a triclinic crystalline form and has a space group of Pl or P2 1 2 1 2 1 .
  • the crystalline form is described by the structure coordinates of Table 1 or Table 2 and the three-dimensional structure of the crystallized complex is determined to a resolution of about 3.0 A or better.
  • the crystalline form comprises one or more atoms having an atomic weight of 40 g/mol or more.
  • the present invention also provides a method for determining the three- dimensional structure of a crystallized HDHD4 in complex with a moiety comprising a metal atom to a resolution of about 3.0 A or better.
  • the method comprises: (a) crystallizing a HDHD4 polypeptide in complex with a moiety comprising a metal atom to form a crystallized complex; and (b) analyzing the crystallized complex to determine a three-dimensional structure of the HDHD4 polypeptide in complex with a ligand, whereby the three-dimensional structure of a crystallized HDHD4 polypeptide in complex with a ligand is determined to a resolution of about 3.0 A or better.
  • the present invention further provides a method of designing a modulator of HDHD4.
  • the method comprises: (a) designing a potential modulator of HDHD4 that will make interactions with amino acids in a ligand binding site of a HDHD4, based upon a crystalline structure comprising a HDHD4 in complex with a ligand; (b) synthesizing the modulator; and (c) determining whether the potential modulator modulates the activity of HDHD4, whereby a modulator of HDHD4 is designed.
  • the present invention provides a method of identifying a HDHD4 modulator.
  • the method comprises: (a) inputting structure coordinates describing a three-dimensional structure of a HDHD4 polypeptide in complex with a moiety comprising a metal atom to modeling software disposed on a computer; and (b) modeling a candidate modulator that forms one or more desired interactions with one or more amino acids of a ligand binding site of the HDHD4 and fits sterically within the HDHD4 binding pocket.
  • the method can further comprise assaying the modulatory properties of the candidate modulator by contacting the candidate modulator with a cell extract or purified HDHD4 polypeptide to determine whether it is a modulator of HDHD4 activity.
  • the present invention also provides a method of increasing the efficiency of a modulator of HDHD4 and, in a representative embodiment, comprises: (a) providing a first ligand having a known effect on the biological activity of HDHD4; (b) modifying the first ligand based on an evaluation of a three-dimensional structure of a HDHD4, optionally in complex with a ligand to form a modified ligand; (c) synthesizing the modified ligand; and (d) determining an effect of the modified ligand on HDHD4, wherein the efficiency of a modulator of HDHD4 is increased if the modified ligand favorably alters a biological activity of a HDHD4 with respect to the biological activity of the first ligand.
  • the present invention provides a method of designing a modulator of HDHD4.
  • the method comprises: (a) modeling all or any part of a HDHD4 ligand binding site; and (b) based on the modeling, designing a candidate modulator that has structural and chemical feature complementarity with all or any part of the HDHD4 binding site; wherein the HDHD4 binding site is defined by the structure coordinates of Table 1 or Table 2.
  • the candidate modulator can be designed to fit spatially into all or any part of a HDHD4 binding site, and a ligand binding site can be described by the structure coordinates of amino acids D 12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, optionally C60, F61, H62, P63, Y64 and N65 and subcombinations thereof according to Table 1 or Table 2.
  • the candidate modulator can then be synthesized and tested for modulation ability in a suitable assay.
  • the method can further comprise: (c) docking the designed candidate modulator into all or any part of the HDHD4 binding site; and (d) analyzing the structural and/or chemical feature complementarity of the candiate modulator with all or any part of HDHD4 binding site.
  • the present invention also provides a method of designing a modulator of a target polypeptide that is structurally similar to HDHD4.
  • the method comprises: (a) modeling all or any part of a HDHD4 polypeptide; and (b) based on the modeling, designing a candidate modulator that has structural and chemical feature complementarity with all or any part of a HDHD4 polypeptide binding site; wherein the HDHD4 polypeptide is described by the structure coordinates of Table 1 or Table 2.
  • the method can further comprise: (c) docking the chemical entity into all or any part of the HDHD4 binding site; and (d) analyzing the structural and chemical feature complementarity of the candidate modulator with all or any part of a HDHD4 polypeptide, such as a binding site.
  • the present invention provides a method of identifying structural features of HDHD4 that can be employed in the design of a modulator that selectively modulates the activity of HDHD4 polypeptide to the exclusion of other structurally similar proteins.
  • the method comprises: (a) providing a three-dimensional structure of a HDHD4 polypeptide, optionally in complex with a moiety comprising a metal atom, and a three-dimensional structure of a structurally similar but non-identical test structure; (b) overlaying the backbone residues of the HDHD4 structure onto the test structure; and (c) identifying structural features of HDHD4 that do not overlap the test structure to a desired degree.
  • the identifying can comprise, for example, a visual inspection of the overlapped structures or a quanitative comparison can be made. Additionally, the identifying can comprise one or more computational evaluations of the overlapped structures, which can be perfomed by employing commercially available computer software known to those of ordinary skill in the art.
  • the present invention provides methods useful in the design and identification of ligands and/or modulators of HDHD4.
  • the present invention provides a method of docking a test molecule into all or any part of a binding site on a HDHD4 and a method of identifying structural and chemical features of all or any part of a HDHD4.
  • the present invention provides a method of designing a ligand of HDHD4.
  • the method comprises: (a) modeling all or any part of a HDHD4; and (b) designing a chemical entity that has structural and chemical complementarity with all or any part of a HDHD4 binding site.
  • a method of evaluating the potential of a chemical entity to bind to all or any part of HDHD4, as well as a method for identifying a ligand and/or a modulator of HDHD4 is also disclosed.
  • the present invention provides a method of designing a HDHD4 mutant.
  • the method comprises: (a) evaluating a three- dimensional structure of a HDHD4 polypeptide to identify one or more amino acids as candidates for mutation; and (b) mutating the HDHD4 identified one or more amino acids by making an amino acid mutation selected from the group consisting of a substitution, a deletion and an insertion.
  • the method can further comprise the step of (c) expressing the mutant so generated.
  • the present invention also encompasses the resultant mutant HDHD4, as well as portions of a mutant HDHD4.
  • the present invention provides a method of forming a homology model based on a HDHD4 structure of the present invention.
  • a method of constructing a homology model consistent with the present invention comprises: (a) providing an amino acid sequence for a target protein for which a structure is not known; (b) aligning the target protein with all or a part of of the structure of a HDHD4, wherein the HDHD4 is described by the structure coordinates of Table 1 or Table 2; (c) analyzing the alignment of the target protein with the all or a part of the HDHD4; and (d) generating a structure of the target protein based on the analysis.
  • the present invention provides a method for evaluating the potential of a chemical entity to bind to all or any part of a HDHD4 or a structurally similar molecule comprising: (a) docking a candidate modulator into all or any part of a HDHD4 described by the structure coordinates of Table 1 or Table 2; and (b) analyzing structural and chemical feature complementarity between the candidate modulator and all or any part of a HDHD4.
  • the present invention additionally provides a method for identifying a modulator of HDHD4.
  • the method comprises the following steps, which are preferably, but not necessarily, performed in the order recited: (a) docking a candidate modulator into all or any part of a HDHD4 binding site, wherein the HDHD4 binding site is described by the structure coordinates of Table 1 or Table 2; (b) analyzing structural and chemical feature complementarity between the candidate modulator and all or any part of the HDHD4 binding site; (c) synthesizing the candidate modulator; and (d) screening the candidate modulator in a biological assay for the ability to modulate HDHD4.
  • the method can further comprise the following step of (e) screening the candidate modulator in an assay that characterizes binding to HDHD4.
  • the present invention also comprises a method of determining the structure of a target protein for which little or no structural information is known.
  • the method comprises: (a) providing an amino acid sequence for a target protein for which a structure is not known; (b) aligning the target protein with all or a part of a HDHD4 structure, wherein the HDHD4 is described by the structure coordinates of Table 1 or Table 2; (c) analyzing the alignment of the target protein with all or a part of the HDHD4; (d) generating a structure of the target protein based on the analysis; and (e) analyzing the generated structure to determine the structure of a target protein for which a structure is not known.
  • the present invention also comprises a method of designing a mutation in
  • One embodiment of a method of designing a mutation comprises: (a) selecting a property of HDHD4 to be investigated; (b) providing a three-dimensional structure of a HDHD4; and (c) evaluating the structure to identify a residue known or suspected to be related to the selected property.
  • the steps of the method can be repeated a desired number of times.
  • the present invention further provides a method of modulating HDHD4 activity comprising administering a modulator of HDHD4 in an amount sufficient to modulate HDHD4 activity, wherein the modulator of HDHD4 is a ligand known or suspected to bind to HDHD4 or was identified using a structure of the present invention.
  • the method of identifying a modulator comprises (a) docking a test molecule into all or any part of a HDHD4 binding site, (b) analyzing the structural and chemical feature complementarity structural and chemical feature complementarity between the test molecule and all or any part of a HDHD4; and (c) screening the test molecule in a biological assay of modulation of HDHD4.
  • the method can further comprise one or more of the following steps: (d) screening the test molecule in an assay that characterizes binding to HDHD4; and (e) screening the test molecule in an assay that characterizes binding to HDHD4.
  • the present invention provides a machine-readable data storage medium comprising data storage material encoded with machine-readable data comprising all or any part of the structure coordinates of a HDHD4 polypeptide, optionally in complex with a ligand and/or optionally in complex with a moiety comprising a metal atom.
  • the present invention further provides computer systems comprising the machine-readable data storage media of the present invention, the systems being capable of producing a three-dimensional representation of all or any part of a HDHD4 alone or optionally in complex with a ligand and/or a moiety comprising a metal atom.
  • the core domain can comprise
  • the cap domain can comprise HDHD4 residues A21-H107 and hinge segments can comprise residues I18-T20 and M108- LI lO.
  • the ligand binding site of the HDHD4 polypeptide can comprise HDHD4 residues D 12, L 13, D 14, N 15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, optionally C60, F61, H62, P63, Y64 and N65, and subcombinations thereof.
  • the HDHD4 polypeptide can comprise the amino acid sequence of SEQ ID NOs:2 or 4 and can be encoded by a nucleic acid selected from the group consisting of SEQ ID NOs: 1 and 3, respectively, and sequences deviating from SEQ ID NOs: 1 and 3 due to the degeneracy in the genetic code.
  • a HDHD4 polypeptide can comprise a moiety comprising a metal atom, and the moiety can be selected from the group consisting of magnesium, manganese, calcium, a phosphate mimetic, both magnesium and a phosphate mimetic, both manganese and a phosphate mimetic, both calcium and a phosphate mimetic, and both magnesium and a phosphorylated sugar.
  • the phosphate mimetic can be, for example, vanadate, phosphate, tungstate, sulfate or aluminum trifluoride.
  • HDHD4 polypeptide optionally in complex with moiety comprising a metal atom and/or optionally comprising a ligand. This object is achieved in whole or in part by the present invention.
  • Figure 1 is a photograph of crystals of HDHD4 complexed with Mg 2+ and phosphate and/or VO 4 " .
  • the diameter of the cluster is approximately 0.65 mm.
  • Figure 2 is a cartoon diagram of HDHD4 with Mg 2+ (black ball) and VO 4 3" (black stick). The unmodelled density connected to Lysl41 is also shown. The cap domain is shown in A and the core domain shown in B.
  • Figure 3 is a line depiction of an expanded view of some of the HDHD4 active
  • Mg is represented as a black ball and VO 4 " is represented by black sticks.
  • the cap domain is shown in A and the core domain shown in B.
  • the unmodelled density is shown. Lysl41 is shown as a gray stick.
  • Figure 4 is a cartoon diagram depicting the HDHD4 X-ray structure with loop residues 60-65 modeled to fit the discontinuous density.
  • Mg 2+ is represented as a black ball and VO 4 3" is represented by a black stick.
  • the cap domain is shown in A and the core domain is shown in B.
  • the region of HDHD4 encompassing residues 60-65 is defined by the black arrows.
  • Figures 5A and 5B are a series of cartoon diagrams depicting the HDHD4 X- ray structure (the left structure in both Figures 5A and 5B) compared with phosphonatase in open conformation in complex with Mg 2+ (PDB file IRQN, the center structure in both Figures 5A and 5B) and in closed conformation in complex with tungstate (shown as sticks) and Mg 2+ (PDB file IFEZ, the right structure in both Figures 5A and 5B).
  • the structures are shown with view to the face of the ⁇ -sheet in the core domain of all three structures.
  • Figure 5B the structures are shown the same structures rotated approximately 90 degrees to view down the same ⁇ -sheet.
  • Figure 6A depicts the DNA and protein sequences of full length wild-type
  • HDHD4 as derived from NCBI RefSeq entries NM_152667 (SEQ ID NO: 1) and NP_689880 (SEQ ID NO:2), respectively.
  • STP refers to a stop codon.
  • Figure 6B depicts the protein sequence of full length wild-type HDHD4, as derived from NCBI RefSeq entry NP_689880 (SEQ ID NO:2).
  • Figure 7 depicts the DNA (SEQ ID NO:3) and translated protein sequences
  • Figure 9 depicts nuclear magnetic resonance data for HDHD4 (VGlO: SEQ ID NO: 4) complexes: overlaid region of two-dimensional 1 H- 15 N hetero-nuclear single quantum coherence (HSQC) spectra.
  • the spectrum shown in black was acquired using a sample of HDHD4 (0.15 mM) in complex with magnesium (3.0 mM), and the spectrum shown in gray was acquired using a sample of HDHD4 (0.15 mM) in complex with magnesium (3.0 mM) and aluminum trifluoride (AlF 3 ) (1.5 mM).
  • AlF 3 aluminum trifluoride
  • Residues in the vicinity of the vanadate/phosphate binding site including 118, T193, G197, and G198, show chemical shift changes in response to AlF 3 binding, whereas more distant residues, including G202, A205 and G213, are not significantly perturbed by AlF 3 binding.
  • the present invention comprises a three-dimensional structure of HDHD4 (e.g., SEQ ID NOs:2 and 4) in complex with magnesium and phosphate and/or vanadate atoms.
  • the three-dimensional structure of HDHD4 disclosed herein reveals several unique structural features heretofor unidentified in the HDHD4 polypeptide, which can be exploited in a rational drug design process.
  • the present invention encompasses not only the three-dimensional structure of HDHD4 (described by the structure coordinates presented in Table 1 or Table 2), but also various uses of the structure including screening methods and modulator design methods.
  • the terms “a” and “an” mean “one or more” when used in this application, including the claims.
  • the term “about,” when referring to a value or to an amount of mass, weight, time, volume, concentration or percentage is meant to encompass variations of ⁇ 20% or less (e.g., ⁇ 15%, ⁇ 10%, ⁇ 7%, ⁇ 5%, ⁇ 4%, ⁇ 3%, ⁇ 2%, ⁇ 1%, or ⁇ 0.1%) from the specified amount, as such variations are appropriate.
  • amino acid As used herein, the terms "amino acid,” “amino acid residue” and “residue” are used interchangeably and mean any of the twenty naturally occurring amino acids.
  • An amino acid is formed upon chemical digestion (hydrolysis) of a polypeptide at its peptide linkages.
  • the amino acid residues described herein are preferably in the "L” isomeric form. However, residues in the "D” isomeric form can be substituted for any L-amino acid residue, as long as the desired functional property is retained by the polypeptide.
  • NH 2 refers to the free amino group present at the amino terminus of a polypeptide.
  • COOH refers to the free carboxy group present at the carboxy terminus of a polypeptide.
  • amino acid residue sequences represented herein by formulae have a left-to-right orientation in the conventional direction of amino terminus to carboxy terminus.
  • amino acid residues are broadly defined to include modified and unusual amino acids.
  • a dash at the beginning or end of an amino acid residue sequence indicates a peptide bond to a further sequence of one or more amino acid residues, or a covalent bond to an amino-terminal group, such as NH 2 , to an acetyl group or to a carboxy -terminal group, such as COOH.
  • an amino-terminal group such as NH 2
  • an acetyl group or to a carboxy -terminal group, such as COOH.
  • the terms "associate” and “bind” and grammatical derivations thereof are used interchangeably and mean a condition of proximity between or amongst molecules, structural elements, chemical compounds or chemical entities.
  • An association can be non-covalent (i.e., reversible), wherein the juxtaposition is energetically favored by hydrogen bonding or van der Waals or electrostatic interactions, or it can be covalent (i.e., irreversible).
  • a ligand “associates” with or "binds" to a protein, it is meant that the ligand interacts with the protein via covalent or non-covalent interactions.
  • binding site and "ligand binding site” are used interchangeably and mean a region of a molecule or molecular complex that, as a result of its shape, favorably associates with a ligand.
  • a binding site such as a binding site in the light chain of HDHD4, defines a space commonly referred to as a "cavity” or “pocket,” both of which terms are used interchangeably with “binding site” and “ligand binding site” in the present disclosure.
  • a ligand of a binding site situates in the binding site when the ligand associates with the molecule or molecular complex.
  • the extended active site of HDHD4 including both the region shown to bind the phosphate mimetic vanadate and the region with the unmodeled density consistent with a small organic molecule, is lined with the following residues: D12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194.
  • the residues C60, F61, H62, P63, Y64, and N65 also likely form part of the extended active site of HDHD4, based on the proximity of their Ca carbons.
  • biological activity means any observable effect flowing from a HDHD4 polypeptide.
  • biological activity in the context of the present invention include phosphoryl transfer, for example, using substrates such as 2,3-diphosphoglycerate or N-acetylneuraminate-9- phosphate.
  • chimeric protein and "fusion protein' are used interchangeably and mean a fusion of a first amino acid sequence encoding a HDHD4 polypeptide with a second amino acid sequence defining a polypeptide domain foreign to, and not homologous with, a HDHD4 polypeptide.
  • a chimeric protein can present a foreign domain that is found in an organism that also expresses the first protein, or it can be an "interspecies” or “intergenic” fusion of protein structures expressed by different kinds of organisms.
  • a chimeric or fusion protein of the present invention can be represented by the general formula X — HDHD4 — Y, wherein HDHD4 represents a portion of the protein which is derived from a HDHD4 polypeptide (e.g., all or a part of a HDHD4 polypeptide), and X and Y are independently absent or represent amino acid sequences which are not derived from a HDHD4 polypeptide, which includes naturally occurring mutants.
  • the term "chimeric gene” refers to a nucleic acid construct that encodes a "chimeric protein" or "fusion protein” as defined herein.
  • chimeric and fusion proteins are encompassed by the term "mutant,” examples of which is described herein.
  • the term “complementary” means a nucleic acid sequence that is base paired, or is capable of base-pairing, according to the standard Watson-Crick complementarity rules. These rules generally hold that guanine pairs with cytosine (G:C) and adenine pairs with either thymine (A:T) in the case of DNA, or adenine pairs with uracil (A:U) in the case of RNA.
  • complementarity can also refer to a favorable spatial arrangement between the surface of a ligand and the surface of its binding site.
  • detecting means confirming the presence of a target entity by observing the occurrence of a detectable signal, such as a radiologic, fluorescent, colorimetric, etc. signal that will appear exclusively in the presence of the target entity.
  • a chemical entity e.g., a ligand or modulator (or a candidate ligand or modulator), such as a small organic molecule
  • HDHD4 gene and “recombinant HDHD4 gene” mean a nucleic acid molecule comprising an open reading frame encoding a HDHD4 polypeptide of the present invention, including both exon and (optionally) intron sequences.
  • HDHD4 gene product As used herein, the terms “HDHD4 gene product”, “HDHD4 protein”, “HDHD4 polypeptide”, and “HDHD4 peptide” are used interchangeably and mean a polypeptide having an amino acid sequence that is substantially identical to a native HDHD4 amino acid sequence from an organism of interest and which is biologically active in that it comprises all or a part of the amino acid sequence of a HDHD4 polypeptide, or cross-reacts with antibodies raised against a HDHD4 polypeptide, or retains all or some of the biological activity (e.g., the ability to transfer a phosphate group, for example, using substrates such as 2,3-diphosphoglycerate or N- acetylneuraminate-9-phosphate) of the native amino acid sequence or protein.
  • biological activity can also include immunogenicity.
  • HDHD4 gene product As used herein, the terms “HDHD4 gene product”, “HDHD4 protein”, “HDHD4 polypeptide”, and “HDHD4 peptide” also include analogs of a HDHD4 polypeptide.
  • analog is intended that a DNA or amino acid sequence can contain alterations relative to the sequences disclosed herein, yet still retain all or some of the biological activity of those sequences. Analogs can be derived from cDNA or genomic nucleotide sequences from a human or other organism, or can be created synthetically. Those of ordinary skill in the art will appreciate that other analogs as yet undisclosed or undiscovered can be used to design and/or construct a HDHD4 analog.
  • HDHD4 gene product "HDHD4 protein”, “HDHD4 polypeptide”, or “HDHD4 peptide” to comprise all or substantially all of the amino acid sequence of a HDHD4 polypeptide gene product.
  • Shorter or longer sequences are anticipated to be of use in the present invention; shorter sequences are herein referred to herein as “segments”.
  • the terms “HDHD4 gene product”, “HDHD4 protein”, “HDHD4 polypeptide”, and “HDHD4 peptide” also include fusion, chimeric or recombinant HDHD4 polypeptides and proteins comprising sequences of the present invention. Methods of preparing such proteins are disclosed herein and/or are known in the art.
  • HDHD4 protein As used herein, the terms “HDHD4 protein”, “HDHD4 polypeptide”, and “HDHD4 peptide” are used interchangeably and mean a polypeptide having an amino acid sequence that is substantially identical to a native HDHD4 amino acid sequence from an organism of interest and which is biologically active in that it comprises all or a part of the amino acid sequence of a HDHD4 polypeptide, or cross-reacts with antibodies raised against a HDHD4 polypeptide, or retain all or some of the biological activity (e.g., the ability to transfer a phosphate group, for example, using substrates such as 2,3-diphosphoglycerate or N-acetylneuraminate-9-phosphate) of the native amino acid sequence or protein.
  • biological activity can include immunogenicity.
  • a HDHD4 protein comprises the amino acid sequences of SEQ ID NOs :2 or 4 and is encoded by the nucleic acid sequences of SEQ ID NOs: ! and 3.
  • the terms “HDHD4 protein”, “HDHD4 polypeptide”, and “HDHD4 peptide” encompass mutants, including derivatives and analogs of a HDHD4 polypeptide.
  • analog meant that a DNA or amino acid sequence can contain alterations relative to a sequence disclosed herein, yet retain all or some of the biological activity of the sequence.
  • An analog can be derived from genomic nucleotide sequences or cDNA, as disclosed herein, or can be created synthetically.
  • HDHD4 protein refers broadly to any segment of DNA associated with a biological function.
  • a gene can encompass polynucleotide sequences including, but not limited, to a coding sequence, a promoter region, a cis- regulatory sequence, a non-expressed DNA segment that is a specific recognition sequence for regulatory proteins, a non-expressed DNA segment that contributes to gene expression, a DNA segment designed to have desired parameters, or combinations thereof.
  • a gene can be obtained by a variety of methods, including cloning from a biological sample, synthesis based on known or predicted sequence information and recombinant derivation of an existing sequence.
  • isolated and purified are used interchangeably and refer to material (e.g., a nucleic acid or a protein) removed from its original environment (e.g., the natural environment, if it is naturally occurring), and thus is altered “by the hand of man” from its natural state.
  • material e.g., a nucleic acid or a protein
  • an isolated polynucleotide could be part of a vector or a composition of matter, or could be contained within a cell, and still be “isolated” because that vector, composition of matter, or particular cell is not the original environment of the polynucleotide.
  • isolated does not refer to genomic or cDNA libraries, whole cell total or mRNA preparations, genomic DNA preparations (including those separated by electrophoresis and transferred onto blots), sheared whole cell genomic DNA preparations or other compositions where the art demonstrates no distinguishing features of the polynucleotide and/or protein sequences of the present invention.
  • the term "isomorphous replacement” means a method of using heavy atom derivative crystals to obtain the phase information necessary to elucidate the three-dimensional structure of a native crystal (see, e.g., Blundell et ah. Protein Crystallography, Academic Press, New York, New York, USA (1976); Otwinowski, in Isomorphous Replacement and Anomalous Scattering, (Evans & Leslie, eds.), Daresbury Laboratory, Daresbury, UK (1991) pp. 80-86, both of which are incorporated in their entirety).
  • the phrase “heavy atom derivatization” is synonymous with the term “isomorphous replacement” and these terms are used synonymously herein.
  • ligand means any molecule that is known or suspected to associate with another molecule.
  • ligand encompasses inhibitors, activators, agonists, antagonists, natural substrates and analogs of natural substrates.
  • modeling in all its grammatical forms, refers to the development of a mathematical construct designed to mimic actual molecular geometry and behavior in proteins and small molecules.
  • These mathematical constructs include, but are not limited to: energy calculations for a given geometry of a molecule utilizing forcefields or ab initio methods known in the art; energy minimization using gradients of the energy calculated as atoms are shifted so as to produce a lower energy; conformational searching, i.e., locating local energy minima; molecular dynamics wherein a molecular system (single molecule or ligand/protein complex) is propagated forward through increments of time according to Newtonian mechanics using techniques known to the art; calculations of molecular properties such as electrostatic fields, hydrophobicity and lipophilicity; calculation of solvent- accessible or other molecular surfaces and rendition of the molecular properties on those surfaces; comparison of molecules using either atom-atom correspondences or other criteria such as surfaces and properties; quantitiative structure-activity relationships (SARs)
  • the term “modified” means an alteration from an entity's normally occurring state. An entity can be modified, for example, by removing discrete chemical units or by adding discrete chemical units. The term “modified” encompasses detectable labels as well as those entities added as aids in purification, such as His-tags.
  • the terms “modulate” and grammatical derivations thereof refer to an increase, decrease, or other alteration of any and/or all chemical and biological activities or properties mediated by a given DNA sequence, RNA sequence, polypeptide, peptide or molecule.
  • the definition of “modulate” as used herein encompasses agonists and/or antagonists of a particular activity, DNA, RNA, or protein. The term “modulation” therefore refers to both upregulation (i.e., activation or stimulation) and downregulation (i.e., inhibition or suppression) of a response by any mode of action.
  • the term "molecular replacement” means a method of solving a three-dimensional structure of a compound (e.g., a protein) that involves generating a preliminary model of a wild-type or mutant crystal whose structure coordinates are unknown, by orienting and positioning a molecule whose structure coordinates are known (e.g., a HDHD4 polypeptide, as disclosed herein) within the unit cell of the unknown crystal so as best to account for the observed diffraction pattern of the unknown crystal. Phases can then be calculated from this model and combined with the observed amplitudes to give an approximate Fourier synthesis of the structure whose coordinates are unknown.
  • a compound e.g., a protein
  • molecular replacement can be used to determine the structure coordinates of a crystalline mutant or homolog of a HDHD4 polypeptide, a structure known or suspected to be similar to the HDHD4 structure of the present invention or of a different crystal form of a HDHD4 polypeptide.
  • mutant encompasses fusion, chimeric and recombinant polypeptides and proteins (e.g., a HDHD4 polypeptide) comprising sequences of the present invention.
  • mutant encompasses a polypeptide otherwise falling within the definition of a polypeptide as set forth herein, but having an amino acid sequence which differs from that of the wild-type polypeptide, whether by way of deletion, substitution, or insertion.
  • a mutant can share many physicochemical and biological activities, (e.g., antigenicity or immunogenicity) with the wild-type, and in some embodiments comprise most or all of a wild-type sequence. Methods of preparing such proteins are disclosed herein and/or are known in the art.
  • nucleotide As used herein, the terms “nucleotide”, “base” and “nucleic acid” are used interchangeably and are equivalent. Additionally, the terms “nucleotide sequence”, “nucleic acid sequence”, “nucleic acid molecule” and “segment” are used interchangeably and are equivalent.
  • nucleotide means any of deoxyribonucleic acid (DNA), ribonucleic acid (RNA), oligonucleotides, fragments generated by the polymerase chain reaction (PCR), and fragments generated by any of ligation, scission, endonuclease action, and exonuclease action.
  • a nucleic acid can comprise monomers that are naturally- occurring nucleotides (such as deoxyribonucleotides and ribonucleotides), or analogs of naturally-occurring nucleotides (e.g., ⁇ -enantiomeric forms of naturally-occurring nucleotides), or a combination of both.
  • Modified nucleotides can have modifications in sugar moieties and/or in pyrimidine or purine base moieties.
  • Sugar modifications include, for example, replacement of one or more hydroxyl groups with halogens, allkyl groups, amines, and azido groups, or sugars can be functionalized as ethers or esters.
  • an entire sugar moiety can be replaced with sterically and electronically similar structures, such as aza-sugars and carbocyclic sugar analogs.
  • modifications in a base moiety include alkylated purines and pyrimidines, acylated purines or pyrimidines, or other well-known heterocylcic substitutes.
  • Nucleic acid monomers can be linked by phosphodiester bonds or analogs of such linkages. Analogs of phosphodiester linkages include phosphorothioate, phosphorodithioate, phosphoroselenoate, phosphorodiselenoate, phosphoroanilothioate, phosphoranilidate, phosphoramidate, and the like.
  • nucleic acid also includes so-called “peptide nucleic acids,” which comprise naturally-occurring or modified nucleic acid bases attached to a polyamide backbone. Nucleic acids can be either single stranded or double stranded.
  • oligonucleotide and “polynucleotide” are used interchangeably and mean a single- or double-stranded DNA or RNA sequence. Typically, an oligonucleotide is a short segment of about 50 or less nucleotides. An oligonucleotide or a polynucleotide can be naturally occurring or synthetic, but oligonucleotides are typically prepared by synthetic means.
  • an "oligonucleotide” and/or a “polynucleotide” includes DNA sequences and/or their complements.
  • the sequences can be, for example, between 1 and 250 bases, and, in some embodiments, between 5-10, 5-20, 10-20, 10-50, 20-50, 10-100 bases, or 100 or more bases in length.
  • the terms “oligonucleotide” and “polynucleotide” refer to a molecule comprising two or more nucleotides.
  • an oligonucleotide or polynucleotide can comprise a nucleotide sequence of a full length cDNA sequence, including any 5' and 3' untranslated sequences, the coding region, with or without a signal sequence, a secreted protein coding region, as well as fragments, epitopes, domains, and variants of the nucleic acid sequence.
  • a "polynucleotide” of the present invention also includes those polynucleotides capable of hybridizing, under stringent hybridization conditions (examples of which are provided herein), to sequences described herein, or the complement thereof.
  • an oligonucleotide or a polynucleotide of the present invention can comprise any polyribonucleotide or polydeoxribonucleotide, and can comprise unmodified RNA or DNA or modified RNA or DNA.
  • a polynucleotide can comprise single- and double-stranded DNA, DNA that is a mixture of single- and double-stranded regions, single- and double-stranded RNA, RNA that is mixture of single- and double-stranded regions, hybrid molecules comprising DNA and RNA that can be single-stranded or, more typically, double-stranded or a mixture of single- and double-stranded regions.
  • a polynucleotide can comprise triple- stranded regions comprising RNA or DNA or both RNA and DNA.
  • a polynucleotide can also contain one or more modified bases or DNA or RNA backbones modified for stability or for other reasons.
  • Modified bases include, for example, tritylated bases and unusual bases, such as inosine.
  • oligonucleotide and polynucleotide embraces chemically, enzymatically, or metabolically modified forms.
  • a "polypeptide”, defined further herein, refers to a molecule having the translated amino acid sequence generated directly or indirectly from a polynucleotide.
  • nucleic acid molecule of the present invention encoding a polypeptide of the present invention can be obtained using standard cloning and screening procedures, such as those for cloning cDNAs using mRNA as starting material.
  • organism As used herein, the terms "organism”, “subject” and “patient” are used interchangeably and mean any organism referenced herein, including prokaryotes, though the terms preferably refer to eukaryotic organisms, notably mammals (e.g., mice, rats, dogs and pigs), including humans.
  • mammals e.g., mice, rats, dogs and pigs
  • protein As used herein, the terms “protein”, “polypeptide” and “peptide” are used interchangeably and mean any polymer comprising any of the 20 protein amino acids, regardless of its size. Although “protein” is often used in reference to relatively large polypeptides, and “peptide” is often used in reference to small polypeptides, usage of these terms in the art overlaps and varies. Therefore, term “polypeptide” as used herein refers to peptides, polypeptides and proteins, unless otherwise noted. The terms “protein”, “polypeptide” and “peptide” are used interchangeably herein.
  • a polypeptide of the present invention can comprise amino acids joined to each other by peptide bonds or modified peptide bonds, i.e., peptide isosteres, and can contain amino acids other than the 20 gene-encoded amino acids.
  • a polypeptide can be modified by either natural processes, such as by posttranslational processing, or by chemical modification techniques which are known in the art. Such modifications will be known to those of ordinary skill in the art. Modifications can occur anywhere in a polypeptide, including the peptide backbone, the amino acid side-chains and the amino or carboxyl termini. The same type of modification can be present in the same or varying degrees at several sites in a given polypeptide.
  • a given polypeptide can contain many types of modifications.
  • a polypeptide can be branched, for example, as a result of ubiquitination, or a polypeptide can be cyclic, with or without branching. Cyclic, branched, and branched cyclic polypeptides can result from posttranslation natural processes or can be made by synthetic methods.
  • Representative modifications include acetylation, acylation, ADP-ribosylation, amidation, covalent attachment of flavin, covalent attachment of a heme moiety, covalent attachment of a nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid derivative, covalent attachment of phosphotidylinositol, cross-linking, cyclization, disulfide bond formation, demethylation, formation of covalent cross-links, formation of cysteine, formation of pyroglutamate, formylation, gamma-carboxylation, glycosylation, GPI anchor formation, hydroxylation, iodination, methylation, myristoylation, oxidation, pegylation, proteolytic processing, phosphorylation, prenylation, racemization, selenoylation, sulfation, transfer-RNA mediated addition of amino acids to proteins such as arginylation, and ubiquitination (see, e.g.
  • a "polypeptide having biological activity” refers to a polypeptide exhibiting activity similar, but not necessarily identical to, an activity of a HDHD4 polypeptide of the present invention, including mature forms, as measured in a particular biological assay (e.g., the ability to transfer a phosphate group, for example, using substrates such as 2,3-diphosphoglycerate or N-acetylneuraminate-9- phosphate; see, e.g., Malieka et al. , (2006) Glycobiology 16:165-172), with or without dose dependency.
  • a particular biological assay e.g., the ability to transfer a phosphate group, for example, using substrates such as 2,3-diphosphoglycerate or N-acetylneuraminate-9- phosphate; see, e.g., Malieka et al. , (2006) Glycobiology 16:165-172
  • a polypeptide having biological activity can exhibit activity of not more than about 25-fold less and, preferably, not more than about ten-fold less activity, and most preferably, not more than about three-fold less activity relative to a polypeptide of the present invention.
  • root mean square deviation means the square root of the arithmetic mean of the squares of the deviations. It is a way to express the deviation or variation from a trend or object.
  • root mean square deviation describes the variation in the backbone of a mutant or homologous protein from the backbone of HDHD4 or a binding pocket portion thereof, as defined by the structure coordinates of HDHD4 described in Table 1 or Table 2 herein.
  • space group means the arrangement of symmetry elements of a crystal.
  • stringent hybridization conditions refers to an overnight incubation at 42°C in a solution comprising 50% formamide, 5x SSC (750 mM NaCl, 75 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5x Denhardt's solution, 10% dextran sulfate, and 20 ⁇ g/mL denatured, sheared salmon sperm DNA, followed by washing the filters in 0. Ix SSC at about 65°C.
  • structure coordinates "structural coordinates"
  • atomic structural coordinates and “atomic coordinates” mean mathematical coordinates derived from mathematical equations related to the patterns obtained from the diffraction of a monochromatic beam of X-rays by the atoms (scattering centers) of a molecule in crystal form.
  • the diffraction data are used to calculate an electron density map of the repeating unit of the crystal.
  • the electron density maps are then used to establish the positions of the individual atoms within the unit cell of the crystal.
  • any set of structure coordinates determined by X-ray crystallography is not without standard error.
  • RMSD root mean square deviation
  • the term "substantially identical” means at least 75% sequence identity between nucleotide or amino acid sequences. Sequence similarity is calculated based on a reference sequence, which can be a subset of a larger sequence, such as a conserved motif, coding region, flanking region, etc. In the context of nucleic acids, a reference sequence will usually be at least about 18 nucleotides (nt) long, more usually at least about 30 nt long, and can extend to the complete sequence that is being compared. Algorithms for sequence analysis are known in the art, such as BLAST, described in Altschul et al.. (1990) J. MoL Biol. 215: 403-10.
  • Percent identity or percent similarity of a DNA or peptide sequence can be determined, for example, by comparing sequence information using the GAP computer program, available from the University of Wisconsin Geneticist Computer Group.
  • the GAP program utilizes the alignment method of Needleman et al. , (1970) J. MoI. Biol. 48: 443, as revised by Smith et al, (1981) Adv. Appl. Math. 2:482. Briefly, the GAP program defines similarity as the number of aligned symbols (i.e., nucleotides or amino acids) that are similar, divided by the total number of symbols in the shorter of the two sequences.
  • the preferred parameters for the GAP program are the default parameters, which do not impose a penalty for end gaps.
  • similarity is contrasted with the term “identity”. Similarity is defined as above; "identity”, however, means a nucleic acid or amino acid sequence having the same amino acid at the same relative position in a given family member of a gene family. Homology and similarity are generally viewed as broader terms than the term identity. Biochemically similar amino acids, for example leucine/isoleucine or glutamate/aspartate, can be present at the same position— these are not identical per se, but are biochemically "similar.” As disclosed herein, these are referred to as conservative differences or conservative substitutions. This differs from a conservative mutation at the DNA level, which changes the nucleotide sequence without making a change in the encoded amino acid, e.g. TCC to TCA, both of which encode serine.
  • DNA analog sequences are "substantially identical" to specific DNA sequences disclosed herein if: (a) the DNA analog sequence is derived from coding regions of the nucleic acid sequences shown in SEQ ID NOs: 1 and 3, or (b) the DNA analog sequence is capable of hybridization with DNA sequences of (a) under stringent conditions and which encode a biologically active prenyltransferase gene product; or (c) the DNA sequences are degenerate as a result of alternative genetic code to the DNA analog sequences defined in (a) and/or (b).
  • Substantially identical analog proteins and nucleic acids will have between about 70% and 80%, preferably between about 81% to about 90% or even more preferably between about 91% and 99.9% sequence identity with the corresponding sequence of the native protein or nucleic acid. Sequences having lesser degrees of identity but comparable biological activity are considered to be equivalents.
  • unit cell means a basic parallelepiped shaped block. The entire volume of a crystal can be constructed by regular assembly of such blocks. Each unit cell comprises a complete representation of the unit of pattern, the repetition of which adds cumulatively to form a crystal. Thus, the term “unit cell” means a fundamental portion of a crystal that is repeated infinitely by translation in three dimensions. A unit cell is characterized by three vectors a, b, and c, not located in one plane, which form the edges of a parallelepiped.
  • Angles ⁇ , ⁇ and ⁇ define the angles between the vectors: angle ⁇ is the angle between vectors b and c; angle ⁇ is the angle between vectors a and c; and angle ⁇ is the angle between vectors a and b.
  • the entire volume of a crystal can be constructed by associating a plurality of unit cells.
  • vector means is a replicon, such as plasmid, phage or cosmid, to which another DNA segment may be attached so as to bring about the replication of the attached segment.
  • Table 1 is a table showing structure coordinates describing the structure of VGlO HDHD4 (SEQ ID NO:4) in complex with magnesium and phosphate and/or vanadate atoms.
  • Table 2 is a table showing structure coordinates describing the structure of wild-type HDHD4 (SEQ ID NO:2) in complex with magnesium and vanadate atoms.
  • One HDHD4 polypeptide sequence comprises the amino acid sequence of SEQ ID NO:4 which is encoded by SEQ ID NO:3.
  • the asymmetric unit (which also equals the unit cell for space group Pl) was determined to contain two independent HDHD4 monomers (51% solvent fraction).
  • HDHD4 polypeptide sequence comprises the amino acid sequence of SEQ ID NO:2 which is encoded by SEQ ID NO:1.
  • the symmetry was consistent with space group P2 1 2 1 2 1 Based on this unit cell and space group, the asymmetric unit was determined to contain three independent HDHD4 monomers (53% solvent fraction).
  • the crystalline form is a triclinic crystalline form and has a space group of Pl or P2 1 2 1 2 1 .
  • the crystalline form is described by the structure coordinates of Table 1 or Table 2 and the three- dimensional structure of the crystallized complex is determined to a resolution of about 3.0 A or better. In the crystalline forms, there were either two or three HDHD4 polypeptides in the unit cell.
  • the three- dimensional structure of the HDHD4 polypeptide in complex with magnesium and phosphate/vanadate and the three-dimensional structure of the HDHD4 polypeptide in complex with magnesium and vanadate were determined (shown in Figures 2, 4, and 5).
  • the HDHD4 X-ray crystal structures disclosed herein exhibit an "open" conformation with phosphate and Mg 2+ bound or the inhibitor vanadate (IC50 « 3 ⁇ M) and Mg + bound.
  • the phosphate-based crystallization conditions utilized phosphate at concentrations (0.8 - 1.8 M) much greater than the vanadate concentration (1.5 mM).
  • the active site is more accessible than the open conformation reported for ⁇ - phosphoglucomutase (Lahiri et al, (2002) Biochemistry 41 : 8351-8359; Lahiri et al, (2004) Biochemistry 43: 2812-2820), also a member of subfamily I.
  • a HDHD4 polypeptide of the present invention can be prepared using any or a combination of technologies known to those of ordinary skill in the art.
  • a HDHD4 polypeptide is expressed in a recombinant system.
  • a HDHD4 polypeptide is isolated from a biological source.
  • a HDHD4 polypeptide is synthesized de novo. Further discussion of these methods is provided hereinbelow.
  • fragments of a HDHD4 polypeptide can be produced by direct peptide synthesis using solid phase techniques (Roberge et al, (1995) Science 269:202-204; Merrifield. (1963) J. Am. Chem. Soc. 85:2149- 2154). Protein synthesis can be performed using manual techniques or by automation. Automated synthesis may be achieved, for example, using ABI 43 IA Peptide Synthesizer (Applied Biosystems, Foster City California, USA). Various fragments of a HDHD4 polypeptide can be chemically synthesized separately and then combined using chemical methods to produce a full-length molecule.
  • sequences encoding a HDHD4 polypeptide can be synthesized in whole, or in part, using chemical methods known in the art (see, for example, Caruthers et al.. (1980) Nucl. Acids Res. Symp. Ser. 215-223 and Horn ef ⁇ /.. (1980) Nucl. Acids Res. Symp. Ser. 225-232; Hunkapiller et al. (1984) Nature 310: 105-111; Creighton. Proteins. Structures and Molecular Principles. W.H. Freeman & Co., New York, New York, USA (1983), incorporated herein by reference).
  • a HDHD4 protein itself, or a fragment or portion thereof can be produced using chemical methods to synthesize the amino acid sequence of a HDHD4 polypeptide, or a fragment or portion thereof.
  • peptide synthesis can be performed using various solid-phase techniques (Roberge et al, (1995) Science 269:202-204; Merrifield. (1963) J. Am. Chem. Soc. 85:2149-2154) and automated synthesis can be achieved, for example, using the ABI 43 IA Peptide Synthesizer (Applied Biosystems, Foster City, California).
  • non-naturally occurring amino acids or chemical amino acid analogs can be introduced as a substitution or addition into the polypeptide sequence.
  • Non-naturally occurring amino acids include, but are not limited to, the D isomers of the common amino acids, 2,4-diaminobutyric acid, alpha-amino isobutyric acid, 4-aminobutyric acid, Abu, 2-amino butyric acid, ⁇ -Abu, ⁇ -Ahx, 6-amino hexanoic acid, Aib, 2-amino isobutyric acid, 3 -amino propionic acid, ornithine, norleucine, norvaline, hydroxyproline, sarcosine, citrulline, homocitrulline, cysteic acid, t-butylglycine, t- butylalanine, phenylglycine, cyclohexylalanine, beta-alanine, alpha-alanine, fluoro- amino acids, designer amino acids such as ⁇ -methyl amino acids, Ca-methyl amino acids, Na-methyl amino acids, trans-3-methylproline, 2,
  • the newly synthesized HDHD4 polypeptide or peptide can be substantially purified by preparative high performance liquid chromatography (see, e.g., Creighton, Proteins, Structures and Molecular Principles, W.H. Freeman & Co., New York, New York, USA (1983)), by reverse-phase high performance liquid chromatography (HPLC), or other purification methods as known and practiced in the art.
  • the composition of the synthetic peptides can be confirmed by amino acid analysis or sequencing (e.g., the Edman degradation procedure).
  • the amino acid sequence of a HDHD4 polypeptide, or any portion thereof can be altered during direct synthesis and/or combined using chemical methods with sequences from other proteins, or any part thereof, to produce a variant polypeptide.
  • E. coli cells are cultured in the absence of a natural amino acid that is to be replaced (e.g., phenylalanine) and in the presence of a desired non-naturally occurring amino acid(s).
  • the non-naturally occurring amino acid is incorporated into the protein in place of its natural counterpart (Koide et ah, (1994) Biochem. 33:7470-76).
  • Naturally occurring amino acid residues can be converted to non-naturally occurring species by in vitro chemical modification. Chemical modification can be combined with site-directed mutagenesis (as described herein) to further expand the range of substitutions CWynn & Richards, (1993) Protein Sci. 2:395-403).
  • a HDHD4 polypeptide can be isolated from any suitable animal source, particularly from a mammal (e.g., from liver, brain, colon, breast or lung tissue). Methods for purifying a HDHD4 protein are known and can be employed to obtain a HDHD4 polypeptide as described herein.
  • a HDHD4 polypeptide can be isolated from a biological sample using standard protein purification methodology known to those of the art (see, e.g., Janson. Protein Purification: Principles, High Resolution Methods, and Applications, (2 nd ed.) Wiley, New York, (1997); Rosenberg. Protein Analysis and Purification: Benchtop Techniques. Birkhauser, Boston, (1996); Walker.
  • HDHD4 polypeptide or peptide e.g., SEQ ID NOs :2 or 4
  • the encoded polypeptide can be expressed.
  • a nucleotide sequence encoding a HDHD4 polypeptide, or a functional equivalent thereof can be inserted into an appropriate expression vector, i.e., a vector, which contains the necessary elements for the transcription and translation of the inserted coding sequence.
  • an expression vector contains an isolated and purified polynucleotide sequence encoding a HDHD4 polypeptide or a sequence as set forth in SEQ ID NOs:2 and 4, encoding a HDHD4 polypeptide, respectively or a functional fragment thereof, in which the HDHD4 polypeptide comprises the amino acid sequence as set forth in SEQ ID NOs :2 and 4.
  • an expression vector can contain the complement of a HDHD4 nucleic acid sequence.
  • Expression vectors derived from retroviruses, adenovirus, herpes or vaccinia viruses, or from various bacterial plasmids can be used in the present invention. Methods, which are known to those of ordinary skill in the art, can be used to construct expression vectors containing sequences encoding one or more HDHD4 polypeptides along with appropriate transcriptional and translational control elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic recombination.
  • the present invention also relates to expression vectors containing genes encoding analogs, derivatives and mutants of a HDHD4 polypeptide, including a modified HDHD4 proteins of the present invention, that have the same or homologous functional activity as a HDHD4 polypeptide, and homologs thereof.
  • Such cloning vectors can be prepared as described.
  • the production and use of derivatives, analogs and mutants related to HDHD4 are within the scope of the present invention.
  • Recombinant molecules can be introduced into host cells via transfection, electroporation, microinjection, transduction, cell fusion, DEAE dextran, calcium phosphate precipitation, lipofection (lysosome fusion), use of a gene gun, or a DNA vector transporter (see, e.g., Wu et al.. (1992) J. Biol. Chem. 267:963-967; Wu & Wu.
  • the cloned gene can be contained on a shuttle vector plasmid, which provides for expansion in a cloning cell, e.g., E. coli, and facile purification for subsequent insertion into an appropriate expression cell line, if such is desired.
  • a shuttle vector which is a vector that can replicate in more than one type of organism, can be prepared for replication in both E. coli and Saccharomyces cerevisiae by linking sequences from an E. coli plasmid with sequences from a yeast plasmid.
  • the DNA sequence can then be inserted into an appropriate cloning vector and expressed in a host cell.
  • Any suitable vector-host systems known in the art can be employed in the present invention.
  • plasmids or modified viruses can be employed, but the vector system should be compatible with the host cell selected.
  • suitable vectors include, but are not limited to, plasmids, such as pBR322 derivatives or pUC plasmid derivatives, e.g., pGEX vectors, pmal-c, pFLAG, etc.
  • the insertion into a cloning vector can be accomplished by ligating the DNA fragment into a cloning vector that comprises complementary cohesive termini.
  • any desired site can be produced by ligating nucleotide sequences (linkers) onto the DNA termini.
  • ligated linkers can comprise specific chemically synthesized oligonucleotides comprising a restriction endonuclease recognition sequence, encoding a protease site, a purification aid (such as a His tag, as was done in the present invention) or other desired feature.
  • a variety of host-expression vector systems can be utilized to express a DNA sequence encoding a HDHD4 polypeptide.
  • These include but are not limited to microorganisms such as bacteria transformed with recombinant bacteriophage DNA, plasmid DNA or cosmid DNA expression vectors containing a DNA sequence encoding a HDHD4 polypeptide; yeast transformed with recombinant yeast expression vectors containing a DNA sequence encoding a HDHD4 polypeptide; insect cell systems infected with recombinant virus expression vectors (e.g., baculovirus) containing a DNA sequence encoding a HDHD4 polypeptide; plant cell systems infected with recombinant virus expression vectors (e.g., cauliflower mosaic virus, (CaMV); tobacco mosaic virus, (TMV)) or transformed with recombinant plasmid expression vectors (e.g., Ti plasmid) containing a DNA sequence encoding a HDHD4 polypeptide; or
  • any of a number of suitable transcription and translation elements can be used in an expression vector.
  • inducible promoters such as pL of bacteriophage ⁇ , plac, ptrp, ptac (ptrp-lac hybrid promoter) and the like can be used.
  • promoters such as the baculovirus polyhedrin promoter can be used.
  • promoters derived from the genome of mammalian cells (e.g., metallothionein promoter) or from mammalian viruses (e.g., the adenovirus late promoter; the vaccinia virus 7.5K promoter) can be used.
  • mammalian viruses e.g., the adenovirus late promoter; the vaccinia virus 7.5K promoter
  • SV40-, BPV- and Epstein-Barr (EBV)-based vectors can be used with an appropriate selectable marker. Representative methods of expressing a DNA sequence encoding a HDHD4 polypeptide are described in the herein.
  • Cultured mammalian cells are preferred hosts within the present invention.
  • Methods for introducing exogenous DNA into mammalian host cells include calcium phosphate-mediated transfection (Wigler et al, (1978) Cell 14:725; Corsaro & Pearson. (1981) Somat. Cell Genet. 7:603; Graham & Van der Eb. (1973) Virology 52:456, 1973), electroporation (Neumann et al.. (1982) EMBO J.
  • cultured mammalian cells examples include the COS-I (ATCC No. CRL 1650), COS-7 (ATCC No. CRL 1651), BHK 570 (ATCC No. CRL 10314), 293 (ATCC No. CRL 1573; Graham et al, (1977) J. Gen. Virol. 36:59-72) and Chinese hamster ovary (e.g. CHO- KL; ATCC No. CCL 61 or DG44) cell lines. Additional suitable cell lines are known in the art and available from public depositories such as the American Type Culture Collection (ATCC), Manassas, Virginia.
  • ATCC American Type Culture Collection
  • Manassas Manasas, Virginia.
  • a number of viral-based expression systems can be utilized.
  • sequences encoding a polypeptide of the present invention can be ligated into an adenovirus transcription/ translation complex containing the late promoter and tripartite leader sequence. Insertion into a non-essential El or E3 region of the viral genome can be used to obtain a viable virus which is capable of expressing a HDHD4 polypeptide in infected host cells (see, e.g., Logan & Shenk, (1984) Proc. Natl. Acad. ScL USA 81:3655-3659).
  • transcription enhancers such as the Rous sarcoma virus (RSV) enhancer
  • RSV Rous sarcoma virus
  • Other expression systems can also be used, such as, but not limited to yeast, plant, and insect vectors.
  • yeast-based systems can be employed to express a recombinant polypeptide of the present invention.
  • Techniques for transforming yeast cells with exogenous DNA to produce recombinant polypeptides therefrom are disclosed by, for example, U.S. Patent Nos. 4,599,311; 4,931,373; 4,870,008; 5,037,743; and 4,845,075, which are incorporated herein by reference. Transformation systems for other yeasts, including Hansenula polymorpha, Schizosaccharomyces pombe, Kluyveromyces lactis, Kluyveromyces fragilis, Ustilago maydis, Pichia pastoris, Pichia guillermondii, and Candida maltosa are known in the art.
  • a preferred system utilizes Pichia methanolica (see, PCT Publication WO 97/17450).
  • Pichia methanolica see, for example, Gleeson et ah, (1986) J. Gen. Microbiol. 132:3459-3465 and U.S. Patent No. 4,882,279.
  • Aspergillus cells can be utilized according to the methods of U.S. Patent No. 4,935, 349, which is incorporated herein by reference.
  • Methods for transforming Acremonium chrysogenum are disclosed in U.S. Patent No. 5,162,228, which is incorporated herein by reference.
  • Methods for transforming Neurospora are disclosed in U.S. Patent No. 4,486,533, which is incorporated herein by reference.
  • Bacterial systems can also be employed to express a recombinant polypeptide of the present invention.
  • a number of expression vectors can be selected, depending upon the use intended for the expressed HDHD4 polypeptide product. For example, when large quantities of expressed protein are needed for the generation of antibodies or for crystallization, vectors that direct high level expression of fusion proteins that can be readily purified can be used. Such vectors include, but are not limited to, the multifunctional E.
  • coli cloning and expression vectors such as BLUESCRIPT (Stratagene, La Jolla, California, USA), in which the sequence encoding a polypeptide of interest can be ligated into the vector in-frame with sequences for the amino-terminal Met and the subsequent 7 residues of ⁇ - galactosidase, so that a hybrid protein is produced; pIN vectors (see, e.g., Van Heeke & Schuster. (1989) J. Biol. Chem. 264:5503-5509); and the like.
  • pGEX vectors Promega, Madison, Wisconsin
  • GST glutathione S-transferase
  • fusion proteins are soluble and can be easily purified from lysed cells by adsorption to glutathione- agarose beads followed by elution in the presence of free glutathione.
  • Proteins made in such systems can be designed to include, for example, heparin, thrombin, or Factor Xa protease cleavage sites so that the cloned polypeptide of interest can be released from the GST moiety at will.
  • Host cells transformed with a nucleotide sequence encoding a polypeptide of the present invention can be cultured under conditions suitable for the expression and recovery of the protein from cell culture.
  • the protein produced by a recombinant cell may be secreted or contained intracellularly depending on the sequence and/or the vector used.
  • expression vectors containing a polynucleotide which encodes a polypeptide of the present invention can be designed to contain signal sequences which direct secretion of the polypeptide through a prokaryotic or eukaryotic cell membrane.
  • nucleic acid sequences encoding a polypeptide to a nucleotide sequence encoding a polypeptide domain, which can facilitate purification of soluble proteins.
  • purification facilitating domains include, but are not limited to, metal chelating peptides such as histidine-tryptophan modules that allow purification on immobilized metals; protein A domains that allow purification on immobilized immunoglobulin; and the domain utilized in the FLAG ® extension/affinity purification system (available from Immunex Corp., Seattle, WA).
  • cleavable linker sequences such as those specific for Factor Xa or enterokinase (Invitrogen Corp., San Diego, California, USA) between the purification domain and the polypeptide can be used to facilitate purification.
  • One such expression vector provides for expression of a fusion protein containing a polypeptide of the present invention and a nucleic acid encoding 6 histidine residues preceding a thioredoxin or an enterokinase cleavage site. The histidine residues facilitate purification on immobilized metal ion affinity chromatography (IMAC) as described by Porath et al . (1992) Prot. Exp. Purif.
  • IMAC immobilized metal ion affinity chromatography
  • enterokinase cleavage site provides a means for purifying from the fusion protein.
  • suitable vectors for fusion protein production see Kroll ef ⁇ /.. (1993) DAW Cell Biol 12:441-453).
  • the presence of polynucleotide sequences encoding a polypeptide of the present invention can be detected by DNA-DNA or DNA-RNA hybridization, or by amplification using probes, portions, or fragments of polynucleotides encoding a polypeptide of the present invention.
  • Nucleic acid amplification based assays involve the use of oligonucleotides or oligomers based on the nucleic acid sequences encoding a polypeptide of the present invention to detect transformants containing DNA or RNA encoding the polypeptide.
  • HDHD4 Crystals can depend on a number of different parameters, including pH, temperature, protein, concentration, the nature of the solvent and precipitant, as well as the presence of ligands. Prior to the present disclosure, many routine crystallization experiments would be required to screen all these parameters for the few combinations that might generate a HDHD4 crystal suitable for X-ray diffraction analysis.
  • the native, analog, derivative and mutant co-crystals, and fragments thereof, disclosed in the present invention can be obtained by a variety of techniques, including batch, liquid bridge, vapor diffusion (e.g., sitting drop and hanging drop methods (see, e.g., Taylor et al. (1992) J. MoI Biol.
  • a drop comprising about an amount of HDHD4 polypeptide is mixed with an equal volume of reservoir buffer and grown at about 20 0 C until crystals form.
  • Methods for forming crystals are known in the art
  • Crystals can be prepared for diffraction using known methodology (see, e.g., Buhrke et ah, A Practical Guide for the Preparation of Specimens for X-ray Fluorescence and X-ray Diffraction Analysis, Wiley-VCH, New York, New York, USA (1998), incorporated herein by reference).
  • crystals can be characterized by using X-rays produced in a conventional source (such as a sealed tube or a rotating anode) or using a synchrotron source. Methods of characterization include, but are not limited to, precision photography, oscillation photography and diffractometer data collection. Heavy atom derivatives such as produced with a mercurial, described herein, can be performed using imaging plates.
  • a HDHD4 polypeptide can be synthesized with selenium-methionine (Se-Met) in place of methionine, and the Se-Met multiwavelength anomalous dispersion data (Hendrickson, (1991) Science 254:51-58) can be collected at multiple X-ray wavelengths, corresponding to two remote points above and below the Se absorption edge ( ⁇ l and ⁇ 4) and the absorption edge inflection point ( ⁇ 2) and peak ( ⁇ 3).
  • Selenium sites can be located using software adapted for that purpose, such as SHELXS-97 in Patterson search mode (Sheldrick (1990) Acta Cryst. A 46:467).
  • Experimental phases can be estimated via a multiple isomorphous replacement/anomalous scattering strategy using MLPHARE (Otwinowski, Daresbury Study Weekend proceedings, 1991) with three of the wavelengths treated as derivatives and one ( ⁇ 2) treated as the parent for example.
  • data can be processed using HKL, DENZO and SCALEPACK (Otwinowski & Minor. Method Enzymol. 276(A) 307-326, (Carter, Jr. & Sweet, eds.), Academic Press, New York, New York, USA (1997)).
  • X-PLOR (Brunger, (1992) X-PLOR, Version 3.1. A System for X- ray Crystallography and NMR, Yale University Press, New Haven, Connecticut; Accelrys, San Diego, California) or HEAVY (Terwilliger, Los Alamos National Laboratory, Los Alamos, New Mexico) can be utilized for bulk solvent correction and B-factor scaling. After density modification and non-crystallographic averaging, the protein is built into a electron density map using the program O, (Jones et ah, (1991) Acta Cry st. A47: 110-119).
  • Model building interspersed with positional and simulated annealing refinement can facilitate an unambiguous trace and sequence assignment of a fragment of a HDHD4 polypeptide or fragment. Additional data collection methods, as well as general crystallographic methods, will be known to those of ordinary skill in the art upon consideration of the present disclosure (see, e.g., McRee. Practical Protein Crystallography. (2 n ed.) Academic Press, San Diego, California, USA (1999), incorporated herein by reference).
  • the three-dimensional structure of the polypeptide can be determined by analyzing the diffraction data. Such an analysis can be employed whether the polypeptide is a wild-type polypeptide or a fragment thereof, or a mutant, derivative or analog of a HDHD4 polypeptide.
  • X-ray diffraction data can be solved by employing available software packages, such as O (Jones et al.. (1991) Acta Cryst. A 47, 110-119); FRODO (Jones et al. (1978) J. Appl. Crystallogr. 11 :268-272) and TURBO FRODO; X-PLOR
  • the present invention therefore provides a method for determining the three- dimensional structure of a crystallized HDHD4 polypeptide, optionally in complex with a ligand, and optionally in complex with, or in further complex with, one or more metal-comprising moieties, such as vanadate, phosphate, tungstate, magnesium or manganese to a resolution of about 3.0 A or better.
  • the method comprises: (a) crystallizing a HDHD4 polypeptide in complex with a metal- comprising moiety to form a crystallized complex; and (b) analyzing the crystallized complex to determine a three-dimensional structure of the HDHD4 polypeptide in complex with a metal-comprising moiety.
  • the crystallization can be carried out using the present disclosure as a guide.
  • various vapor diffusion techniques can be employed to generate a crystalline form of a HDHD4 polypeptide (including mutants, derivatives, etc.) in complex with a metal-comprising moiety.
  • the analyzing can be carried out as described hereinabove and can include collecting and processing X-ray diffraction data, which can then provide a three-dimensional structure of the crystallized molecule(s). The same method can be employed to determine the three dimensional structure of a crystallized HDHD4 polypeptide.
  • a crystal comprising a HDHD4 polypeptide or fragment can also comprise a ligand.
  • crystals can be formed by co-crystallizing a HDHD4 polypeptide, analog, derivative, mutant or functional equivalent with a ligand known or suspected to bind to the HDHD4 polypeptide.
  • Such a co-crystal can be formed by employing the techniques disclosed herein and known to those of ordinary skill in the art. Formation of a Derivative Crystal
  • a structure of a crystallized polypeptide can be diffficult and time consuming.
  • derivative crystals comprising a heavy atom can be generated.
  • the method comprises: (a) providing a crystalline form; and (b) associating a heavy atom with the crystalline form.
  • the association can be carried out by soaking the crystal with a solution containing a heavy atom (e.g., a mercurial).
  • a heavy atom e.g., a mercurial
  • the heavy atoms preferably should not change the structure of the molecule or of the crystal cell, i.e., the crystals should be isomorphous. Isomorphous replacement is usually done by diffusing different heavy-metal complexes into the channels of the preformed protein crystals.
  • the crystalline form can comprise, for example, a HDHD4 polypeptide.
  • a crystal e.g., a crystal comprising a HDHD4 polypeptide
  • a crystal is usually soaked in a solution containing heavy metal atom salts, or organometallic compounds, e.g., lead chloride, gold thiomalate, thiomersal or uranyl acetate, which can diffuse through the crystal and bind to the surface of the protein.
  • the protein molecules expose side chains (such as SH groups) into these solvent channels that are able to bind heavy metals.
  • the diffraction data from the protein crystals are used to calculate an electron-density map of the repeating unit of the crystal. This map is then interpreted as a polypeptide chain of a particular amino acid sequence. Following this stage of the process, the polypeptide chain is oriented with respect to the observed electron density and an initial model can then be built (see, e.g., Blundell & Johnson, Protein Crystallography, Academic Press, New York, New York, USA (1976); McRee, Practical Protein Crystallography. (2 nd ed.) Academic Press, San Diego, California, USA (1999), both of which are incorporated herein by reference).
  • the HDHD4 structural coordinates set forth herein can be used to aid in obtaining structural information about another crystallized molecule or molecular complex that is structurally homologous to a HDHD4 polypeptide (or to a HDHD4 polypeptide).
  • the present invention allows a determination of at least a portion, if not all, of the three-dimensional structure of a molecule or a molecular complex that contains one or more structural features that are similar to structural features of a HDHD4 polypeptide, as revealed by the structure coordinates provided herein. These molecules are referred to herein as "structurally homologous" to HDHD4.
  • the present invention also provides HDHD4 polypeptides that are structurally homologous to the polypeptides of SEQ ID NOs:2 and 4, and/or the polypeptides encoded by SEQ ID NOs: 1 and 3, and orthologs thereof.
  • Compounds that are structurally homologous can be formulated to mimic key portions of a HDHD4 structure. Such compounds are structural homologs.
  • the generation of a structurally homologous protein can be achieved by the techniques of modeling and chemical design known to those of skill in the art and described herein. Modeling and chemical design of HDHD4 structural equivalents can be based on the structure coordinates of a crystalline HDHD4 polypeptide of the present invention. It will be understood that all such structurally homologous constructs fall within the scope of the present invention.
  • Structural homologs can include, for example, regions of amino acid identity, conserved active site or binding site motifs, and similarly arranged secondary structural elements (e.g., ⁇ -helices and ⁇ -sheets).
  • Structural homology can be determined by aligning the residues of two amino acid sequences to optimize the number of identical amino acids along the lengths of their sequences; gaps in either or both sequences are permitted in making the alignment in order to optimize the number of identical amino acids, although the amino acids in each sequence must nonetheless remain in their proper order.
  • two amino acid sequences are compared using the BLASTP program, version 2.0.9, of the BLAST 2 search algorithm, (as described by Tatusova et al, (1999) FEMS Microbiol. Lett. 174:247- 50. See also Altschul et al, (1986) Bull Math. Bio. 48: 603-616 and Henikoff & Henikoff. (1992) Proc. Natl. Acad.
  • a structurally homologous molecule comprises a protein that has an amino acid sequence sharing at least 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% identity with a native or recombinant HDHD4 amino acid sequence (e.g., SEQ ID NO:2 and/or a polypeptide encoded by SEQ ID NO: 1) or a His-tagged HDHD4 amino acid sequence (e.g., SEQ ID NO:4, and/or a polypeptide encoded by SEQ ID NO:3).
  • Percent sequence identity is calculated as: (the total number of identical matches) multiplied by (the length of the longer sequence plus the number of gaps introduced into the longer sequence in order to align the two sequences) x 100%.
  • Structurally homologous proteins and polypeptides are generally defined as having one or more amino acid substitutions, deletions or additions from a native or recombinant HDHD4 amino acid sequence (e.g., SEQ ID NO:2 and/or a polypeptide encoded by SEQ ID NO: 1).
  • a protein that is structurally homologous to HDHD4 comprises at least one contiguous stretch of at least 50 amino acids that shares at least 80% amino acid sequence identity with the analogous portion of the native or recombinant a native or recombinant HDHD4 amino acid sequence (e.g., SEQ ID NO:2 and/or a polypeptide encoded by SEQ ID NO: 1).
  • Methods for generating structural information about the structurally homologous molecule or molecular complex are known and include, for example, molecular replacement techniques, as described herein.
  • the present invention encompasses structural equivalents of HDHD4 polypeptides.
  • Various computational analyses can be used to determine whether a molecule (or a binding pocket portion thereof) is "structurally equivalent,” in terms of its three-dimensional structure, to all or part of a HDHD4 polypeptide or its binding pocket(s).
  • Such analyses can be carried out in current software applications, such as the Molecular Similarity application of QUANTA (Molecular Simulations Inc., San Diego, California, USA) version 4.1, and as described in the accompanying User's Guide.
  • the Molecular Similarity application permits comparisons between different structures, different conformations of the same structure, and different parts of the same structure.
  • the procedure used in Molecular Similarity to compare structures is divided into four steps: (1) load the structures to be compared; (2) define the atom equivalences in these structures; (3) perform a fitting operation; and (4) analyze the results.
  • Each structure is identified by a name.
  • One structure is identified as the target
  • atom equivalency within QUANTA is defined by user input, for the purpose of this invention equivalent atoms are defined as protein backbone atoms (N, Ca, C, and O) for all conserved residues between the two structures being compared.
  • a conserved residue is defined as a residue that is structurally or functionally equivalent. Only rigid fitting operations are considered.
  • the working structure is translated and rotated to obtain an optimum fit with the target structure.
  • the fitting operation uses an algorithm that computes the optimum translation and rotation to be applied to the moving structure, such that the root mean square difference of the fit over the specified pairs of equivalent atom is an absolute minimum. This number, given in angstroms, is reported by QUANTA.
  • RMSD root mean square deviation
  • Representative structurally equivalent molecules or molecular complexes are those that are defined by the entire set of structure coordinates listed in Table 1 or Table 2, ⁇ a root mean square deviation from the conserved backbone atoms of those amino acids of not more than about 1.5 A. In another embodiment, the root mean square deviation is less than about 1.0 A or less.
  • a functional equivalent means a polypeptide that an amino acid sequence that is substantially identical to a HDHD4 amino acid sequence (e.g., SEQ ID NO:2 and/or a polypeptide encoded by SEQ ID NO: 1) and exhibits the same biological activity as these polypeptides (e.g., the ability to transfer a phosphate group, for example, using substrates such as 2,3-diphosphoglycerate or N- acetylneuraminate-9-phosphate), regardless of the polypeptide's sequence length or composition.
  • a functional equivalent as used herein, means a polypeptide that an amino acid sequence that is substantially identical to a HDHD4 amino acid sequence (e.g., SEQ ID NO:2 and/or a polypeptide encoded by SEQ ID NO: 1) and exhibits the same biological activity as these polypeptides (e.g., the ability to transfer a phosphate group, for example, using substrates such as 2,3-diphosphoglycerate or N- acetylneu
  • a “functional equivalent” encompasses any compound capable of mediating an effect substantially identical to that mediated by HDHD4. It is further understood that minor modifications of the primary amino acid sequence of a HDHD4 polypeptide might result in proteins that have substantially equivalent or enhanced function as compared to an unmodified HDHD4 polypeptide. Such a minor modification might affect the overall charge, hydrophobicity, etc. of a modified HDHD4 amino acid sequence (e.g., SEQ ID NO:2 and/or a polypeptide encoded by SEQ ID NO: 1), while maintaining one or more biological activities of the modified protein compared with the wild-type. These modifications may be deliberate, as through site-directed mutagenesis, or may be accidental such as through mutation in hosts. All of these modifications are included as long as the ability to transfer a phosphate group, or bind a given ligand is retained. These types of modifications can be considered to be conservative mutations.
  • a three-dimensional structure of a HDHD4 has been solved and the corresponding structure coordinates form an aspect of the present invention.
  • the structure coordinates can be used in various applications, such as the design and identification of ligands and modulators of HDHD4, as described herein.
  • machine-readable media refers to any media that can be read and accessed directly by a computer. Such media include, but are not limited to: magnetic storage media such as floppy discs, hard disc storage medium and magnetic tape; optical storage media such as optical discs or CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media. Further details regarding machine-readable media and systems for displaying data contained on machine-readable media is provided.
  • the present invention provides a machine-readable data storage medium comprising a data storage material encoded with machine-readable data comprising all or any part of structure coordinates of a HDHD4 polypeptide, wherein the HDHD4 polypeptide is a structure defined by structure coordinates that describe conserved residue backbone atoms having a root mean square deviation of not more than about 2.0 A from the conserved residue backbone atoms described in Table 1 or Table 2.
  • the present invention provides a machine-readable data storage medium comprising a data storage material encoded with machine readable data comprising all or any part of a set of structure coordinates of a HDHD4 polypeptide (Table 1 or Table 2).
  • the machine-readable data storage media of the present invention can be used in a computer.
  • the computer is preferably adapted to produce a three-dimensional representation of a HDHD4 polypeptide, and comprises various components, including the machine-readable storage medium, used to produce the three- dimensional representation.
  • the present invention further provides a computer system capable of producing a three-dimensional representation of all or any part of a HDHD4 polypeptide, wherein said computer system comprises: (a) a machine-readable data storage medium comprising a data storage material encoded with machine readable data comprising all or any part of a set of structure coordinates of a HDHD4 polypeptide, wherein the HDHD4 polypeptide is a structure defined by structure coordinates that describe backbone atoms having a root mean square deviation of not more than about 2.0 A from the backbone atoms described by the structure coordinates of Table 1 or Table 2; (b) a working memory for storing instructions for processing the machine-readable data; (c) a central-processing unit coupled to the working memory and to the machine-readable data storage medium for processing the machine readable data into the three-dimensional representation; and (d) a display coupled to the central-processing unit for displaying the three-dimensional representation.
  • the present invention also provides a computer system as described above wherein the machine-readable data comprises
  • the structure coordinates are preferably Cartesian coordinates, polar coordinates, or internal coordinates. Most preferably said structure coordinates are Cartesian coordinates.
  • the structure coordinates can be those determined for a HDHD4 polypeptide to which a ligand is bound or to which no ligand is bound.
  • the structure coordinates can be those determined for a HDHD4 polypeptide that is in monomer, dimer, or other form.
  • the present invention which comprises, in part, the structure coordinates of Table 1 or Table 2, has broad-based utility and can be employed in many applications. Representative applications include modulator design, mutant design and screening operations. These are other applications are described herein.
  • the HDHD4 structure coordinates of the present invention facilitate structure- based or rational drug design and virtual screening to design or identify potential ligands and/or modulators of a HDHD4 polypeptide.
  • the structural features of the ligand binding site of a HDHD4 polypeptide, as described by the structure coordinates herein, provides insights into the HDHD4 binding site that, prior to the present invention, were unknown and could not be effectively modeled. An understanding of these features facilitates structure-based modulator design and virtual screening at a level of efficiency unattainable prior to the present invention.
  • a three dimensional model of a HDHD4 polypeptide can be used to identify structural and chemical features that might be involved in binding of ligands to a binding site of a HDHD4 polypeptide. Identified structural or chemical features can then be employed to design ligands or modulators of a HDHD4 polypeptide or identify test molecules as ligands or modulators of a HDHD4 polypeptide.
  • Those of ordinary skill in the art can employ one of several methods to screen chemical entities or fragments for their ability to associate with a HDHD4 polypeptide, or a structurally similar polypeptide, and in embodiments comprising the individual binding site(s) of a HDHD4 polypeptide.
  • This process can begin by visual inspection of, for example, the active site on the computer screen based on the structural coordinates provided herein in Table 1 or Table 2 or the structural coordinates of a model generated using the structural coordinates of Table 1 or Table 2.
  • Selected candidate modulators which can be fragments or complete chemical entities, can then be positioned in a variety of orientations, or docked, with a HDHD4 polypeptide (for example in a binding site) as described hereinabove.
  • Docking can be accomplished using software such as QUANTA, SYBYL, Flo, DOCK, GOLD or FLEXX, and followed by energy minimization and molecular dynamics with standard molecular mechanics forcefields, such as CHARMM and AMBER.
  • a candidate modulator Once a candidate modulator has been designed or selected, the efficiency with which that candidate modulator associates ("docks") with a HDHD4 polypeptide can be tested and optimized by computational evaluation. For example, a compound that has been designed or selected to function as an inhibitor should spatially fit into a binding site when it is associated with a HDHD4 polypeptide polypeptide.
  • Docking can be performed manually or using a variety of software, including but not limited to, DOCK (Kuntz et a!.. (1994) Ace. Chem. Res. 27: 117; Gschwend & Kuntz. (1996) J Comp. Aided MoI. Des. 10: 123; Kuntz. (1982) J. MoI. Biol. 161 :269- 288), GOLD (Cambridge Crystallographic Data Center, Cambridge, UK),, Flo (Thistlesoft, Colebrook, Connecticut), QUANTA (Accelrys, San Diego, California), SYBYL (Tripos, St. Louis, Missouri) or FLEXX (Tripos, St. Louis, Missouri).
  • DOCK Korean et a!.. (1994) Ace. Chem. Res. 27: 117; Gschwend & Kuntz. (1996) J Comp. Aided MoI. Des. 10: 123; Kuntz. (1982) J. MoI. Biol. 161 :269- 288)
  • a docking operation can involve analyzing structural and chemical feature complementarity between a structure (e.g., a HDHD4 polypeptide) and a candidate modulator.
  • a structure e.g., a HDHD4 polypeptide
  • a candidate modulator e.g., a structure that is a ligand molecule and a candidate modulator.
  • Such an analysis can include (a) quantifying features of atomic components found within a ligand molecule and protein molecule (e.g., charge, size, shape, polarizability, hyprophobicity, etc.), and (b) quantifying interactions between such features in the ligand molecule, the protein molecule and the protein/ligand complex, as determined using any number of approaches known in the art (e.g., molecular mechanics, force fields and/or quantum mechanics).
  • Analyzing sturctural and chemical feature complementarity can, for example, be performed visually or by scoring functions based on computed ligand-site interactions as implemented in DOCK, GOLD, Flo, COMBIFLEXX (Tripos, St. Louis, Missouri).
  • a three-dimensional structure comprising, all or any part of, a HDHD4 polypeptide, as disclosed herein, is provided.
  • a candidate modulator i.e., potential ligand or potential modulator
  • a candidate modulator can be docked into a binding site of a HDHD4 polypeptide (e.g., a ligand binding site comprising D12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, and optionally C60, F61, H62, P63, Y64, and N65, according to the structural coordinates of Table 1 or Table 2), i.e., a docking operation can be performed in silico between a candidate modulator and a HDHD4 polypeptide.
  • a HDHD4 polypeptide e.g., a ligand binding site comprising D12, L13, D14, N15,
  • test molecule can be designed based on HDHD4 binding site features disclosed herein. After docking, the test molecule can be analyzed for structural and chemical feature complementarity with all or any part of a HDHD4. Structural and chemical features include, but are not limited to, any one of the following: van der Waals interactions, hydrogen bonding interactions, charge interaction, hydrophobic interactions, and dipole interactions.
  • a docking operation can be performed as part of a modulator design process or it can be performed to learn more about how a given ligand associates or might associate with a given structure.
  • the present invention also provides a method of docking a ligand, modulator or candidate modulator with a structure.
  • the method comprises positioning a candidate modulator into a binding site, or any part of a binding site, of a HDHD4 polypeptide, wherein the binding site is a described by the structure coordinates Table 1 or Table 2.
  • the method can further comprise analyzing structural and chemical feature complementarity of the candidate modulator with all or any part of a binding site of a HDHD4 polypeptide.
  • a three-dimensional structure disclosed herein or a three- dimensional model created using methods known in the art including, but not limited to, using software such as INSIGHT II (Accelrys, Inc., San Diego, CA), SYBYL (Tripos Associates, St. Louis, Missouri), and Flo (Thistlesoft, Colebrook, Connecticut), and the coordinates disclosed herein in Table 1 or Table 2 can be employed in a docking operation as a step of modulator design.
  • INSIGHT II Accelelrys, Inc., San Diego, CA
  • SYBYL Tripos Associates, St. Louis, Missouri
  • Flo Thistlesoft, Colebrook, Connecticut
  • Computer software programs can be employed to assist in the process of selecting a candidate modulator.
  • Representative computer software programs include, but are not limited to:
  • suitable chemical entities or fragments can be assembled into a single compound or inhibitor. Assembly can proceed by visual inspection of the relationship of the fragments to each other on the three-dimensional image displayed on a computer screen in relation to the structure coordinates of, for example, a HDHD4 polypeptide in accordance with Table 1 or Table 2 or a model built using the disclosed structure coordinates of a HDHD4 polypeptide. This inspection can be followed by manual model building using software suitable for this purpose, such as QUANTA (Tripos, St. Louis, Missouri), SYBYL (Tripos, St. Louis, Missouri), LOOK/GENEMLNE (Celera, Rockville, Maryland), HOMOLOGY (Tripos, St. Louis, Missouri), or INSIGHT II (Accelrys, San Diego, California).
  • Useful programs to aid one of skill in the art in connecting the individual chemical entities or fragments include:
  • a modulator can be designed as a whole or de novo using either an empty active site or optionally including some portion(s) of a known modulator(s).
  • Software that can be employed in a de novo design effort includes:
  • An effective modulator preferably exhibits a relatively small difference in energy between its bound and free states (i.e., a small deformation energy of binding). Therefore, an efficient modulator preferably exhibits a deformation energy of binding of not greater than about 10 kcal/mole, preferably, not greater than about 7 kcal/mole.
  • Computer software is available in the art to evaluate compound deformation energy and electrostatic interaction. Examples of programs designed for such uses include:
  • the above referenced software packages can be employed to perform various energy calculations with respect to a given modulator-polypeptide system.
  • An energy analysis can take into account non-complementary (e.g., electrostatic) interactions including repulsive charge-charge, dipole-dipole and charge-dipole interactions.
  • the present invention provides a method of designing a modulator of a HDHD4 polypeptide comprising: (a) modeling all or any part of a HDHD4 polypeptide binding site; and (b) based on the modeling, designing a candidate modulator that has structural and chemical feature complementarity with all or any part of the HDHD4 binding site; wherein the HDHD4 polypeptide binding site is defined by the structure coordinates of Table 1 or Table 2.
  • a candidate modulator can then be synthesized and tested for modulation ability in a suitable assay.
  • a candidate modulator can be designed manually without the aid of computer software, either de novo or by employing a portion of a known ligand as a starting point.
  • a candidate modulator can be designed employing computer software either de novo or employing a portion of a known or suspected ligand as a starting point, as described herein.
  • computer software can comprise or access a database from which candidate modulators (or discrete chemical elements) are chosen (e.g., CAVEAT), based on an evaluation of the model.
  • the candidate modulator can be designed to fit spatially into all or any part of a HDHD4 polypeptide binding site, and a ligand binding site can be described generally by the structure coordinates of Table 1 or Table 2 and more specifically by the structure coordinates of amino acids comprising D12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57,K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, and optionally C60, F61, H62, P63, Y64, and N65.
  • the method can further comprise: (c) docking the designed candidate modulator into all or any part of the HDHD4 polypeptide binding site; and (d) analyzing the structural and/or chemical feature complementarity of the candiate modulator with all or any part of the HDHD4 polypeptide binding site. Additional description of docking and docking operations is provided herein.
  • the method can also comprise analyzing structural and chemical feature complementarity of a second chemical entity with all or any part of a HDHD4 polypeptide, such as when the modeling operation grows a ligand in place. The analysis can be computational and take into account energy considerations, surface charges, hydrophobicity, etc., or it can be simply a visual inspection.
  • the present invention provides a method of designing a modulator of a HDHD4 polypeptide comprising: (a) designing a potential modulator of a HDHD4 polypeptide that will make interactions with amino acids in a ligand binding site of the HDHD4 polypeptide, based upon a crystalline structure comprising a HDHD4 polypeptide in complex with a ligand; (b) synthesizing the modulator; and (c) determining whether the potential modulator modulates the activity of the HDHD4 polypeptide, whereby a modulator of a HDHD4 polypeptide is designed.
  • the crystalline structure can be analyzed as described herein and the determining can be carried out by employing an assays as described herein.
  • the present invention also provides a method of designing a modulator of a target polypeptide that is structurally similar to a HDHD4 polypeptide: (a) modeling all or any part of a HDHD4 polypeptide; and (b) based on the modeling, designing a candidate modulator that has structural and chemical feature complementarity with all or any part of a HDHD4 polypeptide binding site; wherein the HDHD4 polypeptide is described by the structure coordinates of Table 1 or Table 2. Due to the structural similarity between the HDHD4 polypeptide and the target polypeptide, a modulator designed to associate with a HDHD4 polypeptide would be expected to associate with the target polypeptide, since both polypeptides are similar in size, composition, shape, etc.
  • the candidate modulator can be designed to fit spatially into all or any part of a HDHD4 binding site.
  • a candidate modulator can be designed manually without the aid of computer software, either de novo or by employing a portion of a known ligand as a starting point.
  • a candidate modulator can be designed employing computer software either de novo or employing a portion of a known ligand as a starting point.
  • computer software can comprise or access a database from which candidate modulators (or discrete chemical elements) are chosen (e.g., CAVEAT), based on an evaluation of the model.
  • the method can further comprise: (c) docking the chemical entity into all or any part of the HDHD4 binding site; and (d) analyzing the structural and chemical feature complementarity of the candidate modulator with all or any part of a HDHD4 polypeptide, such as a binding site. Additional description of docking and docking operations is provided herein.
  • the method can also comprise analyzing structural and chemical feature complementarity of a second chemical entity with all or any part of a HDHD4 polypeptide, such as when the modeling operation grows a ligand in place.
  • the analysis can be computational and take into account energy considerations, surface charges, hydrophobicity, etc., or it can be simply a visual inspection.
  • HDHD4 polypeptide as provided herein might be similar in structure to other proteins. Modulators that lack specificity for a given protein might adversely affect other proteins. Thus, it is desirable to be able to employ a modulator that is specific for a given protein, regardless of structural similarity. Using the structural coordinates of the present invention, such a selective modulator can be designed.
  • the present invention provides a method of designing a modulator that selectively modulates the activity of a HDHD4 polypeptide to the exclusion of other proteins comprising: (a) evaluating a three-dimensional structure of a crystallized HDHD4 polypeptide in complex with a ligand; and (b) synthesizing a potential modulator based on the three-dimensional structure of the crystallized HDHD4 polypeptide in complex with a ligand.
  • Methods of evaluating a three- dimensional structure are provided herein and synthetic pathways for a potential modulator will depend on the composition of the modulator itself.
  • the structure coordinates of the present invention can also be employed in the refinement of an existing HDHD4 polypeptide modulator.
  • desirable properties of the modulator can be enhanced.
  • the present invention also provides a method of increasing the efficiency of a modulator of a HDHD4 polypeptide comprising: (a) providing a first ligand having a known effect on the biological activity of a HDHD4 polypeptide; (b) modifying the first ligand based on an evaluation of a three-dimensional structure of a HDHD4 polypeptide to form a modified ligand; (c) synthesizing the modified ligand; and (d) determining an effect of the modified ligand on a HDHD4 polypeptide, wherein the efficiency of a modulator of a HDHD4 polypeptide is increased if the modified ligand favorably alters a biological activity of a HDHD4 polypeptide with respect to the biological activity of the first ligand.
  • Various structural and/or chemical features of all or any part of a HDHD4 polypeptide can be identified using a three-dimensional representation (e.g., a HDHD4 crystal structure or a generated model) of all or any part of a HDHD4 polypeptide.
  • a three-dimensional representation e.g., a HDHD4 crystal structure or a generated model
  • amino acids that are suspected to be involved in an association with a modulator or an amino acid sequence for example, residues comprising a binding site, etc. can be identified.
  • Such an identification can be carried out by techniques known in the art and described herein, such as by employing software suitable for that purpose as disclosed herein (e.g., DOCK, GOLD, Flo or LEAPFROG).
  • an aspect of the present invention is a method of identifying structural and/or chemical features of all or any part of a HDHD4 polypeptide.
  • the HDHD4 polypeptide is described by the structure coordinates according to Table 1 or Table 2.
  • a HDHD4 polypeptide binding site e.g., amino acids D12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, and optionally C60, F61, H62, P63, Y64, and N65) are identified.
  • a HDHD4 polypeptide binding site e.g., amino acids D12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q
  • the present invention also provides a method of identifying structural features of a HDHD4 polypeptide that can be employed in the design of a modulator that selectively modulates the activity of a HDHD4 polypeptide to the exclusion of other structurally similar but non-identical proteins.
  • the method comprises providing a three-dimensional structure of a crystallized HDHD4 polypeptide in complex with a ligand and a three- dimensional test structure comprising a structurally similar but non-identical protein.
  • the HDHD4 polypeptide structure can comprise the coordinates of Table 1 or Table 2, for example.
  • a HDHD4 polypeptide structure need not be exactly described by (e.g., identical to) the coordinates of Table 1 or Table 2, since HDHD4 functional equivalents are also encompassed by the present invention.
  • the backbone residues of the HDHD4 structure are overlayed onto the test structure.
  • This operation can be carried out manually, for example by fixing the position of one structure (e.g., the test structure(s)) and visually orienting the other structure (e.g., HDHD4) relative to the fixed structure.
  • computer software such as INSIGHT II, can be employed to perform the overlap consistent with user-selected criteria.
  • Structural features of the HDHD4 that do not overlap the test structure to a desired degree are then identified. The identifying can comprise, for example, a visual inspection of the overlapped structures or a quantitative comparison can be made.
  • the identifying can comprise one or more computational evaluations of the overlapped structures, which can be performed by employing commercially available computer software known to those of ordinary skill in the art.
  • Such an evaluation can comprise, for example, an energy analysis, surface analysis, charge analysis of one or both structures.
  • the method can be employed alone or in conjunction with other methods described herein.
  • the method can be employed as a precursor to modulator design.
  • the method can be employed to enhance the specificity of a modulator for HDHD4, or, in other embodiments, even for a protein other than HDHD4.
  • a first stage of a modulator design process can comprise computer-based in silico screening of compound databases (such as the Cambridge Structural Database) in order to identify a compound predicted to interact with a target molecule.
  • Various screening selection criteria can be employed and can account for pharmacokinetic properties such as metabolic stability and toxicity.
  • the structure coordinates provided herein which include coordinates describing a HDHD4 binding site, allow a set of selection criteria for a potential modulator to be identified.
  • Virtual screening methods i.e., methods of evaluating the potential of chemical entities to bind to a given protein or portion of a protein, are known in the art. These methods often employ databases as sources of candidate modulators and often are employed in designing modulators. Often these methods begin by visual inspection of a binding site of a target polypeptide on the computer screen. Selected candidate modulators can then be placed, i.e., docked, in one or more positions and orientations within the binding site and chemical and structural feature complementarity can be analyzed.
  • Databases of chemical entities that may be used include, but are not limited to, ACD (Molecular Designs Limited, San Leandro, California), Aldrich (Aldrich Chemical Company), NCI (National Cancer Institute), Maybridge (Maybridge Chemical Company Ltd), CCDC (Cambridge Crystallographic Data Center, Cambridge, UK), CAST (Chemical Abstract Service) and Derwent (Derwent Information Limited).
  • ACD Molecular Designs Limited, San Leandro, California
  • Aldrich Chemical Company Aldrich Chemical Company
  • NCI National Cancer Institute
  • Maybridge Maybridge Chemical Company Ltd
  • CCDC Cambridge Crystallographic Data Center, Cambridge, UK
  • CAST Chemical Abstract Service
  • Derwent Information Limited for example, programs such as DOCK (Kuntz et al. , (1994) Ace. Chem. Res.
  • a virtual screening approach can include, but is not limited to, the following steps:
  • a second candidate modulator adapted to join with or replace the docked candidate modulator and fit spatially into all or any part of a HDHD4 binding site comprising amino acid residues D12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, and optionally C60, F61, H62, P63, Y64, and N65; 4.
  • the present invention provides a method for evaluating the potential of a chemical entity to bind to all or any part of a HDHD4 polypeptide or a structurally similar molecule comprising: (a) docking a candidate modulator into all or any part of a HDHD4 polypeptide described by the structure coordinates of Table 1 or Table 2; and (b) analyzing structural and chemical feature complementarity between the candidate modulator and all or any part of the HDHD4 polypeptide.
  • HDHD4 polypeptide binding site comprising amino acid D 12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, and optionally C60, F61, H62, P63, Y64 and N65, in the docking of step (a).
  • binding residues of a HDHD4 polypeptide can be employed in the method.
  • the candidate modulator can be selected from a database.
  • the method can further comprise a step in which a second candidate modulator is joined to the first candidate modulator that was docked and analyzed, and the resultant candidate modulator is docked and analyzed.
  • Candidate modulators designed or identified using the methods described herein can then be synthesized and screened in a HDHD4 binding assay, or in an assay designed to test functional activity.
  • assays useful in screening of potential ligands or modulators include, but are not limited to, screening in silico, in vitro assays and high throughput assays.
  • candidate modulators can be screened, using computational means and biological assays, to identify ligands and modulators of a HDHD4 polypeptide.
  • the invention provides a method for identifying a modulator of a
  • the method comprises the following steps, which are preferably, but not necessarily, performed in the order given: (a) docking a candidate modulator into all or any part of a HDHD4 polypeptide binding site, wherein the a HDHD4 polypeptide binding site is described by the structure coordinates of Table 1 or Table 2; (b) analyzing structural and chemical feature complementarity between the candidate modulator and all or any part of the a HDHD4 polypeptide binding site; (c) synthesizing the candidate modulator; and (d) screening the candidate modulator in a biological assay for the ability to modulate a HDHD4 polypeptide.
  • a candidate modulator is identified as a modulator of HDHD4 if the structural and chemical feature complementarity and the modulation exceed a desired level.
  • a compound that stimulates or inhibits a measured activity in a cellular assay by greater than 10% is identified as a preferred modulator.
  • the method can further comprise one or more of the following steps: (e) screening the candidate modulator in an assay that characterizes binding to a HDHD4 polypeptide; and (f) screening the candidate modulator in an assay that characterizes binding to a HDHD4 polypeptide.
  • a modulator of a HDHD4 polypeptide can induce one or more of the following activities of HDHD4 presented in this non-inclusive list: (a) a HDHD4 modulator can transfer a phosphate group, for example, using substrates such as 2,3-diphosphoglycerate or N-acetylneuraminate-9-phosphate).
  • HDHD4 polypeptide preferably relates to enough of a HDHD4 polypeptide binding site so as to be useful in docking or modeling a ligand into the binding site, although it is not necessary to employ a complete HDHD4 polypeptide.
  • a HDHD4 polypeptide binding site comprises the following residues: D12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, and optionally C60, F61, H62, P63, Y64, and N65 of SEQ ID NO:2.
  • "all or any part of a HDHD4 polypeptide" can also relate to structural elements not found in a binding site, however.
  • a set of structure coordinates for a protein e.g., a HDHD4 polypeptide
  • part of a protein e.g., a HDHD4 polypeptide binding site
  • structure coordinates that define two identical or almost identical shapes can vary slightly. If variations are within an acceptable standard error as compared to the original coordinates, the resulting three-dimensional shape is considered to be equivalent.
  • a ligand that is bound to the structure defined by the structure coordinates of the HDHD4 according to Table 1 or Table 2 would also be expected to bind to a site having a shape that fell within the acceptable error.
  • sites with structures falling within an acceptable standard error are also within the scope of this invention.
  • a three dimensional model can be constructed on the basis of the known structure of a homologous protein (see, e.g., Greer. (1991) Methods Enzymol.202:239-52; Greer. (1990) Proteins 7(4):317-34; Cardozo et al, (1995) Proteins 23(3):403-14., SaU, (1995) Curr. Opin. Biotechnol. 6(4):437-51; Birkholtz et al, (2003) Proteins 50(3):464-73).
  • a homology model can be constructed by first identifying a protein (e.g., a HDHD4 polypeptide) or part of a protein (e.g., a HDHD4 polypeptide binding site) of known structure which is similar to the protein or part of the protein without known structure. Next, an alignment is performed and can be accomplished using such programs as the MODELLER module found in INSIGHT II (Accelrys, Inc., San Diego, California, USA), WHAT IF (Rodriguez et al, (1998) CABIOS 14:523-528), or 3D-JIGSAW (Bates et al, (2001) Proteins Supp. 5:39-46).
  • INSIGHT II Accelelrys, Inc., San Diego, California, USA
  • WHAT IF Radriguez et al, (1998) CABIOS 14:523-528
  • 3D-JIGSAW Bates et al, (2001) Proteins Supp. 5:39-46.
  • a method of constructing a homology model consistent with the present invention comprises: (a) providing an amino acid sequence for a target protein for which a structure is not known; (b) aligning the target protein with all or a part of of the structure of a HDHD4 polypeptide, wherein the HDHD4 polypeptide is described in whole or in part by the structure coordinates of Table 1 or Table 2; (c) analyzing the alignment of the target protein with the all or a part of a HDHD4 polypeptide; and (d) generating a structure of the target protein based on the analysis.
  • This and related methods and processes are described more fully herein below.
  • the structure of a target protein can be determined using the structure coordinates of a HDHD4 polypeptide as a starting point.
  • a method of determining the structure of a target protein for which little or no structural information is known forms an aspect of the present invention.
  • the method can comprise: (a) providing an amino acid sequence for a target protein for which a structure is not known; (b) aligning the target protein with all or a part of the structure of a HDHD4 polypeptide, wherein the HDHD4 polypeptide is described in whole or in part by the structure coordinates of Table 1 or Table 2; (c) analyzing the alignment of the target protein with all or a part of the HDHD4 polypeptide; (d) generating a structure of the target protein based on the analysis; and (e) analyzing the generated structure to determine the structure of a target protein for which a structure is not known.
  • This and related methods and processes are described more fully herein below.
  • Various computational analyses can be employed to determine whether a molecule or a portion thereof is sufficiently similar to all or a part of a template (e.g. , a molecule of known structure, such as a HDHD4 polypeptide binding site, which is described by the structure coordinates of Table 1 or Table 2) to be considered equivalent.
  • a template e.g. , a molecule of known structure, such as a HDHD4 polypeptide binding site, which is described by the structure coordinates of Table 1 or Table 2
  • Such analyses can be carried out in software applications, such as INSIGHT II (Accelrys Inc., San Diego, California, USA) as described in the User's Guide, or software applications available in the SYBYL software suite (Tripos, St. Louis, Missouri, USA).
  • INSIGHT II Accelelrys Inc., San Diego, California, USA
  • SYBYL software suite Tripos, St. Louis, Missouri, USA
  • the fitting operation uses an algorithm that computes the optimum translation and rotation to be applied to the template structure, such that the root mean square difference of the fit over the specified pairs of equivalent atoms is an absolute minimum. This number, given in angstroms (A), is reported by INSIGHT II.
  • Three-dimensional coordinates give the location of the centers of all atoms in a protein molecule and are typically expressed as Cartesian coordinates (e.g., distances in three directions, each perpendicular to the other), or polar coordinates
  • Variations in coordinates can also be generated due to mathematical manipulations of the structure coordinates.
  • the HDHD4 structure coordinates set forth in Table 1 or Table 2 could be manipulated by fractionalization of the structure coordinates, integer additions or subtractions to sets of the structure coordinates, inversion of the structure coordinates or any combination of the above.
  • the structure coordinates of an actual X-ray structure of a protein would be expected to have some variation from the homology model of that very same protein. For example, the location of sidechains might vary to some extent.
  • Variations in structure coordinates can be due to mutations, additions, substitutions, and/or deletions of amino acids of a protein being studied. Variations in structure coordinates can also be due to variations in proteins whose shape is being described by the structure coordinates given. For example, rigid fitting operations conducted between a HDHD4 polypeptide and a closely-related protein known to have similar structure and function (can yield root mean square deviations (RMSD) in a conserved residue backbone atom comparison. These RMSD's could be greater if other variation factors described above were present in the calculations. Proteins from non-human species may also have slight variations in shape from that of the HDHD4 defined by the structure coordinates of Table 1 or Table 2.
  • RMSD root mean square deviations
  • an analysis can be carried out involving one or more mathematical constructs.
  • Representative mathematical constructs include, but are not limited to: energy calculations for a given geometry of a molecule utilizing forcefields or ab initio methods known in the art; energy minimization using gradients of the energy calculated as atoms are shifted so as to produce a lower energy; conformational searching, i.e., locating local energy minima; molecular dynamics wherein a molecular system (single molecule or ligand/protein complex) is propagated forward through increments of time according to Newtonian mechanics using techniques known to the art; calculations of molecular properties such as electrostatic fields, hydrophobicity and lipophilicity; calculation of solvent-accessible or other molecular surfaces and rendition of the molecular properties on those surfaces; comparison of molecules using either atom-atom correspondences or other criteria such as surfaces and properties; quantitative structure-activity relationships in which molecular features or properties dependent upon them are correlated with activity or bio-assay data.
  • the computer system on which a modeling operation is being carried out then generates the structural details of one or more regions in which a potential ligand binds (e.g., a HDHD4 polypeptide binding site) so that complementary structural and chemical features of the potential ligands can be determined.
  • Design in these modeling systems is generally based upon the compound being capable of structurally and chemically associating with the protein, i.e., having structural and chemical feature complementarity.
  • the compound must be able to assume a conformation that allows it to associate with the protein.
  • Some modeling and design systems estimate the potential inhibitory or binding effect of a potential modulator prior to actual synthesis and testing. Using modeling, compounds may be designed de novo using an empty binding site.
  • compounds may be designed including some portion of a known ligand, i.e., grown in place.
  • the known ligand may have been determined through virtual screening.
  • Programs for design include, but are not limited to LUDI (Bohm, (1992) J. Comp. Aid. MoI. Design 6:61-78, Accelrys, San Diego, California, USA), LEAPFROG (Tripos Associates, St. Louis Missouri, USA) and DOCK (Kuntz ef ⁇ /.. (1994) Ace. Chem. Res. 27: 117; Gschwend & Kuntz. (1996) J Comp. Aided MoI. Des. 10: 123; Kuntz. (1982) J. MoI. Biol. 161:269-288).
  • This refinement step can be dependent on the nature and results of any analysis carried out as a component of the alignment process. For example, if energy considerations are not taken into account during the alignment process a generated structure might benefit from further refinement. Conversely, if an alignment process is extensive in its treatment, subsequent refinement of the structure might not be necessary or might be only minimal in scope.
  • the present invention provides for the formation of a homology model comprising all or any part (e.g., a binding site) of a HDHD4 polypeptide.
  • the HDHD4 polypeptide is described by the structure coordinates of Table 1 or Table 2.
  • a model of a HDHD4 polypeptide of the present invention can be any type of art-recognized model, including, but not limited to, three-dimensional models and steric/electrostatic field definition models that can be used to study/compute the putative interactions ligands might undergo.
  • a three-dimensional model can be produced through use of structure coordinates, and can be represented in any of a variety of forms, such as ribbon diagrams or wireframe models.
  • mutant includes one or more amino acid deletions, insertions, inversions, repeats, or substitutions as compared to a native protein (e.g., a HDHD4 polypeptide).
  • a native protein e.g., a HDHD4 polypeptide
  • a mutant can have the same, similar, or altered biological activity as compared to the native protein.
  • a HDHD4 polypeptide mutant can have at least 25% sequence identity, at least about 50% sequence identity, at least about 75% sequence identity, or at least about preferably 95%, 96%, 97%, 98%, or 99% sequence identity to a wild-type HDHD4 polypeptide (e.g., SEQ ID NO:2 encoded by SEQ ID NO: 1).
  • the structural coordinates of the present invention can be employed in the design of a mutant HDHD4 polypeptide or fragment thereof.
  • the structural coordinates describe, in one aspect, various structural features of a HDHD4 polypeptide. Those of ordinary skill in the art can employ this understanding of the HDHD4 structure to select one or more amino acid residues for mutation.
  • the rationale for selecting a residue can be based on a steric, chemical or other consideration.
  • the present invention provides for the generation of HDHD4 mutants, and the ability to solve the crystal structures of those that crystallize. Further, desirable sites for mutation can be identified, based on analysis of the three- dimensional HDHD4 structural coordinates provided herein.
  • the present invention provides a method of designing a mutant comprising making one or more amino acid mutations in a HDHD4 polypeptide.
  • the mutant so designed can comprise a complete HDHD4 polypeptide or a portion of thereof, such as a ligand binding site.
  • a mutant comprises an addition, a deletion or a substitution of one or more of the amino acids of a HDHD4 polypeptide binding site.
  • One embodiment of a method of designing a mutation comprises: (a) selecting a property of a HDHD4 polypeptide to be investigated; (b) providing a three-dimensional structure of a HDHD4 polypeptide; and (c) evaluating the structure to identify a residue known or suspect to related to the selected property. The steps of the method can be repeated a desired number of times.
  • a property of a HDHD4 polypeptide to be investigated is selected.
  • Example properties include ligand binding, overall or local charge, overall or local or local hydrophobicity, folding, overall or local secondary or tertiary structure, elimination or formation of an epitope or catalysis.
  • Other properties can also be investigated and a combination of properties can be investigated with a single mutation.
  • a three-dimensional structure of a HDHD4 is provided.
  • the three- dimensional structure can be described by all or a part of the structure coordinates of Table 1 or Table 2.
  • the HDHD4 can comprise all or a part of the amino acid sequence of SEQ ID NO:2.
  • the structure is then evaluated to identify a residue known or suspected to relate to the selected property.
  • the evaluating can be of any form and can be dependent on the nature of the property being investigated.
  • the evaluating can start with the substitution (or the addition or deletion) of one or more residues for one or more HDHD4 polypeptide residues.
  • substitution(s) is performed (for example by employing software used to display the three-dimensional structure)
  • a visual inspection of the three-dimensional structure as it is displayed on a computer screen can be performed.
  • the effect of a given mutation on the structure and/or property of a HDHD4 polypeptide can be determined by visual inspection.
  • the evaluating can comprise one or more calculations to determine the effect of a given substitution. For instance, an energy minimization operation can be performed to energy minimize a mutant HDHD4 polypeptide structure. Further, calculations can be performed that can quantitatively assess the effect of a given mutation on the charge, hydrophobicity, etc., either locally or globally. The overall energy of the structure can also be calculated. After performing the method steps, the effect of a mutation can be determined.
  • the mutant can be synthesized and subjected to further analysis (e.g., ligand binding assays, activation assays, etc., as described herein). If a mutation does not yield a desired result, the steps of the method can be repeated a desired number of times.
  • a desired result e.g., an effect on a property of HDHD4 that is being investigated
  • the mutant can be synthesized and subjected to further analysis (e.g., ligand binding assays, activation assays, etc., as described herein). If a mutation does not yield a desired result, the steps of the method can be repeated a desired number of times.
  • a mutation can be in a ligand binding site or in the area of a ligand binding site.
  • a mutation can comprise a residue selected, for example, from the group consisting of D 12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, and optionally C60, F61, H62, P63, Y64, and N65 of SEQ ID NO:2 in a HDHD4 polypeptide, or a residue that is spatially near these residues (which can be determined from an inspection of the structure coordinates of Table 1 or Table 2).
  • the method can comprise using all or part of a model of a HDHD4 polypeptide to visualize all or part of a HDHD4 polypeptide in its mutated or native form.
  • the model is a three-dimensional model.
  • a mutation into a HDHD4 polypeptide amino acid sequence (e.g., a mutation designed using structure coordinates of the present invention, such as by a method disclosed herein) by any method known to those of skill in the art, including site-directed mutagenesis of DNA encoding a HDHD4 polypeptide.
  • a mutation can be introduced, for example, by employing common DNA amplification methods using primers to introduce and amplify alterations in the DNA template, such as PCR methods that employ primers comprising a desired mutation.
  • Non-naturally occurring variants can be produced using known mutagenesis techniques, including, but not limited to, oligonucleotide mediated mutagenesis, alanine scanning, PCR mutagenesis, site directed mutagenesis (see, e.g., Carter et al, (1986) Nucl Acids Res. 13:4331; and Zoller et al, (1982) Nucl Acids Res. 10:6487), cassette mutagenesis (see, e.g., Wells et al, (1985) Gene 34:315), restriction selection mutagenesis (see, e.g., Wells et al. , (1986) Philos. Tr. R. Soc.
  • phage display e.g., Lowman et al. (1991) Biochem. 30: 10832-10837; U.S. Patent No. 5,223,409; PCT Publication WO 92/06204
  • region-directed mutagenesis e.g., region-directed mutagenesis
  • site-directed mutagenesis techniques employ a phage vector that has single- and double-stranded forms, such as M 13 phage vectors.
  • Other suitable vectors comprising a single- stranded phage origin of replication can also be employed in a site-directed mutatgenesis protocol (see, e.g., Veira et ah, (1987) Meth. Enzymol. 15:3).
  • a mutant designed by a method of the present invention that has the same or similar biological activity as the native HDHD4 polypeptide or a native portion thereof can be useful for any purpose for which the native is useful.
  • a mutant designed by a method of the present invention that has altered biological activity from that of the native can be useful in binding assays to test the ability of a potential ligand to bind to or associate with a HDHD4 polypeptide.
  • a mutant designed by a method of the present invention that has an altered biological activity from the native can be useful in further elucidating the biological role and mechanism of action of HDHD4.
  • the present invention provides a mutant HDHD4 polypeptide, or a mutant portion thereof, comprising one or more amino acid mutations, addition or deletion in a wild-type HDHD4 polypeptide.
  • a mutant portion of a HDHD4 polypeptide can comprise a mutant binding site, such as that described herein.
  • a mutation comprises five or fewer substitutions, deletions or insertions, four or fewer substitutions, deletions or insertions, three or fewer substitutions, deletions or insertions, two or fewer substitutions, deletions or insertions, or one substitution, deletion or insertion.
  • a substitution can be a conservative amino acid substitution, a discussion of which is provided herein, although non-conservative subsitutions, deletions and additions can also be performed and form aspects of the present invention.
  • HDHD4 polypeptide derivatives, analogs and mutants, as described herein, can be made by altering encoding nucleic acid sequences by substitutions, e.g., replacing a given residue with another residue; such additions or deletions can provide for functionally equivalent or specifically modified HDHD4 polypeptides.
  • nucleotide coding sequences that encode substantially the same amino acid sequence as a nucleic acid encoding a modified HDHD4 polypeptide, or a fragment thereof, of the present can be used in the practice of the present invention.
  • DNA sequences that encode substantially the same amino acid sequence as a nucleic acid encoding a modified HDHD4 polypeptide, or a fragment thereof, of the present can be used in the practice of the present invention.
  • These include but are not limited to allelic genes, homologous genes from other species, which are altered by the substitution of different codons that encode the same amino acid residue within the sequence, thus producing a silent change.
  • a modified HDHD4 polypeptide derivative of the present invention can include, but is not limited to, derivatives containing, as a primary amino acid sequence, all or part of the amino acid sequence of a HDHD4 polypeptide, including altered sequences in which functionally equivalent amino acid residues are substituted for residues within the sequence resulting in a conservative amino acid substitution.
  • one or more amino acid residues within the sequence can be substituted by another amino acid of a similar polarity, hydrophobicity, charge, etc. which acts as a functional equivalent, resulting in a silent alteration.
  • Substitutes for an amino acid within the sequence may be selected from other members of the class to which the amino acid belongs.
  • the nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan and methionine.
  • Amino acids containing aromatic ring structures are phenylalanine, tryptophan, and tyrosine.
  • the polar neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and glutamine.
  • the positively charged (basic) amino acids include arginine, lysine and histidine.
  • the negatively charged (acidic) amino acids include aspartic acid and glutamic acid.
  • initial substitutions are conservative, i.e., the replacement group will have approximately the same size, shape, hydrophobicity and charge as the original group.
  • Non-conserved amino acid substitutions can also be introduced to impart a preferred property to a protein.
  • a Cys can be introduced to provide a potential site for disulfide bridges with another Cys.
  • a His can be introduced as a particular "catalytic" site (i.e., His can act as an acid or base and is a common amino acid in biochemical catalysis).
  • Pro can be introduced which, because of its particularly planar structure, induces ⁇ -turns in protein structure.
  • mutant include chimeric and fusion proteins. Such chimeras or fusion proteins can include, for example, a secretion signal or an additional heterologous functional region.
  • a region of additional amino acids can be added to the N- terminus of the polypeptide to improve stability and persistence in the host cell, during purification, or during subsequent handling and storage.
  • peptide moieties can be added to the polypeptide to facilitate purification. Such regions may be removed prior to final preparation of the polypeptide.
  • the addition of peptide moieties to polypeptides to engender secretion or excretion, to improve stability and to facilitate purification, among others, are familiar and routine techniques in the art.
  • One common example of a fusion protein comprises a heterologous region from immunoglobulin that is useful to solubilize proteins.
  • Mutagenesis methods as disclosed herein can be combined with high- throughput, automated screening methods to detect activity of cloned, mutagenized polypeptides in host cells.
  • Mutagenized DNA molecules that encode active polypeptides e.g., cell proliferation
  • These methods allow the rapid determination of the importance of individual amino acid residues in a polypeptide of interest, and can be applied to polypeptides of unknown structure.
  • molecular replacement One method that can be employed for the purpose of solving additional HDHD4 crystal structures is molecular replacement (see generally, The Molecular Replacement Method. (Rossmann, ed.), Gordon & Breach, New York, New York (1972)).
  • the general approach of molecular replacement is to employ a known structure (e.g., a HDHD4 structure of the present invention) as a template from which an unknown structure can be derived.
  • a known structure e.g., a HDHD4 structure of the present invention
  • structural element common to certain domains which can relate to certain primary structure motifs
  • Phases can then be calculated from this model and combined with the observed X-ray diffraction pattern amplitudes to generate an electron density map of the structure whose coordinates are unknown.
  • This in turn, can be subjected to well-known model building and structure refinement techniques to provide a final, accurate structure of the unknown crystallized molecule or molecular complex.
  • Software useful for carrying out a molecular replacement solution includes AmoRe QSfavaza & Saludiian. (1997) Method Enzymol. 276A: 581-94).
  • the structure coordinates of the present invention can be employed in determining the three-dimensional structure of a protein for which a structure is not known, or in determining the three-dimensional structure of regions of a protein for which only a partial structure is available.
  • Modulators designed using a structure of the present invention can be used to modulate HDHD4 activity.
  • the present invention provides a method of modulating a HDHD4 polypeptide comprising administering a modulator of a HDHD4 polypeptide in an amount sufficient to modulate a HDHD4 polypeptide, wherein the modulator of the HDHD4 polypeptide is a ligand known or suspected to bind to a HDHD4 polypeptide or was identified by a method comprising: (i) docking a test molecule into all or any part of a HDHD4 binding site, (ii) analyzing the structural and chemical feature complementarity structural and chemical feature complementarity between the test molecule and all or any part of the HDHD4; and (iii) screening the test molecule in a biological assay of modulation of the HDHD4.
  • a test molecule is identified as a modulator of a HDHD4 polypeptide if the structural and chemical feature complementarity and the modulation exceed a desired level.
  • the method can further comprise the following step of: (b) screening the test molecule in an assay that characterizes binding to a HDHD4 polypeptide.
  • the binding site can be described, for example, by the structure coordinates of amino acids D12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, and optionally C60, F61, H62, P63, Y64, and N65 of SEQ ID NO:2, according to Table 1 or Table 2.
  • the methods of the present invention can be practiced in vitro or in vivo.
  • the methods can employ any number of art-recognized in vitro systems.
  • In vivo methods include, but are not limited to, any of the ways described in the section on methods of treatment.
  • An expression vector was obtained containing and expressing the gene for full length HDHD4, with the addition of a Thrombin-cleavable C-terminal hexahistidine tag, and two extra amino acids (G and S) on the N-terminus.
  • NMR structural data was used to design a truncated protein of HDHD4 for crystallization trials.
  • the Multi Site-Directed Mutagenesis kit (Stratagene, La Jolla, CA) was used to perform deletion mutagenesis to remove 21 base pairs (seven amino acids) from the 5' end (N-terminus) and 27 base pairs (nine amino acids) from the 3' end (C-terminus) of this starting construct.
  • the resulting expression vector referred to as "VG-10" thus expresses HDHD4(R7-C242) with an N-terminal Methionine (start codon) and a C-terminal Thrombin cleavable hexahistidine tag.
  • E.coli BL21(DE3) (Novagen, Madison, WI) were propagated in minimal media overnight at 37 0 C.
  • Minimal Media was made by combining 10.5 g K 2 HPO 4 and 0.5 g NaCl. H 2 O was added and the pH adjusted to 7.2 with H 3 PO 4 .
  • the harvested cells were resuspended in 100 mL of 25 mM Tris-HCl, pH 7.5, 50 mM NaCl, 2 mM dithiothreitol (DTT), 1 mM ethylene- bis(oxyethylenenitrilo)tetraacetic acid (EGTA), 0.5 mM NaF, 100 mg/L protamine sulfate and 1 mL of protease inhibitor cocktail (Sigma, St. Louis, MO). After sonication and clarification at 15,000 rpm, 20 min (Sorval, SS34) the supernatant was applied onto 30 mL of nickel-charged affinity column (His-Select, Sigma, St.
  • Peak fractions (3 ml/tube) were passed through 3 mL of SP Sepharose (Pharmacia, Piscataway, NJ) resin (5 ml) and then through 3 mL of Q Sepharose (Pharmacia, Piscataway, NJ) resin (5 ml).
  • the protein was concentrated to 20 mg/mL using a filtering device with a 10,000 Da MWCO membrane (Millipore Corporation, Bedford, MA) and exchanged into the final buffer: 25 mM Tris-HCl, pH 7.5, 50 mM NaCl, 5 mM DTT, 0.5 mM NaF. All concentrations were done in cooled table-top centrifuge. Typical yields were 100 mg/L of growth media.
  • the protein could be used immediately for crystallization trials or stored at -80 0 C with 10% v/v glycerol.
  • HDHD4 Protein Manipulation and Co-crystallization Initial crystallization screens were run on Fluidigm (San Francisco, CA, USA) microfluidic chips with sulfur methionyl (S-Met) protein. The crystallization conditions were successfully translated to drop volumes above 1 ⁇ l and then applied to selenomethionyl protein (Se-Met).
  • the selenomethionyl protein stock solution consisted of 7 mg/mL (0.26 mM based on the calculated MW of 27,132 Da) HDHD4 in 50 mM NaCl, 5 mM DTT, and 0.5 mM NaF buffered by 25 mM Tris-HCl, pH 7.5.
  • HDHD4 protein stock solution consisted of 12.0 mg/ml (0.419 mM based on the calculated Mw of 28,625 Da) 1 mM TCEP, 32.0 mM NANA, 4.27 mM MgCl 2 , 2.14 mM vanadate buffered by 10 mM HEPES pH 7.5 Crystallization trials were prepared by the hanging drop vapor diffusion method.
  • the reservoir solution consisted of 0.5 M potassium formate, 20% w/v PEG 1500, 0.1 M glycyl-glycine pH 8.5, 0.01% n-dodecyl b-D-maltoside.
  • Example 4 HDHD4 Structure Determination The structure of HDHD4 was determined from experimental phases derived from the incorporated of selenomethionine. A three-wavelength MAD experiment (peak, inflection, and high-energy remote) was conducted (Beamline X12C, National Synchrotron Light Source, Brookhaven National Laboratory, Upton, NY, USA). The inverse-beam approach was used to guarantee the measurement of Friedel mates. The diffraction data were processed with the HKL suite (Otwinowski and Minor (1997) CW. Carter and R.M. Sweet (ed.), Methods Enzymol, Macromolecular Crystallography part A, 276: 307-326, Academic Press, Inc., New York, NY).
  • SHELXD (Us ⁇ n and Sheldrick (1999) Curr. Opin. Struct. Biol 9: 643-648; Schneider and Sheldrick (2002) Acta. Cryst. D58: 1772-1779) was used to identify the selenium sub-structure from the anomalous signal contained in the structure-factor amplitudes. A total of 12 selenium sites, consistent with two molecules in the asymmetric unit as anticipated, were located. The selenium sites were refined with the program autoSHARP (LaFortelle and Bricogne (1997) CW. Carter and R.M.
  • the structure factors associated with the density-modified map and the amino-acid sequence were passed to the program APR/wARP (Lamzin and Wilson (1993) Acta Cryst. D49: 129- 147) which built approximately 85% of the residues in the dimer in 15 fragments.
  • the fragmented model was manually organized by protein molecule and the structure was completed by several rounds of refinement with the program CNX (Accelrys, Inc., San Diego, CA, USA) and model building with the program QUANTA (Accelrys, Inc., San Diego, CA, USA).
  • the structure was refined to 2.0 A resolution.
  • the crystallographic residuals, R and Rfr ee are 24.9% and 29.9%, respectively.
  • the extended active site of HDHD4 including both the region shown to bind the phosphate mimetic vanadate and the region with the unmodeled density consistent with a small organic molecule, is lined with the following residues: D12, L13, D14,
  • P63, Y64, and N65 also likely form part of the extended active site of HDHD4 based on the proximity of their Ca carbons.
  • the coordinates for the structurally conserved regions may be assigned based on the coordinates of the template structure (e.g., HDHD4). Insertions, deletions and mutations may be incorporated into the template structure as desired to build an initial model.
  • the HDHD4 template structure may then be energy minimized to refine the molecular structure so that any steric strain that might have been introduced during the model-building process is eliminated.
  • the model may then be screened for unfavorable steric contacts and, if necessary, such side chains may be remodeled either by using a rotamer library database or by manually rotating the respective side chains to form a final homology model of the target structure.
  • the modeling may be carried out, for example, on a Silicon Graphics OCTANE or FUEL computer (Silicon Graphics Inc., Mountain View, California, USA) using the Homology module in INSIGHT II (Accelrys Inc., San Diego, California, USA).

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • Immunology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • Genetics & Genomics (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Hematology (AREA)
  • Urology & Nephrology (AREA)
  • Medicinal Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Food Science & Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Biophysics (AREA)
  • Cell Biology (AREA)
  • Peptides Or Proteins (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The present invention relates to a three-dimensional crystalline form of a HDHD4, which was solved to provide a three-dimensional structure of a HDHD4. In one aspect of the present invention, HDHD4 is complexed with magnesium and a phosphate mimetic, such as vanadate. Methods of using the crystalline form, such as screening methods and rational modulator design methods, are also aspects of the present invention and are described. The structures of the present invention also provide insight into observed properties of HDHD4 and other polypeptides.

Description

Description of the Invention
THREE-DIMENSIONAL STRUCTURE OF HDHD4 COMPLEXED WITH MAGNESIUM AND A PHOSPHATE MIMETIC
Cross-Reference to Related Applications
This application claims priority from U.S. Provisional Application No. 60/786,323, filed March 27, 2006, incorporated in its entirety herein by reference.
Field of the Invention The present invention relates generally to the three-dimensional structure of haloacid dehalogenase-like hydrolase domain containing protein 4 (HDHD4) in general, and more particularly to HDHD4 in complex with magnesium and a phosphate mimetic, such as vanadate. Additionally, the present invention relates to methods of designing and/or identifying modulators and/or ligands of HDHD4. Methods of modulating HDHD4 activity, methods of designing HDHD4 mutants, mutant HDHD4 polypeptides or portions of mutant HDHD4 polypeptides, and models of HDHD4 also form aspects of the present invention. The present invention further relates to machine-readable data storage media comprising structural coordinates of HDHD4 in complex with magnesium and a phosphate mimetic, vanadate for example, and optionally in further complex with a ligand, and computer systems capable of producing three-dimensional representations of all or any part of a structure of HDHD4 in complex with magnesium and a phosphate mimetic, such as vanadate.
Background of the Invention Broadly, the present invention relates to the three-dimensional structure of
HDHD4 in complex with magnesium and a phosphate mimetic, as determined by X- ray crystallography methods. Examples of phosphate mimetics include but are not limited to vanadate, phosphate, tungstate, sulfate and aluminum trifluoride (Madhusudan et al. (2002) Nature Structur. Biol. 9:273-277). HDHD4 is an intra- cellular protein having a molecular weight of about 31,000 Da that is a member of the haloacid dehalogenase (HAD) superfamily of enzymes (Allen and Dunaway-Mariano (2004) Trends Biochem. ScL 29: 495-503). The HAD superfamily is a large family of enzymes that occur in both prokaryotes and eukaryotes. While the HAD superfamily includes dehalogenases, the majority of the superfamily members are involved in phosphoryl group transfer reactions (phosphatase, phosphonotase and phospho- mutase activities). Recent examples of biologically important mammalian HAD superfamily members include chronophin (Gohla, Birkenfeld and Bokoch (2005) Nature Cell Biol. 7: 21-29), which is involved in the regulation of cofilin-dependent actin dynamics, and the Drosophila eyes absent homolog 2 (Zhang et al. (2005) Cancer Res. 65: 925-932), which is up-regulated in ovarian cancer and promotes tumor growth. Mammalian HAD superfamily members are potential novel targets for cancers and other diseases.
HDHD4 was initially identified as a potential oncology target through studies of its ortholog in Drosophila. Over-expression of the Drosophila gene CG 15771 suppresses the small eye defect caused by over-expression of human p21(+) in the eye. Subsequent studies of HDHD4 in cancer cell lines indicate that it acts synergistically with the Ras/P21 pathway. Pronounced phenotypic effects were observed in several cancer cell lines upon HDHD4 over-expression and/or knockdown, as demonstrated by the following examples: (1) HDHD4 suppression causes transient reduction in p21 protein levels in M 109 murine melanoma cells transfected with an siRNA shown to cause the specific degradation of HDHD4 mRNA; (2) HDHD4 overexpression reverses p21-mediated Gl arrest in A549 cells; and (3) HDHD4 over-expression reverses p21 -mediated S-phase arrest in HEK293 cells.
In vitro, HDHD4 displays weak phosphatase activity against several small- molecule substrates such as 2,3-diphosphoglycerate, which was used as the basis of a high-throughput screen. Based on the observed in vitro activity, and by comparing the HDHD4 active site composition to the known HAD superfamily active site motifs (Allen & Dunaway-Mariano (2004) Trends Biochem. ScL 29: 495-503), HDHD4 is likely to be a novel human phosphatase. Recently, N-acetylneuraminate 9-phosphate was identified as a biologically relevant substrate for HDHD4 (Maliekal et al. (2006) Glycobiology 16: 165-172).
HDHD4 is a member of subfamily I of the HAD super-family (Allen & Dunaway-Mariano (2004) Trends Biochem. ScL 29: 495-503). Subfamily I members contain a core domain and a cap domain. It was believed that the natural substrates of subfamily I members are exclusively small molecules, since the core and cap domains adopt a "closed" conformation when substrates/inhibitors bind.
A detailed three-dimensional structure of HDHD4 would greatly facilitate not only an understanding of HDHD4 structure and activity, but would also facilitate the design of modulators that can be employed in the diagnosis, prognosis and treatment of HDHD4-related conditions, such as different forms of cancer. Thus, what is needed is detailed three-dimensional structural information of HDHD4 in general, and of HDHD4 in complex with one or more ligands in particular. Such information can take the form of, for example, structural coordinates derived from a crystalline form of a HDHD4-ligand complex. Using the present invention as a guide, modulators of
HDHD4 can be designed and/or identified, and additional details regarding HDHD4's mechanism of action can be obtained.
Summary of the Invention
In one aspect, the present invention provides a crystalline form comprising a complex comprising a HDHD4 polypeptide and a moiety comprising a metal atom. In one embodiment, the moiety comprising a metal atom is selected from the group consisting of magnesium, manganese, calcium, a phosphate mimetic, both magnesium and a phosphate mimetic, both manganese and a phosphate mimetic, and both calcium and a phosphate mimetic. The phosphate mimetic can be, for example, vanadate, tungstate, sulfate or aluminum trifluoride.
In another embodiment, a HDHD4 polypeptide comprises the amino acid sequence of SEQ ID NOs :2 or 4. Further, a HDHD4 polypeptide can also comprise a His-tagged form. In one embodiment, the crystalline form has lattice constants of a = 46.4 A, b = 53.5 A, c = 64.0 A, α = 65.5°, β = 75.0°, and γ = 85.2°. Diffraction images from these crystals frequently indexed as the [1 0 0 0 -1 0 0 -1 -1] transform (a = 46.4 A, b = 53.5 A, c = 63.9 A, α = 114.5°, β = 100.4°, and γ = 95.0°). In another embodiment, the crystalline form has lattice constants of a = 46.8 A, b = 102.6 A, c = 186.7 A.
In various embodiments of the present invention, the crystalline form is a triclinic crystalline form and has a space group of Pl or P212121. In one example, the crystalline form is described by the structure coordinates of Table 1 or Table 2 and the three-dimensional structure of the crystallized complex is determined to a resolution of about 3.0 A or better. In other embodiments, there are two HDHD4 polypeptides in the asymmetric unit cell. In another embodiment, there are three HDHD4 polypeptides in the asymmetric unit cell. In still other embodiments, the crystalline form comprises one or more atoms having an atomic weight of 40 g/mol or more.
The present invention also provides a method for determining the three- dimensional structure of a crystallized HDHD4 in complex with a moiety comprising a metal atom to a resolution of about 3.0 A or better. In one embodiment, the method comprises: (a) crystallizing a HDHD4 polypeptide in complex with a moiety comprising a metal atom to form a crystallized complex; and (b) analyzing the crystallized complex to determine a three-dimensional structure of the HDHD4 polypeptide in complex with a ligand, whereby the three-dimensional structure of a crystallized HDHD4 polypeptide in complex with a ligand is determined to a resolution of about 3.0 A or better.
The present invention further provides a method of designing a modulator of HDHD4. In one embodiment, the method comprises: (a) designing a potential modulator of HDHD4 that will make interactions with amino acids in a ligand binding site of a HDHD4, based upon a crystalline structure comprising a HDHD4 in complex with a ligand; (b) synthesizing the modulator; and (c) determining whether the potential modulator modulates the activity of HDHD4, whereby a modulator of HDHD4 is designed.
In still another aspect, the present invention provides a method of identifying a HDHD4 modulator. In one embodiment the method comprises: (a) inputting structure coordinates describing a three-dimensional structure of a HDHD4 polypeptide in complex with a moiety comprising a metal atom to modeling software disposed on a computer; and (b) modeling a candidate modulator that forms one or more desired interactions with one or more amino acids of a ligand binding site of the HDHD4 and fits sterically within the HDHD4 binding pocket. The method can further comprise assaying the modulatory properties of the candidate modulator by contacting the candidate modulator with a cell extract or purified HDHD4 polypeptide to determine whether it is a modulator of HDHD4 activity. In yet another aspect, the present invention also provides a method of increasing the efficiency of a modulator of HDHD4 and, in a representative embodiment, comprises: (a) providing a first ligand having a known effect on the biological activity of HDHD4; (b) modifying the first ligand based on an evaluation of a three-dimensional structure of a HDHD4, optionally in complex with a ligand to form a modified ligand; (c) synthesizing the modified ligand; and (d) determining an effect of the modified ligand on HDHD4, wherein the efficiency of a modulator of HDHD4 is increased if the modified ligand favorably alters a biological activity of a HDHD4 with respect to the biological activity of the first ligand. In another embodiment, the present invention provides a method of designing a modulator of HDHD4. In one embodiment the method comprises: (a) modeling all or any part of a HDHD4 ligand binding site; and (b) based on the modeling, designing a candidate modulator that has structural and chemical feature complementarity with all or any part of the HDHD4 binding site; wherein the HDHD4 binding site is defined by the structure coordinates of Table 1 or Table 2. The candidate modulator can be designed to fit spatially into all or any part of a HDHD4 binding site, and a ligand binding site can be described by the structure coordinates of amino acids D 12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, optionally C60, F61, H62, P63, Y64 and N65 and subcombinations thereof according to Table 1 or Table 2. After a candidate modulator has been designed, the candidate modulator can then be synthesized and tested for modulation ability in a suitable assay. The method can further comprise: (c) docking the designed candidate modulator into all or any part of the HDHD4 binding site; and (d) analyzing the structural and/or chemical feature complementarity of the candiate modulator with all or any part of HDHD4 binding site.
The present invention also provides a method of designing a modulator of a target polypeptide that is structurally similar to HDHD4. In one embodiment, the method comprises: (a) modeling all or any part of a HDHD4 polypeptide; and (b) based on the modeling, designing a candidate modulator that has structural and chemical feature complementarity with all or any part of a HDHD4 polypeptide binding site; wherein the HDHD4 polypeptide is described by the structure coordinates of Table 1 or Table 2. The method can further comprise: (c) docking the chemical entity into all or any part of the HDHD4 binding site; and (d) analyzing the structural and chemical feature complementarity of the candidate modulator with all or any part of a HDHD4 polypeptide, such as a binding site. In another aspect, the present invention provides a method of identifying structural features of HDHD4 that can be employed in the design of a modulator that selectively modulates the activity of HDHD4 polypeptide to the exclusion of other structurally similar proteins. In one embodiment, the method comprises: (a) providing a three-dimensional structure of a HDHD4 polypeptide, optionally in complex with a moiety comprising a metal atom, and a three-dimensional structure of a structurally similar but non-identical test structure; (b) overlaying the backbone residues of the HDHD4 structure onto the test structure; and (c) identifying structural features of HDHD4 that do not overlap the test structure to a desired degree. The identifying can comprise, for example, a visual inspection of the overlapped structures or a quanitative comparison can be made. Additionally, the identifying can comprise one or more computational evaluations of the overlapped structures, which can be perfomed by employing commercially available computer software known to those of ordinary skill in the art.
In another aspect, the present invention provides methods useful in the design and identification of ligands and/or modulators of HDHD4. In various related aspects, the present invention provides a method of docking a test molecule into all or any part of a binding site on a HDHD4 and a method of identifying structural and chemical features of all or any part of a HDHD4.
In yet another aspect, the present invention provides a method of designing a ligand of HDHD4. In one embodiment, the method comprises: (a) modeling all or any part of a HDHD4; and (b) designing a chemical entity that has structural and chemical complementarity with all or any part of a HDHD4 binding site.
A method of evaluating the potential of a chemical entity to bind to all or any part of HDHD4, as well as a method for identifying a ligand and/or a modulator of HDHD4 is also disclosed.
In yet a further aspect, the present invention provides a method of designing a HDHD4 mutant. In one embodiment the method comprises: (a) evaluating a three- dimensional structure of a HDHD4 polypeptide to identify one or more amino acids as candidates for mutation; and (b) mutating the HDHD4 identified one or more amino acids by making an amino acid mutation selected from the group consisting of a substitution, a deletion and an insertion. The method can further comprise the step of (c) expressing the mutant so generated. The present invention also encompasses the resultant mutant HDHD4, as well as portions of a mutant HDHD4.
In another aspect, the present invention provides a method of forming a homology model based on a HDHD4 structure of the present invention. In one embodiment of a method of constructing a homology model consistent with the present invention comprises: (a) providing an amino acid sequence for a target protein for which a structure is not known; (b) aligning the target protein with all or a part of of the structure of a HDHD4, wherein the HDHD4 is described by the structure coordinates of Table 1 or Table 2; (c) analyzing the alignment of the target protein with the all or a part of the HDHD4; and (d) generating a structure of the target protein based on the analysis.
Further, the present invention provides a method for evaluating the potential of a chemical entity to bind to all or any part of a HDHD4 or a structurally similar molecule comprising: (a) docking a candidate modulator into all or any part of a HDHD4 described by the structure coordinates of Table 1 or Table 2; and (b) analyzing structural and chemical feature complementarity between the candidate modulator and all or any part of a HDHD4.
The present invention additionally provides a method for identifying a modulator of HDHD4. In one embodiment the method comprises the following steps, which are preferably, but not necessarily, performed in the order recited: (a) docking a candidate modulator into all or any part of a HDHD4 binding site, wherein the HDHD4 binding site is described by the structure coordinates of Table 1 or Table 2; (b) analyzing structural and chemical feature complementarity between the candidate modulator and all or any part of the HDHD4 binding site; (c) synthesizing the candidate modulator; and (d) screening the candidate modulator in a biological assay for the ability to modulate HDHD4. The method can further comprise the following step of (e) screening the candidate modulator in an assay that characterizes binding to HDHD4. The present invention also comprises a method of determining the structure of a target protein for which little or no structural information is known. In one embodiment the method comprises: (a) providing an amino acid sequence for a target protein for which a structure is not known; (b) aligning the target protein with all or a part of a HDHD4 structure, wherein the HDHD4 is described by the structure coordinates of Table 1 or Table 2; (c) analyzing the alignment of the target protein with all or a part of the HDHD4; (d) generating a structure of the target protein based on the analysis; and (e) analyzing the generated structure to determine the structure of a target protein for which a structure is not known. The present invention also comprises a method of designing a mutation in
HDHD4. One embodiment of a method of designing a mutation comprises: (a) selecting a property of HDHD4 to be investigated; (b) providing a three-dimensional structure of a HDHD4; and (c) evaluating the structure to identify a residue known or suspected to be related to the selected property. The steps of the method can be repeated a desired number of times.
The present invention further provides a method of modulating HDHD4 activity comprising administering a modulator of HDHD4 in an amount sufficient to modulate HDHD4 activity, wherein the modulator of HDHD4 is a ligand known or suspected to bind to HDHD4 or was identified using a structure of the present invention. In one embodiment, the method of identifying a modulator comprises (a) docking a test molecule into all or any part of a HDHD4 binding site, (b) analyzing the structural and chemical feature complementarity structural and chemical feature complementarity between the test molecule and all or any part of a HDHD4; and (c) screening the test molecule in a biological assay of modulation of HDHD4. The method can further comprise one or more of the following steps: (d) screening the test molecule in an assay that characterizes binding to HDHD4; and (e) screening the test molecule in an assay that characterizes binding to HDHD4.
In a further aspect, the present invention provides a machine-readable data storage medium comprising data storage material encoded with machine-readable data comprising all or any part of the structure coordinates of a HDHD4 polypeptide, optionally in complex with a ligand and/or optionally in complex with a moiety comprising a metal atom. The present invention further provides computer systems comprising the machine-readable data storage media of the present invention, the systems being capable of producing a three-dimensional representation of all or any part of a HDHD4 alone or optionally in complex with a ligand and/or a moiety comprising a metal atom. In any embodiment of the present invention, the core domain can comprise
HDHD4 residues M6-L17 and A111-A246, the cap domain can comprise HDHD4 residues A21-H107 and hinge segments can comprise residues I18-T20 and M108- LI lO. Further, in any embodiment of the present invention, the ligand binding site of the HDHD4 polypeptide can comprise HDHD4 residues D 12, L 13, D 14, N 15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, optionally C60, F61, H62, P63, Y64 and N65, and subcombinations thereof. Further, in all embodiments the HDHD4 polypeptide can comprise the amino acid sequence of SEQ ID NOs:2 or 4 and can be encoded by a nucleic acid selected from the group consisting of SEQ ID NOs: 1 and 3, respectively, and sequences deviating from SEQ ID NOs: 1 and 3 due to the degeneracy in the genetic code.
Continuing, in any embodiment of the present invention, a HDHD4 polypeptide can comprise a moiety comprising a metal atom, and the moiety can be selected from the group consisting of magnesium, manganese, calcium, a phosphate mimetic, both magnesium and a phosphate mimetic, both manganese and a phosphate mimetic, both calcium and a phosphate mimetic, and both magnesium and a phosphorylated sugar. The phosphate mimetic can be, for example, vanadate, phosphate, tungstate, sulfate or aluminum trifluoride. Accordingly, it is an object of the present invention to provide a three dimensional structure of a HDHD4 polypeptide, optionally in complex with moiety comprising a metal atom and/or optionally comprising a ligand. This object is achieved in whole or in part by the present invention.
An object of the invention having been stated hereinabove, other objects will be evident as the description proceeds, when taken in connection with the accompanying Drawings and Examples presented and described below. Brief Description of the Drawings Figure 1 is a photograph of crystals of HDHD4 complexed with Mg2+ and phosphate and/or VO4 ". The diameter of the cluster is approximately 0.65 mm.
Figure 2 is a cartoon diagram of HDHD4 with Mg2+ (black ball) and VO4 3" (black stick). The unmodelled density connected to Lysl41 is also shown. The cap domain is shown in A and the core domain shown in B.
Figure 3 is a line depiction of an expanded view of some of the HDHD4 active
94- 1^ site residues and extended active site. Mg is represented as a black ball and VO4 " is represented by black sticks. The cap domain is shown in A and the core domain shown in B. The unmodelled density is shown. Lysl41 is shown as a gray stick.
Figure 4 is a cartoon diagram depicting the HDHD4 X-ray structure with loop residues 60-65 modeled to fit the discontinuous density. Mg2+ is represented as a black ball and VO4 3" is represented by a black stick. The cap domain is shown in A and the core domain is shown in B. The region of HDHD4 encompassing residues 60-65 is defined by the black arrows.
Figures 5A and 5B are a series of cartoon diagrams depicting the HDHD4 X- ray structure (the left structure in both Figures 5A and 5B) compared with phosphonatase in open conformation in complex with Mg2+ (PDB file IRQN, the center structure in both Figures 5A and 5B) and in closed conformation in complex with tungstate (shown as sticks) and Mg2+ (PDB file IFEZ, the right structure in both Figures 5A and 5B). In Figure 5A, the structures are shown with view to the face of the β-sheet in the core domain of all three structures. In Figure 5B, the structures are shown the same structures rotated approximately 90 degrees to view down the same β-sheet. Figure 6A depicts the DNA and protein sequences of full length wild-type
HDHD4, as derived from NCBI RefSeq entries NM_152667 (SEQ ID NO: 1) and NP_689880 (SEQ ID NO:2), respectively. "STP" refers to a stop codon. Figure 6B depicts the protein sequence of full length wild-type HDHD4, as derived from NCBI RefSeq entry NP_689880 (SEQ ID NO:2). Figure 7 depicts the DNA (SEQ ID NO:3) and translated protein sequences
(SEQ ID NO:4) of the translated region of the "VG-IO" expression vector (HDHD4(R7-C242)-T-His). Non-HDHD4 residues are underlined. Figure 8 depicts the amino acid sequence of the "VG-IO" protein (HDHD4(R7-C242)-T-His) expressed by the "VG-IO" expression vector (SEQ ID NO:4). Non-HDHD4 residues are underlined.
Figure 9 depicts nuclear magnetic resonance data for HDHD4 (VGlO: SEQ ID NO: 4) complexes: overlaid region of two-dimensional 1H-15N hetero-nuclear single quantum coherence (HSQC) spectra. The spectrum shown in black was acquired using a sample of HDHD4 (0.15 mM) in complex with magnesium (3.0 mM), and the spectrum shown in gray was acquired using a sample of HDHD4 (0.15 mM) in complex with magnesium (3.0 mM) and aluminum trifluoride (AlF3) (1.5 mM). The spectral changes observed demonstrate that AlF3 forms a complex with HDHD4 in the presence of magnesium. Residues in the vicinity of the vanadate/phosphate binding site, including 118, T193, G197, and G198, show chemical shift changes in response to AlF3 binding, whereas more distant residues, including G202, A205 and G213, are not significantly perturbed by AlF3 binding.
Detailed Description of the Invention
In one aspect, the present invention comprises a three-dimensional structure of HDHD4 (e.g., SEQ ID NOs:2 and 4) in complex with magnesium and phosphate and/or vanadate atoms. The three-dimensional structure of HDHD4 disclosed herein reveals several unique structural features heretofor unidentified in the HDHD4 polypeptide, which can be exploited in a rational drug design process. Thus, as described herein, the present invention encompasses not only the three-dimensional structure of HDHD4 (described by the structure coordinates presented in Table 1 or Table 2), but also various uses of the structure including screening methods and modulator design methods.
Definitions
Following long-standing patent law convention, the terms "a" and "an" mean "one or more" when used in this application, including the claims. As used herein, the term "about," when referring to a value or to an amount of mass, weight, time, volume, concentration or percentage is meant to encompass variations of ±20% or less (e.g., ±15%, ±10%, ±7%, ±5%, ±4%, ±3%, ±2%, ±1%, or ±0.1%) from the specified amount, as such variations are appropriate.
As used herein, the terms "amino acid," "amino acid residue" and "residue" are used interchangeably and mean any of the twenty naturally occurring amino acids. An amino acid is formed upon chemical digestion (hydrolysis) of a polypeptide at its peptide linkages. The amino acid residues described herein are preferably in the "L" isomeric form. However, residues in the "D" isomeric form can be substituted for any L-amino acid residue, as long as the desired functional property is retained by the polypeptide. NH2 refers to the free amino group present at the amino terminus of a polypeptide. COOH refers to the free carboxy group present at the carboxy terminus of a polypeptide.
It is noted that all amino acid residue sequences represented herein by formulae have a left-to-right orientation in the conventional direction of amino terminus to carboxy terminus. In addition, the phrases "amino acid" and "amino acid residue" are broadly defined to include modified and unusual amino acids.
Furthermore, it is noted that a dash at the beginning or end of an amino acid residue sequence indicates a peptide bond to a further sequence of one or more amino acid residues, or a covalent bond to an amino-terminal group, such as NH2, to an acetyl group or to a carboxy -terminal group, such as COOH. As used herein, the terms "associate" and "bind" and grammatical derivations thereof are used interchangeably and mean a condition of proximity between or amongst molecules, structural elements, chemical compounds or chemical entities. An association can be non-covalent (i.e., reversible), wherein the juxtaposition is energetically favored by hydrogen bonding or van der Waals or electrostatic interactions, or it can be covalent (i.e., irreversible). Thus in the present disclosure, when it is stated that a ligand "associates" with or "binds" to a protein, it is meant that the ligand interacts with the protein via covalent or non-covalent interactions.
As used herein, the term "binding site," and "ligand binding site" are used interchangeably and mean a region of a molecule or molecular complex that, as a result of its shape, favorably associates with a ligand. A binding site, such as a binding site in the light chain of HDHD4, defines a space commonly referred to as a "cavity" or "pocket," both of which terms are used interchangeably with "binding site" and "ligand binding site" in the present disclosure. A ligand of a binding site situates in the binding site when the ligand associates with the molecule or molecular complex. In one aspect of the present invention, the extended active site of HDHD4, including both the region shown to bind the phosphate mimetic vanadate and the region with the unmodeled density consistent with a small organic molecule, is lined with the following residues: D12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194. Although represented by discontinuous electron density, the residues C60, F61, H62, P63, Y64, and N65 also likely form part of the extended active site of HDHD4, based on the proximity of their Ca carbons.
As used herein, the term "biological activity" means any observable effect flowing from a HDHD4 polypeptide. Representative, but non-limiting, examples of biological activity in the context of the present invention include phosphoryl transfer, for example, using substrates such as 2,3-diphosphoglycerate or N-acetylneuraminate-9- phosphate.
As used herein, the terms "chimeric protein" and "fusion protein' are used interchangeably and mean a fusion of a first amino acid sequence encoding a HDHD4 polypeptide with a second amino acid sequence defining a polypeptide domain foreign to, and not homologous with, a HDHD4 polypeptide. A chimeric protein can present a foreign domain that is found in an organism that also expresses the first protein, or it can be an "interspecies" or "intergenic" fusion of protein structures expressed by different kinds of organisms. In general, a chimeric or fusion protein of the present invention can be represented by the general formula X — HDHD4 — Y, wherein HDHD4 represents a portion of the protein which is derived from a HDHD4 polypeptide (e.g., all or a part of a HDHD4 polypeptide), and X and Y are independently absent or represent amino acid sequences which are not derived from a HDHD4 polypeptide, which includes naturally occurring mutants. Analogously, the term "chimeric gene" refers to a nucleic acid construct that encodes a "chimeric protein" or "fusion protein" as defined herein. As a chimeric or fusion protein is not normally found in nature, chimeric and fusion proteins are encompassed by the term "mutant," examples of which is described herein. As used herein the term "complementary" means a nucleic acid sequence that is base paired, or is capable of base-pairing, according to the standard Watson-Crick complementarity rules. These rules generally hold that guanine pairs with cytosine (G:C) and adenine pairs with either thymine (A:T) in the case of DNA, or adenine pairs with uracil (A:U) in the case of RNA. The term "complementarity" can also refer to a favorable spatial arrangement between the surface of a ligand and the surface of its binding site.
As used herein, the term "detecting" means confirming the presence of a target entity by observing the occurrence of a detectable signal, such as a radiologic, fluorescent, colorimetric, etc. signal that will appear exclusively in the presence of the target entity.
As used herein, the terms "dock" and "perform a fitting operation," in all their grammatical forms, mean the computational placement of a chemical entity (e.g., a ligand or modulator (or a candidate ligand or modulator), such as a small organic molecule) within a space at least partially enclosed by the protein structure (e.g., a binding site) so that structural and chemical feature complementarity between chemical entity and binding site components (i.e., a binding contacts) can be assessed in terms of interactions typical of protein/ligand complexes. Such placement could be conducted manually or automatically and either approach can employ software designed for such purpose (e.g., INSIGHT II and modules therein, available from Accelrys, San Diego, California, USA).
As used herein, the terms "HDHD4 gene" and "recombinant HDHD4 gene" mean a nucleic acid molecule comprising an open reading frame encoding a HDHD4 polypeptide of the present invention, including both exon and (optionally) intron sequences.
As used herein, the terms "HDHD4 gene product", "HDHD4 protein", "HDHD4 polypeptide", and "HDHD4 peptide" are used interchangeably and mean a polypeptide having an amino acid sequence that is substantially identical to a native HDHD4 amino acid sequence from an organism of interest and which is biologically active in that it comprises all or a part of the amino acid sequence of a HDHD4 polypeptide, or cross-reacts with antibodies raised against a HDHD4 polypeptide, or retains all or some of the biological activity (e.g., the ability to transfer a phosphate group, for example, using substrates such as 2,3-diphosphoglycerate or N- acetylneuraminate-9-phosphate) of the native amino acid sequence or protein. Such biological activity can also include immunogenicity.
As used herein, the terms "HDHD4 gene product", "HDHD4 protein", "HDHD4 polypeptide", and "HDHD4 peptide" also include analogs of a HDHD4 polypeptide. By "analog" is intended that a DNA or amino acid sequence can contain alterations relative to the sequences disclosed herein, yet still retain all or some of the biological activity of those sequences. Analogs can be derived from cDNA or genomic nucleotide sequences from a human or other organism, or can be created synthetically. Those of ordinary skill in the art will appreciate that other analogs as yet undisclosed or undiscovered can be used to design and/or construct a HDHD4 analog. There is no need for a "HDHD4 gene product", "HDHD4 protein", "HDHD4 polypeptide", or "HDHD4 peptide" to comprise all or substantially all of the amino acid sequence of a HDHD4 polypeptide gene product. Shorter or longer sequences are anticipated to be of use in the present invention; shorter sequences are herein referred to herein as "segments". Thus, the terms "HDHD4 gene product", "HDHD4 protein", "HDHD4 polypeptide", and "HDHD4 peptide" also include fusion, chimeric or recombinant HDHD4 polypeptides and proteins comprising sequences of the present invention. Methods of preparing such proteins are disclosed herein and/or are known in the art.
As used herein, the terms "HDHD4 protein", "HDHD4 polypeptide", and "HDHD4 peptide" are used interchangeably and mean a polypeptide having an amino acid sequence that is substantially identical to a native HDHD4 amino acid sequence from an organism of interest and which is biologically active in that it comprises all or a part of the amino acid sequence of a HDHD4 polypeptide, or cross-reacts with antibodies raised against a HDHD4 polypeptide, or retain all or some of the biological activity (e.g., the ability to transfer a phosphate group, for example, using substrates such as 2,3-diphosphoglycerate or N-acetylneuraminate-9-phosphate) of the native amino acid sequence or protein. Such biological activity can include immunogenicity. In one embodiment, a HDHD4 protein comprises the amino acid sequences of SEQ ID NOs :2 or 4 and is encoded by the nucleic acid sequences of SEQ ID NOs: ! and 3. As noted, the terms "HDHD4 protein", "HDHD4 polypeptide", and "HDHD4 peptide" encompass mutants, including derivatives and analogs of a HDHD4 polypeptide. By "analog" meant that a DNA or amino acid sequence can contain alterations relative to a sequence disclosed herein, yet retain all or some of the biological activity of the sequence. An analog can be derived from genomic nucleotide sequences or cDNA, as disclosed herein, or can be created synthetically. There is no need for a "HDHD4 protein", "HDHD4 polypeptide", or "HDHD4 peptide" to comprise all or substantially all of the amino acid sequence of a HDHD4 polypeptide. Shorter or longer sequences may be of use in the present invention. As used herein, the term "gene" refers broadly to any segment of DNA associated with a biological function. A gene can encompass polynucleotide sequences including, but not limited, to a coding sequence, a promoter region, a cis- regulatory sequence, a non-expressed DNA segment that is a specific recognition sequence for regulatory proteins, a non-expressed DNA segment that contributes to gene expression, a DNA segment designed to have desired parameters, or combinations thereof. A gene can be obtained by a variety of methods, including cloning from a biological sample, synthesis based on known or predicted sequence information and recombinant derivation of an existing sequence.
As used herein, the terms "isolated" and "purified" are used interchangeably and refer to material (e.g., a nucleic acid or a protein) removed from its original environment (e.g., the natural environment, if it is naturally occurring), and thus is altered "by the hand of man" from its natural state. For example, an isolated polynucleotide could be part of a vector or a composition of matter, or could be contained within a cell, and still be "isolated" because that vector, composition of matter, or particular cell is not the original environment of the polynucleotide. The term "isolated" does not refer to genomic or cDNA libraries, whole cell total or mRNA preparations, genomic DNA preparations (including those separated by electrophoresis and transferred onto blots), sheared whole cell genomic DNA preparations or other compositions where the art demonstrates no distinguishing features of the polynucleotide and/or protein sequences of the present invention.
As used herein, the term "isomorphous replacement" means a method of using heavy atom derivative crystals to obtain the phase information necessary to elucidate the three-dimensional structure of a native crystal (see, e.g., Blundell et ah. Protein Crystallography, Academic Press, New York, New York, USA (1976); Otwinowski, in Isomorphous Replacement and Anomalous Scattering, (Evans & Leslie, eds.), Daresbury Laboratory, Daresbury, UK (1991) pp. 80-86, both of which are incorporated in their entirety). The phrase "heavy atom derivatization" is synonymous with the term "isomorphous replacement" and these terms are used synonymously herein.
As used herein, the term "ligand" means any molecule that is known or suspected to associate with another molecule. The term "ligand" encompasses inhibitors, activators, agonists, antagonists, natural substrates and analogs of natural substrates.
As used herein, the term "modeling" in all its grammatical forms, refers to the development of a mathematical construct designed to mimic actual molecular geometry and behavior in proteins and small molecules. These mathematical constructs include, but are not limited to: energy calculations for a given geometry of a molecule utilizing forcefields or ab initio methods known in the art; energy minimization using gradients of the energy calculated as atoms are shifted so as to produce a lower energy; conformational searching, i.e., locating local energy minima; molecular dynamics wherein a molecular system (single molecule or ligand/protein complex) is propagated forward through increments of time according to Newtonian mechanics using techniques known to the art; calculations of molecular properties such as electrostatic fields, hydrophobicity and lipophilicity; calculation of solvent- accessible or other molecular surfaces and rendition of the molecular properties on those surfaces; comparison of molecules using either atom-atom correspondences or other criteria such as surfaces and properties; quantitiative structure-activity relationships (SARs) in which molecular features or properties dependent upon them are correlated with activity or bio-assay data.
As used herein, the term "modified" means an alteration from an entity's normally occurring state. An entity can be modified, for example, by removing discrete chemical units or by adding discrete chemical units. The term "modified" encompasses detectable labels as well as those entities added as aids in purification, such as His-tags. As used herein the terms "modulate" and grammatical derivations thereof refer to an increase, decrease, or other alteration of any and/or all chemical and biological activities or properties mediated by a given DNA sequence, RNA sequence, polypeptide, peptide or molecule. The definition of "modulate" as used herein encompasses agonists and/or antagonists of a particular activity, DNA, RNA, or protein. The term "modulation" therefore refers to both upregulation (i.e., activation or stimulation) and downregulation (i.e., inhibition or suppression) of a response by any mode of action.
As used herein, the term "molecular replacement" means a method of solving a three-dimensional structure of a compound (e.g., a protein) that involves generating a preliminary model of a wild-type or mutant crystal whose structure coordinates are unknown, by orienting and positioning a molecule whose structure coordinates are known (e.g., a HDHD4 polypeptide, as disclosed herein) within the unit cell of the unknown crystal so as best to account for the observed diffraction pattern of the unknown crystal. Phases can then be calculated from this model and combined with the observed amplitudes to give an approximate Fourier synthesis of the structure whose coordinates are unknown. This, in turn, can be subject to any of the several forms of refinement to provide a final, accurate structure of the unknown crystal (see, e.g., McRee, Practical Protein Crystallography, Academic Press, San Diego (1993), Lattman, (1985) Method Enzymol. 115:55-77; Rossmann (ed), The Molecular Replacement Method. Gordon & Breach, New York, New York, USA, (1972)). Using the structure coordinates of the HDHD4 provided by the present invention, in conjunction with appropriate commercially available software (e.g., AmoRe, Navaza, (1994). Acta. Cryst. 50: 157-163), molecular replacement can be used to determine the structure coordinates of a crystalline mutant or homolog of a HDHD4 polypeptide, a structure known or suspected to be similar to the HDHD4 structure of the present invention or of a different crystal form of a HDHD4 polypeptide.
As used herein, the term "mutant" encompasses fusion, chimeric and recombinant polypeptides and proteins (e.g., a HDHD4 polypeptide) comprising sequences of the present invention. In the context of the present invention, the term "mutant" encompasses a polypeptide otherwise falling within the definition of a polypeptide as set forth herein, but having an amino acid sequence which differs from that of the wild-type polypeptide, whether by way of deletion, substitution, or insertion. A mutant can share many physicochemical and biological activities, (e.g., antigenicity or immunogenicity) with the wild-type, and in some embodiments comprise most or all of a wild-type sequence. Methods of preparing such proteins are disclosed herein and/or are known in the art.
As used herein, the terms "nucleotide", "base" and "nucleic acid" are used interchangeably and are equivalent. Additionally, the terms "nucleotide sequence", "nucleic acid sequence", "nucleic acid molecule" and "segment" are used interchangeably and are equivalent. The terms "nucleotide", "base", "nucleic acid", "nucleotide sequence", "nucleic acid sequence", "nucleic acid molecule" and "segment" mean any of deoxyribonucleic acid (DNA), ribonucleic acid (RNA), oligonucleotides, fragments generated by the polymerase chain reaction (PCR), and fragments generated by any of ligation, scission, endonuclease action, and exonuclease action. A nucleic acid can comprise monomers that are naturally- occurring nucleotides (such as deoxyribonucleotides and ribonucleotides), or analogs of naturally-occurring nucleotides (e.g., α-enantiomeric forms of naturally-occurring nucleotides), or a combination of both. Modified nucleotides can have modifications in sugar moieties and/or in pyrimidine or purine base moieties. Sugar modifications include, for example, replacement of one or more hydroxyl groups with halogens, allkyl groups, amines, and azido groups, or sugars can be functionalized as ethers or esters. Moreover, an entire sugar moiety can be replaced with sterically and electronically similar structures, such as aza-sugars and carbocyclic sugar analogs. Examples of modifications in a base moiety include alkylated purines and pyrimidines, acylated purines or pyrimidines, or other well-known heterocylcic substitutes. Nucleic acid monomers can be linked by phosphodiester bonds or analogs of such linkages. Analogs of phosphodiester linkages include phosphorothioate, phosphorodithioate, phosphoroselenoate, phosphorodiselenoate, phosphoroanilothioate, phosphoranilidate, phosphoramidate, and the like. The term "nucleic acid" also includes so-called "peptide nucleic acids," which comprise naturally-occurring or modified nucleic acid bases attached to a polyamide backbone. Nucleic acids can be either single stranded or double stranded. As used herein, the terms "oligonucleotide" and "polynucleotide" are used interchangeably and mean a single- or double-stranded DNA or RNA sequence. Typically, an oligonucleotide is a short segment of about 50 or less nucleotides. An oligonucleotide or a polynucleotide can be naturally occurring or synthetic, but oligonucleotides are typically prepared by synthetic means. In the context of the present invention, an "oligonucleotide" and/or a "polynucleotide" includes DNA sequences and/or their complements. The sequences can be, for example, between 1 and 250 bases, and, in some embodiments, between 5-10, 5-20, 10-20, 10-50, 20-50, 10-100 bases, or 100 or more bases in length. The terms "oligonucleotide" and "polynucleotide" refer to a molecule comprising two or more nucleotides. For example, an oligonucleotide or polynucleotide can comprise a nucleotide sequence of a full length cDNA sequence, including any 5' and 3' untranslated sequences, the coding region, with or without a signal sequence, a secreted protein coding region, as well as fragments, epitopes, domains, and variants of the nucleic acid sequence. A "polynucleotide" of the present invention also includes those polynucleotides capable of hybridizing, under stringent hybridization conditions (examples of which are provided herein), to sequences described herein, or the complement thereof.
Thus, an oligonucleotide or a polynucleotide of the present invention can comprise any polyribonucleotide or polydeoxribonucleotide, and can comprise unmodified RNA or DNA or modified RNA or DNA. For example, a polynucleotide can comprise single- and double-stranded DNA, DNA that is a mixture of single- and double-stranded regions, single- and double-stranded RNA, RNA that is mixture of single- and double-stranded regions, hybrid molecules comprising DNA and RNA that can be single-stranded or, more typically, double-stranded or a mixture of single- and double-stranded regions. In addition, a polynucleotide can comprise triple- stranded regions comprising RNA or DNA or both RNA and DNA. A polynucleotide can also contain one or more modified bases or DNA or RNA backbones modified for stability or for other reasons. "Modified" bases include, for example, tritylated bases and unusual bases, such as inosine. A variety of modifications can be made to DNA and RNA; thus, "oligonucleotide" and "polynucleotide" embraces chemically, enzymatically, or metabolically modified forms. Moreover, as used herein, a "polypeptide", defined further herein, refers to a molecule having the translated amino acid sequence generated directly or indirectly from a polynucleotide.
By employing the disclosure presented herein, a nucleic acid molecule of the present invention encoding a polypeptide of the present invention can be obtained using standard cloning and screening procedures, such as those for cloning cDNAs using mRNA as starting material.
As used herein, the terms "organism", "subject" and "patient" are used interchangeably and mean any organism referenced herein, including prokaryotes, though the terms preferably refer to eukaryotic organisms, notably mammals (e.g., mice, rats, dogs and pigs), including humans.
As used herein, the terms "protein", "polypeptide" and "peptide" are used interchangeably and mean any polymer comprising any of the 20 protein amino acids, regardless of its size. Although "protein" is often used in reference to relatively large polypeptides, and "peptide" is often used in reference to small polypeptides, usage of these terms in the art overlaps and varies. Therefore, term "polypeptide" as used herein refers to peptides, polypeptides and proteins, unless otherwise noted. The terms "protein", "polypeptide" and "peptide" are used interchangeably herein.
Thus, a polypeptide of the present invention can comprise amino acids joined to each other by peptide bonds or modified peptide bonds, i.e., peptide isosteres, and can contain amino acids other than the 20 gene-encoded amino acids. A polypeptide can be modified by either natural processes, such as by posttranslational processing, or by chemical modification techniques which are known in the art. Such modifications will be known to those of ordinary skill in the art. Modifications can occur anywhere in a polypeptide, including the peptide backbone, the amino acid side-chains and the amino or carboxyl termini. The same type of modification can be present in the same or varying degrees at several sites in a given polypeptide.
Also, a given polypeptide can contain many types of modifications. A polypeptide can be branched, for example, as a result of ubiquitination, or a polypeptide can be cyclic, with or without branching. Cyclic, branched, and branched cyclic polypeptides can result from posttranslation natural processes or can be made by synthetic methods. Representative modifications include acetylation, acylation, ADP-ribosylation, amidation, covalent attachment of flavin, covalent attachment of a heme moiety, covalent attachment of a nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid derivative, covalent attachment of phosphotidylinositol, cross-linking, cyclization, disulfide bond formation, demethylation, formation of covalent cross-links, formation of cysteine, formation of pyroglutamate, formylation, gamma-carboxylation, glycosylation, GPI anchor formation, hydroxylation, iodination, methylation, myristoylation, oxidation, pegylation, proteolytic processing, phosphorylation, prenylation, racemization, selenoylation, sulfation, transfer-RNA mediated addition of amino acids to proteins such as arginylation, and ubiquitination (see, e.g., Creighton, Proteins - Structure And Molecular Properties, 2nd ed., W. H. Freeman and Company, New York, New York, USA (1993); P osttrans lational Covalent Modification Of Proteins, (Johnson, ed.), Academic Press, New York, New York, USA, pp. 1-12 (1983); Seifter et al. (1990) Method Enzymol 182:626-646; Rattan et ah, (1992) Ann. N. Y. Acad. Sci. 663:48-62, all of which are incorporated herein by reference in their entireties). As used herein, a "polypeptide having biological activity" refers to a polypeptide exhibiting activity similar, but not necessarily identical to, an activity of a HDHD4 polypeptide of the present invention, including mature forms, as measured in a particular biological assay (e.g., the ability to transfer a phosphate group, for example, using substrates such as 2,3-diphosphoglycerate or N-acetylneuraminate-9- phosphate; see, e.g., Malieka et al. , (2006) Glycobiology 16:165-172), with or without dose dependency. In a case where dose dependency does exist, it need not be identical to that of the polypeptide, but rather substantially similar to the dose- dependence in a given activity as compared to a polypeptide of the present invention (e.g., a polypeptide having biological activity can exhibit activity of not more than about 25-fold less and, preferably, not more than about ten-fold less activity, and most preferably, not more than about three-fold less activity relative to a polypeptide of the present invention).
As used herein, the term "root mean square deviation" means the square root of the arithmetic mean of the squares of the deviations. It is a way to express the deviation or variation from a trend or object. In the present invention, for example, "root mean square deviation" describes the variation in the backbone of a mutant or homologous protein from the backbone of HDHD4 or a binding pocket portion thereof, as defined by the structure coordinates of HDHD4 described in Table 1 or Table 2 herein.
As used herein, the term "space group" means the arrangement of symmetry elements of a crystal. As used herein, the term "stringent hybridization conditions" refers to an overnight incubation at 42°C in a solution comprising 50% formamide, 5x SSC (750 mM NaCl, 75 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5x Denhardt's solution, 10% dextran sulfate, and 20 μg/mL denatured, sheared salmon sperm DNA, followed by washing the filters in 0. Ix SSC at about 65°C. As used herein, the terms "structure coordinates," "structural coordinates,"
"atomic structural coordinates" and "atomic coordinates" mean mathematical coordinates derived from mathematical equations related to the patterns obtained from the diffraction of a monochromatic beam of X-rays by the atoms (scattering centers) of a molecule in crystal form. The diffraction data are used to calculate an electron density map of the repeating unit of the crystal. The electron density maps are then used to establish the positions of the individual atoms within the unit cell of the crystal.
Those of ordinary skill in the art understand that a set of structure coordinates determined by X-ray crystallography is not without standard error. For the purpose of the present invention, any set of structure coordinates for HDHD4, including those describing a HDHD4 mutant, etc., or the core comain or the cap domain or a binding pocket portion thereof, taken in conjunction or independently, that have a root mean square deviation (RMSD) from ideal of preferably no more than about 3.0 A, more preferably no more than about 2.0 A, even more preferably less than about 1.0 A, yet more preferably less than about 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, or 0.1 A, most preferably 0.0 A when superimposed on the polypeptide backbone Ca atoms defined by the structural coordinates listed in Table 1 or Table 2 herein are considered identical.
As used herein, the term "substantially identical" means at least 75% sequence identity between nucleotide or amino acid sequences. Sequence similarity is calculated based on a reference sequence, which can be a subset of a larger sequence, such as a conserved motif, coding region, flanking region, etc. In the context of nucleic acids, a reference sequence will usually be at least about 18 nucleotides (nt) long, more usually at least about 30 nt long, and can extend to the complete sequence that is being compared. Algorithms for sequence analysis are known in the art, such as BLAST, described in Altschul et al.. (1990) J. MoL Biol. 215: 403-10. Percent identity or percent similarity of a DNA or peptide sequence can be determined, for example, by comparing sequence information using the GAP computer program, available from the University of Wisconsin Geneticist Computer Group. The GAP program utilizes the alignment method of Needleman et al. , (1970) J. MoI. Biol. 48: 443, as revised by Smith et al, (1981) Adv. Appl. Math. 2:482. Briefly, the GAP program defines similarity as the number of aligned symbols (i.e., nucleotides or amino acids) that are similar, divided by the total number of symbols in the shorter of the two sequences. The preferred parameters for the GAP program are the default parameters, which do not impose a penalty for end gaps. See, e.g., Schwartz et al. (eds.), (1979), Atlas of Protein Sequence and Structure. National Biomedical Research Foundation, pp. 357-358, and Gribskov et al, (1986) Nucl Acids. Res. 14: 6745.
The term "similarity" is contrasted with the term "identity". Similarity is defined as above; "identity", however, means a nucleic acid or amino acid sequence having the same amino acid at the same relative position in a given family member of a gene family. Homology and similarity are generally viewed as broader terms than the term identity. Biochemically similar amino acids, for example leucine/isoleucine or glutamate/aspartate, can be present at the same position— these are not identical per se, but are biochemically "similar." As disclosed herein, these are referred to as conservative differences or conservative substitutions. This differs from a conservative mutation at the DNA level, which changes the nucleotide sequence without making a change in the encoded amino acid, e.g. TCC to TCA, both of which encode serine.
As used herein, DNA analog sequences are "substantially identical" to specific DNA sequences disclosed herein if: (a) the DNA analog sequence is derived from coding regions of the nucleic acid sequences shown in SEQ ID NOs: 1 and 3, or (b) the DNA analog sequence is capable of hybridization with DNA sequences of (a) under stringent conditions and which encode a biologically active prenyltransferase gene product; or (c) the DNA sequences are degenerate as a result of alternative genetic code to the DNA analog sequences defined in (a) and/or (b). Substantially identical analog proteins and nucleic acids will have between about 70% and 80%, preferably between about 81% to about 90% or even more preferably between about 91% and 99.9% sequence identity with the corresponding sequence of the native protein or nucleic acid. Sequences having lesser degrees of identity but comparable biological activity are considered to be equivalents.
As used herein, the term "unit cell" means a basic parallelepiped shaped block. The entire volume of a crystal can be constructed by regular assembly of such blocks. Each unit cell comprises a complete representation of the unit of pattern, the repetition of which adds cumulatively to form a crystal. Thus, the term "unit cell" means a fundamental portion of a crystal that is repeated infinitely by translation in three dimensions. A unit cell is characterized by three vectors a, b, and c, not located in one plane, which form the edges of a parallelepiped. Angles α, β and γ define the angles between the vectors: angle α is the angle between vectors b and c; angle β is the angle between vectors a and c; and angle γ is the angle between vectors a and b. The entire volume of a crystal can be constructed by associating a plurality of unit cells.
As used herein, the term "vector" means is a replicon, such as plasmid, phage or cosmid, to which another DNA segment may be attached so as to bring about the replication of the attached segment.
Description of Tables
Table 1 is a table showing structure coordinates describing the structure of VGlO HDHD4 (SEQ ID NO:4) in complex with magnesium and phosphate and/or vanadate atoms.
Table 2 is a table showing structure coordinates describing the structure of wild-type HDHD4 (SEQ ID NO:2) in complex with magnesium and vanadate atoms.
Properties of Crystals of the Present Invention
One HDHD4 polypeptide sequence comprises the amino acid sequence of SEQ ID NO:4 which is encoded by SEQ ID NO:3. In one embodiment, the crystalline form has lattice constants of a = 46.4 A, b = 53.5 A, c = 64.0 A, α = 65.5°, β = 75.0°, and γ = 85.2°. The symmetry was consistent with space group Pl. Diffraction images from these crystals frequently indexed as the [1 0 0 0 -1 0 0 -1 -1] transform (a = 46.4 A, b = 53.5 A, c = 63.9 A, α = 114.5°, β = 100.4°, and γ = 95.0°). Based on this unit cell and space group, the asymmetric unit (which also equals the unit cell for space group Pl) was determined to contain two independent HDHD4 monomers (51% solvent fraction).
Another HDHD4 polypeptide sequence comprises the amino acid sequence of SEQ ID NO:2 which is encoded by SEQ ID NO:1. In one embodiment, the crystalline form has lattice constants of a = 46.8 A, b = 102.6 A, c = 186.7 A. The symmetry was consistent with space group P212121 Based on this unit cell and space group, the asymmetric unit was determined to contain three independent HDHD4 monomers (53% solvent fraction).
In representative embodiments, the crystalline form is a triclinic crystalline form and has a space group of Pl or P212121. In one embodiment, the crystalline form is described by the structure coordinates of Table 1 or Table 2 and the three- dimensional structure of the crystallized complex is determined to a resolution of about 3.0 A or better. In the crystalline forms, there were either two or three HDHD4 polypeptides in the unit cell.
Structural Features of a Crystalline Form of a HDHD4 Polypeptide
Complexed with a Magnesium Atom and Phosphate and a Crystalline From of a
HDHD4 Polypeptide Complexed with a Magnesium Atom and the Phosphate
Mimetic Vanadate After forming the described crystalline forms (shown in Figure 1 and described elsewhere in the text) and acquiring X-ray diffraction data, the three- dimensional structure of the HDHD4 polypeptide in complex with magnesium and phosphate/vanadate and the three-dimensional structure of the HDHD4 polypeptide in complex with magnesium and vanadate were determined (shown in Figures 2, 4, and 5). The HDHD4 X-ray crystal structures disclosed herein exhibit an "open" conformation with phosphate and Mg2+ bound or the inhibitor vanadate (IC50 « 3 μM) and Mg + bound. The phosphate-based crystallization conditions utilized phosphate at concentrations (0.8 - 1.8 M) much greater than the vanadate concentration (1.5 mM). Consequently, the anion bound in the active site of crystals grown in the presence of high phosphate levels could predominantly be phosphate rather than vanadate. In an effort to clarify the nature of the anion, crystals were grown under the phosphate conditions but with vanadate replaced by tungstate (over 15-fold weaker binder than vanadate with an IC50 « 50 μM). X-ray analysis of these tungstate crystals at the tungsten LIII edge (λ = 1.2140 A) showed no significant anomalous signal. Thus, under high-phosphate conditions, for example in the case of tungstate and in the case of vanadate, the bound anion was predominantly phosphate. Therefore, the protein was purified in the absence of added phosphate, non-phosphate crystallization conditions were developed, and the structure with only vanadate present was determined. The presence of vanadate was confirmed by an anomalous difference Fourier map (λ = 1 A: vanadium f" = 1.00 eV and f = 0.37 eV) computed from measured anomalous differences and phases calculated from the model with vanadate omitted. Irrespective of the anion bound in the active site, the two forms of the protein are iso-structural. These structures are the first of a subfamily I member shown to adopt an open conformation with phosphate/a phosphate mimic bound. The active site is more accessible than the open conformation reported for β- phosphoglucomutase (Lahiri et al, (2002) Biochemistry 41 : 8351-8359; Lahiri et al, (2004) Biochemistry 43: 2812-2820), also a member of subfamily I.
Producing a HDHD4 Polypeptide
A HDHD4 polypeptide of the present invention can be prepared using any or a combination of technologies known to those of ordinary skill in the art. In one embodiment of the present invention, a HDHD4 polypeptide is expressed in a recombinant system. In another embodiment, a HDHD4 polypeptide is isolated from a biological source. And in yet another embodiment, a HDHD4 polypeptide is synthesized de novo. Further discussion of these methods is provided hereinbelow.
Other methods of producing a HDHD4 polypeptide will be known to those of ordinary skill in the art, upon consideration of the present disclsoure. De novo Synthesis
In addition to recombinant production, fragments of a HDHD4 polypeptide can be produced by direct peptide synthesis using solid phase techniques (Roberge et al, (1995) Science 269:202-204; Merrifield. (1963) J. Am. Chem. Soc. 85:2149- 2154). Protein synthesis can be performed using manual techniques or by automation. Automated synthesis may be achieved, for example, using ABI 43 IA Peptide Synthesizer (Applied Biosystems, Foster City California, USA). Various fragments of a HDHD4 polypeptide can be chemically synthesized separately and then combined using chemical methods to produce a full-length molecule. In a further embodiment, sequences encoding a HDHD4 polypeptide can be synthesized in whole, or in part, using chemical methods known in the art (see, for example, Caruthers et al.. (1980) Nucl. Acids Res. Symp. Ser. 215-223 and Horn ef α/.. (1980) Nucl. Acids Res. Symp. Ser. 225-232; Hunkapiller et al. (1984) Nature 310: 105-111; Creighton. Proteins. Structures and Molecular Principles. W.H. Freeman & Co., New York, New York, USA (1983), incorporated herein by reference). Alternatively, a HDHD4 protein itself, or a fragment or portion thereof, can be produced using chemical methods to synthesize the amino acid sequence of a HDHD4 polypeptide, or a fragment or portion thereof. For example, peptide synthesis can be performed using various solid-phase techniques (Roberge et al, (1995) Science 269:202-204; Merrifield. (1963) J. Am. Chem. Soc. 85:2149-2154) and automated synthesis can be achieved, for example, using the ABI 43 IA Peptide Synthesizer (Applied Biosystems, Foster City, California). Furthermore, if desired, non-naturally occurring amino acids or chemical amino acid analogs can be introduced as a substitution or addition into the polypeptide sequence. Non-naturally occurring amino acids include, but are not limited to, the D isomers of the common amino acids, 2,4-diaminobutyric acid, alpha-amino isobutyric acid, 4-aminobutyric acid, Abu, 2-amino butyric acid, γ-Abu, ε-Ahx, 6-amino hexanoic acid, Aib, 2-amino isobutyric acid, 3 -amino propionic acid, ornithine, norleucine, norvaline, hydroxyproline, sarcosine, citrulline, homocitrulline, cysteic acid, t-butylglycine, t- butylalanine, phenylglycine, cyclohexylalanine, beta-alanine, alpha-alanine, fluoro- amino acids, designer amino acids such as β-methyl amino acids, Ca-methyl amino acids, Na-methyl amino acids, trans-3-methylproline, 2,4-methanoproline, cis-4- hydroxyproline, trans-4- hydroxyproline, N-methyl-glycine, allo-threonine, methylthreonine, hydroxyethyl-cysteine, hydroxyethylhomocysteine, nitroglutamine, homoglutamine, pipecolic acid, tert-leucine, norvaline, 2-azaphenylalanine, 3- azaphenylalanine, 4-azaphenyl-alanine, 4- fluorophenylalanine, 4-hydroxyproline, 6- N-methyl lysine, 2- aminoisobutyric acid, isovaline and N-methyl serine, and amino acid analogs in general.
Techniques are known in the art for incorporating non-naturally occurring amino acid residues into proteins. For example, an in vitro system can be employed wherein nonsense mutations are suppressed using chemically aminoacylated suppressor tRNAs. Methods for synthesizing amino acids and aminoacylating tRNA are known in the art, some of which are described herein. In one approach, transcription and translation of plasmids containing nonsense mutations can be carried out in a cell free system comprising an E. coli S30 extract and commercially available enzymes and other reagents. The newly synthesized HDHD4 polypeptide or peptide can be substantially purified by preparative high performance liquid chromatography (see, e.g., Creighton, Proteins, Structures and Molecular Principles, W.H. Freeman & Co., New York, New York, USA (1983)), by reverse-phase high performance liquid chromatography (HPLC), or other purification methods as known and practiced in the art. The composition of the synthetic peptides can be confirmed by amino acid analysis or sequencing (e.g., the Edman degradation procedure). In addition, the amino acid sequence of a HDHD4 polypeptide, or any portion thereof, can be altered during direct synthesis and/or combined using chemical methods with sequences from other proteins, or any part thereof, to produce a variant polypeptide. In yet another approach, E. coli cells are cultured in the absence of a natural amino acid that is to be replaced (e.g., phenylalanine) and in the presence of a desired non-naturally occurring amino acid(s). The non-naturally occurring amino acid is incorporated into the protein in place of its natural counterpart (Koide et ah, (1994) Biochem. 33:7470-76). Naturally occurring amino acid residues can be converted to non-naturally occurring species by in vitro chemical modification. Chemical modification can be combined with site-directed mutagenesis (as described herein) to further expand the range of substitutions CWynn & Richards, (1993) Protein Sci. 2:395-403).
Isolation from a Biological Source In yet another embodiment, a HDHD4 polypeptide, can be isolated from any suitable animal source, particularly from a mammal (e.g., from liver, brain, colon, breast or lung tissue). Methods for purifying a HDHD4 protein are known and can be employed to obtain a HDHD4 polypeptide as described herein. In another embodiment, a HDHD4 polypeptide can be isolated from a biological sample using standard protein purification methodology known to those of the art (see, e.g., Janson. Protein Purification: Principles, High Resolution Methods, and Applications, (2nd ed.) Wiley, New York, (1997); Rosenberg. Protein Analysis and Purification: Benchtop Techniques. Birkhauser, Boston, (1996); Walker. The Protein Protocols Handbook. Humana Press, Totowa, New Jersey, (1996); Doonan. Protein Purification Protocols. Humana Press, Totowa, New Jersey, (1996); Scopes. Protein Purification: Principles and Practice. Springer-Verlag, New York, (1994); Harris. Protein Purification Methods: A Practical Approach. IRL Press, New York, (1989), all of which are incorporated in their entireties herein by reference).
Recombinant Methods
Preparation of a DNA Sequence Encoding a HDHD4 Polypeptide In accordance with the present invention, conventional molecular biology, microbiology, recombinant DNA and protein chemistry techniques known to those of ordinary skill of the art can be employed to produce a DNA sequence encoding a HDHD4 polypeptide. Such techniques are explained fully in the relevant literature (see, e.g., Sambrook et al.. Molecular Cloning: A Laboratory Manual. (3rd ed.) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, USA (2001); Glover. DNA Cloning: A Practical Approach. (2nd ed.) IRL Press, New York, USA (1995); Gait. Oligonucleotide Synthesis: A Practical Approach. IRL Press, New York, USA (1984); Hames & Higgins. Nucleic Acid Hybridisation: A Practical Approach. IRL Press, Washington, D.C., USA (1985); Hames & Higgins. Protein Expression: A Practical Approach. Oxford University Press, New York, USA, (1999); Masters. Animal Cell Culture: A Practical Approach, Oxford University Press, New York, USA (2000); Bickerstaff, Immobilization of Cells And Enzymes, Humana Press, Totowa, New Jersey, USA (1997); Perbal, A Practical Guide To Molecular Cloning (2nd ed.) Wiley, New York, New York, USA (1988); Current Protocols in Molecular Biology, (Ausubel et ah, eds.), Greene Publishing Associates and Wiley-Interscience, New York (2002); Ausubel. Short Protocols in Molecular Biology: A Compendium of Methods from Current Protocols in Molecular Biology. (4th ed.) John Wiley & Sons, New York, New York, USA (1999), all of which are incorporated herein). A DNA sequence encoding a HDHD4 polypeptide (including mutants, analogs, derivative and functional equivalents, as described herein), can be prepared by various molecular biology methods known in the art and disclosed herein.
Expression Vectors Upon providing a nucleic sequence encoding a HDHD4 polypeptide or peptide (e.g., SEQ ID NOs :2 or 4), or mutant, analog, derivative or functional equivalent, the encoded polypeptide can be expressed. To express a biologically active HDHD4 polypeptide or peptide, a nucleotide sequence encoding a HDHD4 polypeptide, or a functional equivalent thereof, can be inserted into an appropriate expression vector, i.e., a vector, which contains the necessary elements for the transcription and translation of the inserted coding sequence.
In one embodiment of the present invention, an expression vector contains an isolated and purified polynucleotide sequence encoding a HDHD4 polypeptide or a sequence as set forth in SEQ ID NOs:2 and 4, encoding a HDHD4 polypeptide, respectively or a functional fragment thereof, in which the HDHD4 polypeptide comprises the amino acid sequence as set forth in SEQ ID NOs :2 and 4. Alternatively, an expression vector can contain the complement of a HDHD4 nucleic acid sequence.
Expression vectors derived from retroviruses, adenovirus, herpes or vaccinia viruses, or from various bacterial plasmids can be used in the present invention. Methods, which are known to those of ordinary skill in the art, can be used to construct expression vectors containing sequences encoding one or more HDHD4 polypeptides along with appropriate transcriptional and translational control elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic recombination. Such techniques are described, for example, in Current Protocols in Molecular Biology, (Ausubel et ah, eds.), Greene Publishing Associates and Wiley-Interscience, New York (2002); AusubeL Short Protocols in Molecular Biology: A Compendium of Methods from Current Protocols in Molecular Biology. (4th ed.) John Wiley & Sons, New York, New York, USA (1999); and Sambrook et ah. Molecular Cloning: A Laboratory Manual. (3rd ed.) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, USA (2001).
The present invention also relates to expression vectors containing genes encoding analogs, derivatives and mutants of a HDHD4 polypeptide, including a modified HDHD4 proteins of the present invention, that have the same or homologous functional activity as a HDHD4 polypeptide, and homologs thereof. Such cloning vectors can be prepared as described. Thus, the production and use of derivatives, analogs and mutants related to HDHD4 are within the scope of the present invention.
Introducing a Nucleic Acid into a Host Cell
Recombinant molecules can be introduced into host cells via transfection, electroporation, microinjection, transduction, cell fusion, DEAE dextran, calcium phosphate precipitation, lipofection (lysosome fusion), use of a gene gun, or a DNA vector transporter (see, e.g., Wu et al.. (1992) J. Biol. Chem. 267:963-967; Wu & Wu.
(1988) J. Biol. Chem. 263: 14621-14624).
In another example, the cloned gene can be contained on a shuttle vector plasmid, which provides for expansion in a cloning cell, e.g., E. coli, and facile purification for subsequent insertion into an appropriate expression cell line, if such is desired. For example, a shuttle vector, which is a vector that can replicate in more than one type of organism, can be prepared for replication in both E. coli and Saccharomyces cerevisiae by linking sequences from an E. coli plasmid with sequences from a yeast plasmid. Vector-Host Systems
After a DNA sequence encoding a HDHD4 polypeptide has been identified or isolated, the DNA sequence can then be inserted into an appropriate cloning vector and expressed in a host cell. Any suitable vector-host systems known in the art can be employed in the present invention. For example, plasmids or modified viruses can be employed, but the vector system should be compatible with the host cell selected. Examples of suitable vectors include, but are not limited to, plasmids, such as pBR322 derivatives or pUC plasmid derivatives, e.g., pGEX vectors, pmal-c, pFLAG, etc. The insertion into a cloning vector can be accomplished by ligating the DNA fragment into a cloning vector that comprises complementary cohesive termini. However, if the complementary restriction sites used to fragment a given DNA sequence are not present in the cloning vector, the ends of the DNA molecules can be enzymatically modified. Alternatively, any desired site can be produced by ligating nucleotide sequences (linkers) onto the DNA termini. Such ligated linkers can comprise specific chemically synthesized oligonucleotides comprising a restriction endonuclease recognition sequence, encoding a protease site, a purification aid (such as a His tag, as was done in the present invention) or other desired feature.
Thus, a variety of host-expression vector systems can be utilized to express a DNA sequence encoding a HDHD4 polypeptide. These include but are not limited to microorganisms such as bacteria transformed with recombinant bacteriophage DNA, plasmid DNA or cosmid DNA expression vectors containing a DNA sequence encoding a HDHD4 polypeptide; yeast transformed with recombinant yeast expression vectors containing a DNA sequence encoding a HDHD4 polypeptide; insect cell systems infected with recombinant virus expression vectors (e.g., baculovirus) containing a DNA sequence encoding a HDHD4 polypeptide; plant cell systems infected with recombinant virus expression vectors (e.g., cauliflower mosaic virus, (CaMV); tobacco mosaic virus, (TMV)) or transformed with recombinant plasmid expression vectors (e.g., Ti plasmid) containing a DNA sequence encoding a HDHD4 polypeptide; or animal cell systems. The expression elements of these systems vary in their strength and specificities. In another approach, translation is carried out in Xenopus oocytes by microinjection of mutated mRNA and chemically aminoacylated suppressor tRNAs (Turcatti et al, (1996) J. Biol. Chem. 271 :19991-
Depending on the host-vector system employed, any of a number of suitable transcription and translation elements, including constitutive and inducible promoters, can be used in an expression vector. For example, when cloning in bacterial systems, inducible promoters such as pL of bacteriophage λ, plac, ptrp, ptac (ptrp-lac hybrid promoter) and the like can be used. When cloning in insect cell systems, promoters such as the baculovirus polyhedrin promoter can be used. When cloning in mammalian cell systems, promoters derived from the genome of mammalian cells (e.g., metallothionein promoter) or from mammalian viruses (e.g., the adenovirus late promoter; the vaccinia virus 7.5K promoter) can be used. When generating cell lines that contain multiple copies of a nucleic acid sequence encoding a polypeptide of the present invention, SV40-, BPV- and Epstein-Barr (EBV)-based vectors can be used with an appropriate selectable marker. Representative methods of expressing a DNA sequence encoding a HDHD4 polypeptide are described in the herein.
Cultured mammalian cells are preferred hosts within the present invention. Methods for introducing exogenous DNA into mammalian host cells include calcium phosphate-mediated transfection (Wigler et al, (1978) Cell 14:725; Corsaro & Pearson. (1981) Somat. Cell Genet. 7:603; Graham & Van der Eb. (1973) Virology 52:456, 1973), electroporation (Neumann et al.. (1982) EMBO J. 1 :841-845), DEAE- dextran mediated transfection (Current Protocols in Molecular Biology, (Ausubel et al, eds.), Greene Publishing Associates and Wiley-Interscience, (2002)), and liposome-mediated transfection (Hawley-Nelson et al, (1993) Focus 15:73; Ciccarone et al, (1993) Focus 15:80), both of which are incorporated herein by reference in their entireties. The production of recombinant polypeptides in cultured mammalian cells is disclosed, for example, in U.S. Patent Nos. 4,713,339; 4,784,950; 4,579,821; and 4,656,134, which are incorporated herein by reference. Examples of cultured mammalian cells include the COS-I (ATCC No. CRL 1650), COS-7 (ATCC No. CRL 1651), BHK 570 (ATCC No. CRL 10314), 293 (ATCC No. CRL 1573; Graham et al, (1977) J. Gen. Virol. 36:59-72) and Chinese hamster ovary (e.g. CHO- KL; ATCC No. CCL 61 or DG44) cell lines. Additional suitable cell lines are known in the art and available from public depositories such as the American Type Culture Collection (ATCC), Manassas, Virginia.
In mammalian host cells, a number of viral-based expression systems can be utilized. In cases where an adenovirus is used as an expression vector, sequences encoding a polypeptide of the present invention can be ligated into an adenovirus transcription/ translation complex containing the late promoter and tripartite leader sequence. Insertion into a non-essential El or E3 region of the viral genome can be used to obtain a viable virus which is capable of expressing a HDHD4 polypeptide in infected host cells (see, e.g., Logan & Shenk, (1984) Proc. Natl. Acad. ScL USA 81:3655-3659). In addition, transcription enhancers, such as the Rous sarcoma virus (RSV) enhancer, may be used to increase expression in mammalian host cells. Other expression systems can also be used, such as, but not limited to yeast, plant, and insect vectors.
Alternatively, yeast-based systems can be employed to express a recombinant polypeptide of the present invention. Techniques for transforming yeast cells with exogenous DNA to produce recombinant polypeptides therefrom are disclosed by, for example, U.S. Patent Nos. 4,599,311; 4,931,373; 4,870,008; 5,037,743; and 4,845,075, which are incorporated herein by reference. Transformation systems for other yeasts, including Hansenula polymorpha, Schizosaccharomyces pombe, Kluyveromyces lactis, Kluyveromyces fragilis, Ustilago maydis, Pichia pastoris, Pichia guillermondii, and Candida maltosa are known in the art. A preferred system utilizes Pichia methanolica (see, PCT Publication WO 97/17450). For alternative transformation systems, see, for example, Gleeson et ah, (1986) J. Gen. Microbiol. 132:3459-3465 and U.S. Patent No. 4,882,279. Aspergillus cells can be utilized according to the methods of U.S. Patent No. 4,935, 349, which is incorporated herein by reference. Methods for transforming Acremonium chrysogenum are disclosed in U.S. Patent No. 5,162,228, which is incorporated herein by reference. Methods for transforming Neurospora are disclosed in U.S. Patent No. 4,486,533, which is incorporated herein by reference. Bacterial systems can also be employed to express a recombinant polypeptide of the present invention. In bacterial systems, a number of expression vectors can be selected, depending upon the use intended for the expressed HDHD4 polypeptide product. For example, when large quantities of expressed protein are needed for the generation of antibodies or for crystallization, vectors that direct high level expression of fusion proteins that can be readily purified can be used. Such vectors include, but are not limited to, the multifunctional E. coli cloning and expression vectors such as BLUESCRIPT (Stratagene, La Jolla, California, USA), in which the sequence encoding a polypeptide of interest can be ligated into the vector in-frame with sequences for the amino-terminal Met and the subsequent 7 residues of β- galactosidase, so that a hybrid protein is produced; pIN vectors (see, e.g., Van Heeke & Schuster. (1989) J. Biol. Chem. 264:5503-5509); and the like. pGEX vectors (Promega, Madison, Wisconsin) can also be used to express foreign polypeptides as fusion proteins with glutathione S-transferase (GST). In general, such fusion proteins are soluble and can be easily purified from lysed cells by adsorption to glutathione- agarose beads followed by elution in the presence of free glutathione. Proteins made in such systems can be designed to include, for example, heparin, thrombin, or Factor Xa protease cleavage sites so that the cloned polypeptide of interest can be released from the GST moiety at will.
Host cells transformed with a nucleotide sequence encoding a polypeptide of the present invention can be cultured under conditions suitable for the expression and recovery of the protein from cell culture. The protein produced by a recombinant cell may be secreted or contained intracellularly depending on the sequence and/or the vector used. As will be understood by those having skill in the art, expression vectors containing a polynucleotide which encodes a polypeptide of the present invention can be designed to contain signal sequences which direct secretion of the polypeptide through a prokaryotic or eukaryotic cell membrane. Other constructions can be used to join nucleic acid sequences encoding a polypeptide to a nucleotide sequence encoding a polypeptide domain, which can facilitate purification of soluble proteins. Such purification facilitating domains include, but are not limited to, metal chelating peptides such as histidine-tryptophan modules that allow purification on immobilized metals; protein A domains that allow purification on immobilized immunoglobulin; and the domain utilized in the FLAG® extension/affinity purification system (available from Immunex Corp., Seattle, WA). The inclusion of cleavable linker sequences such as those specific for Factor Xa or enterokinase (Invitrogen Corp., San Diego, California, USA) between the purification domain and the polypeptide can be used to facilitate purification. One such expression vector provides for expression of a fusion protein containing a polypeptide of the present invention and a nucleic acid encoding 6 histidine residues preceding a thioredoxin or an enterokinase cleavage site. The histidine residues facilitate purification on immobilized metal ion affinity chromatography (IMAC) as described by Porath et al . (1992) Prot. Exp. Purif. 3:263- 281, while the enterokinase cleavage site provides a means for purifying from the fusion protein. For a discussion of suitable vectors for fusion protein production (see Kroll ef α/.. (1993) DAW Cell Biol 12:441-453). The presence of polynucleotide sequences encoding a polypeptide of the present invention can be detected by DNA-DNA or DNA-RNA hybridization, or by amplification using probes, portions, or fragments of polynucleotides encoding a polypeptide of the present invention. Nucleic acid amplification based assays involve the use of oligonucleotides or oligomers based on the nucleic acid sequences encoding a polypeptide of the present invention to detect transformants containing DNA or RNA encoding the polypeptide.
Method of Forming HDHD4 Crystals The formation of HDHD4 crystals can depend on a number of different parameters, including pH, temperature, protein, concentration, the nature of the solvent and precipitant, as well as the presence of ligands. Prior to the present disclosure, many routine crystallization experiments would be required to screen all these parameters for the few combinations that might generate a HDHD4 crystal suitable for X-ray diffraction analysis. The native, analog, derivative and mutant co-crystals, and fragments thereof, disclosed in the present invention can be obtained by a variety of techniques, including batch, liquid bridge, vapor diffusion (e.g., sitting drop and hanging drop methods (see, e.g., Taylor et al. (1992) J. MoI Biol. 226: 1287-1290, incorporated herein by reference) and by microdialysis (see, e.g., McPherson. (1982) Preparation and Analysis of Protein Crystals, John Wiley, New York; McPherson, (1990) Eur. J. Biochem. 189: 1-23; Weber, (1991) Adv. Protein Chem. 41 : 1-36, incorporated herein by reference). Seeding of the crystals can be useful in obtaining X-ray quality crystals. Standard micro and/or macroseeding of crystals can therefore be used in the context of the present invention. In one embodiment, hanging or sitting drop methods are used for the crystallization of HDHD4 polypeptides and fragments thereof.
In an example of a hanging drop method, a drop comprising about an amount of HDHD4 polypeptide is mixed with an equal volume of reservoir buffer and grown at about 200C until crystals form. Methods for forming crystals are known in the art
(MacPherson. Crystallization of Biological Macromolecules. Cold Spring Harbor
Press, Cold Spring Harbor, New York, USA (1999), incorporated herein by reference) and can be employed in the context of the present invention to form crystals comprising HDHD4, and/or fragments thereof.
Generation and Collection of Diffraction Data
Once a crystal comprising a HDHD4 polypeptide of the present invention is grown, X-ray diffraction data can be collected. Crystals can be prepared for diffraction using known methodology (see, e.g., Buhrke et ah, A Practical Guide for the Preparation of Specimens for X-ray Fluorescence and X-ray Diffraction Analysis, Wiley-VCH, New York, New York, USA (1998), incorporated herein by reference).
Various methods can be employed to collect diffraction data, such as a MAR imaging plate detector for X-ray diffraction data collection. For example, crystals can be characterized by using X-rays produced in a conventional source (such as a sealed tube or a rotating anode) or using a synchrotron source. Methods of characterization include, but are not limited to, precision photography, oscillation photography and diffractometer data collection. Heavy atom derivatives such as produced with a mercurial, described herein, can be performed using imaging plates. Alternatively, a HDHD4 polypeptide can be synthesized with selenium-methionine (Se-Met) in place of methionine, and the Se-Met multiwavelength anomalous dispersion data (Hendrickson, (1991) Science 254:51-58) can be collected at multiple X-ray wavelengths, corresponding to two remote points above and below the Se absorption edge (λl and λ4) and the absorption edge inflection point (λ2) and peak (λ3). Selenium sites can be located using software adapted for that purpose, such as SHELXS-97 in Patterson search mode (Sheldrick (1990) Acta Cryst. A 46:467). Experimental phases can be estimated via a multiple isomorphous replacement/anomalous scattering strategy using MLPHARE (Otwinowski, Daresbury Study Weekend proceedings, 1991) with three of the wavelengths treated as derivatives and one (λ2) treated as the parent for example. In either case, data can be processed using HKL, DENZO and SCALEPACK (Otwinowski & Minor. Method Enzymol. 276(A) 307-326, (Carter, Jr. & Sweet, eds.), Academic Press, New York, New York, USA (1997)).
In addition, X-PLOR (Brunger, (1992) X-PLOR, Version 3.1. A System for X- ray Crystallography and NMR, Yale University Press, New Haven, Connecticut; Accelrys, San Diego, California) or HEAVY (Terwilliger, Los Alamos National Laboratory, Los Alamos, New Mexico) can be utilized for bulk solvent correction and B-factor scaling. After density modification and non-crystallographic averaging, the protein is built into a electron density map using the program O, (Jones et ah, (1991) Acta Cry st. A47: 110-119). Model building interspersed with positional and simulated annealing refinement can facilitate an unambiguous trace and sequence assignment of a fragment of a HDHD4 polypeptide or fragment. Additional data collection methods, as well as general crystallographic methods, will be known to those of ordinary skill in the art upon consideration of the present disclosure (see, e.g., McRee. Practical Protein Crystallography. (2n ed.) Academic Press, San Diego, California, USA (1999), incorporated herein by reference).
Solving a Three-dimensional Structure of the Present Invention After acquiring X-ray diffraction data from a crystal comprising HDHD4 polypeptide, the three-dimensional structure of the polypeptide can be determined by analyzing the diffraction data. Such an analysis can be employed whether the polypeptide is a wild-type polypeptide or a fragment thereof, or a mutant, derivative or analog of a HDHD4 polypeptide.
X-ray diffraction data can be solved by employing available software packages, such as O (Jones et al.. (1991) Acta Cryst. A 47, 110-119); FRODO (Jones et al. (1978) J. Appl. Crystallogr. 11 :268-272) and TURBO FRODO; X-PLOR
(Brunger. (1992) X-PLOR. Version 3.1. A System for X-ray Crystallography and
NMR. Yale University Press, New Haven, Connecticut; Accelrys, San Diego, California); HKL; DENZO (Sawyer et al eds., Proceedings ofCCP4 Study Weekend, pp. 56-62, SERC Darsbary Lab., UK (1993); SCALEPACK; the CCP4 package (SERC Collaborative Computing Project No. 4, Daresbury Laboratory, UK, 1979); MLPHARE (Wolf e? al, eds., Isomorphous Replacement and Anomalous Scattering: Proceedings ofCCP4 Study Weekend, pp. 80-86, SERC Daresbury Lab., UK (1991)
The present invention therefore provides a method for determining the three- dimensional structure of a crystallized HDHD4 polypeptide, optionally in complex with a ligand, and optionally in complex with, or in further complex with, one or more metal-comprising moieties, such as vanadate, phosphate, tungstate, magnesium or manganese to a resolution of about 3.0 A or better. In one embodiment, the method comprises: (a) crystallizing a HDHD4 polypeptide in complex with a metal- comprising moiety to form a crystallized complex; and (b) analyzing the crystallized complex to determine a three-dimensional structure of the HDHD4 polypeptide in complex with a metal-comprising moiety. The crystallization can be carried out using the present disclosure as a guide. For example, various vapor diffusion techniques can be employed to generate a crystalline form of a HDHD4 polypeptide (including mutants, derivatives, etc.) in complex with a metal-comprising moiety. The analyzing can be carried out as described hereinabove and can include collecting and processing X-ray diffraction data, which can then provide a three-dimensional structure of the crystallized molecule(s). The same method can be employed to determine the three dimensional structure of a crystallized HDHD4 polypeptide.
Co-crystals Comprising a HDHD4 Polypeptide
In one aspect of the present invention, a crystal comprising a HDHD4 polypeptide or fragment can also comprise a ligand. Those of ordinary skill in the art will recognize that crystals can be formed by co-crystallizing a HDHD4 polypeptide, analog, derivative, mutant or functional equivalent with a ligand known or suspected to bind to the HDHD4 polypeptide. Such a co-crystal can be formed by employing the techniques disclosed herein and known to those of ordinary skill in the art. Formation of a Derivative Crystal
Solving a structure of a crystallized polypeptide can be diffficult and time consuming. In order to facilitate the elucidation of a three-dimensional structure, derivative crystals comprising a heavy atom can be generated. In a representative method of forming a derivative crystal, the method comprises: (a) providing a crystalline form; and (b) associating a heavy atom with the crystalline form. In one embodiment, the association can be carried out by soaking the crystal with a solution containing a heavy atom (e.g., a mercurial). Preferably, when derivative crystals are formed, that there should not be too many heavy atoms, which can make the identification of their positions difficult. Additionally, the heavy atoms preferably should not change the structure of the molecule or of the crystal cell, i.e., the crystals should be isomorphous. Isomorphous replacement is usually done by diffusing different heavy-metal complexes into the channels of the preformed protein crystals. The crystalline form can comprise, for example, a HDHD4 polypeptide. In practice, a crystal (e.g., a crystal comprising a HDHD4 polypeptide) is usually soaked in a solution containing heavy metal atom salts, or organometallic compounds, e.g., lead chloride, gold thiomalate, thiomersal or uranyl acetate, which can diffuse through the crystal and bind to the surface of the protein. The protein molecules expose side chains (such as SH groups) into these solvent channels that are able to bind heavy metals.
After the diffraction data from crystals with and without heavy metal atoms is analyzed, the diffraction data from the protein crystals are used to calculate an electron-density map of the repeating unit of the crystal. This map is then interpreted as a polypeptide chain of a particular amino acid sequence. Following this stage of the process, the polypeptide chain is oriented with respect to the observed electron density and an initial model can then be built (see, e.g., Blundell & Johnson, Protein Crystallography, Academic Press, New York, New York, USA (1976); McRee, Practical Protein Crystallography. (2nd ed.) Academic Press, San Diego, California, USA (1999), both of which are incorporated herein by reference).
Polypeptides That are Structurally Homologous or Structurally Equivalent to a HDHD4 Polypeptide of the Present Invention The HDHD4 structural coordinates set forth herein can be used to aid in obtaining structural information about another crystallized molecule or molecular complex that is structurally homologous to a HDHD4 polypeptide (or to a HDHD4 polypeptide). The present invention allows a determination of at least a portion, if not all, of the three-dimensional structure of a molecule or a molecular complex that contains one or more structural features that are similar to structural features of a HDHD4 polypeptide, as revealed by the structure coordinates provided herein. These molecules are referred to herein as "structurally homologous" to HDHD4. Thus, the present invention also provides HDHD4 polypeptides that are structurally homologous to the polypeptides of SEQ ID NOs:2 and 4, and/or the polypeptides encoded by SEQ ID NOs: 1 and 3, and orthologs thereof.
Compounds that are structurally homologous can be formulated to mimic key portions of a HDHD4 structure. Such compounds are structural homologs. The generation of a structurally homologous protein can be achieved by the techniques of modeling and chemical design known to those of skill in the art and described herein. Modeling and chemical design of HDHD4 structural equivalents can be based on the structure coordinates of a crystalline HDHD4 polypeptide of the present invention. It will be understood that all such structurally homologous constructs fall within the scope of the present invention. Structural homologs can include, for example, regions of amino acid identity, conserved active site or binding site motifs, and similarly arranged secondary structural elements (e.g., α-helices and β-sheets). Structural homology can be determined by aligning the residues of two amino acid sequences to optimize the number of identical amino acids along the lengths of their sequences; gaps in either or both sequences are permitted in making the alignment in order to optimize the number of identical amino acids, although the amino acids in each sequence must nonetheless remain in their proper order. In one embodiment, two amino acid sequences are compared using the BLASTP program, version 2.0.9, of the BLAST 2 search algorithm, (as described by Tatusova et al, (1999) FEMS Microbiol. Lett. 174:247- 50. See also Altschul et al, (1986) Bull Math. Bio. 48: 603-616 and Henikoff & Henikoff. (1992) Proc. Natl. Acad. ScL U.S.A. 89:10915-10919). The default values for all BLAST 2 search parameters can be used and can include matrix=BLOSUM62; open gap penalty=l l, extension gap penalty=l, gap x_dropoff=50, expect=10, wordsize=3, and filter on.
In a comparison of two amino acid sequences using the BLAST search algorithm, structural similarity is referred to as "identity." In one embodiment of the present invention, a structurally homologous molecule comprises a protein that has an amino acid sequence sharing at least 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% identity with a native or recombinant HDHD4 amino acid sequence (e.g., SEQ ID NO:2 and/or a polypeptide encoded by SEQ ID NO: 1) or a His-tagged HDHD4 amino acid sequence (e.g., SEQ ID NO:4, and/or a polypeptide encoded by SEQ ID NO:3). Percent sequence identity is calculated as: (the total number of identical matches) multiplied by (the length of the longer sequence plus the number of gaps introduced into the longer sequence in order to align the two sequences) x 100%.
Structurally homologous proteins and polypeptides are generally defined as having one or more amino acid substitutions, deletions or additions from a native or recombinant HDHD4 amino acid sequence (e.g., SEQ ID NO:2 and/or a polypeptide encoded by SEQ ID NO: 1). These changes are preferably of a minor nature and preferably comprise conservative amino acid substitutions (a table of which is provided herein) and other substitutions that do not significantly affect the folding or activity of the protein or polypeptide; small deletions, typically of one to about 30 amino acids; and small amino- or carboxyl-terminal extensions, such as an amino- terminal methionine residue, a small linker peptide of up to about 20-25 residues, or a small extension that facilitates purification (an affinity tag), such as a poly-histidine tract, protein A (Nilsson et al . (1985) EMBO J. 4: 1075; Nilsson et al. (1991) Method Enzymol. 198:3, incorporated herein by reference), glutathione S transferase (Smith & Johnson, (1988) Gene 67:31, incorporated herein by reference), maltose binding protein (Kellerman & Ferenci, (1982) Method Enzymol. 90:459-463; Guan et al. (1987) Gene 67:21-30, incorporated herein by reference), or other antigenic epitope or binding domain (see, in general, Ford et al. (1991) Protein Express. Purif. 2: 95-107, which is incorporated herein by reference). In another embodiment, a protein that is structurally homologous to HDHD4 comprises at least one contiguous stretch of at least 50 amino acids that shares at least 80% amino acid sequence identity with the analogous portion of the native or recombinant a native or recombinant HDHD4 amino acid sequence (e.g., SEQ ID NO:2 and/or a polypeptide encoded by SEQ ID NO: 1). Methods for generating structural information about the structurally homologous molecule or molecular complex are known and include, for example, molecular replacement techniques, as described herein.
In another aspect, the present invention encompasses structural equivalents of HDHD4 polypeptides. Various computational analyses can be used to determine whether a molecule (or a binding pocket portion thereof) is "structurally equivalent," in terms of its three-dimensional structure, to all or part of a HDHD4 polypeptide or its binding pocket(s). Such analyses can be carried out in current software applications, such as the Molecular Similarity application of QUANTA (Molecular Simulations Inc., San Diego, California, USA) version 4.1, and as described in the accompanying User's Guide.
The Molecular Similarity application permits comparisons between different structures, different conformations of the same structure, and different parts of the same structure. The procedure used in Molecular Similarity to compare structures is divided into four steps: (1) load the structures to be compared; (2) define the atom equivalences in these structures; (3) perform a fitting operation; and (4) analyze the results. Each structure is identified by a name. One structure is identified as the target
(i.e., the fixed structure); all remaining structures are working structures (i.e., moving structures). Since atom equivalency within QUANTA is defined by user input, for the purpose of this invention equivalent atoms are defined as protein backbone atoms (N, Ca, C, and O) for all conserved residues between the two structures being compared. A conserved residue is defined as a residue that is structurally or functionally equivalent. Only rigid fitting operations are considered.
When a rigid fitting method is used, the working structure is translated and rotated to obtain an optimum fit with the target structure. The fitting operation uses an algorithm that computes the optimum translation and rotation to be applied to the moving structure, such that the root mean square difference of the fit over the specified pairs of equivalent atom is an absolute minimum. This number, given in angstroms, is reported by QUANTA. In the context of the present invention, any set of structure coordinates for HDHD4, including those describing a HDHD4 mutant, etc., or the core domain or the cap domain or a binding pocket portion thereof, taken in conjunction or independently, that have a root mean square deviation (RMSD) from ideal of preferably no more than about 3.0 A, more preferably no more than about 2.0 A, even more preferably less than about 1.0 A, yet more preferably less than about 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, or 0.1 A, most preferably 0.0 A when superimposed on the polypeptide backbone Ca atoms defined by the structural coordinates listed in Table 1 or Table 2 herein are considered identical. In the context of the present invention, any molecule or molecular complex or binding site thereof, or any portion thereof, that has a root mean square deviation of conserved residue backbone atoms (N, Ca, C, O) of less than about 2.0 A, when superimposed on the relevant backbone atoms described by the reference structure coordinates listed in Table 1 or Table 2 herein and depicted in Figures presented herein, is considered to be "structurally equivalent" to the reference molecule.
Representative structurally equivalent molecules or molecular complexes are those that are defined by the entire set of structure coordinates listed in Table 1 or Table 2, ± a root mean square deviation from the conserved backbone atoms of those amino acids of not more than about 1.5 A. In another embodiment, the root mean square deviation is less than about 1.0 A or less.
Functional Equivalents
The present invention also encompasses functional equivalents of HDHD4. A functional equivalent, as used herein, means a polypeptide that an amino acid sequence that is substantially identical to a HDHD4 amino acid sequence (e.g., SEQ ID NO:2 and/or a polypeptide encoded by SEQ ID NO: 1) and exhibits the same biological activity as these polypeptides (e.g., the ability to transfer a phosphate group, for example, using substrates such as 2,3-diphosphoglycerate or N- acetylneuraminate-9-phosphate), regardless of the polypeptide's sequence length or composition. Thus, fragments of a HDHD4 polypeptide that exhibit such activity are encompassed by the term "functional equivalent" and are exemplary thereof. Continuing, a "functional equivalent" encompasses any compound capable of mediating an effect substantially identical to that mediated by HDHD4. It is further understood that minor modifications of the primary amino acid sequence of a HDHD4 polypeptide might result in proteins that have substantially equivalent or enhanced function as compared to an unmodified HDHD4 polypeptide. Such a minor modification might affect the overall charge, hydrophobicity, etc. of a modified HDHD4 amino acid sequence (e.g., SEQ ID NO:2 and/or a polypeptide encoded by SEQ ID NO: 1), while maintaining one or more biological activities of the modified protein compared with the wild-type. These modifications may be deliberate, as through site-directed mutagenesis, or may be accidental such as through mutation in hosts. All of these modifications are included as long as the ability to transfer a phosphate group, or bind a given ligand is retained. These types of modifications can be considered to be conservative mutations.
Thus, other sterically similar proteins and peptides can be formulated to mimic the key structural regions of a HDHD4. The generation of a functional equivalent can be achieved by the techniques of modeling and chemical design known to those of skill in the art and described herein. It will be understood that all such sterically similar constructs fall within the scope of the present invention.
Machine-Readable Data Storage Media and Computer Systems of the Present Invention
In one aspect of the present invention, a three-dimensional structure of a HDHD4 has been solved and the corresponding structure coordinates form an aspect of the present invention. The structure coordinates can be used in various applications, such as the design and identification of ligands and modulators of HDHD4, as described herein. In order to employ the provided structure coordinates in this and other capacities, it may be desirable to convert them into a graphical three- dimensional representation of the HDHD4 polypeptide they describe. This can be easily done by employing commercially and freely-available software, in conjunction with a computer that is capable of generating a three-dimensional graphical representation of a molecule, or a portion thereof, from a set of structure coordinates provided on a machine-readable data storage medium. As used herein, "machine-readable media" refers to any media that can be read and accessed directly by a computer. Such media include, but are not limited to: magnetic storage media such as floppy discs, hard disc storage medium and magnetic tape; optical storage media such as optical discs or CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media. Further details regarding machine-readable media and systems for displaying data contained on machine-readable media is provided.
Thus, in one aspect, the present invention provides a machine-readable data storage medium comprising a data storage material encoded with machine-readable data comprising all or any part of structure coordinates of a HDHD4 polypeptide, wherein the HDHD4 polypeptide is a structure defined by structure coordinates that describe conserved residue backbone atoms having a root mean square deviation of not more than about 2.0 A from the conserved residue backbone atoms described in Table 1 or Table 2. Generally, then, the present invention provides a machine-readable data storage medium comprising a data storage material encoded with machine readable data comprising all or any part of a set of structure coordinates of a HDHD4 polypeptide (Table 1 or Table 2).
The machine-readable data storage media of the present invention can be used in a computer. The computer is preferably adapted to produce a three-dimensional representation of a HDHD4 polypeptide, and comprises various components, including the machine-readable storage medium, used to produce the three- dimensional representation.
Thus, the present invention further provides a computer system capable of producing a three-dimensional representation of all or any part of a HDHD4 polypeptide, wherein said computer system comprises: (a) a machine-readable data storage medium comprising a data storage material encoded with machine readable data comprising all or any part of a set of structure coordinates of a HDHD4 polypeptide, wherein the HDHD4 polypeptide is a structure defined by structure coordinates that describe backbone atoms having a root mean square deviation of not more than about 2.0 A from the backbone atoms described by the structure coordinates of Table 1 or Table 2; (b) a working memory for storing instructions for processing the machine-readable data; (c) a central-processing unit coupled to the working memory and to the machine-readable data storage medium for processing the machine readable data into the three-dimensional representation; and (d) a display coupled to the central-processing unit for displaying the three-dimensional representation. In another aspect, the present invention also provides a computer system as described above wherein the machine-readable data comprises all or any part of a set of structure coordinates of a HDHD4 polypeptide.
In the present invention, the structure coordinates are preferably Cartesian coordinates, polar coordinates, or internal coordinates. Most preferably said structure coordinates are Cartesian coordinates. In the context of the present invention, the structure coordinates can be those determined for a HDHD4 polypeptide to which a ligand is bound or to which no ligand is bound. The structure coordinates can be those determined for a HDHD4 polypeptide that is in monomer, dimer, or other form.
A number of programs can be used to process the machine-readable data of this invention. Such programs are discussed in reference to the computational methods of drug discovery as described herein. Specific references to components of the hardware system are included as appropriate throughout the following description of the data storage medium.
Representative Applications of the Present Invention
The present invention, which comprises, in part, the structure coordinates of Table 1 or Table 2, has broad-based utility and can be employed in many applications. Representative applications include modulator design, mutant design and screening operations. These are other applications are described herein.
Modulator Design and Identification
The HDHD4 structure coordinates of the present invention facilitate structure- based or rational drug design and virtual screening to design or identify potential ligands and/or modulators of a HDHD4 polypeptide. The structural features of the ligand binding site of a HDHD4 polypeptide, as described by the structure coordinates herein, provides insights into the HDHD4 binding site that, prior to the present invention, were unknown and could not be effectively modeled. An understanding of these features facilitates structure-based modulator design and virtual screening at a level of efficiency unattainable prior to the present invention.
In a rational modulator design approach, a three dimensional model of a HDHD4 polypeptide can be used to identify structural and chemical features that might be involved in binding of ligands to a binding site of a HDHD4 polypeptide. Identified structural or chemical features can then be employed to design ligands or modulators of a HDHD4 polypeptide or identify test molecules as ligands or modulators of a HDHD4 polypeptide.
Those of ordinary skill in the art can employ one of several methods to screen chemical entities or fragments for their ability to associate with a HDHD4 polypeptide, or a structurally similar polypeptide, and in embodiments comprising the individual binding site(s) of a HDHD4 polypeptide. This process can begin by visual inspection of, for example, the active site on the computer screen based on the structural coordinates provided herein in Table 1 or Table 2 or the structural coordinates of a model generated using the structural coordinates of Table 1 or Table 2. Selected candidate modulators, which can be fragments or complete chemical entities, can then be positioned in a variety of orientations, or docked, with a HDHD4 polypeptide (for example in a binding site) as described hereinabove. Docking can be accomplished using software such as QUANTA, SYBYL, Flo, DOCK, GOLD or FLEXX, and followed by energy minimization and molecular dynamics with standard molecular mechanics forcefields, such as CHARMM and AMBER.
Once a candidate modulator has been designed or selected, the efficiency with which that candidate modulator associates ("docks") with a HDHD4 polypeptide can be tested and optimized by computational evaluation. For example, a compound that has been designed or selected to function as an inhibitor should spatially fit into a binding site when it is associated with a HDHD4 polypeptide polypeptide.
Docking can be performed manually or using a variety of software, including but not limited to, DOCK (Kuntz et a!.. (1994) Ace. Chem. Res. 27: 117; Gschwend & Kuntz. (1996) J Comp. Aided MoI. Des. 10: 123; Kuntz. (1982) J. MoI. Biol. 161 :269- 288), GOLD (Cambridge Crystallographic Data Center, Cambridge, UK),, Flo (Thistlesoft, Colebrook, Connecticut), QUANTA (Accelrys, San Diego, California), SYBYL (Tripos, St. Louis, Missouri) or FLEXX (Tripos, St. Louis, Missouri). A docking operation can involve analyzing structural and chemical feature complementarity between a structure (e.g., a HDHD4 polypeptide) and a candidate modulator. Such an analysis can include (a) quantifying features of atomic components found within a ligand molecule and protein molecule (e.g., charge, size, shape, polarizability, hyprophobicity, etc.), and (b) quantifying interactions between such features in the ligand molecule, the protein molecule and the protein/ligand complex, as determined using any number of approaches known in the art (e.g., molecular mechanics, force fields and/or quantum mechanics). Analyzing sturctural and chemical feature complementarity can, for example, be performed visually or by scoring functions based on computed ligand-site interactions as implemented in DOCK, GOLD, Flo, COMBIFLEXX (Tripos, St. Louis, Missouri).
In a docking operation, a three-dimensional structure comprising, all or any part of, a HDHD4 polypeptide, as disclosed herein, is provided. Following modulator design, as described above, using appropriate software, a candidate modulator (i.e., potential ligand or potential modulator) can then be associated (i.e., docked) with the three-dimensional structure. For example, a candidate modulator can be docked into a binding site of a HDHD4 polypeptide (e.g., a ligand binding site comprising D12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, and optionally C60, F61, H62, P63, Y64, and N65, according to the structural coordinates of Table 1 or Table 2), i.e., a docking operation can be performed in silico between a candidate modulator and a HDHD4 polypeptide. Such a test molecule can be designed based on HDHD4 binding site features disclosed herein. After docking, the test molecule can be analyzed for structural and chemical feature complementarity with all or any part of a HDHD4. Structural and chemical features include, but are not limited to, any one of the following: van der Waals interactions, hydrogen bonding interactions, charge interaction, hydrophobic interactions, and dipole interactions.
A docking operation can be performed as part of a modulator design process or it can be performed to learn more about how a given ligand associates or might associate with a given structure. Thus, in one aspect, the present invention also provides a method of docking a ligand, modulator or candidate modulator with a structure. In one embodiment, the method comprises positioning a candidate modulator into a binding site, or any part of a binding site, of a HDHD4 polypeptide, wherein the binding site is a described by the structure coordinates Table 1 or Table 2. The method can further comprise analyzing structural and chemical feature complementarity of the candidate modulator with all or any part of a binding site of a HDHD4 polypeptide.
As described herein, a three-dimensional structure disclosed herein or a three- dimensional model created using methods known in the art, including, but not limited to, using software such as INSIGHT II (Accelrys, Inc., San Diego, CA), SYBYL (Tripos Associates, St. Louis, Missouri), and Flo (Thistlesoft, Colebrook, Connecticut), and the coordinates disclosed herein in Table 1 or Table 2 can be employed in a docking operation as a step of modulator design.
Computer software programs can be employed to assist in the process of selecting a candidate modulator. Representative computer software programs include, but are not limited to:
1. CAVEAT (Bartlett et al, (1989) in Molecular Recognition in Chemical and Biological Problems, Special Pub. Royal Chem. Soc. 78: 182- 196; Lauri & Bartlett. (1994) J. Comp. Aided MoI. Design 8, 51-66);
2. GRID (Goodford. (1985) J. Med.Chem. 28:849-857); 3. MCSS (Miranker & Karplus. (1991) Proteins 11 :29-34);
4. AUTODOCK (Morris et al, (1996) J. Comp. Aided MoI. Design 10: 293-304; Goodsell & Olsen. (1990) Proteins 8: 195-202); and
5. DOCK (Kuntz et al, (1994) Ace. Chem. Res. 27: 117; Gschwend & Kuntz. (1996) J Comp. Aided MoI. Des. 10: 123; Kuntz. (1982) J. MoI. Biol. 161 :269-288). And
6. HOMOLOGY (Accelrys, San Diego, California, USA).
Once suitable chemical entities or fragments have been selected, they can be assembled into a single compound or inhibitor. Assembly can proceed by visual inspection of the relationship of the fragments to each other on the three-dimensional image displayed on a computer screen in relation to the structure coordinates of, for example, a HDHD4 polypeptide in accordance with Table 1 or Table 2 or a model built using the disclosed structure coordinates of a HDHD4 polypeptide. This inspection can be followed by manual model building using software suitable for this purpose, such as QUANTA (Tripos, St. Louis, Missouri), SYBYL (Tripos, St. Louis, Missouri), LOOK/GENEMLNE (Celera, Rockville, Maryland), HOMOLOGY (Tripos, St. Louis, Missouri), or INSIGHT II (Accelrys, San Diego, California). Useful programs to aid one of skill in the art in connecting the individual chemical entities or fragments include:
1. CAVEAT (Bartlett et al. (1989) in Molecular Recognition in Chemical and Biological Problems, Special Pub. Royal Chem. Soc. 78: 182- 196; Lauri & Bartlett. (1994) J. Comp. Aided MoI. Design 8, 51-66); 2. MODELLER 7vO (Marti-Renom et al, (2000) Ann. Rev.
Biophys. Biomol. Struct. 29:291-325);
3. INSIGHTII (Accelrys, San Diego, California); and
4. WHAT IF (Vriend. (1990) J. MoI. Graph. 8:52-56.).
Instead of proceeding to build a modulator in a step-wise fashion one fragment or chemical entity at a time as described, a modulator can be designed as a whole or de novo using either an empty active site or optionally including some portion(s) of a known modulator(s). Software that can be employed in a de novo design effort includes:
1. LUDI (Bohm, (1992) J. Comp. Aid. MoI. Design 6:61-78, Accelrys, San Diego, California);
2. LEGEND (Nishibata. &. Itai. (1991) Tetrahedron 47:8985; Accelrys, San Diego, California);
3. LEAPFROG (Tripos, St. Louis, Missouri).
The above discussion of useful software is only representative and other molecular modeling techniques can also be employed in accordance with this invention, as will be apparent to those of ordinary skill in the art upon consideration of the present disclosure.
An effective modulator preferably exhibits a relatively small difference in energy between its bound and free states (i.e., a small deformation energy of binding). Therefore, an efficient modulator preferably exhibits a deformation energy of binding of not greater than about 10 kcal/mole, preferably, not greater than about 7 kcal/mole. Computer software is available in the art to evaluate compound deformation energy and electrostatic interaction. Examples of programs designed for such uses include:
1. GAUSSIAN 98 (Gaussian, Pittsburgh, Pennsylvania, USA) 2. AMBER v7 (available from University of California San
Francisco)
3. GAMESS (available from Iowa State University)
4. QUANTA/CHARMM (Accelrys, San Diego, California, USA) and 5. INSIGHT II (Accelrys, San Diego, California, USA).
These software packages can be implemented on a computer system as described herein (e.g., a Silicon Graphics FUEL or OCTANE 2 workstation).
The above referenced software packages can be employed to perform various energy calculations with respect to a given modulator-polypeptide system. An energy analysis can take into account non-complementary (e.g., electrostatic) interactions including repulsive charge-charge, dipole-dipole and charge-dipole interactions.
Thus, in one embodiment, the present invention provides a method of designing a modulator of a HDHD4 polypeptide comprising: (a) modeling all or any part of a HDHD4 polypeptide binding site; and (b) based on the modeling, designing a candidate modulator that has structural and chemical feature complementarity with all or any part of the HDHD4 binding site; wherein the HDHD4 polypeptide binding site is defined by the structure coordinates of Table 1 or Table 2.
After a candidate modulator has been designed, the candidate modulator can then be synthesized and tested for modulation ability in a suitable assay. A candidate modulator can be designed manually without the aid of computer software, either de novo or by employing a portion of a known ligand as a starting point. Alternatively, a candidate modulator can be designed employing computer software either de novo or employing a portion of a known or suspected ligand as a starting point, as described herein. When computer software is employed, such software can comprise or access a database from which candidate modulators (or discrete chemical elements) are chosen (e.g., CAVEAT), based on an evaluation of the model. The candidate modulator can be designed to fit spatially into all or any part of a HDHD4 polypeptide binding site, and a ligand binding site can be described generally by the structure coordinates of Table 1 or Table 2 and more specifically by the structure coordinates of amino acids comprising D12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57,K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, and optionally C60, F61, H62, P63, Y64, and N65.
The method can further comprise: (c) docking the designed candidate modulator into all or any part of the HDHD4 polypeptide binding site; and (d) analyzing the structural and/or chemical feature complementarity of the candiate modulator with all or any part of the HDHD4 polypeptide binding site. Additional description of docking and docking operations is provided herein. The method can also comprise analyzing structural and chemical feature complementarity of a second chemical entity with all or any part of a HDHD4 polypeptide, such as when the modeling operation grows a ligand in place. The analysis can be computational and take into account energy considerations, surface charges, hydrophobicity, etc., or it can be simply a visual inspection.
In a related embodiment, the present invention provides a method of designing a modulator of a HDHD4 polypeptide comprising: (a) designing a potential modulator of a HDHD4 polypeptide that will make interactions with amino acids in a ligand binding site of the HDHD4 polypeptide, based upon a crystalline structure comprising a HDHD4 polypeptide in complex with a ligand; (b) synthesizing the modulator; and (c) determining whether the potential modulator modulates the activity of the HDHD4 polypeptide, whereby a modulator of a HDHD4 polypeptide is designed. The crystalline structure can be analyzed as described herein and the determining can be carried out by employing an assays as described herein.
The present invention also provides a method of designing a modulator of a target polypeptide that is structurally similar to a HDHD4 polypeptide: (a) modeling all or any part of a HDHD4 polypeptide; and (b) based on the modeling, designing a candidate modulator that has structural and chemical feature complementarity with all or any part of a HDHD4 polypeptide binding site; wherein the HDHD4 polypeptide is described by the structure coordinates of Table 1 or Table 2. Due to the structural similarity between the HDHD4 polypeptide and the target polypeptide, a modulator designed to associate with a HDHD4 polypeptide would be expected to associate with the target polypeptide, since both polypeptides are similar in size, composition, shape, etc. The candidate modulator can be designed to fit spatially into all or any part of a HDHD4 binding site. As in the case of designing a modulator of a HDHD4 polypeptide, a candidate modulator can be designed manually without the aid of computer software, either de novo or by employing a portion of a known ligand as a starting point. Alternatively, a candidate modulator can be designed employing computer software either de novo or employing a portion of a known ligand as a starting point. When computer software is employed, such software can comprise or access a database from which candidate modulators (or discrete chemical elements) are chosen (e.g., CAVEAT), based on an evaluation of the model.
The method can further comprise: (c) docking the chemical entity into all or any part of the HDHD4 binding site; and (d) analyzing the structural and chemical feature complementarity of the candidate modulator with all or any part of a HDHD4 polypeptide, such as a binding site. Additional description of docking and docking operations is provided herein. The method can also comprise analyzing structural and chemical feature complementarity of a second chemical entity with all or any part of a HDHD4 polypeptide, such as when the modeling operation grows a ligand in place. The analysis can be computational and take into account energy considerations, surface charges, hydrophobicity, etc., or it can be simply a visual inspection.
The structure of a HDHD4 polypeptide as provided herein might be similar in structure to other proteins. Modulators that lack specificity for a given protein might adversely affect other proteins. Thus, it is desirable to be able to employ a modulator that is specific for a given protein, regardless of structural similarity. Using the structural coordinates of the present invention, such a selective modulator can be designed. Thus, in one aspect, the present invention provides a method of designing a modulator that selectively modulates the activity of a HDHD4 polypeptide to the exclusion of other proteins comprising: (a) evaluating a three-dimensional structure of a crystallized HDHD4 polypeptide in complex with a ligand; and (b) synthesizing a potential modulator based on the three-dimensional structure of the crystallized HDHD4 polypeptide in complex with a ligand. Methods of evaluating a three- dimensional structure are provided herein and synthetic pathways for a potential modulator will depend on the composition of the modulator itself.
The structure coordinates of the present invention can also be employed in the refinement of an existing HDHD4 polypeptide modulator. By refining the structure of an existing modulator, desirable properties of the modulator can be enhanced. In one aspect, therefore, the present invention also provides a method of increasing the efficiency of a modulator of a HDHD4 polypeptide comprising: (a) providing a first ligand having a known effect on the biological activity of a HDHD4 polypeptide; (b) modifying the first ligand based on an evaluation of a three-dimensional structure of a HDHD4 polypeptide to form a modified ligand; (c) synthesizing the modified ligand; and (d) determining an effect of the modified ligand on a HDHD4 polypeptide, wherein the efficiency of a modulator of a HDHD4 polypeptide is increased if the modified ligand favorably alters a biological activity of a HDHD4 polypeptide with respect to the biological activity of the first ligand.
Identification of Structural and/or Chemical Features of a HDHD4 Polypeptide or
Structurally Similar Polypeptide
Various structural and/or chemical features of all or any part of a HDHD4 polypeptide can be identified using a three-dimensional representation (e.g., a HDHD4 crystal structure or a generated model) of all or any part of a HDHD4 polypeptide. For example, amino acids that are suspected to be involved in an association with a modulator or an amino acid sequence, for example, residues comprising a binding site, etc. can be identified. Such an identification can be carried out by techniques known in the art and described herein, such as by employing software suitable for that purpose as disclosed herein (e.g., DOCK, GOLD, Flo or LEAPFROG). Thus, an aspect of the present invention is a method of identifying structural and/or chemical features of all or any part of a HDHD4 polypeptide. In an embodiment of the method, the HDHD4 polypeptide is described by the structure coordinates according to Table 1 or Table 2. In other embodiments the structural and/or chemical features of a HDHD4 polypeptide binding site (e.g., amino acids D12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, and optionally C60, F61, H62, P63, Y64, and N65) are identified.
In a related aspect, the present invention also provides a method of identifying structural features of a HDHD4 polypeptide that can be employed in the design of a modulator that selectively modulates the activity of a HDHD4 polypeptide to the exclusion of other structurally similar but non-identical proteins.
In one embodiment, the method comprises providing a three-dimensional structure of a crystallized HDHD4 polypeptide in complex with a ligand and a three- dimensional test structure comprising a structurally similar but non-identical protein. The HDHD4 polypeptide structure can comprise the coordinates of Table 1 or Table 2, for example. As noted herein, a HDHD4 polypeptide structure need not be exactly described by (e.g., identical to) the coordinates of Table 1 or Table 2, since HDHD4 functional equivalents are also encompassed by the present invention.
Next, the backbone residues of the HDHD4 structure are overlayed onto the test structure. This operation can be carried out manually, for example by fixing the position of one structure (e.g., the test structure(s)) and visually orienting the other structure (e.g., HDHD4) relative to the fixed structure. Alternatively, computer software, such as INSIGHT II, can be employed to perform the overlap consistent with user-selected criteria. Structural features of the HDHD4 that do not overlap the test structure to a desired degree are then identified. The identifying can comprise, for example, a visual inspection of the overlapped structures or a quantitative comparison can be made. Additionally, the identifying can comprise one or more computational evaluations of the overlapped structures, which can be performed by employing commercially available computer software known to those of ordinary skill in the art. Such an evaluation can comprise, for example, an energy analysis, surface analysis, charge analysis of one or both structures.
The method can be employed alone or in conjunction with other methods described herein. For example, the method can be employed as a precursor to modulator design. In this role, the method can be employed to enhance the specificity of a modulator for HDHD4, or, in other embodiments, even for a protein other than HDHD4. In silico Screening Operations
The use of the structure coordinates of the present invention in structure-based drug design described above can begin with an initial identification of possible compounds for interaction with target molecule, such as a polypeptide (e.g., a HDHD4 polypeptide). Sometimes suitable compounds are known in the art. However, when they are not, or when novel compounds are wanted, a first stage of a modulator design process can comprise computer-based in silico screening of compound databases (such as the Cambridge Structural Database) in order to identify a compound predicted to interact with a target molecule.
Various screening selection criteria can be employed and can account for pharmacokinetic properties such as metabolic stability and toxicity. However, the structure coordinates provided herein, which include coordinates describing a HDHD4 binding site, allow a set of selection criteria for a potential modulator to be identified.
Virtual screening methods, i.e., methods of evaluating the potential of chemical entities to bind to a given protein or portion of a protein, are known in the art. These methods often employ databases as sources of candidate modulators and often are employed in designing modulators. Often these methods begin by visual inspection of a binding site of a target polypeptide on the computer screen. Selected candidate modulators can then be placed, i.e., docked, in one or more positions and orientations within the binding site and chemical and structural feature complementarity can be analyzed.
In virtual screening, molecular docking followed by energy minimization and molecular dynamics with standard molecular mechanics forcefields such as CHARMM and MMFF can be performed as described herein. Examples of computer programs which can assist in the selection of chemical entities useful in the present invention include, but are not limited to, GRID (Goodford, 1985), AUTODOCK (Goodsell, 1990), and DOCK (Kuntz et al. (1994) Ace. Chem. Res. 27: 117; Gschwend & Kuntz. (1996) J Comp. Aided MoL Des. 10: 123; Kuntz. (1982) J. MoL Biol. 161 :269-288). Databases of chemical entities that may be used include, but are not limited to, ACD (Molecular Designs Limited, San Leandro, California), Aldrich (Aldrich Chemical Company), NCI (National Cancer Institute), Maybridge (Maybridge Chemical Company Ltd), CCDC (Cambridge Crystallographic Data Center, Cambridge, UK), CAST (Chemical Abstract Service) and Derwent (Derwent Information Limited). For example, programs such as DOCK (Kuntz et al. , (1994) Ace. Chem. Res.
27: 117; Gschwend & Kuntz. (1996) J Comp. Aided MoI. Des. 10: 123; Kuntz. (1982) J. MoI. Biol. 161:269-288) can be used with the structure coordinates of HDHD4 disclosed herein to identify chemical entities from databases or virtual databases of small molecules. These molecules may therefore be suitable candidates for synthesis and testing. A virtual screening approach can include, but is not limited to, the following steps:
1. Selecting a candidate modulator from a database or elsewhere and positioning the candidate modulator in one or more orientations within all or any part of a binding site of a target molecule, the conserved backbone residues of the binding site having a root mean square deviation of not more than about 3.0 A from the structure coordinates of the HDHD4 amino acids D12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D 194, and optionally C60, F61, H62, P63, Y64, and N65 according to Table 1 or Table 2;
2. Characterizing structural and chemical features of the candidate modulator and binding site, such as van der Waals interactions, hydrogen bonding interactions, charge interaction, hydrophobic bonding interaction, and dipole interactions; 3. Optionally, selecting from a database or elsewhere a second candidate modulator adapted to join with or replace the docked candidate modulator and fit spatially into all or any part of a HDHD4 binding site comprising amino acid residues D12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, and optionally C60, F61, H62, P63, Y64, and N65; 4. Evaluating the docked candidate modulator using one or more scoring schemes which account for van der Waals interactions, hydrogen bonding interactions, charge interaction and hydrophobic interactions, i.e., evaluation of structural and chemical feature complementarity. Upon selection of one or more preferred chemical entities, their relationship to each other and to a HDHD4 polypeptide can be visualized and then assembled into a single candidate modulator. Programs useful in assembling the individual chemical entities include, but are not limited to, SYBYL (Tripos, St. Louis Missouri, USA), LEAPFROG (Tripos, St. Louis Missouri, USA), LUDI (Bohm, (1992) J. Comp. Aid. MoI. Design 6:61-78, Accelrys, San Diego, California) and 3D Database systems (see, e.g., Martin. (1992) J. Med Chem. 35(12):2145-2154), as discussed herein.
Thus the present invention provides a method for evaluating the potential of a chemical entity to bind to all or any part of a HDHD4 polypeptide or a structurally similar molecule comprising: (a) docking a candidate modulator into all or any part of a HDHD4 polypeptide described by the structure coordinates of Table 1 or Table 2; and (b) analyzing structural and chemical feature complementarity between the candidate modulator and all or any part of the HDHD4 polypeptide. It might be desirable to employ a HDHD4 polypeptide binding site comprising amino acid D 12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, and optionally C60, F61, H62, P63, Y64 and N65, in the docking of step (a). Analogously, binding residues of a HDHD4 polypeptide can be employed in the method. The candidate modulator can be selected from a database. The method can further comprise a step in which a second candidate modulator is joined to the first candidate modulator that was docked and analyzed, and the resultant candidate modulator is docked and analyzed.
Candidate modulators designed or identified using the methods described herein can then be synthesized and screened in a HDHD4 binding assay, or in an assay designed to test functional activity. Examples of assays useful in screening of potential ligands or modulators include, but are not limited to, screening in silico, in vitro assays and high throughput assays. Similarly and further to the method for evaluating the potential of a chemical entity to associate with a HDHD4 polypeptide, candidate modulators can be screened, using computational means and biological assays, to identify ligands and modulators of a HDHD4 polypeptide. Thus, the invention provides a method for identifying a modulator of a
HDHD4 polypeptide. In one embodiment the method comprises the following steps, which are preferably, but not necessarily, performed in the order given: (a) docking a candidate modulator into all or any part of a HDHD4 polypeptide binding site, wherein the a HDHD4 polypeptide binding site is described by the structure coordinates of Table 1 or Table 2; (b) analyzing structural and chemical feature complementarity between the candidate modulator and all or any part of the a HDHD4 polypeptide binding site; (c) synthesizing the candidate modulator; and (d) screening the candidate modulator in a biological assay for the ability to modulate a HDHD4 polypeptide. A candidate modulator is identified as a modulator of HDHD4 if the structural and chemical feature complementarity and the modulation exceed a desired level. A compound that stimulates or inhibits a measured activity in a cellular assay by greater than 10% is identified as a preferred modulator. The method can further comprise one or more of the following steps: (e) screening the candidate modulator in an assay that characterizes binding to a HDHD4 polypeptide; and (f) screening the candidate modulator in an assay that characterizes binding to a HDHD4 polypeptide.
In the present methods, a modulator of a HDHD4 polypeptide can induce one or more of the following activities of HDHD4 presented in this non-inclusive list: (a) a HDHD4 modulator can transfer a phosphate group, for example, using substrates such as 2,3-diphosphoglycerate or N-acetylneuraminate-9-phosphate).
The term "all or any part of a HDHD4 polypeptide" preferably relates to enough of a HDHD4 polypeptide binding site so as to be useful in docking or modeling a ligand into the binding site, although it is not necessary to employ a complete HDHD4 polypeptide. Preferably, a HDHD4 polypeptide binding site comprises the following residues: D12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, and optionally C60, F61, H62, P63, Y64, and N65 of SEQ ID NO:2. For purposes of the present disclosure, "all or any part of a HDHD4 polypeptide" can also relate to structural elements not found in a binding site, however.
Generating a Homology Model
In the context of the present invention, including the generation of a homology model based on the structure coordinates disclosed herein, those of ordinary skill in the art will understand that a set of structure coordinates for a protein (e.g., a HDHD4 polypeptide) or part of a protein (e.g., a HDHD4 polypeptide binding site) is a relative set of points that define a shape in three dimensions. For one or more reasons, including those that follow, structure coordinates that define two identical or almost identical shapes can vary slightly. If variations are within an acceptable standard error as compared to the original coordinates, the resulting three-dimensional shape is considered to be equivalent. Thus, for example, a ligand that is bound to the structure defined by the structure coordinates of the HDHD4 according to Table 1 or Table 2 would also be expected to bind to a site having a shape that fell within the acceptable error. Such sites with structures falling within an acceptable standard error are also within the scope of this invention.
Homology models are useful when there is no experimental information available on the three-dimensional structure of the protein of interest. A three dimensional model can be constructed on the basis of the known structure of a homologous protein (see, e.g., Greer. (1991) Methods Enzymol.202:239-52; Greer. (1990) Proteins 7(4):317-34; Cardozo et al, (1995) Proteins 23(3):403-14., SaU, (1995) Curr. Opin. Biotechnol. 6(4):437-51; Birkholtz et al, (2003) Proteins 50(3):464-73). Those of ordinary skill in the art will understand that a homology model can be constructed by first identifying a protein (e.g., a HDHD4 polypeptide) or part of a protein (e.g., a HDHD4 polypeptide binding site) of known structure which is similar to the protein or part of the protein without known structure. Next, an alignment is performed and can be accomplished using such programs as the MODELLER module found in INSIGHT II (Accelrys, Inc., San Diego, California, USA), WHAT IF (Rodriguez et al, (1998) CABIOS 14:523-528), or 3D-JIGSAW (Bates et al, (2001) Proteins Supp. 5:39-46). After generating the alignment, the alignment can be analyzed, secondary structure weighting operations can be performed and gap deletions and additions can be made. Finally, a structure can be calculated and refined as desired. Thus, a method of constructing a homology model consistent with the present invention comprises: (a) providing an amino acid sequence for a target protein for which a structure is not known; (b) aligning the target protein with all or a part of of the structure of a HDHD4 polypeptide, wherein the HDHD4 polypeptide is described in whole or in part by the structure coordinates of Table 1 or Table 2; (c) analyzing the alignment of the target protein with the all or a part of a HDHD4 polypeptide; and (d) generating a structure of the target protein based on the analysis. This and related methods and processes are described more fully herein below.
In another embodiment, the structure of a target protein can be determined using the structure coordinates of a HDHD4 polypeptide as a starting point. Thus, a method of determining the structure of a target protein for which little or no structural information is known forms an aspect of the present invention. The method can comprise: (a) providing an amino acid sequence for a target protein for which a structure is not known; (b) aligning the target protein with all or a part of the structure of a HDHD4 polypeptide, wherein the HDHD4 polypeptide is described in whole or in part by the structure coordinates of Table 1 or Table 2; (c) analyzing the alignment of the target protein with all or a part of the HDHD4 polypeptide; (d) generating a structure of the target protein based on the analysis; and (e) analyzing the generated structure to determine the structure of a target protein for which a structure is not known. This and related methods and processes are described more fully herein below. Various computational analyses can be employed to determine whether a molecule or a portion thereof is sufficiently similar to all or a part of a template (e.g. , a molecule of known structure, such as a HDHD4 polypeptide binding site, which is described by the structure coordinates of Table 1 or Table 2) to be considered equivalent. Such analyses can be carried out in software applications, such as INSIGHT II (Accelrys Inc., San Diego, California, USA) as described in the User's Guide, or software applications available in the SYBYL software suite (Tripos, St. Louis, Missouri, USA). In one embodiment, only rigid fitting operations are considered. When a rigid fitting method is used, the template structure is translated and rotated to obtain an optimum fit with the target structure. The fitting operation uses an algorithm that computes the optimum translation and rotation to be applied to the template structure, such that the root mean square difference of the fit over the specified pairs of equivalent atoms is an absolute minimum. This number, given in angstroms (A), is reported by INSIGHT II.
Three-dimensional coordinates give the location of the centers of all atoms in a protein molecule and are typically expressed as Cartesian coordinates (e.g., distances in three directions, each perpendicular to the other), or polar coordinates
(e.g., sets of angle/distance pairs from a universal origin), or internal coordinates (e.g., sets of angle/distance pairs from one atom center to the next). Thus, it is possible that an entirely different set of coordinates could define an identical or similar shape, depending on which coordinate system is used. All such equivalent coordinates describing the HDHD4 coordinates presented in Table 1 or Table 2 are encompassed by the present invention.
Slight variations in the individual coordinates, which can arise from generation of similar homology models using different alignment templates, and/or using different methods in generating the homology model, can have minor effects on the overall shape, however such models can still be encompassed by the present invention.
Variations in coordinates can also be generated due to mathematical manipulations of the structure coordinates. For example, the HDHD4 structure coordinates set forth in Table 1 or Table 2 could be manipulated by fractionalization of the structure coordinates, integer additions or subtractions to sets of the structure coordinates, inversion of the structure coordinates or any combination of the above.
The structure coordinates of an actual X-ray structure of a protein (e.g., a HDHD4 polypeptide) would be expected to have some variation from the homology model of that very same protein. For example, the location of sidechains might vary to some extent.
Variations in structure coordinates can be due to mutations, additions, substitutions, and/or deletions of amino acids of a protein being studied. Variations in structure coordinates can also be due to variations in proteins whose shape is being described by the structure coordinates given. For example, rigid fitting operations conducted between a HDHD4 polypeptide and a closely-related protein known to have similar structure and function (can yield root mean square deviations (RMSD) in a conserved residue backbone atom comparison. These RMSD's could be greater if other variation factors described above were present in the calculations. Proteins from non-human species may also have slight variations in shape from that of the HDHD4 defined by the structure coordinates of Table 1 or Table 2.
Following the generation of an alignment of two or more sequences, an analysis can be carried out involving one or more mathematical constructs. Representative mathematical constructs include, but are not limited to: energy calculations for a given geometry of a molecule utilizing forcefields or ab initio methods known in the art; energy minimization using gradients of the energy calculated as atoms are shifted so as to produce a lower energy; conformational searching, i.e., locating local energy minima; molecular dynamics wherein a molecular system (single molecule or ligand/protein complex) is propagated forward through increments of time according to Newtonian mechanics using techniques known to the art; calculations of molecular properties such as electrostatic fields, hydrophobicity and lipophilicity; calculation of solvent-accessible or other molecular surfaces and rendition of the molecular properties on those surfaces; comparison of molecules using either atom-atom correspondences or other criteria such as surfaces and properties; quantitative structure-activity relationships in which molecular features or properties dependent upon them are correlated with activity or bio-assay data. Following an analysis of one or more of the above concepts, the computer system on which a modeling operation is being carried out then generates the structural details of one or more regions in which a potential ligand binds (e.g., a HDHD4 polypeptide binding site) so that complementary structural and chemical features of the potential ligands can be determined. Design in these modeling systems is generally based upon the compound being capable of structurally and chemically associating with the protein, i.e., having structural and chemical feature complementarity. In addition, the compound must be able to assume a conformation that allows it to associate with the protein. Some modeling and design systems estimate the potential inhibitory or binding effect of a potential modulator prior to actual synthesis and testing. Using modeling, compounds may be designed de novo using an empty binding site. Alternatively, compounds may be designed including some portion of a known ligand, i.e., grown in place. The known ligand may have been determined through virtual screening. Programs for design include, but are not limited to LUDI (Bohm, (1992) J. Comp. Aid. MoI. Design 6:61-78, Accelrys, San Diego, California, USA), LEAPFROG (Tripos Associates, St. Louis Missouri, USA) and DOCK (Kuntz ef α/.. (1994) Ace. Chem. Res. 27: 117; Gschwend & Kuntz. (1996) J Comp. Aided MoI. Des. 10: 123; Kuntz. (1982) J. MoI. Biol. 161:269-288).
After a structure is generated, additional operations can be carried out and the generated structure refined. This refinement step can be dependent on the nature and results of any analysis carried out as a component of the alignment process. For example, if energy considerations are not taken into account during the alignment process a generated structure might benefit from further refinement. Conversely, if an alignment process is extensive in its treatment, subsequent refinement of the structure might not be necessary or might be only minimal in scope.
Thus, the present invention provides for the formation of a homology model comprising all or any part (e.g., a binding site) of a HDHD4 polypeptide. In one embodiment, the HDHD4 polypeptide is described by the structure coordinates of Table 1 or Table 2.
A model of a HDHD4 polypeptide of the present invention can be any type of art-recognized model, including, but not limited to, three-dimensional models and steric/electrostatic field definition models that can be used to study/compute the putative interactions ligands might undergo. A three-dimensional model can be produced through use of structure coordinates, and can be represented in any of a variety of forms, such as ribbon diagrams or wireframe models.
Designing a Mutant HDHD4 Polypeptide As used herein, the term "mutation" includes one or more amino acid deletions, insertions, inversions, repeats, or substitutions as compared to a native protein (e.g., a HDHD4 polypeptide). Various methods of making mutations are known to one of ordinary skill in the art. A mutant can have the same, similar, or altered biological activity as compared to the native protein. A HDHD4 polypeptide mutant can have at least 25% sequence identity, at least about 50% sequence identity, at least about 75% sequence identity, or at least about preferably 95%, 96%, 97%, 98%, or 99% sequence identity to a wild-type HDHD4 polypeptide (e.g., SEQ ID NO:2 encoded by SEQ ID NO: 1). The structural coordinates of the present invention can be employed in the design of a mutant HDHD4 polypeptide or fragment thereof. The structural coordinates describe, in one aspect, various structural features of a HDHD4 polypeptide. Those of ordinary skill in the art can employ this understanding of the HDHD4 structure to select one or more amino acid residues for mutation. The rationale for selecting a residue can be based on a steric, chemical or other consideration. Thus, the present invention provides for the generation of HDHD4 mutants, and the ability to solve the crystal structures of those that crystallize. Further, desirable sites for mutation can be identified, based on analysis of the three- dimensional HDHD4 structural coordinates provided herein.
In one aspect, the present invention provides a method of designing a mutant comprising making one or more amino acid mutations in a HDHD4 polypeptide. The mutant so designed can comprise a complete HDHD4 polypeptide or a portion of thereof, such as a ligand binding site. In some embodiments, a mutant comprises an addition, a deletion or a substitution of one or more of the amino acids of a HDHD4 polypeptide binding site. One embodiment of a method of designing a mutation comprises: (a) selecting a property of a HDHD4 polypeptide to be investigated; (b) providing a three-dimensional structure of a HDHD4 polypeptide; and (c) evaluating the structure to identify a residue known or suspect to related to the selected property. The steps of the method can be repeated a desired number of times.
Initially a property of a HDHD4 polypeptide to be investigated is selected. Example properties include ligand binding, overall or local charge, overall or local or local hydrophobicity, folding, overall or local secondary or tertiary structure, elimination or formation of an epitope or catalysis. Other properties can also be investigated and a combination of properties can be investigated with a single mutation. Next, a three-dimensional structure of a HDHD4 is provided. The three- dimensional structure can be described by all or a part of the structure coordinates of Table 1 or Table 2. The HDHD4 can comprise all or a part of the amino acid sequence of SEQ ID NO:2. The structure is then evaluated to identify a residue known or suspected to relate to the selected property. The evaluating can be of any form and can be dependent on the nature of the property being investigated. The evaluating can start with the substitution (or the addition or deletion) of one or more residues for one or more HDHD4 polypeptide residues. After the substitution(s) is performed (for example by employing software used to display the three-dimensional structure), a visual inspection of the three-dimensional structure as it is displayed on a computer screen can be performed. In some cases, the effect of a given mutation on the structure and/or property of a HDHD4 polypeptide can be determined by visual inspection. Alternatively, the evaluating can comprise one or more calculations to determine the effect of a given substitution. For instance, an energy minimization operation can be performed to energy minimize a mutant HDHD4 polypeptide structure. Further, calculations can be performed that can quantitatively assess the effect of a given mutation on the charge, hydrophobicity, etc., either locally or globally. The overall energy of the structure can also be calculated. After performing the method steps, the effect of a mutation can be determined.
Following the determination, if the mutation yields a desired result (e.g., an effect on a property of HDHD4 that is being investigated), the mutant can be synthesized and subjected to further analysis (e.g., ligand binding assays, activation assays, etc., as described herein). If a mutation does not yield a desired result, the steps of the method can be repeated a desired number of times.
When a HDHD4 polypeptide binding site is mutated, such a mutation can be in a ligand binding site or in the area of a ligand binding site. Thus, a mutation can comprise a residue selected, for example, from the group consisting of D 12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, and optionally C60, F61, H62, P63, Y64, and N65 of SEQ ID NO:2 in a HDHD4 polypeptide, or a residue that is spatially near these residues (which can be determined from an inspection of the structure coordinates of Table 1 or Table 2). The method can comprise using all or part of a model of a HDHD4 polypeptide to visualize all or part of a HDHD4 polypeptide in its mutated or native form. Preferably the model is a three-dimensional model. In the context of the present invention, when it is desired to introduce a mutation into a HDHD4 polypeptide amino acid sequence (e.g., a mutation designed using structure coordinates of the present invention, such as by a method disclosed herein) by any method known to those of skill in the art, including site-directed mutagenesis of DNA encoding a HDHD4 polypeptide. A mutation can be introduced, for example, by employing common DNA amplification methods using primers to introduce and amplify alterations in the DNA template, such as PCR methods that employ primers comprising a desired mutation.
Non-naturally occurring variants (e.g., mutants) can be produced using known mutagenesis techniques, including, but not limited to, oligonucleotide mediated mutagenesis, alanine scanning, PCR mutagenesis, site directed mutagenesis (see, e.g., Carter et al, (1986) Nucl Acids Res. 13:4331; and Zoller et al, (1982) Nucl Acids Res. 10:6487), cassette mutagenesis (see, e.g., Wells et al, (1985) Gene 34:315), restriction selection mutagenesis (see, e.g., Wells et al. , (1986) Philos. Tr. R. Soc. A 317:415). In another example, multiple amino acid substitutions can be made and tested using known methods of mutagenesis and screening, such as those disclosed by Reidhaar-Olson and Sauer (Reidhaar-Olson & Sauer. (1988) Science 241:53-57) or Bowie and Sauer (Bowie & Sauer. (1989) Proc. Natl. Acad. ScL U.S.A. 86:2152- 2156). Briefly, these references disclose methods for simultaneously randomizing two or more positions in a polypeptide, selecting for functional polypeptide, and then sequencing the mutagenized polypeptides to determine the spectrum of allowable substitutions at each position. Other methods that can be used include phage display (e.g., Lowman et al. (1991) Biochem. 30: 10832-10837; U.S. Patent No. 5,223,409; PCT Publication WO 92/06204) and region-directed mutagenesis (Derbyshire et al, (1986) Gene 46: 145; Ner ef α/.. (1988) DNA 7: 127).
Often site-directed mutagenesis techniques employ a phage vector that has single- and double-stranded forms, such as M 13 phage vectors. Other suitable vectors comprising a single- stranded phage origin of replication can also be employed in a site-directed mutatgenesis protocol (see, e.g., Veira et ah, (1987) Meth. Enzymol. 15:3).
A mutant designed by a method of the present invention that has the same or similar biological activity as the native HDHD4 polypeptide or a native portion thereof can be useful for any purpose for which the native is useful. A mutant designed by a method of the present invention that has altered biological activity from that of the native can be useful in binding assays to test the ability of a potential ligand to bind to or associate with a HDHD4 polypeptide. A mutant designed by a method of the present invention that has an altered biological activity from the native can be useful in further elucidating the biological role and mechanism of action of HDHD4.
Thus, the present invention provides a mutant HDHD4 polypeptide, or a mutant portion thereof, comprising one or more amino acid mutations, addition or deletion in a wild-type HDHD4 polypeptide. A mutant portion of a HDHD4 polypeptide can comprise a mutant binding site, such as that described herein.
In representative mutants, a mutation comprises five or fewer substitutions, deletions or insertions, four or fewer substitutions, deletions or insertions, three or fewer substitutions, deletions or insertions, two or fewer substitutions, deletions or insertions, or one substitution, deletion or insertion. A substitution can be a conservative amino acid substitution, a discussion of which is provided herein, although non-conservative subsitutions, deletions and additions can also be performed and form aspects of the present invention.
HDHD4 polypeptide derivatives, analogs and mutants, as described herein, can be made by altering encoding nucleic acid sequences by substitutions, e.g., replacing a given residue with another residue; such additions or deletions can provide for functionally equivalent or specifically modified HDHD4 polypeptides.
Due to the degeneracy of nucleotide coding sequences, other DNA sequences that encode substantially the same amino acid sequence as a nucleic acid encoding a modified HDHD4 polypeptide, or a fragment thereof, of the present can be used in the practice of the present invention. These include but are not limited to allelic genes, homologous genes from other species, which are altered by the substitution of different codons that encode the same amino acid residue within the sequence, thus producing a silent change. Likewise, a modified HDHD4 polypeptide derivative of the present invention can include, but is not limited to, derivatives containing, as a primary amino acid sequence, all or part of the amino acid sequence of a HDHD4 polypeptide, including altered sequences in which functionally equivalent amino acid residues are substituted for residues within the sequence resulting in a conservative amino acid substitution. For example, one or more amino acid residues within the sequence can be substituted by another amino acid of a similar polarity, hydrophobicity, charge, etc. which acts as a functional equivalent, resulting in a silent alteration. Substitutes for an amino acid within the sequence may be selected from other members of the class to which the amino acid belongs. For example, the nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan and methionine. Amino acids containing aromatic ring structures are phenylalanine, tryptophan, and tyrosine. The polar neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and glutamine. The positively charged (basic) amino acids include arginine, lysine and histidine. The negatively charged (acidic) amino acids include aspartic acid and glutamic acid.
Generally, initial substitutions are conservative, i.e., the replacement group will have approximately the same size, shape, hydrophobicity and charge as the original group.
It is understood that amino acids and structural elements known in the art to alter conformation should be avoided, unless such an alteration is desired. Such substituted chemical compounds can then be analyzed for efficiency of fit to a HDHD4 binding site using the one or more of the computer-based approaches described in detail herein.
Non-conserved amino acid substitutions can also be introduced to impart a preferred property to a protein. For example, a Cys can be introduced to provide a potential site for disulfide bridges with another Cys. A His can be introduced as a particular "catalytic" site (i.e., His can act as an acid or base and is a common amino acid in biochemical catalysis). Pro can be introduced which, because of its particularly planar structure, induces β-turns in protein structure. Included within the scope of the term "mutant" are chimeric and fusion proteins. Such chimeras or fusion proteins can include, for example, a secretion signal or an additional heterologous functional region. For instance, a region of additional amino acids, particularly charged amino acids, can be added to the N- terminus of the polypeptide to improve stability and persistence in the host cell, during purification, or during subsequent handling and storage. Also, peptide moieties can be added to the polypeptide to facilitate purification. Such regions may be removed prior to final preparation of the polypeptide. The addition of peptide moieties to polypeptides to engender secretion or excretion, to improve stability and to facilitate purification, among others, are familiar and routine techniques in the art. One common example of a fusion protein comprises a heterologous region from immunoglobulin that is useful to solubilize proteins.
Mutagenesis methods as disclosed herein can be combined with high- throughput, automated screening methods to detect activity of cloned, mutagenized polypeptides in host cells. Mutagenized DNA molecules that encode active polypeptides (e.g., cell proliferation) can be recovered from the host cells and rapidly sequenced using modern equipment. These methods allow the rapid determination of the importance of individual amino acid residues in a polypeptide of interest, and can be applied to polypeptides of unknown structure.
Solving Three-dimensional Structures of Structurally Similar Proteins All or a part of the structure coordinates of a HDHD4, as provided in Table 1 or Table 2 of the present invention, can be employed in the solution of other crystal forms of HDHD4 and crystalline forms of other proteins, such as those having some degree of structural similarity with HDHD4 co-complex (e.g., a complex comprising a ligand or modulator).
One method that can be employed for the purpose of solving additional HDHD4 crystal structures is molecular replacement (see generally, The Molecular Replacement Method. (Rossmann, ed.), Gordon & Breach, New York, New York (1972)). The general approach of molecular replacement is to employ a known structure (e.g., a HDHD4 structure of the present invention) as a template from which an unknown structure can be derived. Broadly, in a molecular replacement solution, structural element common to certain domains (which can relate to certain primary structure motifs) are employed to align the unknown sequence with structural elements of the known structure. Phases can then be calculated from this model and combined with the observed X-ray diffraction pattern amplitudes to generate an electron density map of the structure whose coordinates are unknown. This, in turn, can be subjected to well-known model building and structure refinement techniques to provide a final, accurate structure of the unknown crystallized molecule or molecular complex. Software useful for carrying out a molecular replacement solution includes AmoRe QSfavaza & Saludiian. (1997) Method Enzymol. 276A: 581-94). Thus, the structure coordinates of the present invention can be employed in determining the three-dimensional structure of a protein for which a structure is not known, or in determining the three-dimensional structure of regions of a protein for which only a partial structure is available.
Modulating a HDHD4 Polypeptide
Modulators designed using a structure of the present invention can be used to modulate HDHD4 activity. Thus, the present invention provides a method of modulating a HDHD4 polypeptide comprising administering a modulator of a HDHD4 polypeptide in an amount sufficient to modulate a HDHD4 polypeptide, wherein the modulator of the HDHD4 polypeptide is a ligand known or suspected to bind to a HDHD4 polypeptide or was identified by a method comprising: (i) docking a test molecule into all or any part of a HDHD4 binding site, (ii) analyzing the structural and chemical feature complementarity structural and chemical feature complementarity between the test molecule and all or any part of the HDHD4; and (iii) screening the test molecule in a biological assay of modulation of the HDHD4. A test molecule is identified as a modulator of a HDHD4 polypeptide if the structural and chemical feature complementarity and the modulation exceed a desired level. The method can further comprise the following step of: (b) screening the test molecule in an assay that characterizes binding to a HDHD4 polypeptide. In embodiments of the method, the binding site can be described, for example, by the structure coordinates of amino acids D12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, and optionally C60, F61, H62, P63, Y64, and N65 of SEQ ID NO:2, according to Table 1 or Table 2.
The methods of the present invention can be practiced in vitro or in vivo. When practiced in vitro, the methods can employ any number of art-recognized in vitro systems. In vivo methods include, but are not limited to, any of the ways described in the section on methods of treatment.
Examples The following Examples have been included to illustrate representative modes of the present invention. These Examples are exemplified through the use of standard laboratory practices of the inventors. The following Examples are intended to be exemplary only and numerous changes, modifications and alterations can be employed without departing from the spirit and scope of the invention.
Example 1
HDHD4 Cloning & Expression
An expression vector was obtained containing and expressing the gene for full length HDHD4, with the addition of a Thrombin-cleavable C-terminal hexahistidine tag, and two extra amino acids (G and S) on the N-terminus.
NMR structural data was used to design a truncated protein of HDHD4 for crystallization trials. The Multi Site-Directed Mutagenesis kit (Stratagene, La Jolla, CA) was used to perform deletion mutagenesis to remove 21 base pairs (seven amino acids) from the 5' end (N-terminus) and 27 base pairs (nine amino acids) from the 3' end (C-terminus) of this starting construct. The resulting expression vector, referred to as "VG-10", thus expresses HDHD4(R7-C242) with an N-terminal Methionine (start codon) and a C-terminal Thrombin cleavable hexahistidine tag. The DNA sequence of the coding (translated) region of the "VG-10" expression construct is shown in Figure 7, with sequences coding the non-HDHD4 residues underlined. The protein sequence that was expressed using the "VG-10" construct is shown in Figure 8, again with non-HDHD4 residues underlined. Example 2 HDHD4 Expression
Growth of selenomethionine-labeled protein: Transformed E.coli BL21(DE3) (Novagen, Madison, WI) cells were propagated in 200 mL of minimal media supplemented with 100 mg/L L-methionine and 100 mg/L L-cysteine overnight at 370C. The culture was spun down and the cells were suspended in 0.5 L of minimal media supplemented with 100 mg/L of selenomethionine and 100 mg/L of L-cysteine. The culture was induced at OD600 = 1.0 with 0.3 mM IPTG and 0.5mM NaF. After induction the temperature of the culture was lowered to 280C and the cells were harvested after 19 hours.
Growth of unlabeled protein Transformed E.coli BL21(DE3) (Novagen, Madison, WI) were propagated in minimal media overnight at 370C. The culture was induced at OD600 = 1.0 with 0.3 mM IPTG and 0.5 mM NaF. After induction the temperature of the culture was lowered to 280C and the cells were harvested after 5 hours. Minimal Media was made by combining 10.5 g K2HPO4 and 0.5 g NaCl. H2O was added and the pH adjusted to 7.2 with H3PO4. 1.0 g (NH4) 2SO4, 5.0 g glucose, 2 mL 1 M MgSO4, 10 mL IOOX Stock B and 30 mg/L of kanamycin were added. H2O was added to a final volume of 1 liter. The media was filter sterilized and 1 mL of Stock C was then added to sterilized media.
Stock B: To prepare 100 mL of IOOX, the following was weighed out: 20 mg CaCl2, 20 mg ZnSO4.7H2O, 20 mg MnSO4, 500 mg thiamine, 500 mg niacin, 6 mg biotin,10 mg choline chloride, 10 mg pantothenic acid, 10 mg pyridoxine,10 mg folic acid, 10 mg p-aminobenzoic acid (PABA), 200 μl of 0.1 mM vitamin B 12. The components were mixed together and then filter sterilized. The solution was then stored at 40C. 10 mL/L of media was used.
Stock C: To prepare 50 mL of 100OX, the following was weighed out: 540 mg FeCl3.6H2O, 35 mg Na2MoO4.2H2O, 40 mg CuSO4.5H2O, 10 mg H3BO3. Mix together components and heat to dissolve. The hot solution was filter sterilized through 0.22 micron filter. Some components will precipitate after cooling. This is normal. The solution was stored at room temperature. The contents of the solution were swirled prior to taking 1 mL for 1 liter of media. 1 mL of IOOOX stock per liter was then added.
Protein Purification
The harvested cells were resuspended in 100 mL of 25 mM Tris-HCl, pH 7.5, 50 mM NaCl, 2 mM dithiothreitol (DTT), 1 mM ethylene- bis(oxyethylenenitrilo)tetraacetic acid (EGTA), 0.5 mM NaF, 100 mg/L protamine sulfate and 1 mL of protease inhibitor cocktail (Sigma, St. Louis, MO). After sonication and clarification at 15,000 rpm, 20 min (Sorval, SS34) the supernatant was applied onto 30 mL of nickel-charged affinity column (His-Select, Sigma, St. Louis, MO) and, after thoroughly washing, the resin the protein was eluted with 0.25 M imidazole. Fractions containing HDHD4 were cleaved with human thrombin (10 u/lmg) at room temperature for 1 hour, concentrated in a filtering device with a 10,000 dalton (Da) MWCO membrane (Millipore Corporation, Bedford, MA) and applied onto Superdex 75(26/60) (Pharmacia, Piscataway, NJ) equilibrated in 50 mM Tris-HCl, pH 7.5, 100 mM NaCl, 5 mM DTT, 0.5 mM NaF. Peak fractions (3 ml/tube) were passed through 3 mL of SP Sepharose (Pharmacia, Piscataway, NJ) resin (5 ml) and then through 3 mL of Q Sepharose (Pharmacia, Piscataway, NJ) resin (5 ml). The protein was concentrated to 20 mg/mL using a filtering device with a 10,000 Da MWCO membrane (Millipore Corporation, Bedford, MA) and exchanged into the final buffer: 25 mM Tris-HCl, pH 7.5, 50 mM NaCl, 5 mM DTT, 0.5 mM NaF. All concentrations were done in cooled table-top centrifuge. Typical yields were 100 mg/L of growth media. The protein could be used immediately for crystallization trials or stored at -800C with 10% v/v glycerol.
Example 3
HDHD4 Protein Manipulation and Co-crystallization Initial crystallization screens were run on Fluidigm (San Francisco, CA, USA) microfluidic chips with sulfur methionyl (S-Met) protein. The crystallization conditions were successfully translated to drop volumes above 1 μl and then applied to selenomethionyl protein (Se-Met). The selenomethionyl protein stock solution consisted of 7 mg/mL (0.26 mM based on the calculated MW of 27,132 Da) HDHD4 in 50 mM NaCl, 5 mM DTT, and 0.5 mM NaF buffered by 25 mM Tris-HCl, pH 7.5. Aliquots of the protein solution were incubated with Mg2+ (MgCl2) and VO4 3" (Na3VO4) at approximate molar excesses of 10 (3.0 mM) and 5 (1.5 mM), respectively for 1 hour at room temperature. Crystallization trials were prepared by the hanging drop vapor diffusion method. The reservoir solution consisted of 0.8-1.8 M Na/K phosphate, pH 5.6. Drops were formed from 1 μl of the protein solution and 1 μl of the reservoir solution (total initial volume of 2 μl), mixed, and left to equilibrate at room temperature. Crystals appeared as clustered/stacked plates from central nucleation sites (Figure 1) within one week. Single crystals were removed from these clusters and prepared for collection at 10OK. The cryo-solution consisted of 25% v/v ethylene glycol added to the reservoir solution. The diffraction patterns from crystals of this complex gave unit-cell parameters of a = 46.4 A, b = 53.5 A, c = 64.0 A, α = 65.5°, β = 75.0°, and γ = 85.2°. The symmetry was consistent with space group Pl. Diffraction images from these crystals frequently indexed as the [1 0 0 0 -1 0 0 -1 -1] transform (a = 46.4 A, b = 53.5 A, c = 63.9 A, α = 114.5°, β = 100.4°, and γ = 95.0°). Based on this unit cell and space group, the asymmetric unit (which also equals the unit cell for space group Pl) was determined to contain two independent HDHD4 monomers (51% solvent fraction). The structure coordinates for this crystal can be found in Table 1.
15N-HDHD4 Wild-Type Protein Manipulation and Co-crystallization Initial crystallization screens were run on Innovadyne ScreenMaker 96+8 (Santa Rosa, CA, USA) with HDHD4 Wild-Type protein. Plated on Neurprobe hanging drop trays using Hampton Research and Fluidigm crystallization screening solutions. Initial crystals were observed and conditions were successfully optimized for harvesting and data collection. HDHD4 protein stock solution consisted of 12.0 mg/ml (0.419 mM based on the calculated Mw of 28,625 Da) 1 mM TCEP, 32.0 mM NANA, 4.27 mM MgCl2, 2.14 mM vanadate buffered by 10 mM HEPES pH 7.5 Crystallization trials were prepared by the hanging drop vapor diffusion method. The reservoir solution consisted of 0.5 M potassium formate, 20% w/v PEG 1500, 0.1 M glycyl-glycine pH 8.5, 0.01% n-dodecyl b-D-maltoside. Drops were formed from 1 μl of the protein solution and 1 μl of the reservoir solution (total initial volume of 2 μl), mixed, and placed at 4°C to equilibrate. Crystals appeared within one week. Single crystals were removed and prepared for collection at 10OK. The cryo-solution consisted of 20% v/v ethylene glycol added to the reservoir solution. The diffraction patterns from crystals of this complex gave unit-cell dimensions of a = 46.8 A, b = 102.6 A, c = 186.7 A. The symmetry was consistent with space group P212121. Based on this unit cell and space group, the asymmetric unit was determined to contain three independent HDHD4 monomers (53% solvent fraction). The structure coordinates for this crystal can be found in Table 2.
Example 4 HDHD4 Structure Determination The structure of HDHD4 was determined from experimental phases derived from the incorporated of selenomethionine. A three-wavelength MAD experiment (peak, inflection, and high-energy remote) was conducted (Beamline X12C, National Synchrotron Light Source, Brookhaven National Laboratory, Upton, NY, USA). The inverse-beam approach was used to guarantee the measurement of Friedel mates. The diffraction data were processed with the HKL suite (Otwinowski and Minor (1997) CW. Carter and R.M. Sweet (ed.), Methods Enzymol, Macromolecular Crystallography part A, 276: 307-326, Academic Press, Inc., New York, NY). The program SHELXD (Usόn and Sheldrick (1999) Curr. Opin. Struct. Biol 9: 643-648; Schneider and Sheldrick (2002) Acta. Cryst. D58: 1772-1779) was used to identify the selenium sub-structure from the anomalous signal contained in the structure-factor amplitudes. A total of 12 selenium sites, consistent with two molecules in the asymmetric unit as anticipated, were located. The selenium sites were refined with the program autoSHARP (LaFortelle and Bricogne (1997) CW. Carter and R.M. Sweet (ed.), Methods Enzymol., Macromolecular Crystallography part A, 276: 472- 494, Academic Press, Inc., New York, NY; Vonrhein, Blanc, Roversi, and Bricogne (2005) Automated structure solution with autoSHARP, in "Crystallographic Methods", S Doublie (ed.), Humana Press, Totowa, NJ, submitted. The program SOLOMON (Abrahams and Leslie (1996) Acta Cryst. D52: 30-42) of the CCP4 suite (Collaborative Computational Project Number 4 (1994) Acta Cryst. D50: 760-763) was used to apply density modification/solvent flipping to the electron density map generated with the phases from the refined selenium coordinates. The structure factors associated with the density-modified map and the amino-acid sequence were passed to the program APR/wARP (Lamzin and Wilson (1993) Acta Cryst. D49: 129- 147) which built approximately 85% of the residues in the dimer in 15 fragments. The fragmented model was manually organized by protein molecule and the structure was completed by several rounds of refinement with the program CNX (Accelrys, Inc., San Diego, CA, USA) and model building with the program QUANTA (Accelrys, Inc., San Diego, CA, USA). The structure was refined to 2.0 A resolution. The crystallographic residuals, R and Rfree, are 24.9% and 29.9%, respectively.
Example 5 Features of the HDHD4 Active Site
The extended active site of HDHD4, including both the region shown to bind the phosphate mimetic vanadate and the region with the unmodeled density consistent with a small organic molecule, is lined with the following residues: D12, L13, D14,
N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164,
D189, T190, T193, and D194. See Figure 3 for a depiction of the active site.
Although represented by discontinuous electron density, the residues C60, F61, H62,
P63, Y64, and N65 also likely form part of the extended active site of HDHD4 based on the proximity of their Ca carbons.
Example 6
Generation of a Homology Model
A multiple sequence alignment of a target protein and a template structure
(e.g., HDHD4) may be carried out manually, conserving the overall secondary structure. Once the correspondence between amino acids in the target and template sequences is made, the coordinates for the structurally conserved regions may be assigned based on the coordinates of the template structure (e.g., HDHD4). Insertions, deletions and mutations may be incorporated into the template structure as desired to build an initial model.
The HDHD4 template structure may then be energy minimized to refine the molecular structure so that any steric strain that might have been introduced during the model-building process is eliminated.
The model may then be screened for unfavorable steric contacts and, if necessary, such side chains may be remodeled either by using a rotamer library database or by manually rotating the respective side chains to form a final homology model of the target structure. The modeling may be carried out, for example, on a Silicon Graphics OCTANE or FUEL computer (Silicon Graphics Inc., Mountain View, California, USA) using the Homology module in INSIGHT II (Accelrys Inc., San Diego, California, USA).
Table 1 Table of VGlO HDHD4 Structure Coordinates
Figure imgf000083_0001
Figure imgf000084_0001
Figure imgf000085_0001
Figure imgf000086_0001
Figure imgf000087_0001
Figure imgf000088_0001
Figure imgf000089_0001
Figure imgf000090_0001
Figure imgf000091_0001
Figure imgf000092_0001
Figure imgf000093_0001
Figure imgf000094_0001
Figure imgf000095_0001
Figure imgf000096_0001
Figure imgf000097_0001
Figure imgf000098_0001
Figure imgf000099_0001
Figure imgf000100_0001
Figure imgf000101_0001
Figure imgf000102_0001
Figure imgf000103_0001
Figure imgf000104_0001
Figure imgf000105_0001
Figure imgf000106_0001
Figure imgf000107_0001
Figure imgf000108_0001
Figure imgf000109_0001
Figure imgf000110_0001
Figure imgf000111_0001
Figure imgf000112_0001
Figure imgf000113_0001
- Ill -
Figure imgf000114_0001
Figure imgf000115_0001
Figure imgf000116_0001
Figure imgf000117_0001
Figure imgf000118_0001
Figure imgf000119_0001
Figure imgf000120_0001
Figure imgf000121_0001
Figure imgf000122_0001
Figure imgf000123_0001
Figure imgf000124_0001
Figure imgf000125_0001
Figure imgf000126_0001
Figure imgf000127_0001
Figure imgf000128_0001
Figure imgf000129_0001
Figure imgf000130_0001
Figure imgf000131_0001
Figure imgf000132_0001
Figure imgf000133_0001
Figure imgf000134_0001
Figure imgf000135_0001
Figure imgf000136_0001
Table 2 Table of Wild-Type HDHD4 Structure Coordinates
Figure imgf000136_0002
Figure imgf000137_0001
Figure imgf000138_0001
Figure imgf000139_0001
Figure imgf000140_0001
Figure imgf000141_0001
Figure imgf000142_0001
Figure imgf000143_0001
Figure imgf000144_0001
Figure imgf000145_0001
Figure imgf000146_0001
Figure imgf000147_0001
Figure imgf000148_0001
Figure imgf000149_0001
Figure imgf000150_0001
Figure imgf000151_0001
Figure imgf000152_0001
Figure imgf000153_0001
Figure imgf000154_0001
Figure imgf000155_0001
Figure imgf000156_0001
Figure imgf000157_0001
Figure imgf000158_0001
Figure imgf000159_0001
Figure imgf000160_0001
Figure imgf000161_0001
Figure imgf000162_0001
Figure imgf000163_0001
Figure imgf000164_0001
Figure imgf000165_0001
Figure imgf000166_0001
Figure imgf000167_0001
Figure imgf000168_0001
Figure imgf000169_0001
Figure imgf000170_0001
Figure imgf000171_0001
Figure imgf000172_0001
Figure imgf000173_0001
Figure imgf000174_0001
Figure imgf000175_0001
Figure imgf000176_0001
Figure imgf000177_0001
Figure imgf000178_0001
Figure imgf000179_0001
Figure imgf000180_0001
Figure imgf000181_0001
Figure imgf000182_0001
Figure imgf000183_0001
Figure imgf000184_0001
Figure imgf000185_0001
Figure imgf000186_0001
Figure imgf000188_0001
Figure imgf000189_0001
Figure imgf000190_0001
Figure imgf000191_0001
Figure imgf000192_0001
Figure imgf000193_0001
Figure imgf000194_0001
Figure imgf000195_0001
Figure imgf000196_0001
Figure imgf000197_0001
Figure imgf000198_0001
Figure imgf000199_0001
Figure imgf000200_0001
Figure imgf000201_0001
Figure imgf000202_0001
Figure imgf000203_0001
Figure imgf000204_0001
Figure imgf000205_0001
Figure imgf000206_0001
Figure imgf000207_0001
Figure imgf000208_0001
Figure imgf000209_0001
Figure imgf000210_0001
Figure imgf000211_0001
Figure imgf000212_0001
Figure imgf000213_0001
Figure imgf000214_0001
Figure imgf000215_0001
Figure imgf000216_0001
Figure imgf000217_0001
Figure imgf000218_0001
References
The references cited in the specification are incorporated herein by reference to the extent that they supplement, explain, provide a background for or teach methodology, techniques and/or compositions employed herein. All cited patents, including patent applications, and publications referred to in this application are herein expressly incorporated by reference. Also expressly incorporated herein by reference are the contents of all citations of GenBank accession numbers, LocusIDs, and other computer database listings, as well as the contents of the Sequence Listing associated herewith.
It will be understood that various details of the invention may be changed without departing from the scope of the invention. Furthermore, the foregoing description is for the purpose of illustration only.

Claims

Claims What is claimed is:
1. A crystalline form comprising a complex comprising a HDHD4 polypeptide and a moiety comprising a metal atom.
2. The crystalline form of claim 1, wherein the HDHD4 polypeptide comprises the cap domain of HDHD4.
3. The crystalline form of claim 1, wherein the HDHD4 polypeptide comprises the core domain.
4. The crystalline form of claim 1, wherein the HDHD4 polypeptide comprises the amino acid sequence selected from the group consisting of SEQ ID NO:2 and SEQ ID NO:4.
5. The crystalline form of claim 1, wherein the HDHD4 polypeptide is encoded by a nucleic acid selected from the group consisting of SEQ ID NO: 1, sequences deviating from SEQ ID NO: 1 due to the degeneracy in the genetic code, SEQ ID NO: 3 and sequences deviating from SEQ ID NO: 3 due to the degeneracy in the genetic code.
6. The crystalline form of claim 1, wherein the moiety is selected from the group consisting of magnesium, manganese, calcium, a phosphate mimetic, both manganese and a phosphate mimetic, both magnesium and phosphate mimetic, both calcium and a phosphate mimetic.
7. The crystalline form of claim 6, wherein the phosphate mimetic is seleted from the group consisting of vanadate, phosphate, tungstate, sulfate and aluminum trifluoride.
8. The crystalline form of claim 1, wherein the crystalline form has lattice constants a = 46.4 A, b = 53.5 A, c = 64.0 A, α = 65.5°, β = 75.0°, and γ = 85.2°.
9. The crystalline form of claim 1, wherein the crystalline form has lattice constants a = 46.4 A, b = 53.5 A, c = 63.9 A, α = 114.5°, β = 100.4°, and γ = 95.0°.
10. The crystalline form of claim 1, wherein the crystalline form has lattice constants a = 46.8 A, b = 102.6 A, c = 186.7 A.
11. The crystalline form of claim 1, wherein the crystalline form is a triclinic crystalline form.
12. The crystalline form of claim 1, wherein the crystalline form belongs to the space group selected from the group consisting of Pl and P212121.
13. The crystalline form of claim 1, wherein the crystalline form contains two or three HDHD4 polypeptides in the asymmetric unit.
14. The crystalline form of claim 1, wherein the crystalline form is further characterized by the structure coordinates selected from the group consisting of Table
1 and Table 2.
15. The crystalline form of claim 1, wherein the crystalline form is such that the three-dimensional structure of the complex can be determined to a resolution of about 3.0 A or better.
16. The crystalline form of claim 1 , wherein the crystalline form contains one or more atoms having an atomic weight of 40 g/mol or greater.
17. The crystalline form of claim 1, further comprising a ligand.
18. A method of determining the three-dimensional structure of a crystallized HDHD4 polypeptide in complex with a moiety comprising a metal atom comprising:
(a) crystallizing a HDHD4 polypeptide in complex with a moiety comprising a metal atom to form a crystallized complex; and
(b) analyzing the crystallized complex to determine a three-dimensional structure of the HDHD4 polypeptide in complex with a moiety comprising a metal atom to a resolution of about 3.0 A or better.
19. The method of claim 18, wherein the analyzing is by X-ray diffraction.
20. The method of claim 18, wherein the HDHD4 polypeptide comprises the amino acid sequence selected from the group consisting of SEQ ID NO:2 and SEQ ID NO:4.
21. The method of claim 18, wherein the HDHD4 polypeptide comprises the cap domain of HDHD4.
22. The method of claim 18, wherein the HDHD4 polypeptide comprises the core domain of HDHD4.
23. The method of claim 18, wherein the HDHD4 polypeptide is encoded by a nucleic acid selected from the group consisting of SEQ ID NO: 1, sequences deviating from SEQ ID NO: 1 due to the degeneracy in the genetic code, SEQ ID NO:3 and sequences deviating from SEQ ID NO:3 due to the degeneracy in the genetic code.
24. The method of claim 18, wherein the moiety is selected from the group consisting of magnesium, manganese, calcium, a phosphate mimetic, both manganese and a phosphate mimetic, both magnesium and phosphate mimetic, both calcium and a phosphate mimetic.
25. The method of claim 24, wherein the phosphate mimetic is vanadate, phosphate, tungstate, sulfate and aluminum trifluoride.
26. A method of designing a modulator of HDHD4 comprising: (a) designing a candidate modulator of HDHD4 that will make interactions with amino acids in a ligand binding site of a HDHD4, based upon a crystalline structure comprising a HDHD4 polypeptide in complex with a moiety comprising a metal atom; (b) synthesizing the candidate modulator; and (c) determining whether the candidate modulator modulates the activity of
HDHD4, whereby a modulator of HDHD4 is designed.
27. The method of claim 26, wherein the HDHD4 polypeptide comprises the amino acid sequence selected from the group consisting of SEQ ID NO:2 and
SEQ ID NO:4.
28. The method of claim 26, wherein the HDHD4 polypeptide comprises the cap domain of HDHD4.
29. The method of claim 26, wherein the HDHD4 polypeptide comprises the core domain of HDHD4.
30. The method of claim 26, wherein the HDHD4 polypeptide is encoded by a nucleic acid selected from the group consisting of SEQ ID NO: 1, sequences deviating from SEQ ID NO: 1 due to the degeneracy in the genetic code, SEQ ID NO: 3 and sequences from SEQ ID NO: 3 due to the degeneracy in the genetic code.
31. The method of claim 26, wherein the moiety is selected from the group consisting of magnesium, manganese, calcium, a phosphate mimetic, both manganese and a phosphate mimetic, both magnesium and phosphate mimetic, both calcium and a phosphate mimetic.
32. The crystalline form of claim 31, wherein the phosphate mimetic is vanadate, phosphate, tungstate, sulfate and aluminum trifluoride.
33. The method of claim 26, wherein the ligand binding site of the
HDHD4 polypeptide comprises HDHD4 residues D12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, optionally C60, F61, H62, P63, Y64, and N65 and subcombinations thereof.
34. The method of claim 26, further comprising assaying the modulatory properties of the candidate modulator by contating the candidate modulator with a cell extract or purified HDHD4 polypeptide to determine whether it is a modulator of HDHD4 activity.
35. A method of identifying a HDHD4 modulator comprising:
(a) inputting structure coordinates describing a three-dimensional structure of a HDHD4 polypeptide in complex with a moiety comprising a metal atom to modeling software disposed on a computer; and
(b) modeling a candidate modulator that forms one or more desired interactions with one or more amino acids of a ligand binding site of the HDHD4 and fits sterically within the HDHD4 binding pocket.
36. The method of claim 35, wherein the ligand binding site of the
HDHD4 polypeptide comprises HDHD4 residues D12, L13, D14, N15, 118, T20, A21, G22, A23, S24, R25, M28, Q53, V54, L56, S57, K58, E59, R72, WlOO, R104, M108, T131, N132, G133, D134, T137, Q138, K141, E163, K164, D189, T190, T193, and D194, optionally C60, F61, H62, P63, Y64, and N65 and subcombinations thereof.
37. The method of claim 35, wherein the HDHD4 polypeptide comprises the amino acid sequence selected from the group consisting of SEQ ID NO:2 and SEQ ID NO:4.
38. The method of claim 35, wherein the HDHD4 polypeptide comprises the cap domain of HDHD4.
39. The method of claim 35, wherein the HDHD4 polypeptide comprises the core domain.
40. The method of claim 35, wherein the HDHD4 polypeptide is encoded by a nucleic acid selected from the group consisting of SEQ ID NO: 1, sequences deviating from SEQ ID NO: 1 due to the degeneracy in the genetic code, SEQ ID NO:3 and sequences deviating from SEQ ID NO:3 due to the degeneracy in the genetic code.
41. The method of claim 35, wherein the moiety is selected from the group consisting of magnesium, manganese, calcium, a phosphate mimetic, both manganese and a phosphate mimetic, both magnesium and phosphate mimetic, both calcium and a phosphate mimetic.
42. The method of claim 41, wherein the phosphate mimetic is vanadate, phosphate, tungstate, sulfate and aluminum trifluoride.
43. The method of claim 35, further comprising assaying the modulatory properties of the candidate modulator by contating the candidate modulator with a cell extract or purified HDHD4 polypeptide to determine whether it is a modulator of HDHD4 activity.
PCT/US2007/064983 2006-03-27 2007-03-27 Three-dimensional structure of hdhd4 complexed with magnesium and a phosphate mimetic WO2007112377A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US78632306P 2006-03-27 2006-03-27
US60/786,323 2006-03-27

Publications (3)

Publication Number Publication Date
WO2007112377A2 WO2007112377A2 (en) 2007-10-04
WO2007112377A9 true WO2007112377A9 (en) 2007-11-15
WO2007112377A3 WO2007112377A3 (en) 2008-01-10

Family

ID=38541859

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2007/064983 WO2007112377A2 (en) 2006-03-27 2007-03-27 Three-dimensional structure of hdhd4 complexed with magnesium and a phosphate mimetic

Country Status (1)

Country Link
WO (1) WO2007112377A2 (en)

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1412746A4 (en) * 2001-07-12 2007-08-08 Exelixis Inc HADHs AS MODIFIERS OF THE P21 PATHWAY AND METHODS OF USE

Also Published As

Publication number Publication date
WO2007112377A3 (en) 2008-01-10
WO2007112377A2 (en) 2007-10-04

Similar Documents

Publication Publication Date Title
Seiradake et al. Crystal structures of the human and fungal cytosolic Leucyl-tRNA synthetase editing domains: a structural basis for the rational design of antifungal benzoxaboroles
Yanagisawa et al. Crystallographic studies on multiple conformational states of active-site loops in pyrrolysyl-tRNA synthetase
Kissinger et al. Crystal structure of human ABAD/HSD10 with a bound inhibitor: implications for design of Alzheimer's disease therapeutics
Barinka et al. A high-resolution structure of ligand-free human glutamate carboxypeptidase II
US20090062286A1 (en) Crystal Structure of SMYD3 Protein
Sivaraman et al. Crystal structure of histidinol phosphate aminotransferase (HisC) from Escherichia coli, and its covalent complex with pyridoxal-5′-phosphate and l-histidinol phosphate
Vostrukhina et al. The structure of Aquifex aeolicus FtsH in the ADP-bound state reveals a C2-symmetric hexamer
Mustelin et al. Structure of the hematopoietic tyrosine phosphatase (HePTP) catalytic domain: structure of a KIM phosphatase with phosphate bound at the active site
Wada et al. Crystal structures of Escherichia coli γ-glutamyltranspeptidase in complex with azaserine and acivicin: Novel mechanistic implication for inhibition by glutamine antagonists
US20040171019A1 (en) PIN1 peptidyl-prolyl isomerase polypeptides, their crystal structures, and use thereof for drug design
Lee et al. Dihydroorotase from Escherichia coli: loop movement and cooperativity between subunits
Lountos et al. Structure of human dual-specificity phosphatase 27 at 2.38 Å resolution
WO2001011054A9 (en) CRYSTALLIZATION AND STRUCTURE DETERMINATION OF STAPHYLOCOCCUS AUREUS UDP-N-ACETYLENOLPYRUVYLGLUCOSAMINE REDUCTASE (S. AUREUS MurB)
US20090275047A1 (en) Crystal structure of human soluble adenylate cyclase
WO2007112377A2 (en) Three-dimensional structure of hdhd4 complexed with magnesium and a phosphate mimetic
Lu et al. Structure of nicotinic acid mononucleotide adenylyltransferase from Bacillus anthracis
US20040209344A1 (en) Crystal structure of angiotensin-converting enzyme-related carboxypeptidase
AU781654B2 (en) Crystallization and structure determination of staphylococcus aureus thymidylate kinase
US7563610B1 (en) Crystalline composition of farsenyl pyrophosphate synthase (IspA)
EP1247860A1 (en) Crystal structure of pyruvate dehydrogenase kinase 2 (PDHK-2) and use thereof in methods for identifying and designing new ligands
US20070015270A1 (en) Crystalline PDE4D2 catalytic domain complex, and methods for making and employing same
US7319016B1 (en) Crystallization of cathepsin S
US20050208639A1 (en) Crystal structure of staphylococcus undecaprenyl pyrophosphate synthase and uses thereof
US7507552B1 (en) Crystallization of histone deacetylase 2
US20040191271A1 (en) Crystal structures of streptococcus undecaprenyl pyrophosphate synthase and uses thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07759431

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07759431

Country of ref document: EP

Kind code of ref document: A2