US7344852B1 - Crystallization of dipeptidyl peptidase IV (DPPIV) - Google Patents

Crystallization of dipeptidyl peptidase IV (DPPIV) Download PDF

Info

Publication number
US7344852B1
US7344852B1 US10/659,055 US65905503A US7344852B1 US 7344852 B1 US7344852 B1 US 7344852B1 US 65905503 A US65905503 A US 65905503A US 7344852 B1 US7344852 B1 US 7344852B1
Authority
US
United States
Prior art keywords
dppiv
protein
crystal
seq
residues
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related, expires
Application number
US10/659,055
Inventor
Kathleen Aertgeerts
Ciaran N. Cronin
David J. Hosfield
Mark W. Knuth
Duncan E. McRee
Sridhar Prasad
Bi Ching Sang
Robert J. Skene
Robert A. Wijnands
Sheng Ye
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Takeda Pharmaceutical Co Ltd
Takeda California Inc
Original Assignee
Takeda San Diego Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Takeda San Diego Inc filed Critical Takeda San Diego Inc
Priority to US10/659,055 priority Critical patent/US7344852B1/en
Assigned to SYRRX, INC. reassignment SYRRX, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: AERTGAERTS, KATHLEEN, HOSFIELD, DAVID J., CRONIN, CLARAN N., SKENE, ROBERT J., WIJNANDS, ROBERT A., MCRAE, DUNCAN E., PRASAD, SRIDHAR, KNUTH, MARK W., SANG, BI CHING, YE, SHANG
Assigned to TAKEDA PHARMACEUTICAL COMPANY LIMITED reassignment TAKEDA PHARMACEUTICAL COMPANY LIMITED ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: TAKEDA SAN DIEGO, INC.
Application granted granted Critical
Publication of US7344852B1 publication Critical patent/US7344852B1/en
Expired - Fee Related legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/48Hydrolases (3) acting on peptide bonds (3.4)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/34Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving hydrolase
    • C12Q1/37Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving hydrolase involving peptidase or proteinase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y304/00Hydrolases acting on peptide bonds, i.e. peptidases (3.4)
    • C12Y304/14Dipeptidyl-peptidases and tripeptidyl-peptidases (3.4.14)
    • C12Y304/14005Dipeptidyl-peptidase IV (3.4.14.5)
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2299/00Coordinates from 3D structures of peptides, e.g. proteins or enzymes
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases

Definitions

  • the present invention relates to a member of the S9 family of human proteases known as Dipeptidyl Peptidases (DPP) and more specifically to a particular dipeptidyl peptidase known as dipeptidyl peptidase IV (DPPIV).
  • DPPIV Dipeptidyl Peptidases
  • DPPIV dipeptidyl peptidase IV
  • crystalline form methods of forming crystals comprising DPPIV, methods of using crystals comprising DPPIV, a crystal structure of DPPIV, and methods of using the crystal structure.
  • a general approach to designing inhibitors that are selective for a given protein is to determine how a putative inhibitor interacts with a three dimensional structure of that protein. For this reason it is useful to obtain the protein in crystalline form and perform X-ray diffraction techniques to determine the protein's three-dimensional structure coordinates.
  • Various methods for preparing crystalline proteins are known in the art.
  • crystallographic data can be generated using the crystals to provide useful structural information that assists in the design of small molecules that bind to the active site of the protein and inhibit the protein's activity in vivo.
  • the protein is crystallized as a complex with a ligand, one can determine both the shape of the protein's binding pocket when bound to the ligand, as well as the amino acid residues that are capable of close contact with the ligand. By knowing the shape and amino acid residues comprised in the binding pocket, one may design new ligands that will interact favorably with the protein. With such structural information, available computational methods may be used to predict how strong the ligand binding interaction will be. Such methods aid in the design of inhibitors that bind strongly, as well as selectively to the protein.
  • the present invention is directed to crystals comprising DPPIV and particularly crystals comprising DPPIV that have sufficient size and quality to obtain useful information about the structural properties of DPPIV and molecules or complexes that may associate with DPPIV.
  • composition in one embodiment, comprises a protein in crystalline form wherein the protein has 65%, 70%, 80%, 90%, 95% or greater identity with residues 13-740 of SEQ ID NO:3.
  • the protein has activity characteristic of DPPIV.
  • the protein may optionally be inhibited by inhibitors of wild type DPPIV.
  • the protein may also diffract X-rays for a determination of structure coordinates to a resolution of 4 ⁇ , 3 ⁇ , 2.5 ⁇ , 2 ⁇ or less.
  • the protein crystal has a crystal lattice in a P2 1 space group.
  • the present invention is also directed to crystallizing DPPIV.
  • the present invention is also directed to the conditions useful for crystallizing DPPIV. It should be recognized that a wide variety of crystallization methods can be used in combination with the crystallization conditions to form crystals comprising DPPIV including, but not limited to, vapor diffusion, batch, and dialysis.
  • a method for forming crystals of a protein comprising: forming a crystallization volume comprising: a protein that has at least 65%, 70%, 80%, 90%, 95% identity with residues 13-740 of SEQ ID NO:3 in a concentration between 1 mg/ml and 50 mg/ml; 5-50% w/v of precipitant wherein the precipitant comprises one or more members of the group consisting of PEG MME having a molecular weight range between 300-10000, and PEG having a molecular weight range between 100-10000; optionally 0.05 to 0.8M additives wherein the additives comprises sarcosine or 0.5 to 25% additives wherein the additives comprises xylitrol; and wherein the crystallization volume has a pH between pH 5 and pH 9; and storing the crystallization volume under conditions suitable for crystal formation.
  • the method optionally further comprises using 0.05-0.2M buffers selected from the group consisting of tris-HCl, bicine and combinations thereof.
  • the method also optionally further further
  • the method may optionally further comprise forming a protein crystal that has a crystal lattice in a P2, space group.
  • the invention also relates to protein crystals formed by these methods.
  • machine readable data storage medium having data storage material encoded with machine readable data, the machine readable data comprising: structure coordinates that have a root mean square deviation of alpha-carbon atoms of less than 3 ⁇ when superimposed on alpha-carbon atoms positions of corresponding atomic coordinates of FIG. 3 , the root mean square deviation being calculated based only on those alpha-carbon atoms of amino acid residues in the structure coordinates that are also present in residues 13-740 of SEQ ID NO:3.
  • machine readable data storage medium having data storage material encoded with machine readable data, the machine readable data comprising: structure coordinates that have a root mean square deviation of alpha-carbon atoms of less than 3 ⁇ when superimposed on alpha-carbon atom positions of corresponding atomic coordinates of FIG. 3 , the root mean square deviation being calculated based only on those alpha-carbon atoms of amino acid residues in the structure coordinates that are also present in Tables 1, 2, 3 and/or 4.
  • the present invention is also directed to a three-dimensional structure of all or a portion of DPPIV.
  • This three-dimensional structure may be used to identify binding sites, to provide mutants having desirable binding properties, and ultimately, to design, characterize, or identify ligands capable of interacting with DPPIV.
  • Ligands that interact with DPPIV may be any type of atom, compound, protein or chemical group that binds to or otherwise associates with the protein. Examples of types of ligands include natural substrates for DPPIV, inhibitors of DPPIV, and heavy atoms.
  • a method for displaying, a three dimensional representation of a structure of a protein comprising: taking machine readable data comprising structure coordinates that have a root mean square deviation of alpha-carbon atoms of less than 3 ⁇ when superimposed on alpha-carbon atom positions of corresponding atomic coordinates of FIG. 3 ., the root mean square deviation being calculated based only on those alpha-carbon atoms of amino acid residues in the structure coordinates that are also present in residues shown in Tables 1, 2, 3 and/or 4 or residues 13-740 of SEQ D NO:3; computing a three dimensional representation of a structure based on the structure coordinates; and displaying the three dimensional representation.
  • the root mean square deviation calculation may optionally be based on a comparison of main-chain atoms, non-hydrogen atoms or a comparison of all atoms where the same type of amino acid residue is present.
  • the root mean square deviation of alpha-carbon atoms, non-hydrogen atoms or all atoms may optionally be less than 2.7 ⁇ , 2.5 ⁇ , 2.0 ⁇ , 1.5 ⁇ , 1 ⁇ , 0.5 ⁇ , or less.
  • the present invention is also directed to a method for solving a three-dimensional crystal structure of a target protein using the structure of DPPIV.
  • a computational method comprising taking machine readable data comprising structure coordinates that have a root mean square deviation of alpha-carbon atoms of less than 3 ⁇ when superimposed on alpha-carbon atom positions of corresponding atomic coordinates of FIG. 3 , the root mean square deviation being calculated based only on those alpha-carbon atoms of amino acid residues in the structure coordinates that are also present in residues shown in Tables 1, 2, 3 and/or 4 or residues 13-740 of SEQ ID NO:3; computing phases based on the structural coordinates; computing an electron density map based on the computed phases; and determining a three-dimensional crystal structure based on the computed electron density map.
  • a computational method comprising: taking an X-ray diffraction pattern of a crystal of the target protein; and computing a three-dimensional electron density map from the X-ray diffraction pattern by molecular replacement, wherein structure coordinates used as a molecular replacement model comprise structure coordinates that have a root mean square deviation of alpha-carbon atoms of less than 3 ⁇ when superimposed on alpha-carbon atom positions of corresponding atomic coordinates of FIG. 3 , the root mean square deviation being calculated based only on those alpha-carbon atoms of amino acid residues in the structure coordinates that are also present in residues shown in Tables 1, 2, 3 and/or 4 or residues 13-740 of SEQ ID NO:3.
  • This method may optionally further comprise determining a three-dimensional crystal structure based upon the computed three-dimensional electron density map.
  • the root mean square deviation calculation may optionally be based on a comparison of main-chain atoms, non-hydrogen atoms or a comparison of all atoms where the same type of amino acid residue is present.
  • the root mean square deviation of alpha-carbon atoms, main-chain atoms, non-hydrogen atoms or all atoms may optionally be less than 2.7 ⁇ , 2.5 ⁇ , 2.0 ⁇ , 1.5 ⁇ , 1 ⁇ , 0.5 ⁇ , or less.
  • the present invention is also directed to using a crystal structure of DPPIV, in particular the structure coordinates of DPPIV and the surface contour defined by them, in methods for screening, designing, or optimizing molecules or other chemical entities that interact with and preferably inhibit DPPIV.
  • a further aspect of the present invention relates to using a three-dimensional crystal structure of all or a portion of DPPIV and/or its structure coordinates to evaluate the ability of entities to associate with DPPIV.
  • the entities may be any entity that may function as a ligand and thus may be any type of atom, compound, protein (such as antibodies) or chemical group that can bind to or otherwise associate with a protein.
  • a method for evaluating a potential of an entity to associate with a protein comprising: creating a computer model of a protein structure using structure coordinates that comprise structure coordinates that have a root mean square deviation of alpha-carbon atoms of less than 3 ⁇ when superimposed on alpha-carbon atom positions of corresponding atomic coordinates of FIG. 3 , the root mean square deviation being calculated based only on those alpha-carbon atoms of amino acid residues in the structure coordinates that are also present in residues shown in Tables 1, 2, 3 and/or 4 or residues 13-740 of SEQ ID NO:3. 1; performing a fitting operation between the entity and the computer model; and analyzing results of the fitting operation to quantify an association between the entity and the model.
  • a method for evaluating a potential of an entity to associate with a protein comprising: computing a computer model for a protein binding pocket, at least a portion of the computer model having a surface contour that has a root mean square deviation of less than 3 ⁇ when superimposed on a surface contour defined by atomic coordinates of FIG. 3 , the root mean square deviation being calculated based only on alpha-carbon atoms in the structure coordinates that are present in residues shown in Tables 1, 2, 3 and/or 4 or residues 13-740 of SEQ ID NO:3; evaluating a potential of an entity to associate with the surface contour by performing a fitting operation between the entity and the surface contour; and analyzing results of the fitting operation to quantify an association between the entity and the computer model.
  • a method for identifying entities that can associate with a protein comprising: generating a three-dimensional structure of a protein using structure coordinates that comprise structure coordinates that have a root mean square deviation of alpha-carbon atoms of less than 3 ⁇ when superimposed on alpha-carbon atom positions of corresponding atomic coordinates of FIG.
  • the root mean square deviation being calculated based only on those alpha-carbon atoms of amino acid residues in the structure coordinates that are also present in residues shown in Tables 1, 2, 3 and/or 4 or residues 13-740 of SEQ ID NO:3: employing the three-dimensional structure to design or select an entity that can associate with the protein; and contacting the entity with a protein having at least 65% identity with residues 13-740 of SEQ ID NO:3.
  • a method for identifying entities that can associate with a protein comprising: computing a computer model for a protein binding pocket, at least a portion of the computer model having a surface contour that has a root mean square deviation of less than 3 ⁇ when superimposed on a surface contour defined by atomic coordinates of FIG. 3 , the root mean square deviation being calculated based only on alpha-carbon atoms in the structure coordinates that are present in residues shown in Tables 1, 2, 3 and/or 4 or residues 13-740 of SEQ ID NO:3; employing the computer model to design or select an entity that can associate with the protein; and contacting the entity with a protein having at least 65%, 70, 80, 90, 95% identity with residues 13-740 of SEQ ID NO:3.
  • a method for evaluating the ability of an entity to associate with a protein comprising: constructing a computer model defined by structure coordinates that comprise structure coordinates that have a root mean square deviation of alpha-carbon atoms of less than 3 ⁇ when superimposed on alpha-carbon atom positions of corresponding atomic coordinates of FIG.
  • the root mean square deviation being calculated based oily on those alpha-carbon atoms of amino acid residues in the structure coordinates that are also present in residues shown in Tables 1, 2, 3 and/or 4 or residues 13-740 of SEQ ID NO:3; selecting an entity to be evaluated by a method selected from the group consisting of (i) assembling molecular fragments into the entity (ii) selecting an entity from a small molecule database, (iii) de novo ligand design of the entity, and (iv) modifying a known ligand for DPPIV, or a portion thereof; performing a fitting program operation between computer models of the entity to be evaluated and the binding pocket in order to provide an energy-minimized configuration of the entity in the binding pocket; and evaluating the results of the fitting operation to quantify the association between the entity and the binding pocket model in order to evaluate the ability of the entity to associate with the binding pocket.
  • a method for evaluating the ability of an entity to associate with a protein comprising: computing a computer model for a protein binding pocket, at least a portion of the computer model having a surface contour that has a root mean square deviation of less than 3 ⁇ when superimposed on a surface contour defined by atomic coordinates of FIG.
  • the root mean square deviation being calculated based only on alpha-carbon atoms in the structure coordinates that are present in residues shown in Tables 1, 2, 3 and/or 4 or residues 13-740 of SEQ ID NO:3: selecting an entity to be evaluated by a method selected from the group consisting of (i) assembling molecular fragments into the entity, (ii) selecting an entity from a small molecule database, (iii) de novo ligand design of the entity, and (iv) modifying a known ligand for an DPPIV, or a portion thereof; performing a fitting program operation between computer models of the entity to be evaluated and the binding pocket in order to provide an energy-minimized configuration of the entity in the binding pocket; and evaluating the results of the fitting operation to quantify the association between the entity and the binding pocket model in order to evaluate the ability of the entity to associate with the said binding pocket.
  • the root mean square deviation calculation may optionally be based on a comparison of main-chain atoms, non-hydrogen atoms or a comparison of all atoms where the same type of amino acid residue is present.
  • the root mean square deviation of alpha-carbon atoms, non-hydrogen atoms or all atoms may optionally be less than 2.7 ⁇ , 2.5 ⁇ , 2.0 ⁇ , 1.5 ⁇ , 1 ⁇ , 0.5 ⁇ , or less.
  • the protein may optionally have activity characteristic of DPPIV.
  • the protein may optionally be inhibited by inhibitors of wild type DPPIV.
  • a method for identifying an entity that associates with a protein comprising: taking structure coordinates from diffraction data obtained from a crystal of a protein that has at least 65%, 70%, 80%, 90%, 95% or more identity with the residues 13-740 of SEQ ID NO:3 and performing rational drug design using a three dimensional structure that is based on the obtained structure coordinates.
  • the method may optionally further comprise selecting one or more entities based on the rational drug design and contacting the selected entities with the protein.
  • the method may also optionally further comprise measuring an activity of the protein when contacted with the one or more entities.
  • the method also may optionally further comprise comparing activity of the protein in a presence of and in the absence of the one or more entities; and selecting entities where activity of the protein changes depending whether a particular entity is present.
  • the method also may a optionally further comprise contacting cells expressing the protein with the one or more entities and detecting a change in a phenotype of the cells when a particular entity is present.
  • FIG. 1 illustrates SEQ. ID Nos. 1, 2 and 3 referred to in this application.
  • FIG. 2 illustrates a crystal of DPPIV complex.
  • FIG. 3 lists a set of atomic structure coordinates for DPPIV (SEQ ID NO:3 as derived by X-ray crystallography from a crystal that comprises the protein.
  • the following abbreviations are used in FIG. 3 : “X, Y, Z” crystallographically define the atomic position of the element measured; “B” is a thermal factor that measures movement of the atom around its atomic center: “Occ” is an occupancy factor that refers to the fraction of the molecules in which each atom occupies the position specified by the coordinates (a value of “1” indicates that each atom has the same conformation, i.e., the same position, in all molecules of the crystal).
  • NAG stands for N-Acetylglucosamine.
  • FIG. 4A illustrates a ribbon diagram overview of the structure of DPPIV, highlighting secondary structural elements of the protein.
  • FIG. 4B illustrates another ribbon diagram overview of the structure of DPPIV, highlighting additional secondary structural elements of the protein.
  • FIG. 5 illustrates the DPPIV binding site of DPPIV based on the determined crystal structure for the molecule in the asymmetric unit corresponding to the coordinates shown in FIG. 3 .
  • FIG. 6 illustrates a system that may be used to carry out instructions for displaying a crystal structure of DPPIV encoded on a storage medium.
  • the present invention relates to a member of the S9 family of human proteases known as dipeptidyl peptidase IV (DPPIV) (SEQ ID NO:1) More specifically, the present invention relates to DPPIV in crystalline form, methods of forming crystals comprising DPPIV, methods of using crystals comprising DPPIV, structure coordinates and a crystal structure of DPPIV, and methods of using the structure coordinates and crystal structure.
  • DPPIV dipeptidyl peptidase IV
  • amino acids comprising the protein.
  • DPPIV Dipeptidyl Peptidase IV
  • SEQ ID NO:1 is a serine protease of Clan SC family S9.
  • DPPIV is a 240 kDa homodimeric, multi-functional type-II membrane bound glycoprotein, widely distributed in all mammalian tissues, but highly expressed in kidney, liver and endothelium.
  • DPPIV is also known as DPP4, CD26, adenosine deaminase complexing protein 2 or adenosine deaminase binding protein (ADAbp).
  • DPPIV consists of a short cytoplasmic domain of six amino acids, followed by a hydrophobic transmembrane domain (amino acids 7-28) and an extracellular sequence of 739 amino acids.
  • DPPIV is a highly specific aminopeptidase and releases dipeptides from the amino terminus of peptides with a Pro or Ala in the penultimate position. N-terminal degradation of the substrate peptides may result in the activation, inactivation or modulation of their activity. Besides its well-known exopeptidase activity, DPPIV also exhibits endopeptidase activity towards denatured collagen. Expression of DPPIV is tightly associated with cell adhesion and is a co-stimulant during T-cell activation and proliferation.
  • DPPIV glucagon-like peptide 1
  • GIP glucose-dependent insulinotropic polypeptide
  • DPPIV comprises the wild-type form of full length DPPIV, set forth herein as SEQ. ID No. 1 (GenBank Accession Number NM — 001935; “Dipeptidyl peptidase IV (CD 26) gene expression in enterocyte-like colon cancer cell lines HT-29 and Caco-2. Cloning of the complete human coding sequence and changes of dipeptidyl peptidase IV mRNA levels during cell differentiation”, Darmoul, D., Lacasa, M., Baricault, L., Marguet, D., Sapin, C., Trotot, P., Barbat, A. and Trugnan, G., J. Biol. Chem. 267 (7), 4824-4833, 1992.
  • DPPIV comprises residues 13-740 of SEQ ID NO:3 which comprises the active site domain of wild-type DPPIV that is represented in the set of structural coordinates shown in FIG. 3 .
  • DPPIV comprises a sequence that has at least 65% identity, preferably at least 70%, 80%, 90%, 95% or higher identity with SEQ. ID No. 1.
  • DPPIV DPPIV-derived neuropeptides
  • SEQ. ID No. 3 which includes a 12 residue N-terminal tag (6 residues of which are histidine) that may be used to facilitate purification of the protein.
  • DPPIV amino acids shown in Table 1 encompass a 4-Angstrom radius around the DPPIV active site and thus likely to interact with any active site inhibitor of DPPIV.
  • Applicants have also determined that the amino acids of Table 2 encompass a 7-Angstrom radius around the DPPIV active site.
  • the amino acids of Table 3 encompass a 10-Angstrom radius around the DPPIV active site.
  • chains A, B, C and D there are four different DPPIV molecules in the asymmetric unit, referred to as chains A, B, C and D. As a result, four sets of structure coordinates were obtained for each amino acid. There are two dimers formed in the asymmetric unit; one dimer is formed between molecules A and B and the other with molecules C and D.
  • amino acids of Table 4 encompass a 5-Angstrom radius around the DPPIV amino acids that interact at the AB and CD dimerization interfaces.
  • the A, B, C and D sets of structural coordinates appear in FIG. 3 . It is noted that the sequence and structure of the residues in the active site and dimerization interface may also be conserved and hence pertinent to other S9 proteases.
  • DPPIV may optionally comprise a sequence that has at least 65% identity, preferably at least 70%, 80%, 90%, 95% or higher identity with any one of the above sequences (e.g., all of SEQ ID NO:3 or residues 13-740 of SEQ ID NO:3) where at least the residues shown in tables 1, 2, 3 and/or 4 are conserved with the exception of 0, 2, 3 or 4 residues. It should be recognized that one might optionally vary some of the binding site residues in order to determine the effect such changes have on structure or activity.
  • DPPIV variants e.g., insertions, deletions, substitutions, etc.
  • a wide variety of DPPIV variants e.g., insertions, deletions, substitutions, etc.
  • DPPIV variants that fall within the above specified identity ranges may be designed and manufactured utilizing recombinant DNA techniques well known to those skilled in the art, particularly in view of the knowledge of the crystal structure provided herein. These modifications can be used in a number of combinations to produce the variants.
  • the present invention is useful for crystallizing and then solving the structure of the range of variants of DPPIV.
  • Variants of DPPIV may be insertional variants in which one or more amino acid residues are introduced into a predetermined site in the DPPIV sequence.
  • insertional variants can be fusions of heterologous proteins or polypeptides to the amino or carboxyl terminus of the subunits.
  • Variants of DPPIV also may be substitutional variants in which at least one residue has been removed and a different residue inserted in its place.
  • Non-natural amino acids i.e. amino acids not normally found in native proteins
  • isosteric analogs amino acid or otherwise
  • suitable substitutions are well known in the art, such as the Glu ⁇ sp, Ser ⁇ Cys, Cys ⁇ Ser, and His ⁇ Ala for example.
  • variants are deletional variants, which are characterized by the removal of one or more amino acid residues from the DPPIV sequence.
  • Exemplary modifications include the modification of lysinyl and amino terminal residues by reaction with succinic or other carboxylic acid anhydrides. Modification with these agents has the effect of reversing the charge of the lysinyl residues.
  • Other suitable reagents for modifying amino-containing residues include imidoesters such as methyl picolinimidate; pyridoxal phosphate; pyridoxal chloroborohydride; trinitrobenzenesulfonic acid; 0-methylisourea, 2,4-pentanedione; and transaminaseN: talyzed reaction with glyoxylate, and N-hydroxysuccinamide esters of polyethylene glycol or other bulky substitutions.
  • Carboxyl side groups may be selectively modified by reaction with carbodiimides or they may be converted to asparaginyl and glutaminyl residues by reaction with ammonium ions. Conversely, asparaginyl and glutaminyl residues may be deamidated to the corresponding aspartyl or glutamyl residues, respectively, under mildly acidic conditions. Either form of these residues falls within the scope of this invention.
  • modifications of the nucleic sequence encoding DPPIV may be accomplished by a variety of well-known techniques, such as site-directed mutagenesis (see, Gillman and Smith, Gene 8:81-97 (1979) and Roberts, S. et al., Nature 328:731-734 (1987)). When modifications are made, these modifications may optionally be evaluated for there affect on a variety of different properties including, for example, solubility, crystallizability and a modification to the protein's structure and activity.
  • the variant and/or fragment of wild-type DPPIV is functional in the sense that the resulting protein is capable of associating with at least one same chemical entity that is also capable of selectively associating with a protein comprising the wild-type DPPIV (e.g., residues 39-766 of SEQ. ID No. 1) since this common associative ability evidences that at least a portion of the native structure has been conserved.
  • That chemical entity may optionally be glucagon-like peptide 1 (GLP-1), glucagon-like peptide 2 (GLP-2), glucose-dependent, insulinotropic polypeptide (GIP), growth hormone releasing factor, SDF-1 ⁇ , ⁇ -Casomorphin, TNF- ⁇ , Peptide YY or Substance P.
  • Amino acid substitutions, deletions and additions that do not significantly interfere with the three-dimensional structure of DPPIV will depend, in part, on the region where the substitution, addition or deletion occurs in the crystal structure. These modifications to the protein can now be made far more intelligently with the crystal structure information provided herein. In highly variable regions of the molecule, non-conservative substitutions as well as conservative substitutions may be tolerated without significantly disrupting the three-dimensional structure of the molecule. In highly conserved regions, or regions containing significant secondary structure, conservative amino acid substitutions are preferred.
  • amino acid substitutions are well known in the art, and include substitutions made on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity and/or the amphipathic nature of the amino acid residues involved.
  • negatively charged amino acids include aspartic acid and glutamic acid
  • positively charged amino acids include lysine and arginine
  • amino acids with uncharged polar head groups having similar hydrophilicity values include the following: leucine, isoleucine, valine; glycine, alanine; asparagine, glutamine; serine, threonine; phenylalanine, tyrosine.
  • Other conservative amino acid substitutions are well known in the art.
  • the protein may be produced in whole or in part by chemical synthesis.
  • the selection of amino acids available for substitution or addition is not limited to the genetically encoded amino acids. Indeed, mutants may optionally contain non-genetically encoded amino acids. Conservative amino acid substitutions for many of the commonly known non-genetically encoded amino acids are well known in the art. Conservative substitutions for other amino acids can be determined based on their physical properties as compared to the properties of the genetically encoded amino acids.
  • the gene encoding DPPIV can be isolated from RNA, cDNA or cDNA libraries.
  • Construction of expression vectors and recombinant proteins from the DNA sequence encoding DPPIV may be performed by various methods well known in the art. For example, these techniques may be performed according to Sambrook et al., Molecular Cloning-A Laboratory Manual, Cold Spring Harbor, N.Y. (1989), and Kriegler, M., Gene Transfer and Expression, A Laboratory Manual, Stockton Press, New York (1990).
  • Example 1 provides one such expression system.
  • DPPIV may optionally be affinity labeled during cloning, preferably with an N-terminal six-histidine tag, in order to facilitate purification.
  • affinity label With the use of an affinity label, it is possible to perform a one-step purification process on a purification column that has a unique affinity for the label.
  • the affinity label may be optionally removed after purification.
  • One aspect of the present invention relates to methods for forming crystals comprising DPPIV as well as crystals comprising DPPIV.
  • a method for forming crystals comprising DPPIV comprising forming a crystallization volume comprising DPPIV, one or more precipitants, optionally a buffer, optionally a monovalent and/or divalent salt and optionally an organic solvent; and storing the crystallization volume under conditions suitable for crystal formation.
  • a method for forming crystals comprising DPPIV comprising forming a crystallization volume comprising DPPIV in solution comprising the components shown in Table 5; and storing the crystallization volume under conditions suitable for crystal formation.
  • Precipitant 5-50% w/v of precipitant wherein the precipitant comprises one or more members of the group consisting of PEG MME having a molecular weight range between 300-10000, and PEG having a molecular weight range between 100-10000 pH pH 5-9. Buffers that may be used include, but are not limited to tris, bicine, cacodylate, acetate, citrate, MES and combinations thereof. Additives optionally 0.05 to 0.8 M additives wherein the additives comprises sarcosine or 0.5 to 25% additives wherein the additives comprises xylitrol Protein Concentration 1 mg/ml-50 mg/ml Temperature 1° C.-25° C.
  • a method for forming crystals comprising DPPIV comprising forming a crystallization volume comprising DPPIV; introducing crystals comprising DPPIV as nucleation sites, and storing the crystallization volume under conditions suitable for crystal formation.
  • Crystallization experiments may optionally be performed in volumes commonly used in the art, for example typically 15, 10, 5, 2 microliters or less. It is noted that the crystallization volume optionally has a volume of less than 1 microliter, optionally 500, 250, 150, 100, 50 or less nanoliters.
  • crystallization may be performed by any crystallization method including, but not limited to batch, dialysis and vapor diffusion (e.g., sitting drop and hanging drop) methods. Micro and/or macro seeding of crystals may also be performed to facilitate crystallization.
  • crystals comprising DPPIV and crystals comprising DPPIV according to the invention are not intended to be limited to the wild type, full length DPPIV shown in SEQ. ID No. 1 and to fragments comprising residues 39-766 of SEQ. ID No. 1. Rather, it should be recognized that the invention may be extended to various other fragments and variants of wild-type DPPIV as described above.
  • forming crystals comprising DPPIV and crystals comprising DPPIV according to the invention may be such that DPPIV is optionally complexed with one or more ligands and one or more copies of the same ligand.
  • the ligand used to form the complex may be any ligand capable of binding to DPPIV.
  • the ligand is a natural substrate.
  • the ligand is an inhibitor.
  • DPPIV crystals have a crystal lattice in the P2 1 space group.
  • DPPIV crystals also preferably are capable of diffracting X-rays for determination of atomic coordinates to a resolution of 4 ⁇ , 3 ⁇ , 2.5 ⁇ , 2 ⁇ or better.
  • Crystals comprising DPPIV may be formed by a variety of different methods known in the art. For example, crystallizations may be performed by batch, dialysis, and vapor diffusion (sitting drop and hanging drop) methods. A detailed description of basic protein crystallization setups may be found in McRee, D. and David. P., Practical Protein Crystallography, 2 nd Ed. (1999), Academic Press Inc. Further descriptions regarding performing crystallization experiments are provided in Stevens, et al. (2000) Curr. Opin. Struct. Biol.: 10(5):558-63, and U.S. Pat. Nos. 6,296,673, 5,419,278, and 5,096,676.
  • crystals comprising DPPIV are formed by mixing substantially pure DPPIV with an aqueous buffer containing a precipitant at a concentration just below a concentration necessary to precipitate the protein.
  • a precipitant for crystallizing DPPIV is polyethylene glycol (PEG), which combines some of the characteristics of the salts and other organic precipitants (see, for example, Ward et al., J. Mol. Biol. 98:161, 1975, and McPherson, J. Biol. Chem. 251:6300, 1976.
  • a protein/precipitant solution is formed and then allowed to equilibrate in a closed container with a larger aqueous reservoir having a precipitant concentration for producing crystals. The protein/precipitant solution continues to equilibrate until crystals grow.
  • FIG. 2 illustrates crystals of the DPPIV complex formed using the crystallization conditions provided in Table 5.
  • crystallization conditions provided in Table 5 and Example 2 can be varied and still yield protein crystals comprising DPPIV.
  • variations on the crystallization conditions described herein can be readily determined by taking the conditions provided in Table 5 and performing fine screens around those conditions by varying the type and concentration of the components in order to determine additional suitable conditions for crystallizing DPPIV, variants of DPPIV, and ligand complexes thereof.
  • Crystals comprising DPPIV have a wide range of uses. For example, now that crystals comprising DPPIV have been produced, it is noted that crystallizations may be performed using such crystals as a nucleation site within a concentrated protein solution. According to this variation, a concentrated protein solution is prepared and crystalline material (microcrystals) are used to ‘seed’ the protein solution to assist nucleation for crystal growth. If the concentrations of the protein and any precipitants are optimal for crystal growth, the seed crystal will provide a nucleation site around which a larger crystal forms. Given the ability to form crystals comprising DPPIV according to the present invention, the crystals so formed can be used by this crystallization technique to initiate crystal growth of other DPPIV comprising crystals, including DPPIV complexed to other ligands.
  • crystals may also be used to perform X-ray or neutron diffraction analysis in order to determine the three-dimensional structure of DPPIV and, in particular, to assist in the identification of its active site.
  • Knowledge of the binding site region allows rational design and construction of ligands including inhibitors. Crystallization and structural determination of DPPIV mutants having altered bioactivity allows the evaluation of whether such changes are caused by general structure deformation or by side chain alterations at the substitution site.
  • Crystals comprising DPPIV may be obtained as described above in Section 3. As described herein, these crystals may then be used to perform x-ray data collection and for structure determination.
  • crystals of a DPPIV complex were obtained where DPPIV has the sequence of residues shown in SEQ. ID No. 3. These particular crystals were used to determine the three dimensional structure of DPPIV. However, it is noted that other crystals comprising DPPIV including different DPPIV variants, fragments, and complexes thereof may also be used.
  • the structure of DPPIV was solved by a combination of heavy-atom derivatives and Seleno-Methionine (Se-Met) phasing in conjunction with non-crystallographic averaging.
  • Heavy atom derivatives were obtained by soaking native DPPIV crystals in heavy atom solutions made using the crystallization solution. The concentration of heavy atom derivative and time of soaking varied between 0.5 mM to 10 mM and 1 to 15 days, respectively. An extensive array of heavy atom derivatives were individually soaked into DPPIV crystals and analyzed.
  • Data were collected and integrated to 2.3 ⁇ with MOSFLM (or HKL2000) and scaled with SCALA (or Scalepack) (CCP4 Study Weekend, Eds. Sawyer, L., Isaacs, N. & Bailey, S. 56-62, SERC Daresbury Laboratory, England, 1993).
  • Positions for the Platinum atoms of the PIP derivative were located using the direct method search program, SHELXD.
  • the heavy atom parameters of the PIP derivative were refined using the program SHARP.
  • the refined parameters were used to compute phases and locate the Mercury atom positions of the EMTS derivative.
  • the heavy atom parameters of both derivatives were refined using SHARP.
  • Initial solvent flattened maps using the phases from both heavy atom derivatives were of reasonable quality and helped identifying parts of the secondary structure elements of DPPIV. Due to low incorporation of Selenium (Se) in the baculovirus expressed protein, solving Se positions using MAD data was not successful.
  • each unit cell comprised four DPPIV molecules. Structure coordinates were determined for this complex and the resultant set of structural coordinates from the refinement are presented in FIG. 3 .
  • sequence of the structure coordinates presented in FIG. 3 differ in some regards from the sequence shown in SEQ. ID No. 1. Structure coordinates are not reported for some residues because the electron density obtained was insufficient to identify the position of these residues. For FIG. 3 , structure coordinates for residues 151-153 (chains C and D) and 97-99 (chain D) are not reported.
  • binding pocket refers to a region of the protein that, as a result of its shape, favorably associates with a ligand.
  • modifications in the crystal structure due to mutations, additions, substitutions, and/or deletions of amino acids or other changes in any of the components that make up the crystal could also account for variations in structure coordinates. If such variations are within an acceptable standard error as compared to the original coordinates, the resulting three-dimensional shape should be considered to be the same.
  • a ligand that bound to the active site binding pocket of DPPIV would also be expected to bind to another binding pocket whose structure coordinates defined a shape that fell within the acceptable error.
  • Various computational analyses may be used to determine whether structure coordinates for a protein or a portion thereof is similar to the structure coordinates of DPPIV provided herein, or a portion thereof. Such analyses may be carried out in well known software applications, such as the Molecular Similarity application of QUANTA (Molecular Simulations Inc., San Diego, Calif.) version 4.1, and as described in the accompanying User's Guide. For the purpose of this invention, a rigid fitting method shall be used to compare protein structures.
  • any set of structure coordinates for a protein from any source having a root mean square deviation of alpha-carbon atoms of less than 3 ⁇ when superimposed on the alpha-carbon atom positions of the corresponding atomic coordinates of FIG. 3 shall be considered identical. It is noted that the root mean square deviation is intended to be limited to only those alpha-carbon atoms of amino acid residues that are common to both the protein fragment represented in FIG. 3 and the protein whose structure coordinates are being compared to the coordinates shown in FIG. 3 .
  • mutants and variants of DPPIV as well as other S9 proteases are likely to have similar structures despite having different sequences.
  • the binding pockets of these related proteins are likely to have similar contours. Accordingly, it should be recognized that the structure coordinates and binding pocket models provided herein have utility for these other related proteins.
  • the invention relates to data, computer readable media comprising data, and uses of the data where the data comprises all or a portion of the structure coordinates shown in FIG. 3 or structure coordinates having a root mean square deviation of alpha-carbon atoms of less than 3 ⁇ when superimposed on the alpha-carbon atom positions of the corresponding atomic coordinates of FIG. 3 .
  • the root mean square deviation is intended to be limited to only those alpha-carbon atoms of amino acid residues that are common to both the protein fragment represented in FIG. 3 and the protein whose structure coordinates are being compared to the coordinates shown in FIG. 3 .
  • the present invention is also directed to any data, computer readable media comprising data, and uses of the data where the data defines a computer model for a protein binding pocket, at least a portion of the computer model having a surface contour that has a root mean square deviation of less than 3 ⁇ when superimposed on a surface contour defined by atomic coordinates of FIG. 3 , the root mean square deviation being calculated based only on alpha-carbon atoms in the structure coordinates of FIG. 3 that are present in residues shown in SEQ. ID No. 1.
  • the root mean square deviation calculation may optionally be based on a comparison of main-chain atoms or non-hydrogen atoms.
  • the root mean square deviation of alpha-carbon atoms, main-chain atoms or non-hydrogen atoms may optionally be less than 2.7 ⁇ , 2.5 ⁇ , 2.0 ⁇ , 1.5 ⁇ , 1 ⁇ , 0.5 ⁇ , or less.
  • the present invention is also directed to a three-dimensional crystal structure of DPPIV.
  • This crystal structure may be used to identify binding sites, to provide mutants having desirable binding properties, and ultimately, to design, characterize, or identify ligands that interact with DPPIV as well as other S9 proteases.
  • the three-dimensional crystal structure of DPPIV may be generated, as is known in the art, from the structure coordinates shown in FIG. 3 and similar such coordinates.
  • Chain A includes amino acid residues 40-766 and four amino acid residues have covalently linked sugar molecules ( FIG. 3 ).
  • Chain B includes amino acid residues 39-766 and also includes 4 histidine residues of the N-terminal polyhistidine tag (residues 35-38). Five amino acid residues of chain B have covalently linked sugar molecules ( FIG. 3 ).
  • Chain C includes amino acid residues 40-766 and five of the amino acid residues are covalently linked to sugar molecules.
  • Chain D includes amino acid residues 39-766 with five sugar-linked amino acid residues. In addition, chains C and D have no density for amino acid residues 139, 140, and 141 and hence coordinates for these residues are not included in FIG. 3 . Similarly, coordinates for amino acid residues 85, 86. and 87 of chain D are not included in FIG. 3 .
  • the coordinate set additionally includes 928 solvent molecules modeled as water.
  • FIG. 4A illustrates a ribbon diagram overview of the structure of DPPIV, highlighting secondary structural elements of the protein.
  • DPPIV is a cylindrical shaped molecule with an approximate height of 70 ⁇ and a diameter of 60 ⁇ ( FIG. 4A ).
  • the catalytic triad of DPPIV (Ser 630, Asp 708 and His 740) is illustrated in the center of FIG. 4A by a “ball and stick” representation. This triad of amino acids is located in the peptidase domain or catalytic domain of DPPIV.
  • the catalytic domain is covalently linked to the ⁇ -propeller domain ( FIG. 4A ).
  • FIG. 5 illustrates the binding site of DPPIV based on the determined crystal structure corresponding to the coordinates shown in FIG. 3 .
  • binding site or “binding pocket”, as the terms are used herein, refers to a region of a protein that, as a result of its shape, favorably associates with a ligand or substrate.
  • DPPIV-like binding pocket refers to a portion of a molecule or molecular complex whose shape is sufficiently similar to the DPPIV binding pockets as to bind common ligands. This commonality of shape may be quantitatively defined based on a comparison to a reference point, that reference point being the structure coordinates provided herein.
  • the commonality of shape may be quantitatively defined based on a root mean square deviation (rmsd) from the structure coordinates of the backbone atoms of the amino acids that make up the binding pockets in DPPIV (as set forth in FIG. 3 ).
  • rmsd root mean square deviation
  • active site binding pockets or “active site” of DPPIV refers to the area on the surface of DPPIV where the substrate binds.
  • FIG. 5 illustrates the inhibitor-binding site of DPPIV based on the determined crystal structure (coordinates shown in FIG. 3 ).
  • the active site containing the catalytic triad (Ser 630, Asp 708 and His 740), is located in a large cavity ( FIG. 5 ) at the interface of the catalytic and the ⁇ -propeller domains.
  • Ser 630 is located on a sharp turn that connects an ⁇ -helix to a ⁇ -strand.
  • the positioning of this active site Serine residue is referred to as a nucleophile elbow and is characteristic of an ⁇ / ⁇ type hydrolase (D. J. Ollis et al., The ⁇ / ⁇ hydrolase fold; (1992) Protein Eng. 5, 197-211).
  • the second oxygen atom of the side chain carboxylate of Asp 708 forms two hydrogen bonded interactions with the main-chain amide of residues (Asn 710 and Val 711).
  • the hydrogen bonding interactions of the catalytic triad is similar to those observed for prolyl oligopeptidase.
  • the residues that form the DPPIV active site pocket can be predicted with a high degree of probability.
  • the binding pocket appears to be formed by a pocket of hydrophobic residues (Phe 357, Tyr 631, Tyr 662, Tyr 666, Tyr 547 and Val 711).
  • a large number of polar residues are also present in this hydrophobic environment (Arg 125, Glu 205, Glu 206 and Asp 663).
  • DPPIV amino acids shown in Table 1 are encompassed within a 4-Angstrom radius around the DPPIV active site and therefore are likely close enough to interact with an active site inhibitor of DPPIV.
  • Applicants have also determined that the amino acids shown in Table 2 (above) are encompassed within a 7-Angstrom radius around the DPPIV active site.
  • the amino acids shown in Table 3 are encompassed within a 10-Angstrom radius around the DPPIV active site. Due to their proximity to the active site, the amino acids in the 4, 7, and/or 10 Angstroms sets are preferably conserved in variants of DPPIV.
  • Applicants are able to know the contour of a DPPIV binding pocket as a binding pocket where the relative positioning of the 4, 7, and/or 10 Angstroms sets of amino acids.
  • Applicants are able to know the contour of a dimerization interface (AB and CD dimers) based on the relative positions of the ⁇ -carbon residues in Table 4.
  • AB and CD dimers dimerization interface
  • any set of structure coordinates for a protein from any source having a root mean square deviation of non-hydrogen atoms of less than 3 ⁇ when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of FIG. 3 for the 4, 7, and/or 10 Angstroms sets of amino acids and/or those amino acids of the dimerization interface shall be considered identical.
  • the root mean square deviation is intended to be limited to only those non-hydrogen atoms of amino acid residues that are common to both the protein fragment represented in FIG. 3 and the protein whose structure coordinates are being compared to the coordinates shown in FIG. 3 since the sequence of the protein may be varied somewhat.
  • the invention relates to data, computer readable media comprising data, and uses of the data where the data comprises the structure coordinates shown in FIG. 3 or structure coordinates having a root mean square deviation of non-hydrogen atoms of less than 3 ⁇ when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of FIG. 3 for the 4, 7, and/or 10 Angstroms sets of amino acids and/or the residues listed in Table 4.
  • root mean square deviation is intended to be limited to only those non-hydrogen atoms of amino acid residues that are common to both the protein fragment represented in one or more of the tables and the protein whose structure coordinates are being compared to the coordinates shown in FIG. 3 .
  • the present invention is also directed to any data, computer readable media comprising data, and uses of the data where the data defines a computer model for a protein binding pocket, at least a portion of the computer model having a surface contour that has a root mean square deviation of less than 3 ⁇ when superimposed on a surface contour defined by atomic coordinates of FIG. 3 , the root mean square deviation being calculated based only on non-hydrogen atoms in the structure coordinates of FIG. 3 that are present in residues shown in Tables 1, 2, 3 and/or 4.
  • the root mean square deviation of non-hydrogen atoms is less than 1.5 ⁇ , 1 ⁇ , 0.5 ⁇ , or less.
  • the present invention is also directed to machine-readable data storage media having data storage material encoded with machine-readable data that comprises structure coordinates for DPPIV.
  • the present invention is also directed to a machine readable data storage media having data storage material encoded with machine readable data, which, when read by an appropriate machine, can display a three dimensional representation of a structure of DPPIV.
  • All or a portion of the DPPIV coordinate data shown in FIG. 3 when used in conjunction with a computer programmed with software to translate those coordinates into the three-dimensional structure of DPPIV may be used for a variety of purposes, especially for purposes relating to drug discovery.
  • Softwares for generating three-dimensional graphical representations are known and commercially available.
  • the ready use of the coordinate data requires that it be stored in a computer-readable format.
  • data capable of being displayed as the three-dimensional structure of DPPIV and/or portions thereof and/or their structurally similar variants may be stored in a machine-readable storage medium, which is capable of displaying a graphical three-dimensional representation of the structure.
  • a computer for producing a three-dimensional representation of at least an DPPIV-like binding pocket, the computer comprising: machine readable data storage medium comprising a data storage material encoded with machine-readable data, the machine readable data comprising structure coordinates that have a root mean square deviation of less than 3 Angstroms when compared to structure coordinates appearing in FIG. 3 , the comparison being based on alpha-carbon atoms of amino acid residues present in both the set of structure coordinates shown in FIG.
  • a working memory for storing instructions for processing the machine-readable data
  • a central-processing unit coupled to the working memory and to the machine-readable data storage medium, for processing the machine-readable data into the three-dimensional representation
  • an output hardware coupled to the central processing unit, for receiving the three Dimensional representation.
  • Another embodiment of this invention provides a machine-readable data storage medium, comprising a data storage material encoded with machine readable data which, when used by a machine programmed with instructions for using said data, displays a graphical three-dimensional representation comprising DPPIV or a portion or variant thereof.
  • the machine readable data comprises data for representing a protein based on structure coordinates having a root mean square deviation of alpha-carbon atoms of less than 3 ⁇ when superimposed on the alpha-carbon atom positions of the corresponding atomic coordinates of FIG. 3 for all of the amino acids in FIG. 3 .
  • the machine readable data comprises data for representing a protein based on structure coordinates having a root mean square deviation of alpha-carbon atoms of less than 3 ⁇ when superimposed on the alpha-carbon atom positions of the corresponding atomic coordinates of FIG. 3 for the amino acids listed in Tables 1, 2, 3 and/or 4.
  • the root mean square deviation calculation may optionally be based on a comparison of main-chain atoms or non-hydrogen atoms.
  • the root mean square deviation of alpha-carbon atoms, main-chain atoms or non-hydrogen atoms may optionally be less than 2.7 ⁇ , 2.5 ⁇ , 2.0 ⁇ , 1.5 ⁇ , 1 ⁇ , 0.5 ⁇ , or less.
  • the machine-readable data storage medium comprises a data storage material encoded with a first set of machine readable data which comprises the Fourier transform of the structure coordinates set forth in FIG. 3 , and which, when using a machine programmed with instructions for using said data, can be combined with a second set of machine readable data comprising the X-ray diffraction pattern of another molecule or molecular complex to determine at least a portion of the structure coordinates corresponding to the second set of machine readable data.
  • the Fourier transform of the structure coordinates set forth in FIG. 3 may be used to determine at least a portion of the structure coordinates of other DPPIV-like enzymes, and isoforms of DPPIV.
  • a computer system in combination with the machine-readable data storage medium provided herein.
  • the computer system comprises a working memory for storing instructions for processing the machine-readable data; a processing unit coupled to the working memory and to the machine-readable data storage medium, for processing the machine-readable data into the three-dimensional representation; and an output hardware coupled to the processing unit, for receiving the three-dimensional representation.
  • FIG. 6 illustrates an example of a computer system that may be used in combination with storage media according to the present invention.
  • the computer system 10 includes a computer 11 comprising a central processing unit (“CPU”) 20, a working memory 22 which may be, e.g., RAM (random-access memory) or “core” memory, mass storage memory 24 (such as one or more disk drives or CD-ROM drives), one or more cathode-ray tube (“CRT”) display terminals 26, one or more keyboards 28, one or more input lines 30, and one or more output lines 40, all of which are interconnected by a conventional bi-directional system bus 50.
  • CPU central processing unit
  • working memory 22 which may be, e.g., RAM (random-access memory) or “core” memory
  • mass storage memory 24 such as one or more disk drives or CD-ROM drives
  • CRT cathode-ray tube
  • Input hardware 36 coupled to computer 11 by input lines 30, may be implemented in a variety of ways.
  • machine-readable data of this invention may be inputted via the use of a modem or modems 32 connected by a telephone line or dedicated data line 34.
  • the input hardware 36 may comprise CD-ROM drives or disk drives 24.
  • keyboard 28 may also be used as an input device.
  • output hardware 46 may include CRT display terminal 26 for displaying a graphical representation of a binding pocket of this invention using a program such as QUANTA as described herein.
  • output hardware might also include a printer 42, so that hard copy output may be produced, or a disk drive 24, to store system output for later use.
  • CPU 20 coordinates the use of the various input and output devices 36, 46 coordinates data accesses from mass storage 24 and accesses to and from working memory 22, and determines the sequence of data processing steps.
  • a number of programs may be used to process the machine-readable data of this invention. Such programs are discussed in reference to using the three dimensional structure of DPPIV described herein.
  • the storage medium encoded with machine-readable data can be any conventional data storage device known in the art.
  • the storage medium can be a conventional floppy diskette or hard disk.
  • the storage medium can also be an optically-readable data storage medium, such as a CD-ROM or a DVD-ROM, or a rewritable medium such as a magneto-optical disk that is optically readable and magneto-optically writable.
  • the three-dimensional crystal structure of the present invention may be used to identify DPPIV binding sites, be used as a molecular replacement model to solve the structure of unknown crystallized proteins, to design mutants having desirable binding properties, and ultimately, to design, characterize, identify entities capable of interacting with DPPIV and other S9 proteases, as well as other uses that would be recognized by one of ordinary skill in the art.
  • entities may be chemical entities or proteins.
  • chemical entity refers to chemical compounds, complexes of at least two chemical compounds, and fragments of such compounds.
  • the DPPIV structure coordinates provided herein are useful for screening and identifying drugs that inhibit DPPIV and other proteases.
  • the structure encoded by the data may be computationally evaluated for its ability to associate with putative substrates or ligands. Such compounds that associate with DPPIV may inhibit DPPIV, and are potential drug candidates.
  • the structure encoded by the data may be displayed in a graphical three-dimensional representation on a computer screen. This allows visual inspection of the structure, as well as visual inspection of the structure's association with the compounds.
  • a method for evaluating the potential of an entity to associate with DPPIV or a fragment or variant thereof by using all or a portion of the structure coordinates provided in FIG. 3 .
  • a method is also provided for evaluating the potential of an entity to associate with DPPIV or a fragment or variant thereof by using structure coordinates similar to all or a portion of the structure coordinates provided in FIG. 3 .
  • the structure coordinates used may have a root mean square deviation of alpha-carbon atoms of less than 3 ⁇ when superimposed on the alpha-carbon atom positions of the corresponding atomic coordinates of FIG. 3 .
  • the root mean square deviation calculation may optionally be based on a comparison of main-chain atoms or non-hydrogen atoms.
  • the root mean square deviation of alpha-carbon atoms or non-hydrogen atoms may optionally be less than 2.7 ⁇ , 2.5 ⁇ , 2.0 ⁇ , 1.5 ⁇ , 1 ⁇ , 0.5 ⁇ , or less.
  • the method may optionally comprise the steps of: creating a computer model of all or a portion of a protein structure (e.g., a binding pocket) using structure coordinates according to the present invention; performing a fitting operation between the entity and the computer model; and analyzing the results of the fitting operation to quantify the association between the entity and the model.
  • a protein structure e.g., a binding pocket
  • the portion of the protein structure used optionally comprises all of the amino acids listed in Tables 1, 2, 3 and/or 4 that are present in the structure coordinates being used.
  • the computer model may not necessarily directly use the structure coordinates. Rather, a computer model can be formed that defines a surface contour that is the same or similar to the surface contour defined by the structure coordinates.
  • the structure coordinates provided herein can also be utilized in a method for identifying a ligand (e.g., entities capable of associating with a protein) of a protein comprising a DPPIV-like binding pocket.
  • One embodiment of the method comprises: using all or a portion of the structure coordinates provided herein to generate a three-dimensional structure of a DPPIV-like binding pocket; employing the three-dimensional structure to design or select a potential ligand; synthesizing the potential ligand; and contacting the synthesized potential ligand with a protein comprising an DPPIV-like binding pocket to determine the ability of the potential ligand to interact with protein.
  • the structure coordinates used may have a root mean square deviation of alpha-carbon atoms of less than 3 ⁇ when superimposed on the alpha-carbon atom positions of the corresponding atomic coordinates of FIG. 3 .
  • the portion of the protein structure used optionally comprises all of the amino acids listed in Tables 1, 2, 3 and/or 4 that are present.
  • the three-dimensional structure of a DPPIV-like binding pocket need not be generated directly from structure coordinates. Rather, a computer model can be formed that defines a surface contour that is the same or similar to the surface contour defined by the structure coordinates.
  • the root mean square deviation calculation may optionally be based on a comparison of main-chain or non-hydrogen atoms.
  • the root mean square deviation of alpha-carbon atoms or non-hydrogen atoms may optionally be less than 2.7 ⁇ , 2.5 ⁇ , 2.0 ⁇ , 1.5 ⁇ , 1 ⁇ , 0.5 ⁇ , or less.
  • a method for evaluating the ability of an entity, such as a compound or a protein to associate with a DPPIV-like binding pocket comprising: constructing a computer model of a binding pocket defined by structure coordinates that have a root mean square deviation of less than 3.0 ⁇ when compared to structure coordinates appearing in FIG.
  • the comparison being based on alpha-carbon atoms present in both sets of structure coordinates, the comparison also being limited to residues of DPPIV appearing in Tables 1, 2, 3 and/or 4 that are present; selecting an entity to be evaluated by a method selected from the group consisting of (i) assembling molecular fragments into the entity, (ii) selecting an entity from a small molecule database, (iii) de novo ligand design of the entity, and (iv) modifying a known ligand for DPPIV, or a portion thereof; performing a fitting program operation between computer models of the entity to be evaluated and the binding pocket in order to provide an energy-minimized configuration of the entity in the binding pocket; and evaluating the results of the fitting operation to quantify the association between the entity and the binding pocket model in order to evaluate the ability of the entity to associate with the said binding pocket.
  • the computer model of a binding pocket used in this embodiment need not be generated directly from structure coordinates. Rather, a computer model can be formed that defines a surface contour that is the same or similar to the surface contour defined by the structure coordinates.
  • the root mean square deviation calculation may optionally be based on a comparison of main-chain atoms or non-hydrogen atoms.
  • the root mean square deviation of alpha-carbon atoms or non-hydrogen atoms may optionally be less than 2.7 ⁇ , 2.5 ⁇ , 2.0 ⁇ , 1.5 ⁇ , 1 ⁇ , 0.5 ⁇ , or less.
  • the method may further include synthesizing the entity; and contacting a protein having a DPPIV-like binding pocket with the synthesized entity.
  • the present invention for the first time permits the use of molecular design techniques to identify, select or design potential inhibitors of DPPIV, based on the structure of a DPPIV-like binding pocket.
  • Such a predictive model is valuable in light of the high costs associated with the preparation and testing of the many diverse compounds that may possibly bind to the DPPIV protein.
  • a potential DPPIV inhibitor may now be evaluated for its ability to bind a DPPIV-like binding pocket prior to its actual synthesis and testing. If a proposed entity is predicted to have insufficient interaction or association with the binding pocket, preparation and testing of the entity can be obviated. However, if the computer modeling indicates a strong interaction, the entity may then be obtained and tested for its ability to bind.
  • a potential inhibitor of a DPPIV-like binding pocket may be computationally evaluated using a series of steps in which chemical entities or fragments are screened and selected for their ability to associate with the DPPIV-like binding pockets.
  • One skilled in the art may use one of several methods to screen entities (whether chemical or protein) for their ability to associate with a DPPIV-like binding pocket. This process may begin by visual inspection of, for example, a DPPIV-like binding pocket on a computer screen based on the DPPIV structure coordinates in FIG. 3 or other coordinates which define a similar shape generated from the machine-readable storage medium. Selected fragments or chemical entities may then be positioned in a variety of orientations, or docked, within that binding pocket as defined above. Docking may be accomplished using software such as Quanta and Sybyl, followed by energy minimization and molecular dynamics with standard molecular mechanics force fields, such as CHARMM and AMBER.
  • Specialized computer programs may also assist in the process of selecting entities. These include: GRID (P. J. Goodford, “A Computational Procedure for Determining Energetically Favorable Binding Sites on Biologically Important Macromolecules”, J. Med. Chem., 28, pp. 849-857 (1985)). GRID is available from Oxford University, Oxford, UK; MCSS (A. Miranker et al., “Functionality Maps of Binding Sites: A Multiple Copy Simultaneous Search Method.” Proteins: Structure, Function and Genetics, 11, pp. 29-34 (1991)). MCSS is available from Molecular Simulations, San Diego, Calif.; AUTODOCK (D. S.
  • Assembly may be preceded by visual inspection of the relationship of the fragments to each other on the three-dimensional image displayed on a computer screen in relation to the structure coordinates of DPPIV. This may then be followed by manual model building using software such as Quanta or Sybyl [Tripos Associates, St. Louis, Mo].
  • CAVEAT P. A. Bartlett et al, “CAVEAT: A Program to Facilitate the Structure-Derived Design of Biologically Active Molecules”, in “Molecular Recognition in Chemical and Biological Problems”, Special Pub., Royal Chem. Soc., 78, pp. 182-196 (1989); G. Lauri and P. A. Bartlett, “CAVEAT: a Program to Facilitate the Design of Organic Molecules”, J. Comput. Aided Mol. Des., 8, pp. 51-66 (1994)).
  • CAVEAT is available from the University of California, Berkeley, Calif.; 3D Database systems such as ISIS (MDL Information Systems, San Leandro, Calif.). This area is reviewed in Y. C. Martin, “3D Database Searching in Drug Design”, J. Med. Chem., 35, pp. 2145-2154 (1992); HOOK (M. B. Eisen et al, “HOOK; A Program for Finding Novel Molecular Architectures that Satisfy the Chemical and Steric Requirements of a Macromolecule Binding Site”, Proteins: Struct., Funct., Genet., 19, pp. 199-221 (1994). HOOK is available from Molecular Simulations, San Diego, Calif.
  • inhibitory or other DPPIV binding compounds may be designed as a whole or “de novo” using either an empty binding site or optionally including some portion(s) of a known inhibitor(s).
  • de novo ligand design methods including: LUDI (H.-J. Bohm, “The Computer Program LUDI: A New Method for the De Novo Design of Enzyme Inhibitors”, J. Comp. Aid. Molec. Design, 6, pp. 61-78 (1992)).
  • LUDI is available from Molecular Simulations Incorporated, San Diego, Calif.; LEGEND (Y.
  • LEGEND is available from Molecular Simulations Incorporated, San Diego, Calif.; LEAPFROG (available from Tripos Associates, St. Louis, Mo.); & SPROUT (V. Gillet et al, “SPROUT: A Program for Structure Generation)”, J. Comput. Aided Mol. Design, 7, pp. 127-153 (1993)). SPROUT is available from the University of Leeds, UK.
  • an effective DPPIV binding pocket inhibitor preferably demonstrates a relatively small difference in energy between its bound and free states (i.e., a small deformation energy of binding).
  • the most efficient DPPIV binding pocket inhibitors should preferably be designed with deformation energy of binding of not greater than about 10 kcal/mole, more preferably, not greater than 7 kcal/mole.
  • DPPIV binding pocket inhibitors may interact with the binding pocket in more than one of multiple conformations that are similar in overall binding energy. In those cases, the deformation energy of binding is taken to be the difference between the energy of the free entity and the average energy of the conformations observed when the inhibitor binds to the protein.
  • An entity designed or selected as binding to a DPPIV binding pocket may be further computationally optimized so that in its bound state it would preferably lack repulsive electrostatic interaction with the target enzyme and with the surrounding water molecules.
  • Such non-complementary electrostatic interactions include repulsive charge-charge, dipole-dipole and charge-dipole interactions.
  • Another approach provided by this invention is the computational screening of small molecule databases for chemical entities or compounds that can bind in whole, or in part, to a DPPIV binding pocket.
  • the quality of fit of such entities to the binding site may be judged either by shape complementarities or by estimated interaction energy [E. C. Meng et al., J. Comp. Chem., 13, 505-524 (1992)].
  • the invention provides compounds that associate with a DPPIV—like binding pocket produced or identified by various methods set forth above.
  • the structure coordinates set forth in FIG. 3 can also be used to aid in obtaining structural information about another crystallized molecule or molecular complex. This may be achieved by any of a number of well-known techniques, including molecular replacement.
  • a method for utilizing molecular replacement to obtain structural information about a protein whose structure is unknown comprising the steps of: generating an X-ray diffraction pattern of a crystal of the protein whose structure is unknown; generating a three-dimensional electron density map of the protein whose structure is unknown from the X-ray diffraction pattern by using at least a portion of the structure coordinates set forth in FIG. 3 as a molecular replacement model.
  • all or part of the structure coordinates of the DPPIV provided by this invention can be used to determine the structure of another crystallized molecule or molecular complex more quickly and efficiently than attempting an ab initio structure determination.
  • One particular use includes use with other S9 proteases.
  • Molecular replacement provides an accurate estimation of the phases for an unknown structure. Phases are a factor in equations used to solve crystal structures that cannot be determined directly. Obtaining accurate values for the phases, by methods other than molecular replacement, is a time-consuming process that involves iterative cycles of approximations and refinements and greatly hinders the solution of crystal structures. However, when the crystal structure of a protein containing at least a homologous portion has been solved, the phases from the known structure provide a satisfactory estimate of the phases for the unknown structure.
  • this method involves generating a preliminary model of a molecule or molecular complex whose structure coordinates are unknown, by orienting and positioning the relevant portion of DPPIV according to FIG. 3 within the unit cell of the crystal of the unknown molecule or molecular complex so as best to account for the observed X-ray diffraction pattern of the crystal of the molecule or molecular complex whose structure is unknown. Phases can then be calculated from this model and combined with the observed X-ray diffraction pattern amplitudes to generate an electron density map of the structure whose coordinates are unknown. This, in turn, can be subjected to any well-known model building and structure refinement techniques to provide a final, accurate structure of the unknown crystallized molecule or molecular complex [E.
  • the method of molecular replacement is utilized to obtain structural information about the present invention and any other DPPIV-like molecule.
  • the structure coordinates of DPPIV, as provided by this invention, are particularly useful in solving the structure of other isoforms of DPPIV or DPPIV complexes.
  • the structure coordinates of DPPIV as provided by this invention are useful in solving the structure of DPPIV variants that have amino acid substitutions, additions and/or deletions (referred to collectively as “DPPIV mutants”, as compared to naturally occurring DPPIV).
  • DPPIV mutants may optionally be crystallized in co-complex with a ligand, such as an inhibitor, substrate analogue or a suicide substrate.
  • a ligand such as an inhibitor, substrate analogue or a suicide substrate.
  • the crystal structures of a series of such complexes may then be solved by molecular replacement and compared with that of DPPIV. Potential sites for modification within the various binding sites of the enzyme may thus be identified. This information provides an additional tool for determining the most efficient binding interactions such as, for example, increased hydrophobic interactions, between DPPIV and a ligand.
  • the ligand may be the protein's natural ligand or may be a potential agonist or antagonist of a protein.
  • All of the complexes referred to above may be studied using well-known X-ray diffraction techniques and may be refined versus 1.5-3 ⁇ resolution X-ray data to an R value of about 0.22 or less using computer software, such as X-PLOR [Yale University, COPYRGT.1992, distributed by Molecular Simulations, Inc.; see, e.g., Blundell & Johnson, supra; Meth. Enzymol., Vol. 114 & 115, H. W. Wyckoff et al., eds., Academic Press (1985)]. This information may thus be used to optimize known DPPIV inhibitors, and more importantly, to design new DPPIV inhibitors.
  • the structure coordinates described above may also be used to derive the dihedral angles, phi and psi, that define the conformation of the amino acids in the protein backbone.
  • phi n angle refers to the rotation around the bond between the alpha-carbon and the nitrogen
  • psi n angle refers to the rotation around the bond between the carbonyl carbon and the alpha-carbon.
  • the subscript “n” identifies the amino acid whose conformation is being described [for a general reference, see Blundell and Johnson, Protein Crystallography, Academic Press, London, 1976].
  • Crystals, crystallization conditions and the diffraction pattern of DPPIV that can be generated from the crystals also have a range of uses.
  • One particular use relates to screening entities that are not known ligands of DPPIV for their ability to bind to DPPIV.
  • crystallization conditions, crystals and diffraction patterns of DPPIV provided according to the present invention, it is possible to take a crystal of DPPIV; expose the crystal to one or more entities that may be a ligand of DPPIV; and determine whether a ligand/DPPIV complex is formed.
  • the crystals of DPPIV may be exposed to potential ligands by various methods, including but not limited to, soaking a crystal in a solution of one or more potential ligands or co-crystallizing DPPIV in the presence of one or more potential ligands.
  • the structure coordinates Given the structure coordinates provided herein, once a ligand complex is formed, the structure coordinates can be used as a model in molecular replacement in order to determine the structure of the ligand complex.
  • structural information from the ligand/DPPIV complex(es) may be used to design new ligands that bind tighter, bind more specifically, have better biological activity or have better safety profile than known ligands.
  • a method for identifying a ligand that binds to DPPIV comprising: (a) attempting to crystallize a protein that comprises a sequence with 70, 80, 90, 95% or greater identity with SEQ. ID No. 1 in the presence of one or more entities; (b) if crystals of the protein are obtained in step (a), obtaining an X-ray diffraction pattern of the protein crystal; and (c) determining whether a ligand/protein complex was formed by comparing an X-ray diffraction pattern of a crystal of the protein formed in the absence of the one or more entities to the crystal formed in the presence of the one or more entities.
  • a method for identifying a ligand that binds to DPPIV comprising: soaking a crystal of a protein that comprises a sequence with 70, 80, 90, 95% or greater identity with SEQ. ID No. 1 with one or more entities; determining whether a ligand/protein complex was formed by comparing an X-ray diffraction pattern of a crystal of the protein that has not been soaked with the one or more entities to the crystal that has been soaked with the one or more entities.
  • the method may further comprise converting the diffraction patterns into electron density maps using phases of the protein crystal and comparing the electron density maps.
  • Libraries of “shape-diverse” compounds may optionally be used to allow direct identification of the ligand-receptor complex even when the ligand is exposed as part of a mixture. According to this variation, the need for time-consuming de-convolution of a hit from the mixture is avoided. More specifically, the calculated electron density function reveals the binding event, identifies the bound compound and provides a detailed 3-D structure of the ligand-receptor complex. Once a hit is found, one may optionally also screen a number of analogs or derivatives of the hit for tighter binding or better biological activity by traditional screening methods. The hit and information about the structure of the target may also be used to develop analogs or derivatives with tighter binding or better biological activity.
  • the ligand-DPPIV complex may optionally be exposed to additional iterations of potential ligands so that two or more hits can be linked together to make a more potent ligand. Screening for potential ligands by co-crystallization and/or soaking is further described in U.S. Pat. No. 6,297,021, which is incorporated herein by reference.
  • the portion of the gene encoding residues 39-766 (from SEQ. ID No. 1), which corresponds to the extracellular portion of human DPPIV, was isolated by PCR from spleen cDNA and cloned into the BamH I and Hind III sites of a modified pFastBacHTb vector.
  • This vector encodes a baculovirus glycoprotein gp67 signal peptide sequence followed by a 6x-histidine tag sequence followed by the DPPIV sequence.
  • Expression in this vector allowed for the production of secreted recombinant DPPIV with part of a gp67 signal sequence and a 6x-histidine tag, the sequence of which is shown in FIG. 1 (part of a gp67 signal sequence and 6x-histidine tag sequence underlined) (SEQ. ID No. 3).
  • Recombinant baculovirus genomic DNAs incorporating the DPPIV cDNA sequences were generated by transposition using the Bac-to-Bac system (Gibco-BRL). Infectious extracellular virus particles were obtained by transfection of a 2 ml adherent culture of Spodoptera frugiperda S Sf9 insect cells with the recombinant viral genomic DNA. Growth in ESF 921 protein free medium (Expression Systems) was for 3 days at 27° C. The resulting passage 1 viral supernatant was used to obtain passage 2 high titer viral stock (HTS) by infection of a 2 ml adherent culture of Spodoptera frugiperda Sf9 insect cells grown under similar conditions.
  • HTS high titer viral stock
  • Passage 2 HTS was used in turn to infect a 100 ml suspension culture of Spodoptera frugiperda Sf9 insect cells in order to generate passage 3 HTS.
  • the production of recombinant DPPIV proteins was carried out by using the passage 3 HTS at a multiplicity of infection (MOI) of approximately 5 to infect 0.5-5 liter cultures of Trichoplusia ni Hi5 insect cells (InVitrogen) at a cell density of (1,5 ⁇ 8) ⁇ 10 6 cells/ml (grown in ESF 921 protein free medium).
  • Infected cell cultures were grown in both shake flasks and in Wave Bioreactors (Wave Biotech) for 48 hours at 27° C. prior to harvest.
  • infected cultures of Spodoptera frugiperda Sf9 insect cells were used to produce recombinant DPPIV under similar conditions. Following harvest, the cell cultures were centrifuged to pellet whole cells.
  • the secreted glycosylated recombinant protein was isolated from the cell culture medium by diafiltration using cross-flow ultrafiltration, followed by passage over a nickel chelate resin and optionally polished by size exclusion chromatography.
  • the resin is poured into 1 cm ID glass columns (Omnifit) and washed with 50 column volumes of 50 mM Potassium Phosphate pH 7.9, 0.4 M NaCl, 20 mM imidazole, 0.25 mM TCEP. After a wash with 5 column volumes of 50 mM Tris pH 7.9, 0.4 M NaCl, 0.25 mM TCEP, the product is eluted with 4 column volumes of 50 mM Tris pH 7.9, 0.4 M NaCl, 200 mM imidazole, 0.25 mM TCEP.
  • polyhistidine tags may optionally be removed; however in this instance, the polyhistidine tag was left as a fusion. It is also noted that for the purification of non-secreted proteins, leupeptin is added to all the buffers used during the immobilized metal affinity purification (IMAC) process at 1 mg/L and that for simplicity reasons the same is sometimes done when purifying DPPIV.
  • IMAC immobilized metal affinity purification
  • DPPIV was purified over a BioSep Sec S3000 column (200 mm ⁇ 21.2 mm, Phenomenex) at 8 ml/minute to remove oligomeric forms.
  • the column was set up in a Summit HPLC system (Dionex) managed by Chromeleon software (Dionex) and equilibrated with 25 mM Tris pH 7.6, 150 to 250 mM NaCl (optionally with 0.25 mM TCEP and 1 mM EDTA).
  • centrifugal ultrafiltration (10 kDa NMWCO) was used for the buffer exchange to the required formulation buffer.
  • the process was carried out at 2-10° C. and DPPIV was stored at the same temperature. For long-term storage, it was kept at ⁇ 80° C.
  • the purity of DPPIV was estimated by SDS-PAGE and IEF to be at least 95%. Glycosylation was confirmed by a molecular mass shift, determined by SDS-PAGE, following treatment with endo-beta-N acetyl glucosaminidases F (Endo-F1 enzyme), and by carbohydrate analysis.
  • DPPIV with seleno-L-methionine substitution was prepared as follows: two 5 liter Wave Bioreactor cultures of Trichoplusia ni Hi5 insect cells in ESF 921 Protein Free medium were infected and grown for 16 hours at 27° C. At that time the cells were pelleted by centrifugation at 480 g and 20° C. for 15 minutes. The supernatant was discarded and the cells resuspended in 2 ⁇ 5 liters of ESF 921 Protein Free Methionine-Free medium (Expression Systems). The resuspended cells were placed in two new 5 liter Wave Bioreactors and growth continued for 4 h at 27° C.
  • Seleno-L-methionine (prepared as a 25 mg/ml solution in water and sterile-filtered) was then added to each culture to a final concentration of 50 mg/l. Cell growth was continued for a further 48 h prior to harvest. Purification of the protein was as described above and included the size exclusion chromatography step. Mass spectrogram peptide analysis was used to estimate the seleno-L-methionine substitution of methionine residues at approximately 34%.
  • This example describes the crystallization of DPPIV (SEQ ID NO:3). It is noted that the precise crystallization conditions used may be further varied, for example by performing a fine screen based on these crystallization conditions.
  • Crystals typically appeared after 3-5 days and grew to a maximum size within 7-10 days. Single crystals were transferred, briefly, into a cryoprotecting solution containing the reservoir solution supplemented with 25% v/v ethylene glycol. Crystals were then flash frozen by immersion in liquid nitrogen and then stored under liquid nitrogen. A crystal of apo DPPIV (SEQ ID NO:3) produced as described is illustrated in FIG. 2 .

Abstract

Provided are crystals relating to DPPIV and its various uses.

Description

RELATED APPLICATION
This application claims the benefit of U.S. Provisional Application No. 60/409,206, filed Sep. 9, 2002, which is incorporated herein by reference.
FIELD OF THE INVENTION
The present invention relates to a member of the S9 family of human proteases known as Dipeptidyl Peptidases (DPP) and more specifically to a particular dipeptidyl peptidase known as dipeptidyl peptidase IV (DPPIV). Provided is DPPIV in crystalline form, methods of forming crystals comprising DPPIV, methods of using crystals comprising DPPIV, a crystal structure of DPPIV, and methods of using the crystal structure.
BACKGROUND OF THE INVENTION
A general approach to designing inhibitors that are selective for a given protein is to determine how a putative inhibitor interacts with a three dimensional structure of that protein. For this reason it is useful to obtain the protein in crystalline form and perform X-ray diffraction techniques to determine the protein's three-dimensional structure coordinates. Various methods for preparing crystalline proteins are known in the art.
Once protein crystals are produced, crystallographic data can be generated using the crystals to provide useful structural information that assists in the design of small molecules that bind to the active site of the protein and inhibit the protein's activity in vivo. If the protein is crystallized as a complex with a ligand, one can determine both the shape of the protein's binding pocket when bound to the ligand, as well as the amino acid residues that are capable of close contact with the ligand. By knowing the shape and amino acid residues comprised in the binding pocket, one may design new ligands that will interact favorably with the protein. With such structural information, available computational methods may be used to predict how strong the ligand binding interaction will be. Such methods aid in the design of inhibitors that bind strongly, as well as selectively to the protein.
SUMMARY OF THE INVENTION
The present invention is directed to crystals comprising DPPIV and particularly crystals comprising DPPIV that have sufficient size and quality to obtain useful information about the structural properties of DPPIV and molecules or complexes that may associate with DPPIV.
In one embodiment, a composition is provided that comprises a protein in crystalline form wherein the protein has 65%, 70%, 80%, 90%, 95% or greater identity with residues 13-740 of SEQ ID NO:3.
In one variation, the protein has activity characteristic of DPPIV. For example, the protein may optionally be inhibited by inhibitors of wild type DPPIV.
The protein may also diffract X-rays for a determination of structure coordinates to a resolution of 4 Å, 3 Å, 2.5 Å, 2 Å or less.
In one variation, the protein crystal has a crystal lattice in a P21 space group. The protein crystal may also have a crystal lattice having unit cell dimensions, +/−5%, of a=121.53 Å b=124.11 Å and c=144.42 Å, α=γ=90°, β=114.6°.
The present invention is also directed to crystallizing DPPIV. The present invention is also directed to the conditions useful for crystallizing DPPIV. It should be recognized that a wide variety of crystallization methods can be used in combination with the crystallization conditions to form crystals comprising DPPIV including, but not limited to, vapor diffusion, batch, and dialysis.
In one embodiment, a method is provided for forming crystals of a protein comprising: forming a crystallization volume comprising: a protein that has at least 65%, 70%, 80%, 90%, 95% identity with residues 13-740 of SEQ ID NO:3 in a concentration between 1 mg/ml and 50 mg/ml; 5-50% w/v of precipitant wherein the precipitant comprises one or more members of the group consisting of PEG MME having a molecular weight range between 300-10000, and PEG having a molecular weight range between 100-10000; optionally 0.05 to 0.8M additives wherein the additives comprises sarcosine or 0.5 to 25% additives wherein the additives comprises xylitrol; and wherein the crystallization volume has a pH between pH 5 and pH 9; and storing the crystallization volume under conditions suitable for crystal formation. The method optionally further comprises using 0.05-0.2M buffers selected from the group consisting of tris-HCl, bicine and combinations thereof. The method also optionally further includes performing the crystallization at a temperature between 1° C.-25° C.
The method may optionally further comprise forming a protein crystal that has a crystal lattice in a P2, space group. The method also optionally further comprises forming a protein crystal that has a crystal lattice having unit cell dimensions, +/−5%, of a=121.53 Å b=1124.11 Å and c=144.42 Å, α=γ=90°, β=114.6°. The invention also relates to protein crystals formed by these methods.
The present invention is also directed to structure coordinates for DPPIV as well as structure coordinates that are comparatively similar to these structure coordinates. It is noted that these comparatively similar structure coordinates may encompass proteins with similar sequences and/or structures, such as other members of the S9 protease family. For example, machine-readable data storage media is provided having data storage material encoded with machine-readable data that comprises structure coordinates that are comparatively similar to the structure coordinates of DPPIV. The present invention is also directed to a machine readable data storage medium having data storage material encoded with machine readable data, which, when read by an appropriate machine, can display a three dimensional representation of all or a portion of a structure of DPPIV or a model that is comparatively similar to the structure of all or a portion of DPPIV.
In one embodiment, machine readable data storage medium is provided having data storage material encoded with machine readable data, the machine readable data comprising: structure coordinates that have a root mean square deviation of alpha-carbon atoms of less than 3 Å when superimposed on alpha-carbon atoms positions of corresponding atomic coordinates of FIG. 3, the root mean square deviation being calculated based only on those alpha-carbon atoms of amino acid residues in the structure coordinates that are also present in residues 13-740 of SEQ ID NO:3.
In another embodiment, machine readable data storage medium is provided having data storage material encoded with machine readable data, the machine readable data comprising: structure coordinates that have a root mean square deviation of alpha-carbon atoms of less than 3 Å when superimposed on alpha-carbon atom positions of corresponding atomic coordinates of FIG. 3, the root mean square deviation being calculated based only on those alpha-carbon atoms of amino acid residues in the structure coordinates that are also present in Tables 1, 2, 3 and/or 4.
It is noted in regard to these embodiments that the root mean square deviation calculation may optionally be based on a comparison of main-chain atoms, non-hydrogen atoms or a comparison of all atoms where the same type of amino acid residue is present. Also, the root mean square deviation of alpha-carbon atoms, main-chain atoms, non-hydrogen atoms or all atoms may optionally be less than 2.7 Å, 2.5 Å, 2.0 Å, 1.5 Å, 1 Å, 0.5 Å, or less.
The present invention is also directed to a three-dimensional structure of all or a portion of DPPIV. This three-dimensional structure may be used to identify binding sites, to provide mutants having desirable binding properties, and ultimately, to design, characterize, or identify ligands capable of interacting with DPPIV. Ligands that interact with DPPIV may be any type of atom, compound, protein or chemical group that binds to or otherwise associates with the protein. Examples of types of ligands include natural substrates for DPPIV, inhibitors of DPPIV, and heavy atoms.
In one embodiment, a method is provided for displaying, a three dimensional representation of a structure of a protein comprising: taking machine readable data comprising structure coordinates that have a root mean square deviation of alpha-carbon atoms of less than 3 Å when superimposed on alpha-carbon atom positions of corresponding atomic coordinates of FIG. 3., the root mean square deviation being calculated based only on those alpha-carbon atoms of amino acid residues in the structure coordinates that are also present in residues shown in Tables 1, 2, 3 and/or 4 or residues 13-740 of SEQ D NO:3; computing a three dimensional representation of a structure based on the structure coordinates; and displaying the three dimensional representation.
In another embodiment, a method is provided for displaying a three dimensional representation of a structure of a protein comprising: displaying a computer model for a protein binding pocket, at least a portion of the computer model having a surface contour that has a root mean square deviation of less than 3 Å when superimposed on a surface contour defined by atomic coordinates of FIG. 3, the root mean square deviation being calculated based only on alpha-carbon atoms in the structure coordinates of FIG. 3 that are present in residues shown in Tables 1, 2, 3 and/or 4 or residues 13-740 of SEQ ID NO:3.
It is again noted in regard to these embodiments that the root mean square deviation calculation may optionally be based on a comparison of main-chain atoms, non-hydrogen atoms or a comparison of all atoms where the same type of amino acid residue is present. Also, the root mean square deviation of alpha-carbon atoms, non-hydrogen atoms or all atoms may optionally be less than 2.7 Å, 2.5 Å, 2.0 Å, 1.5 Å, 1 Å, 0.5 Å, or less.
The present invention is also directed to a method for solving a three-dimensional crystal structure of a target protein using the structure of DPPIV.
In one embodiment, a computational method is provided comprising taking machine readable data comprising structure coordinates that have a root mean square deviation of alpha-carbon atoms of less than 3 Å when superimposed on alpha-carbon atom positions of corresponding atomic coordinates of FIG. 3, the root mean square deviation being calculated based only on those alpha-carbon atoms of amino acid residues in the structure coordinates that are also present in residues shown in Tables 1, 2, 3 and/or 4 or residues 13-740 of SEQ ID NO:3; computing phases based on the structural coordinates; computing an electron density map based on the computed phases; and determining a three-dimensional crystal structure based on the computed electron density map.
In another embodiment, a computational method is provided comprising: taking an X-ray diffraction pattern of a crystal of the target protein; and computing a three-dimensional electron density map from the X-ray diffraction pattern by molecular replacement, wherein structure coordinates used as a molecular replacement model comprise structure coordinates that have a root mean square deviation of alpha-carbon atoms of less than 3 Å when superimposed on alpha-carbon atom positions of corresponding atomic coordinates of FIG. 3, the root mean square deviation being calculated based only on those alpha-carbon atoms of amino acid residues in the structure coordinates that are also present in residues shown in Tables 1, 2, 3 and/or 4 or residues 13-740 of SEQ ID NO:3. This method may optionally further comprise determining a three-dimensional crystal structure based upon the computed three-dimensional electron density map.
It is again noted in regard to these embodiments that the root mean square deviation calculation may optionally be based on a comparison of main-chain atoms, non-hydrogen atoms or a comparison of all atoms where the same type of amino acid residue is present. Also, the root mean square deviation of alpha-carbon atoms, main-chain atoms, non-hydrogen atoms or all atoms may optionally be less than 2.7 Å, 2.5 Å, 2.0 Å, 1.5 Å, 1 Å, 0.5 Å, or less.
The present invention is also directed to using a crystal structure of DPPIV, in particular the structure coordinates of DPPIV and the surface contour defined by them, in methods for screening, designing, or optimizing molecules or other chemical entities that interact with and preferably inhibit DPPIV.
One skilled in the art will appreciate the numerous uses of the inventions described herein, particularly in the areas of drug design, screening and optimization of drug candidates, as well as in determining additional unknown crystal structures. For example, a further aspect of the present invention relates to using a three-dimensional crystal structure of all or a portion of DPPIV and/or its structure coordinates to evaluate the ability of entities to associate with DPPIV. The entities may be any entity that may function as a ligand and thus may be any type of atom, compound, protein (such as antibodies) or chemical group that can bind to or otherwise associate with a protein.
In one embodiment, a method is provided for evaluating a potential of an entity to associate with a protein comprising: creating a computer model of a protein structure using structure coordinates that comprise structure coordinates that have a root mean square deviation of alpha-carbon atoms of less than 3 Å when superimposed on alpha-carbon atom positions of corresponding atomic coordinates of FIG. 3, the root mean square deviation being calculated based only on those alpha-carbon atoms of amino acid residues in the structure coordinates that are also present in residues shown in Tables 1, 2, 3 and/or 4 or residues 13-740 of SEQ ID NO:3. 1; performing a fitting operation between the entity and the computer model; and analyzing results of the fitting operation to quantify an association between the entity and the model.
In another embodiment, a method is provided for evaluating a potential of an entity to associate with a protein comprising: computing a computer model for a protein binding pocket, at least a portion of the computer model having a surface contour that has a root mean square deviation of less than 3 Å when superimposed on a surface contour defined by atomic coordinates of FIG. 3, the root mean square deviation being calculated based only on alpha-carbon atoms in the structure coordinates that are present in residues shown in Tables 1, 2, 3 and/or 4 or residues 13-740 of SEQ ID NO:3; evaluating a potential of an entity to associate with the surface contour by performing a fitting operation between the entity and the surface contour; and analyzing results of the fitting operation to quantify an association between the entity and the computer model.
In another embodiment, a method is provided for identifying entities that can associate with a protein comprising: generating a three-dimensional structure of a protein using structure coordinates that comprise structure coordinates that have a root mean square deviation of alpha-carbon atoms of less than 3 Å when superimposed on alpha-carbon atom positions of corresponding atomic coordinates of FIG. 3 the root mean square deviation being calculated based only on those alpha-carbon atoms of amino acid residues in the structure coordinates that are also present in residues shown in Tables 1, 2, 3 and/or 4 or residues 13-740 of SEQ ID NO:3: employing the three-dimensional structure to design or select an entity that can associate with the protein; and contacting the entity with a protein having at least 65% identity with residues 13-740 of SEQ ID NO:3.
In another embodiment, a method is provided for identifying entities that can associate with a protein comprising: computing a computer model for a protein binding pocket, at least a portion of the computer model having a surface contour that has a root mean square deviation of less than 3 Å when superimposed on a surface contour defined by atomic coordinates of FIG. 3, the root mean square deviation being calculated based only on alpha-carbon atoms in the structure coordinates that are present in residues shown in Tables 1, 2, 3 and/or 4 or residues 13-740 of SEQ ID NO:3; employing the computer model to design or select an entity that can associate with the protein; and contacting the entity with a protein having at least 65%, 70, 80, 90, 95% identity with residues 13-740 of SEQ ID NO:3.
In another embodiment, a method is provided for evaluating the ability of an entity to associate with a protein, the method comprising: constructing a computer model defined by structure coordinates that comprise structure coordinates that have a root mean square deviation of alpha-carbon atoms of less than 3 Å when superimposed on alpha-carbon atom positions of corresponding atomic coordinates of FIG. 3, the root mean square deviation being calculated based oily on those alpha-carbon atoms of amino acid residues in the structure coordinates that are also present in residues shown in Tables 1, 2, 3 and/or 4 or residues 13-740 of SEQ ID NO:3; selecting an entity to be evaluated by a method selected from the group consisting of (i) assembling molecular fragments into the entity (ii) selecting an entity from a small molecule database, (iii) de novo ligand design of the entity, and (iv) modifying a known ligand for DPPIV, or a portion thereof; performing a fitting program operation between computer models of the entity to be evaluated and the binding pocket in order to provide an energy-minimized configuration of the entity in the binding pocket; and evaluating the results of the fitting operation to quantify the association between the entity and the binding pocket model in order to evaluate the ability of the entity to associate with the binding pocket.
In another embodiment, a method for evaluating the ability of an entity to associate with a protein, the method comprising: computing a computer model for a protein binding pocket, at least a portion of the computer model having a surface contour that has a root mean square deviation of less than 3 Å when superimposed on a surface contour defined by atomic coordinates of FIG. 3, the root mean square deviation being calculated based only on alpha-carbon atoms in the structure coordinates that are present in residues shown in Tables 1, 2, 3 and/or 4 or residues 13-740 of SEQ ID NO:3: selecting an entity to be evaluated by a method selected from the group consisting of (i) assembling molecular fragments into the entity, (ii) selecting an entity from a small molecule database, (iii) de novo ligand design of the entity, and (iv) modifying a known ligand for an DPPIV, or a portion thereof; performing a fitting program operation between computer models of the entity to be evaluated and the binding pocket in order to provide an energy-minimized configuration of the entity in the binding pocket; and evaluating the results of the fitting operation to quantify the association between the entity and the binding pocket model in order to evaluate the ability of the entity to associate with the said binding pocket.
It is again noted in regard to these embodiments that the root mean square deviation calculation may optionally be based on a comparison of main-chain atoms, non-hydrogen atoms or a comparison of all atoms where the same type of amino acid residue is present. Also, the root mean square deviation of alpha-carbon atoms, non-hydrogen atoms or all atoms may optionally be less than 2.7 Å, 2.5 Å, 2.0 Å, 1.5 Å, 1 Å, 0.5 Å, or less.
Also in regard to each of these embodiments, the protein may optionally have activity characteristic of DPPIV. For example, the protein may optionally be inhibited by inhibitors of wild type DPPIV.
In another embodiment, a method is provided for identifying an entity that associates with a protein comprising: taking structure coordinates from diffraction data obtained from a crystal of a protein that has at least 65%, 70%, 80%, 90%, 95% or more identity with the residues 13-740 of SEQ ID NO:3 and performing rational drug design using a three dimensional structure that is based on the obtained structure coordinates. The protein crystals may optionally have a crystal lattice having unit cell dimensions, +/−5%, of a=121.53 Å b=124.11 Å and c=144.42 Å, α=γ=90°, β=114.6°). The method may optionally further comprise selecting one or more entities based on the rational drug design and contacting the selected entities with the protein. The method may also optionally further comprise measuring an activity of the protein when contacted with the one or more entities. The method also may optionally further comprise comparing activity of the protein in a presence of and in the absence of the one or more entities; and selecting entities where activity of the protein changes depending whether a particular entity is present. The method also may a optionally further comprise contacting cells expressing the protein with the one or more entities and detecting a change in a phenotype of the cells when a particular entity is present.
BRIEF DESCRIPTION OF THE FIGURES
FIG. 1 illustrates SEQ. ID Nos. 1, 2 and 3 referred to in this application.
FIG. 2 illustrates a crystal of DPPIV complex.
FIG. 3 lists a set of atomic structure coordinates for DPPIV (SEQ ID NO:3 as derived by X-ray crystallography from a crystal that comprises the protein. The following abbreviations are used in FIG. 3: “X, Y, Z” crystallographically define the atomic position of the element measured; “B” is a thermal factor that measures movement of the atom around its atomic center: “Occ” is an occupancy factor that refers to the fraction of the molecules in which each atom occupies the position specified by the coordinates (a value of “1” indicates that each atom has the same conformation, i.e., the same position, in all molecules of the crystal). “NAG” stands for N-Acetylglucosamine.
FIG. 4A illustrates a ribbon diagram overview of the structure of DPPIV, highlighting secondary structural elements of the protein.
FIG. 4B illustrates another ribbon diagram overview of the structure of DPPIV, highlighting additional secondary structural elements of the protein.
FIG. 5 illustrates the DPPIV binding site of DPPIV based on the determined crystal structure for the molecule in the asymmetric unit corresponding to the coordinates shown in FIG. 3.
FIG. 6 illustrates a system that may be used to carry out instructions for displaying a crystal structure of DPPIV encoded on a storage medium.
DETAILED DESCRIPTION OF THE INVENTION
The present invention relates to a member of the S9 family of human proteases known as dipeptidyl peptidase IV (DPPIV) (SEQ ID NO:1) More specifically, the present invention relates to DPPIV in crystalline form, methods of forming crystals comprising DPPIV, methods of using crystals comprising DPPIV, structure coordinates and a crystal structure of DPPIV, and methods of using the structure coordinates and crystal structure.
In describing protein structure and function herein, reference is made to amino acids comprising the protein. The amino acids may also be referred to by their conventional abbreviations; A=Ala=Alanine; T=Thr=Threonine; V=Val=Valine; C=Cys=Cysteine; L=Leu=Leucine; Y=Tyr=Tyrosine; I=Ile=Isoleucine; N=Asn=Asparagine; P=Pro=Proline; Q=Gln=Glutamine; F=Phe=Phenylalanine; D=Asp=Aspartic Acid; W=Trp=Tryptophan; E=Glu=Glutamic Acid; M=Met=Methionine; K=Lys=Lysine; G=Gly=Glycine; R=Arg=Arginine; S=Ser=Serine; and H=His=Histidine.
1. DPPIV
Dipeptidyl Peptidase IV (DPPIV) (SEQ ID NO:1) is a serine protease of Clan SC family S9. DPPIV is a 240 kDa homodimeric, multi-functional type-II membrane bound glycoprotein, widely distributed in all mammalian tissues, but highly expressed in kidney, liver and endothelium. DPPIV is also known as DPP4, CD26, adenosine deaminase complexing protein 2 or adenosine deaminase binding protein (ADAbp). DPPIV consists of a short cytoplasmic domain of six amino acids, followed by a hydrophobic transmembrane domain (amino acids 7-28) and an extracellular sequence of 739 amino acids.
DPPIV is a highly specific aminopeptidase and releases dipeptides from the amino terminus of peptides with a Pro or Ala in the penultimate position. N-terminal degradation of the substrate peptides may result in the activation, inactivation or modulation of their activity. Besides its well-known exopeptidase activity, DPPIV also exhibits endopeptidase activity towards denatured collagen. Expression of DPPIV is tightly associated with cell adhesion and is a co-stimulant during T-cell activation and proliferation.
The nature of its substrates, together with its regulated expression and non-enzymatic interactions characterize active participation of DPPIV in the immune, nerve and endocrine networks in human physiology. Among the substrates of DPPIV are glucagon-like peptide 1 (GLP-1) and glucose-dependent insulinotropic polypeptide (GIP), two hormones important for glucose regulation. Degradation and concomitantly inactivation of GLP-1 and GIP by DPPIV reduces insulin secretion.
It should be understood that the methods and compositions provided relating to DPPIV are not intended to be limited to the wild type, full length form of DPPIV. Instead, the present invention also relates to fragments and variants of DPPIV as described herein. Further, the present invention has applicability to other S9 proteases whose sequence and/or structure are comparatively similar to DPPIV.
In one embodiment, DPPIV comprises the wild-type form of full length DPPIV, set forth herein as SEQ. ID No. 1 (GenBank Accession Number NM001935; “Dipeptidyl peptidase IV (CD 26) gene expression in enterocyte-like colon cancer cell lines HT-29 and Caco-2. Cloning of the complete human coding sequence and changes of dipeptidyl peptidase IV mRNA levels during cell differentiation”, Darmoul, D., Lacasa, M., Baricault, L., Marguet, D., Sapin, C., Trotot, P., Barbat, A. and Trugnan, G., J. Biol. Chem. 267 (7), 4824-4833, 1992.
In another embodiment, DPPIV comprises residues 13-740 of SEQ ID NO:3 which comprises the active site domain of wild-type DPPIV that is represented in the set of structural coordinates shown in FIG. 3.
It should be recognized that the invention may be readily extended to various variants of wild-type DPPIV and variants of fragments thereof. In another embodiment, DPPIV comprises a sequence that has at least 65% identity, preferably at least 70%, 80%, 90%, 95% or higher identity with SEQ. ID No. 1.
It is also noted that the above sequences of DPPIV are also intended to encompass isoforms, mutants and fusion proteins of these sequences. An example of a fusion protein is provided by SEQ. ID No. 3, which includes a 12 residue N-terminal tag (6 residues of which are histidine) that may be used to facilitate purification of the protein.
With the crystal structure provided herein, where amino acid residues are positioned in the structure are now known. As a result, the impact of different substitutions can be more easily predicted and understood.
For example, based on the crystal structure, applicants have determined that the DPPIV amino acids shown in Table 1 encompass a 4-Angstrom radius around the DPPIV active site and thus likely to interact with any active site inhibitor of DPPIV. Applicants have also determined that the amino acids of Table 2 encompass a 7-Angstrom radius around the DPPIV active site. Further it has been determined that the amino acids of Table 3 encompass a 10-Angstrom radius around the DPPIV active site. It is noted that there are four different DPPIV molecules in the asymmetric unit, referred to as chains A, B, C and D. As a result, four sets of structure coordinates were obtained for each amino acid. There are two dimers formed in the asymmetric unit; one dimer is formed between molecules A and B and the other with molecules C and D. Applicants have also determined that amino acids of Table 4 encompass a 5-Angstrom radius around the DPPIV amino acids that interact at the AB and CD dimerization interfaces. The A, B, C and D sets of structural coordinates appear in FIG. 3. It is noted that the sequence and structure of the residues in the active site and dimerization interface may also be conserved and hence pertinent to other S9 proteases.
One or more of the sets of amino acids set forth in the tables is preferably conserved in a variant of DPPIV. Hence, DPPIV may optionally comprise a sequence that has at least 65% identity, preferably at least 70%, 80%, 90%, 95% or higher identity with any one of the above sequences (e.g., all of SEQ ID NO:3 or residues 13-740 of SEQ ID NO:3) where at least the residues shown in tables 1, 2, 3 and/or 4 are conserved with the exception of 0, 2, 3 or 4 residues. It should be recognized that one might optionally vary some of the binding site residues in order to determine the effect such changes have on structure or activity.
TABLE 1
Amino Acids encompassed by a 4-Angstrom radius around the
DPPIV active site (SEQ ID NO:3).
ARG 99 TYR 521 TYR 640
GLU 179 SER 604 ASN 684
GLU 180 TYR 605 HIS 714
SER 183 VAL 630 ASP 682
PHE 331 TYR 636
TABLE 2
Amino Acids encompassed by a 7-Angstrom radius around the
DPPIV active site (SEQ ID NO:3).
ARG 99 TYR 521 TRP 633
HIS 100 GLY 523 TYR 636
TRP 175 PRO 524 ASP 637
GLU 178 TYR 559 TYR 640
GLU 179 TRP 603 THR 641
GLU 180 SER 604 ARG 643
VAL 181 TYR 605 TYR 644
PHE 182 GLY 606 ASN 684
SER 183 TYR 608 VAL 685
ARG 330 ALA 628 HIS 714
PHE 331 PRO 629 ASP 682
ARG 332 VAL 630
TABLE 3
Amino Acids encompassed by a 10-Angstrom radius around the
DPPIV active site (SEQ ID NO:3).
ARG 99 ILE 379 SER 631
HIS 100 VAL 520 ARG 632
TRP 175 TYR 521 TRP 633
VAL 176 ALA 522 TYR 635
TYR 177 GLY 523 TYR 636
GLU 178 PRO 524 ASP 637
GLU 179 CYS 525 SER 638
GLU 180 SER 526 VAL 639
VAL 181 TYR 559 TYR 640
PHE 182 MET 565 THR 641
SER 183 LEU 572 GLU 642
ALA 184 GLU 576 ARG 643
TYR 230 GLY 602 TYR 644
CYS 275 TRP 603 MET 645
GLN 294 SER 604 HIS 678
TRP 327 TYR 605 ASP 682
VAL 328 GLY 606 ASP 683
GLY 329 GLY 607 ASN 684
ARG 330 TYR 608 VAL 685
PHE 331 VAL 609 HIS 686
ARG 332 VAL 627 GLN 689
PRO 333 ALA 628 HIS 714
SER 334 PRO 629 GLY 715
GLU 335 VAL 630
TABLE 4
Amino Acids encompassed by a 5-Angstrom radius around the
AB and CD dimerization interfaces (SEQ ID NO:3).
Chain A Chain B Chain C Chain D
SER A 213 PRO B 208 PRO C 208 LEU D 209
TYR A 215 ILE B 210 LEU C 209 ILE D 210
SER A 216 GLU B 211 ILE C 210 GLU D 211
ASP A 217 TYR B 212 GLU C 211 TYR D 212
GLU A 218 SER B 213 TYR C 212 SER D 213
LEU A 220 TYR B 215 TYR C 215 SER D 216
GLN A 221 SER B 216 SER C 216 ASP D 217
TYR A 222 ASP B 217 ASP C 217 GLU D 218
PRO A 223 GLU B 218 GLU C 218 SER D 219
LYS A 224 SER B 219 SER C 219 LEU D 220
THR A 225 TYR B 222 LEU C 220 GLN D 221
ARG A 227 THR B 225 GLN C 221 TYR D 222
TYR A 230 ARG B 227 TYR C 222 PRO D 223
LYS A 232 GLN B 688 PRO C 223 THR D 225
ALA A 233 ALA B 691 THR C 225 ARG D 227
SER A 694 GLN B 692 TYR C 230 LYS D 232
LYS A 695 LYS B 695 PRO C 231 ALA D 233
LEU A 697 LEU B 697 LYS C 232 ALA D 235
VAL A 698 VAL B 698 GLU C 634 TYR D 635
ASP A 699 ASP B 699 THR C 661 MET D 663
GLY A 701 GLY B 701 LEU C 676 HIS D 678
VAL A 702 VAL B 702 PHE C 687 GLN D 688
PHE A 704 ASP B 703 GLN C 688 SER D 690
GLN A 705 PHE B 704 SER C 690 GLN D 692
MET A 707 GLN B 705 GLN C 692 SER D 694
TRP A 708 ALA B 706 LEU C 697 ASP D 699
TYR A 709 VAL C 698 VAL D 702
THR A 710 ASP C 699 ASP D 703
ASP A 711 GLY C 701 PHE D 704
VAL C 702 GLN D 705
ASP C 703 ALA D 706
PHE C 704 MET D 707
GLN C 705 TRP D 708
ALA C 706 TYR D 709
MET C 707 THR D 710
TRP C 708
TYR C 709
With the benefit of the crystal structure and guidance provided by Tables 1, 2, 3 and 4, a wide variety of DPPIV variants (e.g., insertions, deletions, substitutions, etc.) that fall within the above specified identity ranges may be designed and manufactured utilizing recombinant DNA techniques well known to those skilled in the art, particularly in view of the knowledge of the crystal structure provided herein. These modifications can be used in a number of combinations to produce the variants. The present invention is useful for crystallizing and then solving the structure of the range of variants of DPPIV.
Variants of DPPIV may be insertional variants in which one or more amino acid residues are introduced into a predetermined site in the DPPIV sequence. For instance, insertional variants can be fusions of heterologous proteins or polypeptides to the amino or carboxyl terminus of the subunits.
Variants of DPPIV also may be substitutional variants in which at least one residue has been removed and a different residue inserted in its place. Non-natural amino acids (i.e. amino acids not normally found in native proteins), as well as isosteric analogs (amino acid or otherwise) may optionally be employed in substitutional variants. Examples of suitable substitutions are well known in the art, such as the Glu→sp, Ser→Cys, Cys→Ser, and His→Ala for example.
Another class of variants is deletional variants, which are characterized by the removal of one or more amino acid residues from the DPPIV sequence.
Other variants may be produced by chemically modifying amino acids of the native protein (e.g., diethylpyrocarbonate treatment that modifies histidine residues). Preferred are chemical modifications that are specific for certain amino acid side chains. Specificity may also be achieved by blocking other side chains with antibodies directed to the side chains to be protected. Chemical modification includes such reactions as oxidation, reduction, amidation, deamidation, or substitution of bulky groups such as polysaccharides or polyethylene glycol.
Exemplary modifications include the modification of lysinyl and amino terminal residues by reaction with succinic or other carboxylic acid anhydrides. Modification with these agents has the effect of reversing the charge of the lysinyl residues. Other suitable reagents for modifying amino-containing residues include imidoesters such as methyl picolinimidate; pyridoxal phosphate; pyridoxal chloroborohydride; trinitrobenzenesulfonic acid; 0-methylisourea, 2,4-pentanedione; and transaminaseN: talyzed reaction with glyoxylate, and N-hydroxysuccinamide esters of polyethylene glycol or other bulky substitutions.
Arginyl residues may be modified by reaction with a number of reagents, including phenylglyoxal, 2,3-butanedione, 1,2-cyclohexanedione, and ninhydrin. Modification of arginine residues requires that the reaction be performed in alkaline conditions because of the high pKa, of the guanidine functional group. Furthermore, these reagents may react with the groups of lysine as well as the arginine epsilon-amino group.
Tyrosyl residues may also be modified to introduce spectral labels into tyrosyl residues by reaction with aromatic diazonium compounds or tetranitromethane, forming 0-acetyl tyrosyl species and 3-nitro derivatives, respectively. Tyrosyl residues may also be iodinated using 125I or 131I, to prepare labeled proteins for use in radioimmunoassays.
Carboxyl side groups (aspartyl or glutamyl) may be selectively modified by reaction with carbodiimides or they may be converted to asparaginyl and glutaminyl residues by reaction with ammonium ions. Conversely, asparaginyl and glutaminyl residues may be deamidated to the corresponding aspartyl or glutamyl residues, respectively, under mildly acidic conditions. Either form of these residues falls within the scope of this invention.
Other modifications that may be formed include the hydroxylation of proline and lysine, phosphorylation of hydroxyl groups of seryl or threonyl groups of lysine, arginine and histidine side chains (T. E. Creighton, Proteins. Structure and Molecular Properties, W.H. Freeman & Co., San Francisco, pp. 79-86, 1983), acetylation of the N-terminal amine and amidation of any C-terminal carboxyl group.
As can be seen, modifications of the nucleic sequence encoding DPPIV may be accomplished by a variety of well-known techniques, such as site-directed mutagenesis (see, Gillman and Smith, Gene 8:81-97 (1979) and Roberts, S. et al., Nature 328:731-734 (1987)). When modifications are made, these modifications may optionally be evaluated for there affect on a variety of different properties including, for example, solubility, crystallizability and a modification to the protein's structure and activity.
In one variation, the variant and/or fragment of wild-type DPPIV is functional in the sense that the resulting protein is capable of associating with at least one same chemical entity that is also capable of selectively associating with a protein comprising the wild-type DPPIV (e.g., residues 39-766 of SEQ. ID No. 1) since this common associative ability evidences that at least a portion of the native structure has been conserved. That chemical entity may optionally be glucagon-like peptide 1 (GLP-1), glucagon-like peptide 2 (GLP-2), glucose-dependent, insulinotropic polypeptide (GIP), growth hormone releasing factor, SDF-1α, β-Casomorphin, TNF-α, Peptide YY or Substance P.
It is noted the activity of the native protein need not necessarily be conserved. Rather, amino acid substitutions, additions or deletions that interfere with native activity but which do not significantly alter the three-dimensional structure of the domain are specifically contemplated by the invention. Crystals comprising such variants of DPPIV, and the atomic structure coordinates obtained there from, can be used to identify compounds that bind to the native domain. These compounds may affect the activity of the native domain.
Amino acid substitutions, deletions and additions that do not significantly interfere with the three-dimensional structure of DPPIV will depend, in part, on the region where the substitution, addition or deletion occurs in the crystal structure. These modifications to the protein can now be made far more intelligently with the crystal structure information provided herein. In highly variable regions of the molecule, non-conservative substitutions as well as conservative substitutions may be tolerated without significantly disrupting the three-dimensional structure of the molecule. In highly conserved regions, or regions containing significant secondary structure, conservative amino acid substitutions are preferred.
Conservative amino acid substitutions are well known in the art, and include substitutions made on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity and/or the amphipathic nature of the amino acid residues involved. For example, negatively charged amino acids include aspartic acid and glutamic acid; positively charged amino acids include lysine and arginine; amino acids with uncharged polar head groups having similar hydrophilicity values include the following: leucine, isoleucine, valine; glycine, alanine; asparagine, glutamine; serine, threonine; phenylalanine, tyrosine. Other conservative amino acid substitutions are well known in the art.
It should be understood that the protein may be produced in whole or in part by chemical synthesis. As a result, the selection of amino acids available for substitution or addition is not limited to the genetically encoded amino acids. Indeed, mutants may optionally contain non-genetically encoded amino acids. Conservative amino acid substitutions for many of the commonly known non-genetically encoded amino acids are well known in the art. Conservative substitutions for other amino acids can be determined based on their physical properties as compared to the properties of the genetically encoded amino acids.
In some instances, it may be particularly advantageous or convenient to substitute, delete and/or add amino acid residues in order to provide convenient cloning sites in cDNA encoding the polypeptide, to aid in purification of the polypeptide, etc. Such substitutions, deletions and/or additions which do not substantially alter the three dimensional structure of DPPIV will be apparent to those having skills in the art, particularly in view of the three dimensional structure of DPPIV provided herein.
2. Cloning, Expression and Purification of DPPIV
The gene encoding DPPIV can be isolated from RNA, cDNA or cDNA libraries. In this case, the portion of the gene encoding amino acid residues 39-766 (SEQ. ID No. 1), corresponding to the catalytic domain of human DPPIV, was isolated and is shown as SEQ. ID No. 2.
Construction of expression vectors and recombinant proteins from the DNA sequence encoding DPPIV may be performed by various methods well known in the art. For example, these techniques may be performed according to Sambrook et al., Molecular Cloning-A Laboratory Manual, Cold Spring Harbor, N.Y. (1989), and Kriegler, M., Gene Transfer and Expression, A Laboratory Manual, Stockton Press, New York (1990).
A variety of expression systems and hosts may be used for the expression of DPPIV. Example 1 provides one such expression system.
Once expressed, purification steps are employed to produce DPPIV in a relatively homogeneous state. In general, a higher purity solution of a protein increases the likelihood that the protein will crystallize. Typical purification methods include the use of centrifugation, partial fractionation, using salt or organic compounds, dialysis, conventional column chromatography, (such as ion exchange, molecular sizing chromatography, etc.), high performance liquid chromatography (HPLC), and gel electrophoresis methods (see, e.g., Deutcher, “Guide to Protein Purification” in Methods in Enzymology (1990), Academic Press, Berkeley, Calif.).
DPPIV may optionally be affinity labeled during cloning, preferably with an N-terminal six-histidine tag, in order to facilitate purification. With the use of an affinity label, it is possible to perform a one-step purification process on a purification column that has a unique affinity for the label. The affinity label may be optionally removed after purification. These and other purification methods are known and will be apparent to one of skill in the art.
3. Crystallization and Crystals Comprising DPPIV
One aspect of the present invention relates to methods for forming crystals comprising DPPIV as well as crystals comprising DPPIV.
In one embodiment, a method for forming crystals comprising DPPIV is provided comprising forming a crystallization volume comprising DPPIV, one or more precipitants, optionally a buffer, optionally a monovalent and/or divalent salt and optionally an organic solvent; and storing the crystallization volume under conditions suitable for crystal formation.
In yet another embodiment, a method for forming crystals comprising DPPIV is provided comprising forming a crystallization volume comprising DPPIV in solution comprising the components shown in Table 5; and storing the crystallization volume under conditions suitable for crystal formation.
TABLE 5
Precipitant
5-50% w/v of precipitant wherein the precipitant comprises one or more
members of the group consisting of PEG MME having a molecular weight
range between 300-10000, and PEG having a molecular weight range
between 100-10000
pH
pH 5-9. Buffers that may be used include, but are not limited to tris,
bicine, cacodylate, acetate, citrate, MES and combinations thereof.
Additives
optionally 0.05 to 0.8 M additives wherein the additives comprises
sarcosine or 0.5 to 25% additives wherein the additives comprises
xylitrol
Protein Concentration
1 mg/ml-50 mg/ml
Temperature
1° C.-25° C.
In yet another embodiment, a method for forming crystals comprising DPPIV is provided comprising forming a crystallization volume comprising DPPIV; introducing crystals comprising DPPIV as nucleation sites, and storing the crystallization volume under conditions suitable for crystal formation.
Crystallization experiments may optionally be performed in volumes commonly used in the art, for example typically 15, 10, 5, 2 microliters or less. It is noted that the crystallization volume optionally has a volume of less than 1 microliter, optionally 500, 250, 150, 100, 50 or less nanoliters.
It is also noted that crystallization may be performed by any crystallization method including, but not limited to batch, dialysis and vapor diffusion (e.g., sitting drop and hanging drop) methods. Micro and/or macro seeding of crystals may also be performed to facilitate crystallization.
It should be understood that forming crystals comprising DPPIV and crystals comprising DPPIV according to the invention are not intended to be limited to the wild type, full length DPPIV shown in SEQ. ID No. 1 and to fragments comprising residues 39-766 of SEQ. ID No. 1. Rather, it should be recognized that the invention may be extended to various other fragments and variants of wild-type DPPIV as described above.
It should also be understood that forming crystals comprising DPPIV and crystals comprising DPPIV according to the invention may be such that DPPIV is optionally complexed with one or more ligands and one or more copies of the same ligand. The ligand used to form the complex may be any ligand capable of binding to DPPIV. In one variation, the ligand is a natural substrate. In another variation, the ligand is an inhibitor.
In one particular embodiment, DPPIV crystals have a crystal lattice in the P21 space group. DPPIV crystals may also optionally have unit cell dimensions, +/−5%, of a=121.53 Å b=124.11 Å and c=144.42 Å, α=γ=90°, β=114.6°. DPPIV crystals also preferably are capable of diffracting X-rays for determination of atomic coordinates to a resolution of 4 Å, 3 Å, 2.5 Å, 2 Å or better.
Crystals comprising DPPIV may be formed by a variety of different methods known in the art. For example, crystallizations may be performed by batch, dialysis, and vapor diffusion (sitting drop and hanging drop) methods. A detailed description of basic protein crystallization setups may be found in McRee, D. and David. P., Practical Protein Crystallography, 2nd Ed. (1999), Academic Press Inc. Further descriptions regarding performing crystallization experiments are provided in Stevens, et al. (2000) Curr. Opin. Struct. Biol.: 10(5):558-63, and U.S. Pat. Nos. 6,296,673, 5,419,278, and 5,096,676.
In one variation, crystals comprising DPPIV are formed by mixing substantially pure DPPIV with an aqueous buffer containing a precipitant at a concentration just below a concentration necessary to precipitate the protein. One suitable precipitant for crystallizing DPPIV is polyethylene glycol (PEG), which combines some of the characteristics of the salts and other organic precipitants (see, for example, Ward et al., J. Mol. Biol. 98:161, 1975, and McPherson, J. Biol. Chem. 251:6300, 1976.
During a crystallization experiment, water is removed by diffusion or evaporation to increase the concentration of the precipitant, thus creating precipitating conditions for the protein. In one particular variation, crystals are grown by vapor diffusion in hanging drops or sitting drops. According to these methods, a protein/precipitant solution is formed and then allowed to equilibrate in a closed container with a larger aqueous reservoir having a precipitant concentration for producing crystals. The protein/precipitant solution continues to equilibrate until crystals grow.
By performing submicroliter volume sized crystallization experiments, as detailed in U.S. Pat. No. 6,296,673, effective crystallization conditions for forming crystals of a DPPIV complex were obtained. In order to accomplish this, systematic broad screen crystallization trials were performed on a DPPIV complex using the sitting drop technique. Over 1000 individual trials were performed in which pH, temperature and precipitants were varied. In each experiment, a 100 nL mixture of DPPIV complex and precipitant was placed on a platform positioned over a well containing 100 μL of the precipitating solution. Precipitate and crystal formation was detected in the sitting drops. Fine screening was then carried out for those crystallization conditions that appeared to produce precipitate and/or crystal in the drops.
Based on the crystallization experiments that were performed, a thorough understanding of how different crystallization conditions affect DPPIV crystallization was obtained. Based on this understanding, a series of crystallization conditions were identified that may be used to form crystals comprising DPPIV. These conditions are summarized in Table 5. A particular example of crystallization conditions that may be used to form diffraction quality crystals of the DPPIV complex is detailed in Example 2. FIG. 2 illustrates crystals of the DPPIV complex formed using the crystallization conditions provided in Table 5.
One skilled in the art will recognize that the crystallization conditions provided in Table 5 and Example 2 can be varied and still yield protein crystals comprising DPPIV. For example, it is noted that variations on the crystallization conditions described herein can be readily determined by taking the conditions provided in Table 5 and performing fine screens around those conditions by varying the type and concentration of the components in order to determine additional suitable conditions for crystallizing DPPIV, variants of DPPIV, and ligand complexes thereof.
Crystals comprising DPPIV have a wide range of uses. For example, now that crystals comprising DPPIV have been produced, it is noted that crystallizations may be performed using such crystals as a nucleation site within a concentrated protein solution. According to this variation, a concentrated protein solution is prepared and crystalline material (microcrystals) are used to ‘seed’ the protein solution to assist nucleation for crystal growth. If the concentrations of the protein and any precipitants are optimal for crystal growth, the seed crystal will provide a nucleation site around which a larger crystal forms. Given the ability to form crystals comprising DPPIV according to the present invention, the crystals so formed can be used by this crystallization technique to initiate crystal growth of other DPPIV comprising crystals, including DPPIV complexed to other ligands.
As will be described herein in greater detail, crystals may also be used to perform X-ray or neutron diffraction analysis in order to determine the three-dimensional structure of DPPIV and, in particular, to assist in the identification of its active site. Knowledge of the binding site region allows rational design and construction of ligands including inhibitors. Crystallization and structural determination of DPPIV mutants having altered bioactivity allows the evaluation of whether such changes are caused by general structure deformation or by side chain alterations at the substitution site.
4. X-Ray Data Collection and Structure Determination
Crystals comprising DPPIV may be obtained as described above in Section 3. As described herein, these crystals may then be used to perform x-ray data collection and for structure determination.
In one embodiment, described in Example 2, crystals of a DPPIV complex were obtained where DPPIV has the sequence of residues shown in SEQ. ID No. 3. These particular crystals were used to determine the three dimensional structure of DPPIV. However, it is noted that other crystals comprising DPPIV including different DPPIV variants, fragments, and complexes thereof may also be used.
The structure of DPPIV was solved by a combination of heavy-atom derivatives and Seleno-Methionine (Se-Met) phasing in conjunction with non-crystallographic averaging. Heavy atom derivatives were obtained by soaking native DPPIV crystals in heavy atom solutions made using the crystallization solution. The concentration of heavy atom derivative and time of soaking varied between 0.5 mM to 10 mM and 1 to 15 days, respectively. An extensive array of heavy atom derivatives were individually soaked into DPPIV crystals and analyzed. Two heavy atom derivatives were used to determine the phases: di-m-iodobis (ethylenediamine)-di-platinum (II) nitrate (PIP) and ethyl mercuric thiosalicylic acid sodium salt (EMTS). Data from crystals of apo DPPIV, Se-Met-DPPIV, PIP-DPPIV and EMTS-DPPIV were collected from cryocooled crystals (100K) at the Stanford Synchrotron Radiation Laboratory (SSRL) beam lines 9-1, 9-2 and 11-1 and the Advanced Light Source (ALS) beam lines 5.0.2 and 5.0.3 both using an ADSC Quantum CCD detector. The diffraction pattern of the DPPIV crystals displayed symmetry consistent with space group P21 with unit cell dimensions a=121.53 Å=124.11 Å and c=144.42 Å, α=γ=90°, β=114.6° (+/−5%). Data were collected and integrated to 2.3 Å with MOSFLM (or HKL2000) and scaled with SCALA (or Scalepack) (CCP4 Study Weekend, Eds. Sawyer, L., Isaacs, N. & Bailey, S. 56-62, SERC Daresbury Laboratory, England, 1993).
All crystallographic calculations were performed using the CCP4 program package (Collaborative Computational Project, N. The CCP4 Suite: Programs for Protein Crystallography. Acta Cryst. D50, 760-763 (1994)).
Positions for the Platinum atoms of the PIP derivative were located using the direct method search program, SHELXD. The heavy atom parameters of the PIP derivative were refined using the program SHARP. The refined parameters were used to compute phases and locate the Mercury atom positions of the EMTS derivative. The heavy atom parameters of both derivatives were refined using SHARP. Initial solvent flattened maps using the phases from both heavy atom derivatives were of reasonable quality and helped identifying parts of the secondary structure elements of DPPIV. Due to low incorporation of Selenium (Se) in the baculovirus expressed protein, solving Se positions using MAD data was not successful. However, using the phases from the two heavy atom derivatives and cross phasing on to the peak data of the Se-Met derivative of DPPIV allowed for locating all 52 Se atoms of the four subunits of the Se-Met-DPPIV crystal. A final refinement of both the heavy atom derivatives, including the Se atoms, was carried out using SHARP. The resulting phases with solvent flattening and non-crystallographic averaging using DM resulted in an interpretable electron density map. The model was built into the electron density map using Xfit. Refinement continued with iterative map/model/phase improvement using ARP_WARP map (Perrakis, A., Morris, R. J. & Lamzin, V. S.). This was followed by alternating cycles of manual rebuilding of the model with Xfit (McRee, D. E. XtalView/Xfit-A versatile program for manipulating atomic coordinates and electron density J. Struct. Biol. 125, 156-65 (1999)), ARP_WARP map improvement (Perrakis, A., Morris, R. J. & Lamzin, V. S. Automated protein model building combined with iterative structure refinement) and geometrically restrained refinement against a maximum likelihood target function as implemented in REFMAC(CCP4) until the refinement reached convergence. All stages of model refinement were carried with bulk solvent correction and anisotropic scaling. The data collection and data refinement statistics are given in Table 6.
TABLE 6
Crystal data
Space group P21
Unit cell dimensions a = 121.53 Å
b = 124.11 Å and
c = 144.42 Å,
α = γ = 90°,
β = 114.6°
Data collection
X-ray source ALS: BL5.0.2;
BL5.0.3; SSRL:
BL9-1; BL9-2;
BL11-1
Wavelength [Å] 0.90 to 1.25
Resolution [Å] 2.30
Observations (unique) 183144
Redundancy 2.8
Completeness overall (outer shell) 98 (97.6)%
I/σ (I) overall (outer shell) 8.6 (1.9)
Rsymm 1 overall (outer shell) 8.2 (51.6)%
Refinement
Reflections used 159715
R-factor 22.7%
Rfree 28.3%
r.m.s bonds 0.009 Å
r.m.s angles 1.432°
During structure determination, where the unit cell dimensions were a=121.53 Å b=124.11 Å and c=144.42 Å, α=γ=90°, β=114.6°, it was realized that each unit cell comprised four DPPIV molecules. Structure coordinates were determined for this complex and the resultant set of structural coordinates from the refinement are presented in FIG. 3.
It is noted that the sequence of the structure coordinates presented in FIG. 3 differ in some regards from the sequence shown in SEQ. ID No. 1. Structure coordinates are not reported for some residues because the electron density obtained was insufficient to identify the position of these residues. For FIG. 3, structure coordinates for residues 151-153 (chains C and D) and 97-99 (chain D) are not reported.
Those of skill in the art understand that a set of structure coordinates (such as those in FIG. 3) for a protein or a protein-complex or a portion thereof, is a relative set of points that define a shape in three dimensions. Thus, it is possible that an entirely different set of structure coordinates could define a similar or identical shape. Moreover, slight variations in the individual coordinates may have little effect on overall shape. In terms of binding pockets, these variations would not be expected to significantly alter the nature of ligands that could associate with those pockets. The term “binding pocket” as used herein refers to a region of the protein that, as a result of its shape, favorably associates with a ligand.
These variations in coordinates may be generated because of mathematical manipulations of the DPPIV structure coordinates. For example, the sets of structure coordinates shown in FIG. 3 could be manipulated by crystallographic permutations of the structure coordinates, fractionalization of the structure coordinates, application of a rotation matrix, integer additions or subtractions to sets of the structure coordinates, inversion of the structure coordinates or any combination of the above.
Alternatively, modifications in the crystal structure due to mutations, additions, substitutions, and/or deletions of amino acids or other changes in any of the components that make up the crystal could also account for variations in structure coordinates. If such variations are within an acceptable standard error as compared to the original coordinates, the resulting three-dimensional shape should be considered to be the same. Thus, for example, a ligand that bound to the active site binding pocket of DPPIV would also be expected to bind to another binding pocket whose structure coordinates defined a shape that fell within the acceptable error.
Various computational analyses may be used to determine whether structure coordinates for a protein or a portion thereof is similar to the structure coordinates of DPPIV provided herein, or a portion thereof. Such analyses may be carried out in well known software applications, such as the Molecular Similarity application of QUANTA (Molecular Simulations Inc., San Diego, Calif.) version 4.1, and as described in the accompanying User's Guide. For the purpose of this invention, a rigid fitting method shall be used to compare protein structures.
For the purpose of this invention, any set of structure coordinates for a protein from any source having a root mean square deviation of alpha-carbon atoms of less than 3 Å when superimposed on the alpha-carbon atom positions of the corresponding atomic coordinates of FIG. 3 shall be considered identical. It is noted that the root mean square deviation is intended to be limited to only those alpha-carbon atoms of amino acid residues that are common to both the protein fragment represented in FIG. 3 and the protein whose structure coordinates are being compared to the coordinates shown in FIG. 3.
It is noted that mutants and variants of DPPIV as well as other S9 proteases are likely to have similar structures despite having different sequences. For example, the binding pockets of these related proteins are likely to have similar contours. Accordingly, it should be recognized that the structure coordinates and binding pocket models provided herein have utility for these other related proteins.
Accordingly, in one embodiment, the invention relates to data, computer readable media comprising data, and uses of the data where the data comprises all or a portion of the structure coordinates shown in FIG. 3 or structure coordinates having a root mean square deviation of alpha-carbon atoms of less than 3 Å when superimposed on the alpha-carbon atom positions of the corresponding atomic coordinates of FIG. 3. Again, it is noted that the root mean square deviation is intended to be limited to only those alpha-carbon atoms of amino acid residues that are common to both the protein fragment represented in FIG. 3 and the protein whose structure coordinates are being compared to the coordinates shown in FIG. 3.
As noted, there are many different ways to express the surface contours of the DPPIV structure other than by using the structure coordinates provided in FIG. 3. Accordingly, it is noted that the present invention is also directed to any data, computer readable media comprising data, and uses of the data where the data defines a computer model for a protein binding pocket, at least a portion of the computer model having a surface contour that has a root mean square deviation of less than 3 Å when superimposed on a surface contour defined by atomic coordinates of FIG. 3, the root mean square deviation being calculated based only on alpha-carbon atoms in the structure coordinates of FIG. 3 that are present in residues shown in SEQ. ID No. 1.
In regard to these embodiments, it is noted that the root mean square deviation calculation may optionally be based on a comparison of main-chain atoms or non-hydrogen atoms. Also, the root mean square deviation of alpha-carbon atoms, main-chain atoms or non-hydrogen atoms may optionally be less than 2.7 Å, 2.5 Å, 2.0 Å, 1.5 Å, 1 Å, 0.5 Å, or less.
5. DPPIV Structure
The present invention is also directed to a three-dimensional crystal structure of DPPIV. This crystal structure may be used to identify binding sites, to provide mutants having desirable binding properties, and ultimately, to design, characterize, or identify ligands that interact with DPPIV as well as other S9 proteases.
The three-dimensional crystal structure of DPPIV may be generated, as is known in the art, from the structure coordinates shown in FIG. 3 and similar such coordinates.
During the course of structure solution it became evident that the wild type apo crystals of DPPIV of the present invention contained four nearly identical copies in the asymmetric unit. The final coordinates for each one of these molecules, referred to as chains A, B, C and D, are given in FIG. 3. The variations between the chains are described below.
Chain A includes amino acid residues 40-766 and four amino acid residues have covalently linked sugar molecules (FIG. 3). Chain B includes amino acid residues 39-766 and also includes 4 histidine residues of the N-terminal polyhistidine tag (residues 35-38). Five amino acid residues of chain B have covalently linked sugar molecules (FIG. 3). Chain C includes amino acid residues 40-766 and five of the amino acid residues are covalently linked to sugar molecules. Chain D includes amino acid residues 39-766 with five sugar-linked amino acid residues. In addition, chains C and D have no density for amino acid residues 139, 140, and 141 and hence coordinates for these residues are not included in FIG. 3. Similarly, coordinates for amino acid residues 85, 86. and 87 of chain D are not included in FIG. 3. The coordinate set additionally includes 928 solvent molecules modeled as water.
FIG. 4A illustrates a ribbon diagram overview of the structure of DPPIV, highlighting secondary structural elements of the protein. DPPIV is a cylindrical shaped molecule with an approximate height of 70 Å and a diameter of 60 Å (FIG. 4A). The catalytic triad of DPPIV (Ser 630, Asp 708 and His 740) is illustrated in the center of FIG. 4A by a “ball and stick” representation. This triad of amino acids is located in the peptidase domain or catalytic domain of DPPIV. The catalytic domain is covalently linked to the β-propeller domain (FIG. 4A).
The catalytic domain of DPPIV includes residues 39-55 and 499-766. Since, the structure of the present invention does not contain the first 46 residues (Chain B of FIG. 3) it is presumed that the N-terminal residues of the catalytic domain adopt a random structure with a short double turn α-helix formed by residues 44 to 51. The catalytic domain of DPPIV adopts a characteristic α/β hydrolase fold. The core of this domain contains an 8-stranded β-sheet with all strands being parallel except one (FIG. 4A). The β-sheet is significantly twisted and is flanked by three α-helices on one side and five α-helices on the other. The topology of the β-strands is 1, 2, −1x, 2x and (1x) (J. S. Richardson: The anatomy and taxonomy of protein structure; (1981) Adv. Protein Chem. 269, 15076-15084.).
FIG. 4B illustrates the remaining residues 56-498 that form the non-catalytic domain of DPPIV. This domain is also known as β-propeller domain (FIG. 4B). The β-propeller domain is a 7-fold repeat of four-stranded antiparallel β-sheets (FIG. 4B). The sheets are twisted and arranged around a central tunnel as seen in case of Prolyl Oligopeptidase. Further, the β-sheets pack face-to-face and are stabilized predominantly by hydrophobic interactions. The β-propeller is linked to the catalytic domain by two polypeptide chains, one involving the N-terminal residues and the other consisting of the C-terminal residues 499-508 which also form an α-helix.
FIG. 5 illustrates the binding site of DPPIV based on the determined crystal structure corresponding to the coordinates shown in FIG. 3.
6. DPPIV Active Site and Ligand Interaction
The term “binding site” or “binding pocket”, as the terms are used herein, refers to a region of a protein that, as a result of its shape, favorably associates with a ligand or substrate. The term “DPPIV-like binding pocket” refers to a portion of a molecule or molecular complex whose shape is sufficiently similar to the DPPIV binding pockets as to bind common ligands. This commonality of shape may be quantitatively defined based on a comparison to a reference point, that reference point being the structure coordinates provided herein. For example the commonality of shape may be quantitatively defined based on a root mean square deviation (rmsd) from the structure coordinates of the backbone atoms of the amino acids that make up the binding pockets in DPPIV (as set forth in FIG. 3).
The “active site binding pockets” or “active site” of DPPIV refers to the area on the surface of DPPIV where the substrate binds.
FIG. 5 illustrates the inhibitor-binding site of DPPIV based on the determined crystal structure (coordinates shown in FIG. 3). The active site containing the catalytic triad (Ser 630, Asp 708 and His 740), is located in a large cavity (FIG. 5) at the interface of the catalytic and the β-propeller domains. Ser 630 is located on a sharp turn that connects an α-helix to a β-strand. The positioning of this active site Serine residue is referred to as a nucleophile elbow and is characteristic of an α/β type hydrolase (D. J. Ollis et al., The α/β hydrolase fold; (1992) Protein Eng. 5, 197-211). In DPPIV, the active-site serine is surrounded by hydrophobic residues, which include the large aromatic residues Trp 629 and Tyr 631. The hydroxyl group of the active site serine is exposed and involved in hydrogen bonding with the imidazole group of the active site His 740 (OH——————NH distance 2.7 Å). His 740 is located on the middle of a loop that connects a β-strand to an α-helix. The other nitrogen atom of the imidazole ring of His 740 forms a hydrogen bond with the side chain oxygen of the third active site residue (Asp 708) of the catalytic triad. Asp 708 is also located on a loop connecting a β-strand and an α-helix. The second oxygen atom of the side chain carboxylate of Asp 708 forms two hydrogen bonded interactions with the main-chain amide of residues (Asn 710 and Val 711). The hydrogen bonding interactions of the catalytic triad is similar to those observed for prolyl oligopeptidase.
Based on sequence alignments and structural comparisons with prolyl oligopeptidase, the residues that form the DPPIV active site pocket can be predicted with a high degree of probability. The binding pocket appears to be formed by a pocket of hydrophobic residues (Phe 357, Tyr 631, Tyr 662, Tyr 666, Tyr 547 and Val 711). In addition to the catalytic triad a large number of polar residues are also present in this hydrophobic environment (Arg 125, Glu 205, Glu 206 and Asp 663).
In resolving the crystal structure of DPPIV, applicants determined that DPPIV amino acids shown in Table 1 (above) are encompassed within a 4-Angstrom radius around the DPPIV active site and therefore are likely close enough to interact with an active site inhibitor of DPPIV. Applicants have also determined that the amino acids shown in Table 2 (above) are encompassed within a 7-Angstrom radius around the DPPIV active site. Further, the amino acids shown in Table 3 (above) are encompassed within a 10-Angstrom radius around the DPPIV active site. Due to their proximity to the active site, the amino acids in the 4, 7, and/or 10 Angstroms sets are preferably conserved in variants of DPPIV. While it is desirable to largely conserve these residues, it should be recognized however that variants may also involve varying 1, 2, 3, 4 or more of the residues set forth in Tables 1, 2, and 3 in order to evaluate the roles these amino acids play in the binding pocket. Applicants have also determined that amino acids shown in Table 4 (above) are encompassed within a 5-Angstrom radius around the DPPIV dimerization interface (AB and CD dimers).
With the knowledge of the DPPIV crystal structure provided herein, Applicants are able to know the contour of a DPPIV binding pocket as a binding pocket where the relative positioning of the 4, 7, and/or 10 Angstroms sets of amino acids. In addition, Applicants are able to know the contour of a dimerization interface (AB and CD dimers) based on the relative positions of the α-carbon residues in Table 4. Again, it is noted that it may be desirable to form variants where 1, 2, 3, 4 or more of the residues set forth in Tables 1, 2, and 3 are varied in order to evaluate the roles these amino acids play in the binding pocket. Accordingly, any set of structure coordinates for a protein from any source having a root mean square deviation of non-hydrogen atoms of less than 3 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of FIG. 3 for the 4, 7, and/or 10 Angstroms sets of amino acids and/or those amino acids of the dimerization interface shall be considered identical. As noted previously, the root mean square deviation is intended to be limited to only those non-hydrogen atoms of amino acid residues that are common to both the protein fragment represented in FIG. 3 and the protein whose structure coordinates are being compared to the coordinates shown in FIG. 3 since the sequence of the protein may be varied somewhat.
Accordingly, in one embodiment, the invention relates to data, computer readable media comprising data, and uses of the data where the data comprises the structure coordinates shown in FIG. 3 or structure coordinates having a root mean square deviation of non-hydrogen atoms of less than 3 Å when superimposed on the non-hydrogen atom positions of the corresponding atomic coordinates of FIG. 3 for the 4, 7, and/or 10 Angstroms sets of amino acids and/or the residues listed in Table 4.
Again, it is noted that the root mean square deviation is intended to be limited to only those non-hydrogen atoms of amino acid residues that are common to both the protein fragment represented in one or more of the tables and the protein whose structure coordinates are being compared to the coordinates shown in FIG. 3.
As noted above, there are many different ways to express the surface contours of the DPPIV structure other than by using the structure coordinates provided in FIG. 3. Accordingly, it is noted that the present invention is also directed to any data, computer readable media comprising data, and uses of the data where the data defines a computer model for a protein binding pocket, at least a portion of the computer model having a surface contour that has a root mean square deviation of less than 3 Å when superimposed on a surface contour defined by atomic coordinates of FIG. 3, the root mean square deviation being calculated based only on non-hydrogen atoms in the structure coordinates of FIG. 3 that are present in residues shown in Tables 1, 2, 3 and/or 4.
Optionally, the root mean square deviation of non-hydrogen atoms is less than 1.5 Å, 1 Å, 0.5 Å, or less.
It will be readily apparent to those of skill in the art that the numbering of amino acids in other isoforms of DPPIV may be different than that set forth for DPPIV. Corresponding amino acids in other isoforms of DPPIV are easily identified by visual inspection of the amino acid sequences or by using commercially available homology software programs, as further described below.
7. System For Displaying the Three Dimensional Structure of DPPIV
The present invention is also directed to machine-readable data storage media having data storage material encoded with machine-readable data that comprises structure coordinates for DPPIV. The present invention is also directed to a machine readable data storage media having data storage material encoded with machine readable data, which, when read by an appropriate machine, can display a three dimensional representation of a structure of DPPIV.
All or a portion of the DPPIV coordinate data shown in FIG. 3, when used in conjunction with a computer programmed with software to translate those coordinates into the three-dimensional structure of DPPIV may be used for a variety of purposes, especially for purposes relating to drug discovery. Softwares for generating three-dimensional graphical representations are known and commercially available. The ready use of the coordinate data requires that it be stored in a computer-readable format. Thus, in accordance with the present invention, data capable of being displayed as the three-dimensional structure of DPPIV and/or portions thereof and/or their structurally similar variants may be stored in a machine-readable storage medium, which is capable of displaying a graphical three-dimensional representation of the structure.
For example, in one embodiment, a computer is provided for producing a three-dimensional representation of at least an DPPIV-like binding pocket, the computer comprising: machine readable data storage medium comprising a data storage material encoded with machine-readable data, the machine readable data comprising structure coordinates that have a root mean square deviation of less than 3 Angstroms when compared to structure coordinates appearing in FIG. 3, the comparison being based on alpha-carbon atoms of amino acid residues present in both the set of structure coordinates shown in FIG. 3 and the structure coordinates being compared, the comparison being further limited to residues of DPPIV appearing in Tables 1, 2, 3 and/or 4; a working memory for storing instructions for processing the machine-readable data; a central-processing unit coupled to the working memory and to the machine-readable data storage medium, for processing the machine-readable data into the three-dimensional representation; and an output hardware coupled to the central processing unit, for receiving the three Dimensional representation.
Another embodiment of this invention provides a machine-readable data storage medium, comprising a data storage material encoded with machine readable data which, when used by a machine programmed with instructions for using said data, displays a graphical three-dimensional representation comprising DPPIV or a portion or variant thereof.
In one variation, the machine readable data comprises data for representing a protein based on structure coordinates having a root mean square deviation of alpha-carbon atoms of less than 3 Å when superimposed on the alpha-carbon atom positions of the corresponding atomic coordinates of FIG. 3 for all of the amino acids in FIG. 3.
In another variation, the machine readable data comprises data for representing a protein based on structure coordinates having a root mean square deviation of alpha-carbon atoms of less than 3 Å when superimposed on the alpha-carbon atom positions of the corresponding atomic coordinates of FIG. 3 for the amino acids listed in Tables 1, 2, 3 and/or 4.
It is again noted that the root mean square deviation calculation may optionally be based on a comparison of main-chain atoms or non-hydrogen atoms. Also, the root mean square deviation of alpha-carbon atoms, main-chain atoms or non-hydrogen atoms may optionally be less than 2.7 Å, 2.5 Å, 2.0 Å, 1.5 Å, 1 Å, 0.5 Å, or less.
According to another embodiment, the machine-readable data storage medium comprises a data storage material encoded with a first set of machine readable data which comprises the Fourier transform of the structure coordinates set forth in FIG. 3, and which, when using a machine programmed with instructions for using said data, can be combined with a second set of machine readable data comprising the X-ray diffraction pattern of another molecule or molecular complex to determine at least a portion of the structure coordinates corresponding to the second set of machine readable data. For example, the Fourier transform of the structure coordinates set forth in FIG. 3 may be used to determine at least a portion of the structure coordinates of other DPPIV-like enzymes, and isoforms of DPPIV.
Optionally, a computer system is provided in combination with the machine-readable data storage medium provided herein. In one embodiment, the computer system comprises a working memory for storing instructions for processing the machine-readable data; a processing unit coupled to the working memory and to the machine-readable data storage medium, for processing the machine-readable data into the three-dimensional representation; and an output hardware coupled to the processing unit, for receiving the three-dimensional representation.
FIG. 6 illustrates an example of a computer system that may be used in combination with storage media according to the present invention. As illustrated, the computer system 10 includes a computer 11 comprising a central processing unit (“CPU”) 20, a working memory 22 which may be, e.g., RAM (random-access memory) or “core” memory, mass storage memory 24 (such as one or more disk drives or CD-ROM drives), one or more cathode-ray tube (“CRT”) display terminals 26, one or more keyboards 28, one or more input lines 30, and one or more output lines 40, all of which are interconnected by a conventional bi-directional system bus 50.
Input hardware 36, coupled to computer 11 by input lines 30, may be implemented in a variety of ways. For example, machine-readable data of this invention may be inputted via the use of a modem or modems 32 connected by a telephone line or dedicated data line 34. Alternatively or additionally, the input hardware 36 may comprise CD-ROM drives or disk drives 24. In conjunction with display terminal 26, keyboard 28 may also be used as an input device.
Conventional devices may, similarly implement output hardware 46, coupled to computer 11 by output lines 40. By way of example, output hardware 46 may include CRT display terminal 26 for displaying a graphical representation of a binding pocket of this invention using a program such as QUANTA as described herein. Output hardware might also include a printer 42, so that hard copy output may be produced, or a disk drive 24, to store system output for later use.
In operation, CPU 20 coordinates the use of the various input and output devices 36, 46 coordinates data accesses from mass storage 24 and accesses to and from working memory 22, and determines the sequence of data processing steps. A number of programs may be used to process the machine-readable data of this invention. Such programs are discussed in reference to using the three dimensional structure of DPPIV described herein.
The storage medium encoded with machine-readable data according to the present invention can be any conventional data storage device known in the art. For example, the storage medium can be a conventional floppy diskette or hard disk. The storage medium can also be an optically-readable data storage medium, such as a CD-ROM or a DVD-ROM, or a rewritable medium such as a magneto-optical disk that is optically readable and magneto-optically writable.
8. Uses of the Three Dimensional Structure of DPPIV
The three-dimensional crystal structure of the present invention may be used to identify DPPIV binding sites, be used as a molecular replacement model to solve the structure of unknown crystallized proteins, to design mutants having desirable binding properties, and ultimately, to design, characterize, identify entities capable of interacting with DPPIV and other S9 proteases, as well as other uses that would be recognized by one of ordinary skill in the art. Such entities may be chemical entities or proteins. The term “chemical entity”, as used herein, refers to chemical compounds, complexes of at least two chemical compounds, and fragments of such compounds.
The DPPIV structure coordinates provided herein are useful for screening and identifying drugs that inhibit DPPIV and other proteases. For example, the structure encoded by the data may be computationally evaluated for its ability to associate with putative substrates or ligands. Such compounds that associate with DPPIV may inhibit DPPIV, and are potential drug candidates. Additionally or alternatively, the structure encoded by the data may be displayed in a graphical three-dimensional representation on a computer screen. This allows visual inspection of the structure, as well as visual inspection of the structure's association with the compounds.
Thus, according to another embodiment of the present invention, a method is provided for evaluating the potential of an entity to associate with DPPIV or a fragment or variant thereof by using all or a portion of the structure coordinates provided in FIG. 3. A method is also provided for evaluating the potential of an entity to associate with DPPIV or a fragment or variant thereof by using structure coordinates similar to all or a portion of the structure coordinates provided in FIG. 3. For example, the structure coordinates used may have a root mean square deviation of alpha-carbon atoms of less than 3 Å when superimposed on the alpha-carbon atom positions of the corresponding atomic coordinates of FIG. 3.
It is again noted in regard to these embodiments that the root mean square deviation calculation may optionally be based on a comparison of main-chain atoms or non-hydrogen atoms. Also, the root mean square deviation of alpha-carbon atoms or non-hydrogen atoms may optionally be less than 2.7 Å, 2.5 Å, 2.0 Å, 1.5 Å, 1 Å, 0.5 Å, or less.
The method may optionally comprise the steps of: creating a computer model of all or a portion of a protein structure (e.g., a binding pocket) using structure coordinates according to the present invention; performing a fitting operation between the entity and the computer model; and analyzing the results of the fitting operation to quantify the association between the entity and the model. The portion of the protein structure used optionally comprises all of the amino acids listed in Tables 1, 2, 3 and/or 4 that are present in the structure coordinates being used.
It is noted that the computer model may not necessarily directly use the structure coordinates. Rather, a computer model can be formed that defines a surface contour that is the same or similar to the surface contour defined by the structure coordinates.
The structure coordinates provided herein can also be utilized in a method for identifying a ligand (e.g., entities capable of associating with a protein) of a protein comprising a DPPIV-like binding pocket. One embodiment of the method comprises: using all or a portion of the structure coordinates provided herein to generate a three-dimensional structure of a DPPIV-like binding pocket; employing the three-dimensional structure to design or select a potential ligand; synthesizing the potential ligand; and contacting the synthesized potential ligand with a protein comprising an DPPIV-like binding pocket to determine the ability of the potential ligand to interact with protein. According to this method, the structure coordinates used may have a root mean square deviation of alpha-carbon atoms of less than 3 Å when superimposed on the alpha-carbon atom positions of the corresponding atomic coordinates of FIG. 3. The portion of the protein structure used optionally comprises all of the amino acids listed in Tables 1, 2, 3 and/or 4 that are present.
As noted previously, the three-dimensional structure of a DPPIV-like binding pocket need not be generated directly from structure coordinates. Rather, a computer model can be formed that defines a surface contour that is the same or similar to the surface contour defined by the structure coordinates.
It is again noted that the root mean square deviation calculation may optionally be based on a comparison of main-chain or non-hydrogen atoms. Also, the root mean square deviation of alpha-carbon atoms or non-hydrogen atoms may optionally be less than 2.7 Å, 2.5 Å, 2.0 Å, 1.5 Å, 1 Å, 0.5 Å, or less.
A method is also provided for evaluating the ability of an entity, such as a compound or a protein to associate with a DPPIV-like binding pocket, the method comprising: constructing a computer model of a binding pocket defined by structure coordinates that have a root mean square deviation of less than 3.0 Å when compared to structure coordinates appearing in FIG. 3, the comparison being based on alpha-carbon atoms present in both sets of structure coordinates, the comparison also being limited to residues of DPPIV appearing in Tables 1, 2, 3 and/or 4 that are present; selecting an entity to be evaluated by a method selected from the group consisting of (i) assembling molecular fragments into the entity, (ii) selecting an entity from a small molecule database, (iii) de novo ligand design of the entity, and (iv) modifying a known ligand for DPPIV, or a portion thereof; performing a fitting program operation between computer models of the entity to be evaluated and the binding pocket in order to provide an energy-minimized configuration of the entity in the binding pocket; and evaluating the results of the fitting operation to quantify the association between the entity and the binding pocket model in order to evaluate the ability of the entity to associate with the said binding pocket.
The computer model of a binding pocket used in this embodiment need not be generated directly from structure coordinates. Rather, a computer model can be formed that defines a surface contour that is the same or similar to the surface contour defined by the structure coordinates.
According to the method, the root mean square deviation calculation may optionally be based on a comparison of main-chain atoms or non-hydrogen atoms. Also, the root mean square deviation of alpha-carbon atoms or non-hydrogen atoms may optionally be less than 2.7 Å, 2.5 Å, 2.0 Å, 1.5 Å, 1 Å, 0.5 Å, or less.
Also according to the method, the method may further include synthesizing the entity; and contacting a protein having a DPPIV-like binding pocket with the synthesized entity.
With the structure provided herein, the present invention for the first time permits the use of molecular design techniques to identify, select or design potential inhibitors of DPPIV, based on the structure of a DPPIV-like binding pocket. Such a predictive model is valuable in light of the high costs associated with the preparation and testing of the many diverse compounds that may possibly bind to the DPPIV protein.
According to this invention, a potential DPPIV inhibitor may now be evaluated for its ability to bind a DPPIV-like binding pocket prior to its actual synthesis and testing. If a proposed entity is predicted to have insufficient interaction or association with the binding pocket, preparation and testing of the entity can be obviated. However, if the computer modeling indicates a strong interaction, the entity may then be obtained and tested for its ability to bind.
A potential inhibitor of a DPPIV-like binding pocket may be computationally evaluated using a series of steps in which chemical entities or fragments are screened and selected for their ability to associate with the DPPIV-like binding pockets.
One skilled in the art may use one of several methods to screen entities (whether chemical or protein) for their ability to associate with a DPPIV-like binding pocket. This process may begin by visual inspection of, for example, a DPPIV-like binding pocket on a computer screen based on the DPPIV structure coordinates in FIG. 3 or other coordinates which define a similar shape generated from the machine-readable storage medium. Selected fragments or chemical entities may then be positioned in a variety of orientations, or docked, within that binding pocket as defined above. Docking may be accomplished using software such as Quanta and Sybyl, followed by energy minimization and molecular dynamics with standard molecular mechanics force fields, such as CHARMM and AMBER.
Specialized computer programs may also assist in the process of selecting entities. These include: GRID (P. J. Goodford, “A Computational Procedure for Determining Energetically Favorable Binding Sites on Biologically Important Macromolecules”, J. Med. Chem., 28, pp. 849-857 (1985)). GRID is available from Oxford University, Oxford, UK; MCSS (A. Miranker et al., “Functionality Maps of Binding Sites: A Multiple Copy Simultaneous Search Method.” Proteins: Structure, Function and Genetics, 11, pp. 29-34 (1991)). MCSS is available from Molecular Simulations, San Diego, Calif.; AUTODOCK (D. S. Goodsell et al., “Automated Docking of Substrates to Proteins by Simulated Annealing”, Proteins: Structure, Function, and Genetics, 8, pp. 195-202 (1990)). AUTODOCK is available from Scripps Research Institute, La Jolla, Calif.; & DOCK (I. D. Kuntz et al., “A Geometric Approach to Macromolecule-Ligand Interactions”, J. Mol. Biol., 161, pp. 269-288 (1982)). DOCK is available from University of California, San Francisco, Calif.
Once suitable entities have been selected, they can be designed or assembled. Assembly may be preceded by visual inspection of the relationship of the fragments to each other on the three-dimensional image displayed on a computer screen in relation to the structure coordinates of DPPIV. This may then be followed by manual model building using software such as Quanta or Sybyl [Tripos Associates, St. Louis, Mo].
Useful programs to aid one of skill in the art in connecting the individual chemical entities or fragments include: CAVEAT (P. A. Bartlett et al, “CAVEAT: A Program to Facilitate the Structure-Derived Design of Biologically Active Molecules”, in “Molecular Recognition in Chemical and Biological Problems”, Special Pub., Royal Chem. Soc., 78, pp. 182-196 (1989); G. Lauri and P. A. Bartlett, “CAVEAT: a Program to Facilitate the Design of Organic Molecules”, J. Comput. Aided Mol. Des., 8, pp. 51-66 (1994)). CAVEAT is available from the University of California, Berkeley, Calif.; 3D Database systems such as ISIS (MDL Information Systems, San Leandro, Calif.). This area is reviewed in Y. C. Martin, “3D Database Searching in Drug Design”, J. Med. Chem., 35, pp. 2145-2154 (1992); HOOK (M. B. Eisen et al, “HOOK; A Program for Finding Novel Molecular Architectures that Satisfy the Chemical and Steric Requirements of a Macromolecule Binding Site”, Proteins: Struct., Funct., Genet., 19, pp. 199-221 (1994). HOOK is available from Molecular Simulations, San Diego, Calif.
Instead of proceeding to build an inhibitor of a DPPIV-like binding pocket in a step-wise fashion one fragment or entity at a time as described above, inhibitory or other DPPIV binding compounds may be designed as a whole or “de novo” using either an empty binding site or optionally including some portion(s) of a known inhibitor(s). There are many de novo ligand design methods including: LUDI (H.-J. Bohm, “The Computer Program LUDI: A New Method for the De Novo Design of Enzyme Inhibitors”, J. Comp. Aid. Molec. Design, 6, pp. 61-78 (1992)). LUDI is available from Molecular Simulations Incorporated, San Diego, Calif.; LEGEND (Y. Nishibata et al., Tetrahedron, 47, p. 8985 (1991)). LEGEND is available from Molecular Simulations Incorporated, San Diego, Calif.; LEAPFROG (available from Tripos Associates, St. Louis, Mo.); & SPROUT (V. Gillet et al, “SPROUT: A Program for Structure Generation)”, J. Comput. Aided Mol. Design, 7, pp. 127-153 (1993)). SPROUT is available from the University of Leeds, UK.
Other molecular modeling techniques may also be employed in accordance with this invention (see, e.g., Cohen et al., “Molecular Modeling Software and Methods for Medicinal Chemistry, J. Med. Chem., 33, pp. 883-894 (1990); see also, M. A. Navia and M. A. Murcko, “The Use of Structural Information in Drug Design”, Current Opinions in Structural Biology, 2, pp. 202-210 (1992); L. M. Balbes et al., “A Perspective of Modern Methods in Computer-Aided Drug Design”, in Reviews in Computational Chemistry, Vol. 5, K. B. Lipkowitz and D. B. Boyd, Eds., VCH, New York, pp. 337-380 (1994); see also, W. C. Guida, “Software For Structure-Based Drug Design”, Curr. Opin. Struct. Biology, 4, pp. 777-781 (1994)).
Once an entity has been designed or selected, for example, by the above methods, the efficiency with which that entity may bind to a DPPIV binding pocket may be tested and optimized by computational evaluation. For example, an effective DPPIV binding pocket inhibitor preferably demonstrates a relatively small difference in energy between its bound and free states (i.e., a small deformation energy of binding). Thus, the most efficient DPPIV binding pocket inhibitors should preferably be designed with deformation energy of binding of not greater than about 10 kcal/mole, more preferably, not greater than 7 kcal/mole. DPPIV binding pocket inhibitors may interact with the binding pocket in more than one of multiple conformations that are similar in overall binding energy. In those cases, the deformation energy of binding is taken to be the difference between the energy of the free entity and the average energy of the conformations observed when the inhibitor binds to the protein.
An entity designed or selected as binding to a DPPIV binding pocket may be further computationally optimized so that in its bound state it would preferably lack repulsive electrostatic interaction with the target enzyme and with the surrounding water molecules. Such non-complementary electrostatic interactions include repulsive charge-charge, dipole-dipole and charge-dipole interactions.
Specific computer software is available in the art to evaluate compound deformation energy and electrostatic interactions. Examples of programs designed for such uses include: Gaussian 94, revision C (M. J. Frisch, Gaussian, Inc., Pittsburgh, Pa. .COPYRGT.1995); AMBER, version 4.1 (P. A. Kollman, University of California at San Francisco, COPYRGT 1995); QUANTA/CHARMM (Molecular Simulations, Inc., San Diego, Calif. COPYRGT.1995); Insight II/Discover (Molecular Simulations, Inc., San Diego, Calif. COPYRGT.1995); DelPhi (Molecular Simulations, Inc., San Diego, Calif. COPYRGT.1995); and AMSOL (Quantum Chemistry Program Exchange, Indiana University). These programs may be implemented, for instance, using a Silicon Graphics workstation such as an Indigo.sup.2 with “IMPACT” graphics. Other hardware systems and software packages will be known to those skilled in the art.
Another approach provided by this invention, is the computational screening of small molecule databases for chemical entities or compounds that can bind in whole, or in part, to a DPPIV binding pocket. In this screening, the quality of fit of such entities to the binding site may be judged either by shape complementarities or by estimated interaction energy [E. C. Meng et al., J. Comp. Chem., 13, 505-524 (1992)].
According to another embodiment, the invention provides compounds that associate with a DPPIV—like binding pocket produced or identified by various methods set forth above.
The structure coordinates set forth in FIG. 3 can also be used to aid in obtaining structural information about another crystallized molecule or molecular complex. This may be achieved by any of a number of well-known techniques, including molecular replacement.
For example, a method is also provided for utilizing molecular replacement to obtain structural information about a protein whose structure is unknown comprising the steps of: generating an X-ray diffraction pattern of a crystal of the protein whose structure is unknown; generating a three-dimensional electron density map of the protein whose structure is unknown from the X-ray diffraction pattern by using at least a portion of the structure coordinates set forth in FIG. 3 as a molecular replacement model.
By using molecular replacement, all or part of the structure coordinates of the DPPIV provided by this invention (and set forth in FIG. 3) can be used to determine the structure of another crystallized molecule or molecular complex more quickly and efficiently than attempting an ab initio structure determination. One particular use includes use with other S9 proteases. Molecular replacement provides an accurate estimation of the phases for an unknown structure. Phases are a factor in equations used to solve crystal structures that cannot be determined directly. Obtaining accurate values for the phases, by methods other than molecular replacement, is a time-consuming process that involves iterative cycles of approximations and refinements and greatly hinders the solution of crystal structures. However, when the crystal structure of a protein containing at least a homologous portion has been solved, the phases from the known structure provide a satisfactory estimate of the phases for the unknown structure.
Thus, this method involves generating a preliminary model of a molecule or molecular complex whose structure coordinates are unknown, by orienting and positioning the relevant portion of DPPIV according to FIG. 3 within the unit cell of the crystal of the unknown molecule or molecular complex so as best to account for the observed X-ray diffraction pattern of the crystal of the molecule or molecular complex whose structure is unknown. Phases can then be calculated from this model and combined with the observed X-ray diffraction pattern amplitudes to generate an electron density map of the structure whose coordinates are unknown. This, in turn, can be subjected to any well-known model building and structure refinement techniques to provide a final, accurate structure of the unknown crystallized molecule or molecular complex [E. Lattman, “Use of the Rotation and Translation Functions”, in Meth. Enzymol., 115, pp. 55-77 (1985); M. G. Rossmann, ed., “The Molecular Replacement Method”, Int. Sci. Rev. Ser., No. 13, Gordon & Breach, New York (1972)].
The structure of any portion of any crystallized molecule or molecular complex that is sufficiently homologous to any portion of DPPIV can be resolved by this method.
In one embodiment, the method of molecular replacement is utilized to obtain structural information about the present invention and any other DPPIV-like molecule. The structure coordinates of DPPIV, as provided by this invention, are particularly useful in solving the structure of other isoforms of DPPIV or DPPIV complexes.
The structure coordinates of DPPIV as provided by this invention are useful in solving the structure of DPPIV variants that have amino acid substitutions, additions and/or deletions (referred to collectively as “DPPIV mutants”, as compared to naturally occurring DPPIV). These DPPIV mutants may optionally be crystallized in co-complex with a ligand, such as an inhibitor, substrate analogue or a suicide substrate. The crystal structures of a series of such complexes may then be solved by molecular replacement and compared with that of DPPIV. Potential sites for modification within the various binding sites of the enzyme may thus be identified. This information provides an additional tool for determining the most efficient binding interactions such as, for example, increased hydrophobic interactions, between DPPIV and a ligand. It is noted that the ligand may be the protein's natural ligand or may be a potential agonist or antagonist of a protein.
All of the complexes referred to above may be studied using well-known X-ray diffraction techniques and may be refined versus 1.5-3 Å resolution X-ray data to an R value of about 0.22 or less using computer software, such as X-PLOR [Yale University, COPYRGT.1992, distributed by Molecular Simulations, Inc.; see, e.g., Blundell & Johnson, supra; Meth. Enzymol., Vol. 114 & 115, H. W. Wyckoff et al., eds., Academic Press (1985)]. This information may thus be used to optimize known DPPIV inhibitors, and more importantly, to design new DPPIV inhibitors.
The structure coordinates described above may also be used to derive the dihedral angles, phi and psi, that define the conformation of the amino acids in the protein backbone. As will be understood by those skilled in the art, the phin angle refers to the rotation around the bond between the alpha-carbon and the nitrogen, and the psin angle refers to the rotation around the bond between the carbonyl carbon and the alpha-carbon. The subscript “n” identifies the amino acid whose conformation is being described [for a general reference, see Blundell and Johnson, Protein Crystallography, Academic Press, London, 1976].
9. Uses of the Crystal and Diffraction Pattern of DPPIV
Crystals, crystallization conditions and the diffraction pattern of DPPIV that can be generated from the crystals also have a range of uses. One particular use relates to screening entities that are not known ligands of DPPIV for their ability to bind to DPPIV. For example, with the availability of crystallization conditions, crystals and diffraction patterns of DPPIV provided according to the present invention, it is possible to take a crystal of DPPIV; expose the crystal to one or more entities that may be a ligand of DPPIV; and determine whether a ligand/DPPIV complex is formed. The crystals of DPPIV may be exposed to potential ligands by various methods, including but not limited to, soaking a crystal in a solution of one or more potential ligands or co-crystallizing DPPIV in the presence of one or more potential ligands. Given the structure coordinates provided herein, once a ligand complex is formed, the structure coordinates can be used as a model in molecular replacement in order to determine the structure of the ligand complex.
Once one or more ligands are identified, structural information from the ligand/DPPIV complex(es) may be used to design new ligands that bind tighter, bind more specifically, have better biological activity or have better safety profile than known ligands.
In one embodiment, a method is provided for identifying a ligand that binds to DPPIV comprising: (a) attempting to crystallize a protein that comprises a sequence with 70, 80, 90, 95% or greater identity with SEQ. ID No. 1 in the presence of one or more entities; (b) if crystals of the protein are obtained in step (a), obtaining an X-ray diffraction pattern of the protein crystal; and (c) determining whether a ligand/protein complex was formed by comparing an X-ray diffraction pattern of a crystal of the protein formed in the absence of the one or more entities to the crystal formed in the presence of the one or more entities.
In another embodiment, a method is provided for identifying a ligand that binds to DPPIV comprising: soaking a crystal of a protein that comprises a sequence with 70, 80, 90, 95% or greater identity with SEQ. ID No. 1 with one or more entities; determining whether a ligand/protein complex was formed by comparing an X-ray diffraction pattern of a crystal of the protein that has not been soaked with the one or more entities to the crystal that has been soaked with the one or more entities.
Optionally, the method may further comprise converting the diffraction patterns into electron density maps using phases of the protein crystal and comparing the electron density maps.
Libraries of “shape-diverse” compounds may optionally be used to allow direct identification of the ligand-receptor complex even when the ligand is exposed as part of a mixture. According to this variation, the need for time-consuming de-convolution of a hit from the mixture is avoided. More specifically, the calculated electron density function reveals the binding event, identifies the bound compound and provides a detailed 3-D structure of the ligand-receptor complex. Once a hit is found, one may optionally also screen a number of analogs or derivatives of the hit for tighter binding or better biological activity by traditional screening methods. The hit and information about the structure of the target may also be used to develop analogs or derivatives with tighter binding or better biological activity. It is noted that the ligand-DPPIV complex may optionally be exposed to additional iterations of potential ligands so that two or more hits can be linked together to make a more potent ligand. Screening for potential ligands by co-crystallization and/or soaking is further described in U.S. Pat. No. 6,297,021, which is incorporated herein by reference.
EXAMPLES Example 1 Expression and Purification of DPPIV
This example describes the expression of DPPIV. It should be noted that a variety of other expression systems and hosts are also suitable for the expression of DPPIV, as would be readily appreciated by one of skill in the art.
The portion of the gene encoding residues 39-766 (from SEQ. ID No. 1), which corresponds to the extracellular portion of human DPPIV, was isolated by PCR from spleen cDNA and cloned into the BamH I and Hind III sites of a modified pFastBacHTb vector. This vector encodes a baculovirus glycoprotein gp67 signal peptide sequence followed by a 6x-histidine tag sequence followed by the DPPIV sequence. Expression in this vector allowed for the production of secreted recombinant DPPIV with part of a gp67 signal sequence and a 6x-histidine tag, the sequence of which is shown in FIG. 1 (part of a gp67 signal sequence and 6x-histidine tag sequence underlined) (SEQ. ID No. 3).
Recombinant baculovirus genomic DNAs incorporating the DPPIV cDNA sequences were generated by transposition using the Bac-to-Bac system (Gibco-BRL). Infectious extracellular virus particles were obtained by transfection of a 2 ml adherent culture of Spodoptera frugiperda S Sf9 insect cells with the recombinant viral genomic DNA. Growth in ESF 921 protein free medium (Expression Systems) was for 3 days at 27° C. The resulting passage 1 viral supernatant was used to obtain passage 2 high titer viral stock (HTS) by infection of a 2 ml adherent culture of Spodoptera frugiperda Sf9 insect cells grown under similar conditions. Passage 2 HTS was used in turn to infect a 100 ml suspension culture of Spodoptera frugiperda Sf9 insect cells in order to generate passage 3 HTS. The production of recombinant DPPIV proteins was carried out by using the passage 3 HTS at a multiplicity of infection (MOI) of approximately 5 to infect 0.5-5 liter cultures of Trichoplusia ni Hi5 insect cells (InVitrogen) at a cell density of (1,5−8)×106 cells/ml (grown in ESF 921 protein free medium). Infected cell cultures were grown in both shake flasks and in Wave Bioreactors (Wave Biotech) for 48 hours at 27° C. prior to harvest. In some instances infected cultures of Spodoptera frugiperda Sf9 insect cells were used to produce recombinant DPPIV under similar conditions. Following harvest, the cell cultures were centrifuged to pellet whole cells.
The secreted glycosylated recombinant protein was isolated from the cell culture medium by diafiltration using cross-flow ultrafiltration, followed by passage over a nickel chelate resin and optionally polished by size exclusion chromatography.
In a typical batch prep, 5 L of cell culture supernatant was concentrated to 0.1 L on a 10 kDa NMWCO Omega Ultrasette (Pall Life Sciences) using a Masterflex L/S pump fitted with PharMed #15 tubing at a cross flow of approximately 1 L/minute and an inlet feed pressure of 1.5 to 2.0 bar, generating an initial permeate flow of up to 70 ml/minute. The retentate was diluted two to three fold by adding 25 mM Tris/HCl pH 7.9, 0.4 M NaCl and reconcentrated to 0.1 L. This process was repeated at least twice, after which the concentrate was quantitatively removed from the system, centrifuged when necessary (15 minutes at 4000 rpm in an Allegra (Beckman) centrifuge) and added to approximately 8 ml of a preconditioned 50% slurry of Probond (Invitrogen) divided over three or four 50 ml conical tubes. The tubes were rotated for at least 1 hour, after which the resin was washed with 10 resin volumes of 50 mM Potassium Phosphate pH 7.9, 0.4 M NaCl, 0.25 mM Tris(2-carboxyethyl)phosphine hydrochloride (TCEP). The resin is poured into 1 cm ID glass columns (Omnifit) and washed with 50 column volumes of 50 mM Potassium Phosphate pH 7.9, 0.4 M NaCl, 20 mM imidazole, 0.25 mM TCEP. After a wash with 5 column volumes of 50 mM Tris pH 7.9, 0.4 M NaCl, 0.25 mM TCEP, the product is eluted with 4 column volumes of 50 mM Tris pH 7.9, 0.4 M NaCl, 200 mM imidazole, 0.25 mM TCEP.
It is noted that the polyhistidine tags may optionally be removed; however in this instance, the polyhistidine tag was left as a fusion. It is also noted that for the purification of non-secreted proteins, leupeptin is added to all the buffers used during the immobilized metal affinity purification (IMAC) process at 1 mg/L and that for simplicity reasons the same is sometimes done when purifying DPPIV.
After concentrating to 7.5 mg/ml or higher by centrifugal ultrafiltration (10 kDa NMWCO, VivaScience), DPPIV was purified over a BioSep Sec S3000 column (200 mm×21.2 mm, Phenomenex) at 8 ml/minute to remove oligomeric forms. The column was set up in a Summit HPLC system (Dionex) managed by Chromeleon software (Dionex) and equilibrated with 25 mM Tris pH 7.6, 150 to 250 mM NaCl (optionally with 0.25 mM TCEP and 1 mM EDTA). In cases when the size exclusion step was omitted, centrifugal ultrafiltration (10 kDa NMWCO) was used for the buffer exchange to the required formulation buffer. The process was carried out at 2-10° C. and DPPIV was stored at the same temperature. For long-term storage, it was kept at −80° C. The purity of DPPIV was estimated by SDS-PAGE and IEF to be at least 95%. Glycosylation was confirmed by a molecular mass shift, determined by SDS-PAGE, following treatment with endo-beta-N acetyl glucosaminidases F (Endo-F1 enzyme), and by carbohydrate analysis.
DPPIV with seleno-L-methionine substitution was prepared as follows: two 5 liter Wave Bioreactor cultures of Trichoplusia ni Hi5 insect cells in ESF 921 Protein Free medium were infected and grown for 16 hours at 27° C. At that time the cells were pelleted by centrifugation at 480 g and 20° C. for 15 minutes. The supernatant was discarded and the cells resuspended in 2×5 liters of ESF 921 Protein Free Methionine-Free medium (Expression Systems). The resuspended cells were placed in two new 5 liter Wave Bioreactors and growth continued for 4 h at 27° C. Seleno-L-methionine (prepared as a 25 mg/ml solution in water and sterile-filtered) was then added to each culture to a final concentration of 50 mg/l. Cell growth was continued for a further 48 h prior to harvest. Purification of the protein was as described above and included the size exclusion chromatography step. Mass spectrogram peptide analysis was used to estimate the seleno-L-methionine substitution of methionine residues at approximately 34%.
Example 2 Crystallization of DPPIV
This example describes the crystallization of DPPIV (SEQ ID NO:3). It is noted that the precise crystallization conditions used may be further varied, for example by performing a fine screen based on these crystallization conditions.
Crystals were obtained after an extensive and broad screen of conditions, followed by optimization. Diffraction quality crystals were grown in 100 nL sitting droplets using the vapor diffusion method. 50 nL comprising the apo DPPIV complex (SEQ ID NO:3) (between 8 and 30 mg/ml) was mixed with 50 nL from a reservoir solution (100 μL) comprising: 0.1M Tris-HCl, pH=7.5; 27% MPEG 2000; and 0.35M sarcosine/10% xylitrol. The resulting solution was incubated over a period of two weeks at 4° C.
Crystals typically appeared after 3-5 days and grew to a maximum size within 7-10 days. Single crystals were transferred, briefly, into a cryoprotecting solution containing the reservoir solution supplemented with 25% v/v ethylene glycol. Crystals were then flash frozen by immersion in liquid nitrogen and then stored under liquid nitrogen. A crystal of apo DPPIV (SEQ ID NO:3) produced as described is illustrated in FIG. 2.
While the present invention is disclosed with reference to certain embodiments and examples detailed above, it is to be understood that these embodiments and examples are intended to be illustrative rather than limiting, as it is contemplated that modifications will readily occur to those skilled in the art, which modifications are intended to be within the scope of the invention and the appended claims. All patents, papers, and books cited in this application are incorporated herein in their entirety.

Claims (12)

1. A composition comprising a protein in crystalline form, wherein the protein consists of SEQ ID NO:3, and wherein the protein crystal has a crystal lattice in a P21 space group and unit cell dimensions, +/−5%, of a=121.53 Å b=124.11 Å and c=144.42 Å, α=γ=90°, β=114.6°.
2. A composition according to claim 1 wherein the protein crystal diffracts X-rays for a determination of structure coordinates to a resolution less than 3.0 Angstroms.
3. A method for forming a crystal of a protein comprising:
forming a crystallization volume comprising a precipitant solution and a protein that consists of SEQ ID NO:3, and wherein the protein crystal has a crystal lattice in a P21 space group and unit cell dimensions, +/−5%, of a=121.53 Å b=124.11 Å and c=144.42 Å, α=γ=90°, β=114.6°; and
storing the crystallization volume under conditions suitable for crystal formation of the protein.
4. A method according to claim 3 wherein is expressed from a nucleic acid molecule that comprises SEQ ID NO:2.
5. A method according to claim 3 wherein the protein crystal diffracts X-rays for a determination of structure coordinates to a resolution less than 3.0 Angstroms.
6. A non-crystalline protein consisting of SEQ ID NO:3.
7. A protein according to claim 6 where the protein is expressed from a nucleic acid molecule that comprises SEQ ID NO:2.
8. A non-crystalline protein consisting of residues 39-766 of SEQ ID NO. 1.
9. The protein according to claim 8 wherein the protein is expressed from a nucleic acid molecule that consists of SEQ ID NO:2.
10. An isolated non-crystalline protein consisting of residues 39-766 of SEQ ID NO:1.
11. An isolated non-crystalline protein consisting of SEQ ID NO:3.
12. A method of obtaining the three-dimensional structure of the protein of SEQ ID NO: 3:
(a) Crystallize a protein consisting of SEQ ID NO: 3 to obtain a protein crystal having a crystal lattice in a P21 space group and unit cell dimensions, +/−5%, of a=121.53 Å, b=124.11 Å, and c=144.42 Å, α=γ=90°, and β=114.6;
(b) Use the crystal of (a) to obtain an X-ray diffraction pattern; and
(c) Solve the three-dimensional structure of the protein from the diffraction pattern, and thereby obtain the three-dimensional structure.
US10/659,055 2002-09-09 2003-09-09 Crystallization of dipeptidyl peptidase IV (DPPIV) Expired - Fee Related US7344852B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/659,055 US7344852B1 (en) 2002-09-09 2003-09-09 Crystallization of dipeptidyl peptidase IV (DPPIV)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US40920602P 2002-09-09 2002-09-09
US10/659,055 US7344852B1 (en) 2002-09-09 2003-09-09 Crystallization of dipeptidyl peptidase IV (DPPIV)

Publications (1)

Publication Number Publication Date
US7344852B1 true US7344852B1 (en) 2008-03-18

Family

ID=39182200

Family Applications (1)

Application Number Title Priority Date Filing Date
US10/659,055 Expired - Fee Related US7344852B1 (en) 2002-09-09 2003-09-09 Crystallization of dipeptidyl peptidase IV (DPPIV)

Country Status (1)

Country Link
US (1) US7344852B1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2366394A1 (en) * 2010-03-17 2011-09-21 IMTM GmbH Characterization and validation of inhibitors and ligands of dipeptidyl aminopeptidase IV (DP IV)
US8282412B1 (en) * 2011-04-28 2012-10-09 Hitachi Cable, Ltd. Flat cable and connection structure between flat cable and printed wiring board
WO2014045254A2 (en) 2012-09-23 2014-03-27 Erasmus University Medical Center Rotterdam Human betacoronavirus lineage c and identification of n-terminal dipeptidyl peptidase as its virus receptor

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050260732A1 (en) * 2002-07-29 2005-11-24 Hajime Hiramatsu Three-dimensional structure of dipeptidyl peptidase IV

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050260732A1 (en) * 2002-07-29 2005-11-24 Hajime Hiramatsu Three-dimensional structure of dipeptidyl peptidase IV

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Gilliland et al. Crystallization of biological molecules for X-ray diffraction studies. Current Opinion in Structure Biology 1996, 6, 595-603. *
Ke et al. Crystallization of RNA and RNA-protein complexes. Methods 34, 2004, 408-414. *
Wiencek et al. New strategies for protein crystal growth. Ann. Rev. Biomed. Eng. 1999, 1, 505-534. *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2366394A1 (en) * 2010-03-17 2011-09-21 IMTM GmbH Characterization and validation of inhibitors and ligands of dipeptidyl aminopeptidase IV (DP IV)
WO2011113895A3 (en) * 2010-03-17 2011-12-08 Imtm Gmbh Characterization and validation of inhibitors and ligands of dipeptidyl aminopeptidase iv (dp iv)
US8282412B1 (en) * 2011-04-28 2012-10-09 Hitachi Cable, Ltd. Flat cable and connection structure between flat cable and printed wiring board
US20120276772A1 (en) * 2011-04-28 2012-11-01 Hitachi Cable, Ltd. Flat cable and connection structure between flat cable and printed wiring board
WO2014045254A2 (en) 2012-09-23 2014-03-27 Erasmus University Medical Center Rotterdam Human betacoronavirus lineage c and identification of n-terminal dipeptidyl peptidase as its virus receptor
EP3741846A2 (en) 2012-09-23 2020-11-25 Erasmus University Medical Center Rotterdam Human betacoronavirus lineage c and identification of n-terminal dipeptidyl peptidase as its virus receptor

Similar Documents

Publication Publication Date Title
US8192972B2 (en) Crystal structure of human JAK3 kinase domain complex and binding pockets thereof
US20180010109A1 (en) Polypeptide fragments comprising endonuclease activity and their use
US8002891B2 (en) Crystallization of C-Jun N-Terminal Kinase 3 (JNK3)
US6356845B1 (en) Crystallization and structure determination of Staphylococcus aureus UDP-N-acetylenolpyruvylglucosamine reductase (S. aureus MurB)
US7344852B1 (en) Crystallization of dipeptidyl peptidase IV (DPPIV)
US7498157B2 (en) Three-dimensional structure of dipeptidyl peptidase IV
US6921653B2 (en) Crystalline UDP-glycosyl transferase (MurG) and methods of use thereof
US7297508B1 (en) Crystallization of fibroblast activation protein alpha (FAPα)
US7303893B1 (en) Crystallization of c-KIT tyrosine kinase leading to autoinhibited crystal structure
US20080187980A1 (en) Method for identifying potential agonists or antagonists using the three-dimensional structure of caspase-7
US7319016B1 (en) Crystallization of cathepsin S
US20030165984A1 (en) Hepatitis C virus helicase crystals, crystallographic structure and methods
US7326552B1 (en) Wild-type kinase domain of human Ephrin receptor A2 (EPHA2) and crystallization thereof
US7534592B1 (en) Crystallization of carboxyltransferase domain of Acetyl-CoEnzyme A Carboxylase 2 with a ligand
US7507552B1 (en) Crystallization of histone deacetylase 2
US7303892B1 (en) Crystallization of AKT3
US7076372B1 (en) Crystal structure of MvaS
US20040023297A1 (en) Modulation of tetraspanin function
US7563610B1 (en) Crystalline composition of farsenyl pyrophosphate synthase (IspA)
US7309594B1 (en) Crystallization of protein kinase Bα/AKT1
US7252930B1 (en) Crystallization of MvaS (HMG-CoA Synthase)
US7241604B1 (en) Crystallization of 11-Beta-Hydroxysteroid Dehydrogenase type 1
US7444273B1 (en) Crystallization of aurora/LPL1P-related kinase
US7270987B1 (en) Crystallization of FMS-like tyrosine kinase 3
US20040209344A1 (en) Crystal structure of angiotensin-converting enzyme-related carboxypeptidase

Legal Events

Date Code Title Description
AS Assignment

Owner name: SYRRX, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:AERTGAERTS, KATHLEEN;CRONIN, CLARAN N.;HOSFIELD, DAVID J.;AND OTHERS;REEL/FRAME:015184/0341;SIGNING DATES FROM 20040322 TO 20040405

AS Assignment

Owner name: TAKEDA PHARMACEUTICAL COMPANY LIMITED, JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:TAKEDA SAN DIEGO, INC.;REEL/FRAME:017183/0920

Effective date: 20060111

REMI Maintenance fee reminder mailed
LAPS Lapse for failure to pay maintenance fees
STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20120318