PDE2 CRYSTAL STRUCTURES FOR STRUCTURE BASED DRUG DESIGN
Field of Invention The present invention relates to crystalline compositions of mammalian 3', 5 '-Cyclic Nucleotide Phosphodiesterase Type 2 (PDE2); amino acid sequences utilized to form said crystalline compositions; and of determining the 3-D
X-ray atomic coordinates of the formed crystals. The invention is further directed to methods of identifying ligands of PDE2 using structure based drug design and to the use of such inhibitors for treatment of disease states or disorders mediated by PDE2
Background Cyclic nucleotide second messengers (cAMP and cGMP) play a central role in signal transduction and regulation of physiologic responses. Their intracellular levels are controlled by the family of cyclic nucleotide phosphodiesterase (PDE) enzymes. The PDE family is comprised of metallophosphohydrolases (e.g., Mg2+, and Zn2+) that specifically cleave the 3',5'-cyclic phosphate moiety of cAMP and/or cGMP to produce the corresponding 5 '-nucleotide. The sensitivity of physiological processes to cAMP/cGMP signals requires that their levels be maintained within a narrow range in order to provide for optimal responsiveness in a cell. Cyclic nucleotide PDEs provide the major pathway for channeling the cyclic nucleotide signals for the cell. Members of the PDE family differ in their tissue distributions, physicochemical properties, substrate and inhibitor specificities and regulatory mechanisms. Based on differences in primary structure of known PDEs, they have been subdivided into two major classes, class I and class II. A class III of PDEs has recently been disclosed. Richer, W., Proteins: Structure, Function and Genetics, 46:278-286 (2002). Class I contains the largest number of PDEs and includes all known mammalian PDEs. Known class I PDEs are contained within cells and vary in subcellular distribution, with some being primarily associated with the particulate
fraction of the cytoplasmic fraction of the cell, others being evenly distributed in both compartments. PDEs from mammalian tissues have been divided into 11 families that are derived from separate gene families. The families are named PDE1, PDE2, PDE3,...to PDE 11. Within each family, there may be isoenzymes such as PDE1A, PDE1B and PDE1C, and PDE10A1 and PDE10A2. PDEs within a given family may differ but the members of each family are functionally related to each other through some similarities in amino acid sequences, specificities and affinities for cGMP and cAMP or accommodation of both, inhibitor specificities, and regulatory mechanisms. Conti et al., Prog. Nuc. Acid Res. Mol. Biol, 63 : 1 -52 (2000). Comparison of the amino acid sequences of PDEs suggests that all PDEs may be chimeric multidomain proteins possessing distinct domains that provide for catalysis and a number of regulatory functions. The amino acid sequences of mammalian PDEs identified to date include a conserved region located in the carboxy terminal portion of the proteins. Charbonneau, et al., PNAS, 83:9308-9312 (1986). The conserved domain includes the catalytic site for cAMP and/or cGMP hydrolysis and two putative metal binding sites as well as family specific determinants. Beavo J.A., Physiol. Rev. 75: 725-748 (1995); and Francis, et al., J. Biol. Chem., 269:22477-22480 (1994). The amino terminal region of the various PDEs are highly variable and include other family specific determinants such as: (i) calmodulin binding sites; (ii) non-catalytic cGMP binding sites; (iii) membrane targeting sites; (iv) hydrophobic membrane association sites; and (v) phosphorylation sites for either the calmodulin-dependent kinase (II), the cAMP- dependent kinase, or the cGMP dependent kinase. Beavo J.A., Physiol. Rev. 75:725- 748 (1995); Manganiello, et al., Arch. Biochem. Acta., 322:1-13 (1995); and Conti et al., Physiol. Rev., 75:723-748 (1995). It has been demonstrated that human PDE2 mRNA is expressed in a variety of tissue types. The phosphodiesterase enzyme family controls intracellular levels of secondary messenger cAMP or cGMP through regulation of their hydrolysis. PDE2 possesses a low affinity catalytic domain and an allosteric domain specific for cGMP. The low affinity catalytic site can hydrolyze both cAMP and cGMP with a lower
apparent Km for cGMP over cAMP. However, when cGMP binds to the allosteric site, the catalytic site undergoes a conformational change showing high affinity for cAMP. PDE2 exists as a homodimer that binds two molecules of cGMP per homodimer at the allosteric site. PDE2 shows the highest expression in the brain but is also found in many other tissues as well and therefore inhibitors to PDE2 have a broad array of function and potential therapeutic utility to these sites. Beavo, J. A. et al., Rev. Physio. Biochem. Pharm., 135: 67 (1999); and Soderling and Beavo, Curr. Opin. Cell Biol, 12: 74 (2000). The design and synthesis of various PDE2 inhibitors has been reported in the patent and periodical literature. Selective PDE2 inhibitors are particularly suitable for improving perception, concentration, learning or memory after cognitive disturbances as occur in particular in situations/diseases/syndromes such as mild cognitive impairment, age-associated learning and memory disturbances, age-associated memory losses, vascular dementia, craniocerebral trauma, stroke, dementia occurring after strokes (post stroke dementia), post-traumatic craniocerebral trauma, general disturbances of concentration, disturbances of concentration in children with learning and memory problems, Alzheimer's disease, dementia with Lewy bodies, dementia with degeneration of the frontal lobes including Pick's syndrome, Parkinson' s disease, progressive nuclear palsy, dementia with corticobasal degeneration, amyotrophic lateral sclerosis (ALS), Huntington's disease, multiple sclerosis, thalamic degeneration, Creutzfeld- Jacob dementia, HIN dementia, schizophrenia with dementia or Korsakoff psychosis. PDE2 inhibitors have been noted to include therapeutic potential in neuronal development, learning and memory (van Staveren et al., Brain Res., 888: 275 (2001) and O'Donnell et al., J. Pharm. Exp. Ther., 302: 249 (2002)); prolactin and aldosterone secretion (Velardez et al., Eur. J. Endo., 143: 279 (2000) and Gallo-Payet et al., Endo., 140:3594 (1999)); immunological response (Houslay et al., Cell. Signal., 8: 97 (1996)); vascular angiogenesis (Keravis et al., J. Vase. Res., 37: 235 (2000)); inflammatory cell transit (Wolda et al., J Histochem. Cytochem., 47: 895 (1999)); cardiac contraction (Fischmeister et al, J Clin. Invest., 99: 2710 (1997); Donzeau- Gouge et al., J. Physiol, 533: 329, (2001) and Paterson et al., Card. Res., 52: 446
(2001)); platelet aggregation (Haslam et al., Biochem. J, 323: 371 (1997)); hypoxic pulmonary vasoconstriction (Haynes et al., J. Pharm. Exp. Ther., 276: 752 (1996)) and olfactory signal transduction (Ma et al., J Neurosci. 23: 317 (2003)). Amino acid and nucleotide sequences for PDE2 have been provided by Charbonneau et al., PNAS, 83, 9308 (1986); Trong et al., Biochemistry, 29: 10280 (1990); Tanaka et al., Second Messengers and Phosphoproteins, 13:87-98 (1991); Sonnenberg, et al., J. Biol. Chem., 266:17655 (1991); Yang et al., Biochem. Biophys. Res. Comm., 205:1850-1858 (1994); and Rosman et al., Gene, 191:89-95 (1997). See also GenBank® Accession Nos: U67733; U21101; L49503; M73512; and BC006845. An assay to determine the catalytic activity of PDE2 was disclosed by Beavo, J.A., J. Biol. Chem., 245: 5649 (1990). Assays to screen compounds for therapeutic use as inhibitors of PDE2 were prior to the present invention limited to the reported full length or functional fragments of purified or recombinant polypeptides of PDE2. Beavo and Reifsnyder, TIPS, 11:150 (1990). More recently, crystal studies of mammalian PDEs have been reported. See Martinez et al., PNAS, 99: 13260-13265 (2002); Zhε t al., Protein Sci., 10:1481- 1489 (2001); Chang et al., Biochem Biophys Res. Comm., 307:1045-1050 (2003); Xu et al., Science, 288:1822-1825 (2000); Lee et al., FEBS, 530:53-58 (2002); Shailaja et al., FEBS, 539:161-166 (2003); and Sung et al., Nature, 425:98-102 (2003). Summary of the Invention
The present invention relates generally to crystalline compositions of PDE2, and specifically of the catalytic region of PDE2; and to amino acid sequences utilized for preparing said crystalline compositions. The invention further relates to methods of determining the 3-D X-ray atomic coordinates of the formed crystals and methods of using said atomic coordinates in conjunction with computational methods in structure based drug design. The crystalline compositions of the present invention are utilized for screening and identifying inhibitors of PDE2 and to some extent to PDE4. The inhibitors or chemical compounds so identified are optimally
utilized as pharmaceutical compositions for treatment of diseases or disorders mediated by PDE2 including therapeutic interventions for Female Sexual Arousal Disorder (FSAD), bone recovery and osteoporosis and CNS indications. In a preferred embodiment the invention is directed to crystalline compositions of the catalytic region of human PDE2A. In certain embodiments, the method further comprises refining and evaluating full or partial 3-D coordinates. This method may thus be used to generate 3-dimensional structures for proteins for which heretofore 3-dimensional atomic coordinates have not been determined. Depending on the extent of sequence homology, the newly generated structure may help to elucidate enzymatic mechanisms, or be used in conjunction with other molecular modeling techniques in structure based drug design. In another aspect, the present invention provides a method for identifying inhibitors, ligands, and the like of PDE2 by providing the coordinates of a molecule of PDE2 to a computerized modeling system; identifying chemical entities that are likely to bind to or interfere with the molecule, e.g., by screening a small molecule library; and, optionally, procuring or synthesizing and assaying the compounds or analogues derived therefrom for bioactivity. In certain embodiments, the information obtained by this method is used to iteratively refine or modify the structure of the original ligand. Thus, once a ligand is found to modulate the activity of said enzyme, the structural aspects of the ligand may be modified to generate a structural analog of the ligand. This analog can then be used in the above method to identify additional ligands. One of ordinary skill in the art will know the various ways a structure may be modified. In certain embodiments, the ligand is a selective inhibitor of PDE2. Thus, in a first aspect, the present invention relates to Phosphodiesterase Type 2 (PDE2) crystalline compositions, particularly human PDE2A. In a second aspect, the present invention relates to crystals of a PDE2 PDE2 ligand complex. In a third aspect, the present invention relates to polypeptides comprising the amino acid sequence set forth in SEQ ID NO:4, particularly Ser578 to Glu919, or a
homologue, functional fragment, variant, analogue or variant thereof, wherein the molecules are arranged in a crystalline manner belonging to space group C2221 with unit cell dimensions a= 90.033 A, b= 102.549 A, c=81.598 A, a=β=γ=90, and which effectively diffracts X-rays for determination of the atomic coordinates of PDE2 polypeptide to a resolution of about 1.7 A. The N-term well defined form has a unit cell: a=89.898 A, b=l 02.093 A, α = /3=γ=90 and c=80.733 A in space group C2221 and which effectively diffracts X-rays for determination of the atomic coordinates of PDE2 polypeptide to a resolution of about 2.6 A resolution. In a fourth aspect, the present invention relates to polypeptides consisting essentially of the catalytic domain of PDE2 as provided in SEQ ID NO:2 or SEQ ID NO:4 and to nucleic acid sequences encoding said polypeptides. In a fifth aspect, the present invention relates to computers for producing a three-dimensional representation of a polypeptide with an amino acid sequence spanning amino acids Ser578 to Gly934 of SEQ ID NO:2; or amino acids Ser578 to Glu919 of SEQ ID NO:4; or a homologue, functional fragment, variant, or analogue thereof. The numbering of the amino acids described herein utilize the numbering convention provided by Rosman et al., Gene, 191:89-95 (1991) and in GenBank Accession No: U67733, except for a difference at amino acid 715 where Rosman shows a Val and Applicants show an Ala. The three dimensional representation is optimally utilized in structural analysis with a computer-readable data storage medium comprising a data storage material encoded with computer-readable data, where said data comprises the structure coordinates of FIG. 5, or portions thereof, a working memory for storing instructions for processing said computer-readable data, a central- processing unit coupled to said working memory and to said computer-readable data storage medium for processing said computer-machine readable data into said three- dimensional representation, and a display coupled to said central-processing unit for displaying said representation.
In a sixth aspect, the present invention relates to computers for producing a three-dimensional representation of a molecule or molecular complex comprising the atomic coordinates in FIG. 5 comprising a computer-readable data storage medium comprising a data storage material encoded with computer-readable data, wherein said data comprises the structure coordinates of FIG. 5, or portions thereof working memory for storing instructions for processing said computer-readable data, a central- processing unit coupled to said working memory and to said computer-readable data storage medium for processing said computer-machine readable data into said three- dimensional representation, and a display coupled to said central-processing unit for displaying said representation. In a seventh aspect, the present invention relates to computers for producing a three-dimensional representation of a molecule or molecular complex comprising the atomic coordinates having a root mean square deviation of less than 2.0, 1.7, 1.5, 1.2, 1.0, 0.7, 0.5, or 0.2 A from the atomic coordinates for the carbon backbone atoms listed in FIG. 5 comprising a computer-readable data storage medium comprising a data storage material encoded with computer-readable data, wherein said data comprises the structure coordinates of FIG. 5, or portions thereof, a working memory for storing instructions for processing said computer-readable data, a central-processing unit coupled to said working memory and to said computer- readable data storage medium for processing said computer-machine readable data into said three-dimensional representation, and a display coupled to said central- processing unit for displaying said representation. In an eighth aspect, the present invention relates to computers for producing a three-dimensional representation of a molecule or molecular complex comprising a binding site defined by the structure coordinates in FIG. 5, or a the structural coordinates of a portion of the residues in FIG. 5 comprising at least one residue, preferably at least five residues and more preferably at least fifteen residues, or the structural coordinates of one or more PDE2 amino acids in SEQ ID NO: 2 or SEQ ID NO:4 selected from Tyr655, His656, Leu809, Asp811, Gln812, Ala823, Ile826, Phe830, Met847, Leu858, Gln859, Phe862 and Ile866, wherein said computer comprises a computer-readable data storage medium comprising a data storage
material encoded with computer-readable data, wherein said data comprises the structure coordinates of FIG. 5, or portions thereof, a working memory for storing instructions for processing said computer-readable data, a central-processing unit coupled to said working memory and to said computer-readable data storage medium for processing said computer-machine readable data into said three-dimensional representation, and a display coupled to said central-processing unit for displaying said representation. In a ninth aspect, the present invention relates to methods for generating the 3-D atomic coordinates of protein homologues of PDE2 using the X-ray coordinates of PDE2 shown in FIG. 5, said methods comprising identifying the sequences of one or more proteins which are homologues of PDE2, aligning the homologue sequences with the sequence of PDE2 (SEQ ID NO:2 or SEQ ID NO:4), identifying structurally conserved and structurally variable regions between the homologue sequences, and PDE2 (SEQ ID NO:2 or SEQ ID NO:4), generating 3-D coordinates for structurally conserved residues, variable regions and side-chains of the homologue sequences from those of PDE2, and combining the 3-D coordinates of the conserved residues, variable regions and side-chain conformations to generate full or partial 3-D coordinates for said homologue sequences. In a tenth aspect, the present invention relates to methods for identifying potential ligands for PDE2 and in a still further aspect to PDE4, or homologues, analogues or variants thereof, comprising the steps of displaying three dimensional structure of PDE2 enzyme, or portions thereof, as defined by atomic coordinates in FIG. 5, on a computer display screen, optionally replacing one or more PDE2 enzyme amino acid residues listed in SEQ ID NO:2, or SEQ ID NO:4, or one or more of the amino acids listed in Tables 1-3, or one or more amino acid residues selected from
Tyr655, His656, Leu809, Asp811, Gln812, Ala823, Ile826, Phe830, Met847, Leu858, Gln859, Phe862 and Ile866, in said three-dimensional structure with a different naturally occurring amino acid or an unnatural amino acid, employing said three- dimensional structure to design or select said ligand, contacting said ligand with PDE2, or variant thereof, in the presence of one or more substrates, and measuring the ability of said ligand to modulate the activity PDE2.
In an eleventh aspect, the present invention relates to methods for treating psychological disorders comprising administering to a patient in need of treatment the pharmaceutical compositions of ligands identified by structure-based drug design using the atomic coordinates substantially similar to, or portions of, the coordinated listed in FIG. 5. In a twelfth aspect, the present invention relates to nucleic acid sequences, expression vectors useful in methods for preparing a purified catalytic domain of PDE2 comprising a polypeptide with an amino acid sequence spanning amino acids Ser578 to Gly934 listed in SEQ ID NO:2 or Ser578 to Glu919 listed in SEQ ID NO:4; and/or homologues, functional fragments, variants, analogues or derivatives thereof.
Brief Description of the Drawings
FIG. 1 is an orthogonal view of the structure of PDE2 in ribbon representation. Bound Zn2+ ions are shown as balls, and the phosphate ion is shown as sticks. N- and C- termini of the polypeptide are labeled. The structure is composed of a single domain of fifteen helices and six 3ιo helices arranged in a compact fold. The numbering of the helices is shown in the text. We have followed the helice numbering convention established by Huai et al., Structure, 11 :865-873 (2003), and the start and end points of the helices are determined according to Kabsch and Sander, Biopolymers, 22(12): 2577-637 (1983). FIG. 2 is another orthogonal view of the structure of PDE2. FIG. 3 is a structure based sequence alignment of PDE2 with PDE4D. The PDE4D structure is in complex with Rolipram (Huai et al, Structure, 11 :865-873 (2003)). A total of about 234 equivalent Ca2+ atoms are superimposed between PDE2 and PDE4 structures with a RMS distance of 1.4 A. Residues in PDE2 and PDE4 that are within 7 A to Rolipram. These residues define the binding pocket of PDE2 and play important roles in defining specificity and selectivity. FIG. 4: is a structure alignment of PDE2 with PDE4D. The PDE4D structure is in complex with Rolipram (Huai et al, Structure, 11 :865-873 (2003)). A total of about 234 equivalent Ca2+ atoms are superimposed between PDE2 and PDE4
structures with a RMS distance of 1.4 A. The PDE2 structure is shown in the same orientation as in FIG. 1. Rolipram bound to PDE4D is drawn as space filled model. The secondary structures of PDE2 and PDE4D are well conserved as observed with other members of the PDE family. The binding mode of Rolipram in PDE4D also defines the ligand/inhibitor binding pocket of PDE2. This kind of 3D structure alignment and comparison will provide precise information for the design of PDE2 specific or PDE4 specific inhibitors.
FIG. 5 is a list of X-ray coordinates of the PDE2 catalytic domain crystalline composition of S5N. FIG. 6 is an image of PDE2 crystals of S5N. FIG. 7 is a diffraction pattern from S5N PDE2 crystals. Data was collected with a beamline X12C at NSLS in Brookhaven National Laboratories. The detector was a Brandeis CCD with an exposure time of 2 minutes per frame.
Detailed Description of the Invention The present invention relates to crystalline compositions of PDE2, 3-D X-ray atomic coordinates of said crystalline compositions, and methods of using said atomic coordinates in conjunction with computational methods to identify binding site(s) used to identify ligands which interact with said binding site(s) to agonize or antagonize PDE2. For convenience, certain terms employed in the specification, examples, and appendant claims are collected here. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The term "affinity" as used herein refers to the tendency of a molecule to associate with another. The affinity of a drug is its ability to bind to ifs biological target (receptor, enzyme, transport system, etc.) For pharmacological receptors, affinity can be thought of as the frequency with which the drug, when brought into
the proximity of a receptor by diffusion, will reside at a position of minimum free energy within the force field of that receptor. The term "agonist" as used herein refers to an endogenous substance or a drug that can interact with a receptor and initiate a physiological or a pharmacological response characteristic of that receptor (contraction, relaxation, secretion, enzyme activation, etc.) The term "analog" or "analogue" as used herein refers to a drug or chemical compound whose structure is related in some way to that of another drug or chemical compound, but whose chemical and biological properties may be quite different. The term "antagonist" as used herein refers to a drug or a compound that opposes the physiological effects of another. At the receptor level, it is a chemical entity that opposes the receptor- associated responses normally induced by another bioactive agent. As used herein the term "binding site" refers to a specific region or atom(s) in a molecular entity that is capable of entering into a stabilizing interaction with another molecular entity. In certain embodiments the term also refers to the reactive parts of a macromolecule that directly participate in its specific combination with another molecule. In certain other embodiments, a binding site may be comprised or defined by the three dimensional arrangement of one or more amino acid residues within a folded polypeptide. In certain embodiments, the binding site further comprise prosthetic groups, water molecules or metal ions which may interact with one or more amino acid residues. Prosthetic groups, water molecules, or metal ions may be apparent from X-ray crystallographic data, or may be added to an apoprotein or enzyme using in silico methods. The term "catalytic domain" as used herein, refers to the catalytic domain of the PDE2 class of enzymes, which feature a conserved segment of amino acids in the carboxy-terminal portion of the proteins, wherein this segment has been demonstrated to include the catalytic site of these enzymes. This conserved catalytic domain extends approximately from residue Ser578 to residue Gly934 of the full- length enzyme.
"To clone" as used herein, as will be apparent to skilled artisan, is meant to obtain exact copies of a given polynucleotide molecule using recombinant DNA technology. Furthermore, "to clone into" is meant to insert a given first polynucleotide sequence into a second polynucleotide sequence, preferably such that a functional unit combining the functions of the first and the second polynucleotides results, for example, without limitation, a polynucleotide from which a fusion protein may be translationally provided, which fusion protein comprises amino acid sequences encoded by the first and the second polynucleotide sequences. Details of molecular cloning can be found in a number of commonly used laboratory protocol books such as Molecular Cloning: A Laboratory Manual, 2nd Ed., ed. by Sambrook, Fritsch and Maniatis (Cold Spring Harbor Laboratory Press: (1989)). The term "co-crystallization" as used herein is taken to mean crystallization of a preformed protein/ligand complex. The term "complex" or "co-complex" are used interchangeably and refer to a PDE2 molecule, or a variant, or homologue of PDE2 in covalent or non-covalent association with a substrate, or ligand. The term "contacting" as used herein applies to in silico, in vitro, or in vivo experiments. As used herein, the terms "gene", "recombinant gene" and "gene construct" refer to a nucleic acid comprising an open reading frame encoding a polypeptide, including both exon and (optionally) intron sequences. The term "intron" refers to a DNA sequence present in a given gene which is not translated into protein and is generally found between exons. The term "high affinity" as used herein means strong binding affinity between molecules with a dissociation constant KD of no greater than 1 μM. In a preferred case, the KD is less than 100 nM, 10 nM, 1 nM, 100 pM, or even 10 pM or less. In a most preferred embodiment, the two molecules can be covalently linked (KD is essentially 0). The term "homologue" or "homolog" as used herein refers to polypeptides having at least 50%, 45% or even 42%, amino acid sequence identity with PDE2 enzyme as described in SEQ ID NO:2 or SEQ ID NO:4 or any catalytic domain
described herein. SEQ ID NO:2 is a partial amino acid sequence of the wild-type Human PDE2. SEQ LD NO: 4 is the amino acid sequence of the wild-type C- terminal catalytic domain of Human PDE2A that was crystallized in the Examples. Those of skill in the art understand that a set of structure coordinates determined by X-ray crystallography is not without standard error. As used herein, and for the purpose of this invention, the term "substantially similar atomic coordinates" or atomic coordinates that are "substantially similar" refers to any set of structure coordinates of PDE2 or PDE2 homologues, or PDE2 variants, polypeptide fragments, described by atomic coordinates that have a root mean square deviation for the atomic coordinates of protein backbone atoms (N, Ca, C, and O) of less than about 2.5, 2.0, 1.7, 1.5, 1.2, 1.0, 0.7, 0.5, or even 0.2 A when superimposed- using backbone atoms- of structure coordinates listed in FIG. 5. For the purpose of this invention structures that have substantially similar coordinates, as those listed in FIG. 5 shall be considered identical to the coordinates listed in FIG. 5. The term "substantially similar" also applies an assembly of amino acid residues that may or may not form a contiguous polypeptide chain, but whose three dimensional arrangement of atomic coordinates have a root mean square deviation for the atomic coordinates of protein backbone atoms (N, Ca, C, and O), or the side chain atoms, of less than about 2.5, 2.0, 1.7, 1.5, 1.2, 1.0, 0.7, 0.5, or even 0.2 A when superimposed- using backbone atoms, or the side chain atoms of the atomic coordinates of similar or the same amino acids from the coordinates listed in FIG. 5. To clarify further, but not intending to be limiting, an example of an assembly of amino acids may be the amino acid residues that form a binding site in an enzyme. These residues may have one or more intervening residues which are distant from the binding site, and therefore may minimally interact with a ligand in the binding sites. In such occurrences, the binding site may be defined for the purpose of structure based drug design as comprising at least specific amino acid residues. For example in the case of PDE2, amino acid residues Tyr655, His656, Leu809, Asp811, Gln812, Ala 823, Ile826, Phe830, Met847, Leu858, Gln859, Phe862 and Ile866 of SEQ ID NO:2 or SEQ D NO:4 are known to be near or at the binding site. Thus any molecular assembly that has a root mean square deviation from the atomic
coordinates of the protein backbone atoms (N, Ca, C, and O), or the side chain atoms, of one or more of Tyr655, His656, Leu809, Asp811, Gln812, Ala823, Ile826, Phe830, Met847, Leu858, Gln859, Phe862 and Ile866 of SEQ ID NO:2, or SEQ ID NO:4, or any conservative substitutions thereof, of less than about 2.5, 2.0, 1.7, 1.5, 1.2, 1.0, 0.7, 0.5, or even 0.2 A when superimposed will be considered substantially similar to the coordinates listed in FIG. 5. "Substantially similar" atomic coordinates, for the purposes of this invention are considered identical to the coordinates, or portions thereof, listed in FIG. 5. Those skilled in the art further understand that the coordinates listed in FIG. 5 or portions thereof may be transformed into a different set of coordinates using various mathematical algorithms without departing from the present invention. For example, the coordinates listed in FIG. 5, or portions thereof, may be transformed by algorithms, which translate or rotate the atomic coordinates. Alternatively, molecular mechanics, molecular dynamics or ab initio algorithms may modify the atomic coordinates. Atomic coordinates generated from the coordinates listed in FIG. 5, or portions thereof, using any of the aforementioned algorithms shall be considered identical to the coordinates listed in FIG. 5. The term "in silico" as used herein refers to experiments carried out using computer simulations. In certain embodiments, the in silico methods are molecular modeling methods wherein 3-dimensional models of macromolecules or ligands are generated. In other embodiments, the in silico methods comprise computationally assessing ligand binding interactions. The term "modulate" as used herein refers to both upregulation (i.e., activation or stimulation, e.g., by agonizing or potentiating) and down-regulation (i.e., inhibition or suppression, e.g., by antagonizing, decreasing or inhibiting) of an activity. The term "pharmacophore" as used herein refers to the ensemble of steric and electronic features of a particular structure that is necessary to ensure the optimal supramolecular interactions with a specific biological target structure and to trigger (or to block) its biological response. A pharmacophore may or may not represent a real molecule or a real association of functional groups. In certain
embodiments, a pharmacophore is an abstract concept that accounts for the common molecular interaction capacities of a group of compounds towards their target structure. In certain other embodiments, the term can be considered as the largest common denominator shared by a set of active molecules. Pharmacophoric descriptors are used to define a pharmacophore, including H- bonding, hydrophobic and electrostatic interaction sites, defined by atoms, ring centers and virtual points. Accordingly, in the context of enzyme agonists, antagonists or ligands, a pharmacophore may represent an ensemble of steric and electronic factors which are necessary to insure supramolecular interactions with a specific biological target structure. As such, a pharmacophore may represent a template of chemical properties for an active site of a protein/enzyme - representing these properties' spatial relationship to one another that theoretically defines a ligand that would bind to that site. The term "precipitant" as used herein is includes any substance that, when added to a solution, causes a precipitate to form or crystals to grow. Examples of precipitants within the scope of this invention include, but are not limited to, alkali (e.g., Li, Na, or K), or alkaline earth metal (e.g., Mg2+, or Ca2+) salts, and transition (e.g., Mn2+, or Zn2+) metal salts. Common counterions to the metal ions include, but are not limited to, halides, phosphates, citrates and sulfates. The term "prodrug" as used herein refers to drugs that, once administered, are chemically modified by metabolic processes to become pharmaceutically active. In certain embodiments the term also refers to any compound that undergoes biotransformation before exhibiting its pharmacological effects. Prodrugs can thus be viewed as drugs containing specialized non-toxic protective groups used in a transient manner to alter or to eliminate undesirable properties in the parent molecule. The term "receptor" as used herein refers to a protein or a protein complex in or on a cell that specifically recognizes and binds to a compound acting as a molecular messenger (neurotransmitter, hormone, lymphokine, lectin, drug, etc.). In a broader sense, the term receptor is used interchangeably with any specific (as
opposed to non- specific, such as binding to plasma proteins) drug inding site, also, including nucleic acids such as DNA. The term "recombinant protein" refers to a polypeptide which is produced by recombinant DNA techniques, wherein generally, DNA encoding a polypeptide is inserted into a suitable expression vector which is in turn used to transform a host cell to produce the polypeptide encoded by said DNA. This polypeptide may be one that is naturally expressed by the host cell, or it may be heterologous to the host cell, or the host cell may have been engineered to have lost the capability to express the polypeptide which is otherwise expressed in wild type forms of the host cell. The polypeptide may also be a fusion polypeptide. Moreover, the phrase "derived from", with respect to a recombinant gene, is meant to include within the meaning of "recombinant protein" those proteins having an amino acid sequence of a native polypeptide, or an amino acid sequence similar thereto which is generated by mutations, including substitutions, deletions and truncation, of a naturally occurring form of the polypeptide. The L5N1 and S5N polypeptides are preferably constructed by recombinant techniques. As used herein, the term "selective PDE2 inhibitor" refers to a substance, for example an organic molecule that effectively inhibits an enzyme from the PDE2 family to a greater extent that may other PDE enzyme. In one embodiment, a selective PDE2 inhibitor is a substance, for example, a small organic molecule having a Kj for inhibition of PDE2 that is less than about one-half, one-fifth, or one- tenth the Kj that the substance has for inhibition of any other PDE enzyme. In other words, the substance inhibits PDE2 activity to the same degree at a concentration of about one-half, one-fifth, one-tenth or less than the concentration required for any other PDE enzyme. In general a substance is considered to effectively inhibit PDE2 if it has an IC50 or Ki of less than or about 10 μM, 1 μM, 500 nM, 100 nM, 50 nM or even 10 nM. As used herein the term "small molecules" refers to preferred drugs as they are orally available (unlike proteins which must be administered by injection or topically). Size of small molecules is generally under 1000 Daltons, but many estimates seem to range between 300 to 700 Daltons.
As used herein, the term "transfection" means the introduction of a nucleic acid, e.g., via an expression vector, into a recipient cell by nucleic acid-mediated gene transfer. "Transformation", as used herein, refers to a process in which a cell's genotype is changed as a result of the cellular uptake of exogenous DNA or RNA, and, for example, the transformed cell expresses a recombinant form of a polypeptide or, in the case of anti-sense expression from the transferred gene, the expression of a naturally-occurring form of the polypeptide is disrupted. The term "variant" in relation to the polypeptide sequence in SEQ ID NO:2 or SEQ ID NO:4 include any substitution of, variation of, modification of, replacement of, deletion of, or addition of one or more amino acids from or to the sequence providing a resultant polypeptide sequence for an enzyme having PDE2 activity. Preferably a variant, homologue, functional fragment or portion, of SEQ ID NO:2 or SEQ ID NO:4, comprises a polypeptide sequence of at least 5 contiguous amino acids, preferably at least 10 contiguous amino acids, preferably at least 15 contiguous amino acids, preferably at least 20 contiguous amino acids, preferably at least 25 contiguous amino acids, or preferably at least 30 contiguous amino acids. Variants of SEQ ID NO:2 and SEQ ID NO:4 include for example one or more amino acid substitutions from the non-human PDE 2 noted in the GenBank Accession numbers provided above. The terms "mutant", "variant", "homologue", "analog", "derivative" or
"functional fragment ", are in relation to the amino acid sequence of the PDE2 protein or polypeptide sequence which is used to produce the crystal of the present invention. The terms include any substitution of, variation of, modification of, replacement of, deletion of, or addition of one or more amino acids from or to the sequence providing the resultant PDE2 is capable of being crystallized. Typically, for the "mutant", "variant", "homologue", "analog", "derivative" or "functional fragment " in relation to the amino acid sequence of the protein or polypeptide of the PDE2 of the crystal of the present invention, the types of amino acid substitutions that could be made should maintain the hydrophobicity/hydrophilicity of the amino acid sequence. Amino acid substitutions may be made provided that the modified PDE2 retains the ability to be crystallised in accordance with present
invention. Amino acid substitutions may include the use of non-naturally occurring analogues. The term "vector" refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. One type of preferred vector is an episome, i.e., a nucleic acid capable of extra-chromosomal replication. Preferred vectors are those capable of autonomous replication and/or expression of nucleic acids to which they are linked. Vectors capable of directing the expression of genes to which they are operatively linked are referred to herein as "expression vectors". In general, expression vectors of utility in recombinant DNA techniques are often in the form of "plasmids" which refer generally to circular double stranded DNA loops which, in their vector form are not bound to the chromosome. In the present specification, "plasmid" and "vector" are used interchangeably as the plasmid is the most commonly used form of vector. However, the invention is intended to include such other forms of expression vectors which serve equivalent functions and which become known in the art subsequently hereto. A. Clones The nucleotide sequence coding for a PDE2 polypeptide, or functional fragment, including the C-terminal peptide fragment of the catalytic domain of PDE2 protein, derivatives or analogs thereof, including a chimeric protein, thereof, can be inserted into an appropriate expression vector, i.e., a vector which contains the necessary elements for the transcription and translation of the inserted protein- coding sequence. The elements mentioned above are termed herein a "promoter." Thus, the nucleic acid encoding a PDE2 polypeptide of the invention or a functional fragment comprising the C-terminal peptide fragment of the catalytic domain of PDE2 protein, derivatives or analogs thereof, is operationally associated with a promoter in an expression vector of the invention. In preferred embodiments, the expression vector contains the nucleotide sequence coding for the polypeptide comprising the amino acid sequence spanning amino acids Ser578 to Gly934 listed in SEQ ID NO:2 or amino acids Ser578 to Glu919 listed in SEQ ID NO:4. Both cDNA and genomic sequences can be cloned and expressed under the control of such regulatory sequences. An expression vector also preferably includes a
replication origin. The necessary transcriptional and translational signals can be provided on a recombinant expression vector. As detailed below, all genetic manipulations described for the PDE2 gene in this section, may also be employed for genes encoding a functional fragment, including the C-terminal peptide fragment of the catalytic domain of the PDE2 protein, derivatives or analogs thereof, including a chimeric protein thereof. Potential host-vector systems include but are not limited to mammalian cell systems infected with virus, e.g., vaccinia virus, adenovirus, etc.; insect cell systems infected with virus, e.g., baculovirus; microorganisms such as yeast containing yeast vectors; or bacteria transformed with bacteriophage, DNA, plasmid DNA, or cosmid
DNA. The expression elements of vectors vary in their strengths and specificities.
Depending on the host-vector system utilized, any one of a number of suitable transcription and translation elements may be used. A recombinant PDE2 protein of the invention may be expressed chromosomally, after integration of the coding sequence by recombination. In this regard, any of a number of amplification systems may be used to achieve high levels of stable gene expression See Sambrook et al., 1989, infra. A suitable cell for purposes of this invention is one into which the recombinant vector comprising the nucleic acid encoding PDE2 protein is cultured in an appropriate cell culture medium under conditions that provide for expression of PDE2 protein by the cell. Any of the methods previously described for the insertion of DNA fragments into a cloning vector may be used to construct expression vectors containing a gene consisting of appropriate transcriptional/translational control signals and the protein coding sequences. These methods may include in vitro recombinant DNA and synthetic techniques, and in vivo recombination (genetic recombination). Expression of PDE2 protein may be controlled by any promoter/enhancer element known in the art, but these regulatory elements must be functional in the host selected for expression. Expression vectors containing a nucleic acid encoding aPDE2 protein of the invention can be identified by four general approaches: (1) amplification (i.e. by
PCR) of the desired plasmid DNA or specific mRNA, (2) nucleic acid hybridization, (3) presence or absence of selection marker gene functions, and (4) expression of inserted sequences. In the first approach, the nucleic acids can be amplified to provide for detection of the amplified product. In the second approach, the presence of a foreign gene inserted in an expression vector can be detected by nucleic acid hybridization using probes comprising sequences that are homologous to an inserted marker gene. In the third approach, the recombinant vector/host system can be identified and selected based upon the presence or absence of certain "selection marker" gene functions, e.g., beta-galactosidase activity, thymidine kinase activity, resistance to antibiotics, transformation phenotype, occlusion body formation in baculovirus, etc., caused by the insertion of foreign genes in the vector. In another example, if the nucleic acid encoding PDE2 protein is inserted within the "selection marker" gene sequence of the vector, recombinant vectors containing the PDE2 protein insert can be identified by the absence of the PDE2 protein gene function. In the fourth approach, recombinant expression vectors can be identified by assaying for the activity, biochemical, or immunological characteristics of the gene product expressed by the recombinant vector, provided that the expressed protein assumes a functionally active conformation. A wide variety of host/expression vector combinations may be employed in expressing the nucleic acid sequences of this invention as known by those of skill in the art. Once a particular recombinant DNA molecule is identified and isolated, several methods known in the art may be used to propagate it. Once a suitable host system and growth conditions are established, recombinant expression vectors can be propagated and prepared in quantity. As previously explained, the expression vectors which can be used include, but are not limited to include the following vectors or their derivatives: human or animal viruses such as vaccinia virus or adenovirus; insect viruses such as baculovirus; yeast vectors; bacteriophage vectors (e.g., lambda), and plasmid and cosmid DNA vectors. Vectors can be introduced into the desired host cells by methods known in the art, e.g., transfection, electroporation, microinjection, transduction, cell fusion,
DEAE dextran, calcium phosphate precipitation, lipofection (lysosome fusion), use of a gene gun, or a DNA vector transporter. Wu et al., J. Biol. Chem. 267:963-967 (1992); Wu and Wu, J. Biol. Chem. 263:14621-14624 (1992); and Hartmut et al., Canadian Patent Application No. 2,012,311, filed Mar. 15, 1990. B. Crystal and Space Groups X-ray structure coordinates define a unique configuration of points in space. Those skilled in the art understand that a set of structure coordinates for a protein or a protein/ligand complex, or a portion thereof, define a relative set of points that, in turn, define a configuration in three dimensions. A similar or identical configuration can be defined by an entirely different set of coordinates, provided the distances and angles between atomic coordinates remain essentially the same. In addition, a scalable configuration of points can be defined by increasing or decreasing the distances between coordinates by a scalar factor while keeping the angles essentially the same. One of ordinary skill in the art would recognize that solving atomic coordinates of crystal structures of proteins such as PDE2 requires a stable, long- lasting source of high-quality protein. One aspect of the present invention relates to a human crystalline composition comprising a polypeptide with an amino acid sequence spanning amino acids Ser578 to Gly934 as listed in SEQ ID NO:2 and preferably amino acids SER578 to Glu919 as listed in SEQ ID NO:4. In one embodiment, the present invention discloses a crystalline PDE2 molecule comprising a polypeptide with an amino acid sequence spanning amino acids Ser578 to Gly934 as listed in SEQ ID NO:2 or amino acids Ser578 to Glu919 as listed in SEQ ID NO:4 complexed with one or more ligands. In another embodiment, the crystallized complex is characterized by the structural coordinates listed in FIG. 5, or portions thereof. In certain embodiments, the atoms of ligands are optimal within about 5, 7, or 10 angstroms of one or more PDE2 amino acids in SEQ ID NO:2 preferably selected from Tyr655, His656, Leu809, Asp811, Gln812, Ala823, Ile826, Phe830, Met847, Leu858, Gln859, Phe862 and Ile866. One embodiment of the crystallized complex is characterized as belonging to the C2221 space group and has unit cell dimensions of about a=90.03 A, b=l 02.55 A,
c=81.60 A, cc=β=γ=90.0°. This embodiment is encompassed by the structural coordinates of FIG. 5. The ligand may be a small molecule which binds to a PDE2 catalytic domain defined by SEQ ID NO:2, or portions thereof, with a Kj of less than about 10 μM, 1 μM, 500 nM, 100 nM, 50 nM, or even 10 nM. In certain embodiments, the ligand is a substrate or substrate analog of PDE2. In certain embodiments, the ligand(s) may be a competitive or uncompetitive inhibitor of PDE2. In a preferred embodiment the ligand is an inhibitor of human PDE2A. Various computational methods can be used to determine whether a molecule or a binding pocket portion thereof is "structurally equivalent," defined in terms of its three-dimensional structure, to all or part of PDE2A or its binding pocket(s). Such methods may be carried out in current software applications, such as the molecular similarity application of QUANTA (Accelrys Inc., San Diego, Calif). The molecular similarity application permits comparisons between different structures, different conformations of the same structure, and different parts of the same structure. The procedure used in molecular similarity to compare structures is divided into four steps: (1) load the structures to be compared into a computer; (2) optionally define the atom equivalences in these structures; (3) perform a fitting operation; and (4) analyze the results. Each structure is identified by a name. One structure is identified as the target (i.e., the fixed structure); all remaining structures are working structures (i.e., moving structures). Since atom equivalency within molecular similarity applications is defined by user input, for the purpose of this invention equivalent atoms are defined as protein backbone atoms (N, Ca., C, and O) for all conserved residues between the two structures being compared. A conserved residue is defined as a residue that is structurally or functionally equivalent (See Table 4 infra). In certain embodiments rigid fitting operations are considered. In other embodiments, flexible fitting operations may be considered. When a rigid fitting method is used, the working structure is translated and rotated to obtain an optimum fit with the target structure. The fitting operation uses an algorithm that computes the optimum translation and rotation to be applied to the moving structure, such that the root mean square difference of the fit over the
specified pairs of equivalent atoms is an absolute minimum. This number, given in angstroms, is reported by the molecular similarity application. For the purpose of this invention, any molecule or molecular complex or binding pocket thereof, or any portion thereof, that has a root mean square deviation of conserved residue backbone atoms (N, Ca., C, and O) of less than about 2.5 A, 2.0 A, 1.7A 1.5 A, 1.25 A, 1.0 A, 0.7 A, 0.5 A, 0.25 A, or even 0.2 A, when superimposed on the relevant backbone atoms described by the reference structure coordinates listed in FIG. 5, is considered structurally equivalent" to the reference molecule. That is to say, the crystal structures of those portions of the two molecules are substantially identical, within acceptable error. Particularly preferred structurally equivalent molecules or molecular complexes are those that are defined by the entire set of structural coordinates listed in FIG. 5, plus or minus a root mean square deviation from the conserved backbone atoms of those amino acids of not more than about 2.0 A. More preferably, the root mean square deviation is less than about 1.0 A. The term "root mean square deviation" means the square root of the arithmetic mean of the squares of the deviations. It is a way to express the deviation or variation from a trend or object. For purposes of this invention, the "root mean square deviation" defines the variation in the backbone of a protein from the backbone of PDE2 or a binding pocket portion thereof, as defined by the structural coordinates of PDE2 described herein. The refined x-ray coordinates of the catalytic domain of PDE2 (amino acids Ser578 to Glu919 as listed in SEQ ID NO:4) are as listed in FIG. 5. Schematic views of the molecule is shown in FIG. 1 and FIG.2. The structure is composed of a single domain of sixteen a helices and four
3i0 helices arranged in a compact fold (FIG. 1). The numbering of the helices is as shown below. We have followed the helice numbering convention provided by Huai et al, Structure, 11 :865-873 (2003), and the start and end points of the helices are determined according to Kabsch and Sander, Biopolymers, 22: 2577-637 (1983).
αhelices 3ιo helices Al 593-595 H2 580-585 A2 607-609 H3/4 613-632 A3 677-679 H5 635-649 A4 721-723 H6 658-675 A5 810-813 H7 683-695 H8 705-710 A6 846-848 H9 714-719 H10 725-738 Hl l 751-766 H12 770-786 H13 793-808 H14 816-839 H15 854-878 H16 881-897
Within the overall fold, three sub-domains can also be defined. Residues
580-703 (H2-H7) form the first sub-domain, 704-770 (H8-H11) form the second sub-domain, and 770-899 (H12-H16) form the third sub-domain. No electron density is observed for residues 890-900, which probably comprise a flexible loop between Helices 16 and the C-terminus residues (901-915). No electron density is observed for residues after 915. Two metal ions are seen in the catalytic site. The first is determined to be Zn2+, by analogy with PDE4, and from an analysis of its coordination geometry. The metal is coordinated by His660 (Ne2-Zn 2.1 A), His696 (Ne2-Zn 2.1 A), Asp697 (Oδ2-Zn 2.1 A), Asp808 (Oδl-Zn 2.3A)a phosphate ion (O2-Zn 2.3A). These residues are completely conserved across the PDE gene family. The second metal ion is coordinated to Asp697 (Oδ2-Mg 2.3 A). The phosphate ion also interacts with waters that stabilize the surrounding environment. The active site lies mainly within the third subdomain and is bounded one side by helices HI 5, HI 4, the C-terminus of HI 3 and the 3ι0 helix A4, and on the other side by C-terminus of H5, the N-terminus of H6 and the loop region in between H5 and H6.
Accordingly, the present invention provides a molecule or molecular complex that includes at least a portion of a PDE2 and/or a substrate binding pocket. In one embodiment, the PDE2 binding pocket includes the amino acids listed in Table 1, preferably the amino acids listed in Table 2, and m ore preferably the amino acids listed in Table 3, the binding pocket being defined by a set of points having a root mean square deviation of less than about 2.0, 1.7, 1.5, 1.2, 1.0, 0.7, 0.5, or even 0.2 A, from points representing the backbone atoms of the amino acids in Tables 1-3. In another embodiment, the PDE2 substrate binding pocket includes one or more amino acids selected from Tyr655, His656, Leu809, Asp811, Gln812, Ala823, Ile826, Phe830, Met847, Leu858, Gln859, Phe862 and Ile866 or consecutive amino acids including the above noted amino acids from SEQ ID NO:2 or SEQ ID NO:4.
Table 1: Residues within 10 A of the binding pocket in PDE2 catalytic domain.
Table 2: Residues within 7 A of the binding pocket in PDE2 catalytic domain.
C. Isolated Polypeptide and Variants One embodiment of the invention describes an isolated polypeptide consisting of a portion of PDE2 which functions as a binding site when folded in the proper 3-D orientation as listed for example in FIG. 5. Another embodiment is an isolated polypeptide comprising a portion of
PDE2, wherein the portion starts at about amino acid residue Ser578, and ends at about amino acid residue Gly934 as described in SEQ ID NO:2, or in the preferred embodiment amino acid residues Ser578 to Glu919 as described in SEQ ID NO:4; or a sequence that is at least 95%, or 98% homologous or an equivalent to a polypeptide with an amino acid sequence spanning amino acids Ser578 toGly934 listed in SEQ ID NO:2 or preferably amino acids Ser578 to Glu919 as listed in SEQ ID NO:4. One embodiment of the invention comprises crystalline compositions comprising variants of PDE2. Variants of the present invention may have an amino acid sequence that is different by one or more amino acid substitutions to the sequence disclosed in SEQ ID NO:2 or SEQ ID NO:4. Embodiments which comprise amino acid deletions and/or additions are also contemplated. The variant may have conservative changes (amino acid similarity), wherein a substituted amino acid has similar structural or chemical properties, for example, the replacement of leucine with isoleucine. Guidance in determining which and how many amino acid residues may be substituted, inserted, or deleted without adversely affecting biological or proposed pharmacological activity may be reasonably inferred in view of this disclosure, and may further be found using computer programs well known in the art, for example, DNAStar® software. Amino acid substitutions may be made, for instance, on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues as long as a biological and/or pharmacological activity of the native molecule is retained. Negatively charged amino acids include aspartic acid and glutamic acid; positively charged amino acids include lysine and arginine; amino acids, with uncharged polar head groups having similar hydrophilicity values include leucine,
isoleucine, and valine; amino acids with aliphatic head groups include glycine, alanine; asparagine, glutamine, serine; and amino acids with aromatic side chains include threonine, phenylalanine, and tyrosine. Examples of conservative substitutions are set forth in Table 4 as follows: Table 4
"Homology" is a measure of the identity of nucleotide sequences or amino acid sequences. In order to characterize the homology, subject sequences are aligned so that the highest percentage homology (match) is obtained, after
introducing gaps, if necessary, to achieve maximum percent homology. N- or C- terminal extensions shall not be construed as affecting homology. "Identity" has an art-recognized meaning and can be calculated using published techniques. Computer program methods to determine identity between two sequences, for example, include DNAStar® software (DNAStar Inc. Madison, WI); the GCG® program package (Devereux, J., et al., NAR, 12:387 (1984); BLASTP, BLASTN, FASTA (Atschul, S.F. et al., J. Mol. Bio., 215: 403 (1990). Homology as defined herein is determined conventionally using the well-known computer program, BESTFIT® (Wisconsin Sequence Analysis Package, Version 8 for Unix, Genetics Computer Group, University Research Park, 575 Science Drive, Madison, WI,
53711). When using BESTFIT® or any other sequence alignment program (such as the Clustal algorithm from MegAlign software (DNAStar®) to determine whether a particular sequence is, for example, about 90% homologous to a reference sequence, according to the present invention, the parameters are set such that the percentage of identity is calculated over the full length of the reference nucleotide sequence or amino acid sequence and that gaps in homology of up to about 90% of the total number of nucleotides in the reference sequence are allowed. Ninety percent of homology is therefore determined, for example, using the BESTFIT® program with parameters set such that the percentage of identity is calculated over the full length of the reference sequence, e.g., GenBank Accession No: U67733, and wherein up to 10% of the amino acids in the reference sequence may be substituted with another amino acid. Percent homologies are likewise determined, for example, to identify preferred species, within the scope of the claims appended hereto, which reside within the range of about 90% to 100% homology to SEQ ID NO:2 as well as the binding site thereof. As noted above, N- or C-terminal extensions shall not be construed as affecting homology. Thus, when comparing two sequences, the reference sequence is generally the shorter of the two sequences. This means that, for example, if a sequence of 50 nucleotides in length with precise identity to a 50 nucleotide region within a 100 nucleotide polynucleotide is compared, there is 100% homology as opposed to only 50%) homology.
Although the natural polypeptide of SEQ ID NO:2 and a variant polypeptide may only possess a certain percentage identity, e.g., 90%, they are actually likely to possess a higher degree of similarity, depending on the number of dissimilar codons that are conservative changes. Conservative amino acid substitutions can frequently be made in a protein without altering either the conformation or function of the protein. Similarity between two sequences includes direct matches as well a conserved amino acid substitutes which possess similar structural or chemical properties, e.g., similar charge as described in Table 4. Percentage similarity (conservative substitutions) between two polypeptides may also be scored by comparing the amino acid sequences of the two polypeptides by using programs well known in the art, including the BESTFIT program, by employing default settings for determining similarity. A further embodiment of the invention is a crystal comprising the coordinates of FIG. 5, wherein the amino acid sequence is represented by SEQ ED NO:2 or SEQ ID NO:4. A further embodiment of the invention is a crystal comprising the coordinates of FIG. 5, wherein the amino acid sequence is at least 95%o, or 98% homologous to the amino acid sequence represented by SEQ ID NO:4. Various methods for obtaining atomic coordinates of structurally homologous molecules and molecular complexes using homology modeling are disclosed in U.S. Patent No. 6,356,845.
D. Structure Based Drug Design Once the three-dimensional structure of a crystal comprising a PDE2 protein, a functional domain thereof, homologue fragment, variant analogue or derivative thereof, is determined, a potential ligand (antagonist or agonist) may be examined through the use of computer modeling using a docking program such as GRAM, DOCK, or AUTODOCK. See for example, Morris et al., J. Computational Chemistry, 19:1639-1662 (1998). This procedure can include in silico fitting of potential ligands to the PDE2 crystal structure to ascertain how well the shape and the chemical structure of the potential ligand will complement or interfere with the catalytic domain of PDE2 (Bugg et al., Sci. Am., 92-98 (1993); and West et al., TIPS, 16:67-74 1995)). Computer programs can also be employed to estimate the
attraction, repulsion, and steric hindrance of the ligand to the binding site. Generally the tighter the fit (e.g., the lower the steric hindrance, and/or the greater the attractive force) the more potent the potential drug will be since these properties are consistent with a tighter binding constant. Furthermore, the more specificity in the design of a potential drug the more likely that the drug will not interfere with the properties of other proteins. This will minimize potential side-effects due to unwanted interactions with other proteins. One embodiment of the present invention relates to a method of identifying an agent or chemical compound that binds to a binding site on PDE2A catalytic domain wherein the binding site comprises one or more amino acid residues including, Tyr655 His656, Leu809, Asp811, Gln812, Ala823, Ile826, Phe830, Met847, Leu858, Gln859, Phe862 and Ile866 of SEQ ID NO:2 or SEQ ID NO:4 comprising contacting PDE2 with a test ligand under conditions suitable for binding of the test ligand to the binding site, and determining whether the test ligand binds in the binding site, wherein if binding occurs, the test ligand is an agent that binds in the binding site. In certain embodiments, the testing may be carried out in silico using a variety of molecular modeling software algorithms including, but not limited to, DOCK, ALADDIN, CHARMM simulations, AFFINITY, C2-LIGAND FIT, Catalyst, LUDI, CAVEAT, and CONCORD. Brooks et al., J. Comp. Chem., 4:187- 217 (1983); and Meng et al., J. Comp. Chem., 13:505-524 (1992). In another embodiment, a potential ligand may be obtained by screening a random peptide library produced by a recombinant bacteriophage (Scott and Smith, Science, 249:386-390, (1990); Cwirla et al., PNAS., 87:6378-6382 (1990); Devlin et al., Science, 249:404-06 (1990)), or a chemical library, or the like. A ligand selected in this manner can be then be systematically modified by computer modeling programs until one or more promising potential ligands are identified. Such analysis has been shown to be effective in the development of HIN protease inhibitors (Lam et al., Science 263:380-84 (1994); Wlodawer et al., Ann. Rev. Biochem. 62:543-585 (1993); Appelt, Perspectives in Drug Discovery and Design, 1:23-48 (1993); and Erickson, Perspectives in Drug Discovery and Design, 1:109-128 (1993)).
Such computer modeling allows the selection of a finite number of rational chemical modifications, as opposed to the countless number of essentially random chemical modifications that could be made, and of which any one might lead to a useful drug. Each chemical modification requires additional chemical steps, which while being reasonable for the synthesis of a finite number of compounds, quickly becomes overwhelming if all possible modifications needed to be synthesized are actually synthesized. Thus, through the use of the three-dimensional structure disclosed herein and computer modeling, a large number of these compounds can be rapidly screened on a computer monitor screen, and a few likely candidates can be determined without the laborious synthesis of large numbers of compounds. Once a potential ligand (i.e. agonist or antagonist) is identified, it can be either selected from a library of chemicals, or alternatively, the potential ligand may be synthesized de novo. As mentioned above, the de novo synthesis of one or even a relatively small group of specific compounds is reasonable in the art of drug design. The prospective drug can be placed into any standard binding assay described herein to test its effect on PDE2 interaction. When a suitable drug is identified, a supplemental crystal can be grown which comprises a protein-ligand complex formed between a PDE2 protein and the drug. Preferably the crystal effectively diffracts X-rays allowing the determination of the atomic coordinates of the protein-ligand complex to a resolution of less than 5.0 A , more preferably less than 3.0 A, and even more preferably less than 2.0 A. The three-dimensional structure of the supplemental crystal can be determined by Molecular Replacement Analysis. Molecular replacement involves using a known three-dimensional structure as a search model to determine the structure of a closely related molecule or protein-ligand complex in a new crystal form. The measured X- ray diffraction properties of the new crystal are compared with the search model structure to compute the position and orientation of the protein in the new crystal. Computer programs that can be used include: X-PLOR and AMORE (Navaza, Acta Crystallographies ASO, 157-163 (1994)). Once the position and orientation are known, an electron density map can be calculated using the search model to provide X-ray phases. Thereafter, the electron density is inspected for structural differences,
and the search model is modified to conform to the new structure. Using this approach, it is possible to use the claimed structure of PDE2 to solve the three- dimensional structures of any such PDE2 complexed with a new ligand. Other computer programs that can be used to solve the structures of such STAT crystals include QUANTA, CHARMM; INSIGHT; SYBYL; MACROMODEL; and ICM. Various in silico methods for screening, designing or selecting ligands are disclosed, for example, in U.S. Patent No. 6,356,845. E. Ligands In one aspect, the present invention provides the means to identify ligands, including for example agonists, antagonists, chemical compounds, etc. which interact with a binding site of the PDE2 catalytic domain defined by a set of points having a root mean square deviation of less than about 2.0 A from points representing the backbone atoms of the amino acids represented by the structure coordinates listed in FIG. 5. A further embodiment of the present invention comprises agents which interact with a binding site of PDE2 defined by a set of points having a root mean square deviation of less than about 2.0, 1.7, 1.5, 1.2, 1.0, 0.7, 0.5, or even 0.2 A from points representing the backbone atoms of the amino acids represented by the structure coordinates listed in FIG. 5. Such embodiments represent variants of the PDE2 crystal. In another aspect, the present invention describes ligands, which bind to correctly folded polypeptides comprising an amino acid sequence spanning amino acids Ser578 to Glu919 listed in SEQ ID Nos 2 or 4, or a homologue, or a variant thereof. In certain embodiments, the ligand is a competitive or uncompetitive inhibitor of PDE2. In certain embodiments the ligand inhibits PDE2 with an IC 0 or Ki of less than about 10 μM, 1 μM, 500 nM, 100 nM, 50 nM or 10 nM. In certain embodiments, the ligand inhibits PDE2 with a K, that is less than about one-half, one-fifth, or one-tenth the K, that the substance has for inhibition of any other PDE enzyme. In other words, the substance inhibits PDE2 activity to the same degree at a concentration of about one-half, one-fifth, one-tenth or less than the concentration required for any other PDE enzyme.
One embodiment of the present invention relates to agents or ligands, such as proteins, peptides, peptidomimetics, small organic molecules, etc., designed or developed with reference to the crystal structure of PDE2 as represented by the coordinates presented herein in FIG. 5, and portions thereof. Such agents interact with the binding site of the PDE2 represented by one or more amino acid residues selected from Tyr655, His656, Leu809, Asp811, Gln812, Ala823, Ile826, Phe830, Met847, Leu858, Gln859, Phe862 and Ile866. F. Machine Readable Storage Media Transformation of the structure coordinates for all or a portion of PDE2, or the PDE2/ligand complex or one of its binding pockets, for structurally homologous molecules as defined below, or for the structural equivalents of any of these molecules or molecular complexes as defined above, into three-dimensional (3-D) graphical representations of the molecule or complex can be conveniently achieved through the use of commercially-available software. The invention thus further provides a machine-readable storage medium comprising a data storage material encoded with machine readable data which, when using a machine programmed with instructions for using said data, is capable of displaying a graphical three-dimensional representation of any of the molecule or molecular complexes of this invention that have been described above. In a preferred embodiment, the machine-readable data storage medium comprises a data storage material encoded with machine readable data which, when using a machine programmed with instructions for using said data, is capable of displaying a graphical three-dimensional representation of a molecule or molecular complex comprising all or any parts of a PDE2 C-terminal catalytic domain or binding pocket, as defined above. In another preferred embodiment, the machine-readable data storage medium is capable of displaying a graphical three-dimensional representation of a molecule or molecular complex defined by the structure coordinates of the amino acids listed in FIG. 5, plus or minus a root mean square deviation from the backbone atoms of said amino acids of not more than 2.0 A.
In an alternative embodiment, the machine-readable data storage medium comprises a data storage material encoded with a first set of machine readable data which comprises the Fourier transform of the structural coordinates set forth in FIG. 5, and which, when using a machine programmed with instructions for using said data, can be combined with a second set of machine readable data comprising the X-ray diffraction pattern of a molecule or molecular complex to determine at least a portion of the structural coordinates corresponding to the second set of machine readable data. For example, a system for reading a data storage medium may include a computer comprising a central processing unit ("CPU"), a working memory which may be, e.g., RAM (random access memory) or "core" memory, mass storage memory (such as one or more disk drives or CD-ROM drives), one or more display devices (e.g., cathode-ray tube ("CRT") displays, light emitting diode ("LED") displays, liquid crystal displays ("LCDs"), electroluminescent displays, vacuum fluorescent displays, field emission displays ("FEDs"), plasma displays, projection panels, etc.), one or more user input devices (e.g., keyboards, microphones, mice, touch screens, etc.), one or more input lines, and one or more output lines, all of which are interconnected by a conventional bidirectional system bus. The system may be a stand-alone computer, or may be networked (e.g., through local area networks, wide area networks, intranets, extranets, or the internet) to other systems (e.g., computers, hosts, servers, etc.). The system may also include additional computer controlled devices such as consumer electronics and appliances. Input hardware may be coupled to the computer by input lines and may be implemented in a variety of ways. Machine-readable data of this invention may be inputted via the use of a modem or modems connected by a telephone line or dedicated data line. Alternatively or additionally, the input hardware may comprise CD-ROM drives or disk drives. In conjunction with a display terminal, a keyboard may also be used as an input device. Output hardware may be coupled to the computer by output lines and may similarly be implemented by conventional devices. By way of example, the output hardware may include a display device for displaying a graphical representation of a
binding pocket of this invention using a program such as QUANTA as described herein. Output hardware might also include a printer, so that hard copy output may be produced, or a disk drive, to store system output for later use. In operation, a CPU coordinates the use of the various input and output devices, coordinates data accesses from mass storage devices, accesses to and from working memory, and determines the sequence of data processing steps. A number of programs may be used to process the machine-readable data of this invention. Such programs are discussed in reference to the computational methods of drug discovery as described herein. References to components of the hardware system are included as appropriate throughout the following description of the data storage medium. Machine-readable storage devices useful in the present invention include, but are not limited to, magnetic devices, electrical devices, optical devices, and combinations thereof. Examples of such data storage devices include, but are not limited to, hard disk devices, CD devices, digital video disk devices, floppy disk devices, removable hard disk devices, magneto-optic disk devices, magnetic tape devices, flash memory devices, bubble memory devices, holographic storage devices, and any other mass storage peripheral device. It should be understood that these storage devices include necessary hardware (e.g., drives, controllers, power supplies, etc.) as well as any necessary media (e.g., disks, flash cards, etc.) to enable the storage of data. The present invention is further illustrated by the following examples which should not be construed as limiting in any way. The practice of the present invention will employ, unless otherwise indicated, conventional techniques of cell biology, cell culture, molecular biology, microbiology and recombinant DNA, X-ray crystallography, and molecular modeling which are within the skill of the art. Such techniques are explained fully in the literature and known to those of skill in the pertinent arts. See, for example, Molecular Cloning: A Laboratory Manual, 2nd Ed., ed. by Sambrook, Fritsch and Maniatis (Cold Spring Harbor Laboratory Press (1989)); DNA Cloning, Volumes I and II (D. N. Glover ed., (1985));
Oligonucleotide Synthesis (M. J. Gait ed., (1984)); U.S. Patent No: 4,683,195;
Nucleic Acid Hybridization (B. D. Hames & S. J. Higgins eds. (1984)); Transcription And Translation (B. D. Hames & S. J. Higgins eds. (1984)); B. Perbal, A Practical Guide To Molecular Cloning (1984); the treatise, Methods In Enzymology (Academic Press, Inc., N.Y.); Methods In Enzymology, Vols. 154 and 155 (Wu et al. eds.), Crystallography Made Crystal Clear: A Guide for Users of Macromolecular Models (Gale Rhodes, 2nd. Ed. San Diego; Academic Press, (2000)). Examples Example 1 : Construction and expression of PDE2 wild type catalytic domain Various constructs of human PDE2 were created by PCR and subcloned into pFastBac-1 in order to generate recombinant baculovirus using the Bac-to-Bac system (Invitrogen). Two such proteins which encompass a catalytic region of PDE2, for example, in one embodiment a catalytic region starting at Ser578 and extending toGly934 designated L5N1 (SEQ ED NO:2) and in a preferred embodiment a catalytic region starting at Ser578 extending to Glu919 designated S5N (SEQ ID NO:4). The protein was expressed in SF21 insect cells infected with the recombinant baculovirus at a MOI of 0.1 and harvested 72 hrs. post infection. Pellets of infected cells were frozen at -80°C for transfer to purification.
Example 2: Purification of PDE2 catalytic domain Insect cell paste was resuspended in 50ml/L of ice-cold lysis buffer (50mM HEPES pH 8.0, 50mM NaCl, 5mM Tris (2-carboxyethyl) phosphine hydrocloride [TCEP] (Sigma), 0.5ml/L protease inhibitor cocktail (Sigma cat. # P8340), lug/mL leupeptin (Sigma), ImM phenylmethanesulfonyl floride [PMSF] (Sigma) and lOuM trans-epoxysuccinyl-L-leucylamido (4-guanidino)-butane [E64] (Sigma)) and lysed by single passage through a pre-chilled microfluidizer (Ml 10L, Micro fluidics International Corp., Needham, MA) at 18 kPSI chamber pressure. Lysate is then centrifuged (for 30 minutes to overnight) at 41,000g in 4°C temperature. From this point on, every purification step is conducted at 4°C. Next, the supernatant is decanted, filtered (0.45 micron Nalgene filter), and passed through a combined Q sepharose fast flow (Amersham Biosciences) and SP sepharose fast flow
(Amersham Biosciences) matrix manually packed in a XK50/20 column (Amersham Biosciences). It was discovered that neither Q nor SP sepharose bound PDE2 catalytic domain but both served as a good scrubbing matrix to remove contaminant host protein. In addition, the mixture of both matrixes into a single column was developed to improve the efficiency in (a) contaminant protein removal and (b) time scale of the purification process.
After loading the supernatant, the column is washed with 10 column volumes [c.v.] of lysis buffer. The flow through is collected and buffer exchanged into Buffer A using the QuixStand Benchtop Cross Flow System (10K MWCO hollow fiber filter, A/G Technology Corporation, Needham, MA). Buffer A consists of 50mM MES pH 6.5, 50mM NaCl, 5mM TCEP, 0.5ml/L protease inhibitor cocktail, lug/mL leupeptin, ImM PMSF, and lOuM E64. The purpose of this exchange is to lower the pH of the buffer thereby facilitating the binding of PDE2 to the blue sephorose column. After buffer exchange, PDE2 is loaded onto a blue sepharose 6 fast flow column. After loading, the column is washed to baseline with the Buffer A followed by 500mM NaCl (in Buffer A). This was found to help in the removal of contaminant proteins that bind non-specifically to the column. PDE2 this then eluted in the presence of 20mM cyclic guanosine monophosphate [cGMP] and 500mM NaCl (in Buffer A). This concentration of cGMP was found to be optimal to elute the bound PDE2 and while balancing the solubility limits of cGMP at 4°C. The eluted PDE2 peak is pooled and concentrated using an Amicon Bioseparations stirred cell (Millipore) with a 10K cutoff Biomax PBGC membrane (Millipore). Upon concentration, the protein solution is immediately desalted using a HiPrep 26/10 desalting column into the Buffer B. Buffer B consists of 50mM HEPES pH 7.0, 5mM TCEP, 0.5ml/L protease inhibitor cocktail, lug/mL leupeptin, ImM
PMSF, and lOuM E64. It was discovered that the protein has to be desalted as soon as it comes off the blue sephorose column because PDE2 is not very stable in the presence of high salt. The loss of protein due to aggregation increases with exposure to high salt. This observation is also consistent with results that were discovered in biophysical experiment prior to crystallization (see next segment). After desalting, the protein is loaded onto a heparin sepharose 6 fast flow column. The protein is
then eluted over a 20 c.v. gradient from 0-500mM NaCl Buffer B. In this elution, PDE2 is separated into 2 distinct peaks. PDE2 from the first peak is less active than that of the second peak. Finally, the first peak off the heparin column is pooled and passed over a gel filtration column. The buffer for this column, which is also the final protein buffer, is 25mM HEPES pH 7.5, 25mM NaCl, and 5mM TCEP.
Example 3: Crystallization
Purified PDE2 from the first peak was set to crystal screening using Crystal Screen I and II (Hampton research) and Wizard I & II (Emerald Biosciences) at 4°C and 15°C. Two hits were identified from condition 37 (lOOmM HEPES pH 7.5, 10% PEG 8000, 8% ethylene glycol) and 38 (lOOmM HEPES pH 7.5, 20% PEG 10000) of Crystal Screen II at 4°C. Upon identification of these hits multiple approaches to optimization were taken to improve crystal size and quality. First, extensive crystallization experiments were conducted to determine the effect of temperature. It was ultimately discovered that 4°C was the optimal for crystal growth and that at 15°C, crystals were small and dissolve rapidly. Second, since both hits excluded the presence of salt, dynamic light scattering experiments were carried out to determine the effects of salt on purified PDE2. It was discovered that the presence of salt enhanced polydispersity (i.e. instability) of PDE2. This led to immediate modifications of all PDE2 purification protocols and crystallization trials to exclude salt. Third, the presence of 50mM NaCl in the protein buffer was observed to reverse the vapor diffusion process (no NaCl in the crystallization buffer) resulting in increasing crystallization drop size. To counter this effect, lOOmM NaCl was added into the reservoir but not the crystallization drop. Fourth, although PDE2 could be crystallized using different molecular weight PEG, it was discovered later that the use of PEG 5K MME was optimal for producing highly diffractable crystals. From the sum of all these data, it was finally concluded that the optimal crystallization condition was lOOmM HEPES pH 7.5 - 8.0, 12-20% PEG 5K MME at 4°C with lOOmM NaCl added only the reservoir (not to the crystallization drop). L5N1 crystals grown under these conditions diffracted to 2.6A resolution at the Advance Photo Source (Chicago). The crystallization conditions for the S5N
construct produced crystals that diffracted to 1.7 A resolution. Unless otherwise stated, all crystal optimization trials were carried out at 4°C, using lOmg/ml protein concentration, and a total drop size of 2.4uL (1.2 ul protein + 1.2 mother liquor).
Example 4: X-ray data collection, structure determination and refinement of PDE2 Crystals were transferred to a cryoprotectant solution, made up of the reservoir solution, with 15% glycerol, and then flash- frozen in a stream of cold nitrogen gas at 100K. A full data set was collected from one crystal frozen in this manner on a Rigaku RAXIS lie detector, mounted on a Rigaku RU-200 generator with Osmic optics. Data were processed using the HKL suite of software
(Otwinowski & Minor, Methods Enzymol, 276: 307-26, (1997)). Data collection statistics are summarized in Table 5a.
6. Table 5a -Data statistics Resolution range 500.0-1.7A Number of observations Unique 39346 Redundancy 4 Completeness (%) 97.1(92.1)' I/σ(I) 17.5(2.2) R, sym 0.072(0.48) 1,2
Numbers in parentheses refer to the highest resolution range (2.10- 2.17A)
2 R
sym = Σ (I-<I>)/Σ<I> The crystals belong to space group C2221 with unit cell dimensions a=87.47 A, b=87.47 A, c=135.03A,
They contain 1 molecule of the polypeptide, and one molecule of the inhibitor per asymmetric unit. The structure was solved by the method of molecular replacement, using the program EPMR (Kissinger et al., Acta Crystallographica, D55 :484-491 (1999)).
The search model consisted of only the backbone atoms of PDE4B taken from PDB entry 1FOJ (Xu et al., supra), residues 152 to 461. A clear solution to the rotation
and translation function searches was found using diffraction data limited to 5A resolution. A homology model of PDE2 was then positioned according to the top rotation/translation search, and subjected to refinement, and a combination of automatic and manual refitting. Automatic refitting was carried out using the program ArpWarp in combination with Refinac (Murshudov et al., Acta Cryst. D53:240-55 (1997)), and manual fitting used the program O Refinement in Refinac was carried out using all data in the resolution range 20.0 - 1.7A. Partial structure factors from a bulk-solvent model and anisotropic B-factor correction were supplied throughout the refinement. The R- factor for the current model is 0.21 (free R- factor, 7% of the data, 0.23). The refinement statistics are summarized in Table 5b. 7. Table 5b- Refinement statistics Nr. of reflections used (%) 37389 (89.4%) Nr. of reflections used for R
free 3413 ( 8.2%) Rwork/Rfree 0.215/0.248
3 Number of atoms 2,932
Example 5: Structural Alignment of PSDE2 and PDE4 FIG. 4: provides a structure alignment of PDE2 with PDE4. The PDE4 structure is in complex with Rolipram (Huai et al, Structure, 11 :865-873 (2003)). A total of about 234 equivalent Ca atoms are superimposed between PDE2 and PDE4 structures with a RMS distance of 1.4 A. The PDE2 structure is shown in the same orientation as in FIG. 1. Rolipram bound to PDE4D is drawn as space filled model. The secondary structures of PDE2 and PDE4D are well conserved as observed with other members of the PDE family. The binding mode of Rolipram in PDE4 also defines the liganάVinhibitor binding pocket of PDE2. This kind of 3D structure alignment and comparison will provide precise information for the structure based design of specific and selective PDE2 and specific and selective PDE4 ligands.
Equivalents
While specific embodiments of the subject invention have been discussed, the above specification is illustrative and not restrictive. Many variations of the invention will become apparent to those skilled in the art upon review of this specification. The appended claims should be interpreted by reference to the claims, along with their full scope of equivalents, and the specification, along with such variations. All publications and patents mentioned herein are hereby incorporated by reference in their entireties. In case of conflict any definitions herein will control. The numbering of the amino acid residues herein correspond to the numbering provided by
Rosman et al., Gene, 191:89-95 (1991) with the exception of an Ala715 in Applicants
SEQ ID NO:2 and SEQ ID NO:4.
List of Sequences
SEQ ID NO: 1
L5N1 DNA: atgcatGCCtccgacgatgagtataccaaacttctccatgatgggatccagcctgtggctgccattgactccaattttgc aagtttcacctatacccctcgttccctgcccgaggatgacacgtccatggccatcctgagcatgctgcaggacatgaatt tcatcaacaactacaaaattgactgcccgaccctggcccggttctgtttgatggtgaagaagggctaccgggatcccccc taccacaactggatgcacgccttttctgtctcccacttctgctacctgctctacaagaacctggagctcaccaactacct cgaggacatcgagatctttgccttgtttatttcctgcatgtgtcatgacctggaccacagaggcacaaacaactctttcc aggtggcctcgaaatctgcgctggctgcgctctacagctctgagggctccgtcatggagaggcaccactttgctcaggcc attgccatcctcaacacccacggctgcaacatctttgatcatttctcccggaaggactatcagcgcatgctggatctgat gcgggacatcatcttggccacagacctggcccaccatctccgcatcttcaaggacctccagaagatggctgaggtgggct acgaccgaaacaacaagcagcaccacagacttctcctctgcctcctcatgacctcctgtgacctctctgaccagaccaag ggctggaagactacgagaaagatcgcggagctgatctacaaagaattcttctcccagggagacctggagaaggccatggg caacaggccgatggagatgatggaccgggagaaggcctatatccctgagctgcaaatcagcttcatggagcacattgcaa tgcccatctacaagctgttgcaggacctgttccccaaagcggcagagctgtatgagcgcgtggcctccaaccgtgagcac tggaccaaggtgtcccacaagttcaccatccgcggcctcccaagtaacaactcgctggacttcctggatgaggagtacga ggtgcctgatctggatggcactagggcccccatcaatggctaa
SEQ ID NO:2 L5N1 Protein:
MHASDDEYTKLLHDGIQPVAAIDSNFASFTYTPRSLPEDDTSMAILSMLQDM NFINNYKΩCPTLARFCLMVKXGYRDPPYHNWMHAFSVSHFCYLLYKNLEL TNYLEDIEIFALFISCMCHDLDHRGTNNSFQVASKSALAALYSSEGSVMERH HFAQAIAILNTHGCNIFDHFSRKDYQRMLDLMRDΠLATDLAHHLRIFKDLQ KMAEVGYDRNNKQHHRLLLCLLMTSCDLSDQTKGWKTTRKIAELIYKEFFS QGDLEKAMGNRPMEMMDREKAYIPELQISFMEHLAMPΓYKLLQDLFPKAAE LYERVASNREHWTKVSHKFTIRGLPSNNSLDFLDEEYEVPDLDGTRAPING
SEQ ID NO:3
S5N DNA: atgcatGCCtccgacgatgagtataccaaacttctccatgatgggatccagcctgtggctgccattgactccaattttgc aagtttcacctatacccctcgttccctgcccgaggatgacacgtccatggccatcctgagcatgctgcaggacatgaatt tcatcaacaactacaaaattgactgcccgaccctggcccggttctgtttgatggtgaagaagggctaccgggatcccccc taccacaactggatgcacgccttttctgtctcccacttctgctacctgctctacaagaacctggagctcaccaactacct cgaggacatcgagatctttgccttgtttatttcctgcatgtgtcatgacctggaccacagaggcacaaacaactctttcc aggtggcctcgaaatctgcgctggctgcgctctacagctctgagggctccgtcatggagaggcaccactttgctcaggcc attgccatcctcaacacccacggctgcaacatctttgatcatttctcccggaaggactatcagcgcatgctggatctgat gcgggacatcatcttggccacagacctggcccaccatctccgcatcttcaaggacctccagaagatggctgaggtgggct acgaccgaaacaacaagcagcaccacagacttctcctctgcctcctcatgacctcctgtgacctctctgaccagaccaag ggctggaagactacgagaaagatcgcggagctgatctacaaagaattcttctcccagggagacctggagaaggccatggg
caacaggccgatggagatgatggaccgggagaaggcctatatccctgagctgcaaatcagcttcatggagcacattgcaa tgcccatctacaagctgttgcaggacctgttccccaaagcggcagagctgtatgagcgcgtggcctccaaccgtgagcac tggaccaaggtgtcccacaagttcaccatccgcggcctcccaagtaacaactcgctggacttcctggatgaggagtaa SEQ ID NO:4
S5N Protein:
MHASDDEYTKLLHDGIQPVAAIDSNFASFTYTPRSLPEDDTSMAILSMLQDM NFINNYKDDCPTLARFCLMVKKGYRDPPYHNWMHAFSVSHFCYLLYKNLEL TNYLEDIEIFALFISCMCHDLDHRGTNNSFQVASKSALAALYSSEGSVMERH HFAQAIAILNTHGCNIFDHFSRKDYQRMLDLMRDΠLATDLAHHLRIFKDLQ KMAEVGYDRNNKQHHRLLLCLLMTSCDLSDQTKGWKTTRKIAELΓYKEFFS QGDLEKAMGNRPMEMMDREKAYIPELQISFMEHIAMPIYKLLQDLFPKAAE LYERVASNREHWTKVSHKFTIRGLPSNNSLDFLDEE