WO2002040631A2 - Dipeptide seryl-histidine and related oligopeptides cleave dna, protein, and a carboxyl ester - Google Patents

Dipeptide seryl-histidine and related oligopeptides cleave dna, protein, and a carboxyl ester Download PDF

Info

Publication number
WO2002040631A2
WO2002040631A2 PCT/US2001/043079 US0143079W WO0240631A2 WO 2002040631 A2 WO2002040631 A2 WO 2002040631A2 US 0143079 W US0143079 W US 0143079W WO 0240631 A2 WO0240631 A2 WO 0240631A2
Authority
WO
WIPO (PCT)
Prior art keywords
ser
dna
cleavage
amino acid
nucleic acid
Prior art date
Application number
PCT/US2001/043079
Other languages
French (fr)
Other versions
WO2002040631A3 (en
Inventor
Xiaozhou Chen
Yunsheng Li
Scott Hatfield
Original Assignee
Ohio University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ohio University filed Critical Ohio University
Publication of WO2002040631A2 publication Critical patent/WO2002040631A2/en
Publication of WO2002040631A3 publication Critical patent/WO2002040631A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K7/00Peptides having 5 to 20 amino acids in a fully defined sequence; Derivatives thereof
    • C07K7/04Linear peptides containing only normal peptide links
    • C07K7/06Linear peptides containing only normal peptide links having 5 to 11 amino acids
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K19/00Hybrid peptides, i.e. peptides covalently bound to nucleic acids, or non-covalently bound protein-protein complexes
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K5/00Peptides containing up to four amino acids in a fully defined sequence; Derivatives thereof
    • C07K5/04Peptides containing up to four amino acids in a fully defined sequence; Derivatives thereof containing only normal peptide links
    • C07K5/08Tripeptides
    • C07K5/0802Tripeptides with the first amino acid being neutral
    • C07K5/0804Tripeptides with the first amino acid being neutral and aliphatic
    • C07K5/081Tripeptides with the first amino acid being neutral and aliphatic the side chain containing O or S as heteroatoms, e.g. Cys, Ser
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K5/00Peptides containing up to four amino acids in a fully defined sequence; Derivatives thereof
    • C07K5/04Peptides containing up to four amino acids in a fully defined sequence; Derivatives thereof containing only normal peptide links
    • C07K5/10Tetrapeptides
    • C07K5/1002Tetrapeptides with the first amino acid being neutral
    • C07K5/1005Tetrapeptides with the first amino acid being neutral and aliphatic
    • C07K5/1013Tetrapeptides with the first amino acid being neutral and aliphatic the side chain containing O or S as heteroatoms, e.g. Cys, Ser
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K7/00Peptides having 5 to 20 amino acids in a fully defined sequence; Derivatives thereof
    • C07K7/04Linear peptides containing only normal peptide links
    • C07K7/08Linear peptides containing only normal peptide links having 12 to 20 amino acids

Definitions

  • D ⁇ pept ⁇ de seryl-histidine and related oligopeptides cleave DNA, protein, and a carboxyl ester
  • the invention relates to novel compounds with nuclease . and protease activity, and to the use of these and related compounds as nucleases and proteases. Use of the compounds as nicking agents in nick translation is of particular interest.
  • Enzymes Generally Biological enzymes are polypeptides or polyribonucleotides that catalyze biochemical reactions in organisms. Although polyribonucleotide enzymes
  • Polypeptide enzymes contain active sites where chemical substrates are catalytically converted into products by the enzymes .
  • the active sites consist of multiple amino acid residues, which are usually not directly linked but are held in a precise three-dimensional conformation by various biochemical and biophysical forces. It is this precise three-dimensional conformation that creates a unique cleft and biochemical micro-environment that, allows only one or few chemical substrates to gain entry and to be reacted upon.
  • An active site is a small part of an enzyme, but the rest of the enzyme is important for the maintenance of the precise conformation of the active site . It is the side chains of the amino acids that participate in enzymatic reaction.
  • amino acids that are involved in the catalytic reactions in the active sites are most often those with reactive side chains : positive or negative charges ( ys, Arg, or Asp, Glu) , polar groups such as -OH (Ser, Thr, Tyr) , -SH (Cys, Met), or imidazole group (His) . Also, it is almost always the coordinated participation of two or more side chain groups that allow the specific and efficient catalytic reaction to occur.
  • the active site may, but need not, be identical to its binding site.
  • the binding site residues are those directly involved in its binding to the substrate .
  • the active site residues are those directly involved in the enzymatic modification of the bound substrate.
  • proteases One family of biological enzymes is proteases. These are enzymes that use proteins as their substrate and catalyze protein cleavage (proteolytic) reactions. There are several subgroups of proteases classified based on the key catalytic amino acid residue in the active site of the enzyme. In serine proteases such as chymotrypsin and trypsin, amino acid residues Ser, His, and Asp form a catalytic triad, and cleave protein substrate with a hydrolysis mechanism.
  • serine proteases such as chymotrypsin and trypsin
  • amino acid residues Ser, His, and Asp form a catalytic triad, and cleave protein substrate with a hydrolysis mechanism.
  • Ser and His have been determined as the most important residues, since the substitution of either of the two residues with unrelated amino acids would essentially completely abolish the proteolytic' activity of the enzyme, whereas the substitution of Asp would allow the enzyme to maintain a significantly reduced but measurable proteolytic activity.
  • Ser in the active site is replaced with Cys. Since - SH group in Cys is chemically very similar to -OH group in Ser, it is found that thiol proteases catalyze proteolytic reaction with a hydrolysis mechanism that is very similar to the one used by serine proteases .
  • the nucleic acid-binding enzymes include methylases, ligases, polymerases, replicases, and nucleases.
  • Nucleases are enzymes that cleave nucleic acids (DNA and RNA) .
  • DNA-cleaving nucleases include restriction endonucleases, homing endonucleases, topoisomerases, and nucleases involved in genetic recombination and DNA repair. .
  • DNA nucleases cleave DNA by catalytic hydrolysis of the phosphodiester bonds, which are very resistant to non- catalytic hydrolysis. Most nuclease-catalyzed phosphodiester bond cleavage is proceeded by P-0 bond cleavage.
  • the hydrolytic (P-0) cleavage of phosphodiester bond is through a SN2 (P) mechanism that involves the generation of an electron-rich pentacoordinate phosphorane as a reactive intermediate.
  • Nucleases include both exonucleases, which degrade the ends of a nucleic acid, and endonucleases, which can attack an internal site in the nucleic acid.
  • a nuclease may have both exo- and endonuclease activity.
  • Nucleases also differ in their degree of processivity, which is their ability to repeatedly attack the same substrate before releasing it. A nuclease with high processivity will cause (given appropriate conditions) more degradation than one with low processivity.
  • a nuclease may, but need not, be specific for single or double-stranded nucleic acid, for DNA or for RNA, and for particular nucleic acid sequences. If it is sequence-specific, it may recognize one sequence but cleave the nucleic acid somewhere other than the recognition site. The degree of specificity may vary, it is not an all-or-nothing proposition.
  • Amino acids with charged or polar side chains are involved in the active sites of nucleases. Asp, Glu, two negatively charged amino acids, and Tyr, an amino acid with a OH-group on its side chain, are located in the active site and participated in the DNA cleavage of exonuclease activity of DNA polymerase I .
  • homing endonucleases are a group of enzymes whose catalytic activity results in self-propagation.
  • the sequences that code for these endonucleases usually interrupt genes by localizing as open reading frames in introns or as infra e spacers in protein-coding sequences .
  • the target of a homing endonuclease is its cognate intronless or spacerless allele.
  • the endonuclease initiates a DNA mobility or "homing"event by making a double-strand cut in its target.
  • the homing endonuclease resulted from inframe polypeptide spacer can first function as a self-splicing protein cleavage enzyme (specifically termed as intein) . After the intein cleaves itself out of the "host" protein, the intein functions as a homing endonuclease by cutting a target DNA at specific sites.
  • the inframe spacer, or the intein can function as both a protease first and then an endonuclease (DNase) by cleaving both protein and DNA.
  • the N-terminus and the C-terminus of inteins which participate in the self- splicing, always contain a Ser (or Cys) and a His residues, respectively.
  • Ser or Cys
  • His residues terminal Ser and His may be involved. This is the only known case in nature that a natural polypeptide can function as a protease and as an endonuclease, and at the same time invariably contains Ser and His residues at its ends.
  • Nucleases contain both substrate binding and catalysis sites . These two sites can be next to each other or are overlapping.
  • EcoRV is one of the restriction endonucleases that have been studied in details. EcoRV recognizes a palindromic double stranded sequence GATATC and cleaves at the phosphodiester bond between first T and A, generating ⁇ a blunt end. This restriction specificity is achieved with retention of catalytic prowess. A change of a single base pair in the recognition sequence lowers the cleavage rate more than a millionfold.
  • EcoRV is a dimmer of identical subunits, and binds DNA so that the twofold axis of the target site coincides with the twofold axis of the enzyme.
  • the symmetry of the endonuclease matches the symmetry of its targets.
  • the EcoRV endonuclease searches DNA for its GATATC target sequence by diffusing along its major groove. Specifically, a surface loop from a ⁇ turn of each subunit makes contact with the major groove.
  • a large structural rearrangement occurs in both the enzyme and its DNA target.
  • DNA becomes kinked by 50 degrees at the center of the hexanucleotide recognition site.
  • Each recognition loop forms six hydrogen bonds, all in the major groove, with the outer two base pairs of a GAT half site.
  • Mg++ which is essential for hydrolysis, enters the catalytic site and- becomes coordinated only when the target sequence is encountered.
  • Restriction enzymes are found in many microorganisms (bacteria) and have protective functions for the host. These enzymes recognize specific target sequences, and cleave either within or outside of this sequence. While restriction enzymes are very useful in biological research as a means of nucleic acid manipulation, they are limited in the number of target sequences they can recognize. A site-specific nucleic acid cleavage molecule that can recognize and cleave any specific sequence would be highly desirable.
  • Nonspecific nucleic acid cleavage agents include transitional metal ions, particularly Fe++ and Cu++, and reducing agents such as ascorbate. However these agents cleave nucleic acids by mechanisms other than hydrolysis, such as oxidation. Thus, the cleavage products usually lose one or more bases at the cleavage sites.
  • a nick is the cleavage of just one strand in a double- stranded nucleic acid. (If the substrate is nicked a second time at the same site, the result is the complete cleavage of the double-stranded nucleic acid.)
  • nicking agents necessarily have some potential to cleave nucleic acids as well as nick them, as a result of successive nicks.
  • One of the many utilities of certain DNA-binding enzymes has been in nick translation.
  • Nick Translation is commonly used procedure in molecular biology laboratories employed for the labeling of DNA probes, labeled by radioactive or nonradioactive means.
  • DNase I is an enzyme isolated from bovine pancreas and is an endonuclease that hydrolyzes double-stranded and single-stranded DNA to a complex mixture of mono- to oligonucleotides with 5 ' -phosphate and 3 ' -hydroxyl termini .
  • DNase I attacks each strand of DNA independently (nicks) and the sites of cleavage are distributed in a statistically random fashion.
  • DNase I cleaves both strands of DNA (double-stranded breaks) at approximately the same site to yield fragments of DNA that are blunt-ended or have protruding termini only one or two nucleotides in length.
  • Escherichia coli DNA polymerase I adds nucleotide residues to the 3 ' -hydroxyl terminus that is created when on strand of the double-stranded DNA molecule is nicked.
  • the enzyme by virtue of its 5' to 3' exonucleolytic activity, can remove nucleotides from the 5' side of the nick.
  • the elimination of nucleotides from the 5' side and the sequential addition of nucleotides to the 3' side results in movement of the nick (nick translation) along the DNA (Kelly et al. 1970) .
  • By replacing the preexisting nucleotides with highly radioactive or labeled non-radioactive nucleotides it is possible to prepare 32 P- and other-labeled DNA (Maniatis et al . 1975, and DIG labeled probe).
  • nicks in the substrate DNA are translated in the 5 ' to 3 ' direction, labeled nucleotides are incorporated into the DNA, generating a randomly labeled DNA probe.
  • the specific activity of the nick-translated DNA probe depends not only on the specific activity of the dNTPs, but also on the extent of nucleotide replacement of the template. This can be controlled by varying the amount of DNase I in the reaction. The aim is to establish conditions that will result in incorporation of about 30% of the [ - 32 P] dNTPs into DNA.
  • the size of DNA after nick translation also depends on the amount of DNase I added to the reaction and the amount of DNase contaminating the preparation of DNA polymerase.
  • the standard Nick Translation Kit supplies an Enzyme Mixture containing DNA Polymerase I and an undisclosed amount of DNase I.
  • the nicking activity of DNase I therefore, cannot be modulated independently from the polymerase activity of DNA Polymerase I.
  • Excessive nicking of the DNA substrate by DNase I in the Nick Translation reaction results in some degree of fragmentation of the DNA and in suboptimal labeling of the probes.
  • the Nick Translation reaction cannot be shortened to avoid excessive nicking without a concomitant loss in polymerization and labeling.
  • Serine and histidine are found in the binding sites of both proteases and nucleases.
  • the amino acids serine (Ser) and histidine (His) also function together in the active sites of various natural proteases, esterases and Upases, but not in any natural nucleases, as direct participants in enzymatic reactions 1-7 .
  • Ser and His are directly involved in the peptide bond and ester bond cleavage reactions of the serine proteases chymotrypsin, trypsin, and elastase 1, 8"10 .
  • the active sites of trypsin, chymotrypsin, and elastase all use Ser and His with Asp to form a catalytic triad.
  • Asp forms a catalytic triad with Ser and His in the active sites of serine proteases 1 , lipases 3-5 , and esterases 6 ' 7
  • the Ser/His dyad has been shown to be sufficient for the cleavage reactions 8-11 .
  • serine With its hydroxyl functional group, serine usually plays one of two roles in enzymatic reactions, serving either as a hydrogen donor or as a nucleophile . It is worth noting that phosphorylation of Ser in polypeptides is a key signal transduction step in all organisms, suggesting an early coupling of Ser and phosphate.
  • Histidine is likewise important in many enzymatic reactions, serving as a general acid or base in catalysis. Histidine, for example, is known to be an essential amino acid residue in
  • the exonuclease active site of DNA polymerase I does not contain either Ser or His, see Beese and Steitz, EMBO J. 10:25-33 (1991) . That is likewise true of the active site of staphylococcal nuclease, see Loll and Lattman, Proteins Struct. Funct . Genet., 5:183-201 (1989), and of EcoRV, see Winkler, et al . , EMBO J., 12:1781-1795 (1993).
  • the active sites of Ribonuclease A and RNase Tl contain His but not Ser.
  • Ser-His per se is a nonspecific nuclease. ouch nucleases have utility, for example as nicking agents, in place of (or in addition to) DNase I, in the process of nick translation.
  • the dipeptide seryl-histidine (Ser-His) can also nick DNA.
  • the nicking activity of the dipeptide can be modulated with changes in concentration and incubation temperature.
  • the nicking activity of Ser-His is lower than that of DNase I and it is much easier to control, which makes it more suitable as the nicking agent in Nick Translation.
  • Ser-His can act as a nonspecific proteinase it may be used unchanged as such.
  • the present invention also relates to derivatives and analogues of Ser-His which have been engineered to have specificity for a particular substrate. These may be used as specific proteinases, esterases, nucleases, etc. BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 3 Two proposed mechanisms for DNA cleavage by Ser-His: A. Single phosphate mechanism. B. Dual phosphate mechanism. Figure 4. Computer-generated models for interaction of Ser-His with single stra.nded DNA substrate.
  • a 60mer oligonucleotide whose sequence is derived from the luciferase gene, is end-labeled at its 5' end and used as the substrare.
  • the PNA moiety of the cleaving molecule contains 15 bases that are complementary to a region of the substrate sequence. If site-speci ic cleavage is achieved, the cleaved product, and end-labeled 22mer, will be detected on a PAGE autoradiograph as a band shorter than the original 60 mer substrate.
  • Figure 7 Flow chart of Nick Translation with Ser-His.
  • Figure 8 diagram showing site of hydrolysis of a phosphodiester bond by Ser-His.
  • Figure 9 diagram showing nicking mechanism.
  • Figure 10 diagram showing labeling mechanism.
  • Figure 11 bar charts showing that DNA probes produced by Ser-His treatment of (A) lambda DNA and (B) pBR322 exhibited higher specific activity than probes produced by nick translation kit. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS OF THE INVENTION
  • nuclease includes enzymes which cleave DNA, RNA or both, in single and/or double-stranded form, and in linear and/or circular form. It includes both exo- and endonuclease, and both sequence-specific and non-sequence- specific enzymes.
  • proteas includes enzymes which cleave peptides and enzymes which cleave proteins. It includes both sequence-specific and non-sequence-specific enzymes.
  • the target molecule may be any protein (including peptides) or nucleic acid (including RNAs and DNAs) .
  • the target nucleic acid molecules of particular interest are probes suitable for nick translation, which results in the incorporation of a labeled nucleotide into the target molecule.
  • the enzyme of the present invention is a non-sequence-specific enzyme, such as Ser-His and certain related molecules. These may be referred to, collectively as Ser-His-Like Enzymes, or "SHLE" compounds, for short.
  • SHLE compounds include peptides, and peptoids, peptidomimetics and analogues thereof.
  • These non-sequence-specific enzymes are of value in utilities where specificity is not important or is even detrimental. Qne such utility is nicking nucleic acids preparably to their labeling by nick translation.
  • the non-sequence-specific enzyme has the following structure:
  • This formulae includes all of the molecules with cleavage activities of ++ or better in Table 2, and likewise excludes the Table 2 molecules with lesser activity.
  • Table 2 shows that the Ser can be replaced with Cys, but not with the related amino acids Thr or Asp. Whether it can be replaced with Gly or Ala is not known at this time. The His cannot be replaced with Arg or Lys, although they are positively cnarged, too.
  • the Ser (or Cys) is preferably at the amino terminal of the enzyme. However, the Ser (or Cys) and the His need not be adjacent to each other; at least a three amino acid separation is tolerated, and there is no reason to believe that a greater separation could not be accepted. Moreover, it is clear that at least one amino acid may be placed after the His. Again, there is no reason to believe that this is an upper limit on the C-terminal post-His moiety.
  • a series of combinatorial libraries may be used to systematically test all oligopeptides with this formula for nucleic acid or peptide cleavage activity.
  • the value of m and n are fixed, so that the library is of fixed length.
  • the library members are expressed in cells, and the peptides are displayed on the surface of the cells or of phage produced by the cells. This has the advantage of a plifiability.
  • the Xaa may each encoded by a degenerate codon which allows all 20 amino acids, such codons include the NNN and NNK codons .
  • one or more Xaa positions may be restricted, e.g., to a mixture of Ser, His, Asp, Gly, (the four alternatives explored in table 1) , or possibly to one further including Thr and/or Ala, which are fairly frequently exchange with Ser and/or Cys in families of homologous proteins.
  • the maximum number of different sequences for the longest possible peptide is 4 ⁇ 12, which is about 1.6 xl0E7, and well within the typical diversity range of contemporaneous peptide libraries .
  • Ser is encoded by TCN, AGT and AGC . Cys is encoded TGT and TGC. Hence, it is preferable to prepare separate Ser-XXX and Cys-XXX libraries, where "XXX" denotes the remainder of the peptide.
  • the libraries may be screened separately, or pooled together. Most of the entire set of formula peptides may be prepared in one step if, in synthesizing the encoding DNA, the DNA is synthesized in steps in which one adds one or more codons in each step, rather than just one base. In that case, in step 2, one adds a mixture of Xaa, Xaa-Xaa, Xaa-Xaa, etc. His is added in step 3 and another variable length mixture in step 4. This approach does not, however, produce those subsets in which m, or n, or both, are zero. Those would need to be handled separately.
  • One would then set aside an aliquot of the DNA (for which n 0) and in step 3, add a mixture of Xaa, Xaa-Xaa, Xaa-Xaa-Xaa, etc.
  • This variation produced the entire set of formula peptides in one operation.
  • the library may be prepared by chemical means, e.g., synthesized on beads or at particular grid positions of a support. This has the advantage that members with high nuclease activity are not lost by inhibiting growth of or even killing a host cell.
  • the library is one of soluble peptides rather than one of peptides immobilized on a support (which could be a nonliving support such as a pin or bead, or a biological support such as a cell or phage) , as the support or the linker to the support could affect the activity.
  • a support which could be a nonliving support such as a pin or bead, or a biological support such as a cell or phage
  • the library is one of soluble peptides
  • some form of deconvolution method such as that described by Blake, USP 5, 565, 325, may be used to identify the active peptides.
  • the library is one in which the peptides are immobilized, it is desirable to use one of the flexible linkers commonly used in the combinatorial peptide art.
  • a small peptide such as seryl-histidine is easily amenable to modification via two synthetic processes.
  • the modification necessary can be imparted through either 1) Chemical modification of each amino acid residue followed by amide bond formation to produce the dipeptide, or 2) Modification of the dipeptide as an intact molecular unit.
  • Modification of the serine hydroxyl group can be accomplished in a number of ways, with a large number of attendant groups possible. The easiest modification conducted will be acylation of the hydroxyl functionality.
  • the conditions needed are an acylating agent such as acetic anhydride and acetyl chloride or other derivatives.
  • a base is required, and possibly a catalyst such as dimethylaminopyridine.
  • Ethers will also be useful derivatives to prepare from the serine hydroxyl group.
  • Methyl ethers can be prepared from serine, methyl iodide and a base, methyl Meerwein reagent, methyl sulfate, or other similar alkylating agents .
  • Benzyl and substituted benzyl ethers could be placed to control the electronics of the ether group.
  • Silyl ethers are important modifications due to their perceived lack of hydrogen bonding ability. These are easily prepared from a substituted silyl chloride/triflate, base and catalyst. Subsequently steric bulk and/or electronic differences can be applied to the hydroxyl moiety with the goal of increasing or decreasing affinity.
  • the imidazole functionality of the histidine residue can be easily functionalized in three different areas: 1) The ⁇ -H group, 2) the ⁇ N group, or 3) C-2 of the imidazole ring.
  • sulfonyl derivatives of the N-H group (Me 2 NS0 2 Cl, Et 3 N) can be easily prepared.
  • Alkyl derivatives, carbamates, and phenacyl groups have all been placed on the N-H group.
  • Substitution at C-2 of the imidazole ring has been accomplished, placing F, CF 3 , and substituted alkyl derivatives .
  • Conditions include an alkyl or fluoro acid, AgN0 3 , sulfuric acid, and ammonium persulfate to conduct a radical oxidative decarboxylatio .
  • structural descriptors ca-lculating an overall structural similarity between the compounds.
  • the structural descriptors which may be used include, but are limited to, those listed in Patterson,, et al. (1996), Klebe and Abraham (1993), Cummins, et al. (1996), and Matter (1997). Conventional mathematical methods may be used to select or weight the descriptors.
  • Ser-His has been shown capable of nucleic acid cleavage, this nucleolytic activity is extremely low compared to restriction enzymes. This is not surprising since restriction enzymes typically consist of hundreds of amino acid residues folded in highly complicated three- dimensional configurations to produce active sites capable of catalyzing reactions with extreme efficiency and specificity, whereas Ser-His is a simple dipeptide lacking in the structural complexity and chemical diversity of natural polypeptide enzymes. Modern DNases, RNases, and restriction endonucleases use structural motifs such as zinc fingers to recognize and bind specific regions on the substrate, contributing a thermodynamic advantage by properly positioning and confining the pertinent active site functional groups relative to the substrate and providing a kinetic advantage by promoting rapid association of enzyme and substrate in solution.
  • Ser-His dipeptide The relatively low nucleolytic activity of the free Ser-His dipeptide is likely due to its low affinity for its target (presumably the phosphodiester bonds) in the nucleic acid substrates. Ser-His appears to cleave DNA through a hydrolysis mechanism which yields new 3' and 5' termini.
  • Fig. 3A & 3B The two proposed mechanisms for the interaction between the dipeptide with its substrate (Fig. 3A & 3B) illustrate that certain conditions must met for an encounter between Ser-His and DNA to result in cleavage of the substrate. Cleavage is believed to be the consequence of an S N 2 nucleophilic attack by the hydroxyl of serine on a phosphorus in a phosphodiester bond, forming a pentacoordinate phosphorane transitional state stabilized by a source of p'ositive charge (either the imidazole of His 5 or the N-terminal amino group) in the dipeptide.
  • a source of p'ositive charge either the imidazole of His 5 or the N-terminal amino group
  • the imidazole group of histidine may serve as a general base to increase the nucleophilicity of serine' s hydroxyl. It may also serve as a general acid catalyst to assist ⁇ the leaving
  • the dipeptide In order to form this transitional state, the dipeptide must first approach a phosphodiester bond and
  • this dipeptide could be modified so that its specificity could be increased and if it could be confined in the region of a phosphate, the resulting site-specific nucleolytic molecule would have greatly enhanced nucleolytic activity.
  • the invention relates to a conjugate of Ser-His with, a binding moiety (homing sequence) which provides the desired specificity.
  • This binding moiety is linked, directly or indirectly, to the C- terminal of Ser-His (or related moieties ' ) .
  • the -binding agent ' may be a nucleic acid binding moiety, so that the conjugate is a site-specific nuclease, or a peptide binding moiety, so that the conjugate is a site-specific protease.
  • the binding moiety is preferably a PNA, as defined below.
  • PNA a PNA homing sequence
  • synthetic site-specific nucleolytic molecules can be created that can assume the appropriate conformation for nucleic acid cleavage and can properly position the active site relative to the substrate to result in reliable cleavage at a predetermined location.
  • PNA will be used as an alternative to an oligonucleotide homing sequence, because its nucleosides are linked through peptide bonds instead of phosphodiester bonds.
  • PNA has been shown to form a double helix with single-stranded nucleic acid, and triple helices with double-stranded DNA.
  • the helical structure formed between PNA and oligonucleotides is through Watson-Crick base-pairing and that the two strands are anti-parallel.
  • the double helix formed between PNA and single-stranded DNA has tighter binding than a double- stranded DNA double helix.
  • PNA has been tried as a rare genome cutter. All these suggest that PNA will be useful as a homing sequence for Ser-His to greatly accelerate the rate of bringing the Ser-His to the nucleic acid substrate.
  • Molecular modeling may be used in the design of the linker used to connect Ser-His with the PNA homing sequence.
  • Computer modeling has been used successfully in many biomolecular interactions, including protein/nucleic acid interactions, ligand/receptor binding, as well as in drug design.
  • These artificial site-specific nucleolytic molecules will have many usef l properties . ' They should have much greater activity than the Ser-His dipeptide, since the homing sequence would greatly enhance the affinity between the molecule and the substrate and would confine the active site in the proper position relative to the target phosphate. Pairs of these molecules could be used to cut double-stranded DNA in the manner of restriction enzymes.
  • these site-specific nucleolytic molecules will not be limited to particular recognition sequences; they will be site- specific yet be able to be customized to target any specific sequence. Additionally, these molecules will be modular in nature; once the parameters required for the linker are calculated using molecular modeling it will be possible to vary the target sequence of the molecule by converting the homing sequence linked to the Ser-His- linker. These will be particularly useful in human genome and related research areas, which will benefit from the ability to target with precision any sequence without regard to available restriction enzyme recognition sequences. There are also innumerable potential biomedical . applications for these molecules, which could be used to target DNA, RNA, and possibly even proteins.
  • ribozyme/antisense technologies can target any sequence, but without cleavage.
  • ribozymes can cleave RNA substrates, but are very limited in the sequences they can target.
  • Ser-His--PNA molecules we hope to develop will combine both of these properties and, like ribozyme/antisense technologies, can be used in vivo .
  • the specificity of the enzyme need not be absolute, that is, it need not bind and cleave only the target molecule and no other molecule, provided that its preference for the target molecule as a substrate is sufficiently strong to render the enzyme useful.
  • Specificity may be measured by comparing activity against a target molecule with the predetermined target sequence with activity against a control molecule lacking that sequence.
  • the ratio is at least 10:1, more preferably at least 100:1, still more preferably at least 1000:1, under the conditions of interest.
  • the preferred control molecule has a control sequence obtained by random scrambling of the target sequence.
  • One may use a plurality of control molecules and determine if the relative activity against the target versus the controls is such that there is a statistical significant difference, given the mean and s.d. of activity against the controls.
  • the target sequence is an amino acid sequence of a target protein or a nucleotide sequence of a target nucleic acid which is specifically recognized by the BM.
  • the specificity of a nucleic acid is an exponential function of its length, as the probability that a random nucleic acid will be perfectly complementary is l/4 n, where n is the 'length of the sequence, assuming that all four bases are equiprobable.
  • nucleic acid database It may be prudent to search a nucleic acid database for potential inadvertent targets before selecting a particular sequence as the target sequence.
  • the probability that a random amino acid sequence will be identical to a given target sequence is l/20 ⁇ n, where n is the length of the sequence, assuming that all 20 amino acids are equiprobable.
  • the target sequence is any enzymatically exposed sequence of the target molecule .
  • the cleavage site is the location on the target molecule where cleavage occurs. Generally speaking, it will not be the target sequence, since that sequence will be occluded by the BM:TM complex. However, it will be near the target sequence, within the radius of action permitted by the length of the LM.
  • the cleavage site may be any point on the target molecule which is in range of the tethered EM while the BM is bound to the target sequence. Depending on the length and flexibility of the LM, certain points may be more often cleaved than others . If free EM has a preference for certain cleavage sites, it may be desirable to take these preferences into account when selecting a target sequence. That is, in choosing a target sequence, one may consider not only the affinity and specificity of the available BM for each potential target, but also the proximity of highly cleavable potential cleavage sites.
  • a site-specific enzyme conjugate may be formed by conjugating an Enzymatic Moiety (EM) to a Binding Moiety (BM) , both as hereafter defined.
  • the EM may be linked directly or indirectly, and covalently or noncovalently, to the BM.
  • the indirect linkage may be by a linking moiety (LM) as hereafter defined.
  • the conjugate may be prepared either as a single unit, or by individually preparing the EM and BM and then conjugating the two. The present invention is not limited to any particular method of conjugation.
  • the binding moiety confers specificity on the conjugate, and increases the effective concentration of the bound target in the vicinity of the EM. While the intrinsic specificity of the EM is unchanged, since the effective concentration of the bound target is much higher than the effective concentration of any unbound molecules, the target is selectively cleaved.
  • the molecule containing the target sequence may not be cleaved exactly at the target sequence, but rather nearby, with the exact cleavage site being at a distance from the target sequence which is limited by the length of the linker moiety.
  • the preferred enzymatic moiety is the dipeptide Ser- His. However, it may be any of the related nonspecific enzymes set forth above.
  • the ' binding moiety .(BM) is the component which provides the desired specificity. If the intended use of the conjugate is as a nuclease, the binding moiety will be specific for a particular nucleic acid sequence. It the 15 intended use of the conjugate is as a protease, the binding moiety will be specific for a particular amino acid sequence. When the conjugate is a nuclease, the binding moiety is preferably a peptide or a nucleic acid.
  • a peptide BM has the advantage that, if the linking 20 . moiety is - also a peptide, the entire conjugate may be synthesized as a fusion protein.
  • a peptidic NA-binding BM may be obtained by preparing a phage library display random oligopeptides, and screening for phage displaying oligopeptides that bind the nucleic 25 acid target and which do not bind control nucleic acids.
  • the NA-binding BM may also • be a nucleic acid (or nucleic acid analogue) in which case ' it is a nucleic acid
  • PNA peptide nucleic acid
  • PNAs DNA analogues whose "building • blocks" are normal DNA bases but whose backbone is made with peptide-like bonds instead of sugar-phosphate ⁇ bonds .
  • the achiral backbone is ' made from N- (2-aminoethyl) -glycine units linked by amide bonds, and is uncharged.
  • PNAs can form Watson-Crick pairs with normal nucleotides .
  • PNA oligomers/polymers have higher thermal stability, stronger binding (relatively independent of salt concentration) , more specific binding, (1 mismatch in 15-mer PNA lowers the
  • Tm by 8-20°C (15°C a.vg . ) in 15-mer DNA, by 4-16°C (11°C avg ' .)) and greater resistance to nucleases than DNAs . They are also protease-resistant .
  • PNA oligomers/polymers are preferably synthesized using a modified peptide synthesis protocol. Both Fmoc and tBoc . methods are often used.
  • PNAs are described in the following references : Nielsen, P.E. et al . , 1991, Science, 254:1497-1500.
  • the preferred PNAs are 10-15 bases in length
  • the linking moiety (LM) or tether may be any chemical structure which (1) sufficiently distances the EM from the BM so that neither substantially interferes with the other, and (2) brings the EM into sufficient proximity with the molecule bound -by the BM so that the desired level of specificity of enzymatic activity is obtained.
  • the linkage of the EM to the BM increases the effective concentration of the EM in the vicinity of the recognition site of the BM .
  • the linking moiety is perfectly flexible, but has a fixed length L in angstroms, then the EM must lie in a spherical volume centered on the location of the bound EM. That volume V equals L 3 .
  • the effective concentration of the EM in the vicinity of the binding site is then 1/(V* 6.023 x 10E-4) molar.
  • One class of linkers are oligopeptide linkers.
  • Such linkers may be based on interdomain linkers occurring naturally in multidomain proteins (especially enzymes with separate binding and catalytic domains), or on loops (reverse turns) naturally linking alpha helices or beta strands in proteins. Or they may be non-naturally occurring linkers .
  • the length of an oligopeptide linker may be predicted approximately on the basis of the number of residues in the linker, and the expected conformation of the linker. Typical translation per residue, in angstroms, is
  • the value of 3.8 angstroms/residue corresponds to a fully extended polypeptide.
  • the rms end-to-end distance is about (130n) .5. This reflects steric restrictions on the flexion of the chain. Both Gly and Pro reduce the rms distance, Gly because it directly introduces flexibility, and Pro because the chain tends to change directions . Examples of naturally occurring linkers include the sequences set forth in Argos, J. Mol . Biol., 211: 943-958
  • interdomain linkers were at least five amino acids in length.
  • the mean extension (from C-alpha to C-alpha) was 2.73 angstroms/AA.
  • the interdomain linkers were about average in flexibility, as reflected by the B (temperature) value of the linker compared to the appropriate mean and standard deviation for B for the appropriate length sequences of the protein as a whole.
  • the Hoffman patent recommends, inter alia, poly ' Gly (e.g., Gly7 for 25 angstroms), polyGlu, polyAsp, Artemia (G n -LRRQIDLEVTGL-G n ) , Gly 1 _ 3 -Ala 12 -Gly 1 _. 3 [ 20-
  • Gly 1 _ 3 -Asp n -Gly 1 _ 3 (26-49 angstroms) .
  • phage display libraries it is common to use one of the following sequences to link the displayed peptide or protein to the- phage coat protein: GGGS, EGGGS, GGGGG,
  • GGGGSSS GGGGSSS, (GGGS)x n, or other sequences rich in Gly, Ser,
  • linkers will be rich in glycine, which, by virtue of its lack of a side chain, typically confers flexibility on a peptide chain which incorporates it.
  • linker may be randomized as to length, composition, and/or specific sequence.
  • Linker amino acids are Gly, Ser, Pro, Asp, Asn, or Thr, chosen randomly and independently at each amino acid position, and there are n amino acids in the linker.
  • Library #5 but also allow Glu and Arg, which, although fairly large, are still hydrophilic.
  • the types and the numbers of amino acids used in a linker will be determined based initially on the results of computer-aided simulation. For computer modeling techniques, see, e.g., section 4.5 below.
  • linkers of various amino acid composition and varied linker lengths will be made and tested to evaluate their relative effectiveness and flexibility as linkers, based on the kinetics and specificity of substrate cleavage .
  • linkers and PNAs are shown in Table 3.
  • a second class of linkers are nucleic acid linkers .
  • linkers composed of DNA, RNA, or analogues thereof. Such linkers are discussed in detail in Hanson, USP 5,844,107.
  • linkers include linkers formed by chemically reacting a bifunctional crosslinking agent with the EM and the BM. This agent has reactive end function
  • LI and L2 which may be the same or different. It may be conjugated simultaneously to both the EM and the BM, or first to one and then to the other.
  • one end function may be reactive with the carboxy group at the C-terminal of the EM.
  • the C-terminal of the EM may be derivatized so that a different functionality, e.g., an amide or thiol, is presented, and the end function LI being one reactive with the new functionality.
  • the other end function L2 must be reactive with an original or provided functionality of the BM.
  • Typical end functions are those reactive with carboxy, amino, and thiol groups.
  • the crosslinking agent is a chemical with at least two reactive functions which is reacted with the EM and BM to form the conjugate.
  • the linker is the portion of the original agent which is inherited by the conjugate.
  • the agent were Ll- (CH2)n-L2
  • the conjugate might be BM- (CH2 ) n-EM, and the linker is thus the -(CH2)n-.
  • the site of conjugation of the crosslinking agent to the BM need not be a single site. However, at least one, and preferably substantially all, of the conjugation sites must be such that the EM may be conjugated to the BM without substantially impairing the binding function of the BM or the desired enzymatic function of the EM. If necessary, a sensitive site may be protected with a suitable protecting group during crosslinking, the protective group being selectively removed afterward.
  • a single EM is conjugated to each BM.
  • multiple EM' s are conjugated to each BM, by one or more linkers.
  • a single linking agent may have one or more Lls for conjugating EM' s and one or more L2s for conjugating BM' s .
  • a commercially available 0- linker will be used to connect Ser-His and the PNAs (Table 3) .
  • the length of the O-linker, when fully extended, is about 9.86 A. 'However, the O-linker may assume different conformations when connected with the dipeptide and PNA or when it interacts with oligonucleotide substrate.
  • Linkers with one or two units of the O-linker will be made commercially and tested in the site-specific cleavage study to evaluate their relative effectiveness as linkers.
  • a Silicon Graphics Intergraph (SGI) computer and the molecular simulation software "Macromodel” may be used for computer simulation of molecular interactions between Ser- His, or Ser-His-linker-PNA and nucleic acids.
  • the information gained from computer simulation on oligopeptide/nucleic acid interaction will be used to assist in prediction of results of cleavage experiments, formulation of cleavage mechanisms, and design of chemical linkers for site-specific cleavage.
  • Important specific information that can be obtained from computer simulations include spatial orientations of the oligopeptides relative to different DNA/RNA substrates, important distances between the functional groups of the oligopeptides and phosphodiester bonds of the substrates, and the energy levels of different conformations of the oligopeptides.
  • Another important objective of computer modeling is to determine the positions of the Ser-His moiety of the Ser- His-linker-PNA molecules relative to the phosphodiester bond on the DNA substrate. Such computer simulations will provide very valuable information for the design of linkers with the best length and the least steric hindrance, enabling efficient site-specific cleavage of the nucleic acid by Ser-His-linker-PNA molecules.
  • the library should be formulated such that (1) the members can bind to and cleave the target molecule, (2) members which so bind and act can be differentiated from those which do not, and (3) the successful members can be fully characterized.
  • the library may be synthesized so (1) the members are displayed on the surface of a living support (a cell or virus), (2) the members are displayed on the surface of a nonliving support (pin, bead, sheet, etc.), or (3) the members are provided in soluble form. It is necessary to be able to distinguish the successful members from the unsuccessful members; this may be done by physical separation, or by recognizing a change in a signal as a result of the binding. It is also necessary to characterize the successful binding member. If the member is a peptide or nucleic acid, it may be sequenced directly. The sequence of a peptide may be inferred if its coding sequence is sequenced.
  • the members may be displayed in a distinctive position on a support, or tagged with a distinctive tag, whereby their structures may be inferred.
  • a successful sequence may be inferred by .comparison of 'related mixtures, as in Blake, infra .
  • phage display see Smith, Science; 228:1315-17 (1985), Harrison, Meth. Enzymol. 267:171-191 (1996), Ladner, USP. 5,223,409.
  • the phage genome- is engineered so that a random, or semirandom peptide or protein is fused to a phage coat protein, so that the foreign peptide or protein is displayed on the surface of phage .
  • RNA peptide fusion see Roberts & Swstak, PNAS, 94:12297-12302 (1997).
  • Roberts & Swstak PNAS, 94:12297-12302 (1997).
  • bound and unbound members are separated by immobilizing the target and washing off library members which are not bound to target. If the action of the enzymatic moiety, on the target were such as to result in a loss of binding, this could be problematic.
  • a nonbinding surrogate for the binding moiety e.g., Thr-His for Ser-His.
  • Another solution is to use an intracellular assay, especially one in which the cell dies if the target molecule is not cleaved. . Alternatively, one could select for noncleavage, and identify the successful members by a technique- akin to replica plating.
  • Ser-His worked over wide ranges of pH, temperature, and concentration. It also worked in various buffering systems. Preferred range for pH:5.5-7.5; pH 6-6.5 more preferred. Preferred range for temperature: from 20 °C up to 80°C; increased cleavage rate with increased temperature; temperature of 37°-60°C. more preferred. Preferred range' for concentration: ImM to 20mM; 5-10 mM more preferred, especially for nicking DNA.
  • Compatible buffer systems include but are not limited to Britton- Robinson (B-R) (contains borate, phosphate and acetate) , phosphate buffer (PBS), citrate, and acetate buffer. Ser- His cleavage activity is inhibited with Tris buffer. Preferred range . for incubation time: 1 hour to 48 hours, but is dependent on concentration and temperature; shorter durations may be effective with higher temperatures or concentrations; longer durations will result in more cleavage .
  • a preferred protocol for nick translation of DNA with Ser- His and similar compounds is the following:
  • Steps 5-9 are the same as the procedure used in the standard Ni ck Transla tion Ki t (Roche) , wi th the exception tha t DNA Polymerase I is used instead of the Enzyme Mixture provided in the ki t,, which also con tains DNase I.
  • any other protocol which utilizes Ser-His, or a related molecule, and which achieves acceptable results may be used in place of that set forth above.
  • the Ser-His may be replaced with one of the other enzymes of the present invention, e.g., Cys-His or Ser-His-Asp, and the reaction conditions (e..g., concentrations, temperature, pH, incubation time) may be varied, and the purification step may be replaced with a different purification procedure or possibly omitted altogether.
  • Steps 5- 8 a different polymerase may be used, the reaction conditions may be varied, the label may be different, and the purification step may be altered or possibly omitted altogether.
  • Step 9 is a QC step and its nature will depend on the choice of label.
  • the nicking step can be shortened by further increase of incubation temperature. A incubation time of 60 min or shorter can be achieved by increasing the incubation temperature from 50°C to 60°C or higher.
  • the nick translation procedure has one more step than conventional nick translation procedure in that Ser-His has to be removed by centrifugation in a spin column after the nicking reaction and before the polymerization step.
  • This additional step should not be a problem, since the conventional nick translation, same as the Ser-His nick translation, also requires a spin column step at the end of the procedure to separate labeled probe from unincorporated precursors (radioactive or non- radioactive substrates of polymerization) .
  • the requirement for such a spin column and a centrifuge for the separation indicates that performing Ser-His nick translation procedure requires exactly the same spin column (except needs two columns instead of one in conventional nick translation) and centrifuge as in the conventional procedure without any additional equipment.
  • the nuclease of the present invention is used in conjunction with labeled nucleotides to a label a nucleic acid, such as a probe.
  • The' labeled nucleotides may be the normal nucleotides G, A, T (U for RNA) and C, or may be unusual nucleotides such as inosine.
  • the label may be radioactive or nonradioactive .
  • Suitable radioactive labels include 32 P 33 P and 35 F; suitable nonradioactive labels include biotin (ardin) , fluorophores, chromophores, and other molecules capable of generating a suitable signal and compatible with the target molecule and target nucleotides . Digoxigenin is especially preferred .
  • combinatorial libraries of molecules other than peptides may be used, mutatis mutandis.
  • the principal difference between these libraries and peptide libraries is that the libraries cannot be obtained by expression of partially degenerate DNAs in cells .
  • Non-peptide libraries include nucleic acid libraries, as well as Examples of candidate simple libraries which might be evaluated include derivatives of the following: Cyclic Compounds Containing One Hetero Atom H teronitrogen pyrroles pentasubstituted pyrroles pyrrolidines pyrrolines prolines indoles beta-carbolines pyridines dihydropyridines
  • Amino acids are the basic building blocks with which peptides and proteins are constructed. Amino acids possess both an amino group (-NH 2 ) and a carboxylic acid group (- COOH) . Many amino acids, but not all, have the structure NH 2 -CHR-COOH, where R is hydrogen, or any of a variety of functional groups .
  • Twenty amino acids are genetically encoded: Alanine, Arginine, Asparagine, Aspartic Acid, Cysteine, Glutamic Acid, Glutamine, Glycine, ⁇ istidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Proline, Serine, Threonine, Tryptophan, Tyrosine, and Valine. Of these, all save Glycine are optically isomeric, however, only the L- for is found in humans. Nevertheless, the D-forms of these amino acids do have biological significance; D-Phe, for example, is a known analgesic.
  • amino acids are also known, including: 2- Aminoadipic acid; 3-Aminoadipic acid; beta-Aminopropionic acid; 2-Aminobutyric acid; 4-Aminobutyric acid (Piperidinic acid) ; 6-Aminocaproic acid; 2-Aminoheptanoic acid; 2- Aminoisobutyric acid, 3-Aminoisobutyric acid; 2- Aminopimelic acid;
  • Peptides are constructed by condensation of amino acids and/or smaller peptides.
  • the amino group of one amino acid (or peptide) reacts with the carboxylic acid group of a second amino acid (or peptide) to form a peptide
  • a peptide is composed of a plurality of amino acid residues joined together by peptidyl (-NHCO-) bonds.
  • a biogenic peptide is a peptide in which the residues are all genetically encoded amino acid residues; it is not necessary that the biogenic peptide actually be produced by gene expression.
  • the peptides of the present invention include peptides whose sequences are disclosed in this specification, or sequences differing from the above solely by no more than one nonconservative substitution and/or one or more conservative substitutions, preferably no more than a single conservative substitution.
  • the substitutions may be of non-genetically encoded (exotic) amino acids, in which case the resulting peptide is nonbiogenic.
  • a conservative substitution is a substitution of one amino acid for another of the same exchange group, the exchange groups being defined as follows
  • a highly conservative substitution which is preferred, is Arg/Lys/His, Asp/Glu, Asn/Gln, Leu/Ile/Met/Vai, Phe/Trp/Tyr, or Gly/Ser/Ala. Additional peptides within the present invention may be identified by systematic mutagenesis of the lead peptides, e.g.
  • each amino acids position may be either the original amino acid or alanine (alanine being a se i- conservative substitution for all other amino acids) , and/or
  • mutants are tested for activity, and, if active, are considered . to be within "peptides of the present invention" . Even inactive mutants contribute to our knowledge of structure-activity relationships and thus assist in the design of peptides, peptoids, and peptidomimetics .
  • substitutions of exotic amino acids for the original amino acids take the form of
  • the exotic amino acids may be alpha or non-alpha amino acids (e.g., beta alanine) . They may be alpha amino acids with 2 R groups on the Co., which groups may be the same or different. They may be dehydro amino acids (HOOC-
  • Cyclization is a common mechanism for stabilization of peptide conformation thereby achieving improved association of the peptide with its ligand and hence improved biological activity. Cyclization is usually achieved by intra-chain cystine formation, by formation of peptide bond between side chains or between - and C- terminals. Cyclization was usually achieved by peptides in solution, but several publications have appeared recently that describe cyclization of peptides on beads.
  • a peptoid is an analogue of a peptide in which one or more of the peptide bonds are replaced by pseudopeptide bonds, which may be the same or different.
  • pseudopeptide bonds may be: Carba ⁇ (CH 2 -CH 2 )
  • a peptidomimetic is a molecule which mimics the biological activity of a peptide, by substantially duplicating the pharmacologically relevant portion of the conformation of the peptide, but is not a peptide or peptoid as defined above.
  • the peptidomimetic has a molecular weight of less than 700 daltons .
  • Designing a peptidomimetic usually proceeds by: (a) identifying the pharmacophoric groups responsible for the activity; (b) determining the spatial arrangements of the pharmacophoric groups in the active conformation of the peptide; and (c) selecting a pharmaceutically acceptable template upon which to mount the pharmacophoric groups in a manner which allows them to retain their spatial arrangement in the active conformation of the peptide.
  • Step (a) may be carried out by preparing mutants of the active peptide and determining the effect' of the mutation on activity. One may also examine the 3D structure of a complex of the peptide and the receptor for evidence of interactions, e.g., the fit of a side chain of the peptide into a cleft of the receptor; potential sites for hydrogen bonding, etc.) .
  • Step (b) generally involves determining the 3D structure of the active peptide, in the complex, by NMR spectroscopy or X-ray diffraction studies.
  • the initial 3D model may be refined by an energy minimization and molecular dynamics simulation.
  • Step (c) may be carried out by reference to a template database, see Wilson, et al . Tetrahedron, 49:3655-63 (1993) .
  • the templates will typically allow the mounting of 2-8 pharmacophores, and have a relatively rigid structure. For the latter reason, aromatic structures, such as benzene, biphenyl, phenanthrene and benzodiazepine, are preferred.
  • aromatic structures such as benzene, biphenyl, phenanthrene and benzodiazepine.
  • orthogonal protection techniques see Tuchscherer, et al . , Tetrahedron, 17:3559-75 (1993).
  • Analogues of the disclosed peptides, and other compounds with activity of interest may be identified by assigning a hashed bitmap structural fingerprint to the compound, based on its chemical structure, and determining the similarity of that fingerprint to that of each compound in a broad chemical database.
  • the fingerprints are determined by the fingerprinting software commercially distributed for that purpose by Daylight Chemical Information Systems, Inc., according to the software release current as of January 8, 1999. In essence, this algorithm generates a bit pattern for each atom, and for its nearest neighbors, with paths up to 7 bonds long. Each pattern serves as a seed to a pseudorandom number generator, the output of which is a set of bits which is logically ored to the developing fingerprint.
  • the fingerprint may be tixed or variable size.
  • the database ' may be SPRESI'95 (InfoChem GmbH), Index
  • a compound is an analogue of a reference compound if it has a daylight fingerprint with a similarity (Tanamoto coefficient) of at least 0.85 to the Daylight fingerprint of the reference compound.
  • the compounds of the present invention has a similari ty of at least 0. 85, more preferably a t least 0. 9, still more preferably at least 0. 95, to Ser-His, or to any oligopeptide scoring 2+ or better in Table 2.
  • a compound is also an analogue of a reference compound id it may be conceptually, derived from the reference compound by isosteric replacements.
  • Classical isosteres are those which meet Erlenmeyer ' s definition: "atoms, ions or molecules in which the peripheral layers of electrons can be considered to be identical".
  • amino acids histidine (His) and serine (Ser) function together as key catalytic amino acids in the active sites of such diverse enzymes as the serine- and thiol-proteases, lipases, and esterases.
  • Ser and His are also conserved in the intein- extein junctions of the phylogenetically widespread self- splicing proteins and at the - and C-termini of the homing endonucleases spliced from them.
  • Ser-His is the shortest peptide ever reported to show cleavage activity with multiple categories of natural substrates.
  • Oligopeptides and cleavage substrates were purchased as an acetate salt from Sigma and Bachem (HPLC purified), or as the dipeptide from Research Genetics. Other oligopeptides were purchased from either Sigma or Research Genetics. The powdered oligopeptides were dissolved in double deionized and sterilized (dds) H 2 0, and were then either filter- or autoclave-sterilized. Plasmid DNA pBR322 and 1-DNA were purchased from Life Science Technology.
  • a 60-mer single-stranded oligonucleotide 5'- CGGATTACCA GGGATTTCAG TCGATGTACA CGTTCGTCAC ATCTCATCTA CCTCCCGGTT-3, ' was purchased from Integrated DNA Technologies.
  • the 5' end of the oligonucleotide was labeled with [g- 32 P] ATP (Amercham) by T4 polynucleotide kinase.
  • Bovine serum albumin (BSA) and lysozyme were purchased from Sigma and were dissolved in ddsH 2 0.
  • the carboxyl ester p-NPA was purchased from Sigma and was dissolved in isopropanol.
  • Ser-His related Oligopeptides were individually mixed with a cleavage substrate (bovine serum albumin, 5mg) in Britton-Robinson (B-R) buffer (equal amounts of phosphate, borate, and acetate), to buffer reactions in the pH range 5-9, and ddsH 2 0 to a final volume of 20 mL in PCR reaction tubes, sealed, and incubated in a GeneAmp PCR System (Perkin-Elmer 9600) at designated temperatures (e.g., 50°C.) for pre-determined periods of time (6 to 48 hours) .
  • B-R Britton-Robinson
  • the cleavage reaction of p-NPA (2 mM) with Ser-His was carried out in triplicate in a 96-well microtiter plate at a designated temperature (e.g., room temperature) in B-R buffer (40mM, pH6) a volume of 100 L, and was monitored, recorded, and analyzed using a SPECTRAmax 250 microtiter plate reader system at a wavelength of 400 nm. Cleavage by Ser-His was compared with that by chymotrypsin (pH 7.8).
  • 1-DNA was incubated with or without dipeptide in B-R buffers of varying pH at either 37 °C or 50 °C for 24, 48, and 72 hr . All samples were subjected to electrophoresis in a 1% agarose gel.
  • DNA cleavage fragments which were then incubated for 24 hours at 12 °C with T4 DNA ligase in a ligation buffer containing ATP.
  • the ligation reaction samples were subsequently subjected to electrophoresis on a 1% agarose gel alongside negative control samples for the cleavage and ligation reactions, which were incubated without Ser-His and T4 DNA ligase, respectively.
  • the nucleolytic activity in samples incubated with Ser-His could be detected over wide ranges of pH (from 5 to 9) , with a pH value near the pKa of imidazole (pH 6) being optimal for cleavage at 37 °C. It is interesting to note that the pKa for the imidazole group of His is about 6, suggesting the importance of the imidazole being positively charged in the reaction.
  • the optimal pH became even more acidic when the incubation temperature was increased to 50 °C .
  • the rate of cleavage was also temperature-dependent; incubation at 50 °C resulted in faster DNA cleavage than at 37 °C (Fig. 1A) , and even higher rates of cleavage were observed at 60 °C (data not shown) .
  • the DNA cleavage rate was found to be also affected by Ser-His concentration. At a given constant DNA substrate concentration, the higher the Ser-His concentration, the faster the cleavage rate until the concentration reached 20 mM. A Ser-His concentration higher than 20 mM resulted in declined cleavage rate (data not shown) .
  • a circular plasmid DNA, pBR322 was also used as a substrate for cleavage. In the presence of Ser-His, the ' DNA band corresponding to the supercoiled form disappeared first with a concomitant increase of the relaxed form. As the incubation continued, the relaxed form decreased and a new linear form appeared.
  • the distance between the upper and lower bands at each nucleotide position is approximately 1/4 of the distance between corresponding bands of consecutive nucleotide positions. This difference in distance between lower and upper bands coincides with the difference in average molecular mass of a nucleotide with a 3' -hydroxyl (-330 Daltons ) and a nucleotide with a 3' -phosphate ( ⁇ 330 + 80 Daltons). This cleavage pattern is not consistent with the cleavage mechanisms of natural nucleases, which generate single bands at each oligonucleotide position.
  • Ser-His samples were either filter-sterilized or autoclaved, and incubated with DNA in the presence or absence of EDTA, followed by agarose gel analysis.
  • the results of this study indicate that autoclaved Ser-His is as active as the filter-sterilized Ser-His in DNA cleavage (Table 1) , whether in the presence or absence of EDTA, suggesting that the observed DNA cleavage activity is not due to polypeptide nuclease contamination.
  • the sulfhydryl side chain of Cys can serve as a nucleophile, as is the case in the active sites of natural thiol-proteases 2 .
  • the His residue cannot be replaced by any of amino acids tested, including those with positively charged side chains (Table 2) .
  • the cleavage activities of Ser-His are reduced or lost when an amino acid is added to its N-terminus but are retained when one or more amino acids are added to the C-terminus. It is interesting to note that the cleavage activities of Ser-His-Asp, which contains the amino acids of the catalytic triad, are at least as efficient as those of Ser-His under conditions optimized for Ser-His (Table 2) .
  • Ser-His was found also to cleave the ester p-NPA.
  • p-NPA showed a rapid linear increase in optical density (OD) at 400 nm over time, which is indicative of cleavage of the p-NPA to p-nitrophenol . This change in OD was found to be dependent on the concentration of Ser-His, as well as on pH and temperature (data not shown) .
  • Transition metals particularly Fe 2+ and Cu 2+
  • Transition metals are known to cleave DNA in the presence of EDTA and other reducing agents, but failed to cleave DNA under reaction conditions optimized for Ser-His (Table 1).
  • Ser-His Ser-His
  • the presence of two 3' cleavage products at each nucleotide position generated from DNA cleavage by Ser-His and the successful ligation of the DNA cleavage fragments are strong evidence against metal-assisted cleavage of the DNA 16 but are indicative the 3'-hydroxyls and 5' -phosphates that support a hydrolysis mechanism.
  • polypeptide enzymes that use an amino acid residue with a hydroxyl group (Ser) or a thiol group (Cys) and a His residue in their active sites to perform peptide or ester bond cleavage.
  • Ser or Cys
  • His a catalytic dyad in these protease active sites is a recurring theme apparent from tne evolution of these enzymes.
  • subtilisin is a bacterial serine protease that has very low amino acid sequence homology to chymotrypsin; yet through convergent evolution, it also utilizes the Ser/His combination in its active site 17 .
  • Ser and His are likewise conserved in the active sites of lipases and esterases.
  • a Ser/His dyad was also discovered in the active site of a catalytic antibody that catalyzes the hydrolysis of norleucine and methionine phenyl esters, indicating that antibodies can converge on the active site structures that have been selected by natural enzyme evolution. Protein self-splicing provides another example of the Ser/His catalytic dyad.
  • This peptide bond cleaving process invariably uses Ser or Cys at the N- terminus and His (plus an asparagine (Asn) ) at the C- terminus of an internal protein sequence (intein) to enable cleavage at the splice junctions and the rejoining of the external protein sequences (exteins) . More interestingly, the spliced intein always has Ser or Cys at its N-terminus and His-Asn at its C-terminus, and functions as a homing endonuclease to cleave chromosomal DNA.
  • Ser and His residues may function not only in the intein splicing reaction, but in the subsequent DNA cleavage as well.
  • the common feature of these various enzyme active sites is embodied in the dipeptide Ser-His, which can itself cleave DNA, proteins, and at least one ester.
  • Computer modeling has predicted a low energy conformation of Ser-His that closely matches the relative orientations of the Ser and His residues in the chymotrypsin active site (data not shown) .
  • the dipeptide is suspected to function similarly to the chymotrypsin active site by employing hydrolysis to cleave protein, ester, and even phosphodiester substrates.
  • the requisite N-terminal position of the Ser may be an indication that Ser uses its own a-amino group as a general base for improving the nucleophilicity of the hydroxyl group, as appears to be the case in the hydrolysis of amide bonds by penicillin acylase.
  • the requirement of the His and the optimal cleavage activities near its pK a suggest a possible role for the imidazole group as a general acid in protonating the leaving groups in the cleavage reactions.
  • the dipeptide Ser-His is the shortest peptide ever reported to have multiple cleavage activities .
  • Results of preliminary experiments indicate that in addition to DNA, protein, and ester cleavage, Ser-His is also capable of cleaving RNA (data not shown) .
  • Ser-His and related oligopeptides may have played important roles, either independently or as cofactors to RNA, in the hypothetical "RNA world" from which the modern "protein world” emerged.
  • the ability of Ser-His to retain its multiple cleavage activities when amino acids are added internally or to its C-terminus demonstrates the extraordinary evolutionary capacity of the dipeptide Ser-His.
  • the "standard nick translation kit” is the Nick Translation Kit, Cat. No. 976,776 (Roche Diagnostics GmbH, Roche Molecular Biochemcials, Sandhofer Strasse 116, D-68305, Mannheim, Germany) , used in the accordance with directions associated with version 3, October 1999.
  • the hybridization strength of the probes was also considered, by the following procedure. Equal amounts of ⁇ DNA probe (40 ng) were used in each hybridization reaction .
  • Probes were labeled with DNA Polymerase I in Nick Translation buffer.
  • a more than 100% increase in the labeling specificity in nick translation can be consistently achieved using Ser-His as the DNA nicking agent compared to conventional DNase I with an optimal reaction condition specified by a commercial kit (Roche) .
  • the nicking time by- Ser-His can be adjusted (shortened) by variation (increase) of nicking reaction temperature. Reaction temperatures of 50, 60 and 70 °C . were examined. Computer modeling of Ser-His indicates that the minimal energy conformation of Ser-His is very similar to that of the Ser and His residues in the active site of chymotrypsin, suggesting that Ser-His may cleave DNA using a mechanism similar to that of chymotrypsin, which is both a protease and an esterase.
  • references ci ted herein including journal articles or abstracts , published, corresponding, prior or rela ted U. S . or foreign pa tent applica tions , issued U. S. or foreign pa tents, or any other references , are en tirely incorpora ted by reference herein, incl uding all da ta , tables, figures , and text presented in the ci ted references . Additionally, the entire contents of the references cited within the references ci ted herein are also entirely incorpora ted by reference .
  • any description of a class or range as being useful ' or preferred in the practice of the invention shall be deemed a description of any subclass (e . g. , a disclosed class wi th one . or more disclosed members omi tted) or subrange

Abstract

A compound other than Ser- His, having the structure (Ser/Cys) - Xaam - His - Xaan (EM'), where 0<=(m+n)<=12, is useful as a nuclease or protease. A compound having the structure EM-L-BM, where EM is EM' above or Ser-His, L is a linker and BM is a nucleotide or amino acid sequence binding moiety, is also useful as a nuclease or protease. The compounds, EM and EM-L-BM which act as nucleases are useful in labeling a nucleic acid by nick translation.

Description

D±pept±de seryl-histidine and related oligopeptides cleave DNA, protein, and a carboxyl ester
BACKGROUND OF THE INVENTION
Field of the Invention
The invention relates to novel compounds with nuclease . and protease activity, and to the use of these and related compounds as nucleases and proteases. Use of the compounds as nicking agents in nick translation is of particular interest.
Description of the Background. Art
Enzymes Generally Biological enzymes are polypeptides or polyribonucleotides that catalyze biochemical reactions in organisms. Although polyribonucleotide enzymes
(ribozymes) have been found in various organisms, they are rare in comparison to polypeptide enzymes . _ Polypeptide enzymes contain active sites where chemical substrates are catalytically converted into products by the enzymes . The active sites consist of multiple amino acid residues, which are usually not directly linked but are held in a precise three-dimensional conformation by various biochemical and biophysical forces. It is this precise three-dimensional conformation that creates a unique cleft and biochemical micro-environment that, allows only one or few chemical substrates to gain entry and to be reacted upon. An active site is a small part of an enzyme, but the rest of the enzyme is important for the maintenance of the precise conformation of the active site . It is the side chains of the amino acids that participate in enzymatic reaction. Thus, the amino acids that are involved in the catalytic reactions in the active sites are most often those with reactive side chains : positive or negative charges ( ys, Arg, or Asp, Glu) , polar groups such as -OH (Ser, Thr, Tyr) , -SH (Cys, Met), or imidazole group (His) . Also, it is almost always the coordinated participation of two or more side chain groups that allow the specific and efficient catalytic reaction to occur.
In general, the most useful enzymes are the ones which are highly 'specific, since they are less likely to cause undesirable side reactions. Unfortunately, the number of available specific enzymes is much smaller than the number of target sites of interest. There is an unfulfilled need for an efficient method of generating an enzyme specific to an arbitrary target, especially a peptide or nucleic acid target .
The active site may, but need not, be identical to its binding site. The binding site residues are those directly involved in its binding to the substrate . The active site residues are those directly involved in the enzymatic modification of the bound substrate.
One family of biological enzymes is proteases. These are enzymes that use proteins as their substrate and catalyze protein cleavage (proteolytic) reactions. There are several subgroups of proteases classified based on the key catalytic amino acid residue in the active site of the enzyme. In serine proteases such as chymotrypsin and trypsin, amino acid residues Ser, His, and Asp form a catalytic triad, and cleave protein substrate with a hydrolysis mechanism. Within the catalytic triad, Ser and His have been determined as the most important residues, since the substitution of either of the two residues with unrelated amino acids would essentially completely abolish the proteolytic' activity of the enzyme, whereas the substitution of Asp would allow the enzyme to maintain a significantly reduced but measurable proteolytic activity. In thiol proteases, the Ser in the active site is replaced with Cys. Since - SH group in Cys is chemically very similar to -OH group in Ser, it is found that thiol proteases catalyze proteolytic reaction with a hydrolysis mechanism that is very similar to the one used by serine proteases .
The nucleic acid-binding enzymes include methylases, ligases, polymerases, replicases, and nucleases. Nucleases are enzymes that cleave nucleic acids (DNA and RNA) . DNA-cleaving nucleases include restriction endonucleases, homing endonucleases, topoisomerases, and nucleases involved in genetic recombination and DNA repair. . DNA nucleases cleave DNA by catalytic hydrolysis of the phosphodiester bonds, which are very resistant to non- catalytic hydrolysis. Most nuclease-catalyzed phosphodiester bond cleavage is proceeded by P-0 bond cleavage. The hydrolytic (P-0) cleavage of phosphodiester bond is through a SN2 (P) mechanism that involves the generation of an electron-rich pentacoordinate phosphorane as a reactive intermediate.
Nucleases include both exonucleases, which degrade the ends of a nucleic acid, and endonucleases, which can attack an internal site in the nucleic acid. A nuclease may have both exo- and endonuclease activity. Nucleases also differ in their degree of processivity, which is their ability to repeatedly attack the same substrate before releasing it. A nuclease with high processivity will cause (given appropriate conditions) more degradation than one with low processivity. A nuclease may, but need not, be specific for single or double-stranded nucleic acid, for DNA or for RNA, and for particular nucleic acid sequences. If it is sequence-specific, it may recognize one sequence but cleave the nucleic acid somewhere other than the recognition site. The degree of specificity may vary, it is not an all-or-nothing proposition.
Amino acids with charged or polar side chains are involved in the active sites of nucleases. Asp, Glu, two negatively charged amino acids, and Tyr, an amino acid with a OH-group on its side chain, are located in the active site and participated in the DNA cleavage of exonuclease activity of DNA polymerase I .
In the active site of staphylococcal nuclease, Asp, Glu, as well as two positively charged Arg residues are involved in the DNA cleavage. _ Amino acids Ser and His are not usually found in the active site of DNA nucleases. However, His is often found in RNA nucleases (RNases) such as πbonuclease A and RNase Tj. These observations suggest that although Ser-His is able to cleave DNA, it was not the combination selected by nature, for one reason or the other, for catalyzing DNA cleavage reaction.
Interestingly, one exception can be found in homing endonucleases, which are a group of enzymes whose catalytic activity results in self-propagation. The sequences that code for these endonucleases usually interrupt genes by localizing as open reading frames in introns or as infra e spacers in protein-coding sequences . The target of a homing endonuclease is its cognate intronless or spacerless allele. The endonuclease initiates a DNA mobility or "homing"event by making a double-strand cut in its target. Interestingly, the homing endonuclease resulted from inframe polypeptide spacer can first function as a self-splicing protein cleavage enzyme (specifically termed as intein) . After the intein cleaves itself out of the "host" protein, the intein functions as a homing endonuclease by cutting a target DNA at specific sites. In other words, the inframe spacer, or the intein, can function as both a protease first and then an endonuclease (DNase) by cleaving both protein and DNA. Most interestingly, the N-terminus and the C-terminus of inteins, which participate in the self- splicing, always contain a Ser (or Cys) and a His residues, respectively. Although it is not known how inteins function as endonucleases, terminal Ser and His may be involved. This is the only known case in nature that a natural polypeptide can function as a protease and as an endonuclease, and at the same time invariably contains Ser and His residues at its ends.
Nucleases contain both substrate binding and catalysis sites . These two sites can be next to each other or are overlapping. EcoRV is one of the restriction endonucleases that have been studied in details. EcoRV recognizes a palindromic double stranded sequence GATATC and cleaves at the phosphodiester bond between first T and A, generating^ a blunt end. This restriction specificity is achieved with retention of catalytic prowess. A change of a single base pair in the recognition sequence lowers the cleavage rate more than a millionfold. EcoRV is a dimmer of identical subunits, and binds DNA so that the twofold axis of the target site coincides with the twofold axis of the enzyme. Thus, the symmetry of the endonuclease matches the symmetry of its targets. The EcoRV endonuclease searches DNA for its GATATC target sequence by diffusing along its major groove. Specifically, a surface loop from a β turn of each subunit makes contact with the major groove. When the specific sequence is encountered, a large structural rearrangement occurs in both the enzyme and its DNA target. In this induced fit, DNA becomes kinked by 50 degrees at the center of the hexanucleotide recognition site. Each recognition loop forms six hydrogen bonds, all in the major groove, with the outer two base pairs of a GAT half site. Most significantly, Mg++, which is essential for hydrolysis, enters the catalytic site and- becomes coordinated only when the target sequence is encountered.
Restriction enzymes are found in many microorganisms (bacteria) and have protective functions for the host. These enzymes recognize specific target sequences, and cleave either within or outside of this sequence. While restriction enzymes are very useful in biological research as a means of nucleic acid manipulation, they are limited in the number of target sequences they can recognize. A site-specific nucleic acid cleavage molecule that can recognize and cleave any specific sequence would be highly desirable.
Nonspecific nucleic acid cleavage agents include transitional metal ions, particularly Fe++ and Cu++, and reducing agents such as ascorbate. However these agents cleave nucleic acids by mechanisms other than hydrolysis, such as oxidation. Thus, the cleavage products usually lose one or more bases at the cleavage sites. Nick Translation
A nick is the cleavage of just one strand in a double- stranded nucleic acid. (If the substrate is nicked a second time at the same site, the result is the complete cleavage of the double-stranded nucleic acid.) Thus, nicking agents necessarily have some potential to cleave nucleic acids as well as nick them, as a result of successive nicks. One of the many utilities of certain DNA-binding enzymes has been in nick translation. Nick Translation is commonly used procedure in molecular biology laboratories employed for the labeling of DNA probes, labeled by radioactive or nonradioactive means. Numerous companies, such as Boehringer Mannheim Biochemicals (BMB) , Promega, and others, sell nick translation kits for this procedure, which rely on the ability of DNase I to nick DNA and of DNA Polymerase I to incorporate labeled nucleotides into nicks in the substrate DNA generated by DNase I. Deoxyribonuclease I (DNase I) is an enzyme isolated from bovine pancreas and is an endonuclease that hydrolyzes double-stranded and single-stranded DNA to a complex mixture of mono- to oligonucleotides with 5 ' -phosphate and 3 ' -hydroxyl termini . In the presence of Mg++, DNase I attacks each strand of DNA independently (nicks) and the sites of cleavage are distributed in a statistically random fashion. In the presence of Mn++, DNase I cleaves both strands of DNA (double-stranded breaks) at approximately the same site to yield fragments of DNA that are blunt-ended or have protruding termini only one or two nucleotides in length. Escherichia coli DNA polymerase I adds nucleotide residues to the 3 ' -hydroxyl terminus that is created when on strand of the double-stranded DNA molecule is nicked. In addition, the enzyme, by virtue of its 5' to 3' exonucleolytic activity, can remove nucleotides from the 5' side of the nick. The elimination of nucleotides from the 5' side and the sequential addition of nucleotides to the 3' side results in movement of the nick (nick translation) along the DNA (Kelly et al. 1970) . By replacing the preexisting nucleotides with highly radioactive or labeled non-radioactive nucleotides, it is possible to prepare 32P- and other-labeled DNA (Maniatis et al . 1975, and DIG labeled probe).
As the nicks in the substrate DNA are translated in the 5 ' to 3 ' direction, labeled nucleotides are incorporated into the DNA, generating a randomly labeled DNA probe.
The specific activity of the nick-translated DNA probe depends not only on the specific activity of the dNTPs, but also on the extent of nucleotide replacement of the template. This can be controlled by varying the amount of DNase I in the reaction. The aim is to establish conditions that will result in incorporation of about 30% of the [ -32P] dNTPs into DNA.
The size of DNA after nick translation also depends on the amount of DNase I added to the reaction and the amount of DNase contaminating the preparation of DNA polymerase.
The standard Nick Translation Kit supplies an Enzyme Mixture containing DNA Polymerase I and an undisclosed amount of DNase I. The nicking activity of DNase I, therefore, cannot be modulated independently from the polymerase activity of DNA Polymerase I. Excessive nicking of the DNA substrate by DNase I in the Nick Translation reaction results in some degree of fragmentation of the DNA and in suboptimal labeling of the probes. The Nick Translation reaction cannot be shortened to avoid excessive nicking without a concomitant loss in polymerization and labeling.
Role of Serine and Histidine in Enzymatic Activity
Serine and histidine are found in the binding sites of both proteases and nucleases. The amino acids serine (Ser) and histidine (His) also function together in the active sites of various natural proteases, esterases and Upases, but not in any natural nucleases, as direct participants in enzymatic reactions1-7. Ser and His are directly involved in the peptide bond and ester bond cleavage reactions of the serine proteases chymotrypsin, trypsin, and elastase1, 8"10. Through complex folding of polypeptides in protein enzymes, the side chain functional groups of serine and histidine residues that may be distantly separated in the amino acid sequence are brought together in highly specific orientations in the active sites of these enzymes to allow them to cooperate in enzymatic reactions.
Furthermore, the active sites of trypsin, chymotrypsin, and elastase all use Ser and His with Asp to form a catalytic triad. Although Asp forms a catalytic triad with Ser and His in the active sites of serine proteases1, lipases3-5, and esterases6' 7, the Ser/His dyad has been shown to be sufficient for the cleavage reactions8-11. With its hydroxyl functional group, serine usually plays one of two roles in enzymatic reactions, serving either as a hydrogen donor or as a nucleophile . It is worth noting that phosphorylation of Ser in polypeptides is a key signal transduction step in all organisms, suggesting an early coupling of Ser and phosphate.
The imidazole functional group of histidine is likewise important in many enzymatic reactions, serving as a general acid or base in catalysis. Histidine, for example, is known to be an essential amino acid residue in
• the active sites of all RNases examined to date.
Interestingly, serine and histidine have never been found to functi'on together in carrying out the nucleic acid cleavage of natural DNases or RNases. Instead, the anionic amino acids glutamate (Glu) or aspartate (Asp) , the cationic lysine (Lys) or arginine (Arg), and sometimes a nydrogen donor such as tyrosine (Tyr) are typically found to be involved in the DNA cleavage reactions of most- DNases and restriction endonucleases, while natural RNases use histidine (His) in conjunction with Lys and/or Arg in their active sites .
The exonuclease active site of DNA polymerase I does not contain either Ser or His, see Beese and Steitz, EMBO J. 10:25-33 (1991) . That is likewise true of the active site of staphylococcal nuclease, see Loll and Lattman, Proteins Struct. Funct . Genet., 5:183-201 (1989), and of EcoRV, see Winkler, et al . , EMBO J., 12:1781-1795 (1993). The active sites of Ribonuclease A and RNase Tl contain His but not Ser.
The roles of Ser (or related amino acid residues) and His in the peptide bond and ester bond cleavage reactions of various enzymes are well documented. It was not previously known until the present inventors' work that the dipeptide Ser-His could itself exhibit these activities or that Ser-His could cleave DNA. In a Chinese Patent Abstract, 96114313.4 filed December 13, 1996, published August 27, 1997, it was disclosed that Ser-His, diisopropyl phosphoryl serine, and diisopropyl phosphoryl threonine could all cleave DNA (and it was suggested that they could also cleave RNA) . It was speculated that "if coupled with nucleic acid sequence recognition system", the resulting site-specific artificial enzyme construct would be useful as a therapeutic agent in gene therapy. However, no particulars were given as to how to couple it to a binding moiety or as to what binding moieties might be used. Also, there was no suggestion as to any modification of Ser-His other than conjugating it to a nucleic acid omding moiety. The abstract did not disclose or suggest any protease or esterase activities of Ser-His, and did not disclose or suggest that, besides being able to cleave DNA, that it could also nick it in a manner useful in nick translation labeling protocols.
SUMMARY OF THE INVENTION
Because the Ser/His dyad is highly conserved and performs essential catalytic roles in the active sites of many proteases, 'lipases, and esterases, coupled with the consideration that polypeptide enzymes likely evolved around oligopeptides that possessed in a primitive degree the fundamental functions of the active sites of modern protein enzymes, we hypothesized that the dipeptide Ser- His and related oligopeptides might exhibit rudimentary cleavage activities. We also hypothesized that if the Ser/His dyad is capable of ester bond cleavage, then the dipeptide Ser-His, unencumbered by the tertiary structure of a polypeptide active site, may also be capable of phosphodiester bond cleavage. To test these hypotheses, Ser-His and related oligopeptides were incubated with linear and circular DNAs, proteins, and the carboxyl ester p-NPA. Analysis of the incubation products indicated cleavage of all these substrates, and numerous control reactions unequivocally demonstrated the cleavage activities of Ser-His and the related oligopeptides.
Some unique configuration of the hydroxyl group of serine, the imidazole group of histidine, and possibly the N-terminal amino group of this simple molecule is sufficient to enable the dipeptide to function as if it were a nucleolytically active site in and of itself. , This dipeptide can randomly cleave both single- and double- stranded DNA and single-stranded RNA, and to the best of our knowledge it is the shortest oligopeptide ever reported to have enzymatic or enzyme-like activity. When incubated with 10 mM Ser-His at 37°C or 50°C and pH ranging from 5 to 9, bacteriophage λ-DNA or circular plasmid DNA pBR322 are gradually degraded into smears as evidenced by electrophoresis on agarose gels, whereas these DNA samples remain uncleaved under the same conditions when Ser-His is not added.
Ser-His per se is a nonspecific nuclease. ouch nucleases have utility, for example as nicking agents, in place of (or in addition to) DNase I, in the process of nick translation. The dipeptide seryl-histidine (Ser-His) can also nick DNA. The nicking activity of the dipeptide can be modulated with changes in concentration and incubation temperature. The nicking activity of Ser-His is lower than that of DNase I and it is much easier to control, which makes it more suitable as the nicking agent in Nick Translation. Substitution of Ser-His for DNase I as the nicking agent in Nick Translation does show that both linear λDNA and circular pBR322 DNA substrates can be labeled using the modified Nick Translation reaction, and that the labeled probes generated from these DNA substrates nicked with Ser-His were larger in size and had higher specific activities than control probes labeled using the standard Nick Translation reaction. Furthermore, hybridizations performed with probes generated from DNA nicked with Ser-His show stronger hybridization than probes generated with the Nick Translation Kit.
We have also discovered that Ser-His can act as a nonspecific proteinase it may be used unchanged as such.
The present invention also relates to derivatives and analogues of Ser-His which have been engineered to have specificity for a particular substrate. These may be used as specific proteinases, esterases, nucleases, etc. BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1. Nucleic acid cleaving dipeptide Ser-His .
A. Two-dimensional structure of Ser-His. Because of free rotations of various bonds, Ser-His can assume different three-dimensional conformations. B. Three-dimensional representation of the minimal energy conformation of Ser- His. Please note the proximity of the hydroxyl and imidazole groups in this conformation, and C. Three- dimensional representation of a conformation.
Figure 2. Cleavage of single stranded DNA or RNA by Ser-His as indicated by changes in absorbanσe at 260 n.. λ-DNA, single stranded oligo, DNA or poly(U) RNA were individually incubated with or without Ser-His in Briton- Robinson buffer (phβ) at 50°C. At predetermined time intervals, a portion of the incubated solution was removed and measured for its absorbance at 260 nm. A. A260 changes of λ-DNA and single stranded oligonucleotide DNA coincubated with Ser-His, and B. A260 changes of RNA coincubated with Ser-His. S-H= Ser-His, Oligo = single- stranded oligonucleotide (60 nts).
Figure 3. Two proposed mechanisms for DNA cleavage by Ser-His: A. Single phosphate mechanism. B. Dual phosphate mechanism. Figure 4. Computer-generated models for interaction of Ser-His with single stra.nded DNA substrate. A. Single phosphate mechanism. Both the hydroxyl and imidazole groups of Ser-His are in the vicinity of a single phosphodiester bond of the DNA molecule. B. Dual phosphate mechanism. The hydroxyl and the imidazole groups of Ser- His are separately positioned near two adjacent phosphodiester bonds on the DNA molecule. Ser-His is likely to use dual phosphate mechanism to cleave DNA. The hydroxyl and the imidazole groups of Ser-His are separately positioned near two adjacent phosphodiester bonds on the DNA molecule. ' Figure 5. Two possible scenarios for cleavage of phosphodiester bonds in single-stranded DNA by Ser-His . These two cleavage modes result in different 5' and 3' ends in the cleavage products as shown by mode (A) or mode (B) . Arrows indicate the site of cleavage. Figure 6. Schematic representation of site-specific single-stranded DNA cleavage by a Ser-His-linker-PNA molecule. A 60mer oligonucleotide, whose sequence is derived from the luciferase gene, is end-labeled at its 5' end and used as the substrare. The PNA moiety of the cleaving molecule contains 15 bases that are complementary to a region of the substrate sequence. If site-speci ic cleavage is achieved, the cleaved product, and end-labeled 22mer, will be detected on a PAGE autoradiograph as a band shorter than the original 60 mer substrate. Figure 7 : Flow chart of Nick Translation with Ser-His.
Figure 8 : diagram showing site of hydrolysis of a phosphodiester bond by Ser-His.
Figure 9: diagram showing nicking mechanism. Figure 10: diagram showing labeling mechanism. Figure 11: bar charts showing that DNA probes produced by Ser-His treatment of (A) lambda DNA and (B) pBR322 exhibited higher specific activity than probes produced by nick translation kit. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS OF THE INVENTION
1. De initions
The term "nuclease" includes enzymes which cleave DNA, RNA or both, in single and/or double-stranded form, and in linear and/or circular form. It includes both exo- and endonuclease, and both sequence-specific and non-sequence- specific enzymes.
The term "protease" includes enzymes which cleave peptides and enzymes which cleave proteins. It includes both sequence-specific and non-sequence-specific enzymes.
2. Target Molecule
The target molecule may be any protein (including peptides) or nucleic acid (including RNAs and DNAs) .
The target nucleic acid molecules of particular interest are probes suitable for nick translation, which results in the incorporation of a labeled nucleotide into the target molecule.
3. Non-sequence-specific Enzymes Based on Ser-His
In one embodiment, the enzyme of the present invention is a non-sequence-specific enzyme, such as Ser-His and certain related molecules. These may be referred to, collectively as Ser-His-Like Enzymes, or "SHLE" compounds, for short. The SHLE compounds include peptides, and peptoids, peptidomimetics and analogues thereof. These non-sequence-specific enzymes are of value in utilities where specificity is not important or is even detrimental. Qne such utility is nicking nucleic acids preparably to their labeling by nick translation. In a preferred form, the non-sequence-specific enzyme has the following structure:
(Ser/Cys) -Xaam-Hi's-Xaan where 0<= (m+n) <=12. Preferably (m+n)<=8.
This formulae includes all of the molecules with cleavage activities of ++ or better in Table 2, and likewise excludes the Table 2 molecules with lesser activity. Table 2 shows that the Ser can be replaced with Cys, but not with the related amino acids Thr or Asp. Whether it can be replaced with Gly or Ala is not known at this time. The His cannot be replaced with Arg or Lys, although they are positively cnarged, too. It is clear from Table 2 that the Ser (or Cys) is preferably at the amino terminal of the enzyme. However, the Ser (or Cys) and the His need not be adjacent to each other; at least a three amino acid separation is tolerated, and there is no reason to believe that a greater separation could not be accepted. Moreover, it is clear that at least one amino acid may be placed after the His. Again, there is no reason to believe that this is an upper limit on the C-terminal post-His moiety.
The limitation on m+n in the formula above is thus arbitrary, although in table 2, the maximum value of m+n was four. This maximum observed value for m+n is merely a more preferred embodiment, as is m<=3 and n<=l .
A series of combinatorial libraries may be used to systematically test all oligopeptides with this formula for nucleic acid or peptide cleavage activity. In each library, the value of m and n are fixed, so that the library is of fixed length. In one embodiment, the library members are expressed in cells, and the peptides are displayed on the surface of the cells or of phage produced by the cells. This has the advantage of a plifiability. The Xaa may each encoded by a degenerate codon which allows all 20 amino acids, such codons include the NNN and NNK codons . Alternatively, one or more Xaa positions may be restricted, e.g., to a mixture of Ser, His, Asp, Gly, (the four alternatives explored in table 1) , or possibly to one further including Thr and/or Ala, which are fairly frequently exchange with Ser and/or Cys in families of homologous proteins.
If we limit the Xaa to four possible amino acids, then the maximum number of different sequences for the longest possible peptide is 4Λ12, which is about 1.6 xl0E7, and well within the typical diversity range of contemporaneous peptide libraries .
Ser is encoded by TCN, AGT and AGC . Cys is encoded TGT and TGC. Hence, it is preferable to prepare separate Ser-XXX and Cys-XXX libraries, where "XXX" denotes the remainder of the peptide. Once prepared, the libraries may be screened separately, or pooled together. Most of the entire set of formula peptides may be prepared in one step if, in synthesizing the encoding DNA, the DNA is synthesized in steps in which one adds one or more codons in each step, rather than just one base. In that case, in step 2, one adds a mixture of Xaa, Xaa-Xaa, Xaa-Xaa-Xaa, etc. His is added in step 3 and another variable length mixture in step 4. This approach does not, however, produce those subsets in which m, or n, or both, are zero. Those would need to be handled separately.
A variation would be, in step 2, to add a mixture of Xaa-His, Xaa-Xaa-His, Xaa-Xaa-Xaa-His, etc., effectively merging steps 2 and 3 above and also assuring that the m=0 situation is covered. One would then set aside an aliquot of the DNA (for which n=0) and in step 3, add a mixture of Xaa, Xaa-Xaa, Xaa-Xaa-Xaa, etc. This variation produced the entire set of formula peptides in one operation.
Alternatively, the library may be prepared by chemical means, e.g., synthesized on beads or at particular grid positions of a support. This has the advantage that members with high nuclease activity are not lost by inhibiting growth of or even killing a host cell.
Preferably, the library is one of soluble peptides rather than one of peptides immobilized on a support (which could be a nonliving support such as a pin or bead, or a biological support such as a cell or phage) , as the support or the linker to the support could affect the activity.
If the library is one of soluble peptides, some form of deconvolution method, such as that described by Blake, USP 5, 565, 325, may be used to identify the active peptides. If the library is one in which the peptides are immobilized, it is desirable to use one of the flexible linkers commonly used in the combinatorial peptide art.
However it is contemplated that certain chemical alterations are acceptable. These chemical alterations include individual or combined modifications at the hydroxyl group, imidazole group, N-terminus, or C-terminus of Ser-His.
A small peptide such as seryl-histidine is easily amenable to modification via two synthetic processes. The modification necessary can be imparted through either 1) Chemical modification of each amino acid residue followed by amide bond formation to produce the dipeptide, or 2) Modification of the dipeptide as an intact molecular unit. Modification of the serine hydroxyl group can be accomplished in a number of ways, with a large number of attendant groups possible. The easiest modification conducted will be acylation of the hydroxyl functionality. In general, the conditions needed are an acylating agent such as acetic anhydride and acetyl chloride or other derivatives. Usually a base is required, and possibly a catalyst such as dimethylaminopyridine. Ethers will also be useful derivatives to prepare from the serine hydroxyl group. Methyl ethers can be prepared from serine, methyl iodide and a base, methyl Meerwein reagent, methyl sulfate, or other similar alkylating agents . Benzyl and substituted benzyl ethers could be placed to control the electronics of the ether group. Silyl ethers are important modifications due to their perceived lack of hydrogen bonding ability. These are easily prepared from a substituted silyl chloride/triflate, base and catalyst. Subsequently steric bulk and/or electronic differences can be applied to the hydroxyl moiety with the goal of increasing or decreasing affinity.
The imidazole functionality of the histidine residue can be easily functionalized in three different areas: 1) The τ -H group, 2) the π N group, or 3) C-2 of the imidazole ring. Generally, sulfonyl derivatives of the N-H group (Me2NS02Cl, Et3N) can be easily prepared. Alkyl derivatives, carbamates, and phenacyl groups have all been placed on the N-H group. Substitution at C-2 of the imidazole ring has been accomplished, placing F, CF3, and substituted alkyl derivatives . Conditions include an alkyl or fluoro acid, AgN03, sulfuric acid, and ammonium persulfate to conduct a radical oxidative decarboxylatio . In identifying candidate non-peptide replacements for Ser-His, it may be useful to compare the structure of Ser- His to that of the proposed replacement using "structural descriptors", ca-lculating an overall structural similarity between the compounds. The structural descriptors which may be used include, but are limited to, those listed in Patterson,, et al. (1996), Klebe and Abraham (1993), Cummins, et al. (1996), and Matter (1997). Conventional mathematical methods may be used to select or weight the descriptors.
The 2D fingerprint method described in Matter, et al .
(1997) is of particular interest. In essence, the compound is analyzed for the presence or absence of particular molecular fragments, the results being encoded in a binary format. In order to encode the status of a large number of fragments, without assigning a bit to each fragment, each fragment was projected using a pseudorandomization algorithmin to a bitstring of limited size (i.e., fewer bits than the total number of unique fragments in the compounds of the database) . In addition, the presence of 60 specific functional groups, rings or atoms was encoded in 60 of the total 988 bits. Details are given in UNITY Chemical Information Software, version 2.5. Reference Guide, pp. 45-58, Tripos Inc., 1699 Hanley Rd . , St. Louis, MO 63144.
A similar approach is described in Martin, et al . , J. Med. Chem., 38:1431-6 (1995). A "Daylight fingerprint" routine was used to search a molecule for all substructures up to seven bonds long and set one bit in a 2048-bit string for each fragment found. A "hashing" algonthm randomly assigned each fragment to one of the possible bits. Alternatively, the nonspecific enzymes described in this section may be used as enzymatic moieties of sequence- specific artificial enzymes, as further described below.
4. Sequence-Specific Artificial Enzymes Based on Ser-His
Although Ser-His has been shown capable of nucleic acid cleavage, this nucleolytic activity is extremely low compared to restriction enzymes. This is not surprising since restriction enzymes typically consist of hundreds of amino acid residues folded in highly complicated three- dimensional configurations to produce active sites capable of catalyzing reactions with extreme efficiency and specificity, whereas Ser-His is a simple dipeptide lacking in the structural complexity and chemical diversity of natural polypeptide enzymes. Modern DNases, RNases, and restriction endonucleases use structural motifs such as zinc fingers to recognize and bind specific regions on the substrate, contributing a thermodynamic advantage by properly positioning and confining the pertinent active site functional groups relative to the substrate and providing a kinetic advantage by promoting rapid association of enzyme and substrate in solution.
The relatively low nucleolytic activity of the free Ser-His dipeptide is likely due to its low affinity for its target (presumably the phosphodiester bonds) in the nucleic acid substrates. Ser-His appears to cleave DNA through a hydrolysis mechanism which yields new 3' and 5' termini.
The two proposed mechanisms for the interaction between the dipeptide with its substrate (Fig. 3A & 3B) illustrate that certain conditions must met for an encounter between Ser-His and DNA to result in cleavage of the substrate. Cleavage is believed to be the consequence of an SN2 nucleophilic attack by the hydroxyl of serine on a phosphorus in a phosphodiester bond, forming a pentacoordinate phosphorane transitional state stabilized by a source of p'ositive charge (either the imidazole of His 5 or the N-terminal amino group) in the dipeptide. At pH values near the pKa of histidine (6.5~7), the imidazole group of histidine may serve as a general base to increase the nucleophilicity of serine' s hydroxyl. It may also serve as a general acid catalyst to assist the leaving
10 group - from the pentacoordinate phosphorane reactive intermediate.'
In order to form this transitional state, the dipeptide must first approach a phosphodiester bond and
..assume an appropriate conformation that allows the relevant
15. functional groups to participate in the reaction. Because it lacks the structural and chemical complexity of other nucleases, Ser-His does not have a great deal of affinity for its .target. The negative charge on the phosphates presents a barrier to nucleophilic attack, which is-
20 precisely why nucleic acids are not susceptible to rapid degradation in water. In order to overcome the electrostatic repulsion between the phosphate target and the hydroxyl nucleophile, the dipeptide must approach the phosphate while in a conformation that presents either the
25 positive imidazole or - amino group along with the
'nucleophilic hydroxyl group. Since the conformation of the dipeptide is rapidly and continuously changing, and since it has no specificity for the target, the probability of a random collision resulting in cleavage is very low.
30 If this dipeptide could be modified so that its specificity could be increased and if it could be confined in the region of a phosphate, the resulting site-specific nucleolytic molecule would have greatly enhanced nucleolytic activity.
The successful design of an artificial site-specific nucleolytic molecule would require that the active site be able to assume the appropriate conformation and that it be able to be properly positioned relative to the substrate such that reliable cleavage would result at a predetermined location. The difficulties associated with modifying the active site of a polypeptide nuclease to "target a specific sequence, without inhibiting its ability to assume the proper conformation, are alleviated considerably when working with seryl-histidine . We have found that additional amino acid residues can be linked to- the C- terminal of Ser-His without obliterating the molecule's nucleolytic activity (Table 2). The structural simplicity of the dipeptide is an asset in this case, because fewer constraints are involved in its nucleolytically active conformation than in the formation of the active sites of protein enzymes, which require coordination of multiple polypeptide chains, each having many degrees of freedom.
Thus, in ' another embodiment, the invention relates to a conjugate of Ser-His with, a binding moiety (homing sequence) which provides the desired specificity. This binding moiety is linked, directly or indirectly, to the C- terminal of Ser-His (or related moieties') . The -binding agent 'may be a nucleic acid binding moiety, so that the conjugate is a site-specific nuclease, or a peptide binding moiety, so that the conjugate is a site-specific protease.
When the conjugate is site-specific nuclease, the binding moiety is preferably a PNA, as defined below. By linking a PNA homing sequence to the C-terminal of Ser-His, artificial site-specific nucleolytic molecules can be created that can assume the appropriate conformation for nucleic acid cleavage and can properly position the active site relative to the substrate to result in reliable cleavage at a predetermined location. PNA will be used as an alternative to an oligonucleotide homing sequence, because its nucleosides are linked through peptide bonds instead of phosphodiester bonds. PNA has been shown to form a double helix with single-stranded nucleic acid, and triple helices with double-stranded DNA. The helical structure formed between PNA and oligonucleotides is through Watson-Crick base-pairing and that the two strands are anti-parallel. The double helix formed between PNA and single-stranded DNA has tighter binding than a double- stranded DNA double helix. Additionally, PNA has been tried as a rare genome cutter. All these suggest that PNA will be useful as a homing sequence for Ser-His to greatly accelerate the rate of bringing the Ser-His to the nucleic acid substrate.
Molecular modeling may be used in the design of the linker used to connect Ser-His with the PNA homing sequence. Computer modeling has been used successfully in many biomolecular interactions, including protein/nucleic acid interactions, ligand/receptor binding, as well as in drug design. These artificial site-specific nucleolytic molecules will have many usef l properties . ' They should have much greater activity than the Ser-His dipeptide, since the homing sequence would greatly enhance the affinity between the molecule and the substrate and would confine the active site in the proper position relative to the target phosphate. Pairs of these molecules could be used to cut double-stranded DNA in the manner of restriction enzymes. However, unlike restriction enzymes, the activity of these site-specific nucleolytic molecules will not be limited to particular recognition sequences; they will be site- specific yet be able to be customized to target any specific sequence. Additionally, these molecules will be modular in nature; once the parameters required for the linker are calculated using molecular modeling it will be possible to vary the target sequence of the molecule by converting the homing sequence linked to the Ser-His- linker. These will be particularly useful in human genome and related research areas, which will benefit from the ability to target with precision any sequence without regard to available restriction enzyme recognition sequences. There are also innumerable potential biomedical . applications for these molecules, which could be used to target DNA, RNA, and possibly even proteins.
The development of artificial site-specific nucleolytic molecules will also advance or complement other technologies. For example, these molecules will prove useful with ribozyme/antisense technologies. Antisense molecules (either regular oligonucleotides or PNA) can target any sequence, but without cleavage. Conversely, ribozymes can cleave RNA substrates, but are very limited in the sequences they can target. The Ser-His--PNA molecules we hope to develop will combine both of these properties and, like ribozyme/antisense technologies, can be used in vivo . These unique characteristics of the artificial site-specific nucleolytic molecules will complement the utility of ribozymes and antisense molecules in studying gene expression and regulation, and will be especially useful in down-regulating or blocking biological pathways associated with the pathological overexpression of abnormal genes. Additionally, we expect technologies, such as the use of PNA and the computer modeling we will use to design our linkers, could also be positively impacted by our work. Because Ser-His—PNA is a novel ' molecule , its design may raise new scientific questions that may in turn lead to new uses for PNA or improved modeling software.
4.1. Specificity
For the purposes of the present invention, the specificity of the enzyme need not be absolute, that is, it need not bind and cleave only the target molecule and no other molecule, provided that its preference for the target molecule as a substrate is sufficiently strong to render the enzyme useful. Specificity may be measured by comparing activity against a target molecule with the predetermined target sequence with activity against a control molecule lacking that sequence. Preferably, the ratio is at least 10:1, more preferably at least 100:1, still more preferably at least 1000:1, under the conditions of interest. The preferred control molecule has a control sequence obtained by random scrambling of the target sequence. One may use a plurality of control molecules and determine if the relative activity against the target versus the controls is such that there is a statistical significant difference, given the mean and s.d. of activity against the controls.
4.2. Target Sequence
If the enzyme is a sequence-specific artificial enzyme, the target sequence is an amino acid sequence of a target protein or a nucleotide sequence of a target nucleic acid which is specifically recognized by the BM. The specificity of a nucleic acid is an exponential function of its length, as the probability that a random nucleic acid will be perfectly complementary is l/4 n, where n is the 'length of the sequence, assuming that all four bases are equiprobable.
It may be prudent to search a nucleic acid database for potential inadvertent targets before selecting a particular sequence as the target sequence.
Similarly, the probability that a random amino acid sequence will be identical to a given target sequence is l/20Λn, where n is the length of the sequence, assuming that all 20 amino acids are equiprobable.
Of course, it is possible that two different amino acid sequences will have similar 3D conformations and hence both be recognized by the binding moiety.
If it is found to be difficult to identify a sufficiently specific BM for a particular target sequence of a target molecule, one may choose a different sequence of the target molecule as the target sequence. If the enzyme is a nonspecific enzyme, such as Ser-His per se, the target sequence is any enzymatically exposed sequence of the target molecule .
4 . 3 . Cleavage Sequence The cleavage site is the location on the target molecule where cleavage occurs. Generally speaking, it will not be the target sequence, since that sequence will be occluded by the BM:TM complex. However, it will be near the target sequence, within the radius of action permitted by the length of the LM.
If the EM cleaves the target molecule indiscriminately, the cleavage site may be any point on the target molecule which is in range of the tethered EM while the BM is bound to the target sequence. Depending on the length and flexibility of the LM, certain points may be more often cleaved than others . If free EM has a preference for certain cleavage sites, it may be desirable to take these preferences into account when selecting a target sequence. That is, in choosing a target sequence, one may consider not only the affinity and specificity of the available BM for each potential target, but also the proximity of highly cleavable potential cleavage sites.
4.4. Site-Specific Enzyme Conjugate
A site- specific enzyme conjugate may be formed by conjugating an Enzymatic Moiety (EM) to a Binding Moiety (BM) , both as hereafter defined. The EM may be linked directly or indirectly, and covalently or noncovalently, to the BM. The indirect linkage may be by a linking moiety (LM) as hereafter defined. , The conjugate may be prepared either as a single unit, or by individually preparing the EM and BM and then conjugating the two. The present invention is not limited to any particular method of conjugation.
In essence, the binding moiety confers specificity on the conjugate, and increases the effective concentration of the bound target in the vicinity of the EM. While the intrinsic specificity of the EM is unchanged, since the effective concentration of the bound target is much higher than the effective concentration of any unbound molecules, the target is selectively cleaved.
Of course, it should be appreciated that the molecule containing the target sequence may not be cleaved exactly at the target sequence, but rather nearby, with the exact cleavage site being at a distance from the target sequence which is limited by the length of the linker moiety.
5 4.4.1. Enzymatic Moiety (EM)
The preferred enzymatic moiety is the dipeptide Ser- His. However, it may be any of the related nonspecific enzymes set forth above.
10. 4.4.2. Binding Moiety (BM)
The ' binding moiety .(BM) is the component which provides the desired specificity. If the intended use of the conjugate is as a nuclease, the binding moiety will be specific for a particular nucleic acid sequence. It the 15 intended use of the conjugate is as a protease, the binding moiety will be specific for a particular amino acid sequence. When the conjugate is a nuclease, the binding moiety is preferably a peptide or a nucleic acid.
A peptide BM has the advantage that, if the linking 20 . moiety is - also a peptide, the entire conjugate may be synthesized as a fusion protein.
A peptidic NA-binding BM may be obtained by preparing a phage library display random oligopeptides, and screening for phage displaying oligopeptides that bind the nucleic 25 acid target and which do not bind control nucleic acids.
The NA-binding BM may also • be a nucleic acid (or nucleic acid analogue) in which case' it is a nucleic acid
(or analogue) which is substantially .complementary, .and more preferably perfectly complementary, to the target NA
30 sequence.
While the design of a complementary sequence is trivial (G binds to C, A binds to T or U, and vice versa) , linking an oligonucleotide with the Ser-His dipeptide is technically more difficult and also generates a potential possibility for self-cleavage. .
These problems may be overcome by using of a nucleic acid analogue which can be linked to peptides and which is not susceptible to self-cleavage by the conjugate. An example is a PNA ("peptide nucleic acid") . PNAs are DNA analogues whose "building • blocks" are normal DNA bases but whose backbone is made with peptide-like bonds instead of sugar-phosphate bonds . Specifically, the achiral backbone is 'made from N- (2-aminoethyl) -glycine units linked by amide bonds, and is uncharged. The standard monomers A, C, G and
T are attached by methylene carbonyl linkages. PNAs can form Watson-Crick pairs with normal nucleotides . PNA oligomers/polymers have higher thermal stability, stronger binding (relatively independent of salt concentration) , more specific binding, (1 mismatch in 15-mer PNA lowers the
Tm by 8-20°C (15°C a.vg . ) in 15-mer DNA, by 4-16°C (11°C avg'.)) and greater resistance to nucleases than DNAs . They are also protease-resistant .
PNA oligomers/polymers are preferably synthesized using a modified peptide synthesis protocol. Both Fmoc and tBoc .methods are often used.
PNAs are described in the following references : Nielsen, P.E. et al . , 1991, Science, 254:1497-1500.
Egholm, et al . , 1992_, J. Am. Chem. Soc, 114:1895- 1897.
Nielsen, et al . , 1993, Nucl. Acids Res., 21.-.197-200. ' Hanvey, J.C. -et al . , 1992, Science, 258:1481-1485. Egholm, et al . , 1993, . Nature, 365:566-568.
Wittung, P., et al . , 1994, Nature, 368:561-563. Peffer, N.J. et al . , 1993, PNAS USA, 90:106048-10652. See also Nielsen and Egholm, Peptide Nucleic Acids : Protocols and Applications (1999) .
The preferred PNAs are 10-15 bases in length
4.4.3. Linking Moiety
The linking moiety (LM) or tether may be any chemical structure which (1) sufficiently distances the EM from the BM so that neither substantially interferes with the other, and (2) brings the EM into sufficient proximity with the molecule bound -by the BM so that the desired level of specificity of enzymatic activity is obtained.
The linkage of the EM to the BM increases the effective concentration of the EM in the vicinity of the recognition site of the BM . If the linking moiety is perfectly flexible, but has a fixed length L in angstroms, then the EM must lie in a spherical volume centered on the location of the bound EM. That volume V equals L3. The effective concentration of the EM in the vicinity of the binding site is then 1/(V* 6.023 x 10E-4) molar. One class of linkers are oligopeptide linkers. Such linkers may be based on interdomain linkers occurring naturally in multidomain proteins (especially enzymes with separate binding and catalytic domains), or on loops (reverse turns) naturally linking alpha helices or beta strands in proteins. Or they may be non-naturally occurring linkers .
The length of an oligopeptide linker may be predicted approximately on the basis of the number of residues in the linker, and the expected conformation of the linker. Typical translation per residue, in angstroms, is
antiparallel beta sheet 3.4 parallel beta sheet 3.2 right-handed alpha helix 1.5 3-10 helix 2.0
Pi helix ' 1.15 polyproline I 1.9 polyproline II 3.12 polyglycine I 3.8 polyglycine II 3.1
The value of 3.8 angstroms/residue corresponds to a fully extended polypeptide.
The rms end-to-end distance of a linker of n residues, were the ends free to move, would be (t*n) 0.5, where t were the average translation per residue.
For long polypeptide chains of n residues of amino acids other than Gly and Pro, the rms end-to-end distance is about (130n) .5. This reflects steric restrictions on the flexion of the chain. Both Gly and Pro reduce the rms distance, Gly because it directly introduces flexibility, and Pro because the chain tends to change directions . Examples of naturally occurring linkers include the sequences set forth in Argos, J. Mol . Biol., 211: 943-958
(1989). Argos identified linkers by examining tertiary protein structures for the presence of multiple domains, and determining which sequences linked each pair of domains. While Argos examined 1988 data, it is of course feasible to extend his analysis to post-1988 structures, and to use other methods than his of identifying domains. Argos identified the following linkers: MMA, PSPDV, EVTDV, FIDSSKYT, GYDSTKFK, FIDTTAYT, TSKSAD, KADAR, AGPLHGLAN, GMTEMN, QLGVLPRA, QGSIY, PQHTGT, RGLVV, TDPKFITWS, RVIGSGC, ASCTTN, VAKVTQG, AIFGGFK, TGKSSIMR, NTQA, SSDI, QKGAVTPVKNQGS, EGHPVTSEP, GSGQS, VIAVGAVDS, QDITTAGN, KSQGE, DTYENVYP, RGSTEAA, QTEGRLDN, SLPE, NQDVD, TIEVS, MPSTPHESQ, VPGRLP, KALENPTRP, EGKEL, TAGLIYQ, TSK, ADIK, STDKP, RERIPERVV, NRNPVN, QRFNSAN, SDDFAAG, PSLLKGPS, NGRLPDADK, SKAFEK, SAESA, KRAD. As shown in Argos Table 3, no amino acid was completely excluded from linkers. However, the following amino acids occurred more often in linkers than their normal frequency of occurrence in proteins: Thr, Ser, Pro, Asp, Gly, Lys, Gin, Asn, and Ala. These amino acids should therefore be considered the preferred components for oligopeptide linkers. The amino acids least favored in Argos' linkers were Trp, Cys and His.
Most of the interdomain linkers were at least five amino acids in length. The mean extension (from C-alpha to C-alpha) was 2.73 angstroms/AA.
The interdomain linkers were about average in flexibility, as reflected by the B (temperature) value of the linker compared to the appropriate mean and standard deviation for B for the appropriate length sequences of the protein as a whole.
It is interesting to compare these statistics with the normalized frequencies of occurrence of amino acid residues in reverse turns in globular proteins . The preferred residues are, in descending order: Pro (1.91), Gly (1.64), Asp (1.41), Ser (1.33), Asn (1.28), Tyr (1.05), Thr (1.03). It is known in the art to construct fusion proteins wherein two domains are artificially linked into a single polypeptide chain. Examples include: VH-VL by(GGGGS)x3, EGKSSGSGSESKST, or KESGSVSSEQLAQFRSLD; SOD monomers by PVPSTPPTPSPSTPPTPSM. Methods of determining peptide linkers are discussed in, e.g., Huston, USP 5,258,498, Ladner, USP 4,946,778,
Hoffman, USP 5,545,727. The Hoffman patent recommends, inter alia, poly'Gly (e.g., Gly7 for 25 angstroms), polyGlu, polyAsp, Artemia (Gn-LRRQIDLEVTGL-Gn) , Gly1_3-Ala12-Gly1_.3 [ 20-
40 angstrom), Gly1-.3-Pro12-.i6-Gly1-.3 (21-48 angstroms), and
Gly1_3-Aspn-Gly1_3 (26-49 angstroms) .
In phage display libraries, it is common to use one of the following sequences to link the displayed peptide or protein to the- phage coat protein: GGGS, EGGGS, GGGGG,
GGGGSSS, (GGGS)x n, or other sequences rich in Gly, Ser,
Pro, Asp, Asn, or Thr.
Typically linkers will be rich in glycine, which, by virtue of its lack of a side chain, typically confers flexibility on a peptide chain which incorporates it.
Other small amino acids with polar groups (e.g., Ser, Thr) are helpful, as they interact with solvent, but are unlikely to interact strongly and fortuitously with other portions of the joined EM and BM. Pairs of dibasic amino acids are undesirable because they are frequent sites for protease activity.
Once an EM and a BM are chosen, it is possible to screen for suitable linkers by preparing a library of fusion proteins in which the linker is randomized. The linker may be randomized as to length, composition, and/or specific sequence.
For example, library #1 L=gly x n, where n is varied library #2, L=Gly or Ser at each position, and the number of positions n is also varied. The number of sequences is the sum of the 2 n for n=nmin to nmax .
Library #3, like #2, L=Gly or Thr at each position. Library #4, the linker amino acids are Gly, Ser, Pro, Asp, Asn, or Thr, chosen randomly and independently at each amino acid position, and there are n amino acids in the linker. Library #5, but also allow Glu and Arg, which, although fairly large, are still hydrophilic.
While all-Gly linkers have been used in artificial fusion proteins, see, e.g., Hoffman USP 5,545,727, they do not appear to be common in natural proteins. Hence, they are an acceptable, but not the most preferred, embodiment. Because PNA is synthesized through peptide bond linkage that is also used by oligopeptide synthesis, an oligopeptide linker is preferred if the BM is PNA-based.
The types and the numbers of amino acids used in a linker will be determined based initially on the results of computer-aided simulation. For computer modeling techniques, see, e.g., section 4.5 below.
Multiple linkers of various amino acid composition and varied linker lengths will be made and tested to evaluate their relative effectiveness and flexibility as linkers, based on the kinetics and specificity of substrate cleavage .
Several possible linkers and PNAs are shown in Table 3. A second class of linkers are nucleic acid linkers .
These are linkers composed of DNA, RNA, or analogues thereof. Such linkers are discussed in detail in Hanson, USP 5,844,107.
Other possible linkers include linkers formed by chemically reacting a bifunctional crosslinking agent with the EM and the BM. This agent has reactive end function
LI and L2, which may be the same or different. It may be conjugated simultaneously to both the EM and the BM, or first to one and then to the other. By way of example, one end function may be reactive with the carboxy group at the C-terminal of the EM. Or the C-terminal of the EM may be derivatized so that a different functionality, e.g., an amide or thiol, is presented, and the end function LI being one reactive with the new functionality. Similarly, the other end function L2 must be reactive with an original or provided functionality of the BM. Typical end functions are those reactive with carboxy, amino, and thiol groups.
For clarity of nomenclature, the crosslinking agent is a chemical with at least two reactive functions which is reacted with the EM and BM to form the conjugate. The linker is the portion of the original agent which is inherited by the conjugate. Thus, if the agent were Ll- (CH2)n-L2, the conjugate might be BM- (CH2 ) n-EM, and the linker is thus the -(CH2)n-.
The site of conjugation of the crosslinking agent to the BM need not be a single site. However, at least one, and preferably substantially all, of the conjugation sites must be such that the EM may be conjugated to the BM without substantially impairing the binding function of the BM or the desired enzymatic function of the EM. If necessary, a sensitive site may be protected with a suitable protecting group during crosslinking, the protective group being selectively removed afterward.
In some embodiments, a single EM is conjugated to each BM. In other embodiments, multiple EM' s are conjugated to each BM, by one or more linkers. A single linking agent may have one or more Lls for conjugating EM' s and one or more L2s for conjugating BM' s . In a preferred embodiment, a commercially available 0- linker will be used to connect Ser-His and the PNAs (Table 3) . The length of the O-linker, when fully extended, is about 9.86 A. 'However, the O-linker may assume different conformations when connected with the dipeptide and PNA or when it interacts with oligonucleotide substrate. Linkers with one or two units of the O-linker will be made commercially and tested in the site-specific cleavage study to evaluate their relative effectiveness as linkers.
4.5. Computer Modeling
A Silicon Graphics Intergraph (SGI) computer and the molecular simulation software "Macromodel" may be used for computer simulation of molecular interactions between Ser- His, or Ser-His-linker-PNA and nucleic acids. The information gained from computer simulation on oligopeptide/nucleic acid interaction will be used to assist in prediction of results of cleavage experiments, formulation of cleavage mechanisms, and design of chemical linkers for site-specific cleavage. Important specific information that can be obtained from computer simulations include spatial orientations of the oligopeptides relative to different DNA/RNA substrates, important distances between the functional groups of the oligopeptides and phosphodiester bonds of the substrates, and the energy levels of different conformations of the oligopeptides. Another important objective of computer modeling is to determine the positions of the Ser-His moiety of the Ser- His-linker-PNA molecules relative to the phosphodiester bond on the DNA substrate. Such computer simulations will provide very valuable information for the design of linkers with the best length and the least steric hindrance, enabling efficient site-specific cleavage of the nucleic acid by Ser-His-linker-PNA molecules.
4.6. Combinatorial Analysis It is possible that in order to obtain a functional site-specific enzyme conjugate, it will be desirable to explore several different linkers, or to modify a naturally occurring binding moiety. It is possible to simultaneously and systematically explore an enormous number of possible linkers and/or binding moiety mutants by synthesizing and screening appropriate combinatorial libraries .
In the essence, this means that the library should be formulated such that (1) the members can bind to and cleave the target molecule, (2) members which so bind and act can be differentiated from those which do not, and (3) the successful members can be fully characterized.
The library may be synthesized so (1) the members are displayed on the surface of a living support (a cell or virus), (2) the members are displayed on the surface of a nonliving support (pin, bead, sheet, etc.), or (3) the members are provided in soluble form. It is necessary to be able to distinguish the successful members from the unsuccessful members; this may be done by physical separation, or by recognizing a change in a signal as a result of the binding. It is also necessary to characterize the successful binding member. If the member is a peptide or nucleic acid, it may be sequenced directly. The sequence of a peptide may be inferred if its coding sequence is sequenced. Or the members may be displayed in a distinctive position on a support, or tagged with a distinctive tag, whereby their structures may be inferred. Finally, a successful sequence may be inferred by .comparison of 'related mixtures, as in Blake, infra .
One popular library system is phage display, see Smith, Science; 228:1315-17 (1985), Harrison, Meth. Enzymol. 267:171-191 (1996), Ladner, USP. 5,223,409. The phage genome- is engineered so that a random, or semirandom peptide or protein is fused to a phage coat protein, so that the foreign peptide or protein is displayed on the surface of phage . Another is. ribosome display, see Mattheakis, Proc.
Nat. Acad. Sci. (USA), 91:902.2-6 (1994); Hanes & Pluckthun, PNAS 94:4937-42 (1997) . -
A third is RNA peptide fusion, see Roberts & Swstak, PNAS, 94:12297-12302 (1997).. There are many other systems as well.
In some systems, bound and unbound members are separated by immobilizing the target and washing off library members which are not bound to target. If the action of the enzymatic moiety, on the target were such as to result in a loss of binding, this could be problematic. One solution is use of a nonbinding surrogate for the binding moiety, e.g., Thr-His for Ser-His. Another solution is to use an intracellular assay, especially one in which the cell dies if the target molecule is not cleaved. . Alternatively, one could select for noncleavage, and identify the successful members by a technique- akin to replica plating.
5. Reaction Conditions Ser-His worked over wide ranges of pH, temperature, and concentration. It also worked in various buffering systems. Preferred range for pH:5.5-7.5; pH 6-6.5 more preferred. Preferred range for temperature: from 20 °C up to 80°C; increased cleavage rate with increased temperature; temperature of 37°-60°C. more preferred. Preferred range' for concentration: ImM to 20mM; 5-10 mM more preferred, especially for nicking DNA. Compatible buffer systems include but are not limited to Britton- Robinson (B-R) (contains borate, phosphate and acetate) , phosphate buffer (PBS), citrate, and acetate buffer. Ser- His cleavage activity is inhibited with Tris buffer. Preferred range . for incubation time: 1 hour to 48 hours, but is dependent on concentration and temperature; shorter durations may be effective with higher temperatures or concentrations; longer durations will result in more cleavage .
6. Nick Translation
A preferred protocol for nick translation of DNA with Ser- His and similar compounds is the following:
1. Use linear or circular double-stranded DNA as the substrate.
2. The conditions for the nicking reaction with Ser- His are as follows:
20 ng/mL DNA, 10 mM Ser- His, and 40 mM Britton- Robinson buffer (pH 6.5). 3. Incubate a 20 L reaction (400 ng DNA) in a sterile microfuge tube at 50°C for 4 hours. Include a control without Ser-His.
4. Pass through a G-50 column pre-equilibrated with distilled H20. Use a clinical centrifuge (setting 5) and spin for 2 minutes . Steps 5-9 are the same as the procedure used in the standard Ni ck Transla tion Ki t (Roche) , wi th the exception tha t DNA Polymerase I is used instead of the Enzyme Mixture provided in the ki t,, which also con tains DNase I.
5. Use 100 ng of the DNA collected from the column for a 20 mL nick translation reaction. Add the following to the DNA in a sterile microfuge tube: 2 mL 10X Nick Translation buffer, 1 mL each of dATP, dGTP, and dTTP (0.4 mM) ,
5 mL [a-32P]dCTP (5 mM, 10 mCi/mL) , 2 L DNA Polymerase I (1 unit/mL) , and enough distilled H20 to bring the total volume to 20 mL . 6. Incubate at 15 °C for 1 hour.
7. Add 5 mL EDTA (0.5 M, pH 8.0) to stop the nick translation reaction and 5 mL yeast tRNA (2.75 mg/mL) to carry the labeled DNA through the column. 8. Pass through a G-50 column pre-equilibrated with distilled H20 to remove unincorporated dNTPs. Use a clinical centrifuge (setting 5) and spin for 4 minutes.
9. Measure the specific activity with a scintillation counter.
Any other protocol which utilizes Ser-His, or a related molecule, and which achieves acceptable results may be used in place of that set forth above. For example, in the nicking phase (steps 1-4), the Ser-His may be replaced with one of the other enzymes of the present invention, e.g., Cys-His or Ser-His-Asp, and the reaction conditions (e..g., concentrations, temperature, pH, incubation time) may be varied, and the purification step may be replaced with a different purification procedure or possibly omitted altogether. Similarly, in the polymerizing phase (steps 5- 8), a different polymerase may be used, the reaction conditions may be varied, the label may be different, and the purification step may be altered or possibly omitted altogether. Step 9 is a QC step and its nature will depend on the choice of label. The nicking step can be shortened by further increase of incubation temperature. A incubation time of 60 min or shorter can be achieved by increasing the incubation temperature from 50°C to 60°C or higher.
Currently, the nick translation procedure has one more step than conventional nick translation procedure in that Ser-His has to be removed by centrifugation in a spin column after the nicking reaction and before the polymerization step. This additional step should not be a problem, since the conventional nick translation, same as the Ser-His nick translation, also requires a spin column step at the end of the procedure to separate labeled probe from unincorporated precursors (radioactive or non- radioactive substrates of polymerization) . The requirement for such a spin column and a centrifuge for the separation indicates that performing Ser-His nick translation procedure requires exactly the same spin column (except needs two columns instead of one in conventional nick translation) and centrifuge as in the conventional procedure without any additional equipment.
7. Labeling In the nick translation labeling embodiment, the nuclease of the present invention is used in conjunction with labeled nucleotides to a label a nucleic acid, such as a probe. The' labeled nucleotides may be the normal nucleotides G, A, T (U for RNA) and C, or may be unusual nucleotides such as inosine. The label may be radioactive or nonradioactive .
Suitable radioactive labels include 32P 33P and 35F; suitable nonradioactive labels include biotin (ardin) , fluorophores, chromophores, and other molecules capable of generating a suitable signal and compatible with the target molecule and target nucleotides . Digoxigenin is especially preferred .
8. Combinatorial Non-Peptide Libraries
Where this specification teaches the use of a combinatorial peptide library to identify a useful molecule or moiety, combinatorial libraries of molecules other than peptides may be used, mutatis mutandis. The principal difference between these libraries and peptide libraries is that the libraries cannot be obtained by expression of partially degenerate DNAs in cells .
Known non-peptide libraries include nucleic acid libraries, as well as Examples of candidate simple libraries which might be evaluated include derivatives of the following: Cyclic Compounds Containing One Hetero Atom H teronitrogen pyrroles pentasubstituted pyrroles pyrrolidines pyrrolines prolines indoles beta-carbolines pyridines dihydropyridines
1, 4-dihydropyridines pyrido [2 , 3-d] pyrimidines tetrahydro-3H-imidazo [ 4 , 5-c] pyridines
Isoquinolines tetrahydroisoquinolines quinolones beta-lactams azabicyc1 " \ .3.0] nonen-8-one amino acid
Heterooxygen furans tetrahydrofurans
2 , 5-disubstituted tetrahydrofurans pyrans hydroxypyranones tetrahydroxypyranones gamma-butyrolactones Heterosulfur sulfolenes Cyclic Compounds with Two or More Hetero atoms Multiple heteronitrogens imidazoles pyrazoles piperazines diketopiperazines arylpiperazines benzylpiperazines benzodiazepines 1, 4-benzodiazepine-2, 5-diones hydantoins
5-alkoxyhydantoins dihydropyri idines 1 , 3 -di sub stituted-5 , 6 - dihydopyrimidine-
2, 4-diones cyclic ureas cyclic thioureas quinazolines chiral 3 - sub s t i t u t e d- quinazoline-2, 4-diones triazotes
1,2, 3-triazoles purines Heteronitrogen and Heterooxygen dikelomorpholines isoxazoles isoxazolines Heteronitrogen and Heterosulfur thiazolidines
N-axylthiazolidines dihydrothiazoles
2 - m e t h y l e n e - 2 , 3 dihydrothiazates
2-aminothiazoles thiophenes
3-amino thiophenes 4-thiazolidinones 4-melathiazanones benzisothiazolones For details on synthesis of libraries, see Nefzi, et al., Chem. Rev., 97:449-72 (1997), and references cited therein.
Amino Acids and Peptides
Amino acids are the basic building blocks with which peptides and proteins are constructed. Amino acids possess both an amino group (-NH2) and a carboxylic acid group (- COOH) . Many amino acids, but not all, have the structure NH2-CHR-COOH, where R is hydrogen, or any of a variety of functional groups .
Twenty amino acids are genetically encoded: Alanine, Arginine, Asparagine, Aspartic Acid, Cysteine, Glutamic Acid, Glutamine, Glycine, πistidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Proline, Serine, Threonine, Tryptophan, Tyrosine, and Valine. Of these, all save Glycine are optically isomeric, however, only the L- for is found in humans. Nevertheless, the D-forms of these amino acids do have biological significance; D-Phe, for example, is a known analgesic.
Many other amino acids are also known, including: 2- Aminoadipic acid; 3-Aminoadipic acid; beta-Aminopropionic acid; 2-Aminobutyric acid; 4-Aminobutyric acid (Piperidinic acid) ; 6-Aminocaproic acid; 2-Aminoheptanoic acid; 2- Aminoisobutyric acid, 3-Aminoisobutyric acid; 2- Aminopimelic acid;
2, 4-Diaminobutyric acid; Desmosine; 2, 2 ' -Diaminopimelic acid; 2, 3-Diaminopropionic acid; N-Ethylglycine; N- Ethylasparagine; Hydroxylysine; allo-Hydroxylysine; 3- Hydroxyproline; 4-Hydroxyproline; Isodesmosine; allo-Isoleucine; N- Methylglycine (Sarcosine); N-Methylisoleucine; -Methylvaline; Norvaline; Norleucine; and Ornithine.
Peptides are constructed by condensation of amino acids and/or smaller peptides. The amino group of one amino acid (or peptide) reacts with the carboxylic acid group of a second amino acid (or peptide) to form a peptide
(-NHCO-) bond, releasing one molecule of water. Therefore, when an amino acid is incorporated into a peptide, it should, technically speaking, be referred to as an amino acid residue .
A peptide is composed of a plurality of amino acid residues joined together by peptidyl (-NHCO-) bonds. A biogenic peptide is a peptide in which the residues are all genetically encoded amino acid residues; it is not necessary that the biogenic peptide actually be produced by gene expression.
The peptides of the present invention include peptides whose sequences are disclosed in this specification, or sequences differing from the above solely by no more than one nonconservative substitution and/or one or more conservative substitutions, preferably no more than a single conservative substitution. The substitutions may be of non-genetically encoded (exotic) amino acids, in which case the resulting peptide is nonbiogenic.
A conservative substitution is a substitution of one amino acid for another of the same exchange group, the exchange groups being defined as follows
I Gly, Pro, Ser, Ala (Cys) (and any nonbiogenic, neutral amino acid with a hydrophobicity not exceeding that of the aforementioned a.a.'s) II Arg, Lys, His (and any nonbiogenic, positively- charged amino acids)
III Asp, Glu, Asn, Gin (and any nonbiogenic negatively-charged amino acids) IV Leu, lie, Met, Val (Cys) (and any nonbiogenic, aliphatic, neutral amino acid with a hydrophobicity too high for I above) V Phe, Trp, Tyr (and any nonbiogenic, aromatic neutral amino acid with a hydrophobicity too high for I above) .
Note that Cys belongs to both I and IV.
A highly conservative substitution, which is preferred, is Arg/Lys/His, Asp/Glu, Asn/Gln, Leu/Ile/Met/Vai, Phe/Trp/Tyr, or Gly/Ser/Ala. Additional peptides within the present invention may be identified by systematic mutagenesis of the lead peptides, e.g.
(a) separate synthesis of all possible single substitution (especially of genetically encoded AAs) mutants of each lead peptide, and/or
(b) simultaneous b.ino ial random alanine-scanning mutagenesis of each lead peptide, so each amino acids position may be either the original amino acid or alanine (alanine being a se i- conservative substitution for all other amino acids) , and/or
(c) simultaneous random mutagenesis sampling conservative substitutions of some or all positions of each lead peptide, the number of sequences in total sequences space for a given experiment being such that any sequence, if active, is within detection limits (typically, this means not more than about 1010 different sequences) .
The mutants are tested for activity, and, if active, are considered . to be within "peptides of the present invention" . Even inactive mutants contribute to our knowledge of structure-activity relationships and thus assist in the design of peptides, peptoids, and peptidomimetics .
Preferably, substitutions of exotic amino acids for the original amino acids take the form of
(I) replacement of one or more hydrophilic amino acid side chains with another hydrophilic organic radical, not more than twice the volume of the original side chain, or (II) replacement of one or more hydrophobic amino acid side chains with another hydrophobic organic radical, not more than twice the volume of the original side chain.
The exotic amino acids may be alpha or non-alpha amino acids (e.g., beta alanine) . They may be alpha amino acids with 2 R groups on the Co., which groups may be the same or different. They may be dehydro amino acids (HOOC-
C(NH2)=CHR) .
Cyclic Peptides
Many naturally occurring peptide are cyclic. Cyclization is a common mechanism for stabilization of peptide conformation thereby achieving improved association of the peptide with its ligand and hence improved biological activity. Cyclization is usually achieved by intra-chain cystine formation, by formation of peptide bond between side chains or between - and C- terminals. Cyclization was usually achieved by peptides in solution, but several publications have appeared recently that describe cyclization of peptides on beads.
Peptoid
A peptoid is an analogue of a peptide in which one or more of the peptide bonds are replaced by pseudopeptide bonds, which may be the same or different. Such pseudopeptide bonds may be: Carba Ψ(CH2-CH2)
Depsi Ψ(CO-O)
Hydroxyethylene Ψ(CHOH-CH2) Ketomethylene Ψ(CO-CH2) Methylene-ocy CH2-0- Reduced CH2-NH
Thiomethylene CH2-S- Thiopeptide CS-NH N-modified -NRCO-
Peptidomimetic
A peptidomimetic is a molecule which mimics the biological activity of a peptide, by substantially duplicating the pharmacologically relevant portion of the conformation of the peptide, but is not a peptide or peptoid as defined above. Preferably the peptidomimetic has a molecular weight of less than 700 daltons .
Designing a peptidomimetic usually proceeds by: (a) identifying the pharmacophoric groups responsible for the activity; (b) determining the spatial arrangements of the pharmacophoric groups in the active conformation of the peptide; and (c) selecting a pharmaceutically acceptable template upon which to mount the pharmacophoric groups in a manner which allows them to retain their spatial arrangement in the active conformation of the peptide.
Step (a) may be carried out by preparing mutants of the active peptide and determining the effect' of the mutation on activity. One may also examine the 3D structure of a complex of the peptide and the receptor for evidence of interactions, e.g., the fit of a side chain of the peptide into a cleft of the receptor; potential sites for hydrogen bonding, etc.) .
Step (b) generally involves determining the 3D structure of the active peptide, in the complex, by NMR spectroscopy or X-ray diffraction studies. The initial 3D model may be refined by an energy minimization and molecular dynamics simulation.
Step (c) may be carried out by reference to a template database, see Wilson, et al . Tetrahedron, 49:3655-63 (1993) . The templates will typically allow the mounting of 2-8 pharmacophores, and have a relatively rigid structure. For the latter reason, aromatic structures, such as benzene, biphenyl, phenanthrene and benzodiazepine, are preferred. For orthogonal protection techniques, see Tuchscherer, et al . , Tetrahedron, 17:3559-75 (1993).
For more information on peptoids and peptidomimetics, see USP 5,811,392, USP 5,811,512, USP 5,578,629, USP 5,817,879, USP 5,817,757, USP 5,811,515.
Analogues
Also of interest are analogues of the disclosed peptides, and other compounds with activity of interest. Analogues may be identified by assigning a hashed bitmap structural fingerprint to the compound, based on its chemical structure, and determining the similarity of that fingerprint to that of each compound in a broad chemical database. The fingerprints are determined by the fingerprinting software commercially distributed for that purpose by Daylight Chemical Information Systems, Inc., according to the software release current as of January 8, 1999. In essence, this algorithm generates a bit pattern for each atom, and for its nearest neighbors, with paths up to 7 bonds long. Each pattern serves as a seed to a pseudorandom number generator, the output of which is a set of bits which is logically ored to the developing fingerprint. The fingerprint may be tixed or variable size.
The database 'may be SPRESI'95 (InfoChem GmbH), Index
Chemicus (ISI) , MedChem (Pomona/Biobyte) , World Drug Index
(Derwent) , TSCA93 (EPA) May bridge organic chemical catalog
(Maybridge) , Available Chemicals Directory (MDLIS Inc.), NCI96 (NCI), Asinex catalog of organic compounds (Asinex Ltd.), or IBIOScreen SC and NP (Inter BioScreen Ltd.), or an inhouse database.
A compound is an analogue of a reference compound if it has a daylight fingerprint with a similarity (Tanamoto coefficient) of at least 0.85 to the Daylight fingerprint of the reference compound. Preferably, the compounds of the present invention has a similari ty of at least 0. 85, more preferably a t least 0. 9, still more preferably at least 0. 95, to Ser-His, or to any oligopeptide scoring 2+ or better in Table 2. A compound is also an analogue of a reference compound id it may be conceptually, derived from the reference compound by isosteric replacements.
Homologues 'are compounds which differ by an increase or decrease in the number of methylene groups in an alkyl moiety.
Classical isosteres are those which meet Erlenmeyer ' s definition: "atoms, ions or molecules in which the peripheral layers of electrons can be considered to be identical". Classical isosteres include Monovalents Bivalents Trivalents Tetra Annular F, OH, NH2, CH3 -0- -N= =C= -CH=CH-
=Si=
Cl , SH , PH2 -S- -P= -N+= -S-
Br -Se- -As - =p+= -0- i -Te- -Sb- =As+= -NH- -CH= =Sb+=
Nonclassical isosteric pairs include -CO- and -S02-, - C00H and -S03H, -S02NH2 and -P0(0H)NH2, and -H and -F, - 0C(=0)- and C(=0)0-, -OH and -NH2.
Miscellaneous
For the purpose of any definition of relevant compounds, the recitation of a class of compounds shall be deemed a recitation also of that class less as applicable the compounds of the Chinese Patent Abstract, i.e., Ser- His, diisopropyl phosphoryl serine, and diisopropyl phosphoryl threonine . Exaiπpl es
The amino acids histidine (His) and serine (Ser) , or amino acids similar to Ser, function together as key catalytic amino acids in the active sites of such diverse enzymes as the serine- and thiol-proteases, lipases, and esterases. Ser and His are also conserved in the intein- extein junctions of the phylogenetically widespread self- splicing proteins and at the - and C-termini of the homing endonucleases spliced from them. He.re we show that the dipeptide seryl-histidine (Ser-His) and related oligopeptides can themselves cleave DNA, protein, and the ester p-nitrophenyl acetate (p-NPA) over wide ranges of pH and temperature. Denaturing polyacrylamide gel electrophoresis (PAGE) of 5' -end labeled DNA samples incubated with Ser-His reveals a pattern of two bands per nucleotide position, consistent with the generation of both 3' -hydroxyl and 3' -phosphate DNA cleavage fragments, as would be expected of phosphodiester hydrolysis by Ser-His. To the best of our knowledge, Ser-His is the shortest peptide ever reported to show cleavage activity with multiple categories of natural substrates. The amenability of the dipeptide to variation through addition of amino acid residues, either internally or to the C-terminus while retaining its multiple cleavage activities, combined with its reactivity over wide ranges of pH and temperature, demonstrates the evolutionary capacity of the Ser/His dyad and evokes many questions about possible roles it may have played in molecular evolution and its potential role as a core for selection of oligopeptides with enhanced cleavage activities and target specificity. Example 1 EXPERIMENTAL PROTOCOLS
Oligopeptides and cleavage substrates. Ser-His was purchased as an acetate salt from Sigma and Bachem (HPLC purified), or as the dipeptide from Research Genetics. Other oligopeptides were purchased from either Sigma or Research Genetics. The powdered oligopeptides were dissolved in double deionized and sterilized (dds) H20, and were then either filter- or autoclave-sterilized. Plasmid DNA pBR322 and 1-DNA were purchased from Life Science Technology. A 60-mer single-stranded oligonucleotide, 5'- CGGATTACCA GGGATTTCAG TCGATGTACA CGTTCGTCAC ATCTCATCTA CCTCCCGGTT-3, ' was purchased from Integrated DNA Technologies. The 5' end of the oligonucleotide was labeled with [g-32P] ATP (Amercham) by T4 polynucleotide kinase. Bovine serum albumin (BSA) and lysozyme were purchased from Sigma and were dissolved in ddsH20. The carboxyl ester p-NPA was purchased from Sigma and was dissolved in isopropanol.
Cleavage of Proteins. Ser-His related Oligopeptides were individually mixed with a cleavage substrate (bovine serum albumin, 5mg) in Britton-Robinson (B-R) buffer (equal amounts of phosphate, borate, and acetate), to buffer reactions in the pH range 5-9, and ddsH20 to a final volume of 20 mL in PCR reaction tubes, sealed, and incubated in a GeneAmp PCR System (Perkin-Elmer 9600) at designated temperatures (e.g., 50°C.) for pre-determined periods of time (6 to 48 hours) . After incubation, a portion of the incubated solution was removed and analyzed by 1% agarose gel electrophoresis (for pBR322 and 1 DNA) or 10% and 15% denaturing PAGE (for the single-stranded 5'- end-labeled oligonucleotide), or by 10% reducing PAGE (for proteins) . The cleavage of BSA by chymotrypsin (100 nM, pH 7.8, 37°C, 40 min.) was compared.
The cleavage reaction of p-NPA (2 mM) with Ser-His was carried out in triplicate in a 96-well microtiter plate at a designated temperature (e.g., room temperature) in B-R buffer (40mM, pH6) a volume of 100 L, and was monitored, recorded, and analyzed using a SPECTRAmax 250 microtiter plate reader system at a wavelength of 400 nm. Cleavage by Ser-His was compared with that by chymotrypsin (pH 7.8).
Cleavage of 1-DNA
1-DNA was incubated with or without dipeptide in B-R buffers of varying pH at either 37 °C or 50 °C for 24, 48, and 72 hr . All samples were subjected to electrophoresis in a 1% agarose gel.
Cleavage of plasmid DNA (pBR322) .
The same buffers were used as above. Incubation was at 37 °C only.
Cleavage of 5' end-labeled ss DNA. An oligonucleotide 160 mer) , radiolabeled at its 5' end, was incubated with or without Ser-His in B-R buffer pH 6.5 at 50°C for 48 hr .
Following incubation, the samples were subjected to 10% or
15% PAGE. The latter 3 shows higher resolution of the bands . Ligation reaction of 1-DNA cleaved by Ser-His. 1-DNA
(20 ng/mL) was incubated with Ser-His (10 mM) in Britton-
Robinson buffer (40 mM, pH 6.5) in a reaction volume of 20 mL for either 9 hours or 10 hours at 50 °C to generate 1-
DNA cleavage fragments, which were then incubated for 24 hours at 12 °C with T4 DNA ligase in a ligation buffer containing ATP. The ligation reaction samples were subsequently subjected to electrophoresis on a 1% agarose gel alongside negative control samples for the cleavage and ligation reactions, which were incubated without Ser-His and T4 DNA ligase, respectively.
RESULTS
DNA Cleavage. Linear bacteriophage 1-DNA at a final concentration of 20 ng/mL (60 mM in phosphodiester bonds), when incubated with 10 mM Ser-His in 40 mM Britton-Robinson buffer, was gradually degraded into smears of progressively smaller fragments of heterogeneous sizes, as revealed by agarose gel electrophoresis after 72 hours of incubation. In contrast, DNA samples incubated under the same conditions without Ser-His remained intact. The nucleolytic activity in samples incubated with Ser-His could be detected over wide ranges of pH (from 5 to 9) , with a pH value near the pKa of imidazole (pH 6) being optimal for cleavage at 37 °C. It is interesting to note that the pKa for the imidazole group of His is about 6, suggesting the importance of the imidazole being positively charged in the reaction. The optimal pH became even more acidic when the incubation temperature was increased to 50 °C . The rate of cleavage was also temperature-dependent; incubation at 50 °C resulted in faster DNA cleavage than at 37 °C (Fig. 1A) , and even higher rates of cleavage were observed at 60 °C (data not shown) .
The DNA cleavage rate was found to be also affected by Ser-His concentration. At a given constant DNA substrate concentration, the higher the Ser-His concentration, the faster the cleavage rate until the concentration reached 20 mM. A Ser-His concentration higher than 20 mM resulted in declined cleavage rate (data not shown) . To demonstrate that the nucleolytic activity was not restricted to linear DNA, a circular plasmid DNA, pBR322, was also used as a substrate for cleavage. In the presence of Ser-His, the ' DNA band corresponding to the supercoiled form disappeared first with a concomitant increase of the relaxed form. As the incubation continued, the relaxed form decreased and a new linear form appeared. At the final stages of incubation, the linear form was degraded into a heterogeneous smear similar to that seen with 1-DNA. These DNA cleavage results illustrate that the nucleolytic activity has no sequence specificity, which was further demonstrated in a cleavage experiment using a radio-labeled oligonucleotide substrate. A single-stranded (ss) oligonucleotide of 60 bases was labeled with 32P at its 5' end and incubated with Ser-His. Following denaturing PAGE
(10%), a ladder of cleaved oligonucleotides with relatively uniform intensity and spacing of bands was revealed, indicating that cleavage occurs at all 4 nucleotide positions of the ssDNA with no pronounced base preference. When run for a longer time on a 15% denaturing PAGE gel, two bands were visible at each oligonucleotide position. This cleavage pattern is consistent with DNA hydrolysis by Ser-His, wherein the P-0 bond on either side of a phosphodiester would be expected to be susceptible to hydrolysis, resulting in both 3' -hydroxyl (lower band) and 3' -phosphate (upper band) cleavage products at each nucleotide position on the gel. The distance between the upper and lower bands at each nucleotide position is approximately 1/4 of the distance between corresponding bands of consecutive nucleotide positions. This difference in distance between lower and upper bands coincides with the difference in average molecular mass of a nucleotide with a 3' -hydroxyl (-330 Daltons ) and a nucleotide with a 3' -phosphate (~330 + 80 Daltons). This cleavage pattern is not consistent with the cleavage mechanisms of natural nucleases, which generate single bands at each oligonucleotide position. To further eliminate the possibility of DNA cleavage resulting from contaminating nucleases or metal ions, Ser-His samples were either filter-sterilized or autoclaved, and incubated with DNA in the presence or absence of EDTA, followed by agarose gel analysis. The results of this study indicate that autoclaved Ser-His is as active as the filter-sterilized Ser-His in DNA cleavage (Table 1) , whether in the presence or absence of EDTA, suggesting that the observed DNA cleavage activity is not due to polypeptide nuclease contamination. Furthermore, negative controls identical to the reaction solutions, except in that they did not contain Ser-His, showed no DNA cleavage activity at any of the tested pH and temperature conditions, even after 72 hours of incubation demonstrating that the observed cleavage activity is associated exclusively with addition of the Ser-His solution. Ser-His samples purchased from three different sources, including an HPLC purified Ser-His, all exhibit DNA cleavage activity (Table 1). Moreover, DNA samples incubated under the same conditions with varying concentrations of added Cu2+ or Fe2+ with or without EDTA and without Ser-His displayed no cleavage activity (Table 1) . These findings, combined with the distinctive cleavage pattern, provide strong evidence of the DNA cleavage activity of Ser-His. Ligation of DNA Cleavage Fragments. To investigate the DNA cleavage mechanism employed by Ser-His, cleavage fragments generated from incubation of 1-DNA with Ser-His were then incubated with T4 DNA ligase. The upward shift in fragment sizes on a 1% agarose gel observed for the samples incubated with DNA ligase is evidence . that fragments with 3'-hydroxyls and 5' -phosphates were among the DNA cleavage products generated by Ser-His, ■ since DNA ligase requires free 3'-hydroxyls and 5' -phosphates for ligation. The generation of these terminal groups is consistent with hydrolysis of phosphodiesters by Ser-His, and would not be characteristic of cleavage of the DNA by a free-radical mechanism.
Protein Cleavage. Since the serine proteases conserve Ser and His in their active sites, we speculated that Ser- His might also be able to cleave protein. ' BSA was used as . a substrate to test this hypothesis. As revealed "by" "PAGE, BSA is indeed cleaved by Ser-His into a smear of progressively smaller fragments, with optimal reaction conditions similar to those for DNA cleavage. The proteolytic activity of Ser-His was likewise apparent using- lysozyme as a second protein substrate (data not shown) . Various related oligopeptides were tested for proteolytic • activity, and the results show a consistent pattern in modifications to Ser-His . and corresponding changes in cleavage activity.
Roles of Functional Groups in DNA and Protein Cleavage. To study, the roles of Ser and His in DNA and protein cleavage, different- amino acid residues were used
' in substitution of Ser or His, or were added internally .or to the - or C-terminus of Ser-His. These oligopeptides were incubated individually with either DNA or BSA to test' their respective cleavage activities. Solutions of the amino acids Ser and/or His do not exhibit DNA or protein cleavage activities (Table 2) , indicating that Ser and His must be covalently linked to exhibit cleavage activities . His-Ser, a dipeptide identical to Ser-His in chemical composition but in reverse sequence, is also inactive (Table 2) . The -cleavage activities are also lost when Ser is replaced with any other amino acid except cysteine (Cys ) or threonine (Thr) , the latter of which retains only minimal activity. Like the hydroxyl of Ser, the sulfhydryl side chain of Cys can serve as a nucleophile, as is the case in the active sites of natural thiol-proteases2. The His residue cannot be replaced by any of amino acids tested, including those with positively charged side chains (Table 2) . The cleavage activities of Ser-His are reduced or lost when an amino acid is added to its N-terminus but are retained when one or more amino acids are added to the C-terminus. It is interesting to note that the cleavage activities of Ser-His-Asp, which contains the amino acids of the catalytic triad, are at least as efficient as those of Ser-His under conditions optimized for Ser-His (Table 2) . It was also found that amino acids could be added between Ser and His without abolishing the cleavage activities (Table 2) . The effects of these modifications of Ser-His on DNA and protein cleavage were parallel (Table 2), showing that the same agent was cleaving both substrates and that its activity was related to the modifications of Ser-His in a highly predictable fashion. This provides strong evidence that the dipeptide and related oligopeptides are the cleaving agents, and implicates the hydroxyl (or sulfhydryl) functional groups of the N-terminal amino acid residues and the imidazole functional group of histidine as the requisite groups for cleavage . Ester Cleavage. Like the serine protease chymotrypsin, which cleaves proteins and carboxyl esters , Ser-His was found also to cleave the ester p-NPA. When incubated with Ser-His at room temperature, p-NPA showed a rapid linear increase in optical density (OD) at 400 nm over time, which is indicative of cleavage of the p-NPA to p-nitrophenol . This change in OD was found to be dependent on the concentration of Ser-His, as well as on pH and temperature (data not shown) .
DISCUSSION
Although the finding that a dipeptide can cleave protein, ester, and DNA is surprising, the results of extensive experimentation controvert the possibility that these activities arise from contaminants, such as nucleases or transition metals, known to cleave DNA. Cleavage of the substrates is only observed after addition of the Ser-His and does not occur with the Britton-Robinson buffer alone (Table 1) . Autoclaved and/or filtered Ser-His samples retain cleavage activities and still cleave DNA in the presence or absence of 1 mM EDTA even at 65 °C (Table 1) , effectively eliminating the possibility of protein nuclease contamination. Transition metals, particularly Fe2+ and Cu2+, are known to cleave DNA in the presence of EDTA and other reducing agents, but failed to cleave DNA under reaction conditions optimized for Ser-His (Table 1). Furthermore, the presence of two 3' cleavage products at each nucleotide position generated from DNA cleavage by Ser-His and the successful ligation of the DNA cleavage fragments are strong evidence against metal-assisted cleavage of the DNA16 but are indicative the 3'-hydroxyls and 5' -phosphates that support a hydrolysis mechanism. Combined with the observation of a consistent and predictable pattern of parallel cleavage of DNA and protein by the various oligopeptides tested (Table 2), these facts provide overwhelming evidence that Ser-His and the related oligopeptides are themselves the cleaving agents.
There are many examples in nature of polypeptide enzymes that use an amino acid residue with a hydroxyl group (Ser) or a thiol group (Cys) and a His residue in their active sites to perform peptide or ester bond cleavage. The . serine- and thiol-proteases, for example, form two of the four known families of modern proteases and can cleave both peptides and esters. The use of Ser (or Cys) and His as a catalytic dyad in these protease active sites is a recurring theme apparent from tne evolution of these enzymes. For example, subtilisin is a bacterial serine protease that has very low amino acid sequence homology to chymotrypsin; yet through convergent evolution, it also utilizes the Ser/His combination in its active site17. Ser and His are likewise conserved in the active sites of lipases and esterases. A Ser/His dyad was also discovered in the active site of a catalytic antibody that catalyzes the hydrolysis of norleucine and methionine phenyl esters, indicating that antibodies can converge on the active site structures that have been selected by natural enzyme evolution. Protein self-splicing provides another example of the Ser/His catalytic dyad. This peptide bond cleaving process, recently found in organisms of all three kingdoms, invariably uses Ser or Cys at the N- terminus and His (plus an asparagine (Asn) ) at the C- terminus of an internal protein sequence (intein) to enable cleavage at the splice junctions and the rejoining of the external protein sequences (exteins) . More interestingly, the spliced intein always has Ser or Cys at its N-terminus and His-Asn at its C-terminus, and functions as a homing endonuclease to cleave chromosomal DNA. It seems reasonable, given our findings, to speculate that these Ser and His residues may function not only in the intein splicing reaction, but in the subsequent DNA cleavage as well. The common feature of these various enzyme active sites is embodied in the dipeptide Ser-His, which can itself cleave DNA, proteins, and at least one ester. Computer modeling has predicted a low energy conformation of Ser-His that closely matches the relative orientations of the Ser and His residues in the chymotrypsin active site (data not shown) . Though the mechanisms by .which Ser-His cleaves trie various substrates have not been fully elucidated, the dipeptide is suspected to function similarly to the chymotrypsin active site by employing hydrolysis to cleave protein, ester, and even phosphodiester substrates. The requisite N-terminal position of the Ser may be an indication that Ser uses its own a-amino group as a general base for improving the nucleophilicity of the hydroxyl group, as appears to be the case in the hydrolysis of amide bonds by penicillin acylase. The requirement of the His and the optimal cleavage activities near its pKa suggest a possible role for the imidazole group as a general acid in protonating the leaving groups in the cleavage reactions. Because Ser- His and the related oligopeptides are not constrained in fixed conformations, the possible interactions of the functional groups are varied and highly complex. In addition to the 1-DNA cleavage fragment ligation experiment discussed previously, the DNA cleavage mechanism is currently being investigated by labeling the DNA cleavage fragments to identify the different termini produced by cleavage. Combined with a study of the reaction kinetics currently underway, these experiments will resolve whether or not Ser-His functions enzymatically in these reactions and will help clarify the exact roles of the functional groups involved and the mechanisms employed. Answers to these fundamental questions may allow biochemical mimicry, through in vitro selection systems, of early bio-molecular evolution of peptide enzymes.
CONCLUSION
To the best of our knowledge, the dipeptide Ser-His is the shortest peptide ever reported to have multiple cleavage activities . Results of preliminary experiments indicate that in addition to DNA, protein, and ester cleavage, Ser-His is also capable of cleaving RNA (data not shown) . Because of its ability to interact with multiple classes of biological molecules over such wide ranges of physical and chemical conditions, Ser-His and related oligopeptides may have played important roles, either independently or as cofactors to RNA, in the hypothetical "RNA world" from which the modern "protein world" emerged. The ability of Ser-His to retain its multiple cleavage activities when amino acids are added internally or to its C-terminus demonstrates the extraordinary evolutionary capacity of the dipeptide Ser-His.
Example 2 : Use of Ser-His in Nick Translation
For the purpose of this specification, the "standard nick translation kit" is the Nick Translation Kit, Cat. No. 976,776 (Roche Diagnostics GmbH, Roche Molecular Biochemcials, Sandhofer Strasse 116, D-68305, Mannheim, Germany) , used in the accordance with directions associated with version 3, October 1999.
To determine the relative specific activities of the lambda DNA probes produced by Ser-His and by the standard nick translation kit, the following procedure was followed.
All samples labeled with 2 units DNAP-I in Nick Translation buffer .
Nicking Reaction; λDNA (20 ng/μL) incubated at 50 °C for 4 hr. Ser-His sample nicked with 10 mM Ser-His in B-R buffer (pH 6.5) . DNase I sample nicked with DNase I (standard Nick Translation) . NC sample (uncut λDNA) incubated without DNase I or Ser-His. G-50 Sephadex column used to purify the nicked DNA.
It was evident from the resulting gel that the probes produced by Ser-His were larger and higher in specific activity than those produced by the standard nick translation kit.
The hybridization strength of the probes was also considered, by the following procedure. Equal amounts of λDNA probe (40 ng) were used in each hybridization reaction .
Probes were labeled with DNA Polymerase I in Nick Translation buffer.
Hybridization Temperature = 60°C overnight Washing Temperature = 86°C
Exposure Time = 2 hr The Ser-His derived pBR322 probe generated stronger hybridizatio .
Similar results were obtained when probes were made from pBR322. In further work, it has been found that:
1. A more than 100% increase in the labeling specificity in nick translation can be consistently achieved using Ser-His as the DNA nicking agent compared to conventional DNase I with an optimal reaction condition specified by a commercial kit (Roche) .
2. The nicking time by- Ser-His can be adjusted (shortened) by variation (increase) of nicking reaction temperature. Reaction temperatures of 50, 60 and 70 °C . were examined. Computer modeling of Ser-His indicates that the minimal energy conformation of Ser-His is very similar to that of the Ser and His residues in the active site of chymotrypsin, suggesting that Ser-His may cleave DNA using a mechanism similar to that of chymotrypsin, which is both a protease and an esterase.
Table 1
Chemical Reagents Affecting DNA Cleavage of Ser-His
Chemicals Compatible Chemicals that Inhibit with DNA Cleavage the DNA Cleavage of Ser- of Ser-His His
Britton-Robinson Strong Inhibitor buffer (boric acid, Tris.HCl acetic acio, and (pH = 6.0-8.5) phosphoric acid)
phosphate buffer Weak Inhibitors (Na+ containing) NaCl
MgCl2 EDTA KC1
Table 2 DNA and protein cleavage activities of Ser-His and related oligopeptides
Oligopeptide DNA cleavage* Protein activity cleavage* activity
Ser -t -
His - -
Ser+His - -
His-Ser - -
Ser-His +++++I +++++
Cys-His +++ +++
Thr-His + +
Asp-His - -
Ser-Arg - -
Ser-Lys - -
Gly-Ser- -His + +
His-Ser- -His - -
Ser-His- -Gly +++ +++
Ser-His- -His +++ +++
Ser-His- -Asp +++++ +++++
Ser-Gly- -His- His ++ ++
Ser-Gly- -Gly- His- ++ ++
His
* The cleavage assays were performed at the conditions optimal for Ser-His cleavage activities: in Britton- Robinson buffer (pH 6.0); 10 M oligopeptide, 20 ng/μl of BSA) , incubated at 50°C for 48 hr and the cleavage products were separated and visualized by agarose or acrylamide gel electrophoresis . t "-" stands for no detectable cleavage activity; each "+" stands for- approximately 20% of Ser-His cleavage activity. "+" stands for marginally detectable cleavage activity.
Table 3 Determination of origin of DNA cleavage activity
Reaction Conditions* DNA cleavage activity
Britton-Robinson buffer No
(B-R) Yes
Ser-Hisf (filtered) + B-R Yes
Ser-Hist (autoclaved) + B- Yes
R Yes
Ser-Hist i 1 mM EDTA + B-R
Ser-Hist + B-R at 65°C
FeSO„ (1 μM or 10 μM) + No
EDTA + B-R No
CuS04 (1 μM or 10 μM) +
EDTA + B-R
* All reactions were performed at 37°C in 40 mM Britton-Robinson buffer (pH 6.0) with 20 ng/μl λ-DNA in a total volume of 20 μl. t Ser-His purchased from three different sources was used in this study (See Methods) . REFERENCES
Steitz, T. A.; Shulman, R. G. Annu . Rev. Biophys . Bioeng. 1982, II, 419-444. Kamphuis, I. G.; Drenth, J. ; Baker E. N. J. Mol .
Biol . 1985, 182, 317-329.
Brady, L.; Brzozowski, A. M.; Derewenda, Z. S. Dodson, E.; Dodson, G.; Tolley, S.; Turkenburg, J. P. Christiansen, L.; Huge-Jensen, B.; Norskov, L.; Thim, L. Menge, U. Na ture 1990, 343, 767-770.
Contreras, J. A.; Karlsson, M.; Osterlund, T.; Laurell, H.; Svensson, A.; Holm, C. J. Biol . Chem . 1996, 271, 31426-31430.
Schrag, J.D.; Li, Y.; Wu, S.; Cygler, M. Na ture 1991, 351 , 761-764.
Sussman, J. L.; Harel, M.; Frolow, F.; Oefner, C; Goldman, A.; Toker, L.; Silman, I. Science 1991, 253, 872- 879.
Anthonsen, H. W. ; Baptista, A.; Drabols, F.; Martel, P.; Petersen, S. B.; Sebastiao, M.; Vaz, L. Biotechnol . Annu . Rev. 1995, 1 , 315-371.
Carter, P.; Wells, J.A. Na ture 1988, 332, 564-568. Corey, D. R.; Willett, W. S.; Coombs, G. S.; Craik, C. S. Biochemistry 1995, 34, 11521-11527. Craik, C. S . ; Roczniak, S.; Largman, C . ; Rutter, W. J.
Science 1987, 237,909-913.
Paetzel, M.; Dalbey, R. E. Trends Biochem . Sci . 1997, 22, 28-31.
Gerlt, J. A. In Nucleases, 2nd Ed. ; Linn, S. M.; Lloyd, R. S.; Roberts, R. J-, Eds.; Cold Spring Harbor Laboratory Press, 1993; pp. 1-34. Hertzberg, R. P.; Dervan, P. B. J. Am. Chem. Soc. 1982, 104, 313-315.
Pei, D.; Schultz, P.G. In Nucleases, 2nd Ed.; Linn, S. M.; Lloyd, R. S.; Roberts, R. J., Eds.; Cold Spring Harbor Laboratory Press, 1993; pp. 317-340.
Sigman, D. S.; Chen, C.-H.B. Annu. Rev. Biochem. 1990, 59, 207-236.
Hertzberg, R. P.; Dervan, P. B. Biochemistry. 1984, 23, 3934-3945. Stryer, L. In Biochemistry, 3rd Ed.; W.H. Freeman:
New York, 1988; p. 227.
Zhou, G. W.; Guo, J.; Huang, W.; Fletterick, R. J.; Scanlan, T. S. Science 1994, 265, 1059-1064.
Kane, P. M.; Yamashiro, C. T.; Wolczyk, D. F.; Neff, N.; Goebl, M . ; Stevens, T. H. Science 1990, 250, 651-657.
Lambowitz, A. M.; Belfort, M. Annu. Rev. Biochem. 1993, 62, 587-622.
Xu, M-Q.; Southworth, M. W.; Mersha, F. B.; Hornstra, L. J.; Perler, F. B. Cell 1993, 75, 1371-1377. Perler, F. B. Cell 1998, 92, 1-4.
Perler, F. B.; Xu, M. Q.; Paulus, H. Curr. Opin. Chem. Biol. 1997, 3, 292-299.
- Hanes, J.; Pluckthun, A. Proc. Natl. Acad. Sci. USA
1997, 94, 4937-4942. Roberts, R. W.; Szostak, J. W. Proc. Natl. Acad. Sci.
USA 1997, 94, 12297-12302.
Scott, J. K.; Smith, G. P. Science 1990, 249, 386-390. Roth, A.; Breaker, R. R. Proc. Natl. Acad. Sci. USA
1998, 95, 6027-6031. Joyce, G. F.; Orgel, L. E. In The RNA World; Gesteland
R. F.; Atkins, J.F., Eds.; Cold Spring Harbor Laboratory Press, 1993; pp. 1-26. Cita tion of documents herein is not intended as an admission that any of the documents ci ted herein is pertinent prior art, or an admission that the cited documents is considered material to the patentability of any of the claims of the present application . All statements as to the date or representation as to the contents of these documents is based on the information available to the applicant and does not constitute any admission as to the correctness of the da tes or contents of these documents .
The appended claims are to be treated as a non- limiting recitation of preferred embodiments .
In addition to those set forth elsewhere, the following references are hereby incorporated by reference, in their most recent editions as of the time of filing of this application: Kay, Phage Display of Peptides and Proteins: A Laboratory Manual; the John Wiley and Sons Current Protocols series, including Ausubel, Current Protocols in Molecular Biology; Coligan, Current Protocols in Protein Science; Coligan, Current Protocols in Immunology; Current Protocols in Human Genetics; Current Protocols in Cytometry; Current Protocols in Pharmacology; Current Protocols in Neuroscience; Current Protocols in Cell Biology; Current Protocols in Toxicology; Current Protocols in Field Analytical Chemistry; Current Protocols in Nucleic Acid Chemistry; and Current Protocols in Human Genetics; and the following Cold Spring Harbor Laboratory publications: Sambrook, Molecular Cloning: A Laboratory Manual; Harlow, Antibodies: A Laboratory Manual; Manipulating the Mouse Embryo: A Laboratory Manual; Methods in Yeast Genetics: A Cold Spring Harbor Laboratory Course Manual; Drosophila Protocols; Imaging Neurons: A Laboratory Manual; Early Development of Xenopus laevis: A Laboratory Manual; Using Antibodies: A Laboratory Manual; At the Bench: A Laboratory Navigator; Cells: A Laboratory Manual; Methods) in Yeast Genetics: A Laboratory Course Manual; Discovering Neurons: The Experimental Basis of Neuroscience; Genome Analysis: A Laboratory Manual Series; Laboratory DNA Science; Strategies for Protein Purification and Characterization: A Laboratory Course Manual; Genetic Analysis of Pathogenic Bacteria: A Laboratory Manual; PCR Primer: A Laboratory Manual; Methods in Plant Molecular Biology: A Laboratory Course Manual; Manipulating the Mouse Embryo: A Laboratory Manual; Molecular Probes of the Nervous System; Experiments with Fission Yeast: A Laboratory Course Manual; A Short Course in Bacterial Genetics: A Laboratory Manual and Handbook for Escherichia coli and Related Bacteria; DNA Science: A First Course in Recombinant DNA Technology; Methods in Yeast Genetics: A Laboratory Course Manual; Molecular Biology of Plants: A Laboratory Course Manual. All references ci ted herein , including journal articles or abstracts , published, corresponding, prior or rela ted U. S . or foreign pa tent applica tions , issued U. S. or foreign pa tents, or any other references , are en tirely incorpora ted by reference herein, incl uding all da ta , tables, figures , and text presented in the ci ted references . Additionally, the entire contents of the references cited within the references ci ted herein are also entirely incorpora ted by reference .
Reference to known method steps , conventional methods steps , known methods or conventional methods is not in any way an admission tha t any aspect , description or embodiment of the present invention is disclosed, taught or suggested in the relevant art .
The foregoing description of the specific embodiments will so fully reveal the general nature of the invention 5 that others can, by applying knowledge within the skill of the art (including the contents of the references cited herein) , readily modify and/or adapt for various applications such specific embodiments , without undue experimenta tion, without departing from the general concept
10. of the present invention . Therefore, such adaptations and modifications are intended to be within the meaning and range of equivalents of the disclosed embodiments , based on the teaching and guidance presented herein . It is to be understood that the. phraseology or terminology herein is
15 for the purpose of description and not of limitation, such that the terminology or phraseology of the present specification is to be interpreted by the skilled artisan in light of the teachings and guidance presented herein, in combination wi th the knowledge of one of ordinary skill in
20 the art .
Any description of a class or range as being useful ' or preferred in the practice of the invention shall be deemed a description of any subclass (e . g. , a disclosed class wi th one . or more disclosed members omi tted) or subrange
25 contained therein, as well as a separate description of each individual member or value in said class or range .
The description of preferred embodiments individually shall be deemed a description of any possible combination of such pref rred embodiments , except for combinations
30 which are impossible (e . g, mutually exclusive choices for an element of the invention) or which are expressly excluded by this specification . If an embodiment of this invention is disclosed in the prior art, the description of the invention shall be deemed to incl ude the invention as herein disclosed with such embodiment excised.

Claims

1. A compound having the structure
(Ser/Cys) -Xaam-His-Xaan, where 0<= (m+n) <=12, where Xaa is an amino acid or an amino acid derivative, said compound having nuclease or protease activity, with the proviso that said compound is not the dipeptide
Ser-His.
2. A compound having the structure
EM-L-BM, where EM denotes an enzymatic moiety and is (Ser/Cys) -Xaam- His-Xaan, where 0<= (m+n) <=12, where Xaa is an amino acid or an amino acid derivative, said EM having nuclease or protease activity; where L denotes a linker moiety; where BM denotes a binding moiety, said binding moiety specifically binding a predetermined nucleotide or amino acid sequence, said compound having nuclease or protease activity against a nucleic acid or polypeptide, respectively, having said predetermined nucleotide or amino acid sequence.
3. In a method of labeling a nucleic acid by nick translation, in which said nucleic acid is incubated first with a nicking agent, and then with a polymerase and labeled nucleotides, the improvement comprising the nicking agent being an enzymatic agent having the structure (Ser/Cys) -Xaam-His-Xaan, where 0<= (m+n) <:=12, where Xaa is an amino acid or an amino acid derivative, said agent having nicking activity against said nucleic acid.
4. The method of claim 3 in which the labeled nucleotides are radiolabeled.
5. The method of claim 3 in which the labeled nucleotides are nonradioactively labeled.
6. The method of claim 5 in which the label is digoxigenin.
7. The method of claim 3 in which the nucleic acid is linear.
8. The method of claim 3 in which the nucleic acid is circular.
9. The method of claim 3 in which the incubation with the nicking agent is carried out at a temperature in the range of 20 to 80°C, a pH in the range of 5.5 to 7.5, and the incubation time is in the range of 1-48 hours.
" F:\,E\Edis\Chen239\chen239.hatfield.wp9. pd
PCT/US2001/043079 2000-11-14 2001-11-14 Dipeptide seryl-histidine and related oligopeptides cleave dna, protein, and a carboxyl ester WO2002040631A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US24796900P 2000-11-14 2000-11-14
US60/247,969 2000-11-14

Publications (2)

Publication Number Publication Date
WO2002040631A2 true WO2002040631A2 (en) 2002-05-23
WO2002040631A3 WO2002040631A3 (en) 2003-06-19

Family

ID=22937100

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/043079 WO2002040631A2 (en) 2000-11-14 2001-11-14 Dipeptide seryl-histidine and related oligopeptides cleave dna, protein, and a carboxyl ester

Country Status (1)

Country Link
WO (1) WO2002040631A2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2714928B1 (en) * 2011-05-27 2017-08-02 Life Technologies Corporation Methods for manipulating biomolecules
WO2017137495A1 (en) 2016-02-09 2017-08-17 Fresenius Medical Care Deutschland Gmbh Blood treatment with inactivation of circulating nucleic acids

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11196819A (en) * 1998-01-13 1999-07-27 Mikio Shimizu Use of specific enzyme activity of amino acid, dipeptide and dinucleotide (part 2)

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DATABASE CA [Online] CHEMICAL ABSTRACTS SERVICE, COLUMBUS, OHIO, US; KULIK, WILLEM ET AL: "Fast atom bombardment tandem mass spectrometry for amino acid sequence determination in tripeptides" retrieved from STN Database accession no. 113:6782 CA XP002227765 & BIOMEDICAL & ENVIRONMENTAL MASS SPECTROMETRY (1989), 18(10), 910-17 , 1989, *
DATABASE WPI , 1999 Derwent Publications Ltd., London, GB; AN 1999-471871 XP002227766 "New modified amino acid specific enzyme active useful biochemical catalyst" & JP 11 196819 A (M SHIMIZU), 27 July 1999 (1999-07-27) *
Y LI ET AL.: "Peptide seryl-histidine and related oligopeptides cleave DNA, protein and a carboxyl ester " BIOORGANIC & MEDICINAL CHEMISTRY., vol. 8, no. 12, December 2000 (2000-12), pages 2675-2680, XP002227764 ELSEVIER SCIENCE LTD., GB ISSN: 0968-0896 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2714928B1 (en) * 2011-05-27 2017-08-02 Life Technologies Corporation Methods for manipulating biomolecules
EP3260557A1 (en) * 2011-05-27 2017-12-27 Life Technologies Corporation Methods for manipulating biomolecules
US11542535B2 (en) 2011-05-27 2023-01-03 Life Technologies Corporation Methods for manipulating biomolecules
WO2017137495A1 (en) 2016-02-09 2017-08-17 Fresenius Medical Care Deutschland Gmbh Blood treatment with inactivation of circulating nucleic acids
DE102016001407A1 (en) 2016-02-09 2017-08-24 Fresenius Medical Care Deutschland Gmbh Blood treatment with inactivation of free nucleic acids

Also Published As

Publication number Publication date
WO2002040631A3 (en) 2003-06-19

Similar Documents

Publication Publication Date Title
Zuo et al. Functional domains of the human splicing factor ASF/SF2.
Ambler The structure of β-lactamases
Cobianchi et al. Phosphorylation of human hnRNP protein A1 abrogates in vitro strand annealing activity
Porse et al. The antibiotic thiostrepton inhibits a functional transition within protein L11 at the ribosomal GTPase centre
Tsai et al. Molecular modeling of the three-dimensional structure of the bacterial RNase P holoenzyme
Wower et al. Labeling the peptidyltransferase center of the Escherichia coli ribosome with photoreactive tRNA (Phe) derivatives containing azidoadenosine at the 3'end of the acceptor arm: a model of the tRNA-ribosome complex.
Merryman et al. Nucleotides in 16S rRNA protected by the association of 30S and 50S ribosomal subunits
US20200140835A1 (en) Engineered CRISPR-Cas9 Nucleases
US11840685B2 (en) Inhibition of unintended mutations in gene editing
Curran et al. Alteration of the enzymic specificity of human angiogenin by site-directed mutagenesis
Bochtler et al. Similar active sites in lysostaphins and D‐Ala‐D‐Ala metallopeptidases
Blaschke et al. [29] Protein engineering by expressed protein ligation
Steuer et al. Chimeras of the homing endonuclease PI‐SceI and the homologous Candida tropicalis intein: a study to explore the possibility of exchanging DNA‐binding modules to obtain highly specific endonucleases with altered specificity
Lynn et al. Peptide sequencing and site‐directed mutagenesis identify tyrosine‐319 as the active site tyrosine of Escherichia coli DNA topoisomerase I
Jeon et al. Toward protein-cleaving catalytic drugs: Artificial protease selective for myoglobin
Vassilenko et al. Topography of 16 S RNA in 30 S subunits and 70 S ribosomes accessibility to cobra venom ribonuclease
Heitman How the EcoRI endonuclease recognizes and cleaves DNA
Liu et al. Truncation of amino-terminal tail stimulates activity of human endonuclease III (hNTH1)
Auge-Gouillou et al. The ITR binding domain of the Mariner Mos-1 transposase
WO2011005598A1 (en) Compositions and methods for the rapid biosynthesis and in vivo screening of biologically relevant peptides
WO2002040631A2 (en) Dipeptide seryl-histidine and related oligopeptides cleave dna, protein, and a carboxyl ester
Mathonet et al. Active TEM‐1 β‐lactamase mutants with random peptides inserted in three contiguous surface loops
EP3663310A1 (en) Tale rvd specifically recognizing dna base modified by methylation and application thereof
Christ et al. A Model for the PI-SceI× DNA Complex Based on Multiple Base and Phosphate Backbone-specific Photocross-links
Déclais et al. Structural recognition between a four-way DNA junction and a resolving enzyme

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): CA JP US

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP