WO2002038796A9 - Methods for determining protease cleavage site motifs - Google Patents

Methods for determining protease cleavage site motifs

Info

Publication number
WO2002038796A9
WO2002038796A9 PCT/US2001/046777 US0146777W WO0238796A9 WO 2002038796 A9 WO2002038796 A9 WO 2002038796A9 US 0146777 W US0146777 W US 0146777W WO 0238796 A9 WO0238796 A9 WO 0238796A9
Authority
WO
WIPO (PCT)
Prior art keywords
protease
peptides
peptide
amino acid
ofthe
Prior art date
Application number
PCT/US2001/046777
Other languages
French (fr)
Other versions
WO2002038796A2 (en
WO2002038796A3 (en
Inventor
Benjamin E Turk
Lewis C Cantley
Original Assignee
Beth Israel Hospital
Benjamin E Turk
Lewis C Cantley
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beth Israel Hospital, Benjamin E Turk, Lewis C Cantley filed Critical Beth Israel Hospital
Priority to AU2002230630A priority Critical patent/AU2002230630A1/en
Publication of WO2002038796A2 publication Critical patent/WO2002038796A2/en
Publication of WO2002038796A3 publication Critical patent/WO2002038796A3/en
Publication of WO2002038796A9 publication Critical patent/WO2002038796A9/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6803General methods of protein analysis not limited to specific proteins or families of proteins
    • G01N33/6848Methods of protein analysis involving mass spectrometry
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/81Protease inhibitors
    • C07K14/8107Endopeptidase (E.C. 3.4.21-99) inhibitors
    • C07K14/8146Metalloprotease (E.C. 3.4.24) inhibitors, e.g. tissue inhibitor of metallo proteinase, TIMP
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K5/00Peptides containing up to four amino acids in a fully defined sequence; Derivatives thereof
    • C07K5/04Peptides containing up to four amino acids in a fully defined sequence; Derivatives thereof containing only normal peptide links
    • C07K5/08Tripeptides
    • C07K5/0802Tripeptides with the first amino acid being neutral
    • C07K5/0812Tripeptides with the first amino acid being neutral and aromatic or cycloaliphatic
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K7/00Peptides having 5 to 20 amino acids in a fully defined sequence; Derivatives thereof
    • C07K7/04Linear peptides containing only normal peptide links
    • C07K7/06Linear peptides containing only normal peptide links having 5 to 11 amino acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/34Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving hydrolase
    • C12Q1/37Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving hydrolase involving peptidase or proteinase
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6803General methods of protein analysis not limited to specific proteins or families of proteins
    • G01N33/6818Sequencing of polypeptides
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6803General methods of protein analysis not limited to specific proteins or families of proteins
    • G01N33/6842Proteomic analysis of subsets of protein mixtures with reduced complexity, e.g. membrane proteins, phosphoproteins, organelle proteins

Definitions

  • protease inhibitors are in widespread clinical use.
  • MMP matrix metalloprotease
  • HtrA protease which leads to decreased survival ofthe bacterium in mice when inactivated (44).
  • the H1L gene product of variola virus, which causes smallpox, is 98% identical to vaccinia virus protease, a metalloprotease involved in viral polyprotein processing and essential for maturation ofthe viral particle (45, 46).
  • the equine encephalitis viruses carry homologs ofthe Sindbis virus nsP2 protease responsible for processing of a nonstructural polyprotein to produce essential components for replication ofthe viral genome (47, 48).
  • Niral agents which apparently do not carry genes for proteases have been shown in some cases to employ host proteases in their life cycle, as in the processing of envelope glycoproteins ofthe Ebola and Marburg hemorrhagic fever viruses (49). Such proteases offer an unexplored avenue for the development of drugs which would be useful as therapy in the event of exposure to biological weapons.
  • Protease inhibition is particularly promising as a strategy for the treatment of anthrax.
  • Inhalation of Bacillus anthracis spores gives rise to systemic anthrax, a condition nearly always fatal in humans (50). Spores germinate within macrophages and emerge as rapidly dividing vegetative bacteria. Within several days the bacteria spread to the bloodstream where it multiplies to high levels, producing a toxin which results in death of host.
  • Anthrax toxin is comprised of three protein components, protective antigen (PA), edema factor (EF) and lethal factor (LF), which are active in binary combinations (51).
  • PA protective antigen
  • EF edema factor
  • LF lethal factor
  • the combination of PA and EF also called edema toxin, impairs neutrophil function and gives rise to edema associated with cutaneous anthrax.
  • LeTx lethal toxin
  • Intravenous injection of LeTx is alone sufficient to cause death in experimental animals, and strains lacking LF or PA are greatly attenuated (52, 53).
  • the crucial cellular target for LeTx appears to be the macrophage (54).
  • Treatment of macrophages or macrophage cell lines with LeTx results in high levels of inflammatory cytokine production, activation ofthe oxidative burst, and eventual cell lysis (54-56). These effects are likely to contribute to death from infection by crippling host defense against the pathogen and by causing a shock-like syndrome.
  • LeTx functions as a classical two-component bacterial toxin, with PA acting to translocate the enzymatically active component, LF, into the cytosol (Fig. 1) (51).
  • PA binds to the surface of target cells by interaction with an unidentified receptor. Subsequent cleavage by furin or a furin-like proprotein convertase enzyme removes a 20kDa fragment to generate the ⁇ -terminally truncated PA 63 (57).
  • PA 63 assembles into a heptameric ring structure which binds to LF (58, 59).
  • LF is a zinc-dependent metalloprotease belonging to the same superfamily (clan MA) as the prototypical bacterial protease thermolysin (60-62).
  • clan MA the prototypical bacterial protease thermolysin
  • the cleavage of proteins in the host cell cytosol appears to be essential for its biological activity (63).
  • MEK or MAP kinase kinase family protein kinases
  • protease cleavage site motifs that would permit the design of additional protease inhibitors.
  • inhibitors of proteases of human pathogens including the B. anthracis anthrax lethal factor protease.
  • the invention provides novel methodology for the rapid determination of protease cleavage site motifs using a mixture-based oriented peptide library approach.
  • the cleavage site motif for a protease involves residues both amino- and carboxy-terminal to the scissile bond (the unprimed and primed sides, respectively, where the cleavage site for a protease is defined as ...P3-P2-P1-P1 '-P2'-P3' ..., and cleavage occurs between the PI and PI' residues).
  • the methods involve the initial determination ofthe primed side motif and the successive determination ofthe unprimed side motif.
  • the primed side motif is preferably determined by partial digestion of a completely random mixture of peptides (preferably dodecamers) blocked (e.g., acetylated) at the amino terminus.
  • the digested mixture is subjected to amino-terminal sequencing by Edman degradation. Unreacted intact peptides and the amino-terminal fragments of reacted peptides remain blocked and do not contribute to the sequenced pool; only the carboxy-terminal fragments are sequenced.
  • the relative amounts of each amino acid present in a given cycle indicates the preference for that residue at a particular site, so that the first sequencing cycle affords information about the PI ' position, the second cycle about the P2' position, and so on.
  • a second peptide library is synthesized which fixes one or more ofthe primed positions by incorporating optimal amino acid residues determined in the initial screen.
  • the fixed positions are preceded by several degenerate residues which correspond to the unprimed positions.
  • the library preferably is prepared with the amino terminus free and with a carboxy-terminal tag (e.g., biotin) to permit removal ofthe uncleaved peptides and carboxy terminal portion the of peptides in the library after protease cleavage.
  • a carboxy-terminal tag e.g., biotin
  • the library is partially digested with the protease, the reaction mixture is quenched, and undigested peptides and carboxy-terminal fragments which retain the carboxy-terminal tag are removed (e.g., biotin-tagged fragments are removed with immobilized avidin).
  • the remaining amino-terminal fragments are subjected to amino- terminal sequencing, and the selectivities are determined from the relative abundance of each amino acid in a given sequencing cycle (preference values for particular amino acids) as before.
  • methods for determining an amino acid sequence motif for a cleavage site of a protease are provided.
  • the methods include: a) contacting the protease with a peptide library containing one or more degenerate residues under conditions which allow for cleavage of a substrate by the protease; b) allowing the protease to cleave peptides within the degenerate peptide library having a cleavage site for the protease to form a population of cleaved peptides comprising amino-terminal peptides and carboxy-terminal peptides; c) determining the amino acid sequences ofthe population of cleaved carboxy- terminal peptides; and d) determining an amino acid sequence motif for a cleavage site ofthe protease based upon the relative abundance of different amino acid residues at each degenerate position within the population of cleaved C-terminal peptides.
  • the methods also include isolating the population of cleaved carboxy-terminal peptides from the non-cleaved peptides and cleaved amino-terminal peptides.
  • the degenerate peptide library is a soluble synthetic peptide library and/or the peptide library contains all degenerate amino acid residues.
  • the peptide library will omit cysteine residues to avoid the formation of disulfide bonds.
  • the peptides ofthe degenerate peptide library are blocked at their N-termini to prevent Edman degradation.
  • the peptides ofthe degenerate peptide library are labeled at their N-termini and/or C-termini with a binding molecule, preferably biotin.
  • a binding molecule preferably biotin.
  • the N- termini are labeled with a first binding molecule and the C-termini are labeled with a second binding molecule.
  • the cleaved carboxy-terminal peptides are isolated from the non-cleaved peptides and cleaved amino-terminal peptides by contacting the population of cleaved peptides with a substrate that binds the first binding molecule.
  • the methods for determining the protease cleavage site motif include determining the N-terminal (unprimed) residues ofthe cleavage site. The knowledge ofthe C-terminal (primed) residues ofthe cleavage site is used to orient a second library with respect to the cleavage site.
  • Such methods include: a) obtaining a second peptide library, wherein the library is an oriented degenerate peptide library comprising one or more nondegenerate residues carboxy-terminal to a scissile peptide bond, and one or more degenerate residues amino-terminal to the scissile peptide bond, wherein the sequence ofthe nondegenerate residues is based on the amino acid sequence motif determined for the C-terminal (primed) residues, b) contacting the protease with the second peptide library under conditions which allow for cleavage of a substrate by the protease; c) allowing the protease to cleave peptides within the second peptide library having a cleavage site for the protease to form a population of cleaved peptides comprising amino- terminal peptides and carboxy-terminal peptides; d) isolating the population of cleaved amino-terminal
  • the second peptide library is a soluble synthetic peptide library.
  • the amino termini ofthe peptides in the second peptide library are unblocked. Libraries with blocked termini can be used if more convenient, in which case the step of determining the amino acid sequences comprises unblocking the amino termini prior to sequencing the peptides.
  • the step of separating cleaved amino-terminal peptides and cleaved carboxy-terminal peptides comprises affinity isolation ofthe uncleaved peptides and the cleaved carboxy-terminal peptides from the cleaved amino-terminal peptides, preferably by biotin-avidin binding.
  • the degenerate (first) peptide library comprises peptides comprising the formula (Xaa) n (SEQ ID NO: 104). In these libraries, Xaa is any amino acid and n is preferably an integer from 3-20 inclusive.
  • the protease cleaves a peptide before or after a known amino acid Zaa and the degenerate peptide library comprises peptides comprising the formula (Xaa) n -Zaa-(Xaa) m (SEQ ID NO: 105).
  • Zaa is a non-degenerate amino acid (PI or PI') that forms part ofthe scissile bond
  • Xaa is any amino acid
  • n and m preferably are integers from 1-10 inclusive.
  • the degenerate peptide library comprises peptides comprising the formula (Zaa) n -(Xaa) m (SEQ ID NO: 106).
  • Zaa is a non-degenerate amino acid amino-terminal to a scissile bond
  • Xaa is any amino acid
  • n and m preferably are integers from 1-10 inclusive.
  • the second peptide library comprises peptides comprising the formula (Xaa) n -(Zaa) m (SEQ ID NO: 107).
  • Zaa is an amino acid carboxy-terminal to a scissile bond (primed amino acid)
  • Xaa is an amino acid amino- terminal to the scissile bond (unprimed amino acid)
  • n and m preferably are integers from 1-10 inclusive.
  • each Zaa amino acid corresponds to the amino acid sequence motif for a cleavage site ofthe protease based upon the relative abundance of different amino acid residues at each degenerate position within the population of cleaved C- terminal peptides.
  • the methods described herein are used in an iterative fashion to further determine protease cleavage site motifs.
  • the information gained from the use ofthe first (degenerate) library and the second (oriented) library is used to re-examine the sequence ofthe C-terminal (primed) residues ofthe cleavage site.
  • the methods include: a) preparing a third peptide library, wherein the library is an oriented degenerate peptide library comprising one or more nondegenerate residues amino-terminal to a scissile peptide bond, and one or more degenerate residues carboxy-terminal to the scissile peptide bond, wherein the sequence ofthe nondegenerate residues is based on the amino acid sequence motif determined in claim 10, b) contacting the protease with the third peptide library under conditions which allow for cleavage of a substrate by the protease; c) allowing the protease to cleave peptides within the third peptide library having a cleavage site for the protease to form a population of cleaved peptides comprising amino- terminal peptides and carboxy-terminal peptides; d) isolating the population of cleaved carboxy-terminal peptides from
  • the third peptide library comprises peptides comprising the formula (Zaa) n -(Xaa) m (SEQ ID NO: 108).
  • Xaa is any amino acid and is amino acid carboxy-terminal to a scissile bond (primed amino acid)
  • Zaa is an amino acid that is amino-terminal to the scissile bond (unprimed amino acid)
  • n and m preferably are integers from 1-10 inclusive.
  • each Zaa amino acid corresponds to the amino acid sequence motif for a cleavage site ofthe protease based upon the relative abundance of different amino acid residues at each degenerate position within the population of cleaved amino-terminal peptides.
  • some ofthe Xaa amino acids can be non- degenerate, in accordance with the information determined from cleavage ofthe first library.
  • the peptides within the peptide library do not contain cysteine residues.
  • the protease is a matrix metalloproteinase.
  • the protease is a proteolytic enzyme that mediates the pathogenesis of a pathogen; pathogens include biological warfare agents.
  • preferred proteases are selected from the group consisting of lethal factor of B. anthracis, Pla and YopJ proteases of Yersinia, and the smallpox H1L metalloprotease. Most preferably the protease is lethal factor of B. anthracis.
  • the protease is selected from the group consisting of proteases of pathogenic organisms, cathepsin family proteases, tumor necrosis factor-alpha converting enzyme (TACE), calpains, caspases, beta-site amyloid precursor protein-cleaving enzyme (BACE; beta-secretase), presenilins, membrane-type serine proteases, furin and other proprotein convertases, proteasome components, and proteases affecting the blood clotting cascade.
  • Other proteases include cysteine proteases, aspartyl proteases and serine proteases.
  • the amino acid sequence motif for a cleavage site ofthe protease is determined by calculating a preference value for each amino acid at each degenerate position, wherein the preference value for a particular amino acid is determined by dividing the amount ofthe particular amino acid by the average amount per amino acid in that cycle to obtain a first value for the particular amino acid, and then dividing each first value by the relative amount of that particular amino acid in the starting mixture, and selecting amino acid residues that have a preference value of greater than 1.0 at a degenerate position for inclusion at a position corresponding to the degenerate position in the amino acid sequence motif.
  • protease inhibitors or protease substrates including a sequence determined according to the foregoing methods are provided.
  • inhibitors of matrix metalloproteinase protease activity are provided.
  • the inhibitors includes a noncleavable peptide molecule comprising an amino acid sequence selected from the group consisting of SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5 and SEQ ID NO:6, or a fragment thereof that inhibits matrix metalloproteinase protease activity.
  • inhibitors of Bacillus anthracis lethal factor protease activity include a noncleavable peptide molecule comprising SEQ ID NO:69, or a fragment thereof that inhibits lethal factor protease activity.
  • amino acid sequence comprises SEQ ID NO:70.
  • inhibitors oi Bacillus anthracis lethal factor protease activity consist essentially of a compound selected from the group consisting of 2-thioacetyl-Tyr-Pro-Met-amide, ⁇ -acetyl-Lys-Nal-Tyr-Pro-hydroxamic acid (SEQ ID ⁇ O:72), ⁇ -acetyl-Lys-Val-Tyr- ⁇ Ala-hydroxamic acid (SEQ ID NO:73) and ⁇ - acetyl-Lys-Pro-Thr-Pro-hydroxamic acid (SEQ ID NO:74).
  • inhibitors of Bacillus anthracis lethal factor protease activity include SEQ ID NO:76, or a fragment thereof that inhibits lethal factor proteolytic activity.
  • the inhibitors include at least one group that chelates the active site metal ion incorporated at either the amino-terminus or the carboxy-terminus.
  • Preferred groups that chelate the active site metal ion are selected from the group consisting of thioacetyl groups, carboxylate groups, phosphonate groups, phosphoramidate groups and hydroxamic acids.
  • preferred inhibitors are peptides or peptide analogs consisting of 3-25 amino acids.
  • Inhibitors of protease activity that compete for binding to the protease with the foregoing inhibitors also are provided in another aspect ofthe invention, as are compositions comprising any ofthe foregoing inhibitors (including the competitive inhibitors) and a pharmaceutically acceptable carrier.
  • methods for determining an amino acid sequence motif for a binding site of a protease include: a) contacting the protease with an oriented peptide library containing one or more degenerate residues under conditions which allow for binding of a substrate by the protease; b) allowing the protease to bind peptides within the degenerate peptide library having a binding site for the protease to form protease-peptide complexes; c) isolating the protease-peptide complexes from the unbound peptides; d) releasing the peptides from the protease-peptide complexes; e) isolating the peptides previously bound to the protease; c) determining the amino acid sequences ofthe peptides; and d) determining an amino acid sequence motif for a binding site ofthe protease based upon the relative abundance of different amino acid residue
  • the peptides in the oriented peptide library include a carboxy- terminal hydroxamic acid group.
  • the peptides include the amino acid sequence MAXXXXX-hydroxamate (SEQ ID NO:77).
  • the peptide library is contacted with the protease by application ofthe library to a substrate to which the protease is immobilized.
  • the protease-peptide complexes are isolated by washing the protease-peptide complexes in a buffer that permits binding.
  • the peptides are eluted from the protease-peptide complexes by incubating the protease-peptide complexes with an elution solution.
  • the elution solution comprises either low pH or a metal chelator.
  • protease binding molecules are provided that included an amino acid sequence motif for a binding site of a protease determined according to the foregoing methods.
  • intramolecularly-quenched fluorogenic peptide protease substrates include a lethal factor protease cleavage motif sequence or a matrix metalloprotease cleavage motif flanked by a fluorescent group and a fluorescence quenching moiety.
  • the fluorescent group is attached to the motif sequence at the amino terminus and the quenching moiety is attached to the peptide at the carboxy terminus.
  • Prefened amino terminal fluorescent groups include a methoxycoumarinacetyl (Mca) group
  • preferred carboxy-terminal quenching moiety include a dinitrophenyl-diaminopropionic acid Dap(Dnp) moiety.
  • Mca and Dap(Dnp) are used together.
  • fluorescent groups and quenchers include aminobenzoyl groups or a tryptophan residue as the fluorophore with either a dinitrophenyl group or a nitrotyrosine group as the quencher, Edans (5-(2-aminoethyl)aminonaphthalene-l-sulfonic acid) as the fluorophore with dabcyl (4-(4- dimethylaminophenylazo)benzoic acid) as the quencher.
  • Still other fluorogenic reagents include those where the fluorophore is at the C-terminus. Upon cleavage, there is an increase in fluorescence.
  • Fluorogenic reagents of this type include aminomethylcoumarins or aminonaphthalenesulfonamides.
  • intramolecularly-quenched fluorogenic protease substrates are provided.
  • the substrates include a lethal factor protease cleavage motif sequence or a matrix metalloprotease cleavage motif sequence flanked by fluorescent proteins that have overlapping emission spectra.
  • the fluorescent proteins are cyan fluorescent protein (CFP) and yellow fluorescent protein (YFP), or green fluorescent protein (GFP) and red fluorescent protein (RFP).
  • protease substrates that contain the cleavage site for a protease of interest placed between the transmembrane segment of a membrane-anchored transcription factor and its transcriptional activation domain, which allows release ofthe transcriptional activation domain to be regulated by the protease.
  • the lethal factor protease cleavage motif sequence includes SEQ ID NO:69, and more preferably the motif sequence is SEQ ID NO:70.
  • a particularly prefened inhibitor is Mca-KKVYPYPME-Dap(Dnp).
  • the matrix metalloprotease cleavage motif sequence includes an amino acid sequence selected from the group consisting of SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5 and SEQ ID NO:6. According to another aspect ofthe invention, methods for identifying protease inhibitors are provided.
  • the methods include: a) providing a protease and a cleavable protease substrate, wherein the uncleaved substrate is distinguishable from the cleaved substrate, wherein the cleavable protease substrate comprises a motif sequence determined according to any ofthe methods ofthe invention, b) contacting the protease with a candidate protease inhibitor compound and the cleavable substrate under conditions that permit cleavage ofthe substrate, and c) detecting the amounts of cleaved and uncleaved substrate as a measure ofthe presence of a protease inhibitor, wherein detection of a lesser amount of cleaved substrate than is present when the protease is not contacted with the candidate protease inhibitor compound indicates that the candidate protease inhibitor compound is a protease inhibitor.
  • the cleavable protease substrate is an intramolecularly- quenched fluorogenic peptide protease substrate comprising a protease cleavage motif sequence flanked by a fluorescent group and a fluorescence quenching moiety, or an intramolecularly-quenched fluorogenic protease substrate comprising a protease cleavage motif sequence flanked by fluorescent proteins that have overlapping emission spectra.
  • Prefened protease cleavage motifs in the substrates include lethal factor protease cleavage motifs sequence comprising SEQ ID NO:69, preferably SEQ ID NO:70, and matrix metalloprotease cleavage motif sequence comprising an amino acid sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5 and SEQ ID NO:6.
  • methods for identifying protease inhibitors include: a) providing a protease, a protease inhibitor that binds the protease, and a candidate protease inhibitor compound, b) contacting the protease with the candidate protease inhibitor compound and the protease inhibitor under conditions that permit binding ofthe protease inhibitor to the protease, wherein either or both ofthe candidate protease inhibitor compound and the protease inhibitor are detectable, and wherein either or both ofthe candidate protease inhibitor compound and the protease inhibitor comprises a sequence determined according to the methods ofthe invention, c) separating the protease from the unbound protease inhibitor and unbound candidate protease inhibitor compound, and d) detecting the amounts of detectable protease inhibitor and/or the detectable candidate protease inhibitor compound bound to the protease as a measure ofthe presence of a candidate protease inhibitor compound that
  • the methods include testing the activity ofthe protease in the presence ofthe candidate protease inhibitor compound, wherein a greater reduction in protease activity in the presence ofthe candidate protease inhibitor compound than in the absence ofthe candidate protease inhibitor compound indicates that the candidate protease inhibitor compound is a protease inhibitor.
  • the candidate protease inhibitor compound or the protease inhibitor comprises a lethal factor protease cleavage motif sequence comprising SEQ ID NO:69, preferably SEQ ID NO:70.
  • the candidate protease inhibitor compound or the protease inhibitor comprises a matrix metalloprotease cleavage motif sequence comprising an amino acid sequence selected from the group consisting of SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5 and SEQ ID NO:6.
  • the candidate protease inhibitor compound is a small organic molecule.
  • Protease inhibitors identified according to these methods also are provided, as are uses ofthe protease inhibitors in the preparation of a medicament.
  • Fig. 1 is a schematic drawing depicting an overview ofthe peptide library method.
  • Fig. 2 shows the cleavage-site specificity of MMP-7 (matrilysin).
  • Fig. 2 A shows the relative distribution of amino acid residues at positions C-terminal to the MMP-7 cleavage site, determined by sequencing a partial digest ofthe N-terminally blocked random dodecamer library Ac-XXXXXXXXXX (SEQ ID NO:7). Data are normalized so that a value of 1 conesponds to the average quantity per amino acid in a given sequencing cycle and would indicate no selectivity. Because of poor yield during sequencing, tryptophan was not included in the analysis. The average of two experiments with standard deviations are shown.
  • Fig. 1 is a schematic drawing depicting an overview ofthe peptide library method.
  • Fig. 2 shows the cleavage-site specificity of MMP-7 (matrilysin).
  • Fig. 2 A shows the relative distribution of amino acid residues
  • FIG. 2B shows the specificity N-terminal to the MMP-7 cleavage site.
  • data shown were obtained using the library MAXXXXXLRGAARE(K-biotin) (SEQ ID NO: 8).
  • the P3 proline library MGXXPXXLRGGGEE(K- biotin) (SEQ ID NO: 9) was used. Glycine, glutamine, and threonine were omitted because of high interfering background peaks on the sequencer. Data were normalized as in Fig. 2A.
  • Fig. 3 shows that MMP-2 can act as a neurocan-processing enzyme in vitro.
  • Purified neonatal rat brain neurocan was digested at 37°C for 2 h with varying concentrations of MMPs as indicated in the absence or presence ofthe MMP inhibitor GM6001. Reaction mixtures were quenched with EDTA and chondroitinase-digested before SDS-PAGE and silver staining.
  • Fig. 4 depicts FRET substrates for visualizing protease activity in living cells.
  • Cyan fluorescent protein (CFP) and yellow fluorescent protein (YFP) are fused with an intervening linker bearing an optimal LF cleavage site. Inadiation ofthe uncleaved construct at the CFP excitation wavelength results in transfer of energy to the YFP molecule and emission at its wavelength. Upon cleavage, FRET is disrupted and emission now occurs at the CFP emission wavelength. The ration of emission at the two wavelengths provides a readout of the extent of cleavage.
  • the invention relates to methods for determining cleavage site motifs for proteases, substrate peptides that contain such motifs, including fluorogenic substrates, and inhibitors containing at least a portion of such motifs.
  • the invention also relates to identification of substrate proteins by using the motifs identified to scan databases for proteins containing the motifs. Recognition ofthe target substrate by a protease depends in part on complementarity between the protease active site and the sequence sunounding the scissile bond in the substrate. Determination of protease cleavage site motifs has several important applications. Specificity information can be used to design highly sensitive and specific synthetic fluorogenic substrates that enable high-throughput screening for small-molecule inhibitors.
  • Analogs of optimized substrates tailor-made to the class of protease provide potent inhibitors useful as lead compounds in drug discovery and as tools in exploring the biological function ofthe enzyme.
  • knowledge ofthe optimal cleavage motif for a protease helps identify possible in vivo protein substrates.
  • the invention is generally applicable in determining the protease cleavage site motifs for any protease, in identifying substrates and inhibitors for any protease, and so on.
  • Other proteases of interest will be known to one of skill in the art.
  • the invention pertains generally to the substrate specificity of proteases and to peptides which are substrates for proteases.
  • the invention provides methods that allow for the identification of an amino acid sequence motif for the cleavage site of a specific protease without having to identify, isolate and compare native substrates for the protease.
  • the methods ofthe invention are based upon selection of a subpopulation of peptides from a degenerate peptide library that are substrates for a protease.
  • the peptides within a peptide library that can be substrates for a protease are cleaved by the protease, converting them to amino-terminal peptides and carboxy-terminal peptides.
  • the peptides ofthe peptide library preferably are blocked at the amino termini, thereby preventing amino acid sequencing (e.g., by Edman degradation) ofthe amino- terminal peptides and uncleaved peptides.
  • Blocking ofthe amino terminus can be accomplished using any means known in the art.
  • Preferably the N- terminus is blocked by the covalent attachment of a moiety to all peptides after the synthesis ofthe library peptides are completed.
  • a prefened example of a blocking moiety is an acetyl group, and methods of acetylating peptides are well known in the art.
  • the carboxy-terminal peptides which are unblocked by virtue ofthe cleavage by the protease, are sequenced and the relative abundance of each amino acid residue at each degenerate position ofthe carboxy-terminal peptides is determined.
  • the cleaved carboxy-terminal peptides can be separated from the remaining non-cleaved peptides and the amino-terminal peptides, thereby isolating the subpopulation of peptides that are the carboxy-terminal portions of substrate peptides for the protease.
  • the carboxy-terminal peptides can have a molecule attached to the end that permits isolation of these cleavage products (e.g., biotin or an epitope recognized by an antibody).
  • An amino acid sequence motif for the cleavage site of a protease can be determined from the most abundant amino acid residues at each degenerate position ofthe carboxy- terminal peptides.
  • the abundance of an amino acid at a position in the peptide provides a preference value for each amino acid at each degenerate position.
  • the preference value for a particular amino acid is determined by dividing the amount ofthe particular amino acid identified in a sequencing cycle by the average amount per amino acid in that cycle. This provides a raw value for the particular amino acid. To conect for bias in the library, it is prefened that the raw value is conected by then dividing the raw value for each amino acid by the relative amount of that particular amino acid in the starting mixture.
  • Amino acid residues that have a preference value of greater than 1.0 at a degenerate position are considered to be a part ofthe cleavage site motif.
  • Higher preference values are prefened, e.g., 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3.0, and so on.
  • cleavage site motifs based on the highest preference value at a particular peptide residue, or can select motifs based on a combination of two or more amino acids at a particular residue that have preference values above a certain cutoff score.
  • a partially degenerate library to determine the carboxy-terminal cleavage half-site. This may be prefened, for example, when the protease of interest does not cleave the totally degenerate library efficiently.
  • the use of a partially degenerate library also may be prefened if one knows that the protease requires a certain residue in the cleavage site. Examples of two situations are provided in the Examples below for B. anthracis lethal factor protease.
  • the carboxy-terminal portion ofthe cleavage site motif is determined using a partially or totally degenerate peptide library
  • a partially degenerate second library using the knowledge ofthe carboxy-terminal cleavage half-site to orient the degenerate amino-terminal residues.
  • This second library is, therefore, an oriented degenerate peptide library (ODPL).
  • ODPL oriented degenerate peptide library
  • the amino termini ofthe peptides in the second library preferably are left unblocked to permit ready sequencing by Edman degradation, although in alternative embodiments the peptides can be blocked during the cleavage reaction and unblocked after removal ofthe carboxy-terminal peptides, to facilitate sequencing. In this latter instance, blocking agents that are readily removed will be prefened.
  • the carboxy-terminal (also refened to as C- terminal) peptide fragments are removed.
  • the C-termini ofthe peptides in the second library are labeled with a moiety that facilitates ready removal ofthe C-terminal peptide fragments after cleavage.
  • the C- terminal moiety is biotin, incorporated as a lysyl-biotin residue.
  • one or more K residues can be added to the peptides (e.g., at the C-terminus ofthe libraries) to promote water solubility.
  • Avidin molecules can be coupled to a substrate (e.g., a bead, resin, dipstick, magnetic bead) and used to bind biotin-linked peptides (uncleaved peptides) and biotin- linked peptide fragments (carboxy-terminal cleavage products).
  • a substrate e.g., a bead, resin, dipstick, magnetic bead
  • the biotin-avidin binding pair is but one example of agents useful for removing the cleaved C-terminal peptide fragments and uncleaved peptides.
  • Other binding pairs known in the art include antibody- antigen pairs.
  • the isolated amino-terminal cleavage products are sequences according to standard methodologies.
  • the peptides are sequenced by an automated peptide sequencer. Preference values for amino acids at positions ofthe N-terminal portion ofthe cleavage site are then determined as for the C-terminal sequence. Combining the N-terminal motif and the C-terminal motif sequences provides a complete cleavage motif sequence. The determination of N-terminal and C-terminal motif sequences can proceed in iterative fashion. Thus, a third round of motif determination can be based on the second round. The cleavage ofthe second library provides sequence information about the N- terminal (unprimed) residues ofthe protease cleavage site.
  • This sequence information can be used to design a third library which fixes the unprimed N-terminal residues in accordance with the experimentally determined cleavage motif.
  • This third library like the first library, contains degenerate amino acid sequence in the portion ofthe peptides carboxy-terminal to the scissile bond (i.e., the primed residues).
  • the peptides ofthe third library preferably are blocked at the N-termini so that only the C-terminal cleaved peptides will yield sequence information.
  • the third library is subjected to protease cleavage and the sequence ofthe C-terminal peptide fragments is then determined. Preference values for the C-terminal residues ofthe cleavage motif are calculated, thus refining the C-terminal portion of the motif.
  • Substrates for the protease are designed based on the protease cleavage motif.
  • the substrates preferably are detectably labeled in a manner that permits detection ofthe cleavage products as distinct from the uncleaved peptide substrate.
  • One prefened example of a detectably labeled peptide is a fluorogenic peptide substrate.
  • fluorogenic substrates include two moieties linked to the ends of a substrate peptide. While linked in close proximity, the fluorogenic moieties have certain properties that change upon cleavage ofthe substrate peptide. For example, the moieties may be quenched in close proximity so that the uncleaved substrate peptide is not fluorescent.
  • fluorogenic reagents are amino terminal fluorescent methoxycoumarinacetyl groups and carboxy-terminal dinitrophenyl-diaminopropionic acid quenching moieties. Fluorogenic peptides also can be made using aminobenzoyl groups or a tryptophan residue as the fluorophore with either a dinitrophenyl group or a nitrotyrosine group as the quencher.
  • Edans (5-(2-aminoethyl)aminonaphthalene-l-sulfonic acid) can be used as the fluorophore with dabcyl (4-(4-dimethylaminophenylazo)benzoic acid) as the quencher.
  • Still other fluorogenic reagents include those where the fluorophore is at the C- terminus. Upon cleavage, there is an increase in fluorescence. Fluorogenic reagents of this type include aminomethylcoumarins or aminonaphthalenesulfonamides.
  • FRET fluorescence resonance energy transfer
  • detectable cleavage substrates are to include in the substrate a molecule that affects a detectable process, preferably a process detectable in cellular assays. In such an approach, the molecule is inactive until the substrate is cleaved.
  • a membrane-anchored transcription factor such as ATF6 which is normally released from a cytoplasmic membrane by proteolytic cleavage to allow it to enter the nucleus and act as a transcription factor.
  • the cleavage site for a protease of interest is placed between the transmembrane segment ofthe membrane- anchored transcription factor and its transcriptional activation domain, which allows release ofthe transcriptional activation domain to be regulated by the protease.
  • the release ofthe transcriptional activation domain is monitored using standard reporter assays, such as a reporter gene assay in which a detectable protein product (green fluorescent protein, luciferase, etc.) is placed under the control ofthe transcription factor.
  • a detectable protein product green fluorescent protein, luciferase, etc.
  • Other cleavage- activated processes known in the art also are adaptable to this purpose.
  • Specific high affinity protease inhibitors also can be designed to incorporate the cleavage site motif.
  • Inhibitors can be based on the entire protease cleavage site or on the C- terminal or N-terminal half-site motifs determined from cleavage ofthe first peptide library (and/or the third peptide library) and the second peptide library, respectively.
  • Many modifications to peptide structure are known that are useful in the preparation of protease inhibitors. These include modified bonds, modified amino acids, and moieties that interact with the protease to prevent cleavage.
  • a group which chelates the active site zinc ion can be incorporated at either the amino- or carboxy-terminus of an optimized peptide.
  • Peptides conesponding to primed residues (C-terminal motif) bearing amino-terminal thioacetyl groups can be synthesized using standard solid-phase chemistries.
  • peptides conesponding to unprimed residues (N-terminal motif) can incorporate an hydroxamic acid group in place ofthe carboxylic acid.
  • inhibitors for metalloproteinases include carboxylates, phosphonates, phosphoramidates, and "right-handed" hydroxamic acids (which cover the unprimed residues).
  • inhibitors include aldehydes, halomethylketones, acyloxyketones, diazomethylketones, vinyl sulfones, epoxides, and ketomethylene peptides.
  • inhibitors include statines and other inhibitors which span the cleavage site and incorporate an hydroxyethylene moiety.
  • inhibitors include chloromethylketones. Specific methods for synthesis and purification ofthe inhibitors are known in the art, and certain of these are described in more detail in the Examples below.
  • the natural substrates ofthe protease used in the methods ofthe invention can be determined by scanning existing amino acid sequence databases (e.g., Swiss-Prot) for the existence of proteins having sequences that match the cleavage site motif. Software packages that are useful for this purpose are known in the art. For example, the Scansite program (Yaffe, et al., Nat. Biotechnol. 19, 348-353, 2001) can be used. Identification of natural substrates provides additional substrates for testing of inhibitors; the cleavage ofthe substrates can be monitored in the absence and in the presence of varying concentrations of candidate inhibitors to assess their effectiveness in preventing the cleavage of a variety of naturally occurring protein molecules.
  • the methods provided herein have the advantage that they can be used to determine a cleavage site motif for any protease, regardless of whether native substrates for that protease have been identified. Furthermore, since the methods involve selection of peptides which are cleaved most readily by a protease, the amino acid sequence motif determined by the methods represents the optimal cleavage site for that protease.
  • the cleavage-based methods require cleavage of peptides in a library. It may be desirable to determine motifs of protease binding rather than cleavage, particularly for the development of high-affinity uncleavable inhibitors. Accordingly, the invention also includes methods for determining protease binding motifs. In these methods, a protease is contacted with a library of noncleavable peptides. After washing away unbound peptides, the remaining peptides are eluted from the bound state and sequenced. As with the other methods described herein, the preference values ofthe peptide residues are then determined, and the protease binding motif is thereby determined.
  • protease is immobilized on a solid surface, such as a resin bead, that permits thorough removal of unbound peptides, such as by washing, and recovery of protease following removal of bound peptides (e.g., by alteration of salt concentration, pH, addition of metal chelators, etc.).
  • the peptide libraries for determining binding motifs preferably are oriented degenerate peptide libraries. Because the peptides are not cleaved in this method (which provides an orientation at the cleavage site), some other method of orientation is required to be able to extract meaningful sequence information (e.g., to prevent recognition of phased binding sites in the peptide library).
  • One approach to orientation ofthe binding site libraries for metalloproteinases is to synthesize peptide library mixtures bearing a carboxy-terminal hydroxamic acid group, which will serve to orient the library by forcing the binding ofthe peptides at the active site.
  • Another approach for peptide library orientation is to utilize protease cleavage motif information determined in accordance with other methods described herein to fix several ofthe residues in the peptide library to enhance the binding ofthe peptides at the active site.
  • the peptides synthesized for the libraries can be of any size that is readily recognized and cleaved by proteases (for determination of cleavage site motifs), or that is bound by proteases with high affinity (for determination of binding site motifs).
  • the size ofthe peptides can be determined empirically, although it is expected that a peptide length of 5-25 amino acids, e.g., 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 and 25 amino acids, will work well for most applications ofthe methods described herein.
  • the peptides are 10-15 amino acids in length, most preferably 12 amino acids
  • Inhibitors including half site inhibitors, and substrate peptides will generally be of a similar length. It is possible, however, to use peptides that are longer if required or prefened for a particular application.
  • the peptides can incorporate natural and/or unnatural amino acids, and can by synthesized using standard solid-phase chemistries.
  • unnatural amino acids are provided below, and additional amino acids will be known to the skilled artisan.
  • the library does not contain cysteine residues so that disulfide bonds are not formed.
  • noncleavable peptides containing the cleavage motif sequences may be desired to prepare noncleavable peptides containing the cleavage motif sequences.
  • noncleavable peptides are useful as specific inhibitors of proteases.
  • the peptides described herein preferably are non-hydrolyzable.
  • the individual peptide bonds which are susceptible to proteolysis can be replaced with non-hydrolyzable peptide bonds by in vitro synthesis ofthe peptide.
  • Non-hydrolyzable bonds include -psi[CH 2 NH]- reduced amide peptide bonds, -psi[COCH 2 ]- ketomethylene peptide bonds, - psi[CH(CN)NH]- (cyanomethylene)amino peptide bonds, -psi[CH 2 CH(OH)]- hydroxyethylene peptide bonds, -psi[CH 2 O]- peptide bonds, and -psi[CH 2 S]- thiomethylene peptide bonds.
  • Nonpeptide analogs of peptides are also contemplated.
  • Peptide mimetic analogs can be prepared based on a cleavage motif sequence by replacement of one or more amino acid residues by nonpeptide moieties.
  • the nonpeptide moieties permit the peptide mimetic to retain its natural confirmation, or stabilize a prefened, e.g., bioactive, confirmation.
  • the substrate peptides, binding peptides and inhibitors of protease cleavage labeled as described herein are useful for screening compounds and libraries of compounds for protease inhibitory activity. As mentioned, high throughput screening of known compounds and libraries of compounds can be performed using these substrates according to known methodologies.
  • the invention further provides efficient methods of identifying pharmacological agents or lead compounds for agents useful for inhibiting or monitoring protease activity.
  • the screening methods involve assaying for compounds which are cleaved or which inhibit cleavage of a protease substrate. Such methods are adaptable to automated, high throughput screening of compounds.
  • assays for pharmacological agents are provided, including labeled in vitro protease cleavage assays, cell-based protease cleavage assays, etc.
  • in vitro protease cleavage assays are used to rapidly examine the effect of candidate pharmacological agents on the cleavage of a substrate by a specific protease.
  • the candidate pharmacological agents can be derived from, for example, combinatorial peptide or small molecule libraries. Convenient reagents for such assays are known in the art.
  • Peptides used in the methods ofthe invention are added to an assay mixture as an isolated peptide.
  • Peptides can be produced recombinantly, or isolated from biological extracts, but preferably are synthesized in vitro.
  • Peptides encompass chimeric proteins comprising a fusion of a peptide having a particular cleavage site motif with one or more other polypeptides, e.g., fluorescent polypeptides.
  • Peptides may also be labeled with detectable compound(s) to provide a means of readily detecting whether the peptide is cleaved, e.g., by immunological recognition or by fluorescent labeling.
  • a typical assay mixture includes a peptide having a protease cleavage site motif and a candidate pharmacological agent.
  • a plurality of assay mixtures are run in parallel with different agent concentrations to obtain a different response to the various concentrations.
  • one of these concentrations serves as a negative control, i.e., at zero concentration of agent or at a concentration of agent below the limits of assay detection.
  • Candidate agents encompass numerous chemical classes, although typically they are organic compounds.
  • the candidate pharmacological agents are small organic compounds, i.e., those having a molecular weight of more than 50 yet less than about 2500.
  • Candidate agents comprise functional chemical groups necessary for structural interactions with polypeptides (e.g., protease cleavage sites), and typically include at least an amine, carbonyl, hydroxyl or carboxyl group, preferably at least two ofthe functional chemical groups and more preferably at least three ofthe functional chemical groups.
  • the candidate agents can comprise cyclic carbon or heterocyclic structure and/or aromatic or polyaromatic structures substituted with one or more ofthe above-identified functional groups.
  • Candidate agents also can be biomolecules such as peptides (preferably non-hydrolyzable for protease inhibitors), saccharides, fatty acids, sterols, isoprenoids, purines, pyrimidines, derivatives or structural analogs ofthe above, or combinations thereof and the like.
  • the agent is a nucleic acid (i.e., aptamer)
  • the agent typically is a DNA or RNA molecule, although modified nucleic acids having non-natural bonds or subunits are also contemplated.
  • Candidate agents are obtained from a wide variety of sources including libraries of synthetic or natural compounds.
  • libraries of natural compounds in the form of bacterial, fungal, plant and animal extracts are available or readily produced.
  • natural and synthetically produced libraries and compounds can be readily be modified through conventional chemical, physical, and biochemical means.
  • known pharmacological agents may be subjected to directed or random chemical modifications such as acylation, alkylation, esterification, amidification, etc. to produce structural analogs ofthe agents.
  • reagents such as salts, buffers, neutral proteins (e.g., albumin), detergents, etc. which may be used to facilitate optimal protein-protein and/or protein-nucleic acid binding. Such a reagent may also reduce non-specific or background interactions ofthe reaction components.
  • reagents that improve the efficiency ofthe assay such as nuclease inhibitors, antimicrobial agents, and the like may also be used.
  • the mixture ofthe foregoing assay materials is incubated under conditions whereby, but for the presence ofthe candidate pharmacological agent, a protease cleaves a substrate (for protease inhibition studies), or specifically binds a protease inhibitor, e.g., a non- hydrolyzable peptide (for identifying compounds that compete with known inhibitors).
  • a protease cleaves a substrate (for protease inhibition studies), or specifically binds a protease inhibitor, e.g., a non- hydrolyzable peptide (for identifying compounds that compete with known inhibitors).
  • the order of addition of components, incubation temperature, time of incubation, and other parameters ofthe assay may be readily determined. Such experimentation merely involves optimization ofthe assay parameters, not the fundamental composition ofthe assay. Incubation temperatures typically are between 4°C and 40°C. Incubation times preferably are minimized to facilitate rapid, high throughput screening, and typically are between 1 minute and
  • a separation step may be used to separate bound from unbound components.
  • the separation step may be accomplished in a variety of ways. Conveniently, at least one ofthe components is immobilized on a solid substrate, from which the unbound components may be easily separated.
  • the solid substrate can be made of a wide variety of materials and in a wide variety of shapes, e.g., microtiter plate, microbead, dipstick, resin particle, etc.
  • the substrate preferably is chosen to maximum signal to noise ratios, primarily to minimize background binding, as well as for ease of separation and cost.
  • Separation may be effected for example, by removing a bead or dipstick from a reservoir, emptying or diluting a reservoir such as a microtiter plate well, rinsing a bead, particle, chromatographic column or filter with a wash solution or solvent.
  • the separation step preferably includes multiple rinses or washes.
  • the solid substrate is a microtiter plate
  • the wells may be washed several times with a washing solution, which typically includes those components ofthe incubation mixture that do not participate in specific binding or interaction such as salts, buffer, detergent, non-specific protein, etc.
  • the solid substrate is a magnetic bead
  • the beads may be washed one or more times with a washing solution and isolated using a magnet.
  • Detection may be effected using any convenient method.
  • the protease cleavage or binding typically alters a directly or indirectly detectable product, e.g., a cleaved substrate peptide.
  • one ofthe components usually comprises, or is coupled to, a detectable label.
  • labels can be used, such as those that provide direct detection (e.g., radioactivity, luminescence, optical or electron density, etc), or indirect detection (e.g., epitope tag such as the FLAG epitope, enzyme tag such as horseradish peroxidase, etc.).
  • the label may be bound to a protease substrate or inhibitor as described elsewhere herein or to the candidate pharmacological agent.
  • the label may be detected while bound to the solid substrate or subsequent to separation from the solid substrate.
  • Labels may be directly detected through optical or electron density, radioactive emissions, nonradiative energy transfers, etc. or indirectly detected with antibody conjugates, streptavidin-biotin conjugates, etc. Methods for detecting the labels are well known in the art.
  • the present invention includes automated drug screening assays for identifying compositions having the ability to inhibit protease cleavage of a substrate directly (by binding protease), or indirectly (by serving as cleavable decoy substrates).
  • the automated methods are carried out in an apparatus which is capable of delivering a reagent solution to a plurality of predetermined compartments of a vessel and measuring the change in a detectable molecule in the predetermined compartments.
  • Exemplary methods include the following steps. First, a divided vessel is provided that has one or more compartments which contain a protease substrate which, when exposed to a specific protease, has a detectable change in fluorescence.
  • the protease can be in a cell in the compartment, in solution, or immobilized within the compartment.
  • one or more predetermined compartments are aligned with a predetermined position (e.g., aligned with a fluid outlet of an automatic pipette) and an aliquot of a solution containing a compound or mixture of compounds being tested for its ability to protease cleavage is delivered to the predetermined compartment(s) with an automatic pipette.
  • the fluorescent protease substrate is also added with the compounds or following the addition ofthe compounds.
  • fluorescence emitted by the substrate in response to an excitation wavelength is measured for a predetermined amount of time, preferably by aligning said cell-containing compartment with a fluorescence detector.
  • fluorescence also measured prior to adding the compounds to the compartments, to establish e.g., background and/or baseline values for fluorescence.
  • the compounds can be added with or after addition of a substrate or inhibitor to the protease- containing compartments.
  • One of ordinary skill in the art can readily determine the appropriate order of addition ofthe assay components for particular assays.
  • the plate is moved, if necessary, so that assay wells are positioned for measurement of fluorescence emission. Because a change in the fluorescence signal may begin within the first few seconds after addition of test compounds, it is desirable to align the assay well with the fluorescence reading device as quickly as possible, with times of about two seconds or less being desirable.
  • fluorescence readings may be taken substantially continuously, since the plate does not need to be moved for addition of reagent.
  • the well and fluorescence-reading device should remain aligned for a predetermined period of time suitable to measure and record the change in fluorescence.
  • the apparatus is configured to detect fluorescence from above the plate, it is prefened that the bottom ofthe wells are colored black to reduce the background fluorescence and thereby decreases the noise level in the fluorescence reader.
  • the apparatus ofthe present invention is programmable to begin the steps of an assay sequence in a predetermined first well (or rows or columns of wells) and proceed sequentially down the columns and across the rows ofthe plate in a predetermined route through well number n. It is prefened that the fluorescence data from replicate wells treated with the same compound are collected and recorded (e.g., stored in the memory of a computer) for calculation of fluorescence. To accomplish rapid compound addition and rapid reading ofthe fluorescence response, the fluorometer can be modified by fitting an automatic pipetter and developing a software program to accomplish precise computer control over both the fluorometer and the automatic pipetter.
  • the delay time between reagent addition and fluorescence reading can be significantly reduced.
  • both greater reproducibility and higher signal-to-noise ratios can be achieved as compared to manual addition of reagent because the computer repeats the process precisely time after time.
  • this anangement permits a plurality of assays to be conducted concunently without operator intervention.
  • reliability ofthe fluorescent dye-based assays as well as the number of assays that can be performed per day are advantageously increased.
  • Inhibitors of proteases identified by the methods described herein are useful to treat diseases or conditions that result from excessive or unwanted protease activity, including pathogenic infections, cancer, inflammatory diseases, etc.
  • an effective inhibitory amount of a protease inhibitor is administered to a subject.
  • the inhibitors also can be used in diagnostic applications, to detect specific proteases.
  • pathogens that express a specific protease can be detected in a subject, in a biological sample ofthe subject, or in various materials to assess contamination.
  • Inhibitors and other compounds that incorporate protease cleavage or binding site sequence motifs can be administered as part of a pharmaceutical composition.
  • Such a pharmaceutical composition may include the compounds in combination with any standard physiologically and/or pharmaceutically acceptable carriers which are known in the art.
  • the compositions should be sterile and contain a therapeutically effective amount ofthe inhibitor peptide or other therapeutic compound in a unit of weight or volume suitable for administration to a patient.
  • pharmaceutically acceptable means a non-toxic material that does not interfere with the effectiveness ofthe biological activity ofthe active ingredients.
  • physiologically acceptable refers to a non-toxic material that is compatible with a biological system such as a cell, cell culture, tissue, or organism. The characteristics ofthe carrier will depend on the route of administration.
  • Physiologically and pharmaceutically acceptable carriers include diluents, fillers, salts, buffers, stabilizers, solubilizers, and other materials which are well known in the art.
  • a therapeutically effective amount means that amount necessary to delay the onset of, inhibit the progression of, or halt altogether the particular condition being treated.
  • Therapeutically effective amounts specifically will be those which desirably influence protease activity.
  • a therapeutically effective amount will vary with the subject's age, and condition, as well as the nature and extent ofthe disease in the subject, all of which can be determined by one of ordinary skill in the art. The dosage may be adjusted by the individual physician, particularly in the event of any complication.
  • a therapeutically effective amount typically varies from 0.01 ng/kg to about 1000 ⁇ g/kg, preferably from about 0.1 ng/kg to about 200 ⁇ g/kg and most preferably from about 0.2 ng/kg to about 20 ⁇ g/kg, in one or more dose administrations daily, for one or more days.
  • the therapeutics ofthe invention can be administered by any conventional route, including injection or by gradual infusion over time.
  • the administration may, for example, be oral, intravenous, topical, intracranial, intraperitoneal, intramuscular, intracavity, intrarespiratory, subcutaneous, or transdermal.
  • the route of administration will depend on the composition of a particular therapeutic preparation ofthe invention and its intended use.
  • Preparations for parenteral administration include sterile aqueous or non-aqueous solutions, suspensions, and emulsions.
  • non-aqueous solvents are propylene glycol, polyethylene glycol, vegetable oils such as olive oil, and injectable organic esters such as ethyl oleate.
  • Aqueous carriers include water, alcoholic/aqueous solutions, emulsions or suspensions, including saline and buffered media.
  • Parenteral vehicles include sodium chloride solution, Ringer's dextrose, dextrose and sodium chloride, lactated Ringer's or fixed oils.
  • Intravenous vehicles include fluid and nutrient replenishers, electrolyte replenishers (such as those based on Ringer's dextrose), and the like. Preservatives and other additives may also be present such as, for example, antimicrobials, anti-oxidants, chelating agents, and inert gases and the like.
  • Other delivery systems can include time-release, delayed release or sustained release delivery systems. Such systems can avoid repeated administrations ofthe active compounds ofthe invention, increasing convenience to the subject and the physician. Many types of release delivery systems are available and known to those of ordinary skill in the art.
  • polymer based systems such as polylactic and polyglycolic acid, polyanhydrides and polycaprolactone; nonpolymer systems that are lipids including sterols such as cholesterol, cholesterol esters and fatty acids or neutral fats such as mono-, di and triglycerides; hydrogel release systems; silastic systems; peptide based systems; wax coatings, compressed tablets using conventional binders and excipients, partially fused implants and the like.
  • a pump-based hardware delivery system can be used, some of which are adapted for implantation.
  • a long-term sustained release implant also may be used.
  • Long-term release as used herein, means that the implant is constructed and ananged to deliver therapeutic levels ofthe active ingredient for at least 30 days, and preferably 60 days.
  • Long-term sustained release implants are well known to those of ordinary skill in the art and include some ofthe release systems described above. Such implants can be particularly useful in treating conditions characterized by unwanted protease activity by placing the implant near portions of a subject affected by such activity, thereby effecting localized, high doses ofthe compounds ofthe invention.
  • Example 1 Determination of complete protease cleavage site motifs using oriented peptide library mixtures
  • Reagents Recombinant MTl-MMP catalytic domain and GM6001 were purchased from Chemicon (Temecula, CA), recombinant human MMP-2 from Oncogene Research Products (San Diego, CA), and other purified MMPs (native human MMP-1 , recombinant human MMP-3 catalytic domain, recombinant human MMP-7, native monomeric MMP-9) from Calbiochem (San Diego, CA).
  • Peptide libraries were synthesized at the Tufts University Core Facility (Boston, MA). Degenerate positions X were prepared using iso-kinetic mixtures ofthe 19 naturally occurring L- amino acids excluding cysteine.
  • Biotinylated libraries were prepurified on a monomeric avidin column (Pierce Chemical Co., Rockford, IL). Peptides were applied to the column in PBS, washed extensively with PBS followed by 50 mM NH 4 OAc, and eluted with 0.2 M HO Ac, and lyophilized. Purified libraries were partially digested with protease in 20 ⁇ l reactions as described above. Ethylenediamine tetraacetate (EDTA) was added to 15 mM and the biotinylated fraction removed by rotating with 400 ⁇ l avidin agarose (Sigma Chemical Co., St. Louis, MO) in 600 ⁇ l 25 mM ammonium bicarbonate for 1 h at room temperature.
  • EDTA Ethylenediamine tetraacetate
  • the mixture was transfened to a column, and the flowthrough was combined with five 200 ⁇ l wash fractions of 25 mM ammonium bicarbonate.
  • the material was evaporated to dryness under reduced pressure, suspended in 20 ⁇ l double-distilled water, and sequenced. Data were normalized as described above.
  • Peptide cleavage assay Peptide cleavage assay. Peptide cleavage was assayed by following the production of amine using fluorescamine (34). Amounts of product were determined by using the signal from a given peptide digested to completion with MMP-7 as a standard. For enzyme-peptide combinations in which the reaction rate was linear over substrate concentration [S] at 100 ⁇ M, values of k C3 lKu were determined from initial rates ( ⁇ 10% turnover) at that concentration (where __ » [S]). Otherwise, catalytic parameters were obtained by determining initial rates at various substrate concentrations and fitting the data directly to the Michaelis-Menten equation using Kaleidagraph software. Assays were performed in triplicate. Enzyme concentrations used were based on protein concentration alone.
  • Neonatal rat brain neurocan 300 ng
  • MMP as indicated in 10 ⁇ l buffer containing 20 mM HEPES, pH 7.4, 140 mM NaCl, and 2 mM CaCl 2 for 2 h at 37°C, and quenched by adding 10 ⁇ l 20 mM EDTA.
  • Chondroitin sulfate chains were removed by chondroitinase treatment as described (35), and samples were run on 5% SDS-PAGE gels followed by silver staining.
  • Oriented peptide libraries have been used previously to determine the target sequence preferences of protein kinases (10) and protein interaction domains (11—13).
  • proteases a two-step method is used. We first determine the cleavage site motif C-terminal to the cleavage site by partial digestion and N-terminal sequencing of a completely random peptide mixture. Information from this first round of screening is used to design a second library in which strongly selected amino acids are fixed, allowing data on sites N-terminal to the cleavage site to be obtained. Reiteration of this process allows an optimal recognition sequence to be determined.
  • MMPs are a family of secreted enzymes, including collagenases, gelatinases, and stromelysins, that play a crucial role in defining the cellular environment through regulated degradation and processing of extracellular proteins (14, 15).
  • Previous work using large series of synthetic peptides (16-22), phage display libraries (2, 23), and mixture-based libraries (9, 24) have provided information on the cleavage site specificity of several MMPs. Data obtained with our approach is consistent with these previous findings and provides novel selectivity information as well.
  • the cleavage site motif for a protease involves residues both N- and C-terminal to the scissile bond (the unprimed and primed sides, respectively, with the cleavage site for a protease defined as ... P3-P2-P1-P1'-P2'-P3'..., and cleavage occurs between the PI and PI' residues (25).
  • Our method involves the initial determination ofthe primed-side motif and subsequent determination ofthe unprimed-side motif (Fig. 1).
  • the primed-side motif is determined by partial digestion of a completely random mixture of peptide dodecamers acetylated at the N terminus.
  • the digested mixture is subjected to N-terminal sequencing by Edman degradation. Unreacted intact peptides and the N-terminal fragments of reacted peptides remain blocked and do not contribute to the sequenced pool; only the C-terminal fragments are sequenced.
  • the relative amounts of each amino acid present in a given cycle indicates the preference for that residue at a particular site, so that the first sequencing cycle affords information about the PI ' position, the second cycle about the P2' position, and so on.
  • the primed-side motif for MMP-7 (matrilysin) determined in this manner is shown in Fig. 2A.
  • sequences for the cleavage site motifs are as follows: MMP- 7 (SEQ ID NO:l), MMP-1 (SEQ ID NO: 2), MMP-2 (SEQ ID NO:3), MMP-9 (SEQ ID NO:4), MMP-3 (SEQ ID NO:5), and MT1-MMP (SEQ ID NO:6).
  • MMPs generally require hydrophobic amino acids at PI' and prefer either hydrophobic or basic amino acids at P2'.
  • MMP- 1, MMP-2, and MMP-9 prefer small residues (alanine, glycine, or serine) at P3', MMP-3, MMP-7, and MT1-MMP select for methionine at that position.
  • the MMPs can also be distinguished on the basis of their relative tolerance for aromatic amino acids at PI'. Although all enzymes tested select aliphatic residues most strongly at PI', MMP-2, MMP-3, MMP-9, and MT1-MMP also had reasonable selections for phenylalanine and tyrosine at that position. This observation concurs with previous reports on MMP substrate specificity and has been rationalized in terms ofthe deeper hydrophobic SI' pocket in these MMPs, as determined by both crystallography and mutagenesis studies (15, 16, 26) .
  • This secondary library has the sequence MAXXXXXLRGAARE(K-biotin) (SEQ ID NO:8), where X indicates a degenerate position, K-biotin is ⁇ -(biotinamidohexanoyl)lysine, and the N terminus is unblocked.
  • the fixed LRG sequence in this library conesponds to the PI '-P3' positions and represents a consensus MMP motif. These fixed positions are preceded by several degenerate residues that conespond to the unprimed positions, so that cleavage is directed to the X-L bond.
  • the library is partially digested with the MMP, the reaction mixture is quenched, and undigested peptides and C-terminal fragments that retain the biotin tag are removed with immobilized avidin. The remaining N-terminal fragments are subjected to N-terminal sequencing, and the selectivities are determined from the relative abundance of each amino acid in a given sequencing cycle as before.
  • the secondary library was used to analyze the unprimed-side specificity ofthe six
  • GPQG- IAGQ 0 . 15 + 0 . 02 22 a Peptides were N-terminally acetylated and C-terminally amidated, and peptide cleavage was assayed by fluorescamine detection of amine production.
  • the predicted optimal MMP-7 substrate is listed at the top. Cleavage sites are indicated with hyphens, and substitutions to the optimal peptide are indicated in boldface.
  • the collagen cleavage-site-spanning peptide is listed at bottom. Values are shown as a percentage ofthe V ma JK value for the consensus peptide.
  • MMPs share many common features in their consensus cleavage motifs (proline in P3, serine in PI, and leucine or methionine in PI '), the presence of subtle distinctions indicated that we might be able to discriminate among MMPs with optimized peptide substrates.
  • a peptide conesponding to the consensus motif for each MMP was synthesized, and catalytic parameters for cleavage of each peptide by the six MMPs studied were determined (Table 3). Parameters for cleavage ofthe collagen cleavage site-spanning octapeptide were also determined for comparison, and in every case the predicted optimal peptide was a significantly better substrate than the collagen peptide.
  • the consensus peptide was either the best peptide substrate tested for that enzyme (MMP-2, MMP-3, and MMP-7) or within twofold ofthe best peptide (MMP-9, MMP-1, and MT1- MMP).
  • MMP-2, MMP-3, and MMP-7 the best peptide substrate tested for that enzyme
  • MMP-9, MMP-1, and MT1- MMP the best peptide substrate tested for that enzyme
  • MMP-9, MMP-1, and MT1- MMP the optimal motifs for cleavage by this family of highly related proteases are largely similar, it is possible to design peptides that are selectively cleaved by specific members.
  • MMP-7 180 ⁇ 20 7,900+900 9,700+400 12,000 ⁇ 1500 120,000 ⁇ 20,000 22,000 ⁇ 3,000 12,000+600
  • MTl-MMP 3,600+200 6,100+300 4,300 ⁇ 300 3,700 ⁇ 300 10,300 ⁇ 700 5,500 ⁇ 300 6,900 ⁇ 500 a Peptides were designed and synthesized based on the data in Table 1.
  • the cleavage site is indicated by a hyphen.
  • the k cat /K M value was determined directly from the initial rate at a single substrate concentration under conditions where _ M >> [S] .
  • Jc cat and K ⁇ values (not shown) were determined by fitting initial rate data at varying substrate concentrations to the ichaelis-Menten equation.
  • the sequences for the consensus peptides are: collagen (SEQ ID NO:23) , MMP-1 (SEQ ID N0:24), MMP-2 (SEQ ID NO:25) MMP-3 (SEQ ID NO:26), MMP-7 (SEQ ID NO:27), MMP-9 (SEQ ID NO:28), and MTl-MMP (SEQ ID N0:29)
  • Collagen-o_ (I) (bovine) GPQG-LLGA 34
  • sequences above are: protein substrates (SEQ ID NO: 51), peptide libraries (SEQ ID NO: 52) .
  • IPVS- RSG 53 Consensus peptide 100 + 7
  • Neurocan I AM-LRAP 65 MMP-2, MMP-3
  • PAI-3 (mouse) TAAA-ITGA 66 MMP-2 a Part A tabulates known MMP-2 cleavage sites (15, 36—39). Multiple sites in a single protein that were mapped following complete degradation of a given protein are not listed. For cases in which several cleavage sites have been identified in a single protein but one site clearly predominates, only the major site is given.
  • Part B compares the experimentally determined MMP-2 cleavage-site motif with residues found at each position in known protein substrates. The number of occurrences of each residue in the 21 sites listed in part A for residues that arise more than once is given in parentheses.
  • Part C shows cleavage rates relative to the MMP-2 consensus peptide for peptide substrates derived from several known MMP-2 protein substrates. Parameters were determined as for Table 2.
  • Part D lists predicted MMP cleavage sites from computer database searches using matrices derived from the cleavage motifs for several MMPs. Unless otherwise indicated, the sequence is from the human ortholog.
  • chondroitin sulfate proteogly can neurocan whose predicted MMP-2 cleavage site corresponds to a known, developmentally regulated in vivo processing site (31).
  • MMP-2 could cleave neurocan at the predicted site.
  • Neonatal rat brain neurocan purified as a mixture ofthe full-length proteoglycan and its C-terminal fragment, was treated with MMP-2 and analyzed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE; Fig. 3).
  • MMP-2 Treatment with low concentrations of MMP-2 resulted in the complete disappearance of full-length neurocan with a concomitant increase in the abundance ofthe C-terminal fragment. MMP-2 digestion also generated a faster- migrating band, which was confirmed to be the N-terminal fragment by immunoblotting with a monoclonal antibody (1F6, ref. 32) that recognizes the N terminus of neurocan. Proteolysis was completely blocked by the hydroxamate MMP inhibitor GM6001. Treatment with equimolar amounts of MMP-1 or MMP-9, which according to their library profiles are not predicted to cleave neurocan at its processing site, did not result in cleavage.
  • MMP-2 can specifically process neurocan in vitro and demonstrate that cleavage motifs based on library data can be used to identify novel protein substrates.
  • residues at two sites (P3 and P2) important for MMP recognition were not previously known to promote cleavage by MMP-2, suggesting that neurocan would not have been predicted to be an MMP-2 substrate.
  • MMP-2 acts as the authentic neurocan- processing enzyme in vivo remains to be determined and awaits analysis of MMP-2 knockout mice.
  • a limitation to the method is that for those proteases with major selectivity sites only on the unprimed side, information from first-round screening ofthe fully degenerate library would probably be insufficient in itself to produce secondary libraries capable of deriving the unprimed-side motif. In such cases, prior knowledge ofthe major specificity position would be required to design a library that fixes sites on both sides ofthe cleaved bond. Unlike previously described approaches, mixture-based libraries are rapid, provide data for both the primed and unprimed positions, and are theoretically applicable to any protease that can digest peptide substrates.
  • Example 1 The method described above in Example 1 was used to determine the cleavage site motif for anthrax lethal factor (LF).
  • LF anthrax lethal factor
  • native enzyme was purified from culture supernatants oi Bacillus anthracis strain Sterne according to published procedures (60).
  • an LF-specific library acetyl-KKKPTPXXXXXAK, where X indicates a degenerate position; SEQ ID NO:67
  • the library was prepared with the sequence MXXXXXPYPMEDK(K-biotin) (SEQ ID NO:68), where X indicates a degenerate position and K-biotin is a biotinyllysine residue.
  • the library bears fixed optimal residues at PI '-P4', and a proline residue was fixed at the PI position, as mutation of this residue to alanine was reported to be eliminate cleavage of MEK-1 by LF (61).
  • Using this secondary library we were able to determine the LF motif for the unprimed positions (Table 5).
  • LF selects basic residues (lysine, arginine, and histidine) at the P6-P4 positions, and has a strong preference for hydrophobic (particularly aromatic) residues at the P2 position. Subtle preferences for proline and valine were seen at P3.
  • the lethal factor cleavage site motif is provided as SEQ ID NO:69:
  • Primed positions were determined using the library acetyl-KKKPTPXXXXXK (SEQ ID NO: 67), and unprimed positions were determined using the secondary library MXXXXXPYPMEDK(K-biotin) (SEQ ID NO:68). Selection values shown in parentheses are the relative amount of a given amino acid found at a given sequencing cycle normalized so that a value of 1 corresponds to average quantity per amino acid in that cycle and would indicate no selection. Only positive selections of 1.2 or over are listed.
  • This peptide serves as a tool to allow determination ofthe potency of LF inhibitors in vitro. As the peptide allows for rapid and facile monitoring of LF activity, it is also suitable for use in high-throughput screens of chemical libraries for LF inhibitors.
  • the V ⁇ J ⁇ M for LF cleavage of this consensus substrate was found to be significantly (14-fold) higher than for cleavage of an analogous peptide derived from the LF cleavage site in MEK-1 (Mca- KKPTP ⁇ IQLN-Dap(Dnp); SEQ ID NO:71).
  • the peptide library methodology thereby allowed us to produce a substrate with much improved properties over what would have been possible based on prior knowledge alone.
  • fluorogenic peptide substrates provide useful tools for evaluating the activity of proteases in vitro, a means for evaluating activity within living cells is also desirable, since this would allow for the direct screening for inhibitors that are both cell-permeant and metabolically stable, which are essential properties for clinically useful compounds.
  • the optimal cleavage motif data is used to prepare fluorescent reporters that can be used to monitor activity within living cells (Fig. 4).
  • the strategy takes advantage of recent advances in the development of enhanced green fluorescent protein (GFP) derivatives which exhibit a variety of spectral properties (70). For example, the emission spectrum ofthe enhanced cyan fluorescent protein (CFP) overlaps with the excitation spectrum of yellow fluorescent protein (YFP).
  • FRET fluorescence resonance energy transfer
  • Mammalian expression constructs are generated that insert the LF optimal cleavage site between a CFP/YFP pair, as well as a GFP/red fluorescent protein (RFP) pair, which also exhibits FRET.
  • RFP red fluorescent protein
  • similar fusions are generated where the LF cleavage site is scrambled and thus not susceptible to cleavage, as well as constructs using the foot-and- mouth disease virus 2 A autocatalytic processing site, which should undergo constitutive cleavage (74).
  • These constructs are tested by transient expression in a cell type which can be efficiently transfected (i.e. COS cells or 293T cells) and the cells treated with PA plus varying concentrations of LF.
  • Cells are observed by fluorescence microscopy to monitor changes in the FRET ratio upon LF treatment. Upon observation of a significant and reproducible decrease in FRET in these preliminary experiments, cell lines which stably express these fusions are generated by standard protocols. Stable lines facilitate screening of inhibitors in a high throughput manner by providing a population in which all cells express the fluorescent constructs, and by eliminating the need for a transfection step.
  • LF is a metalloproteinase
  • a group which chelates the active site zinc ion is incorporated at either the amino- or carboxy-terminus of an optimized peptide.
  • Such inhibitors can achieve remarkable potency and specificity by virtue of an avidity effect in which two separate binding groups, the metal chelator and the peptide moiety, are linked in a single molecule (75, 76) .
  • LF has significant selectivity on either side ofthe scissile bond, two types of inhibitors are generated and tested for their ability to inhibit LF.
  • One type incorporating unprimed residues is synthesized bearing an hydroxamic acid group in place ofthe carboxylic acid.
  • Solid phase synthesis of peptide hydroxamates is carried out according to well- established procedures by employing a commercially available hydroxylamine-bearing resin from which the growing peptide chain can be synthesized by standard Fmoc chemistry (77).
  • Substrate-analogous peptide hydroxamates have been generated as potent inhibitors of several families of metalloproteinases, including matrix metalloproteinases and astacins (78, 79).
  • inhibitors generated inco ⁇ orates primed residues and bears amino- terminal thioacetyl groups Such thioacetyl peptides make potent inhibitors of thermolysin, a bacterial metalloprotease related to LF (80).
  • Thioacetyl peptides aregenerated on the solid phase by coupling 2-(acetylthio)acetyl succinimide to the amino-terminus ofthe resin-bound, side chain-protected peptide followed by liberation ofthe free thiol with a standard Fmoc chemistry deprotection cycle (piperidine in dimethyl formamide).
  • Peptide derivatives are purified by reversed-phase HPLC. Inhibitors of varying peptide length (3 to 5 amino acid residues) are synthesized to optimize this parameter empirically.
  • Inhibitors based on the P4-P1 residues ofthe optimal substrate ⁇ -acetyl-Lys-Nal-Tyr- Pro-hydroxamic acid (SEQ ID ⁇ O:72) and ⁇ -acetyl-Lys-Val-Tyr- ⁇ Ala-hydroxamic acid (SEQ ID NO:73) are prepared.
  • the potency ofthe candidate inhibitors in the inhibition of LF is initially determined in vitro, and their specificity for LF is evaluated by assaying for their ability to inhibit other metalloproteases. Next, the ability ofthe candidate inhibitors to prevent lysis of cultured macrophages treated with LT is evaluated. Compounds which perform well in cell culture are tested for their ability to protect mice from a lethal challenge with LT.
  • PTH phenylthiohydantoin
  • amino acids close to the LF cleavage site are investigated. Accordingly, hydrophobic aliphatic and aromatic residues as well as proline analogs are investigated, in keeping with the general properties ofthe residues selected by LF in the P3 to P3' positions. A representative group of such amino acids is shown below.
  • unnatural amino acid-containing libraries will be of similar complexity as in our natural amino acid containing libraries (about 20 distinct amino acids).
  • Four separate unnatural amino acid mixtures are prepared so that roughly 80 unnatural amino acids may be evaluated at each site.
  • Mixtures also include the optimal natural amino acid residue for each position to allow us to determine if any unnatural amino acid is an improvement over the natural one at a given position.
  • two libraries are synthesized wherein either all ofthe primed or all ofthe unprimed positions are fixed to ensure that cleavage occurs at the intended scissile bond.
  • a library with the sequence KKKPYPXXXXGK (SEQ ID NO:75) was prepared in which the degenerate positions X contain a mixture ofthe following amino acids: A, Y, P, V, M, K, aminobutyric acid, allylglycine, S-methylcysteine, norvaline, norleucine, p- chlorophenylalanine, S-benzylcysteine, S-methoxybenzylcysteine, and ⁇ -cyclohexylalanine.
  • the results are summarized in the table below.
  • Nle norleucine
  • Cys(Me) S-methylcysteine
  • Nva norvaline
  • Cl-Phe p-chlo ⁇ henlylalanine
  • Chx-Ala ⁇ -cyclohexylalanine
  • Cys(Bzl) S-benzylcysteine
  • Allylgly allylglycine.
  • the numbers in parentheses represent the preference values calculated as described in the Examples above.
  • the sequence ofthe LF cleavage site motif determined using the SEQ ID NO:75 library containing unnatural amino acids is KKKPYPXaalXaa2Xaa3Xaa4GK (SEQ ID NO: 76), wherein the cleavage site primed amino acids (Pl'-P2'-P3') are Xaal-Xaa2-Xaa3.
  • PI ' and P3', unnatural amino acids were favored over the most highly selected natural ones.
  • the best natural amino acid in PI ', methionine differs from the best natural amino acid selected by LF at PI' when using the previous library KKKPTPXXXXXAK (SEQ ID NO:67), which was tyrosine. This is likely a consequence of fixing tyrosine at P2 in the newer library and suggests that application of these libraries in an iterative manner can be used for substrate optimization. Any novel selections which arise from the unnatural libraries are confirmed by inco ⁇ orating them into fluorogenic peptide substrates to see if the new substrate is indeed an improvement over the previously defined consensus peptide. Hydroxamic acid and thioacetyl-peptide inhibitors similar to those described above also are synthesized and evaluated.
  • the library method described above selects for efficient substrates which must undergo turnover and not for tight binding peptides per se.
  • libraries are screened directly for peptides which bind to LF. This approach was used previously to generate a specific peptide inhibitor ofthe protein tyrosine kinase ZAP-70 (83).
  • peptide library mixtures are synthesized bearing a carboxy-terminal hydroxamic acid group, which will serve to orient the library.
  • the library (for example, MAXXXXXX- hydroxamate; SEQ ID NO:77) is applied to a column containing immobilized LF, the column is washed extensively, and bound peptides are eluted with either low pH or a metal chelator.
  • the bound pool is then sequenced as usual to determine the preferences at each site for LF inhibitors. If necessary, analogous libraries are made containing the same unnatural amino acid mixtures which were used in the substrate screens described above. Consensus peptide hydroxamates are individually synthesized and evaluated as LF inhibitors.
  • Candidate LF inhibitors produced as a result ofthe library screens will be evaluated in a number of assays. Initially K values for inhibition of LF cleavage ofthe peptide substrate are determined by fluorometric assay by titrating the concentration of inhibitor under initial rate conditions, using a fixed substrate concentration well below the K M . Compounds also are tested for their ability to inhibit the cleavage of a known protein substrate in cell lysates. We have been able to cleanly assay cleavage of MEK-4 by LF in cell lysates by immunoblotting. Compounds that do well in vitro also are tested for their ability to inhibit LF in live cells using the FRET substrate described above.
  • LF inhibition in cells is assayed by following MEK-4 cleavage in extracts from cells treated with LF plus PA in the presence of varying concentrations of inhibitor. Finally, the compounds are evaluated for their ability to inhibit lysis of macrophage cell lines by LeTx.
  • Cleavage site motif data is used to identify downstream substrates ofthe proteases analyzed.
  • Knowledge ofthe substrates for a protease is crucial to understanding its function at the molecular level, and may provide additional targets for therapeutic intervention.
  • Our laboratory has recently developed a world wide web-accessible computer program called Scansite (http://scansite.mit.edu/) for searching protein sequence databases for the presence i of short peptide motifs (27).
  • Scansite offers substantial improvements over previously existing sequence database searching programs such as BLAST which are better suited to longer individual sequences.
  • Scansite searches are performed using matrices (of weighted amino acid preference by cleavage site position) corresponding to the cleavage site for the protease of interest against protein sequences in public databases belonging either to the organism itself or to the mammalian host as appropriate.
  • matrices of weighted amino acid preference by cleavage site position
  • MEK-1 cleavage by LF is reported to impair its ability to activate Erk
  • LeTx treatment of macrophages results in a strong though transient activation of Erk (62).
  • Proteins for which antibodies are available are tested by probing LF-treated lysates against untreated lysates on immunoblots. When antibodies are not available, cDNA clones encoding the protein of interest are acquired and used to construct either epitope-tagged mammalian expression vectors or bacterial GST-fusion constructs, which are used to evaluate whether the protein can be cleaved by LF.
  • a risk associated with 2D electrophoresis approaches is that many cellular proteins will escape detection due to inadequate sensitivity of silver staining or inefficient separation.
  • an alternative approach is undertaken using a recently developed expression cloning method based on the screening of small cDNA pools (86).
  • Small pool screening has been used successfully in a number of contexts, including the identification of substrates for caspase family proteases (87).
  • a cDNA library is subdivided into pools containing approximately 100 clones apiece. Pools are transcribed and translated in a reticulocyte lysate fed [ 35 S]-methionine to metabolically label the proteins synthesized.
  • nonstructural proteinase is in the C-terminal half of nsP2 and functions both in cis and in trans. J. Virol. 63, 4653-4664 (1989).

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Organic Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Immunology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Hematology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Medicinal Chemistry (AREA)
  • Urology & Nephrology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Analytical Chemistry (AREA)
  • Microbiology (AREA)
  • Genetics & Genomics (AREA)
  • Pathology (AREA)
  • Food Science & Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Cell Biology (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • General Engineering & Computer Science (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides methods for rapidly determining protease cleavage site motifs using a mixture-based oriented peptide library approach. The cleavage site motif for a protease involve residues both amino- and carboxy- terminal to the scissile bond (the unprimed and primed sides, respectively). The methods involve the initial determination of the primed side motif and the successive determination of the unprimed side motif. Iterative application of the methods is also provided. Substrates and inhibitors of proteases that include or compete for the cleavage site motifs determined using the methods also are provided, as are methods and compositions for using these substrates and inhibitors.

Description

METHODS FOR DETERMINING PROTEASE CLEAVAGE SITE MOTIFS
Government Support
This work was funded in part by the National Institutes of Health under grant numbers GM56203 and F32 GM19895-01. The government may have certain rights in this invention.
Background of the Invention
Proteolytic enzymes play widespread roles in a variety of essential biological processes, both as nonspecific mediators of protein degradation and in the catalysis of specific cleavage events. Misregulation of proteolysis has been implicated in neoplastic, autoimmune, and infectious disease, and a number of pathogens carry proteases essential for viability or infectivity. Proteases are therefore attractive targets for drug design, and several protease inhibitors are in clinical use. Currently inhibitors of two proteolytic enzymes, angiotensin converting enzyme and HIV protease, are in widespread clinical use (40, 41). A growing number of additional protease inhibitor drugs are in development, with several currently in clinical trials, such as matrix metalloprotease (MMP) inhibitors for cancer and arthritis (14). Proteases are therefore emerging as a promising and general targets for therapeutic intervention.
One area of therapy that is particularly in need of improved agents is the treatment of infectious agents used in biological warfare. Most infectious agents regarded as potentially threatening in biological warfare or bioterrorism produce proteolytic enzymes essential for viability, reproduction, or virulence ofthe organism. Deletion ofthe gene encoding the Pla serine protease produced by the plague bacterium Yersiniapestis, for example, increases by one million fold the lethal dose ofthe bacterium for mice (42). The Yersinia YopJ protein is a cysteine protease and virulence factor which is important for evasion ofthe host immune system by the pathogen (43). Many bacteria, including Brucella and Yersinia species, produce the HtrA protease, which leads to decreased survival ofthe bacterium in mice when inactivated (44). The H1L gene product of variola virus, which causes smallpox, is 98% identical to vaccinia virus protease, a metalloprotease involved in viral polyprotein processing and essential for maturation ofthe viral particle (45, 46). The equine encephalitis viruses carry homologs ofthe sindbis virus nsP2 protease responsible for processing of a nonstructural polyprotein to produce essential components for replication ofthe viral genome (47, 48). Niral agents which apparently do not carry genes for proteases have been shown in some cases to employ host proteases in their life cycle, as in the processing of envelope glycoproteins ofthe Ebola and Marburg hemorrhagic fever viruses (49). Such proteases offer an unexplored avenue for the development of drugs which would be useful as therapy in the event of exposure to biological weapons.
Protease inhibition is particularly promising as a strategy for the treatment of anthrax. Inhalation of Bacillus anthracis spores gives rise to systemic anthrax, a condition nearly always fatal in humans (50). Spores germinate within macrophages and emerge as rapidly dividing vegetative bacteria. Within several days the bacteria spread to the bloodstream where it multiplies to high levels, producing a toxin which results in death of host. Anthrax toxin is comprised of three protein components, protective antigen (PA), edema factor (EF) and lethal factor (LF), which are active in binary combinations (51). The combination of PA and EF, also called edema toxin, impairs neutrophil function and gives rise to edema associated with cutaneous anthrax. PA and LF together form lethal toxin (LeTx), which appears to be the major cause of death in systemic anthrax infection. Intravenous injection of LeTx is alone sufficient to cause death in experimental animals, and strains lacking LF or PA are greatly attenuated (52, 53). The crucial cellular target for LeTx appears to be the macrophage (54). Treatment of macrophages or macrophage cell lines with LeTx results in high levels of inflammatory cytokine production, activation ofthe oxidative burst, and eventual cell lysis (54-56). These effects are likely to contribute to death from infection by crippling host defense against the pathogen and by causing a shock-like syndrome.
LeTx functions as a classical two-component bacterial toxin, with PA acting to translocate the enzymatically active component, LF, into the cytosol (Fig. 1) (51). PA binds to the surface of target cells by interaction with an unidentified receptor. Subsequent cleavage by furin or a furin-like proprotein convertase enzyme removes a 20kDa fragment to generate the Ν-terminally truncated PA63 (57). PA63 assembles into a heptameric ring structure which binds to LF (58, 59). Upon endocytosis, the acidic endosomal environment triggers conversion ofthe PA63 heptamer into a pore which facilitates entry of bound LF into the cell cytosol. LF is a zinc-dependent metalloprotease belonging to the same superfamily (clan MA) as the prototypical bacterial protease thermolysin (60-62). As point mutants at the LF catalytic center which result in loss of protease activity also impair the ability of LF to cause lysis of macrophages, the cleavage of proteins in the host cell cytosol appears to be essential for its biological activity (63). The only known substrates of LF identified thus far are MEK (or MAP kinase kinase) family protein kinases (61, 62, 64, 65). To date, six distinct MEKs (MEK-1, -2, -3, -4, -6 and -7) have been found to be LF substrates.
The only currently available therapy for anthrax involves antibiotics, which have a low success rate against the systemic form ofthe disease (50). Anthrax vaccines have been controversial due to issues with safety and efficacy and are currently not administered widely to civilians. Thus a need exists for alternative strategies to treat anthrax infection. Experimental therapies including a dominant negative form of PA which inhibits translocation have shown efficacy in laboratory animals but have not yet entered human trials (67). The crucial role of LF in the pathogenesis of anthrax suggests that it would make an ideal target for pharmacological intervention. Given the success of protease inhibition as a therapeutic strategy in other systems, the development of compounds which inhibit LF protease activity is an attractive strategy for producing anthrax drugs. To date, no specific inhibitors of LF have been reported. Accordingly, there is need for a greater knowledge of protease cleavage site motifs that would permit the design of additional protease inhibitors. In particular, there is a need for inhibitors of proteases of human pathogens, including the B. anthracis anthrax lethal factor protease.
A number of methods for determining protease substrate specificity on the basis of peptide libraries have emerged recently. These include substrate phage display libraries, positional-scanning peptide libraries, and mixture-based peptide libraries. Phage display (1,2), while generally applicable, is laborious and generates only a set of efficiently cleaved substrates rather than an exhaustive evaluation ofthe presence of each amino acid residue at each position around the cleavage site. Positional-scanning synthetic peptide libraries (3-5), based on detection of cleavage by the release of a C-terminal fluorogenic group, are rapid and enable analysis of all possible peptide sequences. Currently these libraries only provide sequence specificity N-terminal to the scissile bond and cannot be used with proteases that require amino acid residues C-terminal to the cleavage site; this restricts their use largely to serine and cysteine proteases. Digestion of library mixtures followed by N-terminal sequencing has been used to provide specificity information C-terminal to the cleavage site (6—8). Liquid chromatography-mass spectrometry analysis of peptide mixtures has also been described, but requires that a separate library for each position relative to the scissile bond be made in the context of a previously known good peptide substrate (9). Therefore current methods for determining protease cleavage sites lack both general applicability and speed. Accordingly, there is a need for improved methods of determining the cleavage site motifs of a variety of proteases.
Summary of the Invention
The invention provides novel methodology for the rapid determination of protease cleavage site motifs using a mixture-based oriented peptide library approach. The cleavage site motif for a protease involves residues both amino- and carboxy-terminal to the scissile bond (the unprimed and primed sides, respectively, where the cleavage site for a protease is defined as ...P3-P2-P1-P1 '-P2'-P3' ..., and cleavage occurs between the PI and PI' residues). The methods involve the initial determination ofthe primed side motif and the successive determination ofthe unprimed side motif. The primed side motif is preferably determined by partial digestion of a completely random mixture of peptides (preferably dodecamers) blocked (e.g., acetylated) at the amino terminus. The digested mixture is subjected to amino-terminal sequencing by Edman degradation. Unreacted intact peptides and the amino-terminal fragments of reacted peptides remain blocked and do not contribute to the sequenced pool; only the carboxy-terminal fragments are sequenced. The relative amounts of each amino acid present in a given cycle indicates the preference for that residue at a particular site, so that the first sequencing cycle affords information about the PI ' position, the second cycle about the P2' position, and so on. Subsequent determination ofthe specificity at the unprimed positions depends on data generated from the initial screen as follows. A second peptide library is synthesized which fixes one or more ofthe primed positions by incorporating optimal amino acid residues determined in the initial screen. The fixed positions are preceded by several degenerate residues which correspond to the unprimed positions. The library preferably is prepared with the amino terminus free and with a carboxy-terminal tag (e.g., biotin) to permit removal ofthe uncleaved peptides and carboxy terminal portion the of peptides in the library after protease cleavage. To obtain the unprimed side motif, the library is partially digested with the protease, the reaction mixture is quenched, and undigested peptides and carboxy-terminal fragments which retain the carboxy-terminal tag are removed (e.g., biotin-tagged fragments are removed with immobilized avidin). The remaining amino-terminal fragments are subjected to amino- terminal sequencing, and the selectivities are determined from the relative abundance of each amino acid in a given sequencing cycle (preference values for particular amino acids) as before. According to one aspect ofthe invention, methods for determining an amino acid sequence motif for a cleavage site of a protease are provided. The methods include: a) contacting the protease with a peptide library containing one or more degenerate residues under conditions which allow for cleavage of a substrate by the protease; b) allowing the protease to cleave peptides within the degenerate peptide library having a cleavage site for the protease to form a population of cleaved peptides comprising amino-terminal peptides and carboxy-terminal peptides; c) determining the amino acid sequences ofthe population of cleaved carboxy- terminal peptides; and d) determining an amino acid sequence motif for a cleavage site ofthe protease based upon the relative abundance of different amino acid residues at each degenerate position within the population of cleaved C-terminal peptides.
In some embodiments, the methods also include isolating the population of cleaved carboxy-terminal peptides from the non-cleaved peptides and cleaved amino-terminal peptides. In other embodiments, the degenerate peptide library is a soluble synthetic peptide library and/or the peptide library contains all degenerate amino acid residues. In certain preferred embodiments, the peptide library will omit cysteine residues to avoid the formation of disulfide bonds.
In preferred embodiments, the peptides ofthe degenerate peptide library are blocked at their N-termini to prevent Edman degradation. In other embodiments, the peptides ofthe degenerate peptide library are labeled at their N-termini and/or C-termini with a binding molecule, preferably biotin. For peptides labeled at both termini, it is preferred that the N- termini are labeled with a first binding molecule and the C-termini are labeled with a second binding molecule. In such methods, the cleaved carboxy-terminal peptides are isolated from the non-cleaved peptides and cleaved amino-terminal peptides by contacting the population of cleaved peptides with a substrate that binds the first binding molecule. In another aspect ofthe invention, the methods for determining the protease cleavage site motif include determining the N-terminal (unprimed) residues ofthe cleavage site. The knowledge ofthe C-terminal (primed) residues ofthe cleavage site is used to orient a second library with respect to the cleavage site. Such methods include: a) obtaining a second peptide library, wherein the library is an oriented degenerate peptide library comprising one or more nondegenerate residues carboxy-terminal to a scissile peptide bond, and one or more degenerate residues amino-terminal to the scissile peptide bond, wherein the sequence ofthe nondegenerate residues is based on the amino acid sequence motif determined for the C-terminal (primed) residues, b) contacting the protease with the second peptide library under conditions which allow for cleavage of a substrate by the protease; c) allowing the protease to cleave peptides within the second peptide library having a cleavage site for the protease to form a population of cleaved peptides comprising amino- terminal peptides and carboxy-terminal peptides; d) isolating the population of cleaved amino-terminal peptides from non-cleaved peptides and cleaved carboxy-terminal peptides; e) determining the amino acid sequences ofthe population of cleaved amino-terminal peptides; and f) determining an amino acid sequence motif for a cleavage site ofthe protease based upon the relative abundance of different amino acid residues at each degenerate position within the population of cleaved amino-terminal peptides.
In some preferred embodiments, the second peptide library is a soluble synthetic peptide library. In other preferred embodiments, the amino termini ofthe peptides in the second peptide library are unblocked. Libraries with blocked termini can be used if more convenient, in which case the step of determining the amino acid sequences comprises unblocking the amino termini prior to sequencing the peptides.
In other preferred embodiments, the step of separating cleaved amino-terminal peptides and cleaved carboxy-terminal peptides comprises affinity isolation ofthe uncleaved peptides and the cleaved carboxy-terminal peptides from the cleaved amino-terminal peptides, preferably by biotin-avidin binding. In some embodiments ofthe methods, the degenerate (first) peptide library comprises peptides comprising the formula (Xaa)n (SEQ ID NO: 104). In these libraries, Xaa is any amino acid and n is preferably an integer from 3-20 inclusive.
In other embodiments ofthe methods, the protease cleaves a peptide before or after a known amino acid Zaa and the degenerate peptide library comprises peptides comprising the formula (Xaa)n-Zaa-(Xaa)m (SEQ ID NO: 105). In these libraries, Zaa is a non-degenerate amino acid (PI or PI') that forms part ofthe scissile bond, Xaa is any amino acid and n and m preferably are integers from 1-10 inclusive.
In still other embodiments ofthe methods, the degenerate peptide library comprises peptides comprising the formula (Zaa)n-(Xaa)m (SEQ ID NO: 106). In these libraries, Zaa is a non-degenerate amino acid amino-terminal to a scissile bond, Xaa is any amino acid and n and m preferably are integers from 1-10 inclusive.
In further embodiments ofthe methods, the second peptide library comprises peptides comprising the formula (Xaa)n-(Zaa)m (SEQ ID NO: 107). In these libraries, Zaa is an amino acid carboxy-terminal to a scissile bond (primed amino acid), Xaa is an amino acid amino- terminal to the scissile bond (unprimed amino acid), and n and m preferably are integers from 1-10 inclusive. In these libraries, each Zaa amino acid corresponds to the amino acid sequence motif for a cleavage site ofthe protease based upon the relative abundance of different amino acid residues at each degenerate position within the population of cleaved C- terminal peptides.
According to a further aspect ofthe invention, the methods described herein are used in an iterative fashion to further determine protease cleavage site motifs. For example, the information gained from the use ofthe first (degenerate) library and the second (oriented) library is used to re-examine the sequence ofthe C-terminal (primed) residues ofthe cleavage site. In these embodiments, the methods include: a) preparing a third peptide library, wherein the library is an oriented degenerate peptide library comprising one or more nondegenerate residues amino-terminal to a scissile peptide bond, and one or more degenerate residues carboxy-terminal to the scissile peptide bond, wherein the sequence ofthe nondegenerate residues is based on the amino acid sequence motif determined in claim 10, b) contacting the protease with the third peptide library under conditions which allow for cleavage of a substrate by the protease; c) allowing the protease to cleave peptides within the third peptide library having a cleavage site for the protease to form a population of cleaved peptides comprising amino- terminal peptides and carboxy-terminal peptides; d) isolating the population of cleaved carboxy-terminal peptides from non-cleaved peptides and cleaved amino-terminal peptides; e) determining the amino acid sequences ofthe population of cleaved carboxy - terminal peptides; and f) determining an amino acid sequence motif for a cleavage site ofthe protease based upon the relative abundance of different amino acid residues at each degenerate position within the population of cleaved carboxy-terminal peptides.
In some embodiments, the third peptide library comprises peptides comprising the formula (Zaa)n-(Xaa)m (SEQ ID NO: 108). In these libraries, Xaa is any amino acid and is amino acid carboxy-terminal to a scissile bond (primed amino acid), Zaa is an amino acid that is amino-terminal to the scissile bond (unprimed amino acid), and n and m preferably are integers from 1-10 inclusive. Further, each Zaa amino acid corresponds to the amino acid sequence motif for a cleavage site ofthe protease based upon the relative abundance of different amino acid residues at each degenerate position within the population of cleaved amino-terminal peptides. In other embodiments, some ofthe Xaa amino acids can be non- degenerate, in accordance with the information determined from cleavage ofthe first library. In certain embodiments ofthe foregoing methods, the peptides within the peptide library do not contain cysteine residues.
In some preferred embodiments ofthe foregoing methods, the protease is a matrix metalloproteinase. In other preferred embodiments ofthe foregoing methods, the protease is a proteolytic enzyme that mediates the pathogenesis of a pathogen; pathogens include biological warfare agents. In such embodiments, preferred proteases are selected from the group consisting of lethal factor of B. anthracis, Pla and YopJ proteases of Yersinia, and the smallpox H1L metalloprotease. Most preferably the protease is lethal factor of B. anthracis.
In other embodiments ofthe foregoing methods, the protease is selected from the group consisting of proteases of pathogenic organisms, cathepsin family proteases, tumor necrosis factor-alpha converting enzyme (TACE), calpains, caspases, beta-site amyloid precursor protein-cleaving enzyme (BACE; beta-secretase), presenilins, membrane-type serine proteases, furin and other proprotein convertases, proteasome components, and proteases affecting the blood clotting cascade. Other proteases include cysteine proteases, aspartyl proteases and serine proteases.
In still other preferred embodiments ofthe foregoing methods, the amino acid sequence motif for a cleavage site ofthe protease is determined by calculating a preference value for each amino acid at each degenerate position, wherein the preference value for a particular amino acid is determined by dividing the amount ofthe particular amino acid by the average amount per amino acid in that cycle to obtain a first value for the particular amino acid, and then dividing each first value by the relative amount of that particular amino acid in the starting mixture, and selecting amino acid residues that have a preference value of greater than 1.0 at a degenerate position for inclusion at a position corresponding to the degenerate position in the amino acid sequence motif.
According to another aspect ofthe invention, protease inhibitors or protease substrates including a sequence determined according to the foregoing methods are provided.
In another aspect ofthe invention, inhibitors of matrix metalloproteinase protease activity are provided. The inhibitors includes a noncleavable peptide molecule comprising an amino acid sequence selected from the group consisting of SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5 and SEQ ID NO:6, or a fragment thereof that inhibits matrix metalloproteinase protease activity.
Also provided are inhibitors of Bacillus anthracis lethal factor protease activity. The inhibitors include a noncleavable peptide molecule comprising SEQ ID NO:69, or a fragment thereof that inhibits lethal factor protease activity. Preferably the amino acid sequence comprises SEQ ID NO:70.
In further aspects ofthe invention, inhibitors oi Bacillus anthracis lethal factor protease activity are provided that consist essentially of a compound selected from the group consisting of 2-thioacetyl-Tyr-Pro-Met-amide, α-acetyl-Lys-Nal-Tyr-Pro-hydroxamic acid (SEQ ID ΝO:72), α-acetyl-Lys-Val-Tyr-βAla-hydroxamic acid (SEQ ID NO:73) and α- acetyl-Lys-Pro-Thr-Pro-hydroxamic acid (SEQ ID NO:74).
In yet another aspect ofthe invention, inhibitors of Bacillus anthracis lethal factor protease activity are provided that include SEQ ID NO:76, or a fragment thereof that inhibits lethal factor proteolytic activity.
In preferred embodiments, the inhibitors include at least one group that chelates the active site metal ion incorporated at either the amino-terminus or the carboxy-terminus. Preferred groups that chelate the active site metal ion are selected from the group consisting of thioacetyl groups, carboxylate groups, phosphonate groups, phosphoramidate groups and hydroxamic acids. For peptide inhibitors, preferred inhibitors are peptides or peptide analogs consisting of 3-25 amino acids. Inhibitors of protease activity that compete for binding to the protease with the foregoing inhibitors also are provided in another aspect ofthe invention, as are compositions comprising any ofthe foregoing inhibitors (including the competitive inhibitors) and a pharmaceutically acceptable carrier.
According to yet another aspect ofthe invention, methods for determining an amino acid sequence motif for a binding site of a protease are provided. The methods include: a) contacting the protease with an oriented peptide library containing one or more degenerate residues under conditions which allow for binding of a substrate by the protease; b) allowing the protease to bind peptides within the degenerate peptide library having a binding site for the protease to form protease-peptide complexes; c) isolating the protease-peptide complexes from the unbound peptides; d) releasing the peptides from the protease-peptide complexes; e) isolating the peptides previously bound to the protease; c) determining the amino acid sequences ofthe peptides; and d) determining an amino acid sequence motif for a binding site ofthe protease based upon the relative abundance of different amino acid residues at each degenerate position within the peptides.
In certain embodiments, the peptides in the oriented peptide library include a carboxy- terminal hydroxamic acid group. In preferred embodiments, the peptides include the amino acid sequence MAXXXXXX-hydroxamate (SEQ ID NO:77). In other embodiments, the peptide library is contacted with the protease by application ofthe library to a substrate to which the protease is immobilized. In yet other embodiments, the protease-peptide complexes are isolated by washing the protease-peptide complexes in a buffer that permits binding. In still other embodiments, the peptides are eluted from the protease-peptide complexes by incubating the protease-peptide complexes with an elution solution. Preferably the elution solution comprises either low pH or a metal chelator. According to a further aspect ofthe invention, protease binding molecules are provided that included an amino acid sequence motif for a binding site of a protease determined according to the foregoing methods.
In still another aspect ofthe invention, intramolecularly-quenched fluorogenic peptide protease substrates are provided. The substrates include a lethal factor protease cleavage motif sequence or a matrix metalloprotease cleavage motif flanked by a fluorescent group and a fluorescence quenching moiety.
In some embodiments, the fluorescent group is attached to the motif sequence at the amino terminus and the quenching moiety is attached to the peptide at the carboxy terminus. Prefened amino terminal fluorescent groups include a methoxycoumarinacetyl (Mca) group, and preferred carboxy-terminal quenching moiety include a dinitrophenyl-diaminopropionic acid Dap(Dnp) moiety. Preferably the Mca and Dap(Dnp) are used together. Other fluorescent groups and quenchers include aminobenzoyl groups or a tryptophan residue as the fluorophore with either a dinitrophenyl group or a nitrotyrosine group as the quencher, Edans (5-(2-aminoethyl)aminonaphthalene-l-sulfonic acid) as the fluorophore with dabcyl (4-(4- dimethylaminophenylazo)benzoic acid) as the quencher. Still other fluorogenic reagents include those where the fluorophore is at the C-terminus. Upon cleavage, there is an increase in fluorescence. Fluorogenic reagents of this type include aminomethylcoumarins or aminonaphthalenesulfonamides. According to yet another aspect ofthe invention, intramolecularly-quenched fluorogenic protease substrates are provided. The substrates include a lethal factor protease cleavage motif sequence or a matrix metalloprotease cleavage motif sequence flanked by fluorescent proteins that have overlapping emission spectra. Preferably the fluorescent proteins are cyan fluorescent protein (CFP) and yellow fluorescent protein (YFP), or green fluorescent protein (GFP) and red fluorescent protein (RFP).
In another aspect ofthe invention, protease substrates are provided that contain the cleavage site for a protease of interest placed between the transmembrane segment of a membrane-anchored transcription factor and its transcriptional activation domain, which allows release ofthe transcriptional activation domain to be regulated by the protease. In some prefened embodiments ofthe foregoing substrates, the lethal factor protease cleavage motif sequence includes SEQ ID NO:69, and more preferably the motif sequence is SEQ ID NO:70. Thus a particularly prefened inhibitor is Mca-KKVYPYPME-Dap(Dnp). In other prefened embodiments ofthe foregoing substrates, the matrix metalloprotease cleavage motif sequence includes an amino acid sequence selected from the group consisting of SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5 and SEQ ID NO:6. According to another aspect ofthe invention, methods for identifying protease inhibitors are provided. The methods include: a) providing a protease and a cleavable protease substrate, wherein the uncleaved substrate is distinguishable from the cleaved substrate, wherein the cleavable protease substrate comprises a motif sequence determined according to any ofthe methods ofthe invention, b) contacting the protease with a candidate protease inhibitor compound and the cleavable substrate under conditions that permit cleavage ofthe substrate, and c) detecting the amounts of cleaved and uncleaved substrate as a measure ofthe presence of a protease inhibitor, wherein detection of a lesser amount of cleaved substrate than is present when the protease is not contacted with the candidate protease inhibitor compound indicates that the candidate protease inhibitor compound is a protease inhibitor.
In certain embodiments, the cleavable protease substrate is an intramolecularly- quenched fluorogenic peptide protease substrate comprising a protease cleavage motif sequence flanked by a fluorescent group and a fluorescence quenching moiety, or an intramolecularly-quenched fluorogenic protease substrate comprising a protease cleavage motif sequence flanked by fluorescent proteins that have overlapping emission spectra. Prefened protease cleavage motifs in the substrates include lethal factor protease cleavage motifs sequence comprising SEQ ID NO:69, preferably SEQ ID NO:70, and matrix metalloprotease cleavage motif sequence comprising an amino acid sequence selected from the group consisting of SEQ ID NO: 1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5 and SEQ ID NO:6.
According to a further aspect ofthe invention, methods for identifying protease inhibitors are provided. The methods include: a) providing a protease, a protease inhibitor that binds the protease, and a candidate protease inhibitor compound, b) contacting the protease with the candidate protease inhibitor compound and the protease inhibitor under conditions that permit binding ofthe protease inhibitor to the protease, wherein either or both ofthe candidate protease inhibitor compound and the protease inhibitor are detectable, and wherein either or both ofthe candidate protease inhibitor compound and the protease inhibitor comprises a sequence determined according to the methods ofthe invention, c) separating the protease from the unbound protease inhibitor and unbound candidate protease inhibitor compound, and d) detecting the amounts of detectable protease inhibitor and/or the detectable candidate protease inhibitor compound bound to the protease as a measure ofthe presence of a candidate protease inhibitor compound that competes with the protease inhibitor for binding to the protease.
In certain embodiments, the methods include testing the activity ofthe protease in the presence ofthe candidate protease inhibitor compound, wherein a greater reduction in protease activity in the presence ofthe candidate protease inhibitor compound than in the absence ofthe candidate protease inhibitor compound indicates that the candidate protease inhibitor compound is a protease inhibitor.
In some prefened embodiments, the candidate protease inhibitor compound or the protease inhibitor comprises a lethal factor protease cleavage motif sequence comprising SEQ ID NO:69, preferably SEQ ID NO:70. In other prefened embodiments, the candidate protease inhibitor compound or the protease inhibitor comprises a matrix metalloprotease cleavage motif sequence comprising an amino acid sequence selected from the group consisting of SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5 and SEQ ID NO:6.
In prefened embodiments, the candidate protease inhibitor compound is a small organic molecule. Protease inhibitors identified according to these methods also are provided, as are uses ofthe protease inhibitors in the preparation of a medicament.
These and other objects and features ofthe invention are described in greater detail below.
Brief Description of the Drawings
Fig. 1 is a schematic drawing depicting an overview ofthe peptide library method. Fig. 2 shows the cleavage-site specificity of MMP-7 (matrilysin). Fig. 2 A shows the relative distribution of amino acid residues at positions C-terminal to the MMP-7 cleavage site, determined by sequencing a partial digest ofthe N-terminally blocked random dodecamer library Ac-XXXXXXXXXXXX (SEQ ID NO:7). Data are normalized so that a value of 1 conesponds to the average quantity per amino acid in a given sequencing cycle and would indicate no selectivity. Because of poor yield during sequencing, tryptophan was not included in the analysis. The average of two experiments with standard deviations are shown. Fig. 2B shows the specificity N-terminal to the MMP-7 cleavage site. For the P3 position, data shown were obtained using the library MAXXXXXLRGAARE(K-biotin) (SEQ ID NO: 8). For all other positions, the P3 proline library MGXXPXXLRGGGEE(K- biotin) (SEQ ID NO: 9) was used. Glycine, glutamine, and threonine were omitted because of high interfering background peaks on the sequencer. Data were normalized as in Fig. 2A.
Fig. 3 shows that MMP-2 can act as a neurocan-processing enzyme in vitro. Purified neonatal rat brain neurocan was digested at 37°C for 2 h with varying concentrations of MMPs as indicated in the absence or presence ofthe MMP inhibitor GM6001. Reaction mixtures were quenched with EDTA and chondroitinase-digested before SDS-PAGE and silver staining.
Fig. 4 depicts FRET substrates for visualizing protease activity in living cells. Cyan fluorescent protein (CFP) and yellow fluorescent protein (YFP) are fused with an intervening linker bearing an optimal LF cleavage site. Inadiation ofthe uncleaved construct at the CFP excitation wavelength results in transfer of energy to the YFP molecule and emission at its wavelength. Upon cleavage, FRET is disrupted and emission now occurs at the CFP emission wavelength. The ration of emission at the two wavelengths provides a readout of the extent of cleavage.
Detailed Description of the Invention The invention relates to methods for determining cleavage site motifs for proteases, substrate peptides that contain such motifs, including fluorogenic substrates, and inhibitors containing at least a portion of such motifs. The invention also relates to identification of substrate proteins by using the motifs identified to scan databases for proteins containing the motifs. Recognition ofthe target substrate by a protease depends in part on complementarity between the protease active site and the sequence sunounding the scissile bond in the substrate. Determination of protease cleavage site motifs has several important applications. Specificity information can be used to design highly sensitive and specific synthetic fluorogenic substrates that enable high-throughput screening for small-molecule inhibitors. Analogs of optimized substrates tailor-made to the class of protease provide potent inhibitors useful as lead compounds in drug discovery and as tools in exploring the biological function ofthe enzyme. Finally, knowledge ofthe optimal cleavage motif for a protease helps identify possible in vivo protein substrates. In addition to the metalloproteinases used herein, the invention is generally applicable in determining the protease cleavage site motifs for any protease, in identifying substrates and inhibitors for any protease, and so on. Thus the methods ofthe invention can be applied to determine protease cleavage site motifs of, for example, medically interesting proteases including proteases of pathogenic organisms such as hepatitis C virus NS3 protease, cathepsin family proteases for cancer and arthritis, tumor necrosis factor-alpha converting enzyme (TACE) for arthritis and septic shock, calpains and caspases for stroke, beta-site amyloid precursor protein-cleaving enzyme (BACE; beta-secretase) and presenilins for Alzheimer's disease, membrane-type serine proteases for cancer, furin and other proprotein convertases for cancer, proteasome inhibitors for cancer, and proteases affecting the blood clotting cascade. Other proteases of interest will be known to one of skill in the art.
Thus the invention pertains generally to the substrate specificity of proteases and to peptides which are substrates for proteases. The invention provides methods that allow for the identification of an amino acid sequence motif for the cleavage site of a specific protease without having to identify, isolate and compare native substrates for the protease. The methods ofthe invention are based upon selection of a subpopulation of peptides from a degenerate peptide library that are substrates for a protease. In the methods, the peptides within a peptide library that can be substrates for a protease are cleaved by the protease, converting them to amino-terminal peptides and carboxy-terminal peptides.
The peptides ofthe peptide library preferably are blocked at the amino termini, thereby preventing amino acid sequencing (e.g., by Edman degradation) ofthe amino- terminal peptides and uncleaved peptides. Blocking ofthe amino terminus (also known as the N-terminus) can be accomplished using any means known in the art. Preferably the N- terminus is blocked by the covalent attachment of a moiety to all peptides after the synthesis ofthe library peptides are completed. A prefened example of a blocking moiety is an acetyl group, and methods of acetylating peptides are well known in the art.
The carboxy-terminal peptides, which are unblocked by virtue ofthe cleavage by the protease, are sequenced and the relative abundance of each amino acid residue at each degenerate position ofthe carboxy-terminal peptides is determined. Alternatively, if desired, the cleaved carboxy-terminal peptides can be separated from the remaining non-cleaved peptides and the amino-terminal peptides, thereby isolating the subpopulation of peptides that are the carboxy-terminal portions of substrate peptides for the protease. For example, the carboxy-terminal peptides can have a molecule attached to the end that permits isolation of these cleavage products (e.g., biotin or an epitope recognized by an antibody).
An amino acid sequence motif for the cleavage site of a protease can be determined from the most abundant amino acid residues at each degenerate position ofthe carboxy- terminal peptides. Upon sequencing ofthe cleaved peptides, the abundance of an amino acid at a position in the peptide provides a preference value for each amino acid at each degenerate position. The preference value for a particular amino acid is determined by dividing the amount ofthe particular amino acid identified in a sequencing cycle by the average amount per amino acid in that cycle. This provides a raw value for the particular amino acid. To conect for bias in the library, it is prefened that the raw value is conected by then dividing the raw value for each amino acid by the relative amount of that particular amino acid in the starting mixture. Amino acid residues that have a preference value of greater than 1.0 at a degenerate position are considered to be a part ofthe cleavage site motif. Higher preference values are prefened, e.g., 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3.0, and so on. Thus one can select cleavage site motifs based on the highest preference value at a particular peptide residue, or can select motifs based on a combination of two or more amino acids at a particular residue that have preference values above a certain cutoff score.
In addition to the totally degenerate peptide library described above, one can use a partially degenerate library to determine the carboxy-terminal cleavage half-site. This may be prefened, for example, when the protease of interest does not cleave the totally degenerate library efficiently. The use of a partially degenerate library also may be prefened if one knows that the protease requires a certain residue in the cleavage site. Examples of two situations are provided in the Examples below for B. anthracis lethal factor protease.
Once the carboxy-terminal portion ofthe cleavage site motif is determined using a partially or totally degenerate peptide library, one prepares a partially degenerate second library, using the knowledge ofthe carboxy-terminal cleavage half-site to orient the degenerate amino-terminal residues. This second library is, therefore, an oriented degenerate peptide library (ODPL). The amino termini ofthe peptides in the second library preferably are left unblocked to permit ready sequencing by Edman degradation, although in alternative embodiments the peptides can be blocked during the cleavage reaction and unblocked after removal ofthe carboxy-terminal peptides, to facilitate sequencing. In this latter instance, blocking agents that are readily removed will be prefened.
After cleavage ofthe second library, the carboxy-terminal (also refened to as C- terminal) peptide fragments are removed. To facilitate the removal, the C-termini ofthe peptides in the second library are labeled with a moiety that facilitates ready removal ofthe C-terminal peptide fragments after cleavage. For example, in prefened embodiments, the C- terminal moiety is biotin, incorporated as a lysyl-biotin residue. Additionally, one or more K residues can be added to the peptides (e.g., at the C-terminus ofthe libraries) to promote water solubility. Avidin molecules can be coupled to a substrate (e.g., a bead, resin, dipstick, magnetic bead) and used to bind biotin-linked peptides (uncleaved peptides) and biotin- linked peptide fragments (carboxy-terminal cleavage products). The biotin-avidin binding pair is but one example of agents useful for removing the cleaved C-terminal peptide fragments and uncleaved peptides. Other binding pairs known in the art include antibody- antigen pairs.
Following removal ofthe C-terminal fragments and uncleaved peptides, the isolated amino-terminal cleavage products are sequences according to standard methodologies.
Preferably the peptides are sequenced by an automated peptide sequencer. Preference values for amino acids at positions ofthe N-terminal portion ofthe cleavage site are then determined as for the C-terminal sequence. Combining the N-terminal motif and the C-terminal motif sequences provides a complete cleavage motif sequence. The determination of N-terminal and C-terminal motif sequences can proceed in iterative fashion. Thus, a third round of motif determination can be based on the second round. The cleavage ofthe second library provides sequence information about the N- terminal (unprimed) residues ofthe protease cleavage site. This sequence information can be used to design a third library which fixes the unprimed N-terminal residues in accordance with the experimentally determined cleavage motif. This third library, like the first library, contains degenerate amino acid sequence in the portion ofthe peptides carboxy-terminal to the scissile bond (i.e., the primed residues). Also like the first library, the peptides ofthe third library preferably are blocked at the N-termini so that only the C-terminal cleaved peptides will yield sequence information. The third library is subjected to protease cleavage and the sequence ofthe C-terminal peptide fragments is then determined. Preference values for the C-terminal residues ofthe cleavage motif are calculated, thus refining the C-terminal portion of the motif.
Substrates for the protease are designed based on the protease cleavage motif. The substrates preferably are detectably labeled in a manner that permits detection ofthe cleavage products as distinct from the uncleaved peptide substrate. One prefened example of a detectably labeled peptide is a fluorogenic peptide substrate. In general, fluorogenic substrates include two moieties linked to the ends of a substrate peptide. While linked in close proximity, the fluorogenic moieties have certain properties that change upon cleavage ofthe substrate peptide. For example, the moieties may be quenched in close proximity so that the uncleaved substrate peptide is not fluorescent. Upon cleavage, the quenching is relieved and one (or both) ofthe cleavage products are fluorescent and thus readily detectable. Prefened examples of fluorogenic reagents are amino terminal fluorescent methoxycoumarinacetyl groups and carboxy-terminal dinitrophenyl-diaminopropionic acid quenching moieties. Fluorogenic peptides also can be made using aminobenzoyl groups or a tryptophan residue as the fluorophore with either a dinitrophenyl group or a nitrotyrosine group as the quencher. Also Edans (5-(2-aminoethyl)aminonaphthalene-l-sulfonic acid) can be used as the fluorophore with dabcyl (4-(4-dimethylaminophenylazo)benzoic acid) as the quencher. Still other fluorogenic reagents include those where the fluorophore is at the C- terminus. Upon cleavage, there is an increase in fluorescence. Fluorogenic reagents of this type include aminomethylcoumarins or aminonaphthalenesulfonamides.
Another example is the use of two moieties that exhibit fluorescence resonance energy transfer (FRET) when placed in close proximity. Upon cleavage of a substrate peptide labeled with a pair of such moieties, the FRET is relieved and the fluorescent properties ofthe peptides change in a detectable manner. Certain fluorogenic moieties may be prefened for high-throughput screening of proteases in vitro, whereas other fluorogenic moieties may be prefened for testing protease activity in cells.
Another approach to the design of detectable cleavage substrates is to include in the substrate a molecule that affects a detectable process, preferably a process detectable in cellular assays. In such an approach, the molecule is inactive until the substrate is cleaved. One example of this approach is the use of a membrane-anchored transcription factor (such as ATF6) which is normally released from a cytoplasmic membrane by proteolytic cleavage to allow it to enter the nucleus and act as a transcription factor. In this strategy, the cleavage site for a protease of interest is placed between the transmembrane segment ofthe membrane- anchored transcription factor and its transcriptional activation domain, which allows release ofthe transcriptional activation domain to be regulated by the protease. The release ofthe transcriptional activation domain is monitored using standard reporter assays, such as a reporter gene assay in which a detectable protein product (green fluorescent protein, luciferase, etc.) is placed under the control ofthe transcription factor. Other cleavage- activated processes known in the art also are adaptable to this purpose.
Specific high affinity protease inhibitors also can be designed to incorporate the cleavage site motif. Inhibitors can be based on the entire protease cleavage site or on the C- terminal or N-terminal half-site motifs determined from cleavage ofthe first peptide library (and/or the third peptide library) and the second peptide library, respectively. Many modifications to peptide structure are known that are useful in the preparation of protease inhibitors. These include modified bonds, modified amino acids, and moieties that interact with the protease to prevent cleavage.
For example, for metalloproteinases, a group which chelates the active site zinc ion can be incorporated at either the amino- or carboxy-terminus of an optimized peptide. Peptides conesponding to primed residues (C-terminal motif) bearing amino-terminal thioacetyl groups can be synthesized using standard solid-phase chemistries. Likewise, peptides conesponding to unprimed residues (N-terminal motif) can incorporate an hydroxamic acid group in place ofthe carboxylic acid. These inhibitors in the context of lethal factor inhibitors are described below. Other inhibitors for metalloproteinases include carboxylates, phosphonates, phosphoramidates, and "right-handed" hydroxamic acids (which cover the unprimed residues). For cysteine proteases, inhibitors include aldehydes, halomethylketones, acyloxyketones, diazomethylketones, vinyl sulfones, epoxides, and ketomethylene peptides. For aspartyl proteases inhibitors include statines and other inhibitors which span the cleavage site and incorporate an hydroxyethylene moiety. For serine proteases, inhibitors include chloromethylketones. Specific methods for synthesis and purification ofthe inhibitors are known in the art, and certain of these are described in more detail in the Examples below.
The natural substrates ofthe protease used in the methods ofthe invention can be determined by scanning existing amino acid sequence databases (e.g., Swiss-Prot) for the existence of proteins having sequences that match the cleavage site motif. Software packages that are useful for this purpose are known in the art. For example, the Scansite program (Yaffe, et al., Nat. Biotechnol. 19, 348-353, 2001) can be used. Identification of natural substrates provides additional substrates for testing of inhibitors; the cleavage ofthe substrates can be monitored in the absence and in the presence of varying concentrations of candidate inhibitors to assess their effectiveness in preventing the cleavage of a variety of naturally occurring protein molecules.
The methods provided herein have the advantage that they can be used to determine a cleavage site motif for any protease, regardless of whether native substrates for that protease have been identified. Furthermore, since the methods involve selection of peptides which are cleaved most readily by a protease, the amino acid sequence motif determined by the methods represents the optimal cleavage site for that protease.
The cleavage-based methods require cleavage of peptides in a library. It may be desirable to determine motifs of protease binding rather than cleavage, particularly for the development of high-affinity uncleavable inhibitors. Accordingly, the invention also includes methods for determining protease binding motifs. In these methods, a protease is contacted with a library of noncleavable peptides. After washing away unbound peptides, the remaining peptides are eluted from the bound state and sequenced. As with the other methods described herein, the preference values ofthe peptide residues are then determined, and the protease binding motif is thereby determined.
This method requires that a large quantity of protease is contacted with the libraries of peptides. Preferably the protease is immobilized on a solid surface, such as a resin bead, that permits thorough removal of unbound peptides, such as by washing, and recovery of protease following removal of bound peptides (e.g., by alteration of salt concentration, pH, addition of metal chelators, etc.).
The peptide libraries for determining binding motifs preferably are oriented degenerate peptide libraries. Because the peptides are not cleaved in this method (which provides an orientation at the cleavage site), some other method of orientation is required to be able to extract meaningful sequence information (e.g., to prevent recognition of phased binding sites in the peptide library). One approach to orientation ofthe binding site libraries for metalloproteinases is to synthesize peptide library mixtures bearing a carboxy-terminal hydroxamic acid group, which will serve to orient the library by forcing the binding ofthe peptides at the active site. Another approach for peptide library orientation is to utilize protease cleavage motif information determined in accordance with other methods described herein to fix several ofthe residues in the peptide library to enhance the binding ofthe peptides at the active site.
The peptides synthesized for the libraries can be of any size that is readily recognized and cleaved by proteases (for determination of cleavage site motifs), or that is bound by proteases with high affinity (for determination of binding site motifs). The size ofthe peptides can be determined empirically, although it is expected that a peptide length of 5-25 amino acids, e.g., 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 and 25 amino acids, will work well for most applications ofthe methods described herein. Preferably, the peptides are 10-15 amino acids in length, most preferably 12 amino acids
(dodecamers). Inhibitors, including half site inhibitors, and substrate peptides will generally be of a similar length. It is possible, however, to use peptides that are longer if required or prefened for a particular application.
The peptides can incorporate natural and/or unnatural amino acids, and can by synthesized using standard solid-phase chemistries. Nonlimiting examples of unnatural amino acids are provided below, and additional amino acids will be known to the skilled artisan. For some protease cleavage motif determinations, it will be prefened that the library does not contain cysteine residues so that disulfide bonds are not formed.
In some embodiments, it may be desired to prepare noncleavable peptides containing the cleavage motif sequences. For example, noncleavable peptides are useful as specific inhibitors of proteases. Thus, for use as inhibitors, the peptides described herein preferably are non-hydrolyzable. To provide such peptides, one may select peptides from a library of non-hydrolyzable peptides, such as peptides containing one or more D-arnino acids or peptides containing one or more non-hydrolyzable peptide bonds linking amino acids.
Alternatively, one can determine cleavage motifs and then synthesize non-hydrolyzable peptides or modify peptides as necessary to reduce the potential for hydrolysis by proteases. For example, the individual peptide bonds which are susceptible to proteolysis can be replaced with non-hydrolyzable peptide bonds by in vitro synthesis ofthe peptide.
Many non-hydrolyzable peptide bonds are known in the art, along with procedures for synthesis of peptides containing such bonds. Non-hydrolyzable bonds include -psi[CH2NH]- reduced amide peptide bonds, -psi[COCH2]- ketomethylene peptide bonds, - psi[CH(CN)NH]- (cyanomethylene)amino peptide bonds, -psi[CH2CH(OH)]- hydroxyethylene peptide bonds, -psi[CH2O]- peptide bonds, and -psi[CH2S]- thiomethylene peptide bonds.
Nonpeptide analogs of peptides, e.g., those which provide a stabilized structure or lessened biodegradation, are also contemplated. Peptide mimetic analogs can be prepared based on a cleavage motif sequence by replacement of one or more amino acid residues by nonpeptide moieties. Preferably, the nonpeptide moieties permit the peptide mimetic to retain its natural confirmation, or stabilize a prefened, e.g., bioactive, confirmation.
Particularly prefened are those mimetics that inhibit protease cleavage activity. One example of methods for preparation of nonpeptide mimetic analogs from peptides is described in Nachman et al., Regul. Pept. 57:359-370 (1995). "Peptide," as used herein, embraces all of the foregoing.
The substrate peptides, binding peptides and inhibitors of protease cleavage labeled as described herein are useful for screening compounds and libraries of compounds for protease inhibitory activity. As mentioned, high throughput screening of known compounds and libraries of compounds can be performed using these substrates according to known methodologies.
The invention further provides efficient methods of identifying pharmacological agents or lead compounds for agents useful for inhibiting or monitoring protease activity.
Generally, the screening methods involve assaying for compounds which are cleaved or which inhibit cleavage of a protease substrate. Such methods are adaptable to automated, high throughput screening of compounds. A wide variety of assays for pharmacological agents are provided, including labeled in vitro protease cleavage assays, cell-based protease cleavage assays, etc. For example, in vitro protease cleavage assays are used to rapidly examine the effect of candidate pharmacological agents on the cleavage of a substrate by a specific protease. The candidate pharmacological agents can be derived from, for example, combinatorial peptide or small molecule libraries. Convenient reagents for such assays are known in the art.
Peptides used in the methods ofthe invention are added to an assay mixture as an isolated peptide. Peptides can be produced recombinantly, or isolated from biological extracts, but preferably are synthesized in vitro. Peptides encompass chimeric proteins comprising a fusion of a peptide having a particular cleavage site motif with one or more other polypeptides, e.g., fluorescent polypeptides. Peptides may also be labeled with detectable compound(s) to provide a means of readily detecting whether the peptide is cleaved, e.g., by immunological recognition or by fluorescent labeling.
A typical assay mixture includes a peptide having a protease cleavage site motif and a candidate pharmacological agent. Typically, a plurality of assay mixtures are run in parallel with different agent concentrations to obtain a different response to the various concentrations. Typically, one of these concentrations serves as a negative control, i.e., at zero concentration of agent or at a concentration of agent below the limits of assay detection. Candidate agents encompass numerous chemical classes, although typically they are organic compounds. Preferably, the candidate pharmacological agents are small organic compounds, i.e., those having a molecular weight of more than 50 yet less than about 2500. Candidate agents comprise functional chemical groups necessary for structural interactions with polypeptides (e.g., protease cleavage sites), and typically include at least an amine, carbonyl, hydroxyl or carboxyl group, preferably at least two ofthe functional chemical groups and more preferably at least three ofthe functional chemical groups. The candidate agents can comprise cyclic carbon or heterocyclic structure and/or aromatic or polyaromatic structures substituted with one or more ofthe above-identified functional groups. Candidate agents also can be biomolecules such as peptides (preferably non-hydrolyzable for protease inhibitors), saccharides, fatty acids, sterols, isoprenoids, purines, pyrimidines, derivatives or structural analogs ofthe above, or combinations thereof and the like. Where the agent is a nucleic acid (i.e., aptamer), the agent typically is a DNA or RNA molecule, although modified nucleic acids having non-natural bonds or subunits are also contemplated. Candidate agents are obtained from a wide variety of sources including libraries of synthetic or natural compounds. For example, numerous means are available for random and directed synthesis of a wide variety of organic compounds and biomolecules, including expression of randomized oligonucleotides, random or non-random peptide libraries, synthetic organic combinatorial libraries, phage display libraries of random peptides, and the like. Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant and animal extracts are available or readily produced. Additionally, natural and synthetically produced libraries and compounds can be readily be modified through conventional chemical, physical, and biochemical means. Further, known pharmacological agents may be subjected to directed or random chemical modifications such as acylation, alkylation, esterification, amidification, etc. to produce structural analogs ofthe agents.
A variety of other reagents also can be included in the mixture. These include reagents such as salts, buffers, neutral proteins (e.g., albumin), detergents, etc. which may be used to facilitate optimal protein-protein and/or protein-nucleic acid binding. Such a reagent may also reduce non-specific or background interactions ofthe reaction components. Other reagents that improve the efficiency ofthe assay such as nuclease inhibitors, antimicrobial agents, and the like may also be used.
The mixture ofthe foregoing assay materials is incubated under conditions whereby, but for the presence ofthe candidate pharmacological agent, a protease cleaves a substrate (for protease inhibition studies), or specifically binds a protease inhibitor, e.g., a non- hydrolyzable peptide (for identifying compounds that compete with known inhibitors). The order of addition of components, incubation temperature, time of incubation, and other parameters ofthe assay may be readily determined. Such experimentation merely involves optimization ofthe assay parameters, not the fundamental composition ofthe assay. Incubation temperatures typically are between 4°C and 40°C. Incubation times preferably are minimized to facilitate rapid, high throughput screening, and typically are between 1 minute and 10 hours.
After incubation, the presence or absence of protease cleavage or binding of a substrate is detected by any convenient method available to the user. For cell free binding type assays, a separation step may be used to separate bound from unbound components. The separation step may be accomplished in a variety of ways. Conveniently, at least one ofthe components is immobilized on a solid substrate, from which the unbound components may be easily separated.. The solid substrate can be made of a wide variety of materials and in a wide variety of shapes, e.g., microtiter plate, microbead, dipstick, resin particle, etc. The substrate preferably is chosen to maximum signal to noise ratios, primarily to minimize background binding, as well as for ease of separation and cost. Separation may be effected for example, by removing a bead or dipstick from a reservoir, emptying or diluting a reservoir such as a microtiter plate well, rinsing a bead, particle, chromatographic column or filter with a wash solution or solvent. The separation step preferably includes multiple rinses or washes. For example, when the solid substrate is a microtiter plate, the wells may be washed several times with a washing solution, which typically includes those components ofthe incubation mixture that do not participate in specific binding or interaction such as salts, buffer, detergent, non-specific protein, etc. Where the solid substrate is a magnetic bead, the beads may be washed one or more times with a washing solution and isolated using a magnet.
Detection may be effected using any convenient method. The protease cleavage or binding typically alters a directly or indirectly detectable product, e.g., a cleaved substrate peptide. In the assays, one ofthe components usually comprises, or is coupled to, a detectable label. A wide variety of labels can be used, such as those that provide direct detection (e.g., radioactivity, luminescence, optical or electron density, etc), or indirect detection (e.g., epitope tag such as the FLAG epitope, enzyme tag such as horseradish peroxidase, etc.). The label may be bound to a protease substrate or inhibitor as described elsewhere herein or to the candidate pharmacological agent.
A variety of methods may be used to detect the label, depending on the nature ofthe label and other assay components. For example, the label may be detected while bound to the solid substrate or subsequent to separation from the solid substrate. Labels may be directly detected through optical or electron density, radioactive emissions, nonradiative energy transfers, etc. or indirectly detected with antibody conjugates, streptavidin-biotin conjugates, etc. Methods for detecting the labels are well known in the art.
Thus the present invention includes automated drug screening assays for identifying compositions having the ability to inhibit protease cleavage of a substrate directly (by binding protease), or indirectly (by serving as cleavable decoy substrates). The automated methods are carried out in an apparatus which is capable of delivering a reagent solution to a plurality of predetermined compartments of a vessel and measuring the change in a detectable molecule in the predetermined compartments. Exemplary methods include the following steps. First, a divided vessel is provided that has one or more compartments which contain a protease substrate which, when exposed to a specific protease, has a detectable change in fluorescence. The protease can be in a cell in the compartment, in solution, or immobilized within the compartment. Next, one or more predetermined compartments are aligned with a predetermined position (e.g., aligned with a fluid outlet of an automatic pipette) and an aliquot of a solution containing a compound or mixture of compounds being tested for its ability to protease cleavage is delivered to the predetermined compartment(s) with an automatic pipette. The fluorescent protease substrate is also added with the compounds or following the addition ofthe compounds. Finally, fluorescence emitted by the substrate in response to an excitation wavelength is measured for a predetermined amount of time, preferably by aligning said cell-containing compartment with a fluorescence detector. Preferably, fluorescence also measured prior to adding the compounds to the compartments, to establish e.g., background and/or baseline values for fluorescence. For competition assays, the compounds can be added with or after addition of a substrate or inhibitor to the protease- containing compartments. One of ordinary skill in the art can readily determine the appropriate order of addition ofthe assay components for particular assays.
At a suitable time after addition ofthe reaction components, the plate is moved, if necessary, so that assay wells are positioned for measurement of fluorescence emission. Because a change in the fluorescence signal may begin within the first few seconds after addition of test compounds, it is desirable to align the assay well with the fluorescence reading device as quickly as possible, with times of about two seconds or less being desirable. In prefened embodiments ofthe invention, where the apparatus is configured for detection through the bottom ofthe well(s) and compounds are added from above the well(s), fluorescence readings may be taken substantially continuously, since the plate does not need to be moved for addition of reagent. The well and fluorescence-reading device should remain aligned for a predetermined period of time suitable to measure and record the change in fluorescence. When the apparatus is configured to detect fluorescence from above the plate, it is prefened that the bottom ofthe wells are colored black to reduce the background fluorescence and thereby decreases the noise level in the fluorescence reader.
The apparatus ofthe present invention is programmable to begin the steps of an assay sequence in a predetermined first well (or rows or columns of wells) and proceed sequentially down the columns and across the rows ofthe plate in a predetermined route through well number n. It is prefened that the fluorescence data from replicate wells treated with the same compound are collected and recorded (e.g., stored in the memory of a computer) for calculation of fluorescence. To accomplish rapid compound addition and rapid reading ofthe fluorescence response, the fluorometer can be modified by fitting an automatic pipetter and developing a software program to accomplish precise computer control over both the fluorometer and the automatic pipetter. By integrating the combination ofthe fluorometer and the automatic pipetter and using a microcomputer to control the commands to the fluorometer and automatic pipetter, the delay time between reagent addition and fluorescence reading can be significantly reduced. Moreover, both greater reproducibility and higher signal-to-noise ratios can be achieved as compared to manual addition of reagent because the computer repeats the process precisely time after time. Moreover, this anangement permits a plurality of assays to be conducted concunently without operator intervention. Thus, with automatic delivery of reagent followed by multiple fluorescence measurements, reliability ofthe fluorescent dye-based assays as well as the number of assays that can be performed per day are advantageously increased.
Inhibitors of proteases identified by the methods described herein are useful to treat diseases or conditions that result from excessive or unwanted protease activity, including pathogenic infections, cancer, inflammatory diseases, etc. For treatment of such conditions, an effective inhibitory amount of a protease inhibitor is administered to a subject. The inhibitors also can be used in diagnostic applications, to detect specific proteases. For example, pathogens that express a specific protease can be detected in a subject, in a biological sample ofthe subject, or in various materials to assess contamination. Inhibitors and other compounds that incorporate protease cleavage or binding site sequence motifs can be administered as part of a pharmaceutical composition. Such a pharmaceutical composition may include the compounds in combination with any standard physiologically and/or pharmaceutically acceptable carriers which are known in the art. The compositions should be sterile and contain a therapeutically effective amount ofthe inhibitor peptide or other therapeutic compound in a unit of weight or volume suitable for administration to a patient. The term "pharmaceutically acceptable" means a non-toxic material that does not interfere with the effectiveness ofthe biological activity ofthe active ingredients. The term "physiologically acceptable" refers to a non-toxic material that is compatible with a biological system such as a cell, cell culture, tissue, or organism. The characteristics ofthe carrier will depend on the route of administration. Physiologically and pharmaceutically acceptable carriers include diluents, fillers, salts, buffers, stabilizers, solubilizers, and other materials which are well known in the art.
When used therapeutically, the compounds ofthe invention are administered in therapeutically effective amounts. In general, a therapeutically effective amount means that amount necessary to delay the onset of, inhibit the progression of, or halt altogether the particular condition being treated. Therapeutically effective amounts specifically will be those which desirably influence protease activity. Generally, a therapeutically effective amount will vary with the subject's age, and condition, as well as the nature and extent ofthe disease in the subject, all of which can be determined by one of ordinary skill in the art. The dosage may be adjusted by the individual physician, particularly in the event of any complication. A therapeutically effective amount typically varies from 0.01 ng/kg to about 1000 μg/kg, preferably from about 0.1 ng/kg to about 200 μg/kg and most preferably from about 0.2 ng/kg to about 20 μg/kg, in one or more dose administrations daily, for one or more days.
The therapeutics ofthe invention can be administered by any conventional route, including injection or by gradual infusion over time. The administration may, for example, be oral, intravenous, topical, intracranial, intraperitoneal, intramuscular, intracavity, intrarespiratory, subcutaneous, or transdermal. The route of administration will depend on the composition of a particular therapeutic preparation ofthe invention and its intended use.
Preparations for parenteral administration include sterile aqueous or non-aqueous solutions, suspensions, and emulsions. Examples of non-aqueous solvents are propylene glycol, polyethylene glycol, vegetable oils such as olive oil, and injectable organic esters such as ethyl oleate. Aqueous carriers include water, alcoholic/aqueous solutions, emulsions or suspensions, including saline and buffered media. Parenteral vehicles include sodium chloride solution, Ringer's dextrose, dextrose and sodium chloride, lactated Ringer's or fixed oils. Intravenous vehicles include fluid and nutrient replenishers, electrolyte replenishers (such as those based on Ringer's dextrose), and the like. Preservatives and other additives may also be present such as, for example, antimicrobials, anti-oxidants, chelating agents, and inert gases and the like. Other delivery systems can include time-release, delayed release or sustained release delivery systems. Such systems can avoid repeated administrations ofthe active compounds ofthe invention, increasing convenience to the subject and the physician. Many types of release delivery systems are available and known to those of ordinary skill in the art. They include polymer based systems such as polylactic and polyglycolic acid, polyanhydrides and polycaprolactone; nonpolymer systems that are lipids including sterols such as cholesterol, cholesterol esters and fatty acids or neutral fats such as mono-, di and triglycerides; hydrogel release systems; silastic systems; peptide based systems; wax coatings, compressed tablets using conventional binders and excipients, partially fused implants and the like. In addition, a pump-based hardware delivery system can be used, some of which are adapted for implantation.
A long-term sustained release implant also may be used. "Long-term" release, as used herein, means that the implant is constructed and ananged to deliver therapeutic levels ofthe active ingredient for at least 30 days, and preferably 60 days. Long-term sustained release implants are well known to those of ordinary skill in the art and include some ofthe release systems described above. Such implants can be particularly useful in treating conditions characterized by unwanted protease activity by placing the implant near portions of a subject affected by such activity, thereby effecting localized, high doses ofthe compounds ofthe invention.
Examples
Example 1 : Determination of complete protease cleavage site motifs using oriented peptide library mixtures
Experimental protocols
Reagents. Recombinant MTl-MMP catalytic domain and GM6001 were purchased from Chemicon (Temecula, CA), recombinant human MMP-2 from Oncogene Research Products (San Diego, CA), and other purified MMPs (native human MMP-1 , recombinant human MMP-3 catalytic domain, recombinant human MMP-7, native monomeric MMP-9) from Calbiochem (San Diego, CA). Peptide libraries were synthesized at the Tufts University Core Facility (Boston, MA). Degenerate positions X were prepared using iso-kinetic mixtures ofthe 19 naturally occurring L- amino acids excluding cysteine. The exact proportions of each amino acid at degenerate positions were determined by Edman sequencing of an unacetylated portion ofthe library. Individual peptides were synthesized at Tufts or Genemed Synthesis (S. San Francisco, CA), purified by high-performance liquid chromatography, and characterized by mass spectrometry.
Peptide library methods. To determine the primed-side cleavage specificity, a 1 mM solution of Ac-XXXXXXXXXXXX (SEQ ID NO:7) in MMP reaction buffer (50 mM HEPES, pH 7.4, 200 mM NaCl, 10 mM CaCl2) was digested at 37°C to 5-10% cleavage and heated to 100°C for 2 min. A 10 μl aliquot was subjected to Edman sequencing on an Applied Biosystems (Foster City, CA) Procise 494 Automated Protein Sequencer. Amino acid preference in a given cycle was calculated by dividing the amount of a particular residue by the average amount per amino acid residue in that cycle. The data were then conected for bias present in the library by dividing each value by the relative amount of that particular amino acid in the starting mixture.
Biotinylated libraries were prepurified on a monomeric avidin column (Pierce Chemical Co., Rockford, IL). Peptides were applied to the column in PBS, washed extensively with PBS followed by 50 mM NH4OAc, and eluted with 0.2 M HO Ac, and lyophilized. Purified libraries were partially digested with protease in 20 μl reactions as described above. Ethylenediamine tetraacetate (EDTA) was added to 15 mM and the biotinylated fraction removed by rotating with 400 μl avidin agarose (Sigma Chemical Co., St. Louis, MO) in 600 μl 25 mM ammonium bicarbonate for 1 h at room temperature. The mixture was transfened to a column, and the flowthrough was combined with five 200 μl wash fractions of 25 mM ammonium bicarbonate. The material was evaporated to dryness under reduced pressure, suspended in 20 μl double-distilled water, and sequenced. Data were normalized as described above.
Peptide cleavage assay. Peptide cleavage was assayed by following the production of amine using fluorescamine (34). Amounts of product were determined by using the signal from a given peptide digested to completion with MMP-7 as a standard. For enzyme-peptide combinations in which the reaction rate was linear over substrate concentration [S] at 100 μM, values of kC3 lKu were determined from initial rates (<10% turnover) at that concentration (where __ » [S]). Otherwise, catalytic parameters were obtained by determining initial rates at various substrate concentrations and fitting the data directly to the Michaelis-Menten equation using Kaleidagraph software. Assays were performed in triplicate. Enzyme concentrations used were based on protein concentration alone.
Neurocan digestion. Neonatal rat brain neurocan (300 ng), purified as described (31), was incubated with varying amounts of MMP as indicated in 10 μl buffer containing 20 mM HEPES, pH 7.4, 140 mM NaCl, and 2 mM CaCl2 for 2 h at 37°C, and quenched by adding 10 μl 20 mM EDTA. Chondroitin sulfate chains were removed by chondroitinase treatment as described (35), and samples were run on 5% SDS-PAGE gels followed by silver staining.
Results
Oriented peptide libraries have been used previously to determine the target sequence preferences of protein kinases (10) and protein interaction domains (11—13). For proteases, a two-step method is used. We first determine the cleavage site motif C-terminal to the cleavage site by partial digestion and N-terminal sequencing of a completely random peptide mixture. Information from this first round of screening is used to design a second library in which strongly selected amino acids are fixed, allowing data on sites N-terminal to the cleavage site to be obtained. Reiteration of this process allows an optimal recognition sequence to be determined.
To validate the method, we used it to determine the optimal cleavage site motifs for several MMPs. MMPs are a family of secreted enzymes, including collagenases, gelatinases, and stromelysins, that play a crucial role in defining the cellular environment through regulated degradation and processing of extracellular proteins (14, 15). Previous work using large series of synthetic peptides (16-22), phage display libraries (2, 23), and mixture-based libraries (9, 24) have provided information on the cleavage site specificity of several MMPs. Data obtained with our approach is consistent with these previous findings and provides novel selectivity information as well. Our results were validated by analysis of individually synthesized peptide substrates and by generating a series of optimized peptide substrates for the various MMPs. Using the specificity data, we have identified a number of likely substrates by scanning protein databases. One predicted MMP-2 cleavage site, the in vivo processing site for the brain-specific proteoglycan neurocan, was confirmed in vitro, substantiating our ability to predict cleavage sites from library data.
Determination of MMP cleavage site motifs. The cleavage site motif for a protease involves residues both N- and C-terminal to the scissile bond (the unprimed and primed sides, respectively, with the cleavage site for a protease defined as ... P3-P2-P1-P1'-P2'-P3'..., and cleavage occurs between the PI and PI' residues (25). Our method involves the initial determination ofthe primed-side motif and subsequent determination ofthe unprimed-side motif (Fig. 1). The primed-side motif is determined by partial digestion of a completely random mixture of peptide dodecamers acetylated at the N terminus. The digested mixture is subjected to N-terminal sequencing by Edman degradation. Unreacted intact peptides and the N-terminal fragments of reacted peptides remain blocked and do not contribute to the sequenced pool; only the C-terminal fragments are sequenced. The relative amounts of each amino acid present in a given cycle indicates the preference for that residue at a particular site, so that the first sequencing cycle affords information about the PI ' position, the second cycle about the P2' position, and so on. The primed-side motif for MMP-7 (matrilysin) determined in this manner is shown in Fig. 2A. These selectivities, in particular the strong preference for hydrophobic amino acids (particularly leucine) in PI' and the selection for methionine in P3', are in precise agreement with previous studies of cleavage rates for individual synthetic peptide variants ofthe MMP cleavage site from collagen-αι(I) (16, 21). The method was subsequently applied to five other MMPs: MMP-1 (collagenase-1), MMP-2 (gelatinase A), MMP-3 (stromelysin-1/ transin), MMP-9 (gelatinase B), and MT1- MMP (MMP- 14) (Table 1). The sequences for the cleavage site motifs are as follows: MMP- 7 (SEQ ID NO:l), MMP-1 (SEQ ID NO: 2), MMP-2 (SEQ ID NO:3), MMP-9 (SEQ ID NO:4), MMP-3 (SEQ ID NO:5), and MT1-MMP (SEQ ID NO:6).
Table 1. Cleavaεe-site motifs for six MMPs"
Cleavage position
Enzvme Common name P5 P4 P3 P2 Pi PI' P2' P3'
MMP-7 Matrilysin P (1.4) V(1.4) P (1.6) L(1.7) S(1.8) L(8.4) V(1.7) M(1.5) 1(1.3) 1(1.4) V(1.6) M (1.6) E (1.6) 1(3.6) T(1.7) Y(1.3)
R(1.3) 1(1.5) Y(1.4) N(1.3) M(2.5) 1(1.5) Q(1.3)
A (1.3) M(1.5) K(1.5) R(1.3)
MMP-1 Collagenase-1 V(1.8) V(1.4) P (2.3) M (1.5) S (2.2) M (4.9) M( 1.7) A (2.0)
1(1.4) Y(1.4) N(1.8) 1(3.8) 1(1.5) G(1.8)
L(1.4) A (1.8) L(3.1) K(1.4) S(1.6)
E(1.4) R(1.3)
MMP-2 Gelatinase A D(1.4) 1(1.4) P (1.7) V(1.3) S (1.9) L (4.2) R(1.5) S (2.2)
L(1.3) V(1.3) V(1.6) A (1.3) G(1.4) M(2.8) Y(1.5) A (2.1)
F(1.3) 1(1.5) A (1.4) 1(2.6) K(1.4) G(2.1)
N(1.3) E(1.3) Y(1.9) M(1.4) 1(1.3) F(1.8) 1(1.4)
V(1.4)
MMP-9 Gelatinase B V(1.4) V(1.3) P (2.5) L(1.6) S(1.8) L(3.4) R(1.4) S (1.9)
V(1.6) Y(1.3) M(2.6) T(1.4) A (1.8) 1(2.6) Y(1.4) G(1.6)
Y(2.1) V(1.3)
F(1.3) 1(1.3)
MMP-3 Stromelysin-1 N(1.3) K(1.6) P (2.5) F(1.5) S (1.6) M (3.5) M (1.9) M (1.6) 1(1.3) V(1.4) V(1.4) Y(1.5) E(1.4) 1(2.9) K(1.8) A (1.3)
1(1.4) 1(1.4) L(1.4) L(2.5) 1(1.7)
R(1.4) M(1.3) Y(2.4) R(1.6)
A (1.3) F(2.1)
MT1-MMP MMP-14 F(1.5) 1(1.6) P (2.0) X S(1.8) L(3.5) R(1.4) M(1.4)
L(1.4) K(1.4) V(1.4) A (1.4) 1(2.2) K(1.3) A (1.4)
D(1.3) V(1.3) M(2.2) Y(1.3)
1(1.3) D(1.3) Y(l-4) V(1.3) Ffl.4) a Quantities were determined from sequencing data as described for Figure 2, and values 1.3 are listed. X indicates no significant selectivity. All primed side data were obtained using the library Ac-XXXXXXXXXXXX (SEQ ID NO:7), P3 data with the library MAXXXXXLRGAARE(K-biotin) (SEQ ID NO:8), and all other unprimed data with MGXXPXXLRGGGEE(K-biotin) (SEQ ID NO:9). Representative data from at least two separate experiments are shown.
In agreement with previous studies (15, 16-22), MMPs generally require hydrophobic amino acids at PI' and prefer either hydrophobic or basic amino acids at P2'. Whereas MMP- 1, MMP-2, and MMP-9 prefer small residues (alanine, glycine, or serine) at P3', MMP-3, MMP-7, and MT1-MMP select for methionine at that position. The MMPs can also be distinguished on the basis of their relative tolerance for aromatic amino acids at PI'. Although all enzymes tested select aliphatic residues most strongly at PI', MMP-2, MMP-3, MMP-9, and MT1-MMP also had reasonable selections for phenylalanine and tyrosine at that position. This observation concurs with previous reports on MMP substrate specificity and has been rationalized in terms ofthe deeper hydrophobic SI' pocket in these MMPs, as determined by both crystallography and mutagenesis studies (15, 16, 26) .
We took advantage ofthe common features among the MMP primed-side motifs to design a secondary library to allow determination of motifs on the unprimed side (Fig. 1). This secondary library has the sequence MAXXXXXLRGAARE(K-biotin) (SEQ ID NO:8), where X indicates a degenerate position, K-biotin is ε-(biotinamidohexanoyl)lysine, and the N terminus is unblocked. The fixed LRG sequence in this library conesponds to the PI '-P3' positions and represents a consensus MMP motif. These fixed positions are preceded by several degenerate residues that conespond to the unprimed positions, so that cleavage is directed to the X-L bond. The library is partially digested with the MMP, the reaction mixture is quenched, and undigested peptides and C-terminal fragments that retain the biotin tag are removed with immobilized avidin. The remaining N-terminal fragments are subjected to N-terminal sequencing, and the selectivities are determined from the relative abundance of each amino acid in a given sequencing cycle as before. The secondary library was used to analyze the unprimed-side specificity ofthe six
MMPs analyzed previously. The major specificity site N-terminal to the scissile bond for all MMPs tested was at P3, where proline is prefened most strongly, although for most ofthe enzymes valine and isoleucine appear to be reasonable substitutions (P3 data in Fig. 2B and Table 1). Although some preferences were noted at the other sites, they were generally much weaker, and, when compared with the previous literature, were probably greatly underestimated. This shortcoming is likely due to a failure to completely direct cleavage to the intended bond, given that efficient cleavage sites are predicted to arise with a significant frequency within the degenerate positions. To obtain reliable data at the remaining sites, we constructed a tertiary library with the sequence MGXXPXXLRGGGEE(K-biotin) (SEQ ID NO:9), where proline is fixed in the P3 position. Data for MMP-7 cleavage ofthe P3-proline tertiary library is shown in Fig. 2B. The main selectivities for MMP-7 on the unprimed side in addition to proline at P3 are for hydrophobic residues (leucine, methionine, and tyrosine) at P2, and for serine, alanine, or glutamic acid at PI . These observations are in good agreement with previous literature on MMP-7 specificity (2, 16, 21).
A summary of selectivities found using the P3-proline library for the six MMPs examined is given in Table 1. As with the primed-side motifs, there are a number of features for the unprimed-side motifs that are common to all MMPs examined. These include a selection for small residues (principally serine) at PI and for hydrophobic residues at P2. There are some notable differences among the motifs for the various MMPs. For example, although the P2 site is generally hydrophobic, the particular residues vary from enzyme to enzyme (see Table 1). In addition, only MMP-2, MMP-3, and MMP-7 select glutamic acid at PI.
Validation of the method using individually synthesized peptide substrates. Agreement between our results and those in the literature offers support for the validity of our method. To further validate our approach, we prepared a consensus octapeptide substrate based on the MMP-7 motif and a series of peptides bearing substitutions to the motif, and determined the catalytic parameters for cleavage of each peptide by MMP-7 (Table 2). At each position we made substitutions conesponding to the residue found in the collagen cleavage site-spanning peptide, and, at several positions, additional substitutions. The Vmax/KM ofthe optimized peptide was found to be >600 fold greater than that ofthe collagen peptide. In each case, changing the predicted optimal residue to a suboptimal residue decreased the VmaxlKu for cleavage, and in cases of multiple substitutions made at the same position, the rank order of selectivity appears to be conserved between the library data and the individual peptide data (Fig. 2 and Table 2). An unexpected prediction is that in addition to proline, valine and isoleucine should make reasonable substitutions at P3. Substitution ofthe P3 proline for valine was found to cause a roughly twofold drop in the Vmax/Ku value; however replacing the proline with glycine at P3 caused a 20-fold decrease in activity, indicating that there is indeed a relative preference for valine at P3 for MMP-7.
Table 2. Kinetics of cleavage for a series of substituted peptides based on the predicted optimal cleavage motif for MMP-7a
Peptide sequence Vm I./--M SEO ID NO
VPLS-LTMG 100 ± 13 10
GPLS -LTMG 60 ± 10 11
GV S-LTMG 31 ± 3 12
GGLS -LTMG 2 . 8 ± 0 . 6 13
VHLS-LTMG 1 . 7 ± 0 . 4 14
VPQS -LTMG 40 + 10 15
VPRS -LTMG 2 . 1 + 0 . 1 16
VPLG-LTMG 75 + 8 17
VPLS- ITMG 52 ± 7 18
VPLS-LAMG 22 ± 3 19
VPLS-LDMG 9 + 2 20
VPLS-LTGG 31 + 4 21
GPQG- IAGQ 0 . 15 + 0 . 02 22 a Peptides were N-terminally acetylated and C-terminally amidated, and peptide cleavage was assayed by fluorescamine detection of amine production. The predicted optimal MMP-7 substrate is listed at the top. Cleavage sites are indicated with hyphens, and substitutions to the optimal peptide are indicated in boldface. The collagen cleavage-site-spanning peptide is listed at bottom. Values are shown as a percentage ofthe VmaJK value for the consensus peptide.
Although MMPs share many common features in their consensus cleavage motifs (proline in P3, serine in PI, and leucine or methionine in PI '), the presence of subtle distinctions indicated that we might be able to discriminate among MMPs with optimized peptide substrates. A peptide conesponding to the consensus motif for each MMP was synthesized, and catalytic parameters for cleavage of each peptide by the six MMPs studied were determined (Table 3). Parameters for cleavage ofthe collagen cleavage site-spanning octapeptide were also determined for comparison, and in every case the predicted optimal peptide was a significantly better substrate than the collagen peptide. In each case the consensus peptide was either the best peptide substrate tested for that enzyme (MMP-2, MMP-3, and MMP-7) or within twofold ofthe best peptide (MMP-9, MMP-1, and MT1- MMP). Thus, even though the optimal motifs for cleavage by this family of highly related proteases are largely similar, it is possible to design peptides that are selectively cleaved by specific members.
Table 3. Kinetic parameters for the cleavage of a series of consensus peptides by six MMPs'
Peptide substrate k^,/KM ovr
MMP-1 MMP-2 MMP-3 MMP-7 MMP-9 MTl-MMP
Collagen consensus consensus consensus consensus consensus consensus
Enzyme GPOG-IAGO VPMS-MRGG IPVS-LRSG RPFS-MIMG VPLS-LTMG VPLS-LYSG IPES-LRAG
MMP-1 27.4+0.9 1,600±100 98±7 440+20 300±50 2,100±300 870 + 30
MMP-2 10,100+400 24,000+1000 82,000±6000 4,600±500 13,200±400 61,000+4,000 24,000+3,000
MMP-3 160±40 3,900±400 2,300±100 6,900±200 2,400±200 1,390±30 1,500+100
MMP-7 180±20 7,900+900 9,700+400 12,000±1500 120,000±20,000 22,000±3,000 12,000+600
MMP-9 8,400+200 51,000±3000 11,500±300 21,000+4000 20,000±1,000 49,000±3,000 12,600+800
MTl-MMP 3,600+200 6,100+300 4,300±300 3,700±300 10,300±700 5,500±300 6,900±500 a Peptides were designed and synthesized based on the data in Table 1. The cleavage site is indicated by a hyphen. Where possible, the kcat/KM value was determined directly from the initial rate at a single substrate concentration under conditions where _M >> [S] . For other cases Jccat and Kκ values (not shown) were determined by fitting initial rate data at varying substrate concentrations to the ichaelis-Menten equation. b The sequences for the consensus peptides are: collagen (SEQ ID NO:23) , MMP-1 (SEQ ID N0:24), MMP-2 (SEQ ID NO:25) MMP-3 (SEQ ID NO:26), MMP-7 (SEQ ID NO:27), MMP-9 (SEQ ID NO:28), and MTl-MMP (SEQ ID N0:29)
Figure imgf000040_0001
Comparison ofthe experimentally determined motifs with known substrates. A survey of known cleavage sites determined in vitro for MMP-2 indicates that, as predicted from the library results, the most critical element is a hydrophobic residue at PI', which appears in all substrates and is typically leucine (Table 4A, B). Most ofthe sites include the P3 proline residue; two sites have valine at that position. As anticipated, the other positions contribute less to selectivity but still appear important. For example, more than half of the sites have a predicted residue at P2 (alanine or valine), PI (alanine, asparagine, glycine, or serine), or P3' (alanine, glycine, or serine). Overall, the frequencies of appearance of amino acids at the various positions in known MMP-2 cleavage sites is in reasonable agreement with the relative order of selectivity observed with the peptide library (Table 4B).
Table 4. Known and potential MMP cleavage sites3
A Protein Cleavage site SEQ ID NO
Aggrecan (bovine) IPEN-FFGV 30
Big endothelin-1 VPYG-LGSP 31
Brevican/BEHAB (rat) HPSA-FSEA ' 32
Collagen-αι(I) (bovine) GPQG-IAGQ 33
Collagen-o_ (I) (bovine) GPQG-LLGA 34
Collagen-ofi (X) GPAG- SVL 35
Collagen- i (X) GPAG-I TK 36
Decorin DAAS-LLGL 37
FGFR-1 RPAV-MTSP 38
Galectin-3 PPGA-YHGA 39
IGFBP-3 LRAY-LLPA 40
IL-lβ GPYE-LKAL 41
Laminin-5 γ2-chain (rat) TAAA-LTSC 42 2-Macroglobulin GPΞG-LRVG 43 α2-Macroglobulin GHAR-LVHV 44
MCP-3 QPVG-INTS 45
Pregnancy zone protein ELGT-YNVI 46
Pro-MMP-1 DVAQ-FV Y 47
Pro-MMP-2 DVAN-YNFF 48
SPARC HPVG-L AR 49
Substance P KPQQ-FFGL 50
B
MMP-2 motif P4 P3 P2 PI PI' P2' P3'
Protein substrates G(7) P(14) A(9) G(8) L(10) L(4) G(6)
D(3) V(2) Q(3) A(3) F(4) V(3) S(3)
H(2) A(2) V(2) Q(2) 1(3) N(3) V(3)
E(2) N(2) Y(3) A(2)
Y(2)
G(2) Peptide libraries I P V s L R s
V V A G M K A
I A I Y G E Y M
The sequences above are: protein substrates (SEQ ID NO: 51), peptide libraries (SEQ ID NO: 52) .
Peptide SEQ ID NO Substrate Vmax/_CM
IPVS- RSG 53 Consensus peptide 100 + 7
HPVG-LLAR 54 SPARC 190 + 10
QPVG-INTS 55 MCP-3 7.2 ± 0.3
RPAV-MTSP 56 FGFR-1 8.7 + 0.3
GPQG-IAGQ 57 Collagen 12.3 ± 0.5
D
Protein Site SEQ ID NO Likely protease
Betaglycan (rat) HVLN-LRST 58 MMP-2, MMP-7, MMP-9
Dentin DPES-IRSE 59 MMP-1, MTl-MMP
Integrin- v DP E-FKSH 60 MMP-2, MMP-3
Integrin-α6 RPIP-ITAS 61 MMP-7, MMP-9
Integrin-αx RVLG- KAH 62 MMP-3, MMP-7
Integrin-α9 KV N-LTDN 63 MMP-7, MMP-9
NG2 proteoglycan (rat) PPEA-LRGI 64 MMP-1, MMP-2
Neurocan (rat) I AM-LRAP 65 MMP-2, MMP-3
PAI-3 (mouse) TAAA-ITGA 66 MMP-2 a Part A tabulates known MMP-2 cleavage sites (15, 36—39). Multiple sites in a single protein that were mapped following complete degradation of a given protein are not listed. For cases in which several cleavage sites have been identified in a single protein but one site clearly predominates, only the major site is given.
Part B compares the experimentally determined MMP-2 cleavage-site motif with residues found at each position in known protein substrates. The number of occurrences of each residue in the 21 sites listed in part A for residues that arise more than once is given in parentheses.
Part C shows cleavage rates relative to the MMP-2 consensus peptide for peptide substrates derived from several known MMP-2 protein substrates. Parameters were determined as for Table 2.
Part D lists predicted MMP cleavage sites from computer database searches using matrices derived from the cleavage motifs for several MMPs. Unless otherwise indicated, the sequence is from the human ortholog.
Based on the library results, most of these substrates would be predicted to be reasonable, though suboptimal, substrates for MMP-2. To test this prediction, we prepared peptides corresponding to several of these known cleavage sites and compared their rates of cleavage to that ofthe consensus MMP-2 peptide (Table 4C). Peptides derived from three of the substrates — FGFR1, MCP3, and collagen-αι(I) — were cleaved less efficiently than the consensus peptide, as predicted. The presence of a suboptimal cleavage site in a protein substrate may reflect the importance of protease exosites in substrate recognition or may indicate that the protein is a suboptimal substrate for purposes of regulated degradation. The peptide derived from the MMP-2 cleavage site in SPARC, however, was cleaved about twofold more efficiently than the consensus peptide. A possible explanation of this apparent discrepancy would be cooperativity between subsites, which would not be detected using completely random mixture libraries. Mixture-based libraries can reveal such cooperativity if multiple libraries containing different fixed positions are employed, as was shown for the phosphopeptide-binding specificity of 14-3-3 proteins (13).
Prediction and identification of novel protein substrates. We identified a number of novel potential substrates by searching the SWISS-PROT database using the Scansite program (27) with matrices derived from the library data for the various MMPs (Table 4D). Among these proteins is the serpin PAI-3, which bears a predicted MMP-2 site near the reactive bond and is likely to be prone to inactivation by MMPs by analogy with other serpins (28, 29). Cleavage motifs are also found in a number of integrin-α subunits. As degradation of integrin-β by MMP-7 has been previously reported (30), proteolysis of integrins may be a general mechanism by which MMPs regulate cell-matrix interactions. Several extracellular matrix components were also identified as possible substrates. Particularly interesting within this group is the brain-specific chondroitin sulfate proteogly can neurocan, whose predicted MMP-2 cleavage site corresponds to a known, developmentally regulated in vivo processing site (31). We tested whether MMP-2 could cleave neurocan at the predicted site. Neonatal rat brain neurocan, purified as a mixture ofthe full-length proteoglycan and its C-terminal fragment, was treated with MMP-2 and analyzed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE; Fig. 3). Treatment with low concentrations of MMP-2 resulted in the complete disappearance of full-length neurocan with a concomitant increase in the abundance ofthe C-terminal fragment. MMP-2 digestion also generated a faster- migrating band, which was confirmed to be the N-terminal fragment by immunoblotting with a monoclonal antibody (1F6, ref. 32) that recognizes the N terminus of neurocan. Proteolysis was completely blocked by the hydroxamate MMP inhibitor GM6001. Treatment with equimolar amounts of MMP-1 or MMP-9, which according to their library profiles are not predicted to cleave neurocan at its processing site, did not result in cleavage. These results indicate that MMP-2 can specifically process neurocan in vitro and demonstrate that cleavage motifs based on library data can be used to identify novel protein substrates. Significantly, the residues at two sites (P3 and P2) important for MMP recognition were not previously known to promote cleavage by MMP-2, suggesting that neurocan would not have been predicted to be an MMP-2 substrate. Whether MMP-2 acts as the authentic neurocan- processing enzyme in vivo remains to be determined and awaits analysis of MMP-2 knockout mice.
We have described the use of oriented-peptide library mixtures to determine protease cleavage site motifs. The motifs determined for six MMPs are in general agreement with prior work, and some previously unappreciated selections were observed. This study also provides the first data on the substrate specificity of MTl-MMP outside ofthe PI' position (33). The method requires the synthesis of a minimum number of distinct mixture libraries; eight positions surrounding the cleavage site were evaluated for six enzymes using only three separate libraries. The method should therefore allow motifs for large families of related enzymes to be deduced rapidly. For many proteases, no prior knowledge ofthe enzyme's cleavage specificity is required, because orienting residues for second-generation libraries are identified from initial screens of completely degenerate libraries. A limitation to the method is that for those proteases with major selectivity sites only on the unprimed side, information from first-round screening ofthe fully degenerate library would probably be insufficient in itself to produce secondary libraries capable of deriving the unprimed-side motif. In such cases, prior knowledge ofthe major specificity position would be required to design a library that fixes sites on both sides ofthe cleaved bond. Unlike previously described approaches, mixture-based libraries are rapid, provide data for both the primed and unprimed positions, and are theoretically applicable to any protease that can digest peptide substrates.
Example 2: Determination of the B. anthracis lethal factor cleavage site
The method described above in Example 1 was used to determine the cleavage site motif for anthrax lethal factor (LF). To obtain LF, native enzyme was purified from culture supernatants oi Bacillus anthracis strain Sterne according to published procedures (60). In practice it was found that the completely degenerate dodecamer library was cleaved inefficiently by LF, so an LF-specific library (acetyl-KKKPTPXXXXXAK, where X indicates a degenerate position; SEQ ID NO:67) was prepared based on the LF cleavage site from MEK-1 (61, 62). Partial digestion of this library followed by Edman sequencing allowed the primed side motif for LF to be determined (Table 5). The strongest selectivity was seen at the PI ' position, where LF appears to require a hydrophobic residue, with the strongest preference for tyrosine. A substantial selection for proline was seen at the P2' position. Selection at P3' and beyond was weaker, but included subtle preferences for asparagine and aliphatic hydrophobic residues at P3' and for glutamate and alanine at P4'. The primed side data were used to prepare a biotinylated secondary library to allow for determination ofthe motif for the unprimed side. The library was prepared with the sequence MXXXXXPYPMEDK(K-biotin) (SEQ ID NO:68), where X indicates a degenerate position and K-biotin is a biotinyllysine residue. The library bears fixed optimal residues at PI '-P4', and a proline residue was fixed at the PI position, as mutation of this residue to alanine was reported to be eliminate cleavage of MEK-1 by LF (61). Using this secondary library we were able to determine the LF motif for the unprimed positions (Table 5). LF selects basic residues (lysine, arginine, and histidine) at the P6-P4 positions, and has a strong preference for hydrophobic (particularly aromatic) residues at the P2 position. Subtle preferences for proline and valine were seen at P3. The lethal factor cleavage site motif is provided as SEQ ID NO:69:
Table 5: Lethal factor (LF) cleavage site motif
Cleavage position
P6 P5 P4 P3 P2 PI PI' P2' P3' P4'
R (1.6) K (1.6) K (1.4) V (1.4) Y (3.0) P Y (2.8) P (1.8) N (1.4) E (1.6)
K (1.5) R (1.4) H (1.3) A (1.4) L (1.6) L (2.0) M (1.2) A(1.6)
S (1.4) H (1.3) P (1.3) F (1.5) 1 (1.9)
H (1.3) M (1.6) F (1.6)
Primed positions were determined using the library acetyl-KKKPTPXXXXXK (SEQ ID NO: 67), and unprimed positions were determined using the secondary library MXXXXXPYPMEDK(K-biotin) (SEQ ID NO:68). Selection values shown in parentheses are the relative amount of a given amino acid found at a given sequencing cycle normalized so that a value of 1 corresponds to average quantity per amino acid in that cycle and would indicate no selection. Only positive selections of 1.2 or over are listed.
Example 3; Preparation of fluorogenic peptides
Cleavage site specificity data was directly employed in the generation of efficient intramolecularly-quenched fluorogenic peptide substrates which allow protease activity to be monitored quantitatively (68, 69). We prepared such a peptide by flanking the P5-P4' positions ofthe LF consensus motif with an amino terminal fluorescent methoxycoumarinacetyl group and carboxy-terminal dinitrophenyl-diaminopropionic acid quenching moiety (Mca-KKNYP~YPME-Dap(Dnp); SEQ ID ΝO:70) ("~" indicates the location ofthe scissile bond). Exhaustive cleavage ofthe optimized substrate by LF was found to result in a 13.5-fold increase in fluorescence, and Edman sequencing ofthe reaction products indicated that cleavage occurred exclusively at the intended scissile bond.
This peptide serves as a tool to allow determination ofthe potency of LF inhibitors in vitro. As the peptide allows for rapid and facile monitoring of LF activity, it is also suitable for use in high-throughput screens of chemical libraries for LF inhibitors. The V^J ^M for LF cleavage of this consensus substrate was found to be significantly (14-fold) higher than for cleavage of an analogous peptide derived from the LF cleavage site in MEK-1 (Mca- KKPTP~IQLN-Dap(Dnp); SEQ ID NO:71). The peptide library methodology thereby allowed us to produce a substrate with much improved properties over what would have been possible based on prior knowledge alone.
Example 4: Fluorescent reporter molecules for use in cells
Though fluorogenic peptide substrates provide useful tools for evaluating the activity of proteases in vitro, a means for evaluating activity within living cells is also desirable, since this would allow for the direct screening for inhibitors that are both cell-permeant and metabolically stable, which are essential properties for clinically useful compounds. The optimal cleavage motif data is used to prepare fluorescent reporters that can be used to monitor activity within living cells (Fig. 4). The strategy takes advantage of recent advances in the development of enhanced green fluorescent protein (GFP) derivatives which exhibit a variety of spectral properties (70). For example, the emission spectrum ofthe enhanced cyan fluorescent protein (CFP) overlaps with the excitation spectrum of yellow fluorescent protein (YFP). When in close proximity (less than 100 A), excitation of CFP results in substantial fluorescence resonance energy transfer (FRET) to YFP, resulting in a dampening ofthe CFP emission and an increase in YFP emission (70, 71). Insertion of a protease cleavage site between CFP and YFP molecules results in a fusion protein which exhibits FRET until cleavage by the protease, which causes spatial separation of donor and acceptor. The resultant loss of FRET can be monitored by changes in the ratio of CFP emission to YFP emission (FRET ratio). This strategy, or variants thereof, has been used successfully to monitor activity ofthe proteases caspase-1, caspase-3 and calpain in living cells (72, 73). Mammalian expression constructs are generated that insert the LF optimal cleavage site between a CFP/YFP pair, as well as a GFP/red fluorescent protein (RFP) pair, which also exhibits FRET. As controls, similar fusions are generated where the LF cleavage site is scrambled and thus not susceptible to cleavage, as well as constructs using the foot-and- mouth disease virus 2 A autocatalytic processing site, which should undergo constitutive cleavage (74). These constructs are tested by transient expression in a cell type which can be efficiently transfected (i.e. COS cells or 293T cells) and the cells treated with PA plus varying concentrations of LF. Cells are observed by fluorescence microscopy to monitor changes in the FRET ratio upon LF treatment. Upon observation of a significant and reproducible decrease in FRET in these preliminary experiments, cell lines which stably express these fusions are generated by standard protocols. Stable lines facilitate screening of inhibitors in a high throughput manner by providing a population in which all cells express the fluorescent constructs, and by eliminating the need for a transfection step.
Example 5: Generation of peptide-based protease inhibitors
As LF is a metalloproteinase, a group which chelates the active site zinc ion is incorporated at either the amino- or carboxy-terminus of an optimized peptide. Such inhibitors can achieve remarkable potency and specificity by virtue of an avidity effect in which two separate binding groups, the metal chelator and the peptide moiety, are linked in a single molecule (75, 76) . Because LF has significant selectivity on either side ofthe scissile bond, two types of inhibitors are generated and tested for their ability to inhibit LF. One type incorporating unprimed residues is synthesized bearing an hydroxamic acid group in place ofthe carboxylic acid. Solid phase synthesis of peptide hydroxamates is carried out according to well- established procedures by employing a commercially available hydroxylamine-bearing resin from which the growing peptide chain can be synthesized by standard Fmoc chemistry (77). Substrate-analogous peptide hydroxamates have been generated as potent inhibitors of several families of metalloproteinases, including matrix metalloproteinases and astacins (78, 79).
Figure imgf000048_0001
primed side optimal residues with unprimed side residues with a carboxy- an amino-terminal thioacetyl group terminal hydroxamic acid moiety
The other type of inhibitors generated incoφorates primed residues and bears amino- terminal thioacetyl groups. Such thioacetyl peptides make potent inhibitors of thermolysin, a bacterial metalloprotease related to LF (80). Thioacetyl peptides aregenerated on the solid phase by coupling 2-(acetylthio)acetyl succinimide to the amino-terminus ofthe resin-bound, side chain-protected peptide followed by liberation ofthe free thiol with a standard Fmoc chemistry deprotection cycle (piperidine in dimethyl formamide). Peptide derivatives are purified by reversed-phase HPLC. Inhibitors of varying peptide length (3 to 5 amino acid residues) are synthesized to optimize this parameter empirically.
An inhibitor based on the PI '-P3' residues ofthe optimal substrate, 2-thioacetyl-Tyr- Pro-Met-amide, was prepared. The K; for this inhibitor was evaluated in the standard protease cleavage assays and found to be approximately lOμM. The corresponding inhibitor based on the MEK-1 cleavage site could not be evaluated due to its insolubility in water and other solvents.
Inhibitors based on the P4-P1 residues ofthe optimal substrate, α-acetyl-Lys-Nal-Tyr- Pro-hydroxamic acid (SEQ ID ΝO:72) and α-acetyl-Lys-Val-Tyr-βAla-hydroxamic acid (SEQ ID NO:73) are prepared. A similar inhibitor based on MEK-1, α-acetyl-Lys-Pro-Thr- Pro-hydroxamic acid (SEQ ID NO:74), is also prepared for comparative testing purposes
The potency ofthe candidate inhibitors in the inhibition of LF is initially determined in vitro, and their specificity for LF is evaluated by assaying for their ability to inhibit other metalloproteases. Next, the ability ofthe candidate inhibitors to prevent lysis of cultured macrophages treated with LT is evaluated. Compounds which perform well in cell culture are tested for their ability to protect mice from a lethal challenge with LT.
As an approach for generating more potent inhibitors, new peptide library screens are conducted which incoφorate non-proteogenic "unnatural" amino acids. By expanding the repertoire of available residues, substrates are identified that are cleaved more efficiently than any that can be generated when restricted to the 20 natural amino acids, and corresponding inhibitors derived from such substrates are anticipated to have improved potency. Use of unnatural amino acids in our libraries requires slight modifications to our standard protocol whereby libraries are analyzed by Edman degradation-based amino-terminal sequencing of the peptide mixtures. Quantitative analysis of peptides by Edman degradation is carried out by measuring the yield ofthe phenylthiohydantoin (PTH) adduct for a particular amino acid residue which is identified based on its retention time on an HPLC column. To incoφorate novel amino acids into the mixture, their PTH adducts need simply have a retention time distinct from those of all other components in the library. A vast number of unnatural amino acids have been identified by others and by our own preliminary work which are both commercially available in the Fmoc-protected form suitable for solid phase peptide synthesis and whose PTH adducts have unique retention times (81).
Initially, the amino acids close to the LF cleavage site are investigated. Accordingly, hydrophobic aliphatic and aromatic residues as well as proline analogs are investigated, in keeping with the general properties ofthe residues selected by LF in the P3 to P3' positions. A representative group of such amino acids is shown below.
Figure imgf000050_0001
Degenerate positions in unnatural amino acid-containing libraries will be of similar complexity as in our natural amino acid containing libraries (about 20 distinct amino acids). Four separate unnatural amino acid mixtures are prepared so that roughly 80 unnatural amino acids may be evaluated at each site. Mixtures also include the optimal natural amino acid residue for each position to allow us to determine if any unnatural amino acid is an improvement over the natural one at a given position. For each distinct mixture two libraries are synthesized wherein either all ofthe primed or all ofthe unprimed positions are fixed to ensure that cleavage occurs at the intended scissile bond. We have conducted preliminary screens using unnatural amino acid containing libraries. A library with the sequence KKKPYPXXXXGK (SEQ ID NO:75) was prepared in which the degenerate positions X contain a mixture ofthe following amino acids: A, Y, P, V, M, K, aminobutyric acid, allylglycine, S-methylcysteine, norvaline, norleucine, p- chlorophenylalanine, S-benzylcysteine, S-methoxybenzylcysteine, and β-cyclohexylalanine. The results are summarized in the table below.
Table 6: Screening of library containing unnatural amino acids for LF cleavage site residues
Cleavage site position PT T plr
Nle (1.8) Pro (1.7) Allylgly (1.7)
Cys(Me) (1.7) Cys(Bzl) (1.3) Nva (1.6)
M (1.6) Cys(Me) (1.4)
Nva (1.6) Cl-Phe (1.4) Chx-Ala (1.4)
Abbreviations: Nle, norleucine; Cys(Me), S-methylcysteine; Nva, norvaline; Cl-Phe, p-chloφhenlylalanine; Chx-Ala, β-cyclohexylalanine; Cys(Bzl), S-benzylcysteine; Allylgly, allylglycine. The numbers in parentheses represent the preference values calculated as described in the Examples above. The sequence ofthe LF cleavage site motif determined using the SEQ ID NO:75 library containing unnatural amino acids is KKKPYPXaalXaa2Xaa3Xaa4GK (SEQ ID NO: 76), wherein the cleavage site primed amino acids (Pl'-P2'-P3') are Xaal-Xaa2-Xaa3.
At two positions, PI ' and P3', unnatural amino acids were favored over the most highly selected natural ones. In addition, the best natural amino acid in PI ', methionine, differs from the best natural amino acid selected by LF at PI' when using the previous library KKKPTPXXXXXAK (SEQ ID NO:67), which was tyrosine. This is likely a consequence of fixing tyrosine at P2 in the newer library and suggests that application of these libraries in an iterative manner can be used for substrate optimization. Any novel selections which arise from the unnatural libraries are confirmed by incoφorating them into fluorogenic peptide substrates to see if the new substrate is indeed an improvement over the previously defined consensus peptide. Hydroxamic acid and thioacetyl-peptide inhibitors similar to those described above also are synthesized and evaluated.
Example 6: Identification of lethal factor binding motifs
The library method described above selects for efficient substrates which must undergo turnover and not for tight binding peptides per se. As an alternative strategy, libraries are screened directly for peptides which bind to LF. This approach was used previously to generate a specific peptide inhibitor ofthe protein tyrosine kinase ZAP-70 (83). For LF, peptide library mixtures are synthesized bearing a carboxy-terminal hydroxamic acid group, which will serve to orient the library. The library (for example, MAXXXXXX- hydroxamate; SEQ ID NO:77) is applied to a column containing immobilized LF, the column is washed extensively, and bound peptides are eluted with either low pH or a metal chelator. The bound pool is then sequenced as usual to determine the preferences at each site for LF inhibitors. If necessary, analogous libraries are made containing the same unnatural amino acid mixtures which were used in the substrate screens described above. Consensus peptide hydroxamates are individually synthesized and evaluated as LF inhibitors.
Example 7: Evaluation of inhibitors
Candidate LF inhibitors produced as a result ofthe library screens will be evaluated in a number of assays. Initially K values for inhibition of LF cleavage ofthe peptide substrate are determined by fluorometric assay by titrating the concentration of inhibitor under initial rate conditions, using a fixed substrate concentration well below the KM. Compounds also are tested for their ability to inhibit the cleavage of a known protein substrate in cell lysates. We have been able to cleanly assay cleavage of MEK-4 by LF in cell lysates by immunoblotting. Compounds that do well in vitro also are tested for their ability to inhibit LF in live cells using the FRET substrate described above. In the absence of a functional live cell FRET assay, LF inhibition in cells is assayed by following MEK-4 cleavage in extracts from cells treated with LF plus PA in the presence of varying concentrations of inhibitor. Finally, the compounds are evaluated for their ability to inhibit lysis of macrophage cell lines by LeTx.
Compounds which appear promising based on their activity in cultured cells are synthesized on a larger scale and evaluated against LeTx in an animal model (52, 67). Fisher rats injected intravenously with LeTx alone die within several hours. Rats are injected with varying concentrations of PA, LF and inhibitor and evaluated for rescue from or delay of onset of a moribund state. Compounds which exhibit potency in the rat model are considered candidates for anti-anthrax therapy in humans.
Example 8: Identification of protein substrates
Cleavage site motif data is used to identify downstream substrates ofthe proteases analyzed. Knowledge ofthe substrates for a protease is crucial to understanding its function at the molecular level, and may provide additional targets for therapeutic intervention. Our laboratory has recently developed a world wide web-accessible computer program called Scansite (http://scansite.mit.edu/) for searching protein sequence databases for the presence i of short peptide motifs (27). Scansite offers substantial improvements over previously existing sequence database searching programs such as BLAST which are better suited to longer individual sequences.
Scansite searches are performed using matrices (of weighted amino acid preference by cleavage site position) corresponding to the cleavage site for the protease of interest against protein sequences in public databases belonging either to the organism itself or to the mammalian host as appropriate. In the case of LF, though a number of MEK proteins have been identified thus far as substrates, there is strong reason to suspect that other substrates exist which are important for the activity of LF on cells. No functional data linking MEK protein cleavage to the macrophage cell lysis by LF has been reported. In addition, though MEK-1 cleavage by LF is reported to impair its ability to activate Erk, LeTx treatment of macrophages results in a strong though transient activation of Erk (62). Since activation of three major goups of MAP kinases (Erk, JNK and p38 proteins) is essential for full transcriptional and post-transcriptional upregulation of inflammatory cytokine production under other circumstances (84), it seems likely that cleavage of other substrates will turn out to be important for LF action.
In order to identify such substrates, we have performed Scansite searches using matrices derived from the LF data shown in Table 5 against mammalian proteins in the Swiss-Prot database. Interestingly, MEK-1 appears as one ofthe top scoring hits, and MEK- 4 also appears on the list of potential substrates (Table 7).
Table 7. Proteins with putative LF cleavage sites as determined by searching the Swiss-Prot database with Scansite.
Score Protein ID Protein name Cleavage site Seq Id No
0.11821 ATF6 HUMAN ATF-6 332 KKKEYM-LGLEAR 78
0.12231 MPK1 HUMAN MEK-1 7 KKKPTP-IQLNPA 79
0.12477 DYN2JLTUMAN DYNAMIN 2 565 KEKKYM-LPLDNL 80
0.13073 TNR2_HUMAN TYPE II TNF-α RECEPTOR 294 KKKPLC-LQREAK 81
0.13101 IF2A_HUMAN eIF-2α 270 KRGVFN-VQMEPK 82
0.13621 PUB HUMAN PI3-KINASE pi lOβ SUBUNIT 505 KKQPYY-YPPFDK 83
0.13809 STK9_HUMAN SER/THR-PROTEIN KINASE 9 117 KVKSYI-YQLIKA 84
0.13965 KCCA_MOUSE CAM KINASE IlαC 232 KARAYD-FPSPEW 85
0.14079 P11A BOVIN PI3-KINASE plOOα SUBUNIT 947 KKKKFG-YKRERV 86
0.14292 RTC0 HUMAN SMALL GTPase TC10 86 RLRPLS-YPMTDV 87
0.14363 MPK4 HUMAN MEK-4 46 KRKALK-LNFANP 88
0.14588 PLK1 MOUSE POLO-LIKE KINASE 1 270 KKNEYS-IPKHIN 89
0.14674 CL36 HUMAN CLP-36 104 KRHPYK-MNLASE 90
0.15041 KSYK HUMAN SYK TYROSINE KINASE 298 RIKSYS-FPKPGH 91
0.15080 CBL HUMAN C-CBL PROTO-ONCOGENE 535 KDKPLP-VPPTLR 92
0.15145 BUB3 HUMAN BUB3 CHECKPOINT PROTEIN 221 QKKKYA- FKCHRL 93
0.15357 AKA8 RAT A-KINASE ANCHOR PROTEIN 8 307 KRKPFP-LYEEPD 94
0.15634 HS1 MOUSE HEMAT. LINEAGE PROTEIN 324 SRREYP-VPSLPT 95
0.16201 PRX1 HUMAN PROX1 484 FRHPFP-LPLMAY 96
0.16654 PRSA HUMAN PROTEASOME REG. SUB. 6A 355 RKIEFP-MPNEEA 97
0.16945 SOS1 HUMAN SOS1 1034 KKYSYP-LKSPGV 98
0.17032 DDX3_HUMAN DEAD BOX PROTEIN 3 (HLP2) 267 RRKQYP-ISLVLA 99
0.17042 PTPE MOUSE PTP-ε 113 PKKFFP-IPVEHL 100
0.17112 MAP4 HUMAN μTUBULE-ASSOC PROTEIN 4 893 RVKATP-MPSRPS 101
0.18210 ACA HUMAN PP2A , REG. SUBUNIT 464 KKCPTP-MQNEIG 102
0.18348 FXM1 HUMAN FORKHEAD Ml (FKHL16) 17 KRRRLP-LPVQNA 103
The top 25 sites in cytoplasmic proteins are shown. When orthologs ofthe same protein from different species occurred, only the top ranking hit is shown.
Greater than 100 proteins were identified which possess strong LF cleavage motifs. Since too many candidates exist to test them all individually, the Scansite results are complemented by analyzing cleavage of proteins in cell lysates by two-dimensional (2D) gel electrophoresis followed by silver-staining. Any protein spots which either shift or disappear upon LF treatment are referenced against the list of proteins from the Scansite search, which can be sorted based on their molecular weights and isoelectric points, to help narrow the range of possible substrates. If possible, spots are excised from stained gels, digested with trypsin, and submitted for mass spectrometry to identify the protein directly (85). Individual candidates are tested in a number of ways. Proteins for which antibodies are available are tested by probing LF-treated lysates against untreated lysates on immunoblots. When antibodies are not available, cDNA clones encoding the protein of interest are acquired and used to construct either epitope-tagged mammalian expression vectors or bacterial GST-fusion constructs, which are used to evaluate whether the protein can be cleaved by LF.
A risk associated with 2D electrophoresis approaches is that many cellular proteins will escape detection due to inadequate sensitivity of silver staining or inefficient separation. In the case that we are unable to visualize new substrates by 2D electrophoresis, an alternative approach is undertaken using a recently developed expression cloning method based on the screening of small cDNA pools (86). Small pool screening has been used successfully in a number of contexts, including the identification of substrates for caspase family proteases (87). In this method, a cDNA library is subdivided into pools containing approximately 100 clones apiece. Pools are transcribed and translated in a reticulocyte lysate fed [35S]-methionine to metabolically label the proteins synthesized. Aliquots from the translation mix are incubated in the presence or absence of protease and analyzed by SDS- PAGE followed by autoradiography. Pools containing a protein which is cleaved by LF are then subdivided by a sib selection procedure until a single clone encoding the candidate substrate is isolated and sequenced to determine its identity. Putative substrates identified by this method are confirmed independently, either by testing the ability of LF to cleave either the endogenous protein in cell lysates or protein produced by expression in bacteria or mammalian cells.
References
1. Matthews, D.J. & Wells, J.A. Substrate phage: selection of protease substrates bymonovalent phage display. Science 260, 1113-1117 (1993). 2. Smith, M.M., Shi, L. & Navre, M. Rapid identification of highly active and selective substrates for stromelysin and matriiysin using bacteriophage display libraries. J. Biol. Chem. 270, 6440-6449 (1995).
3. Rano, T.A. et al. A combinatorial approach for determining protease specificities: application to interleukin-lβ converting enzyme (ICE). Chem. Biol. 4, 149-155 (1996).
4. Backes, B.J., Harris, J.L., Leonetti, F., Craik, C.S. & Ellman, J.A. Synthesis of positional-scanning libraries of fluorogenic peptide substrates to define the extended substrate specificity of plasmin and thrombin. Nat. Biotechnol. 18, 187-193 (2000).
5. Harris, J.L. et al. Rapid and general profiling of protease specificity by using combinatorial fluorogenic substrate libraries. Proc. Natl. Acad. Sci. USA 97, 7754-7759 (2000).
6. Birkett, A.J. et al. Determination of enzyme specificity in a complex mixture of peptide substrates by Ν-terminal sequence analysis. Anal. Biochem. 196, 137-143 (1991).
7. Petithory, J.R, Masiarz, F.R., Kirsch, J.F., Santi, DN. & Malcolm, B.A. A rapid method for determination of endoproteinase substrate specificity: Specificity ofthe 3C proteinase from hepatitis A virus. Proc. Natl. Acad. Sci. USA 88, 11510-11514 (1991).
8. Arnold, D. et al. Substrate specificity of cathepsins D and E determined by Ν-terminal and C-terminal sequencing of peptide pools. Eur. J. Biochem. 249, 171-179 (1997).
9. Berman, J. et al. Rapid optimization of enzyme substrates using defined substrate mixtures. J. Biol. Chem. 267, 1434-1437 (1992).
10. Songyang, Z. et al. SH2 domains recognize specific phosphopeptide sequences. Cell 72, 767-778 (1993).
11. Songyang, Z. et al. Use of an oriented peptide library to determine the optimal substrates of protein kinases. Curr. Biol. 4, 973-982 (1994). 12. Songyang, Z. et al. Recognition of unique carboxyl-terminal motifs by distinct PDZ domains. Science 275, 73-77 (1997).
13. Yaffe, M. B. et al. The structural basis for 14-3-3:phosphopeptide binding specificity. Cell 91, 961-971 (1997).
14. Johnson, L.L., Dyer, R. & Hupe, D.J. Matrix metalloproteinase-'. Curr. Opin. Chem. Biol. 2, 466-471 (1998).
15. Woessner, J.F. & Νagase, H. Matrix metalloproteinases and TIMPs. (Oxford University Press, Oxford, UK; 2000). 16. Nagase, H. & Fields, G.B. Human matrix metalloproteinase specificity studies using collagen sequence-based synthetic peptides. Biopolymers 40, 399-416 (1996).
17. Fields, G.B., Nan Wart, H.E. & Birkedal-Hansen, H. Sequence specificity of human skin fibroblast coUagenase. J. Biol. Chem. 262, 6221-6226 (1987). 18. Teahan, , Harrison, R., Izquierdo, M. & Stein, R.L. Substrate specificity of human fibroblast stromelysin. Hydrolysis of substance P and its analogues. Biochemistry 28, 8497— 8501 (1989).
19. Νetzel-Arnett, S., Fields, G., Birkedal-Hansen, H. & Van Wart, H.E. Sequence specificities of human fibroblast and neutrophil coUagenases. J. Biol. Chem. 266, 6747-6755 (1991).
20. Νiedzwiecki, L, Teahan, j., Harrison, R.K. & Stein, R.L. Substrate specificity ofthe human matrix metalloproteinase stromelysin and the development of continuous fluorometric assays. Biochemistry 31, 12618-12623 (1992).
21. Νetzel-Arnett, S. et al. Comparative sequence specificities of human 72- and 92 -kDa gelatinases (type IN coUagenases) and PUMP (matrilysin). Biochemistry 32, 6427-6432
(1993).
22. Νagase, H., Fields, C.G. & Fields, G.B. Design and characterization of a fluorogenic substrate selectively hydrolyzed by stromelysin 1 (matrix metalloproteinase-3). J. Biol. Chem. 269, 20952-20957 (1994). 23. Deng, S.-J. et al. Substrate specificity of human coUagenase 3 assessed using a phage- displayed peptide library. J. Biol. Chem. 275, 31422-31427 (2000).
24. McGeehan, G.M. et al. Characterization ofthe peptide substrate specificities of interstitial coUagenase and 92 -kDa gelatinase: implications for substrate optimization. J Biol.
Chem. 269, 32814-32820 (1994). 25. Schechter, I. & Berger, A. On the size ofthe active site in proteases. I. Papain.
Biochem. Biophys. Res. Commun. 27, 157-62 (1967).
26. Welch, A.R. et al. Understanding the PI ' specificity ofthe matrix metalloproteinases: effect of SI ' pocket mutations in matrilysin and stromelysin-1. Biochemistry 35, 10103—
10109 (1996). 27. Yaffe, M.B. et al. A motif-based profile scanning approach for genome- wide prediction of signaling pathways. Nat. Biotechnol. 19, 348-353 (2001). 28. Liu, Z. et al. The seφin αi-proteinase inhibitor is a critical substrate for gelatinase B/MMP-9 in vivo. Cell 102, 647-655 (2000).
29. Desrochers, P.E., Mookhtiar, K., Nan Wart, H.E., Hasty, K.A. & Weiss, SJ. Proteolytic inactivation of αi-proteinase inhibitor and αi-antichymotrypsin by oxidatively activated human neutrophil metalloproteinases. J. Biol. Chem. 267, 5005-5012 (1992).
30. von Bredow, D.C., Νagle, R.B., Bowden, G.T. & Cress, A.E. Cleavage of β4 integrin by matrilysin. Exp. Cell Res. 236, 341-345 (1997).
31. Rauch, U., Karthikeyan, L., Maurel, P., Margolis, R.U. & Margolis, R.K. Cloning and primary structure of neurocan, a developmentally regulated, aggregating chondroitin sulfate proteoglycan of brain. J. Biol. Chem. 267, 19536-19547 (1992).
32. Meyer-Puttlitz, B. et al. Chondroitin sulfate and chondroitin/keratan sulfate proteoglycans of nervous tissue: developmental changes of neurocan and phosphacan. J. Neurochem. 65, 2327-2337 (1995).
33. Mucha, A. et al. Membrane type-1 matrix metalloprotease and stromelysin-3 cleave more efficiently synthetic substrates containing unusual amino acids in their PI ' positions. J.
Biol. Chem. 273, 2763-2768 (1998).
34. Ridky, T.W. et al. Human immunodeficiency virus, type 1 protease substrate specificity is limited by interactions between substrate amino acids bound in adjacent enzyme subsites. J. Biol. Chem. 271, 4709-4717 (1996). 35. Rauch, U. et al. Isolation and characterization of developmentally regulated chondroitin sulfate and chondroitin keratan sulfate proteoglycans of brain identified with monoclonal antibodies. J. Biol. Chem. 266, 14785-14801 (1991).
36. Fernandez-Patron, C, Radomski, M.W. & Davidge, S.T. Vascular matrix metalloproteinase-2 cleaves big endothelin-1 yielding a novel vasoconstrictor. Circ. Res. 85, 906-911 (1999).
37. Νakamura, H. et al. Brevican is degraded by matrix metalloproteinases and aggrecanase-1 (ADAMTS4) at different sites. J. Biol. Chem. 275, 38885-38890 (2000).
38. McQuibban, G.A. et al. Inflammation dampened by gelatinase A cleavage of monocyte chemoattractant protein-3. Science 289, 1202-1206 (2000). 39. Sasaki, T. et al. Limited cleavage of extracellular matrix protein BM-40 by matrixmetalloproteinases increases its affinity for collagens. J. Biol. Chem. 272, 9237-9243 (1997). 40. Jackson, E. K. and Garrison, J. C. Renin and angiotensin. In The Pharmacological Basis of Therapeutics, J. G. Hardman, L. E. Limbird, P. B. Molinoff, R. W. Ruddon and A. G. Gilman, eds. McGraw-Hill (New York) 733-758 (1996).
41. Korant, B. D. and Rizzo, C. J. The HIV protease and therapies for ADDS. Adv. Exp. Med. Biol. 421, 279-284 (1997).
42. Sodeinde, O. A. et al. A surface protease and the invasive character of plague. Science 258, 1004-1007 (1992).
43. Orth, K. et al. Disruption of signaling by Yersinia effector YopJ, a ubiquitin-like protein protease. Science 290, 1594-1597. 44. Tatum, F. M., Cheville, N. F. and Morfitt, D. Cloning, characterization and construction of htrA and htrA-like mutants of Brucella abortus and their survival in BALB/c mice. Microb. Pathog. 17, 23-36 (1994).
45. Massung, R. F. et al. Analysis ofthe complete genome of smallpox variola major virus strain Bangladesh- 1975. Virology 201, 215-240 (1994). 46. Whitehead, S. S. and Hruby, D. E. A transcriptionally-controlled trαns-processing assay: Identification of a vaccinia virus-encoded proteinase which cleaves precursor protein
P25K. J. Virol. 68, 7603-7608 (1994).
47. Hardy, W. R. and Strauss, J. H. Processing the nonstructural proteins of Sindbis virus: nonstructural proteinase is in the C-terminal half of nsP2 and functions both in cis and in trans. J. Virol. 63, 4653-4664 (1989).
48. Strauss, J. H. and Strauss, E. G. The alphaviruses: Gene expression, replication, and evolution. Microbial. Rev. 58, 491-562 (1994).
49. Basak, A., Zhong, M., Munzer, J. S., Chretien, M. and Seidah, N. G. Implication of the proprotein convertases furin, PC5 and PC7 in the cleavage of surface glycoproteins of Hong Kong, Ebola and respiratory syncytial viruses: a comparative analysis with fluorogenic peptides. Biochem J. 353, 537-545 (2001).
50. Dixon, T. C, Meselson, M., Guillemin, J. and Hanna, P. C. Anthrax. New Engl. J. Med. 341, 815-826 (1999).
51. Duesbery, N. S. and Vande Woude, G. F. Anthrax toxins. Cell. Mol. Life Sci. 55, 1599-1609 (1999).
52. Ivins, B. E., Ristroph, J. D. and Nelson, G. O. Influence of body weight on response of Fischer 344 rats to anthrax lethal toxin. Appl. Environ. Microbiol. 55, 2098-2100 (1989). 53. Pezard, C, Berche, P. and Mock, M. Contribution of individual toxin components to virulence oi Bacillus anthracis. Infect. Immun. 59, 3472-3477 (1991).
54. Hanna, P. C, Acosta, D. and Collier, R. J. On the role of macrophages in anthrax. Proc. Natl. Acad. Sci. USA 90, 10198-10201 (1993). 55. Friedlander, A. M. Macrophages are sensitive to anthrax lethal toxin through an acid- dependent process. J. Biol. Chem. 261, 7123-7126 (1986).
56. Hanna, P. C, Kruskal, B. A., Ezekowitz, R. A. B., Bloom, B. R. and Collier, R. J. Role of macrophage oxidative burst in the action of anthrax lethal toxin. Mol. Medicine 1, 7- 18 (1994). 57. Klimpel K. R., Molloy S. S., Thomas G. and Leppla S.H. Anthrax toxin protective antigen is activated by a cell surface protease with the sequence specificity and catalytic properties of furin. Proc. Natl. Acad. Sci. USA 89, 10277-10281 (1992).
58. Milne J. C, Furlong D., Hanna P. C, Wall J. S. and Collier R. J. Anthrax protective antigen forms oligomers during intoxication of mammalian cells. J. Biol. Chem. 269, 20607- 20612 (1994).
59. Petosa, C, Collier, R. J., Klimpel, K. R., Lepple, S. H. and Liddington, R. C. Crystal structure ofthe anthrax toxin protective antigen. Nature 385, 833-838 (1997).
60. Hammond, S. E. and Hanna, P. C. Lethal factor active-site mutations affect catalytic activity in vitro. Infect. Immun. 66, 2374-2378 (1998). 61. Duesbery, N. S. et al. Proteolytic inactivation of MAP-kinase-kinase by anthrax lethal factor. Science 280, 734-737 (1998).
62. Nitale, G. et al. Anthrax lethal factor cleaves the Ν-terminus of MAPKKs and induces tyrosine/threonine phosphorylation of MAPKs in cultured macrophages. Biochem.
Biophys. Res. Commun. 248, 706-711 (1998). 63. Klimpel, K. R., Arora, Ν. and Leppla, S. H. Anthrax toxin lethal factor contains a zinc metalloprotease consensus sequence which is required for lethal toxin activity. Mol.
Microbiol. 13, 1093-1100 (1994).
64. Pellizzari, R., Guidi-Rontani, C, Nitale, G., Mock, M., and Montecucco, C. Anthrax lethal factor cleaves MKK3 in macrophages and inhibits the LPS/IFΝγ-induced release of NO and TNFα. FEBS Lett. 462, 199-204 (1999). 65. Nitale, G., Bernardi, L., Νapolitani, G., Mock, M. and Montecucco, C. Susceptibility of mitogen-activated protein kinase kinase family members to proteolysis by anthrax lethal factor. Biochem. J. 352, 739-745 (2000).
66. Lewis, T. S., Shapiro, P. S. and Ahn Ν. G. Signal transduction through MAP kinase cascades. Adv Cancer Res. 74, 49-139 (1998).
67. Sellman, B. R., Mourez, M. and Collier, R. J. Dominant-negative mutants of a toxin subunit: an approach to therapy of anthrax. Science 292, 695-697 (2001).
68. Knight, C. G. Fluorometric assays of proteolytic enzymes. Meth. Enzymol. 248, 18- 34 (1995). 69. Νagase, H., Fields, C. G. and Fields, G. B. Design and characterization of a fluorogenic substrate selectively hydrolyzed by stromelysin 1 (matrix metalloproteinase-3). J. Biol. Chem. 269, 20952-20957 (1994).
70. Heim, R. and Tsien, R. Y. Engineering green fluorescent protein for improved brightness, longer wavelengths and fluorescence resonance energy transfer. Curr. Biol. 6, 178-182 (1996).
71. Miyawaki, A. et al. Fluorescent indicators for Ca2+ based on green fluorescent proteins and calmodulin. Nature, 388, 882-887 (1997).
72. Mahajan, Ν. P., Harrison-Shostak, D. C, Michaux, J. and Herman, B. Novel mutant green fluorescent protein protease substrates reveal the activation of specific caspases during apoptosis. Chem Biol. 6, 401-409 (1999).
73. Nanderklish, P. W. et al. Marking synaptic activity in dendritic spines with a calpain substrate exhibiting fluroescence resonance energy transfer. Proc. Natl. Acad. Sci. USA 97, 2253-2258 (2000).
74. Ryan, M. D. and Drew, J. Foot-and-mouth disease virus 2A oligopeptide mediated cleavage of an artificial polyprotein. EMBO J. 13, 928-933.
75. Barrett, A. J. and Salvesen, G., eds. Proteinase Inhibitors, Elsevier (Amsterdam) 55- 298 (1986).
76. Hajduk P. J., Meadows R. P. and Fesik S. W. Discovering high-affinity ligands for proteins. Science 278, 497-499 (1997). 77. Mellor, S. L., McGuire, C. and Chan, W. C. Ν-Fmoc-aminooxy-2-chlorotrityl polystyrene resin: A facile solid-phase methodology for the synthesis of hydroxamic acids. Tetrahedron Lett. 38, 3311-3314 (1997). 78. Levy, D. E. et al. Matrix metalloproteinase inhibitors: A structure-activity study. J. Med. Chem. 41, 199-223 (1998).
79. Ovens, A., Joule, J. A. and Kadler, K. E. Design and synthesis of acidic dipeptide hydroxamate inhibitors of procollagen C-proteinase. J. Pept. Sci. 6, 489-495 (2000). 80. Holmquist, B. and Nallee, B. L. Metal-coordinating substrate analogs as inhibitors of metalloenzymes. Proc. Natl. Acad. Sci. USA 76, 6216-6220 (1979).
81. Liu, R. and Lam, K. S. Automatic Edman microsequencing of peptides containing multiple unnatural amino acids. Anal. Biochem. 295, 9-16 (2001).
82. Menard, A., Papini, E., Mock, M., & Montecucco, C. The cytotoxic activity of Bacillus anthracis lethal factor is inhibited by leukotriene A4 hydrolase and metallopeptidase inhibitors. Biochem J. 320, 687-691 (1996).
83. Νishikawa, K. et al. A peptide library approach identifies a specific inhibitor for the ZAP-70 protein-Tyr kinase. Mol. Cell 6, 969-974 (2000).
84. Guha, M. and Mackman, Ν. LPS induction of gene expression in human monocytes. Cell. Signal. 13, 85-94 (2001).
85. Lewis, T. S. et al. Identification of novel MAP kinase pathway signaling targets by functional proteomics and mass spectrometry. Mol. Cell 6, 1343-1354 (2000).
86. Lustig, K. D. et al. Small pool expression screening: Identification of genes involved in cell cycle control, apoptosis, and early development. Meth. EnzymoL 283, 83-99 (1997). 87. Cryns N. L. et al. Specific proteolysis ofthe kinase protein kinase C-related kinase 2 by caspase-3 during apoptosis: Identification by a novel, small pool expression cloning strategy. J. Biol. Chem. 272, 29449-29453 (1997).
Equivalents Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments ofthe invention described herein. Such equivalents are intended to be encompassed by the following claims. All references disclosed herein are incoφorated by reference in their entirety.
What is claimed is:

Claims

Claims
1. A method for determining an amino acid sequence motif for a cleavage site of a protease, comprising: a) contacting the protease with a peptide library containing one or more degenerate residues under conditions which allow for cleavage of a substrate by the protease; b) allowing the protease to cleave peptides within the degenerate peptide library having a cleavage site for the protease to form a population of cleaved peptides comprising amino-terminal peptides and carboxy-terminal peptides; c) determining the amino acid sequences ofthe population of cleaved carboxy- terminal peptides; and d) determining an amino acid sequence motif for a cleavage site ofthe protease based upon the relative abundance of different amino acid residues at each degenerate position within the population of cleaved C-terminal peptides.
2. The method of claim 1, further comprising isolating the population of cleaved carboxy-terminal peptides from the non-cleaved peptides and cleaved amino-terminal peptides.
3. The method of claim 1, wherein the degenerate peptide library is a soluble synthetic peptide library.
4. The method of claim 1, wherein the peptide library contains all degenerate amino acid residues.
5. The method of claim 1 , wherein the peptides of the degenerate peptide library are blocked at their N-termini to prevent Edman degradation.
6. The method of claim 1, wherein the peptides ofthe degenerate peptide library are labeled at their N-termini or C-termini with a binding molecule.
The method of claim 6, wherein the binding molecule is biotin.
8. The method of claim 1, wherein the peptides ofthe degenerate peptide library are labeled at their N-termini with a first binding molecule and are labeled at their C-termini with a second binding molecule.
9. The method of claim 8, wherein the cleaved carboxy-terminal peptides are isolated from the non-cleaved peptides and cleaved amino-terminal peptides by contacting the population of cleaved peptides with a substrate that binds the first binding molecule.
10. The method of claim 1 , further comprising a) obtaining a second peptide library, wherein the library is an oriented degenerate peptide library comprising one or more nondegenerate residues carboxy-terminal to a scissile peptide bond, and one or more degenerate residues amino-terminal to the scissile peptide bond, wherein the sequence ofthe nondegenerate residues is based on the amino acid sequence motif determined in claim 1, b) contacting the protease with the second peptide library under conditions which allow for cleavage of a substrate by the protease; c) allowing the protease to cleave peptides within the second peptide library having a cleavage site for the protease to form a population of cleaved peptides comprising amino- terminal peptides and carboxy-terminal peptides; d) isolating the population of cleaved amino-terminal peptides from non-cleaved peptides and cleaved carboxy-terminal peptides; e) determining the amino acid sequences ofthe population of cleaved amino-terminal peptides; and f) determining an amino acid sequence motif for a cleavage site ofthe protease based upon the relative abundance of different amino acid residues at each degenerate position within the population of cleaved amino-terminal peptides.
11. The method of claim 10, wherein the second peptide library is a soluble synthetic peptide library.
12. The method of claim 10, wherein the amino termini ofthe peptides in the second peptide library are unblocked.
13. The method of claim 10, wherein the amino termini ofthe peptides in the second peptide library are blocked and the step of determining the amino acid sequences comprises unblocking the amino termini prior to sequencing the peptides.
14. The method of claim 10, wherein the step of separating cleaved amino-terminal peptides and cleaved carboxy-terminal peptides comprises affinity isolation ofthe uncleaved peptides and the cleaved carboxy-terminal peptides from the cleaved amino-terminal peptides.
15. The method of claim 1 , wherein the degenerate peptide library comprises peptides comprising the formula:
(Xaa)„ (SEQ ID NO: 104)
wherein Xaa is any amino acid and n is an integer from 3-20 inclusive.
16. The method of claim 1, wherein the protease cleaves a peptide before or after a known amino acid Zaa and the degenerate peptide library comprises peptides comprising the formula:
(Xaa)n-Zaa-(Xaa)m (SEQ ID NO: 105)
wherein Zaa is a non-degenerate amino acid (PI or PI ') that forms part ofthe scissile bond, Xaa is any amino acid and n and m are integers from 1-10 inclusive.
17. The method of claim 1 , wherein the degenerate peptide library comprises peptides comprising the formula:
(Zaa)n-(Xaa)m (SEQ ID NO: 106) wherein Zaa is a non-degenerate amino acid amino-terminal to a scissile bond, Xaa is any amino acid and n and m are integers from 1-10 inclusive.
18. The method of claim 10, wherein the second peptide library comprises peptides comprising the formula:
(Xaa)n-(Zaa)m (SEQ ID NO: 107)
wherein Zaa is an amino acid carboxy-terminal to a scissile bond (primed amino acid), Xaa is an amino acid amino-terminal to the scissile bond (unprimed amino acid), and n and m are integers from 1-10 inclusive, and wherein each Zaa amino acid corresponds to the amino acid sequence motif for a cleavage site ofthe protease based upon the relative abundance of different amino acid residues at each degenerate position within the population of cleaved C-terminal peptides.
19. The method of claim 10, further comprising a) preparing a third peptide library, wherein the library is an oriented degenerate peptide library comprising one or more nondegenerate residues amino-terminal to a scissile peptide bond, and one or more degenerate residues carboxy-terminal to the scissile peptide bond, wherein the sequence ofthe nondegenerate residues is based on the amino acid sequence motif determined in claim 10, b) contacting the protease with the third peptide library under conditions which allow for cleavage of a substrate by the protease; c) allowing the protease to cleave peptides within the third peptide library having a cleavage site for the protease to form a population of cleaved peptides comprising amino- terminal peptides and carboxy-terminal peptides; d) isolating the population of cleaved carboxy-terminal peptides from non-cleaved peptides and cleaved amino-terminal peptides; e) determining the amino acid sequences ofthe population of cleaved carboxy - terminal peptides; and f) determining an amino acid sequence motif for a cleavage site ofthe protease based upon the relative abundance of different amino acid residues at each degenerate position within the population of cleaved carboxy-terminal peptides.
20. The method of claim 19, wherein the third peptide library comprises peptides comprising the formula:
(Zaa)n-(Xaa)m (SEQ ID NO: 108)
wherein Xaa is any amino acid and is amino acid carboxy-terminal to a scissile bond
(primed amino acid), Zaa is an amino acid that is amino-terminal to the scissile bond (unprimed amino acid), and n and m are integers from 1-10 inclusive, and wherein each Zaa amino acid corresponds to the amino acid sequence motif for a cleavage site ofthe protease based upon the relative abundance of different amino acid residues at each degenerate position within the population of cleaved amino-terminal peptides.
21. The method of any of claims 1 , 10 or 19, wherein peptides within the peptide library do not contain Cys.
22. The method of any of claims 1, 10 or 19, wherein the protease is a matrix metalloproteinase.
23. The method of any of claims 1, 10 or 19, wherein the protease is a proteolytic enzyme that mediates the pathogenesis of a pathogen.
24. The method of claim 23, wherein the pathogen is a biological warfare agent.
25. The method of claim 24, wherein the protease is selected from the group consisting of lethal factor of R. anthracis, Pla and YopJ proteases of Yersinia, and the smallpox HIL metalloprotease.
26. The method of claim 25, wherein the protease is lethal factor of R. anthracis.
27. The method of any of claims 1, 10 or 19, wherein the protease is selected from the group consisting of proteases of pathogenic organisms, cathepsin family proteases, tumor necrosis factor-alpha converting enzyme (TACE), calpains, caspases, beta-site amyloid precursor protein-cleaving enzyme (BACE; beta-secretase), presenilins, membrane-type serine proteases, furin and other proprotein convertases, proteasome components, and proteases affecting the blood clotting cascade.
28. The method of any of claims 1, 10 or 19, wherein the amino acid sequence motif for a cleavage site ofthe protease is determined by calculating a preference value for each amino acid at each degenerate position, wherein the preference value for a particular amino acid is determined by dividing the amount ofthe particular amino acid by the average amount per amino acid in that cycle to obtain a first value for the particular amino acid, and then dividing each first value by the relative amount of that particular amino acid in the starting mixture, and selecting amino acid residues that have a preference value of greater than 1.0 at a degenerate position for inclusion at a position corresponding to the degenerate position in the amino acid sequence motif.
29. A protease inhibitor or substrate comprising a sequence determined according to the method of any of claims 1, 10, or 19.
30. An inhibitor of matrix metalloproteinase protease activity comprising a noncleavable peptide molecule comprising an amino acid sequence selected from the group consisting of SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5 and SEQ ID NO:6, or a fragment thereof that inhibits matrix metalloproteinase protease activity.
31. The inhibitor of claim 30, wherein the inhibitor is a peptide or peptide analog consisting of 5-25 amino acids.
32. The inhibitor of any of claims 30 or 31 , further comprising a group that chelates the active site metal ion incoφorated at either the amino-terminus or the carboxy-terminus.
33. The inhibitor of claim 32, wherein the group that chelates the active site metal ion is selected from the group consisting of thioacetyl groups, carboxylate groups, phosphonate groups, phosphoramidate groups and hydroxamic acids.
34. An inhibitor of matrix metalloproteinase protease activity that competes for binding to matrix metalloproteinase with the inhibitor of any of claims 30-33.
35. A composition comprising the inhibitor of any of claims 30-34 and a pharmaceutically acceptable carrier.
36. An inhibitor oi Bacillus anthracis lethal factor protease activity comprising a noncleavable peptide molecule comprising SEQ ID NO:69, or a fragment thereof that inhibits lethal factor protease activity.
37. The inhibitor of claim 36, wherein the amino acid sequence comprises SEQ ID NO:70.
38. The inhibitor of claim 36, wherein the inhibitor is a peptide or peptide analog consisting of 5-25 amino acids.
39. The inhibitor of any of claims 36-38, further comprising a group that chelates the active site metal ion incoφorated at either the amino-terminus or the carboxy-terminus.
40. The inhibitor of claim 39, wherein the group that chelates the active site metal ion is selected from the group consisting of thioacetyl groups, carboxylate groups, phosphonate groups, phosphoramidate groups and hydroxamic acids.
41. An inhibitor oi Bacillus anthracis lethal factor protease activity consisting essentially of a compound selected from the group consisting of 2-thioacetyl-Tyr-Pro-Met-amide, α- acetyl-Lys-Val-Tyr-Pro-hydroxamic acid (SEQ ID NO:72), -acetyl-Lys-Val-Tyr-βAla- hydroxamic acid (SEQ ID NO:73) and α-acetyl-Lys-Pro-Thr-Pro-hydroxamic acid (SEQ ID NO:74).
42. An inhibitor oi Bacillus anthracis lethal factor protease activity comprising SEQ ID NO:76, or a fragment thereof that inhibits lethal factor proteolytic activity.
43. An inhibitor oi Bacillus anthracis lethal factor protease activity that competes for binding to lethal factor with the inhibitor of any of claims 36-42.
44. A composition comprising the inhibitor of any of claims 36-43 and a pharmaceutically acceptable carrier.
45. A method for determining an amino acid sequence motif for a binding site of a protease, comprising: a) contacting the protease with an oriented peptide library containing one or more degenerate residues under conditions which allow for binding of a substrate by the protease; b) allowing the protease to bind peptides within the degenerate peptide library having a binding site for the protease to form protease-peptide complexes; c) isolating the protease-peptide complexes from the unbound peptides; d) releasing the peptides from the protease-peptide complexes; e) isolating the peptides previously bound to the protease c) determining the amino acid sequences ofthe peptides; and d) determining an amino acid sequence motif for a binding site ofthe protease based upon the relative abundance of different amino acid residues at each degenerate position within the peptides.
46. The method of claim 45, wherein the peptides in the oriented peptide library comprise a carboxy-terminal hydroxamic acid group.
47. The method of claim 45, wherein the peptides comprise the amino acid sequence MAXXXXXX-hydroxamate (SEQ ID NO:77).
48. The method of claim 45, wherein the peptide library is contacted with the protease by application ofthe library to a substrate to which the protease is immobilized.
49. The method of claim 45, wherein the protease-peptide complexes are isolated by washing the protease-peptide complexes in a buffer that permits binding.
50. The method of claim 45, wherein the peptides are eluted from the protease-peptide complexes by incubating the protease-peptide complexes with an elution solution.
51. The method of claim 50, wherein the elution solution comprises either low pH or a metal chelator.
52. A protease binding molecule comprising a sequence determined according to the method of claim 45.
53. An intramolecularly-quenched fluorogenic peptide protease substrate comprising a lethal factor protease cleavage motif sequence flanked by a fluorescent group and a fluorescence quenching moiety.
54. The intramolecularly-quenched fluorogenic peptide protease substrate of claim 53, wherein the fluorescent group is attached to the lethal factor protease cleavage motif sequence at the amino terminus and the quenching moiety is attached to the peptide at the carboxy terminus.
55. The intramolecularly-quenched fluorogenic peptide protease substrate of claim 54, wherein the amino terminal fluorescent group is a methoxycoumarinacetyl (Mca) group.
56. The intramolecularly-quenched fluorogenic peptide protease substrate of claim 54, wherein the carboxy-terminal quenching moiety is a dinitrophenyl-diaminopropionic acid Dap(Dnp) moiety.
57. The intramolecularly-quenched fluorogenic peptide protease substrate of claim 54, wherein the amino terminal fluorescent group is a methoxycoumarinacetyl (Mca)group and the carboxy-terminal quenching moiety is a dinitrophenyl-diaminopropionic acid Dap(Dnp) moiety.
58. The intramolecularly-quenched fluorogenic peptide protease substrate of claim 57, wherein the lethal factor protease cleavage motif sequence is SEQ ID NO:69.
59. The intramolecularly-quenched fluorogenic peptide protease substrate of claim 58, wherein the lethal factor protease cleavage motif sequence is SEQ ID NO:70. [Mca-
KKVYPYPME-Dap(Dnp)] .
60. An intramolecularly-quenched fluorogenic peptide protease substrate comprising a matrix metalloprotease cleavage motif sequence flanked by a fluorescent group and a fluorescence quenching moiety.
61. The intramolecularly-quenched fluorogenic peptide protease substrate of claim 60, wherein the fluorescent group is attached to the matrix metalloprotease cleavage motif sequence at the amino terminus and the quenching moiety is attached to the peptide at the carboxy terminus.
62. The intramolecularly-quenched fluorogenic peptide protease substrate of claim 61 , wherein the amino terminal fluorescent group is a methoxycoumarinacetyl (Mca) group.
63. The intramolecularly-quenched fluorogenic peptide protease substrate of claim 61 , wherein the carboxy-terminal quenching moiety is a dinitrophenyl-diaminopropionic acid Dap(Dnp) moiety.
64. The intramolecularly-quenched fluorogenic peptide protease substrate of claim 61 , wherein the amino terminal fluorescent group is a methoxycoumarinacetyl (Mca)group and the carboxy-terminal quenching moiety is a dinitrophenyl-diaminopropionic acid Dap(Dnp) moiety.
65. The intramolecularly-quenched fluorogenic peptide protease substrate of claim 64, wherein the matrix metalloprotease cleavage motif sequence comprises an amino acid sequence selected from the group consisting of SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5 and SEQ ID NO:6.
66. An intramolecularly-quenched fluorogenic protease substrate comprising a lethal factor protease cleavage motif sequence flanked by fluorescent proteins that have overlapping emission spectra.
67. The intramolecularly-quenched fluorogenic protease substrate of claim 66, wherein the fluorescent proteins are cyan fluorescent protein (CFP) and yellow fluorescent protein (YFP).
68. The intramolecularly-quenched fluorogenic protease substrate of claim 66, wherein the fluorescent proteins are green fluorescent protein (GFP) and red fluorescent protein (RFP).
69. The intramolecularly-quenched fluorogenic protease substrate of claim 66, wherein the lethal factor protease cleavage motif sequence is SEQ ID NO:69.
70. The intramolecularly-quenched fluorogenic peptide protease substrate of claim 69, wherein the lethal factor protease cleavage motif sequence is SEQ ID NO:70.
71. An intramolecularly-quenched fluorogenic protease substrate comprising a matrix metalloprotease cleavage motif sequence flanked by fluorescent proteins that have overlapping emission spectra.
72. The intramolecularly-quenched fluorogenic protease substrate of claim 71, wherein the fluorescent proteins are cyan fluorescent protein (CFP) and yellow fluorescent protein (YFP).
73. The intramolecularly-quenched fluorogenic protease substrate of claim 71 , wherein the fluorescent proteins are green fluorescent protein (GFP) and red fluorescent protein (RFP).
74. The intramolecularly-quenched fluorogenic protease substrate of claim 71 , wherein the matrix metalloprotease cleavage motif sequence comprises an amino acid sequence selected from the group consisting of SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5 and SEQ ID NO:6.
75. A method for identifying protease inhibitors, comprising providing a protease and a cleavable protease substrate, wherein the uncleaved substrate is distinguishable from the cleaved substrate, wherein the cleavable protease substrate comprises a sequence determined according to any of claims 1, 10 or 19, contacting the protease with a candidate protease inhibitor compound and the cleavable substrate under conditions that permit cleavage ofthe substrate, and detecting the amounts of cleaved and uncleaved substrate as a measure ofthe presence of a protease inhibitor, wherein detection of a lesser amount of cleaved substrate than is present when the protease is not contacted with the candidate protease inhibitor compound indicates that the candidate protease inhibitor compound is a protease inhibitor.
76. The method of claim 75, wherein the cleavable protease substrate is an intramolecularly-quenched fluorogenic peptide protease substrate comprising a protease cleavage motif sequence flanked by a fluorescent group and a fluorescence quenching moiety.
77. The method of claim 75, wherein the cleavable protease substrate is an intramolecularly-quenched fluorogenic protease substrate comprising a protease cleavage motif sequence flanked by fluorescent proteins that have overlapping emission spectra.
78. The method of claim 76 or claim 77, wherein the protease cleavage motif is a lethal factor protease cleavage motif sequence comprising SEQ ID NO: 69.
79. The method of claim 78, wherein the lethal factor protease cleavage motif sequence comprises SEQ ID NO:70.
80. The method of claim 76 or claim 77, wherein the protease cleavage motif is a matrix metalloprotease cleavage motif sequence comprising an amino acid sequence selected from the group consisting of SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5 and SEQ ID NO:6.
81. The method of claim 75 wherein the candidate protease inhibitor compound is a small organic molecule.
82. A protease inhibitor identified according to any one of claims 75-81.
83. Use of the protease inhibitor of claim 82 in the preparation of a medicament.
84. A method for identifying protease inhibitors, comprising providing a protease, a protease inhibitor that binds the protease, and a candidate protease inhibitor compound, contacting the protease with the candidate protease inhibitor compound and the protease inhibitor under conditions that permit binding ofthe protease inhibitor to the protease, wherein either or both ofthe candidate protease inhibitor compound and the protease inhibitor are detectable, and wherein either or both ofthe candidate protease inhibitor compound and the protease inhibitor comprises a sequence determined according to any of claims 1, 10, 19, or 45, separating the protease from the unbound protease inhibitor and unbound candidate protease inhibitor compound, and detecting the amounts of detectable protease inhibitor and/or the detectable candidate protease inhibitor compound bound to the protease as a measure ofthe presence of a candidate protease inhibitor compound that competes with the protease inhibitor for binding to the protease.
85. The method of claim 84, further comprising testing the activity ofthe protease in the presence ofthe candidate protease inhibitor compound, wherein a greater reduction in protease activity in the presence ofthe candidate protease inhibitor compound than in the absence ofthe candidate protease inhibitor compound indicates that the candidate protease inhibitor compound is a protease inhibitor.
86. The method of claim 84, wherein the candidate protease inhibitor compound or the protease inhibitor comprises a lethal factor protease cleavage motif sequence comprising SEQ ID NO:69.
87. The method of claim 86, wherein the lethal factor protease cleavage motif sequence comprises SEQ ID NO:70.
88. The method of claim 84, wherein the candidate protease inhibitor compound or the protease inhibitor comprises a matrix metalloprotease cleavage motif sequence comprising an amino acid sequence selected from the group consisting of SEQ ID NO:l, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5 and SEQ ID NO:6.
89. The method of claim 84 wherein the candidate protease inhibitor compound is a small organic molecule.
90. A protease inhibitor identified according to any one of claims 84-89.
91. Use of the protease inhibitor of claim 90 in the preparation of a medicament.
PCT/US2001/046777 2000-11-08 2001-11-08 Methods for determining protease cleavage site motifs WO2002038796A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2002230630A AU2002230630A1 (en) 2000-11-08 2001-11-08 Methods for determining protease cleavage site motifs

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US24681500P 2000-11-08 2000-11-08
US60/246,815 2000-11-08

Publications (3)

Publication Number Publication Date
WO2002038796A2 WO2002038796A2 (en) 2002-05-16
WO2002038796A3 WO2002038796A3 (en) 2004-02-26
WO2002038796A9 true WO2002038796A9 (en) 2004-04-29

Family

ID=22932332

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/046777 WO2002038796A2 (en) 2000-11-08 2001-11-08 Methods for determining protease cleavage site motifs

Country Status (2)

Country Link
AU (1) AU2002230630A1 (en)
WO (1) WO2002038796A2 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005538946A (en) * 2002-05-10 2005-12-22 ファルマシア・コーポレーション Peptide compounds and their use as protease substrates
WO2004012737A1 (en) * 2002-07-29 2004-02-12 Novartis Ag Use or arylsulfonamido-substituted hydroxamid acid matrix metalloproteinase inhibitors for the treatment or prevention of toxemia
US9193791B2 (en) 2010-08-03 2015-11-24 City Of Hope Development of masked therapeutic antibodies to limit off-target effects
AU2014324884B2 (en) * 2013-09-25 2020-03-26 Cytomx Therapeutics, Inc Matrix metalloproteinase substrates and other cleavable moieties and methods of use thereof
DK3628328T3 (en) 2014-01-31 2022-12-05 Cytomx Therapeutics Inc MATRIPTASE AND U PLASMINOGEN ACTIVATOR SUBSTRATES AND OTHER CLEAVABLE PARTS AND METHODS OF USING THEREOF
MA41374A (en) 2015-01-20 2017-11-28 Cytomx Therapeutics Inc MATRIX METALLOPROTEASE CLIVABLE AND SERINE PROTEASE CLIVABLE SUBSTRATES AND METHODS OF USE THEREOF
CN107305172B (en) * 2016-04-25 2019-10-11 中国科学院大连化学物理研究所 A kind of protein N-terminal enrichment method based on hydrophobic grouping modification
US20190360019A1 (en) * 2016-11-16 2019-11-28 Universal Biosensors Pty Ltd Cleavage event transduction methods and products

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU775164B2 (en) * 1999-04-09 2004-07-22 Trustees Of Tufts College Methods and reagents for determining enzyme substrate specificity, and uses related thereto

Also Published As

Publication number Publication date
WO2002038796A2 (en) 2002-05-16
WO2002038796A3 (en) 2004-02-26
AU2002230630A1 (en) 2002-05-21

Similar Documents

Publication Publication Date Title
Demuth Recent developments in inhibiting cysteine and serine proteases
St. Hilaire et al. Fluorescence-quenched solid phase combinatorial libraries in the characterization of cysteine protease substrate specificity
Borodovsky et al. Chemistry-based functional proteomics reveals novel members of the deubiquitinating enzyme family
Anne et al. High-throughput fluorogenic assay for determination of botulinum type B neurotoxin protease activity
Jirácek et al. Development of Highly Potent and Selective Phosphinic Peptide Inhibitors of Zinc Endopeptidase 24-15 Using Combinatorial Chemistry (∗)
Scott et al. Irreversible inhibition of the bacterial cysteine protease-transpeptidase sortase (SrtA) by substrate-derived affinity labels
Kruger et al. Development of a high-performance liquid chromatography assay and revision of kinetic parameters for the Staphylococcus aureus sortase transpeptidase SrtA
Pietsch et al. Calpains: attractive targets for the development of synthetic inhibitors
US9187523B2 (en) Substrate peptide sequences for plague plasminogen activator and uses thereof
CAMARGO et al. Structural features that make oligopeptides susceptible substrates for hydrolysis by recombinant thimet oligopeptidase
Kalińska et al. Substrate specificity of Staphylococcus aureus cysteine proteases–Staphopains A, B and C
Stein et al. Applied techniques for mining natural proteasome inhibitors
US20040009911A1 (en) Hepsin substrates and prodrugs
Bannwarth et al. Identification of exosite-targeting inhibitors of anthrax lethal factor by high-throughput screening
Thomas et al. A broad‐spectrum fluorescence‐based peptide library for the rapid identification of protease substrates
WO2002038796A9 (en) Methods for determining protease cleavage site motifs
Rut et al. Profiling of flaviviral NS2B-NS3 protease specificity provides a structural basis for the development of selective chemical tools that differentiate Dengue from Zika and West Nile viruses
Shen Allosteric regulation of protease activity by small molecules
US20030096328A1 (en) Serine/threonine hydrolase proteins and screening assays
Taylor et al. Induced fit activation mechanism of the exceptionally specific serine protease, complement factor D
Rohweder et al. Multiplex substrate profiling by mass spectrometry for proteases
Filippova et al. New glutamine-containing substrates for the assay of cysteine peptidases from the C1 papain family
Flynn et al. Recent advances in antiviral research: identification of inhibitors of the herpesvirus proteases
US20120202754A1 (en) Enhanced substrates for the protease activity of serotype a botulinum neurotoxin
Ye et al. A recombinant human stromelysin catalytic domain identifying tryptophan derivatives as human stromelysin inhibitors

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
COP Corrected version of pamphlet

Free format text: PAGES 1/4-4/4, DRAWINGS, REPLACED BY NEW PAGES 1/4-4/4; AFTER RECTIFICATION OF OBVIOUS ERRORS AUTHORIZED BY THE INTERNATIONAL SEARCH AUTHORITY

NENP Non-entry into the national phase in:

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP