AU4239600A

AU4239600A - Zymogen activation system

Info

Publication number: AU4239600A
Application number: AU42396/00A
Authority: AU
Inventors: Patricia Andrade-Gordon; Andrew Darrow; Jensen Qi
Original assignee: Ortho Mcneil Pharmaceutical Res Inc; Ortho Mcneil Pharmaceutical Research Inc
Current assignee: Ortho Mcneil Pharmaceutical Research Inc
Priority date: 1999-04-30
Filing date: 2000-04-13
Publication date: 2000-11-17
Also published as: EP1173590A2; WO2000066709A2; AR023818A1; WO2000066709A9; JP2003531567A; WO2000066709A3; CA2372907A1; EP1173590A4

Description

WO 00/66709 PCT/USOO/09973 1 TITLE OF THE INVENTION ZYMOGEN ACTIVATION SYSTEM 5 BACKGROUND OF THE INVENTION Members of the trypsin/chymotrypsin-like (Sl) serine protease family may play pivotal roles in a multitude of diverse physiological processes, including digestive processes and regulatory amplification cascades through the proteolytic activation of inactive zymogen precursors. In many instances protease substrates within these cascades are themselves the 10 inactive form, or zymogen, of a "downstream" serine protease. Well-known examples of serine protease-mediated regulation include blood coagulation, (Davie, et al (1991). Biochemistry 30:10363-70), kinin formation (Proud and Kaplan (1988). Ann Rev Immunol 6: 49-83) and the complement system (Reid and Porter (1981). Ann Rev Biochemistry 50:433-464). Although these proteolytic pathways have been known for sometime, it is likely that the discovery of 15 novel serine protease genes and their products will enhance our understanding of regulation within these existing cascades, and lead to the elucidation of entirely novel protease networks. Maturation of a serine protease zymogen into an active form by proteolytic cleavage, results in transformation into a protease of enhanced catalytic efficiency, termed zymogenicity (Tachias and Madison (1996). JBiol Chem 271:28749-28752). Zymogenicity, the degree of 20 enhanced catalytic efficiency, varies widely among individual members of the serine protease family. Proteolytic cleavage of the conserved amino terminus zymogen activation sequence results in an aliphatic amino acid, most frequently isoleucine (Ile- 16 chymotrypsin numbering), becoming protonated and thus, positively charged. The event that accompanies zymogen activation is the creation of a rigid substrate specificity pocket generated by a salt bridge 25 between the aliphatic amino acid and a highly conserved residue aspartic acid (Asp-194 chymotrypsin numbering) one amino acid upstream from the active-site seine (Ser- 195 WO 00/66709 PCT/USOO/09973 2 chymotrypsin numbering) within the catalytic domain (Huber and Bode (1978). Acc Chem Res 11:114-22). A major drawback in the expression of full-length serine protease cDNAs for biochemical and enzymological analyses has been overwhelming potential for the production of 5 inactive zymogen. These zymogen precursors often have little or no proteolytic activity and thus must be activated by either one of two methods currently available. One method relies on autoactivation (Little, et al. (1997). JBiol Chem 272:25135-25142), which may occur in homogeneous purified protease preparations, often requires high protein concentrations (J.Q., unpublished observations), and must be rigorously evaluated by the investigator. The second 10 method uses a surrogate protease, such as trypsin, to cleave the desired seine protease. The surrogate protease must then be either inactivated (Takayama, et al. (1997). JBiol Chem 272:21582-21588) or physically removed from the desired activated protease. (Hansson, et al. (1994). JBiol Chem 269:19420-6). In both methods, the exact conditions must be established empirically and activating reactions monitored carefully, since inadequate activation or over 15 digestion would result in a heterogeneous population of active and inactive zymogen protein. Some investigators studying particular members of the SI seine protease family have exploited the use of restriction proteinases on the activation of zymogens expressed in either bacterium (Wang, et al. (1995). Biol Chem 376:681-4) or mammalian cells (Yamashiro, et al. (1997). Biochim Biophys Acta 1350:11-14). This method of generically activating recombinant 20 zymogens clearly has the added value of permitting the systematic expression and activation of several seine proteases. SUMMARY OF THE INVENTION The present invention provides a series of DNA vectors allowing for the systematic 25 expression of heterologous inactive zymogen proteases which can subsequently be purified and proteolytically processed to generate the active enzyme product. The present invention provides a system that is amenable to the parallel expression and activation permitting rapid WO 00/66709 PCT/USOO/09973 3 analysis of several SI protease family members. The serine protease cDNA of interest expressed in the vector can be used for any one of a number of applications. The generation of antisera or producing enough purified protease for crystal growth to determine the three dimensional crystallographic structure are only two examples of such numerous uses. The 5 protein products of serine protease cDNAs generated within this particular zymogen activation system can be proteolytically activated, whereby the purified recombinant protein will become activated to an extent similar to its mature activated gene product counterpart purified from native or endogenous sources. Thus, the catalytic activity and substrate specificity of the expressed protease cDNA can be evaluated. 10 BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 - Shown schematically is this zymogen activation vector which features a series of interchangeable modules represented by segments of different pattern and summarized in the Table. The arrowhead over the pro sequence indicates that sequences within this region can be cleaved with a restriction protease. The HDS represent the amino acids of the catalytic 15 triad in the serine protease catalytic domain cassette. Listed below are the various sequence modules we have employed for the secretory pre sequences, the zymogen activation pro sequences and various C-terminal affinity/epitope tagging combinations we have designed and successfully used. These constructs can be generally used to express different seine proteases by the in-frame insertion of a particular cDNA fragment encoding only the conserved catalytic 20 domain. The generic activation is achieved through the digestion of the purified zymogen using the appropriate restriction protease EK or FXa. Figure 2 - The sequences of various activation constructs (SEQ.ID.NO.: 1 through SEQ.ID.NO.:6) are presented. For each, the double-stranded nucleotide sequence is shown, below which segments are translated to reveal the pertinent amino acid sequence encoded by 25 each respective module. The relevant restriction endonuclease sites are also included along with the sequences derived from the SV 40 Late polyadenylation sequences. SEQ.ID.NO.:1 Construct:PFEK2-Stop WO 00/66709 PCTIUSOO/09973 4 SEQ.ID.NO.:2 Construct:TEK3- I XHA-TAG SEQ.ID.NO.:3 Construct:PFFXa-3XHA-TAG SEQ.ID.NO.:4 Construct:PFEKI-6XHIS-TAG SEQ.ID.NO.:5 Construct:CFEK2-6XHIS-TAG 5 SEQ.ID.NO.:6 Construct:CFEK2-HA6XHIS-TAG Figure 3 - The sequence of the catalytic domain from the protease prostasin, inserted into the PFEK2-6XHIS-TAG activation construct (SEQ.ID.NO.:7). 10 Figure 4 - The sequence of the catalytic domain from the protease prostasin, inserted into the CFEK2-6XHIS-TAG activation construct (SEQ.ID.NO.:8). Figure 5 - The sequence of the catalytic domain from the protease neuropsin, inserted into the PFEKI-6XHIS-TAG activation construct (SEQ.ID.NO.:9). 15 Figure 6 - The sequence of the catalytic domain from the protease 0, inserted into the PFEKI-6XHIS-TAG activation construct (SEQ.ID.NO.: 10). Figure 7 - Polyacrylamide gel and Western blot analyses of the recombinant protease 20 PFEK2-prostasin-6XHIS expressed, purified and activated from the activation construct of SEQ.ID.NO.:7 (Figure 3). Shown is the polyacrylamide gel containing samples of the serine protease PFEK2-prostasin-6XHIS stained with Coomassie Brilliant Blue (A). The relative molecular masses are indicated by the positions of protein standards (M). In the indicated lanes, the purified zymogen was either untreated (-) or digested with EK (+) which was used to 25 cleave and activate the zymogen into its active form. A Western blot of the gel in A, probed with the anti-FLAG MoAb M2, is also shown (B lanes I and 2). This demonstrates the quantitative cleavage of the expressed and purified zymogen to generate the processed and WO 00/66709 PCT/USOO/09973 5 activated protease. Since the FLAG epitope is located just upstream of the of the EK pro sequence, cleavage with EK generates a FLAG-containing polypeptide which is too small to be retained in the polyacrylamide gel, and is therefore not detected in the +EK lanes. Also shown in panel B, the untreated or EK digested PFEK2-prostasin-6XHIS was denatured in the absence 5 of DTT, in order to retain disulfide bonds, prior to electrophoresis (lanes 3 and 4). Although equivalent amounts of sample were loaded into each lane of the gel in the Western blot of B, the anti-FLAG MoAb M2 appears to detect proteins better when pretreated with DTT (compare lane BI with B3). 10 Figure 8 - Polyacrylamide gel and Western blot analyses of the recombinant protease CFEK2-prostasin-6XHIS expressed, purified and activated from the activation construct of SEQ.ID.NO.:8 (Figure 4). Shown is the polyacrylamide gel containing samples of the serine protease CFEK2-prostasin-6XHIS stained with Coomassie Brilliant Blue (A). The relative molecular masses are indicated by the positions of protein standards (M). In the indicated lanes, 15 the purified zymogen was either untreated (-) or digested with EK (+) which was used to cleave and activate the zymogen into its active form. A Western blot of the gel in A, probed with the anti-FLAG MoAb M2, is also shown (B lanes I and 2). This demonstrates the quantitative cleavage of the expressed and purified zymogen to generate the processed and activated protease. Since the FLAG epitope is located just upstream of the of the EK2 pro sequence, 20 cleavage with EK generates a FLAG-containing polypeptide which is too small to be retained in the polyacrylamide gel, and is therefore not detected in the +EK lanes. Also shown in panel B, the untreated or EK digested CFEK2-prostasin-6XHIS was denatured in the absence of DTT, in order to retain disulfide bonds, prior to electrophoresis (lanes 3 and 4). Of significance in lane 4 is the retention of the FLAG epitope indicating the formation of a disulfide bond between the 25 cysteine in the CF pre sequence with a cysteine in the catalytic domain of prostasin which is presumably Cys-122 (chymotrypsin numbering). Retention of the FLAG epitope, following EK cleavage and denaturation without DTT, is not observed using the prolactin pre sequence WO 00/66709 PCT/USOO/09973 6 which lacks a cysteine residue (Compare lane 4 of Figure 7 with lane 4 of Figure 8). This documents that the CF pre sequence is capable of forming a light chain, that is disulfide bonded to the heavy catalytic chain of the recombinant serine proteases, when expressed in this system. It appears that in the absence of the reducing agent DTT, the EK cleaved polypeptides have a 5 reproducibly decreased mobility in the gel (compare lane B3 with B4). Figure 9 - Polyacrylamide gel and Western blot analyses of the recombinant protease

PFEKI

neuropsin-6XHIS expressed, purified and activated from the activation construct of SEQ.ID.NO.:9 (Figure 5). Shown is the polyacrylamide gel containing samples of the serine 10 protease PFEKI -neuropsin-6XHIS stained with Coomassie Brilliant Blue (A). The relative molecular masses are indicated by the positions of protein standards (M). In the indicated lanes, the purified zymogen was either untreated (-) or digested with EK (+) which was used to cleave and activate the zymogen into its active form. A Western blot of the gel in A, probed with the anti-FLAG MoAb M2, is also shown. This demonstrates the quantitative cleavage of 15 the expressed and purified zymogen to generate the processed and activated protease. Since the FLAG epitope is located just upstream of the of the EKI pro sequence, cleavage with EKI generates a FLAG-containing polypeptide which is too small to be retained in the polyacrylamide gel, and is therefore not detected in the +EK lane. 20 Figure 10 - Polyacrylamide gel and Western blot analyses of the recombinant protease PFEK1 protease O-6XHIS expressed, purified and activated from the activation construct of SEQ.ID.NO.:10 (Figure 6). Shown is the polyacrylamide gel containing samples of the novel seine protease PFEKl-protease O-6XHIS stained with Coomassie Brilliant Blue (A). The relative molecular masses are indicated by the positions of protein standards (M). In the 25 indicated lanes, the purified zymogen was either untreated (-) or digested with EK (+) which was used to cleave and activate the zymogen into its active form. A Western blot of the gel in A, probed with the anti-FLAG MoAb M2, is also shown. This demonstrates the quantitative WO 00/66709 PCT/USOO/09973 7 cleavage of the expressed and purified zymogen to generate the processed and activated protease. Since the FLAG epitope is located just upstream of the of the EK pro sequence, cleavage with EK generates a FLAG-containing polypeptide which is too small to be retained in the polyacrylamide gel, and is therefore not detected in the +EK lane. 5 DETAILED- DESCRIPTION OF THE INVENTION DEFINITIONS: The term "protein domain" as used herein refers to a region of a protein that can fold into a stable three-dimensional structure independent to the rest of the protein. This structure 10 may maintain a specific function associated with the domain's function within the protein including enzymatic activity, creation of a recognition motif for another molecule, or provide necessary structural components for a protein to exist in a particular environment. Protein domains are usually evolutionarily conserved regions of proteins, both within a protein superfamily and within other protein superfamilies that perform similar functions. 15 The term "protein superfamily" as used herein refers to proteins whose evolutionary relationship may not be entirely established or may be distant by accepted phylogenetic standards, but show similar three dimensional structure or display unique consensus of critical amino acids. The term "fusion protein" as used herein refers to novel protein constructs that are the 20 result of combining multiple protein domains or linker regions for the purpose of gaining function of the combined functions of the domains or linker regions. This is most often accomplished by molecular cloning of the nucleotide sequences to result in the creation of a new polynucleotide sequence that codes for the desired protein. Alternatively, creation of a fusion protein may be accomplished by chemically joining two proteins together. 25 The term "linker region" or "linker domain" or similar such descriptive terms as used herein refers to stretches of polynucleotide or polypeptide sequence that are used in the construction of a cloning vector or fusion protein. Functions of a linker region can include WO 00/66709 PCT/JUSOO/09973 8 introduction of cloning sites into the nucleotide sequence, introduction of a flexible component or space-creating region between two protein domains, or creation of an affinity tag for specific molecule interaction. A linker region may be introduced into a fusion protein without a specific purpose, but as a compromise that results from choices made during cloning. 5 The term "pre-sequence" as used herein refers to a nucleotide sequence that encodes a secretion signal amino acid sequence. A wide variety of such secretion signal sequences are known to those skilled in the art, and are suitable for use in the present invention. Examples of suitable pre-sequences include, but are not limited to, prolactinFLAG, trypsinogen, and chymoFLAG. 10 The term "pro-sequence" as used herein refers to a nucleotide sequence that encodes a cleavage site for a restriction protease. A wide variety of cleavage sites for restriction proteases are known to those skilled in the art, and are suitable for use in the present invention. Examples of suitable pro-sequences include, but are not limited to, EK, Fxa, and thrombin. The term "cloning site" or "polycloning site" as used herein refers to a region of the 15 nucleotide sequence contained within a cloning vector or engineered within a fusion protein that has one or more available restriction endonuclease consensus sequences. The use of a correctly chosen restriction endonuclease results in the ability to isolate a desired nucleotide sequence that codes for an in-frame sequence relative to a start codon that yields a desirable protein product after transcription and translation. These nucleotide sequences can then be 20 introduced into other cloning vectors, used create novel fusion proteins, or used to introduce specific site-directed mutations. It is well known by those in the art that cloning sites can be engineered at a desired location by silent mutations, conserved mutation, or introduction of a linker region that contains desired restriction enzyme consensus sequences. It is also well known by those in the art that the precise location of a cloning site can be flexible so long as the 25 desired function of the protein or fragment thereof being cloned is maintained. The term "tag" as used herein refers to a nucleotide sequence that encodes an amino acid sequence that facilitates isolation, purification or detection of a fusion protein containing WO 00/66709 PCTIUSOO/09973 9 the tag. A wide variety of such tags are known to those skilled in the art, and are suitable for use in the present invention. Suitable tags include, but are not limited to, HA-tag, His-tag, biotin, avidin, and antibody binding sites. As used herein, "expression vectors" are defined herein as DNA sequences that are 5 required for the transcription of cloned copies of genes and the translation of their mRNAs in an appropriate host. Such vectors can be used to express eukaryotic genes in a variety of hosts such as bacteria including E. coli, blue-green algae, plant cells, insect cells, fungal cells including yeast cells, and animal cells. The term "catalytic domain cassette" as used herein refers to a nucleotide sequence that 10 encodes an amino acid sequence encoding at least the catalytic domain of the seine protease of interest. A wide variety of protease catalytic domains may be inserted into the expression vectors of the present invention, including those presently known to those skilled in the art, as well as those not yet having an isolated nucleotide sequence encodes it, once the nucleotide sequence is isolated. 15 As used herein, a "functional derivative" of the zymogen activation vector is a construct that possesses a biological activity (either functional or structural) that is substantially similar to the properties described herein. The term "functional derivatives" is intended to include the "fragments," "variants," "degenerate variants," "analogs" and "homologues" of the construct presented. The term "fragment" is meant to refer to any nucleic acid or polypeptide subset of 20 the modules described as pre and pro sequences used for the activation of expressed zymogen precursors. The term "variant" is meant to refer to a construct or coding sequence module substantially similar in structure and function to either the entire zymogen activation construct molecule or to a fragment thereof. A construct is "substantially similar" to the zymogen activation construct if both molecules expressed from them have similar structural 25 characteristics or if both molecules possess similar biological properties i.e., can be manipulated such that expressed recombinant zymogen following a proteolytic cleavage results in an enhanced catalytic activity. Therefore, if the two molecules possess substantially similar WO 00/66709 PCT/USOO/09973 10 activity, they are considered to be variants even if the structure of one of the molecules is not found in the other or even if the two amino acid sequences are not identical. The term "analog" refers to a molecule substantially similar in function to either the entire zymogen activation construct molecule or to a fragment thereof. 5 The present invention relates to DNA encoding an expression vector system, schematized in Figure 1, which will permit post-translational modification, through limited proteolysis, to activate inactive zymogen precursor proteins in a highly controlled and reproducible fashion. The expressed and processed protein is rendered in an activated form amenable to measuring its catalytic activity which often gives a more accurate representation of 10 the mature protease gene product than is often available from purified native tissue samples. Any of a variety of procedures, known in the art, may be used to molecularly manipulate recombinant DNA to enable study of a particular seine protease using this system. These methods include, but are not limited to, direct functional expression of the serine protease cDNA following their insertion into and subsequent expression from this series of 15 vectors. A method to obtain such a serine protease cDNA molecule is to screen a cDNA library constructed in a bacteriophage or plasmid shuttle vector with a labeled oligonucleotide probe designed from the amino acid sequence or restriction fragment of the partial or related cDNA. This partial cDNA is obtained by the specific polymerase chain reaction (PCR) amplification of the cDNA fragments through the design of matching or degenerate oligonucleotide primers 20 from the sequence of the cDNA or amino acid sequence of the protein. Expressed sequence tags (ESTs) are also available for this purpose. Alternatively, the full-length cDNA of a published sequence may be obtained by the specific PCR amplification through the design of matching oligonucleotide primers flanking the entire coding sequence. Insertion into the zymogen activation construct described herein would require only the isolation, through PCR 25 amplification, of just the catalytic domain (catalytic cassette) of the particular seine protease cDNA. The catalytic domain can then be subcloned into the zymogen activation construct in the proper translational register and orientation so as to produce a recombinant fusion protein.

WO 00/66709 PCT/USOO/09973 11 The serine protease catalytic cassette obtained through the methods described above may be recombinantly expressed by molecular cloning into an expression vector containing a suitable promoter and other appropriate transcription regulatory elements, and transferred into prokaryotic or eukaryotic host cells to express a recombinant zymogen of the seine protease 5 catalytic domain. Techniques for such manipulations are fully described in (Sambrook, et al. Molecular Cloning: A Laboratory Manual, 2nd ed., (1989). 1-1626) and are well known to those in the art. Specifically designed vectors allow the shuttling of DNA between hosts such as bacteria-yeast or bacteria-animal cells or bacteria-fungal cells or bacteria-invertebrate cells. An 10 appropriately constructed expression vector should contain: an origin of replication for autonomous replication in host cells, selectable markers, a limited number of useful restriction enzyme sites, a potential for high copy number, and active promoters. A promoter is defined as a DNA sequence that directs RNA polymerase to bind to DNA and initiate RNA synthesis. A strong promoter is one that causes mRNAs to be initiated at high frequency. Expression 15 vectors may include, but are not limited to, cloning vectors, modified cloning vectors, specifically designed plasmids or viruses. A variety of mammalian expression vectors may be used to express recombinant serine protease catalytic domain in a zymogen configuration in mammalian cells. Commercially available mammalian expression vectors which may be suitable for recombinant protein 20 expression, include but are not limited to, pCI Neo (Promega, Madison, WI, Madison WI), pMAMneo (Clontech, Palo Alto, CA), pcDNA3 (InVitrogen, San Diego, CA), pMCIneo (Stratagene, La Jolla, CA), pXT1 (Stratagene, La Jolla, CA), pSG5 (Stratagene, La Jolla, CA), EBO-pSV2-neo (ATCC 37593) pBPV-1(8-2) (ATCC 37110), pdBPV-MMTneo(342-12) (ATCC 37224), pRSVgpt (ATCC 37199), pRSVneo (ATCC 37198), pSV2-dhfr (ATCC 25 37146), pUCTag (ATCC 37460), and lZD35 (ATCC 37565). A variety of bacterial expression vectors may be used to express recombinant serine protease catalytic domain in a zymogen form in bacterial cells. Commercially available WO 00/66709 PCT/USOO/09973 12 bacterial expression vectors which may be suitable for recombinant protein expression include, but are not limited to pET vectors (Novagen, Inc., Madison WI) and pQE vectors (Qiagen, Valencia, CA) pGEX (Pharmacia Biotech Inc., Piscataway, NJ). In general, as is found for many mammalian cDNAs, bacterial serine protease cDNA expression can result in insoluble 5 recombinant proteins that must be renatured in order to refold the protein in the active conformation (Takayama, et al. (1997). JBiol Chem 272:21582-21588). Thus, we have focused our efforts on expression systems featuring eukaryotic cells. A variety of fungal cell expression vectors may be used to express recombinant serine protease catalytic domain in a zymogen configuration in fungal cells such as yeast. 10 Commercially available fungal cell expression vectors which may be suitable for recombinant protein expression include but are not limited to pYES2 (InVitrogen, San Diego, CA) and Pichia expression vector (InVitrogen, San Diego, CA). A variety of insect cell expression systems may be used to express recombinant seine protease catalytic domain in a zymogen form in insect cells. Commercially available 15 baculovirus transfer vectors which may be suitable for the generation of a recombinant baculovirus for recombinant protein expression in Sf9 cells include but are not limited to pFastBacl (Life Technologies, Gaithersberg, MD) pAcSG2 (Pharmingen, San Diego, CA) pBlueBacII (InVitrogen, San Diego, CA). In addition, a class of insect cell vectors, which permit the expression of recombinant proteins in Drosophila Schneider line 2 (S2) cells, is also 20 available (InVitrogen, San Diego, CA). DNA encoding the zymogen activation construct may be subcloned into an expression vector for expression in a recombinant host cell. Recombinant host cells may be prokaryotic or eukaryotic, including but not limited to bacteria such as E .cli, fungal cells such as yeast, mammalian cells including but not limited to cell lines of human, bovine, porcine, monkey and 25 rodent origin, and insect cells including but not limited to Drosophila S2 (ATCC CRL-1963) and silkworm Sf9 (ATCC CRL-171 1), derived cell lines. Cell lines derived from mammalian species which may be suitable and which are commercially available, include but are not WO 00/66709 PCTIUSOO/09973 13 limited to, CV-1 (ATCC CCL 70), COS-1 (ATCC CRL 1650), COS-7 (ATCC CRL 1651), CHO-Ki (ATCC CCL 61), 3T3 (ATCC CCL 92), NIH/3T3 (ATCC CRL 1658), HeLa (ATCC CCL 2), C1271 (ATCC CRL 1616), BS-C-l (ATCC CCL 26), MRC-5 (ATCC CCL 171), L cells, and HEK-293 (ATCC CRL1573), 5 The expression vector may be introduced into host cells via any one of a number of techniques including but not limited to transformation, transfection, protoplast fusion, lipofection, and electroporation. Pools of transfected cells may be cultured and analyzed for recombinant protein expression. Alternatively, the expression vector-containing cells are clonally propagated and individually analyzed to determine whether they produce recombinant 10 protein. Identification of host cell clones expressing recombinant serine protease catalytic domain in a zymogen configuration may be done by several means, including but not limited to immunological reactivity with antibodies directed against the amino acid sequence of serine protease catalytic domain if available. The zymogen activation vector described herein contains modules encoding epitope tags 15 for anti-FLAG and/or anti-HA monoclonal antibodies, which are readily available (Babco, Richmond, CA). Thus, levels of the expressed zymogen protein can be quantified by immunoaffinity and/or ligand affinity techniques. These can be employed by any one of a number of means, such as Western blotting, ELISA or RIA assays of conditioned media from transfected eukaryotic cells or transformed bacterial lysates to detect the production of secreted 20 recombinant serine protease catalytic domain in zymogen form. Since the FLAG epitope is located between the pre and pro sequences, and is removed upon proteolytic activation with either enterokinase (EK) or factor Xa (FXa), the disappearance of this tag is an effective measure of quantitative digestion (see figures 7, 8, 9 and 10). Several members of the SI seine protease family appear to be membrane bound. They 25 may be type II integral membrane proteases, anchored by the NH 2 -terminus as is the case for hepsin (Leytus, et al. (1988). Biochemistry 27:1067-74) and EK (Kitamoto, et al. (1994). Proc. Natl. A cad. Sci. U. S. A. 91:7588-92), or at the C-terminus as exemplified by prostasin (Yu, et WO 00/66709 PCT/USOO/09973 14 al. (1995). J. Biol. Chem. 270:13483-9). In these cases, the biochemical characterization of serine proteases generated in this system is facilitated in that only the catalytic portion is expressed and these trans-membrane domains are excluded. Thus, the expressed zymogens are soluble which greatly facilitates purification, activation, and subsequent biochemical analyses. 5 Expression of the catalytic domain by the generation of a catalytic cassette module, precludes the difficulties one would encounter with the type Il membrane bound serine proteases, since the trans-membrane domain is within an extended non-catalytic

N

2 -terminus. The design of a soluble catalytic module of the C-terminally tethered serine proteases however, would require trans-membrane prediction in order to determine how to truncate the catalytic domain upstream 0 of the predicted trans-membrane segment. Identifying putative trans-membrane spanning regions within a particular polypeptide is often accomplished by measuring amino acid hydropathy within a stretch of the sequence being analyzed. There are currently sequence analysis algorithms that are capable of determining regional hydropathy (Kyte and Doolittle (1982). J. Mol. Biol. 157:105-32) enabling the prediction of a potential trans-membrane 15 anchoring C-terminal tail within a given protease sequence. We have found that activation with either of the two restriction proteases EK and FXa occurs efficiently when the purified serine protease zymogen is bound to Ni-NTA agarose beads. The proteolytic activity of Ni-NTA agarose bead-bound recombinant protease, once cleaved and activated, is unimpeded. The Ni-NTA agarose bead-bound proteases (protease 20 beads) appear stable and their activity can be measured by sequential chromogenic assays, punctuated by intermittent washings, and are active through multiple rounds of assay. Although the stability of the protease beads will be determined by the properties of the particular protease being analyzed, potentially these protease beads could be applied where the immobilization of the protease is required. An example might be for the vivo analysis of the 25 proteolytic activity. A protease bead preparation could be evaluated following subcutaneous or intramuscular delivery and since the Ni-NTA agarose bead-bound protease would be unlikely to WO 00/66709 PCT/USOO/09973 15 diffuse away, it would better approximate a localized accumulation of the protease in vivo than similarly delivered soluble preparations. Another method of expression for recombinant proteins produced by the zymogen activation construct is the in vitro transcription/translation systems (Promega, Madison, WI). The addition of canine pancreatic microsomal membranes would permit membrane translocation and core glycosylation of the expressed zymogen catalytic domains by in vitro transcription/translation. Although, these systems generally produce low amounts of translated product, in vitro translated zymogen catalytic domains of serine proteases with high specific activities could be detected following proteolytic activation. RNA transcribed from the 0 zymogen activation construct in vitro may also be translated efficiently following microinjection into Xenopus laevis oocytes. It is known that there is a substantial amount of redundancy in the various codons that code for specific amino acids. Therefore, this invention is also directed to those DNA sequences that contain alternative codons that code for the eventual translation of the identical 15 amino acid. For purposes of this specification, a sequence bearing one or more replaced codons will be defined as a degenerate variation. Also included within the scope of this invention are mutations either in the DNA sequence or the translated protein that do not substantially alter the ultimate physical properties of the expressed protein. An example of such changes include substitution of an aliphatic for another aliphatic, aromatic for aromatic, acidic for another 20 acidic, or a basic for another basic amino acid may not cause a change in functionality of the polypeptide. Also, more apparently radical substitutions may be made if the function of the residue is to maintain polypeptide solubility, including a charge reversal. It is known that DNA sequences coding for a peptide may be altered so as to code for a peptide having properties that are different than those of the naturally occurring peptide. Methods of altering the DNA 25 sequences include, but are not limited to site directed mutagenesis.

WO 00/66709 PCT/USOO/09973 16 The SI family of serine proteases is the largest family of peptidases (Rawlings and Barrett (1994). Methods Enzymol 244:19-61). As described above members of this diverse family perform diverse functions including food digestion, blood coagulation and fibrinolysis, complement activation as well as other immune or inflammatory responses. It is likely that 5 these functions in both normal physiology and during diseased states, currently under investigation by numerous laboratories, will become better understood in the near future. These functions will undoubtedly be aided by the ability to express large amounts of the active protease, which is then amenable to biochemical analyses. In addition, the discovery of novel SI serine protease cDNAs will enhance our understanding of the complex pathways controlled 0 by these enzymes. The zymogen activation construct described herein will facilitate the future biochemical characterization of these novel genes. The present invention is also directed to methods for screening compounds that modulate the activity of proteins expressed from a zymogen activation construct. Compounds 15 that modulate these activities may be DNA, RNA, peptides, proteins, or non-proteinaceous organic molecules. Compounds that modulate the function of proteins expressed from the zymogen activation constructs may be detected by a variety of assays. The assay may be a simple "yes/no" assay to determine whether there is a change in catalytic or enzymatic activity. The assay may be made quantitative by comparing the expression or function of a test sample 20 with the levels of expression or function in a standard sample. Modulators identified in this process may be useful as therapeutic agents. Kits containing the zymogen activation vector DNA may be prepared since these constructs will be generally useful to express, activate and characterize the activity of a wide variety of heterologous serine proteases. Such kits will be particularly beneficial, 25 for example, to investigators in gene discovery for expressing novel serine proteases in order to determine their proteolytic specificity. Such a kit would comprise a compartmentalized carrier suitable to hold in close confinement at least one container.

WO 00/66709 PCT/USOO/09973 17 The carrier would further comprise reagents such as recombinant protein or antibodies d en h arrier may as oti en o suitable for detecting the expressed proteins. The c detection such as labeled antigen or enzyme substrates or the like. In addition, the use of the methodology described herein, has commercial value since it can be used to generate vast amounts of activated serine proteases which have the potential utility in biochemical reactions or as therapeutic proteins. without, however, limiting the The following examples illustrate the present invention same thereto. ) EXAMPLE I Plasmid whani prationsc All molecular biological methods were in accordance with those previously described (Sambrook, et al. Molecular Cloning: A Laboratory Manual, 2nd ed., (1989). 1-1626). Oligonucleotides were purchased from Ransom Hill Biosciences (Ransom Hill, CA)(Table 1) 15 and all restriction endonucleases and other DNA modifying enzymes were from New England Biolabs (Beverly, MA) unless otherwise specified. Constructs were initially made in the pCDNA3 (InVitrogen, San Diego, CA) or the pClneo (Promega, Madison. WI) vectors and subsequently transferred into Drosophila expression vectors pRM63 and pFLEX 6 4 as described below. The Drosophila expression vectors used are similar to those commercially available 20 (InVitrogen, San Diego, CA). All construct manipulations were confirmed by dye terminator cycle sequencing using Allied Biosystems 373 fluorescent sequencers (Perkin Elmer, Foster City, CA). 25 The various modules used in the zymogen activation constructs are schematized in Figure 1. The bovine prolactin pre sequence signal sequence fused upstream of the FLAG epitope in a manner similar to that previously described (Ishii, et al. (1993). JBiol Chem WO 00/66709 PCTIUSOO/09973 20 b drec doble-stranded oligonucleotide insertions EK site (EK2 and EK3) were generated by direct douthese oligonucleotides once annealed using the corresponding oligonucleotides. By design, they could be inserted into PFpCDNA3 woul posessa 5-Not 1 and a 3'-Xba I site such that te ol eisre noPpDA would possess ah5'-Notan a prbactif LAG and chymotrypsinogenFLAG pre sequences or CFpCDNA3, which contain the pro ecemdue such as PFFXapCDNA3 and respectively, to generate a series of pre-pro sequence modules CFEK2pcDNA 3 etc. be generally defined by several smaller serme The other class of SI serine proteases can st rally corneum chymotryptic enzyme. proteases like trypsin, prostate specific antigen, and stratum coum cytr othe eavage site Thisclas, e will refer to as type 1, lack the cysteine residue just upstream of the cleavagest This class, we wilrfrt stp ,lc h ymgen activation pro sequence. In the case of ) yet, contain a cysteine just downstream of the zymogen by protsinogen numbering) these trypsin-like S1 serine proteases, this cysteine (Cys- 2 2 by caymotYPsin (Cys-157) participates in disulfide bond formation with a cysteine in the catalytic domain (Cys- 157) (Stroud, et al (1974). JMol Biol 83:185-208, Kossiakoff et al. (1977). Biochemistry 16.654-64) and may have important consequences on catalytic activity and or substrate specificity. In 1 5 order to accommodate this other type of serine protease, two more EK cleavage modules for the zymogen activation constructs were generated (Figure 2). rate Thus, to analyze the activity of a particular serine protease cDNA, the aof th particular combination of pre-pro sequence that corresponds to the amino acid sequence of theicula serine protease, can be used. For example, the trypsin-like type 1 seine protease could be 20 expressed from a PFEK3 pre-pro sequence while a chymotrypsinlike type 11 protease may be better represented by the CFEK2 pre-pro modules. invention Other pro sequences, and variations Of them, are suitable for use in the presentineto as pro sequences for cleavage by a restriction protease for activating the inactive zymogen produced by this system. These include, but are not limited to, the cleavage sites for the 25 restriction proteases thrombin and PreScission Protease (Pharmacia Biotech Inc., Piscataway,

NJ).

WO 00/66709 PCTIUSOO/09973 18 268:9780-6). This sequence module was generated by designing a series of 5 double stranded oligonucleotides having cohesive overhangs. These oligonucleotides were kinased, paired

(PF

#1U with PF-#1OL, PF-#2U with PF-#9L, PF-#3U with PF-#8L, PF-#4U with PF-#7L PF-#5U with PF-#6L; Table 1), in 500 mM NaCl and annealed in 5 separate reactions. Aliquots of the annealed oLigonucleotides were combined, ligated and the product subjected to PCR with primers PF-#U and PF-#6L. This preparative reaction was performed using Amplitaq (Perkin Elmer, Foster City, CA) in the buffer supplied by the manufacturer with 10 cycles of 93C for 45 sec.! 60 oC for 45 sec.! 72 tC for 45 sec., followed by 5 min at 72 C. The product was digested with Eco RI and Not 1 and lighted into the pCDNA 3 vector cleaved with Eco RI and 0 Not 1 followed by dephospholYlation with calf alkaline phosphatase. An isolate, containing the desired sequence designated prolactinFLAGpCDNA 3 (PFpCDNA3) was used in subsequent manipulations. Additional pre sequences such as the human trypsinogen I and chyotrypsinogenFLAG (ChymoFLAG or CF) (Figure 1) were generated by a direct double stranded oligonucleotide insertion using the corresponding oligonucleotides (Table I). Since 1 5 these two pre sequences are shorter than that of prolactin, the annealed duplexes were designed to contain a 5'-Eco RI and a 3'-Not I cohesive ends and thereby could be inserted into the corresponding sites of pCDNA3 directly. Most members of the S1 protease family contain a cysteine residue just upstream from the cleavage site of the pro sequence in a conserved region. This cysteine residue (Cys- by 20 chymnotrypsin numbering) is disulfide bonded to another conserved cysteine within the catalytic domain (Cys-1 2 2 ) (Matthew et al. (1967). Nature (London) 214:652-6). We will refer to this class of SI serine proteases as type 11. It is possible that the existence of this catalytic cysteine residue 122 in the disulfide-bonded state is important for specific activity andwor substrate specificity. Consequently, in order to accommodate serine protease of this type, we 25 synthesized the CF pre sequence that will produce recombinant proteases containing a cystene residue just upstream of the zymogen cleavage site.

WO 00/66709 PCTIUSOO/09973 19 Other pre sequences are suitable for use in the present invention as pre sequences for trafficking recombinant proteins into the secretory pathway of eukaryotic cells. These often include but are not limited to translational initiation methionine residues followed by a stretch of aliphatic amino acids. Export signal sequences target newly synthesized proteins to the endoplasmic reticulum of eukaryotic cells and the plasma membrane of bacteria. Although signal sequences contain a hydrophobic core region, they show great variation in both overall length and amino acid sequence. Recently, it has become clear that this variation allows signal sequences to specify different modes of targeting and membrane insertion. In the vast majority of instances, the signal peptide does not interfere with the secreted protein function following its cleavage by the signal peptidase (Martoglio and Dobberstein (1998). Trends Cell Biol 8:410-415). A variety of signal sequence modules, for general use in the secretion of expressed proteins, are currently commercially available (Invtirogen, San Diego, CA), and are suitable for use in the present invention as pre sequences. 5 Pro Sequence Generation The EK cleavage site of human trypsinogen I was generated using the PCR with the two primers EK1-U and EK1-L (Table 1). The template was an EST (W4051 1) identified through FASTA searches (Pearson and Lipman (1988). Proc Natl Acad Sci U. S. A. 85:2444-8) of Db EST and obtained from the I.M.A.G.E. consortium through Genome Systems Inc., St. Louis, 20 MO. The purified plasmid DNA of W40511 was used as a template in preparative PCR reactions, with Amplitaq (Perkin Elmer, Foster City, CA) in accordance with the manufacturer's recommendations with 15 cycles of 93 0 C for 45 sec.! 53 'C for 45 sec.! 72 'C for 45 sec., followed by 5 mi at 72 C. The PCR product was subcloned using the T/A vector pCR 2.1 (InVitrogen, San Diego, CA) and a clone with the desired sequence was chosen. The 25 product was preparatively isolated by digestion using Not I and Xba 1 and subcloned downstream of the PF pre sequence between the Not I and Xba I sites in PFpCDNA3 to make PFEKpCDNA3. Additional pro sequences such as the FXa cleavage site and variations of the WO 00/66709 PCTIUSOO/09973 21 C-terminal Affinity/EpitOPe Tgs Kinased, annealed double-stranded oligonucleotides, containing 5'-Xba I and 3'-Not I cohesive ends were designed corresponding to either a stop codon, 6 histidine codons and a C terminal stop codon (6XHISTAG), or a Hemagglutinin epitope tag with a C-terminal stop 5 codon (HATAG) (Figure 1 and Table 1). These oligonucleotides were individually ligated between the Xba I and Not I sites in the plasmid vector pCI Neo (Promega, Madison, WI). Likewise, oligonucleotides were designed corresponding to the Hemagglutinin epitope tag but lacking a C-terminal stop codon (HA-Nonstop). This kinased annealed double-stranded oligonucleotide, containing Xba I cohesive termini, was reiteratively inserted upstream of the 0 HATAG to generate a 3XHATAG epitope tag. In addition, the HA-Nonstop oligonucleotide was inserted upstream of the 6XHISTAG to generate a Hemagglutinin epitope/ 6XHIS affinity tag (HA6XHISTAG). Zymogen Activation Vector Generation 15 The series of pre-pro sequences described above (ex. PFFXa or CFEK2 etc.) were preparatively excised from the pCDNA3 vector using Eco RI and Xba I. The FXa sequence, shown in Table I in particular, contains a Xba I site which becomes blocked by overlapping Dam methylation. To overcome this phenomenon, plasmid DNA of these FXa recombinants had to be transformed into and purified from a strain lacking Dam methylation (SCS 110 for ex. 20 Stratagene, La Jolla, CA) in order to cleave this site using the Xba I restriction enzyme. The pre-pro sequences were ligated into the various C-terminal epitope or affinity tagged pCIneo constructs between their 5'-Eco RI and 3'-Xba I sites. Thus, these constructs all feature a pre sequence (prolactin FLAG, PF; chymotrypsinogenFLAG, CF; or trypsinogen, T) to direct secretion in-frame with a pro sequence recognized by a restriction protease EK (sites EKl EK2 25 EK3); or factor Xa (site FXa), to permit the post-translational cleavage for zymogen activation. A unique Xba I restriction enzyme site immediately upstream of the epitope/affinity tags, described above, separates these pre-pro combinations (Figure 2). Due to the nature of the WO 00/66709 PCTIUSOO/09973 22 design, the Xba I site is critical to these vectors, and was chosen based on several criteria as follows. These include the observation that the "6-cutter" (a restriction enzyme recognizing 6 nucleotide bases in its specific cleavage site) restriction enzyme Xba I site is found infrequently within cDNAs which greatly minimizes labor-intensive cloning steps in the generation of cDNA expression constructs for general use. Additionally, should one or more Xba I sites exist within a particular cDNA sequence one desires to insert into this vector, two other restriction enzymes (Spe I and Nhe I) are also rare 6-cutters which give rise to Xba I compatible cohesive ends. it should be noted that in this series of zymogen activation constructs, the translational register of the pre-pro sequences is distinct from that of the epitope/affinity tags. The resulting 0 recombinants comprise a series of mammalian zymogen activation constructs in the pCIneo background. For increased levels of expression, these pre-pro-epitope modules were individually shuttled into vectors capable of expression in Drosophila S2 cells. This was accomplished by preparatively isolating the individual pre-pro-Xba I-epitope/affinity-tag modules by digesting the mammalian pCI Neo zymogen activation constructs with 5'-Eco RI 15 and 3'-Hinc II. These modules were then inserted into the Eco RI and Hinc 11 sites of either an inducible Drosophila vector pRM63 containing the metallothionine promoter, or the constitutive Drosophila vector pFLEX6 4 containing the actin 5c promoter. EXAM PLE 2 20 Acquisition erine Protease cDNAs 20 )sto wasidntiie Acquisition n of a full length cDNA corre sponding 1 1:1 S to the srprtease prostan The full length cDNA for prostasin (Yu, et al. (1995). JBiol Chem 270:13483-9) was identified through FASTA searches of Db EST (Genbank accession number AA2056O 4 ) and obtained from the I.M.A.G.E. consortium through Genome Systems, Inc., St. Louis, MO. The clone was 25 sequenced for confirmation.

WO 00/66709 PCT/USOO/09973 23 Acquisition of a full length cDNA corresponding to the novel protease 0 A full-length clone of a novel serine protease (Yoshida, et al., (1998). Biochim. Biophys. Acta, 1399:225-228), designated protease 0, was cloned and sequenced for confirmation using standard techniques known to those skilled in the art. 5 Acquisition of a full length cDNA corresponding to the human orthologue of protease neuropsin A partial clone with homology to the murine neuropsin (Chen, et al. (1995). JNeurosci 15:5088-97) was also identified (Yoshida, et al., (1998). Gene, 213:9-16). The full-length cDNA of human neuropsin was obtained by screening a Uni-ZAP keratinocyte library, 10 followed by in vivo excision and sequence analysis of positive purified plaques. General plasmid manipulation The purified plasmid DNA of these serine protease cDNAs was used as a template in 100 ul preparative PCR reactions with Amplitaq (Perkin Elmer, Foster City, CA) or Pfu DNA 15 polymerase (Stratagene, La Jolla, CA) in accordance with the manufacturer's recommendations. Typically, reactions were run at 18 cycles of 93 "C for 30 sec./ 53 to 65 *C for 30 sec./ 72 'C for 90 sec., followed by 5 min at 72 'C using the Pfu DNA polymerase. The annealing temperatures used were determined for the particular construct by the PrimerSelect 3.11 program (DNASTAR Inc., Madison, WI). The primers of the respective serine proteases 20 (Table 1), containing Xba I cleavable ends, were designed to flank the catalytic domains of these three proteases and generate Xba 1 catalytic cassettes (Figure 1). Since the protease prostasin is initially thought to be C-terminally membrane bound, and subsequently rendered soluble through proteolysis following secretion (Yu, et al. (1995). JBiol Chem 270:13483-9), a soluble form of prostasin was generated. This was accomplished by excluding the C-terminal 25 29 amino acids in the prostasin catalytic cassette by designing the C-terminal Xba 1 primer (prostasin(SOL) Xba-L, Table 1) to a position immediately upstream from the hydrophobic stretch of amino acids thought to represent a membrane tether.

WO 00/66709 PCT/USOO/09973 24 The preparative PCR products were phenol/CHCl 3 (1:1) extracted once, CHCl 3 extracted, and then EtOH precipitated with glycogen (Boehringer-Mannheim Corp., Indianapolis, IN) carrier. The precipitated pellets were rinsed with 70 % EtOH, dried by vacuum, and resuspended in 80 ul H 2 0, 10 ul 10 restriction buffer number 2 and 1 ul 100x BSA 5 (New England Biolabs, Beverly, MA). The products were digested for at least 3 hours at 37 "C with 200 units Xba 1 restriction enzyme (New England Biolabs, Beverly, MA). The Xba I digested products were phenol/CHCl 3 (1:1) extracted once, CHCl 3 extracted, EtOH precipitated rinsed with 70 % EtOH, and dried by vacuum. For purification from contaminating template plasmid DNA, the products were electrophoresed through 1.0 % low melting temperature 10 agarose (Life Technologies, Gaithersberg, MD) gels in TAE buffer (40 mM TRIS-Acetate, 1 mM EDTA pH 8.3) and excised from the gel. Aliquots of the excised products were routinely used for in-gel ligations with the appropriate Xba I digested, dephosphorylated and gel purified, zymogen activation vector. These cassettes once inserted, in the correct orientation, placed them in the proper translational register with the NH 2 -terminal pre-pro sequence and C 15 terminal/epitope affinity tag. PCR products directly cloned, as described above, were sequenced for confirmation. Only clones having confirmed sequences were chosen to isolate the Xba 1 catalytic cassette for subsequent subcloning into additional vectors of the series when desired. 20 EXAMPLE 3 Expression of Recombinan s inDroshila S2 Cells Drosophila S2 cells (ATCC, CRL-1963) (1X 107 cells) were co-transfected with 1 ug of DESneo plasmid and 19 ug of recombinant zymogen activation construct purified DNA (Qiagen, Valencia, CA) using the Calcium phosphate precipitation method (Wigler, et al. 25 (1977). Cell 11:223-32). Transfected cells were incubated at 22 to 24 'C for 24 hours. The calcium phosphate solution was removed and the cells washed twice with complete medium.

WO 00/66709 PCT/USOO/09973 25 Cells were allowed to grow for 48 hours without selection. G418 was added to a final concentration of 400 ug/ml. Cells were spilt approximately every 5 to 8 days. A stable population of G418-resistant cells was obtained in 4-5 weeks. Subcultures of the stable transfected S2 cells (2X10 6 cells/ml in serum-free medium) were induced to express 5 recombinant serine proteases for 40 hours by the addition of CuSO 4 to a final concentration of 1.0 mM. EXAMPLE 4 Purification, and Activation of Recombinant Serine Proteases 10 The serum-free culture medium from stable S2 cells was used to purify secreted recombinant serine proteases. The medium was concentrated 4 to 5 fold using an appropriate Centriprep concentrator for the calculated molecular weight of the protein (Amicon Inc., Beverly, MA). 150 ul of 50 % Ni-NTA slurry (Qiagen, Valencia, CA) was added to 5 tolO ml of the concentrated medium and mixed by shaking at 4 'C for 60 minutes. The zymogen-bound resins 15 were washed 3 times with 1.5 ml of wash buffer (10 mM TRIS-HCl (pH 8.0), 300 mM NaCl, and 15 mM imidazole,), followed by with a 1.5 ml wash with distilled

H

2 0. Enterokinase cleavage was carried out by adding enterokinase (Novagen, Inc., Madison WI; or Sigma, St. Louis, MO) to the zymogen-bound Ni-NTA beads in a 150 ul volume at room temperature overnight with gentle shaking in a buffer containing 20 mM TRIS-HCl (pH 7.4), 50 mM NaCl, 20 and 2.0 mM CaCl 2 . The resins were then washed twice with 1.5 ml wash buffer. The activated serine proteases were eluted with elution buffer (20 mM TRIS-HCI (pH 7.8), 250 mM NaCl, and 250 mM imidazole). Eluted protein concentration was determined by a Micro BCA Kit (Pierce, Rockford, IL) using bovine serum albumin as a standard. Amidolytic activities of the activated serine proteases were monitored by release of para-nitroaniline (pNA) from the 25 synthetic substrates indicated in Table 2. The chromogenic substrates used in these studies were all commercially available (Bachem California Inc., Torrance, PA; American Diagnostica WO 00/66709 PCT/USOO/09973 26 Inc., Greenwich, CT; Kabi Pharmacia Hepar Inc., Franklin, OH). Assay mixtures contained chromogenic substrates in 500 uM and 10 mM TRIS-HCl (pH 7.8), 25 mM NaCl, and 25 mM imidazole. Release of pNA was measured over 120 min at 37 *C on a micro-plate reader (Molecular Devices, Menlo Park, CA) with a 405 nm absorbance filter. The initial reaction 5 rates (Vmax, mOD/min) were determined from plots of absorbance versus time using Softmax (Molecular Devices, Menlo Park, CA). The specific activities (nmole pNA produced /min/ug protein) of the activated proteases for the various substrates are presented in Table 2. No measurable chromogenic amidolytic activity was detected with the purified unactivated zymogens. 10 EXAMPLE 5 Electrophoresis and Western Blotting Detection of Recombinant Serine Proteases Samples of the purified zymogens or activated proteases, denatured in the presence or absence of the reducing agent dithiothreitol (DTT), were analyzed by SDS-PAGE (Bio Rad, Hercules 15 CA) stained with Coomassie Brilliant Blue. For Western Blotting, the Flag-tagged serine proteases expressed from transient or stable S2 cells were detected with anti-Flag M2 antibody (Babco, Richmond, CA). The secondary antibody was a goat-anti-mouse IgG (H+L), horseradish peroxidase-linked F(ab')2 fragment, (Boehringer Mannheim Corp., Indianapolis, IN) and was detected by the ECL kit (Amersham, Arlington Heights, IL). Figure 7 20 demonstrates PFEK2-prostasin-6XHIS function by demonstrating the quantitative cleavage of the expressed and purified zymogen to generate the processed and activated protease. Since the FLAG epitope is located just upstream of the of the EK pro sequence, cleavage with EK generates a FLAG-containing polypeptide which is too small to be retained in the polyacrylamide gel, and is therefore not detected in the +EK lanes. Also shown in panel B, the 25 untreated or EK digested PFEK2-prostasin-6XHIS was denatured in the absence of DTT, in order to retain disulfide bonds, prior to electrophoresis (lanes 3 and 4). Although equivalent amounts of sample were loaded into each lane of the gel in the Western blot of B, the anti- WO 00/66709 PCT/USOO/09973 27 FLAG MoAb M2 appears to detect proteins better when pretreated with DTT (compare lane BI with B3). Figure 8 demonstrates CFEK2-prostasin-6XHIS function by demonstrating the quantitative cleavage of the expressed and purified zymogen to generate the processed and activated protease. Since the FLAG epitope is located just upstream of the of the EK2 pro 5 sequence, cleavage with EK generates a FLAG-containing polypeptide which is too small to be retained in the polyacrylamide gel, and is therefore not detected in the +EK lanes. Also shown in panel B, the untreated or EK digested CFEK2-prostasin-6XHIS was denatured in the absence of DTT, in order to retain disulfide bonds, prior to electrophoresis (lanes 3 and 4). Of significance in lane 4 is the retention of the FLAG epitope indicating the formation of a 10 disulfide bond between the cysteine in the CF pre sequence with a cysteine in the catalytic domain of prostasin which is presumably Cys-122 (chymotrypsin numbering). Retention of the FLAG epitope, following EK cleavage and denaturation without DTT, is not observed using the prolactin pre sequence which lacks a cysteine residue (Compare lane 4 of Figure 7 with lane 4 of Figure 8). This documents that the CF pre sequence is capable of forming a light chain, that 15 is disulfide bonded to the heavy catalytic chain of the recombinant serine proteases, when expressed in this system. It appears that in the absence of the reducing agent DTT, the EK cleaved polypeptides have a reproducibly decreased mobility in the gel (compare lane B3 with B4) for reasons that remain uncertain. Figure 9 demonstrates function of PFEK1 -neuropsin 6XHIS by demonstrating quantitative cleavage of the expressed and purified zymogen to 20 generate the processed and activated protease. Figure 10 demonstrates function of PFEKI protease O-6XHIS by demonstrating quantitative cleavage of the expressed and purified zymogen to generate the processed and activated protease.

WO 00/66709 PCTUSOO/09973 28 Table 1 SEQ.ID Oligo Name Sequence ,NO.:I 15 Stop-U OTAGATAGO 16 Stop-L GGCCGCTAT 17 HA -Stop -U CTAGATACCCCTACGATGTGCCCGATTACGCCTAGC 18 HA -Stop -L GGCCGCTAGGCGTAATCGGGCACATCGTAGGGGTAT 19 HA- Nonstop -U CTAGATACCCCTACGATGTGCCCGATTAOGCCG 20 HA- Nonstop -L CTAGCGGCGTAATCGGGCACATCGTAGGGGTAT 21 6XH IS-U OTAGACATCACCATCACCATCACTAGC 22 6XH IS- L GGCCGCTAGTGATGGTGATGGTGATGT 23 PF -#1 U TGAATTCACCACCATGGACAGCAAAGGTTCGTCG 24 PF -#2U CAGAAAGGGTCOCGCCTGCTCCTGCTGCTG 25 PF-#3U GTGGTGTCAAATCTACTCTTGTGCCAGGGT 26 PF -#4U GTGGTCTCCGACTACAAGGACGACGACGAC 27 PF-#5U GTGGACGCGGCOGCATTATTA 28 PF -#6L TMTAATGCGGCCGCGTCCACGTCGTCGTCGTCCT 29 PF-#7L TGTAGTCGGAGACCACACCCT 30 PF -#8L GGCACAAGAGTAGATTTGACACCACCAGOA 31 Pr -#9L GCAGGAGCAGGCGGGACCOTTTCTGCGACG 32 PF-#1 OL AACCTTTGCTGTCCATGGTGGTGAATTCA 33 TryplPre-U AATTCACCATGAATCCACTCCTGATCCTTACCTTTGTGGC 34 Trypl Pre -L GGCCGCCACAAAGGTAAGGATCAGGAGTGGATTCATGGTG 35 CF -#1 U AATTCACCACCATGGCTTTCCTCTGGCTCCTCTCCTGCTGGG CCCTCCTGGGTAC 36 CF -#2L CCAGGAGGGCCCAGCAGGAGAGGAGCCAGAGGAMAGCCATGG

TGGTG

WO 00/66709 PCTIUSOO/09973 29 37 CF-#3U CACCTTCGGCTGCGGGGTCCCCGACTACAAGGACGACGACGA OGO 38 CF-#4L GGCCGCGTCGTCGTCGTCCTTGTAGTCGGGGACCCCGCAGCC GAAGGTGGTAC 39 EK1-U GTGGCGGCCGCTCTTGCTGCCCCCTTTGA 40 EK1-L TTCTCTAGACAGTTGTAGCCCCCMCGA 41 EK2-U GGCCGOTCTTGCTGCCCCCTTTGATGATGATGACMGATCGT TGGGGGCTATGCT 42 EK2-L CTAGAGCATAGCCCCCAACGATCTTGTCATOATCATCAMGG GGGCAGCAAGAGC 43 EK3-U GGCCGCTCTTGCTGCCCCCTTTGATGATGATGACMGATCGT TGGGGGCTATTGT 44 EK3-L CTAGACAATAGCCCCCAACGATCTTGTCATCATCATCAMGG GGGCAGCAAGAGC 45 FXa-U GGCCGCTCTTGCTGCCCCCTTTATCGAGGGGCGCATTGTGGA GGGCTCGGAT 46 FXa-L CTAGATOCGAGCCCTCCACAATGCGCCCCTCGATAMGGGGG OAGCMAGAGC 47 prostasin Xba-U AGCAGTCTAGAGGCCGGTCAGTGGCCCTGGCA 48 prostasin(SOL) Xba- GCTGGTCTAGAGCTGAAGGCCAGGTGGC L 49 neuropsin Xba-U GGTATCTAGAGCCCTTGCTGCCTATGATC 50 neuropsin Xba-L ACTGTCTAGAACCCCATTCGCAGCCTTGGC 51 protease 0 Xba-U TCGATCTAGAAAAGCACTCCCAGCCCTGGCAG 52 protease 0 Xba-L GTCCTCTAGAATTGTTCTTCATCGTCTCCTGG Protease Genbank Acc.# cONA h W40511 Trypsinogen I h Prostasin AA205604 h Neuropsin 2604309 h Protease 0 2723646 WO 00/66709 PCT/USO0/09973 30 Table 2 Recombinant Protease H-D-Pro-HHT- H-D-Lys(CBO)- H-D-Val-Leu- H-DL-Val-Leu Arg-pNA Pro-Arg-pNA Lys-pNA Arg-pNA PFEK2-prostasin-6XHIS 0.055±0.002 0.870±0.022 N.D. 0.251±0.005 CFEK2-prostasin-6XHIS 0.116±0.011 1.317±0.024 N.D. 0.384±0.003 PFEK1-neuropsin-6XHIS 0.463±0.014 0.731±0.004 0.158±0.001 0.938±0.002 PFEK1-protease 0- 0.058±0.002 0.022±0.000 N.D. 0.006±0.000 6XHIS N.D. = Not Determined References Cited 5 Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990). Basic local alignment search tool. J. Mol. Biol. 215, 403-10. Chen, Z.-L., Yoshida, S., Kato, K., Momota, Y., Suzuki, J., Tanaka, T., Ito, J., Nishino, H., Aimoto, S., Kiyama, H., and Shiosaka, S. (1995). Expression and activity-dependent changes of 10 a novel limbic-serine protease gene in the hippocampus. J. Neurosci. 15, 5088-97. Davie, E. W., Fujikawa, K., and Kisiel, W. (1991). The coagulation cascade: initiation, maintenance, and regulation. Biochemistry 30, 10363-70. 15 Hansson, L., Stroemqvist, M., Baeckman, A., Wallbrandt, P., Carlstein, A., and Egelrud, T. (1994). Cloning, expression, and characterization of stratum corneum chymotryptic enzyme. A skin-specific human serine proteinase. J. Biol. Chem. 269, 19420-6.

WO 00/66709 PCT/USOO/09973 31 Huber, R., and Bode, W. (1978). Structural basis of the activation and action of trypsin. Acc. Chem. Res. 11, 114-22. Ishii, K., Hein, L., Kobilka, B., and Coughlin, S. R. (1993). Kinetics of thrombin receptor 5 cleavage on intact cells. Relation to signaling. J. Biol. Chem. 268, 9780-6. Kitamoto, Y., Yuan, X., Wu, Q., McCourt, D. W., and Sadler, J. E. (1994). Enterokinase, the initiator of intestinal digestion, is a mosaic protease composed of a distinctive assortment of domains. Proc. Natl. Acad. Sci. U. S. A. 91, 7588-92. 10 Kossiakoff, A. A., Chambers, J. L., Kay, L. M., and Stroud, R. M. (1977). Structure of bovine trypsinogen at 1.9 .ANG. resolution. Biochemistry 16, 654-64. Kyte, J., and Doolittle, R. F. (1982). A simple method for displaying the hydropathic character 15 of a protein. J. Mol. Biol. 157, 105-32. Leytus, S. P., Loeb, K. R., Hagen, F. S., Kurachi, K., and Davie, E. W. (1988). A novel trypsin like serine protease (hepsin) with a putative transmembrane domain expressed by human liver and hepatoma cells. Biochemistry 27, 1067-74. 20 Little, S. P., Dixon, E. P., Norris, F., Buckley, W., Becker, G. W., Johnson, M., Dobbins, J. R., Wyrick, T., Miller, J. R., Mackellar, W., Hepburn, D., Corvalan, J., Mcclure, D., Liu, X., Stephenson, D., Clemens, J., and Johnstone, E. M. (1997). Zyme, a novel and potentially amyloidogenic enzyme cDNA isolated from Alzheimer's disease brain. J. Biol. Chem. 272, 25 25135-25142.

WO 00/66709 PCTIUSOO/09973 32 Martoglio, B., and Dobberstein, B. (1998). Signal sequences: more than just greasy peptides. Trends Cell Biol. 8, 410-415. Matthews, B. W., Sigler, P. B., Henderson, R., and Blow, D. M. (1967). Three-dimensional 5 structure of tosyl-.alpha.-chymotrypsin. Nature (London) 214, 652-6. Pearson, W. R., and Lipman, D. J. (1988). Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. U. S. A. 85, 2444-8. 10 Proud, D., and Kaplan, A. P. (1988). Kinin formation: mechanisms and role in inflammatory disorders. Annu. Rev. Immunol. 6,, 49-83. Rawlings, N. D., and Barrett, A. J. (1994). Families of serine peptidases. Methods Enzymol. 244, 19-61. 15 Reid, K. B. M., and Porter, R. R. (1981). The proteolytic activation systems of complement. Annual Review of Biochemistry 50, 433-464. Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989). Molecular Cloning: A Laboratory 20 Manual, 2nd ed.: Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. Stroud, R. M., Kay, L. M., and Dickerson, R. E. (1974). Structure of bovine trypsin. Electron density maps of the inhibited enzyme at 5 .ang. and 2.7 .ang. resolution. J. Mol. Biol. 83, 185 208. 25 Tachias, K., and Madison, E. L. (1996). Converting tissue-type plasminogen activator into a zymogen. J. Biol. Chem. 271, 28749-28752.

WO 00/66709 PCTUSOO/09973 33 Takayama, T. K., Fujikawa, K., and Davie, E. W. (1997). Characterization of the precursor of prostate-specific antigen Activation by trypsin and by human glandular kallikrein. J. Biol. Chem. 272, 21582-21588. 5 Wang, Z.-m., Rubin, H., and Schechter, N. M. (1995). Production of active recombinant human chymase from a construct containing the enterokinase cleavage site of trypsinogen in place of the native propeptide sequence. Biol. Chem. Hoppe-Seyler 376, 681-4. 10 Wigler, M., Silverstein, S., Lee, L.-S., Pellicer, A., Cheng, Y.-C., and Axel, R. (1977). Transfer of purified Herpes virus thymidine kinase gene to cultured mouse cells. Cell (Cambridge, Mass.) 11, 223-32. Yamashiro, K., Tsuruoka, N., Kodama, S., Tsujimoto, M., Yamamura, Y., Tanaka, T., 15 Nakazato, H., and Yamaguchi, N. (1997). Molecular cloning of a novel trypsin-like serine protease (neurosin) preferentially expressed in brain. Biochim. Biophys. Acta 1350, 11-14. Yoshida, S., Taniguchi, M., Hirata, A., and Shiosaka, S. (1998). Sequence analysis and expression of human neuropsin cDNA and gene. Gene 213, 9-16. 20 Yoshida, S., Taniguchi, M., Suemoto, T., Oka, T., He, X., and Shiosaka, S. (1998). cDNA cloning and expression of a novel serine protease, TLSP1. Biochim. Biophys. Acta 1399, 225 228. 25 Yu, J. X., Chao, L., and Chao, J. (1995). Molecular cloning, tissue-specific expression, and cellular localization of human prostasin mRNA. J. Biol. Chem. 270, 13483-9.

WO 00/66709 PCT/US0O/09973 34 SEQUENCE LISTING 5 <110> Darrow, Andrew Qi, Jensen Andrade-Gordon, Patricia 10 <120> Zymogen Activation System 15 <130> ORT-993 20 <140> <141> 25 <160> 52 30 <170> PatentIn Ver. 2.0 35 <210> 1 <211> 361 <212> DNA 40 <213> Artificial Sequence 45 <220> <223> Description of Artificial Sequence: Fusion gene WO 00/66709 PCT/USOO/09973 35 vectors. 5 <400> 1 gaattcacca ccatggacag caaaggttcg tcgcagaaat cccgcctgct cctgctgctg 60 10 gtggtgtcaa atctactctt gtgccagggt gtggtctccg actacaagga cgacgacgac 120 gtggacgcgg ccgctcttgc tgcccccttt gatgatgatg acaagatcgt tgggggctat 180 gctctagata gcggccgctt ccctttagtg agggttaatg cttcgagcag acatgataag 240 15 atacattgat gagtttggac aaaccacaac tagaatgcag tgaaaaaaat gctttatttg 300 tgaaatttgt gatgctattg ctttatttgt aaccattata agctgcaata aacaagttga 360 c 361 20 <210> 2 25 <211> 301 <212> DNA <213> Artificial Sequence 30 <220> 35 <223> Description of Artificial Sequence: Fusion gene vectors. 40 <400> 2 gaattcacca tgaatccact cctgatcctt acctttgtgg cggccgctct tgctgccccc 60 45 tttgatgatg atgacaagat cgttgggggc tattgtctag atacccctac gatgtgcccg 120 attacgccta gcggccgctt ccctttagtg agggttaatg cttcgagcag acatgataag 180 WO 00/66709 PCTUSOO/09973 36 atacattgat gagtttggac aaaccacaac tagaatgcag tgaaaaaaat gctttatttg 240 tgaaatttgt gatgctattg ctttatttgt aaccattata agctgcaata aacaagttga 300 5 c 301 <210> 3 10 <211> 484 <212> DNA 15 <213> Artificial Sequence <220> 20 <223> Description of Artificial Sequence: Fusion gene vectors. 25 <400> 3 30 gaattcacca ccatggacag caaaggttcg tcgcagaaat cccgcctgct cctgctgctg 60 gtggtgtcaa atctactctt gtgccagggt gtggtctccg actacaagga cgacgacgac 120 gtggacgcgg ccgctcttgc tgcccccttt atcgaggggc gcattgtgga gggctcggat 180 35 ctagataccc ctacgatgtg cccgattacg ccgctagata cccctacgat gtgcccgatt 240 acgccgctag ataccactac gatgtgcccg attacgccgc tagatacccc tacgatgtgc 300 40 ccgattacgc ctagcggccg cttcccttta gtgagggtta atgcttcgag cagacatgat 360 aagatacatt gatgagtttg gacaaaccac aactagaatg cagtgaaaaa aatgctttat 420 ttgtgaaatt tgtgatgcta ttgctttatt tgtaaccatt ataagctgca ataaacaagt 480 45 tgac 484 WO 00/66709 PCT/USOO/09973 37 <210> 4 <211> 382 5 <212> DNA <213> Artificial Sequence 10 <220> <223> Description of Artificial Sequence: Fusion gene 15 vectors. <400> 4 20 gaattcacca ccatggacag caaaggttcg tcgcagaaat cccgcctgct cctgctgctg 60 gtggtgtcaa atctactctt gtgccagggt gtggtctccg actacaagga cgacgacgac 120 25 gtggacgcgg ccgctcttgc tgcccccttt gatgatgatg acaagatcgt tgggggctac 180 aactgtctag acatcaccat caccatcact agcggccgct tccctttagt gagggttaat 240 30 gcttcgagca gacatgataa gatacattga tgagtttgga caaaccacaa ctagaatgca 300 gtgaaaaaaa tgctttattt gtgaaatttg tgatgctatt gctttatttg taaccattat 360 aagctgcaat aaacaagttg ac 382 35 <210> 5 <211> 352 40 <212> DNA <213> Artificial Sequence 45 <220> WO 00/66709 PCT/USOO/09973 38 <223> Description of Artificial Sequence: Fusion gene vectors. 5 <400> 5 10 gaattcacca ccatggcttt cctctggctc ctctcctgct gggccctcct gggtaccacc 60 ttcggctgcg gggtccccga ctacaaggac gacgacgacg cggccgctct tgctgccccc 120 tttgatgatg atgacaagat cgttgggggc tatgctctag acatcaccat caccatcact 180 15 agcggccgct tccctttagt gagggttaat gcttcgagca gacatgataa gatacattga 240 tgagtttgga caaaccacaa ctagaatgca gtgaaaaaaa tgctttattt gtgaaatttg 300 20 tgatgctatt gctttatttg taaccattat aagctgcaat aaacaagttg ac 352 <210> 6 25 <211> 385 <212> DNA <213> Artificial Sequence 30 <220> 35 <223> Description of Artificial Sequence: Fusion gene vectors. 40 <400> 6 gaattcacca ccatggcttt cctctggctc ctctcctgct gggccctcct gggtaccacc 60 45 ttcggctgcg gggtccccga ctacaaggac gacgacgacg cggccgctct tgctgccccc 120 tttgatgatg atgacaagat cgttgggggc tatgctctag atacccctac gatgtgcccg 180 WO 00/66709 PCT/USOO/09973 39 attacgccgc tagacatcac catcaccatc actagcggcc gcttcccttt agtgagggtt 240 aatgcttcga gcagacatga taagatacat tgatgagttt ggacaaacca caactagaat 300 5 gcagtgaaaa aaatgcttta tttgtgaaat ttgtgatgct attgctttat ttgtaaccat 360 tataagctgc aataaacaag ttgac 385 10 <210> 7 <211> 1169 15 <212> DNA <213> Artificial Sequence 20 <220> <223> Description of Artificial Sequence: Fusion gene 25 with homo sapien serine protease catalytic domain <400> 7 30 gaattcacca ccatggacag caaaggttcg tcgcagaaat cccgcctgct cctgctgctg 60 gtggtgtcaa atctactctt gtgccagggt gtggtctccg actacaagga cgacgacgac 120 35 gtggacgcgg ccgctcttgc tgcccccttt gatgatgatg acaagatcgt tgggggctat 180 gctctagagg ccggtcagtg gccctggcag gtcagcatca cctatgaagg cgtccatgtg 240 tgtggtggct ctctcgtgtc tgagcagtgg gtgctgtcag ctgctcactg cttccccagc 300 gagcaccaca aggaagccta tgaggtcaag ctgggggccc accagctaga ctcctactcc 360 gaggacgcca aggtcagcac cctgaaggac atcatccccc accccagcta cctccaggag 420 45 ggctcccagg gcgacattgc actcctccaa ctcagcagac ccatcacctt ctcccgctac 480 atccggccca tctgcctccc tgcagccaac gcctccttcc ccaacggcct ccactgcact 540 WO 00/66709 PCTIUSOO/09973 40 gtcactggct ggggtcatgt ggccccctca gtgagcctcc tgacgcccaa gccactgcag 600 caactcgagg tgcctctgat cagtcgtgag acgtgtaact gcctgtacaa catcgacgcc 660 5 aagcctgagg agccgcactt tgtccaagag gacatggtgt gtgctggcta tgtggagggg 720 ggcaaggacg cctgccaggg tgactctggg ggcccactct cctgccctgt ggagggtctc 780 10 tggtacctga cgggcattgt gagctgggga gatgcctgtg gggcccgcaa caggcctggt 840 gtgtacactc tggcctccag ctatgcctcc tggatccaaa gcaaggtgac agaactccag 900 cctcgtgtgg tgccccaaac ccaggagtcc cagcccgaca gcaacctctg tggcagccac 960 15 ctggccttca gctctagaca tcaccatcac catcactagc ggccgcttcc ctttagtgag 1020 ggttaatgct tcgagcagac atgataagat acattgatga gtttggacaa accacaacta 1080 20 gaatgcagtg aaaaaaatgc tttatttgtg aaatttgtga tgctattgct ttatttgtaa 1140 ccattataag ctgcaataaa caagttgac 1169 25 <210> 8 <211> 1142 <212> DNA 30 <213> Artificial Sequence 35 <220> <223> Description of Artificial Sequence: Fusion gene with homo sapien serine protease catalytic domain 40 <400> 8 45 gaattcacca ccatggcttt cctctggctc ctctcctgct gggccctcct gggtaccacc 60 ttcggctgcg gggtccccga ctacaaggac gacgacgacg cggccgctct tgctgccccc 120 WO 00/66709 PCT/USOO/09973 41 tttgatgatg atgacaagat cgttgggggc tatgctctag aggccggtca gtggccctgg 180 caggtcagca tcacctatga aggcgtccat gtgtgtggtg gctctctcgt gtctgagcag 240 5 tgggtgctgt cagctgctca ctgcttcccc agcgagcacc acaaggaagc ctatgaggtc 300 aagctggggg cccaccagct agactcctac tccgaggacg ccaaggtcag caccctgaag 360 10 gacatcatcc cccaccccag ctacctccag gagggctccc agggcgacat tgcactcctc 420 caactcagca gacccatcac cttctcccgc tacatccggc ccatctgcct ccctgcagcc 480 aacgcctcct tccccaacgg cctccactgc actgtcactg gctggggtca tgtggccccc 540 15 tcagtgagcc tcctgacgcc caagccactg cagcaactcg aggtgcctct gatcagtcgt 600 gagacgtgta actgcctgta caacatcgac gccaagcctg aggagccgca ctttgtccaa 660 20 gaggacatg2 tgtgtgctgg ctatgtggag gggggcaagg acgcctgcca gggtgactct 720 gggggcccac tctcctgccc tgtggagggt ctctggtacc tgacgggcat tgtgagctgg 780 ggagatgcct gtggggcccg caacaggcct ggtgtgtaca ctctggcctc cagctatgcc 840 25 tcctggatcc aaagcaaggt gacagaactc cagcctcgtg tggtgcccca aacccaggag 900 tcccagcccg acagcaacct ctgtggcagc cacctggcct tcagctctag acatcaccat 960 caccatcact agcggccgct tccctttagt gagggttaat gcttcgagca gacatgataa 1020 30 gatacattga tgagtttgga caaaccacaa ctagaatgca gtgaaaaaaa tgctttattt 1080 gtgaaatttg tgatgctatt gctttatttg taaccattat aagctgcaat aaacaagttg 1140 35 ac 1142 <210> 9 40 <211> 1049 <212> DNA 45 <213> Artificial Sequence WO 00/66709 PCTIUSOO/09973 42 <220> <223> Description of Artificial Sequence: Fusion gene 5 with homo sapien serine protease catalytic domain <400> 9 10 gaattcacca ccatggacag caaaggttcg tcgcagaaat cccgcctgct cctgctgctg 60 gtggtgtcaa atctactctt gtgccagggt gtggtctccg actacaagga cgacgacgac 120 15 gtggacgcgg ccgctcttgc tgcccccttt gatgatgatg acaagatcgt tgggggctac 180 aactgtctag aaccccattc gcagccttgg caggcggcct tgttccaggg ccagcaacta 240 20 ctctgtggcg gtgtccttgt aggtggcaac tgggtcctta cagctgccca ctgtaaaaaa 300 ccgaaataca cagtacgcct gggagaccac agcctacaga ataaagatgg cccagagcaa 360 gaaatacctg tggttcagtc catcccacac ccctgctaca acagcagcga tgtggaggac 420 25 cacaaccatg atctgatgct tcttcaactg cgtgaccagg catccctggg gtccaaagtg 480 aagcccatca gcctggcaga tcattgcacc cagcctggcc agaagtgcac cgtctcaggc 540 tggggcactg tcaccagtcc ccgagagaat tttcctgaca ctctcaactg tgcagaagta 600 aaaatctttc cccagaagaa gtgtgaggat gcttacccgg ggcagatcac agatggcatg 660 gtctgtgcag gcagcagcaa aggggctgac acgtgccagg gcgattctgg aggccccctg 720 35 gtgtgtgatg gtgcactcca gggcatcaca tcctggggct cagacccctg tgggaggtcc 780 gacaaacctg gcgtctatac caacatctgc cgctacctgg actggatcaa gaagatcata 840 ggcagcaagg gctctagaca tcaccatcac catcactagc ggccgcttcc ctttagtgag 900 ggttaatgct tcgagcagac atgataagat acattgatga gtttggacaa accacaacta 960 gaatgcagtg aaaaaaatgc tttatttgtg aaatttgtga tgctattgct ttatttgtaa 1020 45 ccattataag ctgcaataaa caagttgac 1049 WO 00/66709 PCTUSOO/09973 43 <210> 10 <211> 1052 5 <212> DNA <213> Artificial Sequence 10 <220> <223> Description of Artificial Sequence: Fusion gene 15 with homo sapien serine protease catalytic domain <400> 10 20 gaattcacca ccatggacag caaaggttcg tcgcagaaat cccgcctgct cctgctgctg 60 gtggtgtcaa atctactctt gtgccagggt gtggtctccg actacaagga cgacgacgac 120 25 gtggacgcgg ccgctcttgc tgcccccttt gatgatgatg acaagatcgt tgggggctac 180 aactgtctag aaaagcactc ccagccctgg caggcagccc tgttcgagaa gacgcggcta 240 30 ctctgtgggg cgacgctcat cgcccccaga tggctcctga cagcagccca ctgcctcaag 300 ccccgctaca tagttcacct ggggcagcac aacctccaga aggaggaggg ctgtgagcag 360 acccggacag ccactgagtc cttcccccac cccggcttca acaacagcct ccccaacaaa 420 35 gaccaccgca atgacatcat gctggtgaag atggcatcgc cagtctccat cacctgggct 480 gtgcgacccc tcaccctctc ctcacgctgt gtcactgctg gcaccagctg cctcatttcc 540 40 ggctggggca gcacgtccag cccccagtta cgcctgcctc acaccttgcg atgcgccaac 600 atcaccatca ttgagcacca gaagtgtgag aacgcctacc ccggcaacat cacagacacc 660 atggtgtgtg ccagcgtgca ggaagggggc aaggactcct gccagggtga ctccgggggc 720 45 cctctggtct gtaaccagtc tcttcaaggc attatctcct ggggccagga tccgtgtgcg 780 atcacccgaa agcctggtgt ctacacgaaa gtctgcaaat atgtggactg gatccaggag 840 WO 00/66709 PCT/USOO/09973 44 acgatgaaga acaattctag acatcaccat caccatcact agcggccgct tccctttagt 900 gagggttaat gcttcgagca gacatgataa gatacattga tgagtttgga caaaccacaa 960 5 ctagaatgca gtgaaaaaaa tgctttattt gtgaaatttg tgatgctatt gctttatttg 1020 taaccattat aagctgcaat aaacaagttg ac 1052 10 <210> 11 <211> 328 15 <212> PRT <213> Artificial Sequence 20 <220> <223> Description of Artificial Sequence: Fusion gene 25 with homo sapien serine protease catalytic domain <400> 11 30 Met Asp Ser Lys Gly Ser Ser Gln Lys Ser Arg Leu Leu Leu Leu Leu 1 5 10 15 35 Val Val Ser Asn Leu Leu Leu Cys Gln Gly Val Val Ser Asp Tyr Lys 20 25 30 40 Asp Asp Asp Asp Val Asp Ala Ala Ala Leu Ala Ala Pro Phe Asp Asp 45 35 40 45 WO 00/66709 PCT/USOO/09973 45 Asp Asp Lys Ile Val Gly Gly Tyr Ala Leu Glu Ala Gly Gln Trp Pro 50 55 60 5 Trp Gln Val Ser Ile Thr Tyr Glu Gly Val His Val Cys Gly Gly Ser 65 70 75 80 10 Leu Val Ser Glu Gln Trp Val Leu Ser Ala Ala His Cys Phe Pro Ser 15 85 90 95 20 Glu His His Lys Glu Ala Tyr Glu Val Lys Leu Gly Ala His Gln Leu 100 105 110 25 Asp Ser Tyr Ser Glu Asp Ala Lys Val Ser Thr Leu Lys Asp Ile Ile 115 120 125 30 Pro His Pro Ser Tyr Leu Gln Glu Gly Ser Gln Gly Asp Ile Ala Leu 130 135 140 35 Leu Gln Leu Ser Arg Pro Ile Thr Phe Ser Arg Tyr Ile Arg Pro Ile 145 150 155 160 40 Cys Leu Pro Ala Ala Asn Ala Ser Phe Pro Asn Gly Leu His Cys Thr 45 165 170 175 WO 00/66709 PCT/USOO/09973 46 Val Thr Gly Trp Gly His Val Ala Pro Ser Val Ser Leu Leu Thr Pro 180 185 190 5 Lys Pro Leu Gln Gln Leu Glu Val Pro Leu Ile Ser Arg Glu Thr Cys 195 200 205 10 Asn Cys Leu Tyr Asn Ile Asp Ala Lys Pro Glu Glu Pro His Phe Val 15 210 215 220 20 Gln Glu Asp Met Val Cys Ala Gly Tyr Val Glu Gly Gly Lys Asp Ala 225 230 235 240 25 Cys Gln Gly Asp Ser Gly Gly Pro Leu Ser Cys Pro Val Glu Gly Leu 245 250 255 30 Trp Tyr Leu Thr Gly Ile Val Ser Trp Gly Asp Ala Cys Gly Ala Arg 260 265 270 35 Asn Arg Pro Gly Val Tyr Thr Leu Ala Ser Ser Tyr Ala Ser Trp Ile 275 280 285 40 Gln Ser Lys Val Thr Glu Leu Gln Pro Arg Val Val Pro Gln Thr Gln 45 290 295 300 WO 00/66709 PCT/USOO/09973 47 Glu Ser Gln Pro Asp Ser Asn Leu Cys Gly Ser His Leu Ala Phe Ser 305 310 315 320 5 Ser Arg His His His His His His 325 10 15 <210> 12 <211> 319 <212> PRT 20 <213> Artificial Sequence 25 <220> <223> Description of Artificial Sequence: Fusion gene 30 with homo sapien serine protease catalytic domain <400> 12 35 Met Ala Phe Leu Trp Leu Leu Ser Cys Trp Ala Leu Leu Gly Thr Thr 1 5 10 15 40 Phe Gly Cys Gly Val Pro Asp Tyr Lys Asp Asp Asp Asp Ala Ala Ala 20 25 30 45 Leu Ala Ala Pro Phe Asp Asp Asp Asp Lys Ile Val Gly Gly Tyr Ala WO 00/66709 PCT/USOO/09973 48 35 40 45 5 Leu Glu Ala Gly Gln Trp Pro Trp Gln Val Ser Ile Thr Tyr Glu Gly 50 55 60 10 Val His Val Cys Gly Gly Ser Leu Val Ser Glu Gln Trp Val Leu Ser 65 70 75 80 15 Ala Ala His Cys Phe Pro Ser Glu His His Lys Glu Ala Tyr Glu Val 85 90 95 20 Lys Leu Gly Ala His Gln Leu Asp Ser Tyr Ser Glu Asp Ala Lys Val 25 100 105 110 30 Ser Thr Leu Lys Asp Ile Ile Pro His Pro Ser Tyr Leu Gln Glu Gly 115 120 125 35 Ser Gln Gly Asp Ile Ala Leu Leu Gln Leu Ser Arg Pro Ile Thr Phe 130 135 140 40 Ser Arg Tyr Ile Arg Pro Ile Cys Leu Pro Ala Ala Asn Ala Ser Phe 145 150 155 160 45 Pro Asn Gly Leu His Cys Thr Val Thr Gly Trp Gly His Val Ala Pro WO 00/66709 PCT/USOO/09973 49 165 170 175 5 Ser Val Ser Leu Leu Thr Pro Lys Pro Leu Gln Gln Leu Glu Val Pro 180 185 190 10 Leu Ile Ser Arg Glu Thr Cys Asn Cys Leu Tyr Asn Ile Asp Ala Lys 195 200 205 15 Pro Glu Glu Pro His Phe Val Gln Glu Asp Met Val Cys Ala Gly Tyr 210 215 220 20 Val Glu Gly Gly Lys Asp Ala Cys Gln Gly Asp Ser Gly Gly Pro Leu 25 225 230 235 240 30 Ser Cys Pro Val Glu Gly Leu Trp Tyr Leu Thr Gly Ile Val Ser Trp 245 250 255 35 Gly Asp Ala Cys Gly Ala Arg Asn Arg Pro Gly Val Tyr Thr Leu Ala 260 265 270 40 Ser Ser Tyr Ala Ser Trp Ile Gln Ser Lys Val Thr Glu Leu Gln Pro 275 280 285 45 Arg Val Val Pro Gln Thr Gln Glu Ser Gln Pro Asp Ser Asn Leu Cys WO 00/66709 PCT/USOO/09973 50 290 295 300 5 Gly Ser His Leu Ala Phe Ser Ser Arg His His His His His His 305 310 315 10 <210> 13 15 <211> 288 <212> PRT <213> Artificial Sequence 20 <220> 25 <223> Description of Artificial Sequence: Fusion gene with homo sapien serine protease catalytic domain 30 <400> 13 Met Asp Ser Lys Gly Ser Ser Gln Lys Ser Arg Leu Leu Leu Leu Leu 35 1 5 10 15 Val Val Ser Asn Leu Leu Leu Cys Gln Gly Val Val Ser Asp Tyr Lys 40 20 25 30 45 Asp Asp Asp Asp Val Asp Ala Ala Ala Leu Ala Ala Pro Phe Asp Asp 35 40 45 WO 00/66709 PCTIUSOO/09973 51 Asp Asp Lys Ile Val Gly Gly Tyr Asn Cys Leu Glu Pro His Ser Gln 5 50 55 60 10 Pro Trp Gln Ala Ala Leu Phe Gln Gly Gln Gln Leu Leu Cys Gly Gly 65 70 75 80 15 Val Leu Val Gly Gly Asn Trp Val Leu Thr Ala Ala His Cys Lys Lys 85 90 95 20 Pro Lys Tyr Thr Val Arg Leu Gly Asp His Ser Leu Gln Asn Lys Asp 100 105 110 25 Gly Pro Glu Gln Glu Ile Pro Val Val Gln Ser Ile Pro His Pro Cys 115 120 125 30 Tyr Asn Ser Ser Asp Val Glu Asp His Asn His Asp Leu Met Leu Leu 35 130 135 140 40 Gln Leu Arg Asp Gln Ala Ser Leu Gly Ser Lys Val Lys Pro Ile Ser 145 150 155 160 45 Leu Ala Asp His Cys Thr Gln Pro Gly Gln Lys Cys Thr Val Ser Gly 165 170 175 WO 00/66709 PCT/US0O/09973 52 Trp Gly Thr Val Thr Ser Pro Arg Glu Asn Phe Pro Asp Thr Leu Asn 5 180 185 190 10 Cys Ala Glu Val Lys Ile Phe Pro Gln Lys Lys Cys Glu Asp Ala Tyr 195 200 205 15 Pro Gly Gln Ile Thr Asp Gly Met Val Cys Ala Gly Ser Ser Lys Gly 210 215 220 20 Ala Asp Thr Cys Gln Gly Asp Ser Gly Gly Pro Leu Val Cys Asp Gly 225 230 235 240 25 Ala Leu Gln Gly Ile Thr Ser Trp Gly Ser Asp Pro Cys Gly Arg Ser 245 250 255 30 Asp Lys Pro Gly Val Tyr Thr Asn Ile Cys Arg Tyr Leu Asp Trp Ile 35 260 265 270 40 Lys Lys Ile Ile Gly Ser Lys Gly Ser Arg His His His His His His 275 280 285 45 WO 00/66709 PCT/USOO/09973 53 5 <210> 14 <211> 289 <212> PRT 10 <213> Artificial Sequence 15 <220> <223> Description of Artificial Sequence: Fusion gene with homo sapien serine protease catalytic domain 20 <400> 14 25 Met Asp Ser Lys Gly Ser Ser Gln Lys Ser Arg Leu Leu Leu Leu Leu 1 5 10 15 30 Val Val Ser Asn Leu Leu Leu Cys Gln Gly Val Val Ser Asp Tyr Lys 20 25 30 35 Asp Asp Asp Asp Val Asp Ala Ala Ala Leu Ala Ala Pro Phe Asp Asp 35 40 45 40 Asp Asp Lys Ile Val Gly Gly Tyr Asn Cys Leu Glu Lys His Ser Gln 45 50 55 60 WO 00/66709 PCTIUSOO/09973 54 Pro Trp Gln Ala Ala Leu Phe Glu Lys Thr Arg Leu Leu Cys Gly Ala 65 70 75 80 5 Thr Leu Ile Ala Pro Arg Trp Leu Leu Thr Ala Ala His Cys Leu Lys 85 90 95 10 Pro Arg Tyr Ile Val His Leu Gly Gln His Asn Leu Gln Lys Glu Glu 15 100 105 110 20 Gly Cys Glu Gln Thr Arg Thr Ala Thr Glu Ser Phe Pro His Pro Gly 115 120 125 25 Phe Asn Asn Ser Leu Pro Asn Lys Asp His Arg Asn Asp Ile Met Leu 130 135 140 30 Val Lys Met Ala Ser Pro Val Ser Ile Thr Trp Ala Val Arg Pro Leu 145 150 155 160 35 Thr Leu Ser Ser Arg Cys Val Thr Ala Gly Thr Ser Cys Leu Ile Ser 165 170 175 40 Gly Trp Gly Ser Thr Ser Ser Pro Gln Leu Arg Leu Pro His Thr Leu 45 180 185 190 WO 00/66709 PCTIUSOO/09973 55 Arg Cys Ala Asn Ile Thr Ile Ile Glu His Gln Lys Cys Glu Asn Ala 195 200 205 5 Tyr Pro Gly Asn Ile Thr Asp Thr Met Val Cys Ala Ser Val Gln Glu 210 215 220 10 Gly Gly Lys Asp Ser Cys Gln Gly Asp Ser Gly Gly Pro Leu Val Cys 15 225 230 235 240 Asn Gln Ser Leu Gln Gly Ile Ile Ser Trp Gly Gln Asp Pro Cys Ala 20 245 250 255 25 Ile Thr Arg Lys Pro Gly Val Tyr Thr Lys Val Cys Lys Tyr Val Asp 260 265 270 30 Trp Ile Gln Glu Thr Met Lys Asn Asn Ser Arg His His His His His 275 280 285 35 His 40 45 <210> 15 <211> 9 WO 00/66709 PCT/US00/09973 56 <212> DNA <213> Artificial Sequence 5 <220> <223> Description of Artificial Sequence: 10 oligonucleotide 15 <400> 15 ctagatagc 9 20 <210> 16 <211> 9 25 <212> DNA <213> Artificial Sequence 30 <220> <223> Description of Artificial Sequence: 35 oligonucleotide <400> 16 40 ggccgctat 9 45 <210> 17 <211> 36 WO 00/66709 PCTIUSOO/09973 57 <212> DNA <213> Artificial Sequence 5 <220> 10 <223> Description of Artificial Sequence: oligonucleotide 15 <400> 17 ctagataccc ctacgatgtg cccgattacg cctagc 36 20 <210> 18 <211> 36 25 <212> DNA <213> Artificial Sequence 30 <220> <223> Description of Artificial Sequence: 35 oligonucleotide <400> 18 40 ggccgctagg cgtaatcggg cacatcgtag gggtat 36 45 <210> 19 <211> 33 WO 00/66709 PCTIUSOO/09973 58 <212> DNA <213> Artificial Sequence 5 <220> <223> Description of Artificial Sequence: 10 oligonucleotide 15 <400> 19 ctagataccc ctacgatgtg cccgattacg ccg 33 20 <210> 20 <211> 33 25 <212> DNA <213> Artificial Sequence 30 <220> <223> Description of Artificial Sequence: 35 oligonucleotide <400> 20 40 ctagcggcgt aatcgggcac atcgtagggg tat 33 45 <210> 21 <211> 27 WO 00/66709 PCTIUSO0/09973 59 <212> DNA <213> Artificial Sequence 5 <220> 10 <223> Description of Artificial Sequence: oligonucleotide 15 <400> 21 ctagacatca ccatcaccat cactagc 27 20 <210> 22 <211> 27 25 <212> DNA <213> Artificial Sequence 30 <220> <223> Description of Artificial Sequence: 35 oligonucleotide <400> 22 40 ggccgctagt gatggtgatg gtgatgt 27 45 <210> 23 <211> 34 WO 00/66709 PCTIUSOO/09973 60 <212> DNA <213> Artificial Sequence 5 <220> 10 <223> Description of Artificial Sequence: oligonucleotide 15 <400> 23 tgaattcacc accatggaca gcaaaggttc gtcg 34 20 <210> 24 <211> 30 25 <212> DNA <213> Artificial Sequence 30 <220> <223> Description of Artificial Sequence: 35 oligonucleotide <400> 24 40 cagaaagggt cccgcctgct cctgctgctg 30 45 <210> 25 <211> 30 WO 00/66709 PCT/USOO/09973 61 <212> DNA <213> Artificial Sequence 5 <220> <223> Description of Artificial Sequence: 10 oligonucleotide 15 <400> 25 gtggtgtcaa atctactctt gtgccagggt 30 20 <210> 26 <211> 30 25 <212> DNA <213> Artificial Sequence 30 <220> <223> Description of Artificial Sequence: 35 oligonucleotide <400> 26 40 gtggtctccg actacaagga cgacgacgac 30 45 <210> 27 <211> 21 WO 00/66709 PCT/US00/09973 62 <212> DNA <213> Artificial Sequence 5 <220> <223> Description of Artificial Sequence: 10 oligonucleotide 15 <400> 27 gtggacgcgg ccgcattatt a 21 20 <210> 28 <211> 35 25 <212> DNA <213> Artificial Sequence 30 <220> <223> Description of Artificial Sequence: 35 oligonucleotide <400> 28 40 taataatgcg gccgcgtcca cgtcgtcgtc gtcct 35 45 <210> 29 <211> 21 WO 00/66709 PCT/USOO/09973 63 <212> DNA <213> Artificial Sequence 5 <220> <223> Description of Artificial Sequence: 10 oligonucleotide 15 <400> 29 tgtagtcgga gaccacaccc t 21 20 <210> 30 <211> 30 25 <212> DNA <213> Artificial Sequence 30 <220> <223> Description of Artificial Sequence: 35 oligonucleotide <400> 30 40 ggcacaagag tagatttgac accaccagca 30 45 <210> 31 <211> 30 WO 00/66709 PCT/USOO/09973 64 <212> DNA <213> Artificial Sequence 5 <220> 10 <223> Description of Artificial Sequence: oligonucleotide 15 <400> 31 gcaggagcag gcgggaccct ttctgcgacg 30 20 <210> 32 <211> 29 25 <212> DNA <213> Artificial Sequence 30 <220> <223> Description of Artificial Sequence: 35 oligonucleotide <400> 32 40 aacctttgct gtccatggtg gtgaattca 29 45 <210> 33 <211> 40 WO 00/66709 PCT/USOO/09973 65 <212> DNA <213> Artificial Sequence 5 <220> 10 <223> Description of Artificial Sequence: oligonucleotide 15 <400> 33 aattcaccat gaatccactc ctgatcctta cctttgtggc 40 20 <210> 34 <211> 40 25 <212> DNA <213> Artificial Sequence 30 <220> <223> Description of Artificial Sequence: 35 oligonucleotide <400> 34 40 ggccgccaca aaggtaagga tcaggagtgg attcatggtg 40 45 <210> 35 <211> 55 WO 00/66709 PCT/USOO/09973 66 <212> DNA <213> Artificial Sequence 5 <220> 10 <223> Description of Artificial Sequence: oligonucleotide 15 <400> 35 aattcaccac catggctttc ctctggctcc tctcctgctg ggccctcctg ggtac 55 20 <210> 36 <211> 47 25 <212> DNA <213> Artificial Sequence 30 <220> <223> Description of Artificial Sequence: 35 oligonucleotide <400> 36 40 ccaggagggc ccagcaggag aggagccaga ggaaagccat ggtggtg 47 45 <210> 37 <211> 45 WO 00/66709 PCTIUSOO/09973 67 <212> DNA <213> Artificial Sequence 5 <220> 10 <223> Description of Artificial Sequence: oligonucleotide 15 <400> 37 caccttcggc tgcggggtcc ccgactacaa ggacgacgac gacgc 45 20 <210> 38 <211> 53 25 <212> DNA <213> Artificial Sequence 30 <220> <223> Description of Artificial Sequence: 35 oligonucleotide <400> 38 40 ggccgcgtcg tcgtcgtcct tgtagtcggg gaccccgcag ccgaaggtgg tac 53 45 <210> 39 <211> 29 WO 00/66709 PCT/USOO/09973 68 <212> DNA <213> Artificial Sequence 5 <220> <223> Description of Artificial Sequence: 10 oligonucleotide 15 <400> 39 gtggcggccg ctcttgctgc cccctttga 29 20 <210> 40 <211> 28 25 <212> DNA <213> Artificial Sequence 30 <220> <223> Description of Artificial Sequence: 35 oligonucleotide <400> 40 40 ttctctagac agttgtagcc cccaacga 28 45 <210> 41 <211> 55 WO 00/66709 PCT/USOO/09973 69 <212> DNA <213> Artificial Sequence 5 <220> 10 <223> Description of Artificial Sequence: oligonucleotide 15 <400> 41 ggccgctctt gctgccccct ttgatgatga tgacaagatc gttgggggct atgct 55 20 <210> 42 <211> 55 25 <212> DNA <213> Artificial Sequence 30 <220> <223> Description of Artificial Sequence: 35 oligonucleotide <400> 42 40 ctagagcata gcccccaacg atcttgtcat catcatcaaa gggggcagca agagc 55 45 <210> 43 <211> 55 WO 00/66709 PCT/IUSOO/09973 70 <212> DNA <213> Artificial Sequence 5 <220> <223> Description of Artificial Sequence: 10 oligonucleotide 15 <400> 43 ggccgctctt gctgccccct ttgatgatga tgacaagatc gttgggggct attgt 55 20 <210> 44 <211> 55 25 <212> DNA <213> Artificial Sequence 30 <220> <223> Description of Artificial Sequence: 35 oligonucleotide <400> 44 40 ctagacaata gcccccaacg atcttgtcat catcatcaaa gggggcagca agagc 55 45 <210> 45 <211> 52 WO 00/66709 PCT/USOO/09973 71 <212> DNA <213> Artificial Sequence 5 <220> 10 <223> Description of Artificial Sequence: oligonucleotide 15 <400> 45 ggccgctctt gctgccccct ttatcgaggg gcgcattgtg gagggctcgg at 52 20 <210> 46 <211> 52 25 <212> DNA <213> Artificial Sequence 30 <220> <223> Description of Artificial Sequence: 35 oligonucleotide <400> 46 40 ctagatccga gccctccaca atgcgcccct cgataaaggg ggcagcaaga gc 52 45 <210> 47 <211> 32 WO 00/66709 PCTIUSOO/09973 72 <212> DNA <213> Artificial Sequence 5 <220> <223> Description of Artificial Sequence: 10 oligonucleotide 15 <400> 47 agcagtctag aggccggtca gtggccctgg ca 32 20 <210> 48 <211> 28 25 <212> DNA <213> Artificial Sequence 30 <220> <223> Description of Artificial Sequence: 35 oligonucleotide <400> 48 40 gctggtctag agctgaaggc caggtggc 28 45 <210> 49 <211> 29 WO 00/66709 PCT/USO0/09973 73 <212> DNA <213> Artificial Sequence 5 <220> <223> Description of Artificial Sequence: 10 oligonucleotide 15 <400> 49 ggtatctaga gcccttgctg cctatgatc 29 20 <210> 50 <211> 30 25 <212> DNA <213> Artificial Sequence 30 <220> <223> Description of Artificial Sequence: 35 oligonucleotide <400> 50 40 actgtctaga accccattcg cagccttggc 30 45 <210> 51 <211> 32 WO 00/66709 PCT/USOO/09973 74 <212> DNA <213> Artificial Sequence 5 <220> <223> Description of Artificial Sequence: 10 oligonucleotide 15 <400> 51 tcgatctaga aaagcactcc cagccctggc ag 32 20 <210> 52 <211> 32 25 <212> DNA <213> Artificial Sequence 30 <220> <223> Description of Artificial Sequence: 35 oligonucleotide <400> 52 40 gtcctctaga attgttcttc atcgtctcct gg 32

Claims

1. An expression vector comprising, in frame and in order, a pre sequence, a pro sequence, and a cloning site for in frame insertion of a catalytic domain cassette. 5

2. The expression vector of claim 1, additionally comprising a tag sequence in frame with the cloning site.

3. The expression vector of claim 2 wherein said vector comprises a DNA 10 sequence selected from a group consisting of: SEQ.ID.NO.:1, SEQ.ID.NO.:2, SEQ.ID.NO.:3, SEQ.ID.NO.:4, SEQ.ID.NO.:5, and SEQ.ID.NO.:6.

4. The expression vector of claim 1, wherein said vector contains a catalytic domain cassette inserted in frame into the cloning site. 15

5. A recombinant host cell containing the expression vector of claim 1.

6. A process for expression of a catalytic domain cassette, comprising: (a) transferring the expression vector of Claim 4 into suitable host cells; and 20 (b) culturing the host cells of step (a) under conditions that allow expression of the zymogen catalytic domain protein from the expression vector.

7. The process of claim 6, wherein said expression vector comprises a nucleotide sequence selected from a group consisting of SEQ.ID.NO.:1, SEQ.ID.NO.:2, SEQ.ID.NO.:3, 25 SEQ.ID.NO.:4, SEQ.ID.NO.:5, SEQ.ID.NO.:6, SEQ.ID.NO.:7, SEQ.ID.NO.:8, SEQ.ID.NO.:9 and, SEQ.ID.NO.:10. WO 00/66709 PCTUSOO/09973 76

8. A serine protease catalytic domain produced from the recombinant host cell containing the expression vector of claim 4, which functions as a serine protease when said protein is cleaved at a pre sequence by post-translational proteolysis. 5

9. The protease of claim 8, wherein said protease is bound to Ni-NTA silica or Ni NTA agarose beads.

10. A method for identifying compounds that modulate the activity of proteases 10 expressed and activated from the zymogen activation construct, described herein, comprising: (a) combining a modulator of the recombinant catalytic domain of a protease; and (b) measuring an effect of the modulator on the protein. 10. The method of claim 9, wherein the effect of the modulator on the protease is 15 inhibiting or enhancing its enzymatic activity.

11. The method of claim 9, wherein the effect of the modulator on the protease is stimulation or inhibition of proteolysis mediated by the expressed catalytic domain. 20

12. A compound active in the method of Claim 9, wherein said compound is a modulator of the expressed catalytic domain.

13. A pharmaceutical composition comprising a compound of Claim 12. 25

14. A kit comprising the expression vector of claim 1.