NZ241011A

NZ241011A - Leader sequence for secreting heterologous polypeptides in yeast, vectors using it, yeast cells and process for producing heterologous polypeptides

Info

Publication number: NZ241011A
Application number: NZ241011A
Authority: NZ
Inventors: Lars Christiansen
Original assignee: Novo Nordisk As
Priority date: 1990-12-19
Filing date: 1991-12-17
Publication date: 1993-04-28
Also published as: AU660161B2; CZ119293A3; ZA919932B; FI932831A0; WO1992011378A1; CA2098731A1; PT99848A; IE914433A1; HUT68751A; SK62593A3; DK300090D0; AU9134891A; MX9102684A; FI932831A; EP0563175A1; IL100408A0; JPH06503957A; HU9301801D0; KR930703450A

Description

<div id="description" class="application article clearfix"> 2410 11 Priority Date(s): . Complete Specification Filed: Class: „ £'?;, 9X-. . .C < ifsn\Vl D""^8 APR"1993 * Pub!icet:on uses: P.O. Journ?.', f-i: .... .(sfoTl.......... Patents Form No. 5 NEW ZEALAND PATENTS ACT 1953 COMPLETE SPECIFICATION A METHOD OF CONSTRUCTING SYNTHETIC LEADER SEQUENCES WE, NOVO NORDISK A/S, a Danish company of Novo Alle, 2880 Bagsvaerd, DENMARK hereby declare the invention, for which we pray that a patent may be granted to us, and the method by which it is to be performed, to be particularly described in and by the following statement: - 1 - (followed by page la) n.z. PATENT OFFICE 17 DEC 1931 received c 241011 la A METHOD OF CONSTRUCTING SYNTHETIC LEADER SEQUENCES FIELD OF INVENTION 5 The present invention relates to a method of constructing synthetic leader peptide sequences for secreting heterologous polypeptides in yeast, and yeast expression vectors for use in the method. 10 BACKGROUND OF THE INVENTION Yeast organisms produce a number of proteins which are synthesized intracellularly, but which have a function outside the cell. Such extracellular proteins are referred to as 15 secreted proteins. These secreted proteins are expressed initially inside the cell in a precursor or a pre-protein form containing a presequence ensuring effective direction of the expressed product across the membrane of the endoplasmic reticulum (ER). The presequence, normally named a signal 2 0 peptide, is generally cleaved off from the desired product during translocation. Once entered in the secretory pathway, the protein is transported to the Golgi apparatus. From the Golgi the protein can follow different routes that lead to compartments such as the cell vacuole or the cell membrane, or 25 it can be routed out of the cell to be secreted to the external medium (Pfeffer, S.R. and Rothman, J.E. Ann.Rev.Biochem. 56 (1987), 829-852). Several approaches have been suggested for the expression and 3 0 secretion in yeast of proteins heterologous to yeast. European published patent application No. 88 632 describes a process by which proteins heterologous to yeast are expressed, processed and secreted by transforming a yeast organism with an expression vehicle harbouring DNA encoding the desired protein 3 5 and a signal peptide, preparing a culture of the transformed organism, growing the culture and recovering the protein from the culture medium. The signal peptide may be the signal (followed by page 2) 'J'*.' 241011 r r peptide of the desired protein itself, a heterologous signal peptide or a hybrid of native and heterologous signal peptide. A problem encountered with the use of signal peptides hetero-5 logous to yeast might be that the heterologous signal peptide does not ensure efficient translocation and/or cleavage after the signal peptide. CD The S_;_ cerevisiae MFal (a-factor) is synthesized as a prepro 10 form of 165 amino acids comprising signal-or prepeptide of 19 amino acids followed by a "leader" or propeptide of 64 amino aicds, encompassing three N-linked glycosylation sites followed by (LysArg(Asp/Glu, Ala) 2.3a-factor) 4 (Kurjan, J. and Herskowitz, I. Cell 30 (1982), 933-943). The signal-leader part 15 of the preproMFal has been widely employed to obtain synthesis and secretion of heterologous proteins in cerivisiae. Use of signal/leader peptides homologous to yeast is known from i.a. US patent specification No. 4,546,082, European published 20 patent applications Nos. 116 201, 123 294, 123 544, 163 529, and 123 289 and EP 100561. In EP 123 289 utilization of the SL. cerevisiae a-factor precursor is described whereas WO 84/01153 indicates utilization 25 of the Saccharomvces cerevisiae invertase signal peptide and EP 100561 utilization of the Saccharomvces cerevisiae PH05 signal peptide for secretion of foreign proteins. US patent specification No. 4,546,082, EP 16 201, 123 294, 123 30 544, and 163 529 describe processes by which the a-factor signal-leader from Saccharomvces cerevisiae (MFal or MFa2) is utilized in the secretion process of expressed heterologous proteins in yeast. By fusing a DNA sequence encoding the S^. cerevisiea MFal signal/leader sequence at the 5' end of the 3 5 gene for the desired protein secretion and processing of the desired protein was demonstrated. o n C 241011 3 EP 206 783 discloses a system for the secretion of polypeptides from S_j. cerevisiae whereby the a-factor leader sequence has been truncated to eliminate the four a-factor peptides present on the native leader sequence so as to leave the leader peptide 5 itself fused to a heterologous polypeptide via the a-factor processing site LysArgGluAlaGluAla. This construction is indicated to lead to an efficient process of smaller peptides (less than 50 amino acids). For the secretion and processing of larger polypeptides, the native a-factor leader sequence has 10 been truncated to leave one or two a-factor peptides between the leader peptide and the polypeptide. A number of secreted proteins are routed so as to be exposed to a proteolytic processing system which can cleave the peptide 15 bond at the carboxy end of two consecutive basic amino acids. This enzymatic activity is in cerevisiae encoded by the KEX 2 gene (Julius, D.A. et al., Cell 37 (1984b), 1075). Processing of the product by the KEX 2 gene product is needed for the secretion of active cerevisiae mating factor al (MFal or a-20 factor) but is not involved in the secretion of active S. cerevisiae mating factor a. Secretion and correct processing of a polypeptide intended to be secreted is obtained in some cases when culturing a yeast 25 organism which is transformed with a vector constructed as indicated in the references given above. In many cases, however, the level of secretion is very low or there is no se-CD cretion, or the proteolytic processing may be incorrect or incomplete. It is therefore the object of the present invention 3 0 to provide leader peptides which ensure a more efficient expression and/or processing of heterologous polypeptides. SUMMARY OF THE INVENTION 35 It has surprisingly been found possible to replace the a-factor leader peptide by a variety of different DNA sequences, thereby obtaining secretion of a heterologous polypeptide in yeast. ^ 2410 11 Based on this observation, a method has been developed by which random DNA fragments are cloned into yeast vectors downstream of a DNA sequence coding for a signal peptide and upstream of a DNA sequence coding for a heterologous polypeptide. After 5 transformation with the vectors, yeast cells are screened for secretion of the heterologous polypeptide in question. More specifically, the present invention relates to a method of constructing a synthetic leader peptide sequence for 10 secreting heterologous polypeptides in yeast, the method comprising (a) inserting a random DNA fragment into a yeast expression vector comprising the following sequence 15 5 1 -SP—Xn-3 ' -RS-5 ' -Xm- (NZT) -X -PS-*gene*-3 1 wherein SP is a DNA sequence encoding a signal peptide, Xn is a DNA sequence encoding n amino acids, wherein n is 0 or 20 an integer of from 1 to about 10 amino acids, RS is a restriction endonuclease recognition site for insertion of random DNA fragments, which site is provided at the junction of Xn and Xm, Xm is a DNA sequence encoding m amino acids, wherein m is 0 or 25 an integer from 1 to about 10, (NZT)p is a DNA sequence encoding Asn-Xaa-Thr, wherein p is 0 or 1, Xq is a DNA sequence encoding q amino acids, wherein q is 0 or an integer from 1 to about 10, 30 PS is a DNA sequence encoding a peptide defining a yeast processing site, and *gene* is a DNA sequence encoding a heterologous polypeptide; (b) transforming a yeast host cell with the expression vector 3 5 of step (a); (c) culturing the transformed host cell of step (b) under -it- O' 2410 5 appropriate conditions; and (d) screening the culture of step (c) for secretion of the o heterologous polypeptide. 5 In the present context, the expression "leader peptide" is understood to indicate a peptide whose function is to allow the heterologous polypeptide to be directed from the endoplasmic reticulum to the Golgi apparatus and further to a secretory ve-10 sicle for secretion into the medium, (i.e. exportation of the expressed polypeptide across the cell wall or at least through the cellular membrane into the periplasmic space of the cell). The term "synthetic" used in connection with leader peptides is intended to indicate that the leader peptide constructed by 15 the present method is one not found in nature. The term "signal peptide" is understood to mean a presequence which is predominantly hydrophobic in nature and present as an N-terminal sequence of the precursor form of an extracellular 20 protein expressed in yeast. The function of the signal peptide is to allow the heterologous protein to be secreted to enter the endoplasmic reticulum. The signal peptide is normally cleaved off in the course of this process. The signal peptide may be heterologous or homologous to the yeast organism 25 producing the protein but, as explained above, a more efficient cleavage of the signal peptide may be obtained when it is homologous to the yeast organism in question. The expression "heterologous polypeptide" is intended to 30 indicate a polypeptide which is not produced by the host yeast organism in nature. In the method of the invention, the heterologous polypeptide is preferably one the secretion of which by transformed yeast cells may easily be detected, e.g. by established standard methods such as by immunological 3 5 screening by means of antibodies reactive with the polypeptide in question (cf. for instance Sambrook, Fritsch and Maniatis, Molecular Cloning: A Laboratory Manual. Cold Spring Harbor, New o o 0 c 2410 11 York, 1989) or by screening for a specific biological activity of the heterologous polypeptide. A positive result of the screening indicates that a leader peptide useful for the secretion of heterologous polypeptides in yeast has been 5 constructed. The expression "a random DNA fragment" is intended to indicate any sequence of DNA at least 3 nucleotides in length, for instance obtained by digesting genomic DNA (of any organism) 10 with restriction endonuclease (s) or by preparing synthetic DNA, e.g. by the phosphoamidite method described by S.L. Beaucage and M.H. Caruthers, Tetrahedron Letters 22, 1981, pp. 1859-1869 . 15 The peptide Asn-Xaa-Thr encoded by "(NZT)p" is an asparagine-linked glycosylation site. "Xaa" denotes any one of the known amino acids except Pro. In another aspect, the present invention relates to a yeast 20 expression cloning vector comprising the following sequence 5 ' -SP—xn—3 ' —RS-5 ' -xm- (NZT) p-Xq-PS-*gene*-3 ' wherein SP, Xn, RS, Xm, (NZT)p/ Xq/ PS and *gene* are as defined 25 above. This vector may be used in the construction of leader peptide sequences according to the method described above. 30 In a further aspect, the present invention relates to a yeast expression vector comprising the following sequence 5 ' -SP-Xn-ranDNA-Xm- (NZT) p-Xq-PS-*gene*-3 ■ 35 wherein SP, Xn, Xm, (NZT)p, Xq, PS and *gene* are as defined above, and ranDNA is a random DNA fragment inserted in a restriction endonuclease recognition site provided at the 241011 7 junction of Xn and Xm- In this vector, the leader peptide sequence (once identified by the method of the invention) will be composed of the 5 sequence Xn-ranDNA-Xra-(NZT) p-Xq. Such a vector may be used in the production of a heterologous polypeptide of interest. In a still further aspect, the present invention relates to a process for producing a heterologous polypeptide in yeast, the 10 process comprising culturing a yeast cell, which is capable of expressing a heterologous polypeptide and which is transformed with a yeast expression vector as described above including a leader peptide sequence constructed by the method of the invention, in a suitable medium to obtain expression and 15 secretion of the heterologous polypeptide, after which the heterologous polypeptide is recovered from the medium. DETAILED DISCLOSURE OF THE INVENTION The length of the random DNA fragment inserted in the expression vector is not particularly critical. However, in order to be of a manageable length, the fragment preferably has a length of from 6 to about 600 base pairs. More preferably, the fragment has a length of from about 15 to about 3 00 base pairs. It is at present considered that a suitable length of the fragment is from about 30 to about 150 base pairs. The random DNA fragment preferably encodes a high proportion of polar amino acids. These are selected from the group consisting of Glu, Asp, Lys, Arg, His, Thr, Ser, Asn and Gin. In the present context, the term "a high proportion of" is understood to indicate that the DNA fragment encodes a larger number of polar amino acids than do other DNA sequences of a corresponding length. Independently hereof, or in addition hereto, it may be advantageous that the fragment encodes at least one proline. \ 20 25 0 a 2410 11 In the sequence 5 1 -SP-Xn-3 ' -RS-5 ' -Xm- (NZT) p-Xq-PS-*gene*-3 1 , n and/or m and/or q are preferably >1. In particular, all of n, m and q are >1. 5 There is some evidence (cf. WO 89/024 63) to support that the presence of an asparagine-linked glycosylation site in the leader sequence may confer a higher secretion efficiency to the leader peptide. In the sequence 5 '-SP-Xn-3 '-RS-5 1-Xm-(NZT) p-Xq-PS-*gene*-3 • , p is therefore preferably 1. W/Y 10 The signal peptide sequence (SP) may encode any signal peptide which ensures an effective direction of the expressed heterologous polypeptide into the secretory pathway of the cell. The signal peptide may be a naturally occurring signal 15 peptide or functional parts thereof, or it may be a synthetic peptide. Suitable signal peptides have been found to be the a-factor signal peptide, the signal peptide of mouse salivary amylase, a modified carboxypeptidase signal peptide, the yeast BAR1 signal peptide or the Humicola lanuginosa lipase signal 20 peptide, or a derivative thereof. The mouse salivary amylase signal sequence is described by 0. Hagenbiichle et al., Nature 289. 1981, pp. 643-646. The carboxypeptidase signal sequence is described by L.A. Vails et al., Cell 48. 1987, pp. 887-897. The BAR1 signal peptide is disclosed in WO 87/02670. The H^. 25 lanuginosa lipase signal peptide is disclosed in EP 305 216. The yeast processing site encoded by the DNA sequence PS may ^J suitably be any paired combination of Lys and Arg, such as Lys- Arg, Arg-Lys, Lys-Lys or Arg-Arg, which permits processing of 30 the heterologous polypeptide by the KEX2 protease of Saccharomvces cerevisiae or the equivalent protease in other yeast species (D.A. Julius et al. , Cell 37, 1984, 1075 ff.). If KEX2 processing is not convenient, e.g. if it would lead to cleavage of the polypeptide product, a processing site for 35 another protease may be selected instead comprising an amino acid combination which is not found in the polypeptide product, e.g. the processing site for FX , Ile-Glu-Gly-Arg (cf. o 9 241011 Sambrook, Fritsch and Maniatis, Molecular Cloning: A Laboratory Manual. Cold Spring Harbor, New York, 1989). The heterologous protein produced by the method of the inven-5 tion may be any protein which may advantageously be produced in yeast. Examples of such proteins are aprotinin, tissue factor pathway inhibitor or other protease inhibitors, insulin or insulin precursors, human or bovine growth hormone, interleukin, glucagon, tissue plasminogen activator, 10 transforming growth factor a or /?, platelet-derived growth factor, enzymes, or a functional analogue thereof. In the present context, the term "functional analogue" is meant to indicate a polypeptide with a similar function as the native protein (this is intended to be understood as relating to the 15 nature rather than the level of biological activity of the native protein). The polypeptide may be structurally similar to the native protein and may be derived from the native protein by addition of one or more amino acids to either or both the C- and N-terminal end of the native protein, 2 0 substitution of one or more amino acids at one or a number of different sites in the native amino acid sequence, deletion of one or more amino acids at either or both ends of the native protein or at one or several sites in the amino acid sequence, or insertion of one or more amino acids at one or more sites 25 in the native amino acid sequence. Such modifications are well known for several of the proteins mentioned above. The random DNA fragment and the sequence 51-SP-Xn-31-RS-5'-Xm-(NZT) p-Xq-PS-*gene*-31 may be prepared synthetically by 3 0 established standard methods, e.g. the phosphoamidite method described by S.L. Beaucage and M.H. Caruthers, Tetrahedron Letters 22. 1981, pp. 1859-1869, or the method described by Matthes et al., EMBO Journal 3. 1984, pp. 801-805. According to the phosphoamidite method, oligonucleotides are synthesized, 35 e.g. in an automatic DNA synthesizer, purified, annealed, ligated and cloned into the yeast expression vector. It should be noted that the sequence 5 1 -SP-Xn-3 1 -RS-5 ' -Xm- (NZT) p-Xq-PS- X- • 241011 10 o *gene*-3' need not be prepared in a single operation, but may be assembled from two or more oligonucleotides prepared synthetically in this fashion. 5 The random DNA fragment or one or more parts of the sequence 5'-SP-Xn-3 '-RS-5'-Xm-(NZT) p-Xq-PS-*gene*-3' may also be of genomic or cDNA origin, for instance obtained by preparing a genomic or cDNA library and screening for DNA sequences coding Q for said parts (typically SP or *gene*) by hybridization using 10 synthetic oligonucleotide probes in accordance with standard techniques (cf. Sambrook, Fritsch and Maniatis, Molecular Cloning: A Laboratory Manual. Cold Spring Harbor, New York, 1989). In this case, a genomic or cDNA sequence encoding a signal peptide may be joined to a genomic or cDNA sequence encoding 15 the heterologous protein, after which the DNA sequence may be modified by the insertion of synthetic oligonucleotides encoding the sequence Xn-3 1 -RS-5 1-Xm- (NZT) p-Xq-PS in accordance with well-known procedures. 20 Finally, the random DNA fragment and/ or the sequence 51-SP-Xn-3 1 -RS-5 1 -Xm- (NZT) p-Xq-PS-*gene*-3 1 may be of mixed synthetic and genomic, mixed synthetic and cDNA or mixed genomic and cDNA origin prepared by annealing fragments of synthetic, genomic or cDNA origin (as appropriate) , the fragments corresponding 25 to various parts of the entire DNA sequence, in accordance with standard techniques. Thus, it may be envisaged that the DNA sequence encoding the signal peptide or the heterologous G polypeptide may be of genomic or cDNA origin, while the sequence Xn-3 ' -RS-5 ' -Xm- (NZT) p-Xq-PS may be prepared 30 synthetically. Preferred DNA constructs encoding insulin precursors are as shown in Sequence Listings ID Nos. 1-13, or suitable modifications thereof. Examples of suitable modifications of the DNA 35 sequence are nucleotide substitutions which do not give rise to another amino acid sequence of the protein, but which may correspond to the codon usage of the yeast organism into which 2410 1 1 n the DNA construct is inserted or nucleotide substitutions which do give rise to a different amino acid sequence and therefore, possibly, a different protein structure. Other examples of possible modifications are insertion of three or multiples of three nucleotides into the sequence, addition of three or multiples of three nucleotides at either end of the sequence and deletion of three or multiples of three nucleotides at either end of or within the sequence. The recombinant expression vector carrying the sequence 5'-SP-Xn-3' -RS-5 1 -Xm- (NZT) p-Xq-PS-*gene*-3 ' or 5 ■ -SP-Xn-ranDNA-Xm-(NZT) p-Xq-PS-*gene*-3' may be any vector which is capable of replicating in yeast organisms. In the vector, either DNA sequence should be operably connected to a suitable promoter sequence. The promoter may be any DNA sequence which shows transcriptional activity in yeast and may be derived from genes encoding proteins either homologous or heterologous to yeast. The promoter is preferably derived from a gene encoding a protein homologous to yeast. Examples of suitable promoters are the Saccharomvces cerevisiae MFal, TPI, ADH or PGK promoters. The sequences shown above should also be operably connected to a suitable terminator, e.g. the TPI terminator (cf. T. Alber and G. Kawasaki, J. Mol. AppI. Genet, 1, 1982, pp. 419-434). The recombinant expression vector of the invention further comprises a DNA sequence enabling the vector to replicate in yeast. Examples of such sequences are the yeast plasmid 2/i replication genes REP 1-3 and origin of replication. The vector may also comprise a selectable marker, e.g. the Schizo-saccharomvces pombe TPI gene as described by P.R. Russell, Gene 40, 1985, pp. 125-130. The procedures used to ligate the sequence 5 1-SP-Xn-3 '-RS-5 '-Xm- (NZT) p-Xq-PS-*gene*-3 1, the random DNA fragment, the promoter and the terminator, respectively, and to insert them into suitable yeast vectors containing the information necessary for 241 0 12 yeast replication, are well known to persons skilled in the art (cf., for instance, Sambrook, Fritsch and Maniatis, op.cit.). It will be understood that the vector may be constructed either by first preparing a DNA construct containing the entire 5 sequence 5 ■ -SP-Xn-3 ■ -RS-5 ' -Xm- (NZT) p-Xq-PS-*gene*-3 ' and subsequently inserting this fragment into a suitable expression vector, or by sequentially inserting DNA fragments containing genetic information for the individual elements (such as the C^) signal peptide, the sequence Xn-3 ' -RS-5 ' -Xm- (NZT) p-Xq or the 10 heterologous polypeptide) followed by ligation. The yeast organism used in the method of the invention may be any suitable yeast organism which, on cultivation, produces large amounts of the heterologous polypeptide in question. 15 Examples of suitable yeast organisms may be strains of the yeast species Saccharomvces cerevisiae, Saccharomvces kluvveri . Schizosaccharomvces pombe or Saccharomvces uvarum. The transformation of the yeast cells may for instance be effected by protoplast formation followed by transformation in a manner 20 known per se. The medium used to cultivate the cells may be any conventional medium suitable for growing yeast organisms. The secreted heterologous protein, a significant proportion of which will be present in the medium in correctly processed form, may be recovered from the medium by conventional pro-25 cedures including separating the yeast cells from the medium by centrifugation or filtration, precipitating the protein-aceous components of the supernatant or filtrate by means of (3 a salt, e.g. ammonium sulphate, followed by purification by a variety of chromatographic procedures, e.g. ion exchange 30 chromatography, affinity chromatography, or the like. BRIEF DESCRIPTION OF THE DRAWINGS The present invention is further illustrated with reference to 35 the appended drawings wherein Fig. 1 schematically shows the construction of pMT742<5,* O O 2410 13 Fig. 2 schematically shows the construction of pLaC2 02; Fig. 3 shows the DNA sequence and derived amino acid sequence at the cloning site in pLaC2 02 for random DNA fragments (it 5 should be noted that the sequence is cleaved in the unique Clal site and that ligation without insertion of random DNA will lead to a change in the reading frame); 10 15 Fig. 4 schematically shows the construction of pLSC6315D#; The invention is further described in the following examples which are not to be construed as limiting the scope of the invention as claimed. EXAMPLES Plasmids and DNA materials 2 0 All expression plasmids are of the C-POT type. Such plasmids are described in EP patent application No. 171 142 and are characterized in containing the Schizosaccharomvces pombe triose phosphate isomerase gene (POT) for the purpose of plasmid selection and stabilization. A plasmid containing the 25 POT-gene is available from a deposited E. coli strain (ATCC 39685). The plasmids furthermore contain the S. cerevisiae triose phosphate isomerase promoter and terminator (PTpIand Ttp,) . They are identical to pMT742 (M. Egel-Mitani et al. , Gene 73. 1988, pp. 113-120) (see fig. 1) except for the region 30 defined by the Sph-Xbal restriction sites encompassing the PTP1 and the coding region for signal/leader/product. The PTpi has been modified with respect to the sequence found in pMT742, only in order to facilitate construction work. An 35 internal SphI restriction site has been eliminated by SphI cleavage, removel of single stranded tails and religation. Furthermore, DNA sequences, upstream to and without any impact o 5 o 10 15 20 25 0 2410 1 1 14 on the promoter, have been removed by Bal31 exonuclease treatment followed by addition of an SphI restriction site linker. This promoter construction present on a 373 bp Sphl-EcoRI fragment is designated PTPI(J and when used in plasmids already described this promoter modification is indicated by the addition of a S to the plasmid name, e.g. pMT742<S (fig. 1) . The assembly of various DNA fragments have occasionally taken place in a smaller E. coli plasmid of the pT7 type previously described (cf. WO 89/02463) only modified with respect to the PTPI as described above, i.e. pT7<5. For random cloning described below, genomic DNA of various origins have been employed. S. cerevisiae DNA was isolated from strain MT633 (deposited on 7 December 1990 in the Deutsche Sammlung von Mikroorganismen und Zellkulturen under the provisions of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purpose of Patent Procedure with the deposit number DSM 6278). A. orvzae DNA was isolated from strain A1560 (IFO 4177). Finally a number of synthetic DNA fragments have been employed all of which were synthesized on an automatic DNA synthesizer (Applied Biosystems model 380A) using phosphoramidite chemistry and commercially available reagents (S.L. Beaucage and M.H. Caruthers (1981) Tetrahedron Letters 22., 1859-1869) . The oligonucleotides were purified by polyacrylamide gel electrophoresis under denaturing conditions. Prior to annealing complementary pairs of such DNA single strands these were kinased by T4 polynucleotide kinase and ATP. All other methods and materials used are common state of the art knowledge (J. Sambrook et al. , Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory Press) Cold Spring Harbor, N.Y. 1989). 35 o 241011 15 Example 1 Construction of pLaC202 o ^rC'1 5 The 490 bp Sphl-Apal of pT7 196<S (cf. WO 89/02463, fig. 5) and the 179 bp Hinfl-Xbal fragment of pT7.aMI3 (cf. WO 89/02463, fig. 1) joined in the 11 kb Xbal-SphI fragment of pMT742 via the synthetic adaptor: O 10 NOR367/373: CAACCATCGATAACACCACTTTGGCTAAGAG CCGGGTTGGTAGCTATTGTGGTGAAACCGATTCTCTAA resulting in the plasmid pLaC202 (fig. 2 and 3 as well as Sequence Listing ID No. 1). 15 This vector containing a unique Clal site constitutes one embodiment of the random DNA cloning vector in which the product gene codes for the insulin precursor MI3 (B(l-29)-Ala-Ala-Lys-A(l-21)). The following examples concerns the leaders 20 cloned via this construct. CD Example 2 25 Construction of PLSC6315 and PLSC5210 Total DNA was isolated from S. cerevisiae strain MT663, and digested by TaqI, HinPI or TaqI + HinP I. The digests were separated according to size on a 1% agarose gel, and fragments 3 0 smaller than 600 bp were isolated from each of the three digestions. pLaC202, previously digested with Clal, prevented from self ligation with Calf Intestine Alkaline Phosphatase (CIAP), 3 5 dephosphorylation, was mixed with the fragment pools described above and ligated. E. coli strain MT172 (MT172 = MC 1000 m+r" araf leuB-6; MC 1000 (cf. M. Casadaban and S. Cohen, o 0 © 241011 16 J.Mol. Biol. 138. 1980, p. 179)) was transformed with above ligation mixture, and appr. 5000 ApR transformants for each mixture were obtained. Recombinant plasmids were prepared from each of the three types in pools encompassing all 5000 5 transformants. These plasmid pools were used to transform S. cerevisiae strain MT6 63 and the resulting TPI transformants were immunoscreened for MI3 secretion. Among the surprisingly large number of positive transformants 10 the eight apparently most efficient were reisolated and the plasmid content isolated therefrom. As a result of this procedure, it is expected that most of the yeast transformant obtained have a heterogeneous population of 15 plasmids and to obtain true clones, a step of plasmid reisolation was therefore performed. The plasmid preparations from each of the eight reisolated yeast transformants were used to transform E. coli strain MT172 to ApR. Plasmids from 12 E. coli transformants for each of the eight yeast isolates, were 20 individually used to transform yeast strain MT663, TPI, and MI3 secreting transformants were identified by immunoscreening. Sequencing of the inserts of the eight isolated pLaC202 derivatives showed three different sequences, two of which, 25 pLSC6315 and pLSC5210, most efficiently support MI3 secretion. The sequences of the cloned DNA and flanking regions are shown in Sequence Listings ID Nos. 2 and 4, respectively. 30 Example 3 Modifications of PLSC6315 pLSC6315 was chosen for further modification of the cloned 3 5 synthetic leader sequence. pLSC6315 was digested with the Apal endonuclease followed by $ •1 • o o O 241011 17 treatment with the exonuclease Bal31. After phenol extraction the resulting DNA was digested with Xbal and DNA fragments smaller than the original 3 67 bp Apal-Xbal fragment, were isolated. 5 pLaC202 was digested with Clal, and the single stranded CG tails generated were removed, followed by Xbal digestion and isolation of the 11 Kb Xbal-]Clal[ fragment ("] [" indicates that the single-stranded tails have been trimmed off). This 10 fragment was mixed with the pLSC6415 fragments isolated above and ligated (fig. 6). The transformation and screening procedure described in example 2 was repeated, and pLSC6315D3 and pLSC6315D7 were isolated as 15 plasmids supporting MI3 secretion more efficiently than the original pLSC6315 (cf. Sequence Listings ID Nos. 6 and 8, respectively). 20 Example 4 Construction of pLAOl - 5 o Total DNA from Aspergillus orvzae strain A1560 was treated as 25 previously described for S. cerevisiae DNA in example 2, and cloning and recloning was performed exactly as described in ^ example 2, except that the number of E. coli transformants in the first cloning was reduced to approximately 3 000 per ligation mixture. This experiment resulted in the isolation of 3 0 five clones of A. orvzae DNA which in the pLaC202 context mediates secretion of the insulin precursor MI3, from S. cerevisiae. Sequencing of the inserts showed 5 different sequences, two of 3 5 which (pLA02 and pLA05) are more efficient where MI3 secretion is concerned. The sequences of the DNA inserts in pLA02 and pLA05 are shown, together with the flanking regions, in 241011 18 Sequence Listings ID Nos. 10 and 12, respectively. Example 5 5 Yeast strains harbouring plasmids as described above, were grown in YPD medium (Sherman, F. et al. , Methods in Yeast Genetics, Cold Spring Harbor Laboratory 1981). For each strain 6 individual 5 ml cultures were shaken at 3 0°C for 60 hours, with a final OD600 of approx. 15. After centrifugation the 10 supernatant was removed for HPLC analysis by which method the concentration of secreted insulin precursor was measured by a method described by Leo Snel et al. (1987) Chromatographia 24. 329-332. 15 In table I the expression levels of insulin precursor, MI3, by use of leader sequences isolated according to the present invention, are given as a percentage of the level obtained with transformants of pMT742<5, utilizing the MFa(l) leader of S. cerevisiae. 20 Table I 30 25 pMT7 4 2 pLSC6315 pLSC5210 pLSC6315D3 pLSC6315D7 pLA02 pLA05 100% 100% 60% 175% 120% 120% 60% 241011 SEQUENCE LISTING (1) GENERAL INFORMATION: (i) APPLICANT: Novo Nordisk A/S (ii) TITLE OF INVENTION: A Method of Constructing Synthetic Leader Sequences (iii) NUMBER OF SEQUENCES: 13 (iv) CORRESPONDENCE ADDRESS: (A) ADDRESSEE: Novo Nordisk A/S, Patent Department (B) STREET: Novo Alle (C) CITY: Bagsvaerd (E) COUNTRY: Denmark (F) ZIP: DK-2880 (v) COMPUTER READABLE FORM: (A) MEDIUM TYPE: Floppy disk (B) COMPUTER: IBM PC compatible (C) OPERATING SYSTEM: PC-DOS/MS-DOS (D) SOFTWARE: Patentln Release #1.0, Version #1.25 (vi) CURRENT APPLICATION DATA: (A) APPLICATION NUMBER: (B) FILING DATE: (C) CLASSIFICATION: (viii) ATTORNEY/AGENT INFORMATION: (A) NAME: Thalsoe-Madsen, Birgit (C) REFERENCE/DOCKET NUMBER: 3540.204-W0 (ix) TELECOMMUNICATION INFORMATION: (A) TELEPHONE: +45 4444 8888 (B) TELEFAX: +45 4449 3256 (C) TELEX: 37304 (2) INFORMATION FOR SEQ ID NO:l: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 335 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: GAATTCATTC AAGAATAGIT CAAACAAGAA GATTACAAAC TATCAATTTC TAAACGATTA AAAGAATGAA AGTCTTCCTG CTGCITTCCC TCATTGGAJT AIACACAATA 60 CTGCTGGGCC 120 O - 2 4 10 CAACCATOGA TAACACCACT TTGGCEAAGA GATTCGITAA CCAACACITG TGCGGITCCC 180 ACTTGGTIGA AGCITTCTAC TTGGTTTGOG GTGAAAGAGG TTTCTTCIAC ACTCCTAAGG 240 ^ CTGCTAAGGG TATTGICGAA CAATGCIUIA CCTCCATCTG CTCCITCTAC CAATTGGAAA 300 / ACIACTGCAA CEAGACGCAG CCOGCAGGCT CIAGA 335 (2) INFORMATION FOR SEQ ID NO: 2: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 492 base pairs (B) TYRE: nucleic acid •>—' (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (ix) FEATURE: (A) NAME/KEY: CDS (B) LOCATION: 76..468 (ix) FEATURE: (A) NAME/KEY: sigjpeptide (B) LOCATION: 76..309 (ix) FEATURE: (A) NAME/KEY: mat_peptide (B) LOCATION: 310..468 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: GAAITCATTC AAGAATAGIT CAAACAAGAA GATTACAAAC TATCAATTTC AIACACAATA 60 TAAACGATTA AAAGA ATG AAA GTC TTC CIG CIG CIT TCC CTC ATT GGA TTC 111 Met Lys Val Fhe leu Leu Leu Ser Leu lie Gly Fhe -78 -75 -70 TCC TGG GCC CAA CCA TOG ATA GAT GGA ACA CAT TIT COG AAC AAC AAT 159 Cys Trp Ala Gin Pro Ser lie Asp Gly Thr His Fhe Pro Asn Asn Asn -65 -60 -55 CTC CCA ATA GAC ACA AGA AAA GAA GGA CTA CAG CAT GAT TAC GAT ACA 207 Val Pro lie Asp Thr Arg Lys Glu Gly Leu Gin His Asp Tyr Asp Thr -50 -45 -40 -35 GAA ATT TTG GAG CAC ATT GGA AGC GAT GAG TTA AIT TTG AAT GAA GAG 255 Glu lie leu Glu His lie Gly Ser Asp Glu Leu lie Leu Asn Glu Glu -30 -25 -20 TAT GIT ATT GAA AGA ACT TTG CAA GCC ATC GAT AAC ACC ACT TTG GCT 303 Tyr Val lie Glu Arg Thr leu Gin Ala lie Asp Asn Thr Thr Leu Ala -15 -10 -5 AAG AGA TTC GIT AAC CAA CAC TTG TGC GGT TCC CAC TTG GIT GAA GCT 351 Lys Arg Phe Val Asn Gin His Leu Cys' Gly Ser His leu Val Glu Ala 15 10 .1' o 21 241 0 TIG TAC TIG GIT TGC GGT GAA AGA GGT TTC TTC TAC ACT CCT AAG GCT 399 Leu Tyr Leu Val Cys Gly Glu Arg Gly Phe Fhe Tyr Thr Pro Lys Ala 15 20 25 30 GCT AAG GGT AIT GTC GAA CAA TGC TGT ACC TCC ATC TGC TCC TTG TAC 447 Ala Lys Gly lie Val Glu Gin cys Cys Thr Ser lie Cys Ser leu Tyr 35 40 45 CAA TIG GAA AAC TAC TGC AAC TAGACGCAGC CCGCAGGCTC TAGA 492 Gin Leu Glu Asn Tyr Cys Asn 50 n © (2) INFORMATION FOR SEQ ID NO:3: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 131 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: Met Lys Val Phe Leu Leu Leu Ser Leu lie Gly Fhe Cys Trp Ala Gin -78 -75 -70 -65 Pro Ser lie Asp Gly Thr His Fhe Pro Asn Asn Asn Val Pro lie Asp -60 -55 -50 Thr Arg Lys Glu Gly Lsu Gin His Asp Tyr Asp Thr Glu lie Leu Glu -45 -40 -35 His lie Gly Ser Asp Glu Lsu lie Leu Asn Glu Glu Tyr Val lie Glu -30 -25 -20 -15 Arg Thr leu Gin Ala lie Asp Asn Thr Thr Lsu Ala Lys Arg Fhe Val -10 -5 1 Asn Gin His Lsu Cys Gly Ser His leu Val Glu Ala Leu Tyr leu Val 5 10 15 Cys Gly Glu Arg Gly Phe Phe Tyr Thr Pro Lys Ala Ala Lys Gly lie 20 25 30 Val Glu Gin Cys Cys Thr Ser lie Cys Ser Leu Tyr Gin Leu Glu Asn 35 40 45 50 Tyr Cys Asn (2) INFORMATION FOR SEQ ID NO:4: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 420 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear o 241 0 (ii) MOLECULE TYPE: cDNA (ix) FEATURE: (A) NAME/KEY: CDS '. J (B) LOCATION: 76..396 (ix) FEATURE: (A) NAME/KEY: sig_peptide (B) LOCATION: 76..237 (ix) FEATURE: /"""> (A) NAME/KEY: mat_peptide (B) LOCATION: 238..396 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: GAATTCATTC AAGAATAGfIT CAAACAAGAA GATTACAAAC TATCAATTTC AIACACAATA 60 TAAAOGATTA AAAGA ATG AAA GTC TTC CIG CTG CIT TCC CTC ATT GGA TTC 111 Met Lys Val Fhe Lsu Lsu Leu Ser Lsu lie Gly Fhe -54 -50 -45 TGC TGG GCC CAA CCA TOG CIA TTG GAG TCA CTT ACG CTC GIT GAT GIT 159 Cys Trp Ala Gin Pro Ser Leu Lsu Glu Ser Lsu Thr Leu Val Asp Val -40 -35 -30 GAC GCA CTG TOG GAT ATT GAT GTA CIT GIT GAG TCT GAA ACG CIT GTG 207 Asp Ala Leu Ser Asp lie Asp Val Lsu Val Glu Ser Glu Thr Lsu Val -25 -20 -15 CIT GTC GAT AAC ACC ACT TTG GCT AAG AGA TTC GIT AAC CAA CAC TTG 255 ©Leu Val Asp Asn Thr Thr Lsu Ala Lys Arg Phe Val Asn Gin His Lsu -10 -5 15 TGC GGT TCC CAC TTG GIT GAA GCT TIG TAC TTG GIT TGC GGT GAA AGA 303 Cys Gly Ser His Leu Val Glu Ala Leu Tyr Lsu Val cys Gly Glu Arg 10 15 20 ^ GGT TTC TTC TAC ACT CCT AAG GCT GCT AAG GGT ATT GTC GAA CAA TGC 351 :\__J Gly Phe Fhe Tyr Thr Pro Lys Ala Ala Lys Gly lie Val Glu Gin Cys 25 30 35 TGT ACC TCC ATC TGC TCC TTG TAC CAA TTG GAA AAC TAC TGC AAC 396 Cys Thr Ser lie Cys Ser leu Tyr Gin Leu Glu Asn Tyr Cys Asn 40 45 50 TAGACGCAGC CCGCAGGCTC TAGA 420 (2) INFORMATION FOR SEQ ID NO:5: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 107 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear o 23 £. (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: Met Lys Val Fhe Leu Leu Leu Ser Leu lie Gly Phe Cys Trp Ala Gin O -54 -50 -45 -40 Pro Ser Leu Leu Glu Ser Lsu Thr Leu Val Asp Val Asp Ala Leu Ser -35 -30 -25 Asp lie Asp Val Leu Val Glu Ser Glu Thr Leu Val Leu Val Asp Asn -20 -15 -10 o w7 Thr Thr Leu Ala Lys Arg Fhe Val Asn Gin His Leu cys Gly Ser His -5 15 10 Leu Val Glu Ala Leu Tyr Lsu Val Cys Gly Glu Arg Gly Phe Fhe "Tyr 15 20 25 Thr Pro Lys Ala Ala Lys Gly lie Val Glu Gin Cys Cys Thr Ser lie 30 35 40 cys Ser Leu Tyr Gin Leu Glu Asn Tyr Cys Asn 45 50 (2) INPOPMATION FOR SEQ ID NO:6: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 453 base pairs (B) TYPE: nucleic acid (C) STRANDEENESS: single (D) TOPOLOGY: linear g) (ii) MOLECULE TYPE: cDNA (ix) FEATURE: (A) NAME/KEY: CDS (B) LOCATION: 76..420 ,fN (ix) FEATURE: Klj) (A) NAME/KEY: sig__peptide (B) LOCATION: 76..270 (ix) FEATURE: (A) NAME/KEY: mat_peptide (B) LOCATION: 271..420 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: GAATTCATTC AAGAATAGIT CAAACAAGAA GATTACAAAC TATCAATTTC AIACACAATA 60 TAAAOGATTA AAAGA ATG AAA GTC TTC CIG CTG CTT TCC CTC ATT GGA TTC 111 Met Lys Val Fhe Leu Leu Leu Ser Lsu lie Gly Phe -65 -60 -55 n i:£v) O - 2410 TGC TGG GCC CAA CCA ATA GAC ACA AGA AAA GAA GGA CEA CAG CAT GAT 159 Cys Trp Ala Gin Pro lie Asp Thr Arg Lys Glu Gly Lsu Gin His Asp -50 -45 -40 TAC GAT ACA GAA ATT TTG GAG CAC ATT GGA AGC GAT GAG TTA ACC CCG 207 Tyr Asp Thr Glu lie leu Glu His lie Gly Ser Asp Glu Leu Thr Pro -35 -30 -25 AAT GAA GAG TAT GIT AIT GAA AGA ACT TIG CAA GCC ATC GAT AAC ACC 255 Asn Glu Glu Tyr Val lie Glu Arg Thr Leu Gin Ala lie Asp Asn Thr -20 -15 -10 ACT TTG GCT AAG AGA TTC GIT AAC CAA CAC TTG TGC GGT TCC CAC TTG 303 Thr Leu Ala Lys Arg Fhe Val Asn Gin His Leu Cys Gly Ser His Leu -5 15 10 GIT GAA GCT TTG TAC TTG GIT TGC GGT GAA AGA GGT TTC TTC TAC ACT 351 Val Glu Ala leu Tyr Leu Val Cys Gly Glu Arg Gly Hie Fhe Tyr Thr 15 20 25 CCI AAG GCT GCT AAG GGT ATT GTC GAA CAA TGC TGT ACC TCC ATC TGC 399 Pro Lys Ala Ala Lys Gly lie Val Glu Gin Cys Cys Thr Ser lie Cys 30 35 40 TCC TTG TAC CAA TTG GAA AAC TACIGCAACT AGACGCAGCC CGCAGGCTCT 450 Ser Leu Tyr Gin Leu Glu Asn 45 50 AGA 453 (2) INFORMATION FOR SEQ ID NO:7: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 115 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: Met Lys Val Fhe Leu Leu Leu Ser Leu lie Gly Fhe Cys Trp Ala Gin -65 -60 -55 -50 Pro He Asp Thr Arg Lys Glu Gly Leu Gin His Asp Tyr Asp Thr Glu -45 -40 -35 lie Leu Glu His lie Gly Ser Asp Glu Leu Thr Pro Asn Glu Glu Tyr -30 -25 -20 Val lie Glu Arg Thr Leu Gin Ala lie Asp Asn Thr Thr Leu Ala Lys -15 -10 -5 Arg Ehe Val Asn Gin His Lsu Cys Gly Ser His Leu Val Glu Ala Leu 15 10 15 o O 25 24 10 Tyr Leu Val Cys Gly Glu Arg Gly Phe Phe Tyr Thr Pro Lys Ala Ala 20 25 30 Lys Gly lie Val Glu Gin Cys cys Thr Ser lie Cys Ser Leu Tyr Gin 35 40 45 Lsu Glu Asn 50 (2) INPOFMATION FOR SEQ ID NO:8: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 459 base pairs w' (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (ix) FEATURE: (A) NAME/KEY: CDS (B) LOCATION: 76..435 (ix) FEAIURE: (A) NAME/KEY: sig_peptide (B) LOCATION: 76..276 (ix) FEAIURE: (A) NAME/KEY: mat_peptide (B) LOCATION: 277..435 xg (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: GAATTCATTC AAGAATAGIT CAAACAAGAA GATTACAAAC TATCAATTTC AIACACAATA 60 TAAAOGATTA AAAGA ATG AAA GTC TTC CTG CTG CIT TCC CTC ATT GGA TTC 111 Met Lys Val Phe leu Leu leu Ser Lsu lie Gly Fhe -67 -65 -60 TGC TGG GCC CAA CCT GTC CCA ATA GAC ACA AGA AAA GAA GGA CEA CAG 159 CVs Trp Ala Gin Pro Val Pro lie Asp Thr Arg Lys Glu Gly leu Gin -55 -50 -45 -40 CAT GAT TAC GAT ACA GAA ATT TTG GAG CAC ATT GGA AGC GAT GAG TTA 207 His Asp Tyr Asp Thr Glu lie Lsu Glu His lie Gly Ser Asp Glu Leu -35 -30 -25 ACC CCG AAT GAA GAG TAT GIT ATT GAA AGA ACT TIG CAA GCC ATC GAT 255 Thr Pro Asn Glu Glu Tyr Val lie Glu Arg Thr Leu Gin Ala lie Asp -20 -15 -10 AAC ACC ACT TTG GCT AAG AGA TTC GIT AAC CAA CAC TTG TGC GCT TCC Asn Thr Thr Leu Ala Lys Arg Fhe Val Asn Gin His Leu Cys Gly Ser -5 15 303 o 26 241 0 CAC TIG GIT GAA GCT TTG TAC TTG GIT TGC GGT GAA AGA GGT TTC TTC 351 His leu Val Glu Ala Leu Tyr Leu Val Cys Gly Glu Arg Gly Phe Fhe 10 15 20 25 TAC ACT CCT AAG GCT GCT AAG GGT ATT GTC GAA CAA TGC TGT ACC TCC 399 ' z Tyr Thr Pro Lys Ala Ala Lys Gly lie Val Glu Gin Cys Cys Thr Ser 30 35 40 ATC TGC TCC TTG TAC CAA TTG GAA AAC TAC TGC AAC TAGACGCAGC 445 lie cys Ser Lsu Tyr Gin Lsu Glu Asn Tyr Cys Asn 45 50 CCGCAGGCTC TAGA 459 (2) INFORMATION FOR SEQ ID NO:9: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 120 amino acids (B) TYPE: amino acid (D) TOFOIOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: Met Lys Val Phe Lsu Leu leu Ser Leu lie Gly Phe Cys Trp Ala Gin -67 -65 -60 -55 Pro Val Pro lie Asp Thr Arg Lys Glu Gly Leu Gin His Asp Tyr Asp -50 -45 -40 Thr Glu lie lieu Glu His lie Gly Ser Asp Glu Leu Thr Pro Asn Glu -35 -30 -25 -20 Glu Tyr Val lie Glu Arg Thr Leu Gin Ala lie Asp Asn Thr Thr leu -15 -10 -5 Ala Lys Arg Phe Val Asn Gin His Leu Cys Gly Ser His Leu Val Glu 15 10 Ala Leu Tyr Leu Val Cys Gly Glu Arg Gly Phe Phe Tyr Thr Pro Lys 15 20 25 Ala Ala Lys Gly lie Val Glu Gin cys Cys Thr Ser lie cys Ser Leu 30 35 40 45 Tyr Gin Leu Glu Asn Tyr Cys Asn 50 (2) INFORMATION FOR SEQ ID NO: 10: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 408 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOIOGY: linear (ii) MOLECULE TYPE: cDNA o ••MX o 27 (ix) FEATURE: (A) NAME/KEY: CDS (B) LOCATION: 76..384 C~") (ix) FEATURE: (A) NAME/KEY: sig_peptide (B) LOCATION: 76..225 (ix) FEATURE: (A) NAME/KEY: mat_peptide (B) LOCATION: 226..384 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: GAATTCATTC AAGAATAGTT CAAACAAGAA GATTACAAAC TATCAATTTC AIACACAATA 60 TAAAOGATTA AAAGA ATG AAA GTC TTC CTG CIG CIT TCC CTC AIT GGA TTC 111 Met Lys Val Fhe leu Leu Leu Ser Leu lie Gly Fhe -50 -45 -40 TGC TGG GCC CAA CCA TOG ATC TIG GAT TAT GIT GAC TTG GCT GCG GAA 159 Cys Trp Ala Gin Pro Ser lie Leu Asp Tyr Val Asp Leu Gly Ala Glu -35 -30 -25 CTG ATC TCC ATT CGT GGG TAT GAT AAC CTC AAC GAC GOG ATC GAT AAC 207 Leu lie Ser lie Arg Gly Tyr Asp Asn Leu Asn Asp Ala lie Asp Asn -20 -15 -10 ACC ACT TTG GCT AAG AGA TTC GTT AAC CAA CAC TTG TGC GCT TCC CAC 255 Thr Thr Leu Ala Lys Arg Fhe Val Asn Gin His Leu Cys Gly Ser His -5 1 5 10 TTG GIT GAA GCT TTG TAC TTG GTT TGC GGT GAA AGA GGT TTC TTC TAC 303 Leu Val Glu Ala Leu Tyr Leu Val cys Gly Glu Arg Gly Fhe Fhe Tyr 15 20 25 ACT CCT AAG GCT GCT AAG GGT AIT GTC GAA CAA TGC TCT ACC TCC ATC 351 Thr Pro Lys Ala Ala Lys Gly lie Val Glu Gin cys Cys Thr Ser lie 30 35 40 TGC TCC TIG TAC CAA TTG GAA AAC TAC TGC AAC TAGACGCAGC CCGCAGGCTC 404 Cys Ser Leu Tyr Gin Leu Glu Asn Tyr Cys Asn 45 50 TAGA 408 (2) INFORMATION FOR SEQ ID NO:11: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 103 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein O 28 24 1 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: Met Lys Val Phe Leu Leu Leu Ser Leu lie Gly Phe Cys —50 -45 -40 Pro Ser lie leu Asp Tyr Val Asp leu Gly Ala Glu leu -30 -25 Arg Gly Tyr Asp Asn leu Asn Asp Ala lie Asp Asn Thr -15 -10 Lys Arg Hie Val Asn Gin His Leu Cys Gly Ser His Leu ^ 15 10 leu Tyr Leu Val Cys Gly Glu Arg Gly Fhe Fhe Tyr Thr 15 20 25 Ala Lys Gly lie Val Glu Gin Cys cys Thr Ser lie Cys 35 40 Gin Leu Glu Asn Tyr Cys Asn 50 (2) INFORMATION FOR SEQ ID NO: 12: SEQUENCE CHARACTERISTICS: (A) LENGTH: 372 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear MOLECULE TYPE: cDNA FEATURE: (A) NAME/KEY: CDS (B) LOCATION: 76..348 FEATURE: (A) NAME/KEY: sig_peptide (B) LOCATION: 76..189 FEATURE: (A) NAME/KEY: mat_peptide (B) LOCATION: 190..348 SEQUENCE DESCRIPTION: SEQ ID NO: 12: GAATTCATTC AAGAATAGTT CAAACAAGAA GATTACAAAC TATCAATTTC AIACACAATA 60 TAAAOGATTA AAAGA ATG AAA GTC TTC CTG CIG CIT TCC CTC ATT GGA TTC 111 Met Lys Val Hie leu Leu Leu Ser Leu lie Gly Fhe -38 -35 -30 Trp Ala Gin -35 lie Ser lie -20 Thr Leu Ala -5 Val Glu Ala Pro Lys Ala 30 Ser Lsu Tyr 45 (i) (ii) (ix) (ix) (ix) (xi) TGC TGG GCC CAA CCA TOG CAC ACT ACC ATC GGC ACC GCA ACT GAC AAA Cys Trp Ala Gin Pro Ser His Thr Thr lie Gly Thr Ala Thr Asp Lys -25 -20 -15 159 241 0 AAC ATC GAT AAC ACC ACT TTG GCT AAG AGA TTC GIT AAC CAA CAC TTG 207 Asn lie Asp Asn Thr Thr Leu Ala Lys Arg Phe Val Asn Gin His Lsu -10 -5 15 TGC GCT TCC CAC TTG GIT GAA GCT TTG TAC TTG GTT TGC GCT GAA AGA 255 Cys Gly Ser His Leu Val Glu Ala Leu Tyr Leu Val Cys Gly Glu Arg 10 15 20 GGT TTC TTC TAC ACT CCT AAG GCT GCT AAG GCT ATT GTC GAA CAA TGC 303 Gly Fhe Phe Tyr Thr Pro Lys Ala Ala Lys Gly lie Val Glu Gin cys 25 30 35 TGT ACC TCC ATC TGC TCC TTG TAC CAA TTG GAA AAC TAC TGC AAC 348 Cys Thr Ser lie Cys Ser Leu Tyr Gin Leu Glu Asn Tyr Cys Asn 40 45 50 TAGACGCAGC CCGCAGGCTC TAGA 372 (2) INFORMATION FOR SEQ ID NO: 13: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 91 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: Met Lys Val Fhe Leu Leu Leu Ser Leu lie Gly Phe Cys Trp Ala Gin -38 -35 -30 -25 Pro Ser His Thr Thr lie Gly Thr Ala Thr Asp Lys Asn lie Asp Asn -20 -15 -10 Thr Thr Leu Ala Lys Arg Phe Val Asn Gin His Leu cys Gly Ser His -5 15 10 Leu Val Glu Ala Leu Tyr Lsu Val Cys Gly Glu Arg Gly Fhe Phe Tyr 15 20 25 Thr Pro Lys Ala Ala Lys Gly lie Val Glu Gin Cys cys Thr Ser lie 30 35 40 cys Ser Leu Tyr Gin Leu Glu Asn Tyr Cys Asn 45 50 </div>

Claims

<div id="claims" class="application article clearfix printTableText"> O - 30 241011 WHAT l/WE CLAIM IS:-

1. A method of constructing a synthetic leader peptide sequence for secreting heterologous polypeptides in yeast, the method 5 comprising (a) inserting a random DNA fragment into a yeast expression vector comprising the following sequence 10 5 1 -SP-Xn-3 ' -RS-5 * —Xm— (NZT) p-Xq-PS-*gene*-3 1 wherein SP is a DNA sequence encoding a signal peptide, Xn is a DNA sequence encoding n amino acids, wherein n is 0 or an integer of from 1 to 10 amino acids, 15 RS is a restriction endonuclease recognition site for insertion of random DNA fragments, which site is provided at the junction of Xn and X_, n m' Xm is a DNA sequence encoding xn amino acids, wherein m is 0 or an integer from 1 to 10, 20 (NZT) is a DNA sequence encoding Asn-Xaa-Thr, wherein p is 0 or 1, Xq is a DNA sequence encoding q amino acids, wherein q is 0 or an integer from 1 to 10, PS is a DNA sequence encoding a peptide defining a yeast 25 processing site, and *gene* is a DNA sequence encoding a heterologous polypeptide; (b) transforming a yeast host cell with the expression vector of step (a); 30 (c) culturing the transformed host cell of step (b) under appropriate conditions; and (d) screening the culture of step (c) for secretion of the 35 heterologous polypeptide. *>-

2. A method according to claim 1, wherein the rarircrom DNA c- 241011 n 31 fragment inserted in the vector is of genomic or synthetic origin. f~~^)

3 . A method according to claim l or 2, wherein the random DNA 5 fragment has a length of from 6 to 600 base pairs.

4. A method according to any of claims 1-3, wherein the random DNA fragment encodes a high proportion of polar amino acids. CZ) 10

5. A method according to any of claims 1-4, wherein the random DNA fragment encodes at least one proline.

6. A method according to claim 1, wherein n and/or m and/or q are >1. 15

7. A method according to claim 1, wherein p is 1.

8. A method according to claim 1, wherein SP is a DNA sequence encoding the a-factor signal peptide, the signal peptide of 20 mouse salivary amylase, the carboxypeptidase signal peptide, the yeast BAR1 signal peptide, or the Humicola lanuginosa lipase signal peptide, or a derivative thereof.

9. A method according to claim 1, wherein PS is a DNA sequence 25 encoding Lys-Arg, Arg-Lys, Lys-Lys, Arg-Arg or lle-Glu-Gly-Arg.

10. A method according to claim 1, wherein the heterologous polypeptide is selected from the group consisting of aprotinin, tissue factor pathway inhibitor or other protease inhibitors, 30 insulin or insulin precursors, human or bovine growth hormone, interleukin, glucagon, tissue plasminogen activator, transforming growth factor a. or (3, platelet-derived growth factor, enzymes, or a functional analogue thereof. 35

11. A yeast expression cloning vector comprising the following y>i*r sequence v . t , ■ Jh 9 ****.! o O 241011 3 2 5 ' -SP-Xn-3 ' -RS-5 ' -Xm- (NZT) p-Xq-PS-*gene*-3 • wherein SP is a DNA sequence encoding a signal peptide, Xn is a DNA sequence encoding n amino acids, wherein n is 0 or 5 an integer of from 1 to 10 amino acids, RS is a restriction endonuclease recognition site provided at the junction of X_ and Xm, •* n m' Xm is a DNA sequence encoding m amino acids, wherein m is 0 or an integer from 1 to 10, 10 (NZT) is a DNA sequence encoding Asn-Xaa-Thr, wherein p is 0 or 1, Xq is a DNA sequence encoding q amino acids, wherein q is 0 or an integer from 1 to 10, PS is a DNA sequence encoding a peptide defining a yeast 15 processing site, and *gene* is a DNA sequence encoding a heterologous polypeptide.

12. A vector according to claim 11, wherein n and/or m and/or q are >1. 20

13. A vector according to claim 11, wherein p is 1.

14. A vector according to claim 11, wherein SP is a DNA sequence encoding the a-factor signal peptide, the signal 2 5 peptide of mouse salivary amylase, the carboxypeptidase signal peptide or the yeast BAR1 signal peptide.

15. A vector according to claim 11, wherein PS is a DNA sequence encoding Lys-Arg, Arg-Lys, Arg-Arg, Lys-Lys or Ile- 3 0 Glu-Gly-Arg.

16. A vector according to claim 11, wherein the heterologous polypeptide is selected from the group consisting of aprotinin, extrinsic pathway inhibitor or other protease inhibitors, 3 5 insulin or insulin precursors, human or bovine growth hormone^ interleukin, glucagon, tissue plasminogen actitffto'r, * transforming growth factor a or /3, platelet-derived^ %rowtl> r V o% k.: C* 't ^ o O O 241011 33 factor, enzymes, or a functional analogue thereof.

17. A yeast expression vector comprising the following sequence 5 5 ' -SP-Xn-ranDNA-Xm- (NZT) p-Xq-PS-*gene*-3 ' wherein SP is a DNA sequence encoding a signal peptide, Xn is a DNA sequence encoding n amino acids, wherein n is 0 or an integer of from 1 to 10 amino acids, 10 ranDNA is a random DNA fragment inserted in a restriction endonuclease recognition site provided at the junction of Xn and Xm, Xm is a DNA sequence encoding m amino acids, wherein m is 0 or an integer from 1 to 10, 15 (NZT)p is a DNA sequence encoding Asn-Xaa-Thr, wherein p is 0 or 1, Xq is a DNA sequence encoding q amino acids, wherein q is 0 or an integer from 1 to 10, PS is a DNA sequence encoding a peptide defining a yeast 20 processing site, and *gene* is a DNA sequence encoding a heterologous polypeptide, the sequence Xn-Xq encoding a leader peptide sequence. 25

18. A vector according to claim 17, wherein the random DNA fragment inserted in the vector is of genomic or synthetic origin.

19. A vector according to claim 17 or 18, wherein the random 3 0 DNA fragment has a length of from 6 to 600 base pairs.

20. A vector according to any of claims 17-19, wherein the random DNA fragment encodes a high proportion of polar amino acids. 35

21. A vector according to any of claims 17-2 0, wherein^he random DNA fragment encodes at least one proline. ,'cj VI, o 5 10 15 20 25 30 35 241011 34

22. A vector according to claim 17, wherein n and/or m and/or q are >1.

23. A vector according to claim 17, wherein p is 1.

24. A vector according to claim 17, wherein SP is a DNA sequence encoding the a-factor signal peptide, the signal peptide of mouse salivary amylase, the carboxypeptidase signal peptide, the yeast BAR1 signal peptide, or the Humicola lanuginosa lipase signal peptide, or a derivative thereof.

25. A vector according to claim 17, wherein PS is a DNA sequence encoding Lys-Arg, Arg-Lys, Arg-Arg, Lys-Lys or Ile-Glu-Gly-Arg.

26. A vector according to claim 17, wherein the heterologous polypeptide is selected from the group consisting of aprotinin, tissue factor pathway inhibitor or other protease inhibitors, insulin or insulin precursors, human or bovine growth hormone, interleukin, glucagon, tissue plasminogen activator, transforming growth factor a or (3, platelet-derived growth factor, enzymes, or a functional analogue thereof.

27. A yeast cell which is capable of expressing a heterologous polypeptide and which is transformed with a yeast expression vector according to any of claims 11-16.

28. A yeast cell which is capable of expressing a heterologous polypeptide and which is transformed with a yeast expression vector according to any of claims 17-2 6.

29. A process for producing a heterologous polypeptide in yeast, the process comprising culturing a yeast cell, which is capable of expressing a heterologous polypeptide and which is transformed with a yeast expression vector according to any of claims 17-2 6 including a leader peptide sequence constructed by the method of claim 1, in a suitable medium to obtain 241011 35 expression and secretion of the heterologous polypeptide, after which the heterologous polypeptide is recovered from the medium.

30. A method of constructing a synthetic leader peptide sequence, substantially as herein described with reference to any one of the Examples.

31. A yeast expression cloning vector as claimed in claim 11, as specifically set forth herein.

32. A yeast cell transformed with a yeast expression vector of claim 31 or 32. NOVO NORDISK A/S By Their Attorneys BALDWIN SON & CAREY n.Z. PATENT O- 17 DEC 1991 received </div>