CA1340280C

CA1340280C - Synthetic signal sequence for the transport of proteins in expression systems

Info

Publication number: CA1340280C
Application number: CA000492345A
Authority: CA
Inventors: Joachim Engels; Michael Leineweber; Eugen Uhlmann; Waldemar Wetekam
Original assignee: Hoechst AG
Current assignee: Hoechst AG
Priority date: 1984-10-06
Filing date: 1985-10-04
Publication date: 1998-12-22
Anticipated expiration: 2015-12-22
Also published as: PT81253B; JPS6188883A; PH30819A; ES8605579A1; IL76573A; NZ213717A; IE63262B1; DK453285A; ATE97445T1; GR852405B; DK175511B1; ES547600A0; DE3436818A1; HUT40164A; IL76573A0; EP0177827A2; IE852440L; AU4833385A; DK453285D0; AU595486B2

Abstract

The DNA of a natural signal sequence is modified by incorporation of cleavage sites for endonucleases and can thus be incorporated in any desired vectors by the modular construction principle. The vectors modified in this way then bring about transport of the coded protein out of the cytoplasm.

Description

- 1 - 1 e~)280 A synthetic s;gnal sequence for the transport of proteins in express;on systems In the cell, proteins are synthesized on the ribosomes which are located in the cytoplasm. Proteins wh;ch are transported out of the cytoplasm carry on the amino ter-minal end a relatively short peptide chain which is elimi-nated enzymatically on passage through the cytoplasmic membrane, whereupon the mature protein is produced.
This short peptide sequence is called a "signal peptide"
or a presequence or leader sequence.

The signal sequence located at the amino terminal end has already been characterized for a large number of secretory proteins. In general, it is composed of a hydrophobic region of about 10 to 20 amino acids, which is called the core and to whose amino term;nal end a short peptide se-quence (the pre-core) is bonded, this usually hav;ng one posit;vely charged am;no acid (or several). Between the carboxy terminal end of the hydrophobic region and the amino terminal end of the mature transported protein there is a short pept;de sequence (the post-core) wh;ch contains the splice site and ensures that the spatial arrangement is favorable.

It is known, from U.S. Patent 4,411,994, to couple the gene for a protein which is to be expressed with a bacte-rial gene which codes for an extracellular or periplasmiccarrier protein in order thus to bring about the transport of the desired protein out of the cytoplasm. It is neces-sary for this process to isolate a bacterial gene, which is intrinsic to the host, for a periplasmic, outer mem-brane protein or an extracellular protein. This gene isthen cut with a restriction enzyme, the gene for the pro-tein which is to be transported is inserted into the cut which has been produced, and the host cell ;s transformed iO~80 th a vector ~hich contains the fusion gene thus formed.
The ;solation of the natural gene and its characterization for the selection of suitable cleavage sites is extremely complex. This complexity is avoided according to the in-S vention by making use of a synthetic signal sequence.

Thus the invention relates to a synthetic signal sequence for the transport of proteins in expression systems, which comprises DNA essentially corresponding to a natural signal sequence but having one or nore cleavage sites for endonucleases ~hich are not present in the natu-ral DNA.
The invention further relates to DNA of the Formula I (see page 17).

The invention further relates to a process for he transport expression of eukaryotic, prokaryotic or viral proteins in prokaryotic and eukaryotic cells, which comprises coupling the gene for the protein which is to be transported onto a DNA sequence as described above, incorporating this fusion gene into a vector, and transforming therewith a host cell which transports the expressed protein out of the cytoplasm.

The invention further relates to a hybrid vector comprising a DNA sequence as described above and a host organism containing such vector.

The invention will now be described in further detail by reference to the appended drawings:
Figure 1 shows the digestion of the plasmid p~ 322 with the restriction endonucleases EcoR I and Pvu II and then the filling in of the EcoR I
cleavage site.
Figure 2 shows the plasmid pUC 9 containing the monkey preproinsulin DNA
and the reaction sequence for the construction of the proinsulin DNA fragment.
Figure 3 shows the ligation of the chemically synthesized regulation region with the proinsulin DNA fragment.
--~ Figure 4 shows how the hybrid plasmid pVI 6 is obtained.
Figure 5 shows the plasmid pWI Pl having a DNA sequence I integrated in the 3r correct direction of reading to t-he proinsulin gene.

- 2a - 13~0280 The DNA should "essentially" correspond to that of a natural signal sequence. This is to be understood to mean that the expressed signal peptide is substantially or completely identical to the natural signal peptide, in the latter case therefore the only difference existing at the DNA level is that the synthetic DNA has at least one cleavage site that the natural DNA sequence does not contain. This incorporation of the cleavage site accor-ding to the invention thus means that there is a, more or 1n less extensive, difference from the natural sequence, it being necessary under certain circumstances to have re-course to codons uhich are knoYn to be less preferred by the particular host organism. Ho~ever, surprisingly, this is not associated ~ith any expression disadvantage.
On the contrary, the specific "making to measure" of the synthetic gene is associated with so many advantages that any disadvantage o~ing to the use of "unnatural" codons is, in general, overcompensated by far. In fact, it has emerged that replacement of the start codon GTG, uhich occurs in the gene for alkaline phosphatase in E. coli, by ATG leads to a great increase in expression. A parti-cular advantage of the invention is that the host cell has to produce less ballast protein because the gene ~hich is to be expressed can be directly linked to the 3' end of ~ 10280 the synthetic DNA s;gnal sequence. Furthermore, advan-tages accrue in so far as it is possible in the con-struction of the synthetic DNA to provide DNA sequences, ~hich protrude at the ends, for certain restriction recog-5 nition sites ~hich allo~ cloning of this sequence and, inthe case of disparate recognition sites, permit defined incorporation into a cloning vector. This makes possible incorporation to any desired vectors by the "modular construction principle".

10 Internal recognition sites for restriction enzymes per-mit any desired homologous or heterologous genes to be coupled on in the correct reading frame. It is also possible via these internal cleavage sites to introduce in a straightfor~ard manner modifications in the DNA of the 15 signal sequences, ~hich lead to presequences ~hich do not occur in nature.

These internal cleavage sites are advantageously placed in the regions upstream and do~nstream of the hydrophobic region, in particular in the post-core region, it being 20 possible to modify the splice site and/or its adjacent region. Of course, it is also possible to modify the core region in a manner kno~n per se.

Taking kno~n rules into account (G. von Heijne, J. Mol.
Biol. 173 (1984) 243-251) it is possible, via suitable 25 cleavage sites in the gene section ~hich codes for the carboxy terminal part of the prepeptide, to plan the sig-nal peptidase splice site in such a manner that there is expression not of a fusion protein but directly of the desired, generally eukaryotic, peptide in its natural 30 form. In general, genes of natural origin do not allo~
processing of this type.

Suitable signal sequences are in principle all signal sequences knoun from the literature (M.E.E. l~atson; Nucleic Acids Res. 12 (1984), 5145 - 5164), modifications thereof 35 and "idealized" signal sequences derived therefrom 4 ~3~0280 (D. Perlman and HØ Halvorson~ J. Mol. Biol. 167 (1983), 391 - 409).

Preferred host organisms are E. coli, Streptomyces, Staphy-lococcus species, such as S. aureus, Bacillus species, such as B. subtilis, P. amyloliquifaciens, B. cereus or B. licheniformis, Pseudomonas, Saccharomyces, Spodoptera frugiperda and cell lines of higher organisms, such as plant or animal cells.

In principle, it is possible to obtain by transport 1û expression all those proteins of prokaryotic or eukaryotic origin ~hich can pass through the membrane. Ho~ever, pep-tide products ~hich are of pharmaceutical significance, such as hormones, lymphokines, interferons, blood-coagu-lation factors and vaccines, ~hich in nature are also coded as peptides ~ith an amino-terminal presequence are preferred. Ho~ever, in the prokaryotic host organisms this eukaryotic presequence is not, as a rule, eliminated by the signal peptidases intrinsic to the host.

In E. coli, the genes for the periplasmic and outer-membrane proteins are suitable for transport expression, the former directing the product into the periplasm ~here-as the latter tend to direct onto the outer membrane.

The example ~hich is given is the DNA signal sequence of the periplasmic protein alkaline phosphatase, ~hich is very readily expressed in E. coli, but there is no inten-tion to restrict the invention to this.

The presequence including the first t~enty amino acids of alkaline phosphatase of E. coli is shoun belou:
l 5 lO
Met-Lys-Gln-Ser-Thr-Ile-Ala-Leu-Ala-Leu-Leu-Pro-Leu-Leu-Phe-Thr-Pro-Val-Thr-Lys-Ala-Arg-Thr-Pro-Glu-Met-Pro-Val-3~ 35 40 Leu-Glu-Asn-Arg-Ala-Ala-Gln-Gly-Asn-Ile-Thr-Ala-Pro .. ..

- S - 1~402~0 = preferred splice site of the si~nal peptidase It has emerged that up to about 40, usually about 20, additional amino acids of the mature protein suffice for correct processin~. Houever, in many cases feuer addi-tional amino acids also suffice, for example about 10,advantageously about 5. Since a shorter protein chain means lcss stress on the protein biosynthesis system of the host cell, an advantageous embodiment of the invention is set out in DNA sequence I (see page l7)uhich codes for the presequence of alkaline phosphatase and an additional 5 amino acids of the perfect protein. Apart from a fe~
triplet modifications - namely those uhich introduce unique restriction enzyme cleavage sites and replace the start codon GTG by ATG - DNA sequence I corresponds to the natural sequence for alkaline phosphatase. At the ends of the codin~ strand are located protruding DNA sequences corresponding to the restriction endonuclease EcoR I, uhich permit incorporation into conventional cloning vec-tors, for example the commercially available plasmids such as p9R 322, pUC 8 or pUC 12. In addition, a number of other unique cleavage sites for restriction enzymes have been incorporated ~ithin the gene of DNA sequence I, and these, on the one hand, make it possible to couple heterologous genes onto the correct site and in the desired reading frame and, on the other hand, permit modifications to be carried out:
Restriction enzyme Cut after nucleotide No.
tin the coding strand) Sau 3 A 19 Pvu I 22 Hpa II 54 ) (present in the Ncl I ~4 ) natural gene) Alu I 66 Hph I 68 Ava II 70 Of course, it is also possible to construct the protruding 1~40280 sequences ;n such a manner that they correspond to diffe-rent restriction enzymes, and this then permits incorpor-ation into suitabLe vectors in a defined orientation. In this context, the expert will give consideration to whether the complexity associated with the construction of the gene and its specific incorporation is more important than the additional work of selection associated with incorporation in both orientations when the protruding ends are identical.

DNA sequence I can be constructed of 6 oligonucleotides 26 - 31 bases in length by first synthesizing them chemi-cally and then linking them enzymatically via sticky ends of 6 nucleotides. Incorporation of the synthetic gene into cloning vectors, for example into the commercially available plasmids mentioned, is carried out in a manner known per se.

As an example for the expression of a eukaryotic gene in E. coli using a presequence according to the invention, the synthesis of monkey proinsulin is described below: a DNA sequence is constructed in which the DNA sequence I, followed by the proinsulin gene (W. Wetekam et al., Gene 19 (1982) 179-183), is located on a connecting recognition site for EcoR I and downstream of a chemically syn-thesized regulation region, composed of a bacterial promoter, a lac operator and a ribosomal binding site (German Patent Application P 34 30 683.8), and 6 - 14 nucleotides away from the ribosomal binding site. The expressed proinsulin fusion peptide contains an additional 9 amino acids on the amino terminal end, and these can be eliminated enzymatically or chemically.

The incorporation of the synthetic gene into pUC 8 and the construction of expression plasmids which contain the eukaryotic genes coupled to DNA sequence I are carried out in a manner known per se. In this context, reference may be made to the textbook by Maniatis (Molecular Cloning, Maniatis et al., Cold Spring Harbor, 1982). The .. . . .

~ 7 - 1~ 4n ~8 transformation of the hybrid pLasmids thus obtained into suitable host organisms, advantageously E. coli, is Likewise known per se and is described in detail in the abovementioned textbook. The isolation of the expressed proteins and their purification is likewise described.

In the examples which follow some more embodiments of the invention are specifically illustrated, from which is evident to the expert the large number of possible modi-fications (and combinations). Unless otherwise specified, percentage data in these examples relate to weight.

Examples 1. Chemical synthesis of a single-stranded oligonucleotide The synthesis of the structural units of the gene is illustrated by the example of structural unit Ia of the gene, which comprises nucleotides 1 - 29 of the coding strand. The nucleoside at the 3' end, in the present case therefore guanosine (nucleotide No. 29), is co-valently bonded via the 3'-hydroxy group, by known methods (M.J. Gait et al., Nucleic Acids Res. 8 (1980) 1081 - 1096) to silica gel (FRACTOSIL, supplied by Merck).
For this purpose, first the silica gel is reacted with 3-triethoxysilylpropylamine with elimination of ethanol and formation of a Si-O-Si bond. The guanosine is reacted as the N2 -isobutyryl-3'-0-succinoyl-5'-dimethoxytrityl ether with the modified carrier in the presence of para-nitrophenol and N,N'-dicyclohexylcarbodiimide, the free carboxy group of the succinoyl group acylating the amino radical of the propylamine group.

In the synthetic steps which follow, the base component is used as the monomethyl ester of the 5'-0-dimethoxy-tritylnucleoside-3'-phosphorous acid dialkylamide or chloride, the adenine being in the form of the N6-benzoyl compound, the cytosine being in the form of the N4-benzoyl compound, the guanine being in the form of theN2-iso-~ fraol~la~k 134~2~0 -- 8 --butyryl compound, and the thymine, ~hich contains no amino group, be;ng ~ithout a protective group.

50 mg of the polymeric carrier containing 2 ~mol of bound guanosine are treated successively ~ith the follo~ing agents:

a) nitromethane b) saturated zinc bromide solution in nitromethane con-taining 1% ~ater c) methanol d) tetrahydrofuran e) acetonitrile f) 40 ~mol of the appropriate nucleoside phosphite and 200 ~mol of tetrazole in 0.5 ml of anhydrous aceto-nitrile (5 minutes) 9) 20X acetic anhydride in tetrahydrofuran containing 40% lutidine and 10% dimethylaminopyridine (2 minutes) h) tetrahydrofuran i) tetrahydrofuran containing 20% uater and 40X lutidine j) 3X iodine in collidine/~ater/tetrahydrofuran in the ratio by volume 5 : 4 : 1 k) tetrahydrofuran and 1) methanol.

In this context, the term "phosphite" is to be understood to be the monomethyl ester of the deoxyribose-3'-mono-phosphorous acid, the third valency being saturated bychloride or a tertiary amino group, for example a morpho-lino radical. The yields in each synthetic step can be determined after the detritylation reaction (b) in each case by spectrophotometry, measuring the absorption of the dimethoxytrityl cation at a wavelength of 496 nm.

When the synthesis of the oligonucleotide is complete, the methyl phosphate protective groups on the oligomer are eliminated using p-thiocresol and triethylamine. The oligonucleotide is then removed from the solid carrier by treatment uith ammonia for 3 hours. Treatment of the 13402~0 _ 9 _ oligomers uith concentrated ammonia for 2 to 3 days quanti-tatively eliminates the amino protective groups on the bases. The crude product thus obtained is purified by high-pressure liquid chromatography (HPLC) or by poly-acrylamide gel electrophoresis.

The other structural units Ib - If of the gene are synthe-sized entirely correspondingly, their nucleotide sequences being evident from DNA sequence II(see page 18).

2. Enzymatic linkage of the single-stranded oligonucleo-tides to give DNA sequence I

The terminal oligonucleotides Ia and If are not phosphory-lated. This prevents oligomerization via the protruding ends. For the phosphorylation of oligonucleotides Ib, Ic, Id and Ie, in each case 1 nmol of these compounds is treated ~ith 5 nmol of adenosine triphosphate and 4 units of T4 polynucleotide kinase in 2û ~ul of 50 mM tris.HCl buffer (pH 7.6), 10 mM magnesium chloride and 10 mM di-thiothreitol (DTT) at 37~C for 30 minutes. The enzyme is inactivated by heating at 95~C for 5 minutes. The oligonucleotides Ia to If are then combined and hybridized to give the double strand by heating them in a 20 mM KCl solution and then slo~ly (over the course of 2 hours) cooling to 1~~C. The ligation to give the DNA fragment according to DNA sequence I is carried out by reaction in 40~ul of 50 mM tris.HCl buffer (20 mM magnesium chloride and 10 mM DTT) using 100 units of T4 DNA ligase, at 15~C over the course of 18 hours.

The purification of the gene fragment is carried out by gel electrophoresis on a 10X polyacrylamide gel (uithout addition of urea, 20 x 40 cm, 1 mm thick), the marker sub-stance used being OX 174 DNA (supplied by BRL) cut uith Hinf I, or pPR 322 cut ~ith Hae III.

- lO - 13~0283 3. Incorporation of the gene fragment in pUC 8 The commerc;ally available plasmid pUC 8 is opened in a known manner and in accordance with the manufacturer's data using the restriction endonuclease EcoR I. The digestion mixture is fractionated by electrophoresis on a 5% polyacrylamide gel in a known manner, and the DNA is visualized by staining with ethidium bromide or by radio-active labeling ("Nick translation" method of Maniatis, loc. cit.). The plasmid band is then cut out of the acryl-amide gel and separated from the polyacrylamide by electro-phoresis.

4. Incorporation of DNA sequence I into an expression plasmid The expression plasmid p~I 6 having the information for monkey proinsulin is constructed as follows:

10 ~ug of the plasmid pBR 322 are digested with the res-triction endonucleases EcoR I and Pvu II and then the EcoRI cleavage site is filled in a fill-in reaction using Klenow polymerase. Following fractionation by gel electro-phoresis in a 5% polyacrylamide gel, the plasmid fragmentof length 2293 Bp can be obtained by electroelution (Figure 1).

The monkey preproinsulin DNA integrated in the plasmid p8R 322 (~etekam et al., Gene 19 (1982) 179 - 183) is isolated by digestion using the restriction endonucleases Hind III and Mst I (as a fragment of about 1250 Bp) and recloned into the plasmid pUC 9 as follows: the plasmid pUC 9 is cleaved with the enzyme Bam HI, the cleav~e ~ite is filled in a standard fill-in reaction using Klenow polymerase ("large fragment"), subsequent cleavage with the restriction enzyme Hind III is carried out, and the DNA is separated from the other DNA fragments by gel electrophoresis in a 5% polyacrylamide gel. The isolated insulin DNA fragment of length about 1250 Bp is integrated .

" 3.3~280 into the opened plasm;d.

To remove the untranslated region and the presequence, the pUC 9 plasmid thus modified is digested u;th Hae III, and the fragment of length 143 Bp is digested ~ith 8al 31 under limiting enzyme conditions to eliminate the last t~o nucleotides from the presequence. This results in the first codon on the amino terminal end being TTT, ~hich represents phenylalanine as the first amino acid of the B chain.

An adaptor uhich is specific for Eco RI is no~ ligated onto this fragment in a blunt-end ligation reaction:

a) 5' AAT TAT GAA TTC GCA ATG
Eco RI TA CTT AAG CGT TAC

b) 5' AAT TAT GAA TTC GCA AGA
Eco RI TA CTT AAG CGT TCT

In order to prevent polymerization of the adaptors they are used unphosphorylated in the ligation reaction (this being indicated in the figures by Eco RI-, in the same uay as recognition sequences inactivated by, for example, filling in). The adaptor a) has a codon for methionine at the end, and the adaptor b) has a codon for arginine.
Thus, the gene product obtained by variant a) is amenable to removal of the bacterial contribution by cleavage with cyanogen bromide, ~hereas variant b) allo~s trypsin cleav-age.

The ligation product is digested u;th Mbo II. After frac-tionation by gel electrophoresis, a DNA fragment of length 79 Bp having the information for amino acids Nos. 1 to 21 of the B chain is obtained.

The gene for the remaining information for the proinsulin molecule ~including a G-C sequence from the cloning and 21 Bp from the pBR 322 connected to the stop codon) is - - 12 - ~34~280 obtained from the pUC 9 plasmid having the complete information for monkey preproinsulin by digestion ~ith Mbo II/Sma I and isolation of a DNA fragment of length about 240 Bp. The correct ligation product of length about 320 Bp (including the adaptor of 18 Bp) is obtained by ligation of the t~o proinsulin fragments. This pro-insulin DNA fragment thus constructed can no~ be ligated together ~ith a regulation region via the Eco RI negative cleavage site.

Figure 2 sho~s the entire reaction sequence, ~here A, B
and C denote the DNA for the particular peptide chains of the proinsulin molecule, Ad denotes the (dephosphoryl-ated) adaptor (a or b) and Pre denotes the DNA for the presequence of monkey preproinsulin.

A chemically synthesized regulation region composed of a recognition sequence for Bam HI, the lac operator (0), a bacterial promoter (P) and a ribosomal binding site (RB), and having an ATG start codon, 6 to 14 nucleotides a~ay from the RB and having a connected recognition sequence for Eco RI (Figure 3) is Ligated, via the common Eco RI
overlapping region, ~ith the proinsulin gene fragment obtained according to the previous example. It is advan-tageous to choose the follo~ing synthetic regulation region (DNA sequence IIa from Table 2, corresponding to German Patent Application P 34 30 683.8):

. . .. .

~ 13 ~ 1~2~

5' GATCCTAAATAAATTCTTGACA~ AAA 3' 3' GATTTATTTAAGAACTGTAAAAAATTT 5' (Ba~ HI) P

5' TAATTTGGTATAATGTGTGGAATTGTGAGCG 3' 3' ATTAAACCATATTACACACCTTAACACTCGC 5' 5' GAATAACAATTTCACAGAGGATCTAG 3' 3' CTTATTGTTAAAGTGTCTCCTAGATCTTAA 5' RB (Eco RI) The other synthetic regulation regions specified in Table 2 can be used like~ise. Ho~ever, it ;s also pos-sible to choose a natural or derived ~Perlman et al., loc.
cit.) signal sequence kno~n from the literature.

.. . . . . ..

- 1 4 - 1 3 ~ ~ 2 8 0 C~ C~ 'C
ll ll ll a~

~ U~
., a~
.
~ ~ Cl: ~ ~ C C ~ C~
~c~
O ~U a:
O u L
O O ~ O

~ N ~ ~ V~
3 ~

.. 3 c c ~: ~ u u ~ c~ c c ~ c c c~ o o o o ~s co r cr o 11 11 11 "

~ ~ ~ c ~ ' J ~I: C~ ~ ~
~ ~ c ~ ~ o U ~

~ ~ ~ O O O O O r ~ c~ c~ c~ c~ U C~
~ C~ C~ ~ ~ ~ ~ ~ ~. V~
c 1l 1l 1l 1l 1l 1l ~ ~ D C~ ~ ~ ~ bDS
Z H

- 15 - ~ 2~3 Follo~ing double digestion ~ith Sma I/Bam HI and a fill-in reaction of the Bam HI cleavage site ~ith the Kleno~
fragment, the ligation product (about 420 Bp) is isolated by gel electrophoresis.

The fragment thus obtained can then, by a blunt-end liga-tion, be ligated into the pBR 322 part-plasmid of Figure 1 (Figure 4). The hybrid plasmid pWI 6 is obtained.

After transformation into the E. coli strain HB 101 and selection on ampicillin plates, the plasmid DNA of indi-vidual clones uas tested for the integration of a 420 Bpfragment having the regulation region and the proinsulin gene shortened by Bal 31. In order to demonstrate the correct shortening of the proinsulin gene by Bal 31 (Figure 2), the plasmids having the integrated proinsulin gene fragment uere sequenced starting from the Eco RI
cleavage site. Of 60 sequenced clones, three had the desired shortening by t~o nucleotides (Figure 4).

1 JUg of the plasmid pWI 6 is cut ~ith the restriction enzyme Eco RI and then ligated together in the presence of 30 ng of DNA sequence I, at 16~C in 6 hours. After transformation into E. coli HB 101, plasmids are isolated from individual clones and tested for integration of DNA
sequence I by means of restriction enzyme analysis. 7%
of the clones contained the plasmid pWI 6 with integrated DNA sequence I.

The direction of this integration reaction can be unambi-guously determined by standard methods of restriction enzyme analysis via double digestion uith Hind III/
Pvu I. The plasmid pWI 6 having a DNA sequence I inte-grated in the correct direction of reading to the pro-insulin gene is sho~n as pWIP 1 in Figure 5.

This plasmid can then be transformed into various E. coli strains in order to test the synthetic capacity of the individual strains.

- 16 - 13~2~0 The express;on of the presequence-proinsulin gene fusion in E. coli is determined as follows:

1 ml of a bacterial culture induced with IPTG (isopropyl ~-D-thiogalactopyranoside) is stopped using PMSF ~phenyl-5 methylsulfonyl fluoride) in a final concentration of 5x10-4 M at an opt;cal density of OD600 of 1.0 and at an induction time of 1 hour, cooled in ice and spun do~n.
The cell sediment is then washed in 1 ml of buffer (1û mM
tris.HCl, pH 7.6; 40 mM NaCl), spun down and resuspended 10 in 200/ul of buffer (20X sucrose; 20 mM tris.HCl, pH 8.0;
2 mM EDTA), incubated at room temperature for 10 minutes, spun down and immediately resuspended in 500jul of double-distilled H20. After incubation in ice for 10 minutes, the shock-lysed bacteria are spun down and the supernatant 15 is frozen. The proinsuLin content of this supernatant is tested by a standard insulin RIA (Amersham).

The bacterial sediment is resuspended once more in 200~ul of lysozyme buffer (20Z sucrose; 2 mg/ml lysozyme; 20 mM
tris.HCl, pH 8.0; 2 mM EDTA), incubated in ice for 20 30 minutes, sonicated 3 x 10 seconds and then spun down.
The supernatant resulting from this is tested for the con-tent of proinsulin ("plasma fractionn) in a radio-immunoassay.

Individual bacterial clones which contain the plasmid 25 pWIP 1 were examined for their synthetic capacity and their ability to transport the proinsulin-presequence product. It was possible to demonstrate that all the bacterial clones, as expected, transported about 90Z of the produced proinsulin into the periplasmic space. About 30 10Z of the RIA activity of proinsulin was still found in the plasma fraction.

.

- 17 - ~.? 10 280 DNA sequenceI

Triplet No. l 2 3 A~lno acid No. Met Lys Gln Nucleoti~ No. 5 lO
Codin~ strand 5' AA TTC ATG AAA CAA
non-cod. strand 3' G TAC TTT GTT

4 5 6 7 8 9 10 ll 12 13 Ser Thr Ile Ala Leu Ala Leu Leu Pro Leu AGC ACG ATC GCA CTG GCA CTC TTA CCG TTA
TCG TGC TAG CGT GAC CGT GAG AAT GGC AAT

Leu Phe Thr Pro Val Thr Lys Ala Arg Thr CTG TTT ACC CCG GTG ACA AAA GCT CGG ACC
GAC AAA TGG GGC CAC TGT TTT CGA GCC TGG

Pro Glu Met CCA GAA ATG G 3' GGT CTT TAC CTT AA 5' 18 - ~3 ~ 0 2 ~ 0 DNA sequence II:

Ia 5 ' AA TTC ATG AAA CAA AGC ACG ATC GCA CTG
3 ' G TAC TTT GTT TCG TGC TAG CGT GAC
Eco RI ~ Ib Ic IGCA CTC TTA CCG TTA CTG TTT ACC CCG
CGT GAG AAT GGC AAT GAC AAA TGG GGC
~ ~ Id Ie ~ Eco RI
hTG ACA AAA GCT CGG ACC CCA GAA ATG G
CAC TGT TTT CGA GCC TGG GGT CTT TAC CTT AA
~ ~ If

Claims

1. A synthetic signal sequence for the transport of proteins in expression systems which comprises DNA
essentially corresponding to a natural signal sequence but having one or more cleavage sites for endonucleases which the natural DNA does not contain.

2. A signal sequence as claimed in claim 1, which contains internal cleavage sites upstream or downsteam or upstream and downstream of the hydrophobic region.

3. A signal sequence as claimed in claim 1, which essentially corresponds to the natural signal sequence of alkaline phosphatase of E. coli.

4. A signal sequence as claimed in claim 1, which contains at the 3' end up to about 40 of the amino-terminal codons of the adjacent structural gene following downstream.

5. A signal sequence as claimed in claim 2, which essentially corresponds to the natural signal sequence of alkaline phosphatase of E. coli.

6. A signal sequence as claimed in claim 2, which contains at the 3' end up to about 40 of the amino-terminal codons of the structural gene following downstream.

7. A signal sequence as claimed in claim 3, which contains at the 3' end up to about 40 of the amino-terminal codons of the structural gene following downstream.

8. DNA of the formula I: 5 10 5 ' AA TTC ATG AAA CAA
3' G TAC TTT GTT

AGC ACG ATC GCA CTG GCA CTC TTA CCG TTA
TCG TGC TAG CGT GAC CGT GAG AAT GGC AAT

CTG TTT ACC CCG GTG ACA AAA GCT CGG ACC
GAC AAA TGG GGC CAC TGT TTT CGA GCC TGG

CCA GAA ATG G 3 ' GGT CTT TAC CTT AA 5 '

9. A process for the transport expression of eukaryotic, prokaryotic or viral proteins in prokaryotic and eukaryotic cells, which comprises coupling the gene for the protein which is to be transported onto a DNA sequence as claimed in claim 1, incorporating this fusion gene into a vector, and transforming therewith a host cell which transports the expressed protein out of the cytoplasm.

10. The process as claimed in claim 9, wherein the synthetic DNA signal sequence codes for a protein intrinsic to the host.

11. A hybrid vector comprising a DNA sequence as claimed in claim 1.

12. A hybrid vector as claimed in claim 11, which is a hybrid plasmid containing the DNA sequence I as claimed in claim 8, inserted in an Eco RI cleavage site.

13. A host cell containing a vector as claimed in claim 11.

14. A host cell containing a vector as claimed in claim 12.

15. A host cell as claimed in claim 13, which is of the species E.
coli.

16. A host cell as claimed in claim 14, which is of the species E.
coli.