GENE EXPRESSION SYSTEM
Field of the invention
The invention relates to the expression of DNA, genes, cDNAs, proteins, peptides and parts thereof in the nematode worm C. elegans . In particular, the invention relates to methods of improving the translation of RNAs transcribed in C. elegans using a bacteriophage polymerase by introduction of a trans- splice recognition site recognised by an SL1 trans- splice recognition sequence into the DNA template transcribed by the bacteriophage polymerase.
Background to the invention
Eukaryotic versus prokaryotic expression.
Bacteriophage RNA polymerases, such as T7, T3, and SPβ, and their corresponding promoters have been used extensively to drive the expression of heterologous genes in a variety of organisms. In co- pending International patent application No. WO 00/01846, Plaetinck et al . describe the use of the T7 system to express DNA, genes, cDNA, proteins and peptides of parts thereof and for the expression of double-stranded RNA (dsRNA) in the nematode model system C. elegans .
The bacteriophage expression systems are well known in the art for use in prokaryotic host cells, such as E. coli , and have the advantage that they provide simple and strong expression systems dependent only on one RNA polymerase and one well defined promoter. The application of such efficient expression systems in eukaryotic organisms is, however, not evident, mainly because messenger RNAs from eukaryotes and prokaryotes have a different
structure, which has implications for translation efficiency and RNA stability.
Messenger RNAs of higher eukaryotes share a functionally essential 5' CAP structure. This structure is generated during a capping reaction that is linked exclusively to RNA polymerase II transcription. Prokaryotic RNA polymerases such as bacteriophage T3, T7 and SPβ polymerases do not provide messenger RNAs with such a CAP structure, leading to inefficient translation in eukaryotic systems (Fuerst et al . J. Mol. Biol : 206: 333-348 (1989) ) .
One way to improve translation of uncapped mRNAs in eukaryotic systems is by the insertion of an internal ribosome entry site (IRES) sequence 5' of the coding sequence. For example, Elroy-Stein, et al . r Proc. Natl. Acad. Sci. USA 87:6743-6747 (1990), describe the cloning of the untranslated region of the ECMV virus downstream of the T7 promoter in order to enhance the efficiency of translation. In other systems translation of T7-derived transcripts may be enhanced by addition of a CAP structure derived from a capped transcript. For example, in Trypanosoma a 5' CAP structure is added to T7 generated RNA transcripts by a natural occurring trans-splicing reaction (Wirtz et al . NAR 22:3887-3894 (1994)).
Trans-splicing in C. elegans .
In C. elegans many mRNAs contain an identical short leader sequence, designated the spliced leader
(SL) . This splice leader is donated by a small RNA (SL RNA) via a trans-splicing reaction. This trans splicing was first observed by Krause et al . , Cell 49:753-61 (1987). The splice leader RNA exists as a small nuclear ribonucleoprotein particle and has the trimethylguanosine cap that is characteristic of
eukaryotic small nuclear RNAs. The trimethylguanosine cap present on the spliced leader RNA is transferred to the pre-mRNA during the trans-splicing reaction. Thereafter, the trimethylguanosine cap is maintained on the mature RNA (Van Doren et al . , Mol. Cell. Biol . 10:1769-1772 (1990). The trans-splicing signal for such a splice leader is essentially an intron missing only the 51 splice site, designated an Λoutron' . An outron has essentially all the intron sequence including a trans-splice acceptor site homologous to a UUUCAG sequence preceded by a AU rich region (Conrad et al . , NAR 21:913-919 (1993). Introduction of an outron into the 5' untranslated region of a C. elegans gene converts it to a trans-spliced gene (Conrad et al . , EMBO J. 12:1249-1255 (1993); Conrad et al . Mol. Cell Biol. 11:1931-1926 (1991)) and introduction of donor sites in a natural trans-spliced C. elegans gene prevents trans-splicing and converts it into a more conventional gene.
Description of the invention.
Until recently, expression of heterologous and homologous genes in C. elegans was mainly achieved by linking an appropriate coding sequence to a selected C. elegans promoter. The present inventors have recently demonstrated that the recombinant gene expression in C. elegans can be based on the prokaryotic T7 expression system (WO 00/01846) . However, the present inventors found that the expression system was far from being efficient, or at least the resulting expression was much lower than would be expected from this T7 related expression system. ϊt was concluded that this low expression was mainly due to RNA instability or translation arrest. Furthermore, it was reasoned that fundamental differences between prokaryotic and eukaryotic
expression systems, particularly the requirement for capping of the 5' end of the mRNA for efficient translation in eukaryotic systems, was the main reason for this unexpectedly low expression. The inventors have now developed a solution to the problem of the inefficiency of the T7 system in eukaryotic host cells and organisms, particularly in C. elegans , and have constructed a generally applicable expression system which allows for the efficient expression of genes, DNA, cDNA, peptides and proteins under the regulation of the T7 promoter in C. elegans .
Therefore, in accordance with a first aspect of the invention there is provided a DNA construct comprising a bacteriophage promoter operably linked to an outron sequence.
It is an essential feature of the DNA construct of the invention that the bacteriophage promoter and the outron sequence are "operably linked", that is to say they are arranged in a relationship permitting them to function in their intended manner. In this case, the bacteriophage promoter is positioned upstream of the outron sequence such that it is capable of promoting transcription of the outron sequence upon binding of an appropriate RNA polymerase, with the outron sequence forming the extreme 5' end of the resulting transcript.
The DNA construct may further comprise at least one restriction enzyme recognition site positioned downstream of and proximal to the outron sequence.
Advantageously, the DNA construct may contain multiple restriction sites forming a multi-cloning site. The purpose of the restriction site/multi-cloning site is to facilitate cloning of a heterologous or homologous DNA fragment downstream of the outron sequence. A DNA construct comprising a bacteriophage promoter, an outron sequence and a restriction site/multi-cloning
site may therefore be referred to hereinafter as an Λoutron cloning construct' .
In an outron cloning construct it is advantageous for the restriction site/multi-cloning site to be positioned fairly proximal to the outron sequence (e.g. within lOObp) such that a heterologous or homologous sequence inserted at this site may be co- transcribed with the outron sequence on a single mRNA. However, further sequence elements may be interposed between the outron sequence and the restriction site/multi-cloning site. For example, the general purpose vector pDW3123 described in the accompanying examples has a synthetic intron A sequence between the outron sequence and the multi-cloning site. In one preferred embodiment of the invention, the DNA construct is a replicable cloning vector, such as, for. example, a plasmid vector. In addition to the bacteriophage promoter, outron sequence and optional restriction site/multi-cloning site, the vector may further contain one or more of the general features commonly found in cloning vectors, for example an origin of replication to allow autonomous replication within a host cell and a selective marker, such as an antibiotic resistance gene.- Although not essential, the vector may also contain a poly-adenylation signal to stabilize and process the 3' end of the mRNA transcribed from the bacteriophage promoter. A preferred example is the 3'UTR from the C. elegans unc-54 gene, but any other 3'UTR or polyadenylation signal may be used.
Outron-containing DNA constructs according to the invention may be easily be constructed from the component sequence elements using standard recombinant techniques well known in the art and described, for example, in F. M. Ausubel et al . (eds.), Current
Protocols In Molecular Biology, John Wiley & Sons, Inc. (1994).
Outron sequences for use in the constructs of the invention may be isolated from natural C. elegans genes using standard molecular biology techniques . For example, a natural outron sequence might be amplified using the polymerase chain reaction or an equivalent amplification technique using C. elegans genomic DNA as a template. Alternatively, synthetic outron sequences may be synthesised, for example, by annealing two complementary single stranded oligonucleotides, as illustrated in the accompanying examples. Once a DNA fragment comprising the outron sequence has been obtained in would be a matter of routine to assemble an outron construct by linking the outron in the correct orientation relative to the bacteriophage promoter.
The sequences of the commonly used bacteriophage promoters, e.g. T7, T3 and SP6, are well known in the art and oligonucleotides containing functional phage promoter sequences can be readily synthesised using standard oligonucleotide synthesis techniques. It would be a matter of routine to insert such a synthetic promoter sequence into, for example, a plasmid vector backbone containing, for example, an origin of replication a selective marker and a suitable restriction site. Alternatively, one of the many plasmid vectors containing bacteriophage promoter sequences known in the art may be used as the starting point for the construction of a plasmid-based outron cloning vector. The known vectors generally contain, in addition to the phage promoter sequence, one or more restriction sites conveniently positioned downstream of the phage promoter and also a bacterial origin of replication and a selective marker. Once the vector backbone is in place the outron sequence may' s'imply be inserted in the appropriate position downstream of the bacteriophage promoter.
In a particularly useful embodiment the invention
provides a DNA construct for use in bacteriophage promoter-driven expression of a polypeptide in a eukaryotic host cell or organism. This construct comprises a bacteriophage promoter operably linked to a DNA sequence such that it is capable of initiating transcription of the DNA sequence upon binding of an appropriate RNA polymerase to the promoter, wherein the aforesaid DNA sequence comprises an outron sequence and at least one open reading frame positioned downstream of the outron sequence.
The open reading frame may be essentially any protein-encoding DNA sequence bounded by start and stop codons . This protein-encoding DNA sequence may include introns, as both trans-splicing and cis- splicing can occur together.
A DNA construct according to this embodiment of the. invention, which may be referred to hereinafter as an Λoutron expression construct' , may be derived from an outron cloning construct by insertion of a heterologous or homologous protein-encoding DNA fragment into the restriction site/multi-cloning site. It is essential that the heterologous or homologous DNA fragment be inserted downstream of the outron sequence such that the two sequences may be co- transcribed, with the outron sequence forming part of the 5' untranslated region of the resulting mRNA.
The outron expression construct may advantageously form an expression vector, such as, for example, a plasmid vector. Most preferably, the expression ,vector will be one suitable for use in the nematode worm C. elegans . In addition to the bacteriophage promoter, outron sequence and protein- encoding DNA sequence (open reading frame) , the expression vector may further contain one or more of the general features commonly found in expression vectors, for example an origin of replication to allow autonomous replication within a bacterial host cell
and a selective marker, such as an antibiotic resistance gene. The vector may also contain a poly-adenylation signal to stabilize and process the 3 ' end of the mRNA transcribed from the bacteriophage promoter. A preferred example is the 3'UTR from the C. elegans unc-54 gene, but any other 3'UTR or polyadenylation signal may be used. An additional element, such as for example a synthetic intron, may be interposed between the outron sequence and the open reading frame.
It is important that the open reading frame is positioned downstream of and proximal to the outron sequence in the expression construct such that (i) the two elements are co-transcribed to form a single mRNA and (ii) the outron sequence forms part of the 5' untranslated region of the mRNA. If the appropriate splicing machinery and a supply of SL RNAs is provided by the eukaryotic host cell or organism then the uncapped 5 ' end of the pre-mRNA transcribed from the expression construct will be replaced with a capped splice leader via the trans-splicing reaction. This will greatly increase the efficiency of translation in a eukaryotic host system.
The use of an outron sequence at the extreme 5' end of the RNA provides a solution to the problem of reduced expression efficiency in eukaryotic systems wherever the type of promoter/polymerase used to drive gene expression leads to the production of uncapped transcripts, provided that the host cell or organism produces the spliced leader RNAs required for the trans-splicing reaction.
Outron sequences which may be utilised in accordance with the invention include naturally occurring outron sequences isolated from SLl-specific C. elegans genes (Conrad, R. Functional analysis of a C. elegans trans-splice acceptor. Nucleic Acids Res . 1993, 21(4), pp913-919; Conrad, R. SL1 trans-splicing
specified by AU-rich synthetic RNA inserted at the 5' end of Caenorhabditis elegans pre-mRNA. RNA. 1995, 1(2), ppl64-170) and also synthetic outron sequences which are functionally equivalent to the natural C. elegans outron sequences, including variants of naturally occurring C. elegans outrons . The phrase "functionally equivalent" means that the synthetic intron is recognised by the C. elegans trans-splicing machinery and can be trans-spliced to a C. elegans splice leader RNA, preferably the SL1 splice leader. Experimental evidence indicates that trans- splicing in C. elegans is signalled by an AU-rich intron-like sequence followed by a splice acceptor site (Conrad et al 1993 and 1995) . For the purposes of the present application the terms "outron" or
"outron sequence" should be interpreted as referring to "both the AU-rich region from the 5' end of the pre- mRNA to the trans-splice acceptor site and the trans- splice acceptor site itself. In connection with the DNA constructs of the invention, the terms "outron" and "outron sequence" refer to features present in the DNA which encodes the pre-mRNA.
The consensus splice acceptor site for trans- splicing of outrons and the consensus 3' splice acceptor site for cis-splicing of introns are essentially identical (UUUCAG) . Moreover, a normally trans-spliced acceptor site can be efficiently cis- spliced when a donor splice site is inserted upstream within the outron sequence. It is therefore important that the outron constructs described herein do not contain any potential splice donor sequence upstream of the splice acceptor within the outron and downstream of the transcription start site such that it will be transcribed in the mRNA encoded by the construct. If such a site were present than there would be a potential for cis-splicing rather than
trans-splicing.
It has also been observed that the overall length of the outron has an effect on the efficiency of trans-splicing, longer outrons in general working better than shorter ones (Conrad et al . 1995).
Advantageously, the outron sequences for inclusion into the outron constructs described herein should be greater than about 50nt in length.
A synthetic outron containing an AT stretch and a TTTTCAG sequence has been shown to be functional in C. elegans . As illustrated in the accompanying Examples, the insertion of an outron sequence into the 5' untranslated region of GFP reporter construct, downstream of the promoter and upstream of the GFP open reading frame, is required for optimal expression of bacteriophage RNA polymerase transcribed reporter gene mRNA in C. elegans .
Suitable bacteriophage promoters which may be' used in the DNA constructs according to the invention include T7, T3 and SP6 promoters, with T7 being the most preferred. As discussed above, these bacteriophage promoters have long been known to be useful tools in molecular biology since they can provide simple and strong expression systems dependent only on the binding of the specific or cognate RNA polymerase.
In a still further aspect, the invention provides a method for expressing a recombinant polypeptide in C. elegans, which method comprises: introducing an outron expression construct, as described above, said construct being an expression vector suitable for use in C. elegans, into a C. elegans strain which expresses an RNA polymerase specific for the bacteriophage promoter present in said DNA construct in one or more tissues or cell
types .
An outron expression vector for use in this method may be constructed by inserting DNA encoding the polypeptide of interest into an outron cloning vector, as described above. The vector must be one which is suitable for use in C. elegans, plasmid-based vectors are the most preferred.
The C. elegans worms are preferably transgenic worms carrying a transgene capable of expressing the RNA polymerase in one or more tissues or cell types. The term "transgene capable of expressing" as used herein means a nucleic acid molecule comprising a nucleotide sequence encoding the polymerase operably linked to a promoter. The promoter may be any promoter which functions in C. elegans and may be general (i.e. active in substantially all tissues and cel-1 types) , tissue-specific, cell type-specific, constitutive, inducible etc. Most preferably, the promoter will exhibit tissue or cell type-specificity. With the use of a tissue or cell type-specific promoter of the appropriate specificity it is possible to control the site of RNA polymerase expression within C. elegans and hence control the site of expression of the recombinant polypeptide. Methods for the construction of transgenic C. elegans worms are known in the art and are particularly described by Craig Mello and Andrew Fire, Methods in Cell Biology, Vol 48, Ed. H.F. Epstein and D.C. Shakes, Academic Press, pages 452-480.
In a further aspect the invention provides a kit for use in recombinant expression of a polypeptide in C. elegans, the kit comprising an outron cloning construct, as described above, and optionally a supply of C. elegans nematode worms expressing an RNA polymerase specific for the bacteriophage promoter
present in the said outron cloning construct in one or more tissues or cell types.
The kit might further contain control inserts and control constructs, e.g. a reporter gene inserts and constructs which could be used to check efficiency of cloning steps and transfection steps, respectively. It might also contain constructs which may be used as selectable markers in the transfection procedure, e.g. a rol 6 plasmid (see below) . The invention further provides methods for the construction of transgenic C. elegans expressing a recombinant polypeptide in one or more tissues or cell types. One such method comprises introducing an outron expression construct, as described above, said construct being an expression vector suitable for use in C. elegans comprising an open reading frame encoding the desired recombinant polypeptide, into a C. elegans strain which expresses an RNA polymerase specific for the bacteriophage promoter present in said DNA construct in one or more tissues or cell types, and isolating transgenic C. elegans lines which stably express the said polypeptide. The C. elegans strain expressing the polymerase is preferably a transgenic strain carrying a transgene capable of expressing the RNA polymerase in one or more tissues or cell types, as described above. As aforesaid, transgenic C. elegans lines can readily be constructed using standard techniques well known in the art. In an alternative approach, the method may comprise introducing into a background C. elegans strain (i) an outron expression construct, as described above, said construct being an expression vector suitable for use in C. elegans comprising an open frame encoding the desired recombinant polypeptide, and (ii) a DNA construct suitable for expression of an RNA polymerase • specific for the
bacteriophage promoter present in the outron expression construct in one or more tissues or cell types of C. elegans, and isolating transgenic C. elegans lines which stably express the said polypeptide. The second DNA construct may, advantageously, be an expression vector comprising a nucleotide sequence encoding the polymerase operably linked to a promoter having the appropriate tissue or cell type specificity. In carrying out the methods of the invention one may employ standard techniques well known in the art ■ for construction and selection of transgenic C elegans lines. Such techniques are described, for example, in techniques described in Methods in Cell Biology, vol 84; Caenorhabditis elegans : modern biological analysis of an organism, ed. Epstein and Sha-kes, academic press, 1995. Foreign DNA (e.g. plasmid DNA) may be introduced into C. elegans using microinjection or ballistic transformation, as described in the applicant's co-pending International patent application No. WO 99/49066. In order to facilitate the selection of transgenic strains a marker plasmid may be co-introduced with the transgenes. A typical example is the plasmid pRF4 (Mello, C. C. et al . EMBO J. 10, 3959-3970 (1991)) which carries the rol-6 gene. C. elegans expressing rol-6 can be identified by screening for the roller phenotype. Any other C. elegans dominant selectable phenotypic marker, of which there are many known in the art, may be used to facilitate selection of transgenic lines. A useful example is green fluorescent protein (or any of the equivalent autonomous fluorescent proteins known in the art) .
In a still further aspect the invention provides transgenic C. elegans worms which contain an outron expression construct, as described above, said
construct being an expression vector suitable for use in C. elegans, and which further express an RNA polymerase specific for the bacteriophage promoter present in the outron expression construct in one or more tissues or cell types.
The present invention will be further understood with reference to the following non-limiting Examples, together with the accompanying drawings in which:
Figure 1 illustrates the construction of a T7-outron- GFP vector. (A) sequence of the synthetic outron produced by annealing oligonucleotides o-GN59 and o- GN60. (B) summary of the strategy used to construct vector pDW3124.
Figure 2 shows plasmid maps for pDW3123 (outron cloning vector) and pDW3124 (outron expression vector for GFP expression) .
Figure 3 is a plasmid map of pGN148 which contains a T7 RNA polymerase coding sequence under the regulation of the C. elegans SERCA promoter.
Figure 4 illustrates the nucleotide sequence of pGN148.
Figure 5 illustrates the nucleotide sequence of pDW 3123 annotated to show the positions of the T7 promoter, outron, synthetic intron A, multi-cloning site and unc-54 3' UTR sequences and also the ampicillin resistance gene.
Figure 6 illustrates the nucleotide sequence of pDW 3124 annotated to show the positions of the T7 promoter, outron, synthetic intron A, GFP with introns and unc-54 3' UTR sequences and also the ampicillin
resistance gene.
Example 1 -Construction of a T7-outron-GFP containing vector (PDW3124) A SL1 trans-splice acceptor site (outron) was cloned into a vector downstream of the T7 promoter and upstream of the GFP to be expressed.
A synthetic outron consisting of two partially overlapping oligonucleotides (o-GN59 and O-GN60, see Figure 1) was inserted into a Xbal/Xmal digested T7 promoter GFP construct. Briefly, 25μl o-GN59 and 25μl O-GN60 (lOOμM) were denatured for 5 minutes at 94°C, annealed for 30 minutes at 68°C then cooled to 4°C. Iμl of Xmal/Xbal digested pDW3120 and lOμl of the annealed oligos were then ligated using T4 ligase overnight at 16°C, transformed into competent E. coli and. analysed by restriction digestion and DNA sequencing, all according to standard molecular biology procedures. The resulting vector was designated pDW3124 (Figures 1 and 2) .
The outron contains an AU rich sequence followed by a splice-acceptor site as described by Conrad et al , NAR 21:913-919 (1993) (see Figure 1).
Example 2-Construction of a T7-Outron MCS vector A general purpose vector was constructed to facilitate expression of other DNA sequences in C. elegans under the control of the T7 promoter. This was done by digesting vector pDW3124 with Hindll (position 179) and PvuII (position 1029) (partial digest) and re-ligating the blunt ends, resulting in vector pDW3123 (Figure 2) .
Example 3-The expression of heterologous genes in C. elegans regulated by the T7 promoter requires trans-splicing.
Wild-type C. elegans nematodes where co-injected with various combinations of the following test plasmids :
1) GFP reporter plasmid GFP: pDW2020 outron-GFP: pDW2024
T7 promoter-GFP: pDW3120
T7 promoter-outron-GFP: pDW3124
2) T7 polymerase expression plasmid SERCA T7 polymerase: pGN148 together with pRF-4 (rol-6) as marker.
For every co-injection experiment, a total concentration of 200 ng DNA/μl was used (plasmid concentration was 50 ng/μl and carrier DNA was added up to 200ng/μl) . For every co-injection ±15 adult worms were injected.
FI offspring showing the marker rol-6 phenotype were isolated and then selected for further study. The next generation (F2) of the roller lines were screened for GFP expression in the pharynx, vulva, tail and body wall muscles. These are the tissues in which the bacteriophage T7 RNA polymerase is known to be expressed when under the control of the C. elegans SERCA promoter (as in the construct pGN148)
The results are shown in Table 1 below, which indicates the number of lines expressing GFP vs total number of lines isolated.
A
D
* GFP-expression most probably result of recombination in the extrachromosomal array
No GFP expression was observed in the experiments where the T7 RNA polymerase was absent (cells B2, C2, D2, E2) .
In the experiments where the T7 RNA polymerase expressing vector was co-injected with GFP vectors without a T7 promoter, as in the cells B3 and C3, GFP expression was sometimes observed. This is probably due to recombination events in the extrachromosomal arrays, resulting in transcription of GFP directly from the SERCA promoter.
In the experiments where the T7 promoter-GFP construct and the SERCA T7 RNA polymerase where co-injected, no GFP expression could be observed (cell D3) . In contrast, all of the lines isolated from the experiments where the GFP transcript contained an outron at its 5' site (n=13) expressed GFP (cell E3) . The outron is a favourable target for SLl trans'-splicing. Since SLl RNA molecules contain a 5'
trimethylguanosine CAP structure which is transferred to the mature mRNA this results in improved translation of the RNA and hence better expression of GFP. Without the outron the T7 RNA polymerase transcripts do not carry a CAP structure at their 5' end, leading to inefficient translation. The results of this experiment illustrate the importance of trans-splicing for efficient expression of heterologous and homologous genes transcribed by prokaryotic polymerases in C. elegans .
SEQUENCE LISTING
SEQ ID NO: 1 Oligonucleotide o-GN59 SEQ ID NO: 2 Oligonucleotide 0-GN60
SEQ ID NO: 3 Plasmid pDW3123
SEQ. ID NO: 4 Plasmid pDW3124
SEQ ID NO: 5 Plasmid pGNl48