AU8021600B2

AU8021600B2 -

Info

Publication number: AU8021600B2
Authority: AU
Publication date: 2004-06-17

Description

WO 01/27299 PCT/US00/28431

TITLE

VECTORS AND METHODS FOR RECOMBINANT PROTEIN EXPRESSION TECHNICAL FIELD OF THE INVENTION The present invention relates to expression of recombinant proteins in eukaryotic cells.

BACKGROUND OF THE INVENTION The development of expression systems for production of recombinant proteins is important for providing a source of a given protein for research or therapeutic use.

Expression systems have been developed for both prokaryotic cells, such as E. coli, and for eukaryotic cells, such as yeast Saccharomyces, Pichia and Kluyveromyces spp) and mammalian cells. Expression in mammalian cells is often preferred for manufacturing of therapeutic proteins, since post-translational modifications in such expression systems are more likely to resemble those occurring on endogenous proteins in a mammal, than the type of post-translational modifications that occur in microbial expression systems.

Several vectors are available for expression in mammalian hosts, each containing various combinations of cis- and in some cases trans- regulatory elements to achieve high levels of recombinant protein in a minimal time frame. However, despite the availability of numerous such vectors, the level of expression of a recombinant protein achieved in mammalian systems is often lower than that obtained with a microbial expression system.

Additionally, because only a small percentage of cloned, transfected mammalian cells express high levels of the protein of interest, it can often take a considerably longer time to develop useful stably transfected mammalian cell lines than it takes for microbial systems.

The use of a dicistronic expression vector wherein a first open reading frame encodes a polypeptide of interest and a second open reading frame encodes a selectable marker, is one method that has been used to obtain recombinant proteins. A preferred marker for use in such systems is dihydrofolate reductase (DHFR), which has the advantage of being an amplifiable gene, allowing selection for cells having high copy numbers of the inserted DNA by culturing them in increasing levels of methotrexate (MTX). However, translation of the selectable marker gene is up to 100-fold less efficient than translation of the gene of interest, which reduces the efficiency of the selection process. Moreover, dicistronic expression vectors tend to undergo deletion or rearrangement under amplification conditions, in an uncontrolled manner, increasing the chances that amplified cells will no longer express the protein of interest.

2 Internal ribosome entry sites (IRES) are a type of regulatory element found in several viruses and cellular RNAs (reviewed in McBratney et al. Current Opinion in Cell Biology 5:961, 1993). IRES increase the efficiency of translation of the selectable marker gene, and are thus useful in enhancing both the selection and amplification process (Kaufman et al., Nucleic Acids Res.

19:4485, 1991). Nonetheless, the available evidence indicates that dicistronic mRNAs accumulate to lower levels than monocistronic mRNAs, possibly because of reduced mRNA stability of the longer message.

Because the amount of recombinant protein produced by a transfected cell is generally proportional to the amount of mRNA available for translation of the protein, the use of dicistronic expression vectors may result in low levels of production of the desired recombinant protein. Accordingly, there is a need in the art to develop improved methods that retain the utility of a selectable, amplifiable marker such as DHFR, while increasing the proportion of mRNAs 15 encoding the desired recombinant protein. Moreover, there is a need to develop methods that facilitate selection of those transfectants that integrate into more transcriptionally active sites, and that allow production of useful levels of recombinant protein from mammalian cells in a relatively short period of time.

The discussion of documents, acts, materials, devices, articles and the like is included in this specification solely for the purpose of providing a context .:for the present invention. It is not suggested or represented that any or all of these matters formed part of the prior art base or were common general knowledge in the field relevant to the present invention as it existed in Australia before the priority date of each claim of this application.

SUMMARY OF THE INVENTION The present invention provides an expression vector comprising, in the following order, a promoter sequence, a first coding sequence, a polyadenylation site, and a second coding sequence, wherein no promoter sequence occurs between the internal polyadenylation site and the second coding sequence.

X:NIGELSPECIES180216=00.doc 2a In one embodiment of the invention, an expression vector comprises a DNA encoding a first protein, operably linked to a DNA encoding a second protein, wherein a DNA encoding a polyadenylation (polyA) site is inserted between the DNA encoding the first protein of interest and the DNA encoding the second protein, such that the DNA encoding the internal polyadenylation site is operably linked to the DNA encoding the first. A preferred second protein is selectable marker, preferably dihydrofolate reductase (DHFR); other amplifiable markers are also suitable for use in the inventive expression vectors.

The present invention also provides an expression vector comprising, in the following order, a promoter sequence, a first coding sequence, a polyadenylation site consisting essentially of nucleotides 80 through 222 of SEQ ID NO:1, and a second coding sequence, wherein no promoter sequence occurs between the internal polyadenylation site and the second coding sequence.

15 Preferably, the polyadenylation signal utilized to provide the internal polyadenylation site is an SV40 polyadenylation signal, more preferably, the late SV40 polyadenylation signal, and most preferably, a mutant version of the late polyadenylation signal. The preferred polyadenylation signals are Spresented in the Sequence Listing and described further below. In another embodiment of the invention, the polyadenylation signal is inducible.

The expression vector may further comprise an IRES sequence between the DNA encoding the first protein, and the DNA encoding the second protein, operably linked to both and downstream of the internal polyadenylation site.

Alternatively, the expression vector may comprise mRNA splice donor and acceptor sites substantially as described by Lucas et al. infra.

X:\NIGEL\SPECIES\80216=00.doc 3 Another aspect of the invention comprises an expression vector into which a DNA encoding a protein. Such an expression vector comprises a site into which a DNA encoding a recombinant, heterologous protein can be inserted (referred to as a cloning site), such that it is operably linked to an internal polyadenylation site and a DNA encoding a second protein (such as a selectable marker). Optionally, other regulatory elements may also be included, for example, an IRES sequence downstream of the internal polyadenylation site, or mRNA splice donor and acceptor sites substantially as described by Lucas et al. infra, operably linked to the internal polyadenylation site and the DNA encoding the second protein. An expression-augmenting sequence element (EASE) may also be included upstream of the cloning site, operably linked thereto.

Host cells can be transfected with the inventive expression vectors, yielding stable pools of transfected cells. Accordingly, another embodiment of ~15 the invention provides a transfected host cell; yet another embodiment provides a stable pools of cells transfected with the inventive expression vector. Also '.provided are cell lines cloned from pools of transfected cells. Preferred host cells are mammalian cells. In a most preferred embodiment, the host cells are CHO cells.

The invention also provides a method for obtaining a recombinant protein, comprising transfecting a host cell with an inventive expression vector, culturing the transfected host cell under conditions promoting expression of the protein, and recovering the protein. In a preferred application of this invention, transfected host cell lines are selected with two selection steps, the first to select for cells expressing the dominant amplifiable marker, and the second step for high expression levels and/or amplification of the marker gene as well as the gene of interest. In a most preferred embodiment, the selection or amplification agent is methotrexate, an inhibitor of DHFR that has been shown to cause amplification of endogenous DHFR genes and transfected DHFR sequences.

Throughout the description and claims of the specification the word "comprise" and variations of the word, such as "comprising" and "comprises", is not intended to exclude other additives, components, integers or steps.

X:NIGELSPECIESX80216=00.doc 3a BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 is a diagram of constructs prepared in Example 1. The construct in which the SV40 early polyadenylation signal was included was designated SPA6; that in which the late polyadenylation signal was included was designated SPA4. A control construct was designated BGH.

DETAILED DESCRIPTION OF THE INVENTION Expression vectors that retain the utility of a selectable, amplifiable marker such as DHFR, while increasing the proportion of mRNAs encoding a desired recombinant protein, are provided herein. The inventive expression vectors comprise a polyadenylation signal inserted between a first coding sequence and a second or *o o X:\NIGEL\SPECIES\80216=00.doc WO 01/27299 PCTIUS0O/28431 subsequent coding sequence (referred to as an internal polyadenylation site). In the inventive vectors, transcripts originating at the promoter can be polyadenylated following the first coding sequence (monocistronic message) or after the second or subsequent coding sequence (multicistronic message). In one embodiment of the invention, the first coding sequence encodes a protein of interest, and the second (or subsequent) coding sequence encodes a selectable marker. In another embodiment, a second polyadenylation site follows the second or subsequent coding sequence, and is operably linked thereto. In this embodiment, the internal polyadenylation site thus becomes the first polyadenylation site.

Because many transcripts encode only the gene of interest and not the selectable marker, the inventive vectors produce less selectable marker protein, and only those transfectants that integrate into more transcriptionally active sites survive the selection process. Accordingly, use of the inventive expression vectors facilitates isolation of transfected pools and clones that express high levels of recombinant protein using lower levels of a selection agent than is possible in the absence of the internal polyadenylation signal.

An additional benefit of utilizing the inventive expression vectors is that monocistronic messages may be more stable or more efficiently processed than dicistronic messages, potentially leading to increased accumulation of the message encoding the protein of interest, and hence to higher levels of protein production. Use of the inventive internal polyadenylation site will thus facilitate production of useful levels of recombinant protein by transfected cells in a relatively short period of time.

The inventive vectors and methods will also be useful in developing multicistronic vectors. Multicistronic expression vectors allow the coordinated expression of two or more genes (see, for example, Fussenegger et al., Biotechnol Prog 13:733; 1997).

Inserting a polyadenylation site after a first cistron would result in high level expression of the first cistron and lower level expression of any following cistrons. Potential applications of this technology would be to facilitate expression of large amounts of a therapeutic protein (or other, desired recombinant proteins) and lower amounts of other proteins such as selectable markers, transcription factors, enzymes involved in protein folding, and other proteins that regulate cell metabolism and expression.

In another embodiment, the polyadenylation site is inserted after the second or third (or subsequent) cistron. This would allow high expression of the first two (or three or more) cistrons, followed by lower expression of the cistron following the internal polyadenylation site. This embodiment will find use, for example, in recombinant antibody synthesis where the heavy and light chains are synthesized independently at high levels. A tricistronic vector is constructed with the heavy and light chains encoded by the WO 01/27299 PCTIUS00/28431 first two cistrons. The polyadenlylation site is inserted following the second cistron allowing high level expression of the first two cistrons. The selectable marker is expressed from the third cistron after the polyadenylation site) and would be expressed at lower levels.

Expression of Recombinant Proteins As used herein, the term 'expression vector' is understood to describe a vector that comprises various regulatory elements, described in detail below, that are necessary for the expression of recombinant, heterologous proteins in cells. The expression vector can include signals appropriate for maintenance in prokaryotic or eukaryotic cells, and/or the expression vector can be integrated into a chromosome.

Recombinant expression vectors may include a coding sequence encoding a protein of interest (or fragment thereof), ribozymes, ribosomal mRNAs, antisense RNAs and the like. Preferably, the coding sequence encodes a protein or peptide. The coding sequence may be synthetic, a cDNA-derived nucleic acid fragment or a nucleic acid fragment isolated by polymerase chain reaction (PCR). The coding sequence is operably linked to suitable transcriptional or translational regulatory elements derived from mammalian, viral or insect genes. Such regulatory elements include a transcriptional promoter, a sequence encoding suitable mRNA ribosomal binding sites, and sequences which control the termination of transcription and translation a polyadenylation signal), as described in detail below.

Expression vectors may also comprise non-transcribed elements such as a suitable promoter and/or enhancer linked to the gene to be expressed, other 5' or 3' flanking nontranscribed sequences, 5' or 3' non-translated sequences such as ribosome binding sites, a polyadenylation site, splice donor and acceptor sites, and transcriptional termination sequences. An origin of replication that confers the ability to replicate in a host, and a selectable gene to facilitate recognition of transfectants, may also be incorporated.

DNA regions are operably linked when they are functionally related to each other.

For example, DNA for a signal peptide (secretory leader) is operably linked to DNA for a polypeptide if it is expressed as a precursor which participates in the secretion of the polypeptide; thus, in the case of DNA encoding secretory leaders, operably linked means contiguous and in reading frame. A promoter is operably linked to a coding sequence if it controls the transcription of the sequence; and a ribosome binding site is operably linked to a coding sequence if it is positioned so as to permit translation.

Dicistronic expression vectors used for the expression of multiple transcripts have been described previously (Kim S.K. and Wold Cell 42:129, 1985; Kaufman et al.

1991, supra). Dicistronic expression vectors comprise two cistrons, or open reading WO 01/27299 PCT/US00/28431 frames, capable of encoding two proteins, for example, a recombinant of interest and a selectable marker. An example of such dicistronic expression vector is pCAVDHFR, a derivative of pCD302 (Mosley et al., Cell 1989) containing the coding sequence for mouse DHFR (Subramani et al., Mol. Cell. Biol. 1:854, 1981). Another example of such distronic expression vector is pCDE vector, a derivative of pCAVDHFR containing the murine encephalomyocarditis virus internal ribosomal entry site (nucleotides 260 through 824; Jang and Wimmer, Genes and Dev. 4:1560, 1990) cloned between the adenovirus tripartite leader and the DHFR cDNA coding sequence. Other types of expression vectors will also be useful in combination with the invention, for example, those described in U.S. patents 4,634,665 (Axel et al.) and 4,656,134 (Ringold et al.).

The transcriptional and translational control sequences in expression vectors to be used in transfecting cells may be provided by viral sources. For example, commonly used promoters and enhancers are derived from Polyoma, Adenovirus 2, Simian Virus and human cytomegalovirus. Viral genomic promoters, control and/or signal sequences may be utilized to drive expression, provided such control sequences are compatible with the host cell chosen. Examples of such vectors can be constructed as disclosed by Okayama and Berg (Mol. Cell. Biol. 3:280, 1983). Non-viral cellular promoters can also be used the beta-globin and the EF-lalpha promoters), depending on the cell type in which the recombinant protein is to be expressed.

DNA sequences derived from the SV40 viral genome, for example, SV40 origin, early and late promoter, enhancer, splice, and polyadenylation sites may be used to provide the other genetic elements required for expression of a heterologous DNA sequence. The early and late promoters are particularly useful because both are obtained easily from the virus as a fragment which also contains the SV40 viral origin of replication (Fiers et al., Nature 273:113, 1978). Smaller or larger SV40 fragments may also be used, provided the approximately 250 bp sequence extending from the Hind m site toward the BglI site located in the viral origin of replication is included.

In dicistronic expression vectors, a polyadenylation site inserted downstream of, and operably linked to, the second cistron (usually, a DNA encoding a selectable marker), is often used to regulate transcription and translation. Many such polyadenylation signal are known (see for example, Table 1 below). The present invention utilizes an internal polyadenylation signal, eg., one that is inserted between the two cistrons of a dicistronic expression vector, in addition to a polyadenylation signal or other suitable regulatory element downstream of the second cistron.

Both the early and late polyadenylation signals of SV40 are useful in the instant invention. These sequences are encoded within the 237-base pair fragment between the BamHI site at nucleotide 2533 and the Bcll site at nucleotide 2770 of the SV40 genome WO 01/27299 PCT/US00/28431 (Carswell and Alwine, Mol. Cell. Biol. 9:4248; 1989). Carswell and Alwine concluded that, of the two SV40 polyadenylation signals, the late signal was more efficient, most likely because it comprises both downstream and upstream sequence elements that facilitate efficient cleavage and polyadenylation.

Many polyadenylation signals are known in the art, and will also be useful in the instant invention. Examples include those shown in Table 1 below.

late polyA and deletion mutants thereof HIV-1 polyA B-globin polyA HSV TK polyA Table 1: Polyadenylation Signals Schek, N, Cooke, and J.C. Alwine (1992): Definition of the upstream efficiency element of the simian virus 40 late polyadenylation signal by using in vitro analysis. Mol.

Cell Biol. 12:5386-5393 Klasens, Das, and B. Berkhout (1998): Inhibition of polyadenylation by stable RNA secondary structure. Nucleic Acids Res. 26:1870-1876 Gil, and N.J. Proudfoot. (1987): Position-dependent sequence elements downstream of AAUAAA are required for efficient rabbit B-globin mRNA formation. Cell 49:399- 406 Cole, C.N. and T.P. Stacy (1985): Identification of sequence in the herpes simplex virus thymidine kinase gene required for efficient processing and polyadenylation. Mol.

Cell. Biol. 5:2104-2113 Batt, D.B and G.G. Carmichael (1995): Characterization of the polyomavirus late polyadenylation signal. Mol. Cell.

Biol. 15:4783-4790 Gimmi, Reff, and I. C. Deckman. (1989): Alterations in pre-mRNA topology of the bovine growth hormone polyadenylation region decrease polyA site efficiency. Nucleic Acids Res. 17:6983-6998 Polyomavirus polyA Bovine growth hormone Additional polyadenylation sites can be identified or constructed using methods that are known in the art. A minimal polyadenylation site is composed of AAUAAA and a second recognition sequence, generally a G/U rich sequence, found about 30 nucleotides downstream. In the Sequence Listing, the sequences are presented as DNA, rather than RNA, to facilitate preparation of suitable DNAs for incorporation into expression vectors.

When presented as DNA, the polyadenylation site is composed of AATAAA, with, for example, a G/T rich region downstream (see for example, nucleotides 123 through 128 and 151 through 187, respectively, of SEQ ID NO: 1).

Both sequences must be present to form an efficient polyadenylation site. The purpose of these sites is to recruit specific RNA binding proteins to the RNA. The WO 01/27299 PCT/US00/28431 AAUAAA binds cleavage polyadenylation specificity factor (CPSF; Murthy and Manley J.L. (1995), Genes Dev 9:2672-2683), and second site, frequently a G/U sequence, binds to Cleavage stimulatory factor (CstF; Takagaki Y. and Manley J.L.

(1997) Mol Cell Biol 17:3907-3914). CstF is composed of several proteins, but the protein responsible for RNA binding is CstF-64, a member of the ribonucleoprotein domain family of proteins (Takagaki et al. (1992) Proc Natl Acad Sci USA 89:1403- 1407).

The concentration of CstF-64 protein has been shown to be important in regulating usage of different polyadenylation sites in B-cells (Takagaki Y, Manley JL (1998) Mol Cell 2:761-771) Accordingly, an inducible polyadenylation site can be constructed based on this naturally occurring regulation of polyadenylation usage in Bcell, by controlling the interaction of CstF-64 with an mRNA of choice to induce polyadenylation. For example, the CstF-64 may be fused to the RNA binding domain of the MS2 phage coat protein, which binds a specific RNA sequence (ACAUGAGGAUUACCCAUGU; SEQ ID NO:4) distinct from the G/U rich element (Lowary and Uhlenbeck (1987) Nucleic Acids Res. 15:10483+10493). The target mRNA would contain an AAUAAA sequence and an MS2 coat protein RNA recognition sequence. By regulating the level of the MS2-CstF-64 fusion protein transcriptionally using standard inducible expression systems (for example, an Ecdysone-inducible mammalian expression system described by No et al. (1996) Proc Natl Acad Sci USA 93:3346-3351), the usage of the inducible polyA site could be controlled.

Polyadenylation may also be regulated by developing temperature-sensitive MS2 RNA binding domain mutants. MS2 RNA binding domain mutants may be generated using random mutagenesis, and screened for temperature sensitivity. When used as a fusion partner with CstF-64 as described above, the temperature-sensitive MS2 coat protein would be inactive and fail to bind RNA at 37 0 C; thus the internal polyA site would not function at this temperature. However, at reduced temperature, for example 32°C, the MS2 coat protein would be active, would recognize the RNA sequence target, and the message would be polyadenylated. Temperature regulation would be particularly useful for recombinant protein expression, since reducing the temperature of expression cultures is typically used to increase protein expression.

An additional technique that can be used in conjunction with the inventive vectors is described by Lucas et al. (Nucleic Acids Res. 24:1774; 1996). In an effort to increase production of a desired protein, Lucas et al. utilized mRNA splice donor and acceptor sites to develop stable clones that produced both a selectable marker and recombinant proteins. According to these investigators, the vectors they prepared resulted in the WO 01/27299 PCT/US00/28431 transcription of a high proportion of mRNA encoding the desired protein, and a fixed, relatively low level of the selection marker that allowed selection of stable transfectants.

Host Cells Transfected host cells are cells which have been transfected (sometimes referred to as 'transformed) with heterologous DNA. Many techniques for transfecting cells are known; in one approach, cells are transfected with expression vectors constructed using recombinant DNA techniques and which contain sequences encoding recombinant proteins. Expressed proteins will preferably be secreted into the culture supernatant, but to may be associated with the cell membrane, depending on the particular polypeptide that is expressed. Mammalian host cells are preferred for the instant invention. Various mammalian cell culture systems can be employed to express recombinant protein.

Examples of suitable mammalian host cell lines include the COS-7 lines of monkey kidney cells, described by Gluzman (Cell 23:175, 1981), CV-1/EBNA (ATCC CRL 10478), L cells, C127, 3T3, Chinese hamster ovary (CHO), HeLa and BHK cell lines.

A commonly used cell line is DHFR- CHO cells which are auxotrophic for glycine, thymidine and hypoxanthine, and can be transformed to the DHFR+ phenotype using DHFR cDNA as an amplifiable dominant marker. One such DHFR- CHO cell line, DXB11, was described by Urlaub and Chasin (Proc. Natl. Acad. Sci. USA 77:4216, 1980). Another example of a DHFR- CHO cell line is DG44 (see, for example, Kaufman, Meth. Enzymology 185:537 (1988)). Other cell lines developed for specific selection or amplification schemes will also be useful with the invention.

Numerous other eukaryotic cells will also be useful in the present invention, including cells from other vertebrates, and insect cells. Those of skill in the art will be able to select appropriate vectors, regulatory elements, transfection and culture schemes according to the needs of their preferred culture system.

Preparation of transfected mammalian cells Several transfection protocols are known in the art, and are reviewed in Kaufman, supra. The transfection protocol chosen will depend on the host cell type and the nature of the protein of interest, and can be chosen based upon routine experimentation.

The basic requirements of any such protocol are first to introduce DNA encoding the protein of interest into a suitable host cell, and then to identify and isolate host cells which have incorporated the heterologous DNA in a stable, expressible manner.

One commonly used method of introducing heterologous DNA is calcium phosphate precipitation, for example, as described by Wigler et al. (Proc. Natl. Acad. Sci.

USA 77:3567, 1980). DNA introduced into a host cell by this method frequently WO 01/27299 PCT/US00/28431 undergoes rearrangement, making this procedure useful for cotransfection of independent genes.

Polyethylene-induced fusion of bacterial protoplasts with mammalian cells (Schaffner et al., Proc. Natl. Acad. Sci. USA 77:2163, 1980) is another useful method of introducing heterologous DNA. Protoplast fusion protocols frequently yield multiple copies of the plasmid DNA integrated into the mammalian host cell genome. This technique requires the selection and amplification marker to be on the same plasmid as the gene of interest.

Electroporation can also be used to introduce DNA directly into the cytoplasm of a host cell, as described by Potter et al. (Proc. Natl. Acad. Sci. USA 81:7161, 1988) or Shigekawa and Dower (BioTechniques 6:742, 1988). Unlike protoplast fusion, electroporation does not require the selection marker and the gene of interest to be on the same plasmid.

More recently, several reagents useful for introducing heterologous DNA into a mammalian cell have been described. These include Lipofectin® Reagent and Lipofectamine' Reagent (Gibco BRL, Gaithersburg, MD). Both of these reagents are commercially available reagents used to form lipid-nucleic acid complexes (or liposomes) which, when applied to cultured cells, facilitate uptake of the nucleic acid into the cells.

Transfection of cells with heterologous DNA and selection for cells that have taken up the heterologous DNA and express the selectable marker results in a pool of transfected cells. Individual cells in these pools will vary in the amount of DNA incorporated and in the chromosomal location of the transfected DNA. After repeated passage, pools frequently lose the ability to express the heterologous protein. To generate stable cell lines, individual cells can be isolated from the pools and cultured (a process referred to as cloning), a laborious time consuming process. However, in some instances, the pools them selves may be stable production of the heterologous recombinant protein remains stable). The ability to select and culture such stable pools of cells would be desirable as it would allow rapid production of relatively large amounts of recombinant protein from mammalian cells.

A method of amplifying the gene of interest is also desirable for expression of the recombinant protein, and typically involves the use of a selection marker (reviewed in Kaufman, supra). Resistance to cytotoxic drugs is the characteristic most frequently used as a selection marker, and can be the result of either a dominant trait can be used independent of host cell type) or a recessive trait useful in particular host cell types that are deficient in whatever activity is being selected for). Several amplifiable markers are suitable for use in the inventive expression vectors (for example, as described WO 01/27299 PCT/US00/28431 in Maniatis, Molecular Biology: A Laboratory Manual, Cold Spring Harbor Laboratory, NY, 1989; pgs 16.9-16.14).

Useful selectable markers for gene amplification in drug-resistant mammalian cells are shown in Table 1 of Kaufman, supra, and include DHFR-MTX resistance, P-glycoprotein and multiple drug resistance (MDR)-various lipophilic cytoxic agents adriamycin, colchicine, vincristine), and adenosine deaminase (ADA)-Xyl-A or adenosine and 2'-deoxycoformycin. Specific examples of genes that encode selectable markers are those that encode antimetabolite resistance such as the DHFR protein, which confers resistance to methotrexate (Wigler et al., 1980, Proc. Natl. Acad. Sci. USA 77:3567; O'Hare et al., 1981, Proc. Natl. Acad. Sci. USA 78:1527); the GPT protein, which confers resistance to mycophenolic acid (Mulligan Berg, 1981, Proc. Natl. Acad. Sci. USA 78:2072), the neomycin resistance marker, which confers resistance to the aminoglycoside G-418 (Colberre-Garapin et al., 1981, J. Mol. Biol. 150:1); the Hygro protein, which confers resistance to hygromycin (Santerre et al., 1984, Gene 30:147); and the ZeocinTM resistance marker (available commercially from Invitrogen). In addition, the herpes simplex virus thymidine kinase (Wigler et al., 1977, Cell 11:223), hypoxanthine-guanine phosphoribosyltransferase (Szybalska Szybalski, 1962, Proc. Natl. Acad. Sci. USA 48:2026), and adenine phosphoribosyltransferase (Lowy et al., 1980, Cell 22:817) genes can be employed in tk-, hgprt- and aprt- cells, respectively.

Other dominant selectable markers include microbially derived antibiotic resistance genes, for example neomycin, kanamycin or hygromycin resistance. However, these selection markers have not been shown to be amplifiable (Kaufman, supra,).

Several suitable selection systems exist for mammalian hosts (Maniatis supra, pgs 16.9- 16.15). Co-transfection protocols employing two dominant selectable markers have also been described (Okayama and Berg, Mol Cell Biol 5:1136, 1985).

A particularly useful selection and amplification scheme utilizes DHFR-MTX resistance. MTX is an inhibitor of DHFR that has been shown to cause amplification of endogenous DHFR genes (Alt et al., J Biol Chem 253:1357, 1978) and transfected DHFR sequences (Wigler et al., Proc. Natl. Acad. Sci. USA 77:3567, 1980). Cells are transfected with DNA comprising the gene of interest and DNA encoding DHFR in a dicistronic expression unit (Kaufman et al., 1991 supra and Kaufman et al., EMBO J 6:187, 1987). Transfected cells are grown in media containing successively greater amounts of MTX, resulting in greater expression of the DHFR gene, as well as the gene of interest.

Useful regulatory elements, described previously, can also be included in the plasmids or expression vectors used to transfect mammalian cells. The transfection protocol chosen, and the elements selected for use therein, will depend on the type of host WO 01/27299 PCT/US0/28431 cell used. Those of skill in the art are aware of numerous different protocols and host cells, and can select an appropriate system for expression of a desired protein, based on the requirements of their selected cell culture system(s).

Uses of the invention The inventive vectors and methods will find use for the expression of a wide variety of recombinant polypeptides. Examples of such polypeptides include cytokines and growth factors, such as Interleukins 1 through 18, the interferons, RANTES, lymphotoxin-B, Fas ligand, flt-3 ligand, ligand for receptor activator of NF-kappa B (RANKL), TNF-related apoptosis-inducing ligand (TRAIL), CD40 ligand, 0x40 ligand, 4-1BB ligand (and other members of the TNF family), thymic stroma-derived lymphopoietin, granulocyte colony stimulating factor, granulocyte-macrophage colony stimulating factor, mast cell growth factor, stem cell growth factor, epidermal growth factor, growth hormone, tumor necrosis factor, leukemia inhibitory factor, oncostatin-M, and hematopoietic factors such as erythropoietin and thrombopoietin.

Also included are neurotrophic factors such as brain-derived neurotrophic factor, ciliary neurotrophic factor, glial cell-line derived neurotrophic factor and various ligands for cell surface molecules Elk and Hek (such as the ligands for eph-related kinases, or LERKS). Descriptions of proteins that can be expressed according to the inventive methods may be found in, for example, Human Cvtokines: Handbook for Basic and Clinical Research, Vol. II (Aggarwal and Gutterman, eds. Blackwell Sciences, Cambridge MA, 1998); Growth Factors:A Practical Approach (McKay and Leigh, eds., Oxford University Press Inc., New York, 1993) and The Cvtokine Handbook (AW Thompson, ed.; Academic Press, San Diego CA; 1991).

Receptors for any of the aforementioned proteins may also be expressed using the inventive vectors and methods, including both forms of tumor necrosis factor receptor (referred to as p55 and p75), Interleukin-1 receptors (type 1 and Interleukin-4 receptor, receptor, Interleukin-17 receptor, Interleukin-18 receptor, granulocytemacrophage colony stimulating factor receptor, granulocyte colony stimulating factor receptor, receptors for oncostatin-M and leukemia inhibitory factor, receptor activator of NF-kappa B (RANK), receptors for TRAIL, and receptors that comprise death domains, such as Fas or Apoptosis-Inducing Receptor (AIR).

WO 01/27299 PCT/US00/28431 Other proteins that can be expressed using the inventive vectors and methods include cluster of differentiation antigens (referred to as CD proteins), for example, those disclosed in Leukocyte Typing VI (Proceedings of the VIth International Workshop and Conference; Kishimoto, Kikutani et al., eds.; Kobe, Japan, 1996), or CD molecules disclosed in subsequent workshops. Examples of such molecules include CD27, CD39, CD40; and ligands thereto (CD27 ligand, CD30 ligand and CD40 ligand). Several of these are members of the TNF receptor family, which also includes 41BB and the ligands are often members of the TNF family (as are 4-1BB ligand and OX40 ligand); accordingly, members of the TNF and TNFR families can also be expressed using the present invention.

Proteins that are enzymatically active can also be expressed according to the instant invention. Examples include metalloproteinase-disintegrin family members, various kinases (including streptokinase and tissue plasminogen activator as well as Death Associated Kinase Containing Ankyrin Repeats, and IKR 1 and TNF-alpha Converting Enzyme, and numerous other enzymes. Ligands for enzymatically active proteins can also be expressed by applying the instant invention.

The inventive vectors and methods are also useful for expression of other types of recombinant proteins, including immunoglobulin molecules or portions thereof, and chimeric antibodies an antibody having a human constant region couples to a murine antigen binding region) or fragments thereof. Numerous techniques are known by which DNA encoding immunoglobulin molecules can be manipulated to yield DNAs capable of encoding recombinant proteins such as single chain antibodies, antibodies with enhanced affinity, or other antibody-based polypeptides (see, for example, Larrick et al., Biotechnology 7:934-938, 1989; Reichmann et al., Nature 332:323-327, 1988; Roberts et al., Nature 328:731-734, 1987; Verhoeyen et al., Science 239:1534-1536, 1988; Chaudhary et al., Nature 339:394-397, 1989).

Various fusion proteins can also be expressed using the inventive methods and vectors. Examples of such fusion proteins include proteins expressed as fusion with a portion of an immunoglobulin molecule, proteins expressed as fusion proteins with a zipper moiety, and novel polyfunctional proteins such as a fusion proteins of a cytokine and a growth factor GM-CSF and IL-3, MGF and IL-3). WO 93/08207 and WO 96/40918 describe the preparation of various soluble oligomeric forms of a molecule WO 01/27299 PCT/US00/28431 referred to as CD40L, including an immunoglobulin fusion protein and a zipper fusion protein, respectively; the techniques discussed therein are readily applicable to other proteins.

As additional examples, DNAs based on one or more expressed sequence tag (EST) from a library of ESTs can be prepared, inserted into the inventive vector and expressed to obtain recombinant polypeptide. Moreover, DNAs isolated by use of ESTs by PCR or the application of other cloning techniques) can also be expressed by applying the instant invention. Information on the aforementioned polypeptides, as well as many others, can be obtained from a variety of public sources, including electronic databases such as GenBank. A particularly useful site is the website of the National Center for Biotechnology Information/National Library of Medicine/ National Institutes of Health (www.ncbi.nlm.nih.gov). Those of ordinary skill in the art are able to obtain information needed to express a desired polypeptide and apply the techniques described herein by routine experimentation.

However, for purposes of this application, the definition of a protein of interest excludes genes encoding proteins that are typically used as selectable markers in cell culture such as auxotrophic, antimetabolite and/or antibiotic markers. Nevertheless, the invention does include the use of a selectable marker as an aid in selecting cells and/or amplifying clones that are genetically engineered to express a gene of interest. Preferably, the selectable marker gene is positioned adjacent to the gene of interest such that selection and/or amplification of the marker gene will select and/or amplify the adjacent gene.

The relevant disclosures of all references cited herein are specifically incorporated by reference. The following examples are intended to illustrate particular embodiments, and not limit the scope, of the invention. Those of ordinary skill in the art will readily recognize that additional embodiments are encompassed by the invention.

EXAMPLES

Example 1 This example describes the preparation of several expression vectors for the expression of a soluble form of a receptor for human Interleukin-4, referred to as sIL-4R.

Human IL-4R cDNA and protein are disclosed in US Patents 5,840,869, issued November 24, 1998; 5,599,905, issued February 4, 1997 and 5,856,296, issued January WO 01/27299 PCT/US00/28431 1999. SEQ ID NOs:5 and 6 present the nucleotide and amino acid sequence (respectively) of human IL-4R. Amino acids -25 through -1 comprise a putative leader peptide; cleavage has been found to occur between amino acids -1 and 1, and between amino acids -3 and Amino acids 208 through 231 form a transmembrane region.

DNA encoding sIL-4R from amino acid -25 to amino acid 207 was used in the expression vectors.

The original expression vector, pCAVDHFR is a derivative of pCD302 (Mosley et al., Cell 89:335-348; 1989) containing the coding sequence for mouse DHFR (Subramani et al., Mol. Cell. Biol. 1:854, 1981). The pCDE vector is a derivative of pCAVDHFR containing the murine encephalomyocarditis virus IRES (nucleotides 260 through 824; Jang and Wimmer, Genes and Dev. 4:1560, 1990) cloned between the adenovirus tripartite leader and the DHFR cDNA coding sequence. An expression-augmenting sequence element (EASE) was included upstream of the CMV leader. The EASE is described in US Patent 6,027,915, issued February 22, 2000, and in USSN 09/435,377, filed November 5, 1999, now allowed.

To allow polyadenylation of the dicistronic message, the bovine growth hormone polyadenylation site was placed 3' of the DHFR gene. The plasmid pBGH is a standard dicistronic vector and serves as the control. The alternate polyadenylation vectors of the present invention were constructed by inserting various polyadenylation sites between the IL-4R and the IRES. The plasmids pSPA4, pSPA6, and pMLPA were constructed by inserting the late SV40 polyA site, the early SV40 polyA site, and a deletion mutant of the late SV40 polyA site, respectively. The deletion mutant late SV40 polyA site was constructed using PCR to isolate a fragment of the late SV40 polyA, nucleotides through 222 of SEQ ID NO: 1. A diagram of the various constructs is shown in Figure 1; the nucleotide sequences of the various polyA sites are shown in the Sequence Listing late: SEQ ID NO:1; BGH: SEQ ID NO:2; SV40 early: SEQ ID NO:3).

The plasmids were used in standard transfections to prepare transfected cells expressing IL-4R. Dihydrofolate reductase (DHFR) deficient Chinese hamster ovary (CHO) cells DXB11 (Chasin and Urlaub, supra) cells were adapted to a DMEM:F12 based serum free medium supplemented with 2mM L-glutamine, 90mM thymidine, mM hypoxanthine, 120 mM glycine, 5% Hy-soy peptone, and 100 mg/L insulin like growth factor 1 (Rassmussen et al., Cytotechnology 28:31-42, 1998). For DHFR selection and methotrexate amplifications, the cells were cultured in the same medium lacking thymidine hypoxanthine, and glycine. For methotrexate selection, methotrexate (MTX; Lederle Laboratories, Pearl River, NY) is added to the selection medium at appropriate concentrations. If neomycin selection is employed, 400 pg/ml of G418 (Gibco, Grand Island, NY) is included in the medium. The cells are transfected using WO 01/27299 PCT/US00/28431 calcium phosphate transfection (Wigler et al. supra), or Lipofectamine" transfection as recommended by the supplier (Gibco BRL, Gaithersburg, MD). Lipofectamine" Reagent is a commercially available reagent used to form lipid-nucleic acid complexes (or liposomes) which, when applied to cultured cells, facilitate uptake of the nucleic acid into the cells.

Example 2 This example describes a semi-quantitive polymerase chain reaction (PCR) technique that was used to confirm that the IL-4R and DHFR messages encoded by the plasmids described above were made and provide information on the relative levels of the various mRNAs. Cells were transfected and cultured as described, and mRNA was obtained using an RNeasy total RNA isolation kit (Quiagen, Chatsworth, CA), and treated with RNAse-free DNAse to diminish DNA contamination. Oligo-dT primers were used to prepare the first strand cDNA; a control primer for actin was included to facilitate quantification.

The first strand was amplified and the amount of input RNA determined using a GeneAmp 5700 from PE Biosystems (Foster City, CA). Thirty cycles of PCR were performed and real-time quantitation of the PCR products was achieved using the doublestranded DNA binding dye SYBR Green I (PE Biosystems, Foster City, CA). A standard curve was prepared using known amounts of actin cDNA, IL-4R cDNA, and DHFR cDNA. The amount of cDNA in each sample was normalized using the amount of actin cDNA. The relative amounts of IL-4R and DHFR in each sample are shown in Table 2.

Table 2 Construct IL-4R/Actin DHFR/Actin pBGH 4.2 7.4 pMPLA 32.5 These data demonstrate that cells transfected with the alternate polyadenylation vector have about 8 times as much IL-4R specific message as the control, and the amount of DHFR is reduced 3.5-fold relative to the control. This technique can be used to evaluate additional polyadenylation signals for use in the inventive expression vectors.

Example 3 This example describes an enzyme-linked immunosorbent assay (ELISA) that can be used to monitor production of recombinant proteins. The ELISA is well known in the art; adaptations of the techniques disclosed in Engvall et al., Immunochem. 8:871, 1971 and in U.S. Patent 4,703,004 have been used to monitor production of various WO 01/27299 PCT/US00/28431 recombinant proteins. In this assay, a first antibody specific for a protein of interest (usually a monoclonal antibody) is immobilized on a substrate (most often, a 96-well microtiter plate), then a sample containing the protein is added and incubated. A series of dilutions of a known concentration of the protein is also added and incubated, to yield a standard curve. After a wash step to remove unbound proteins and other materials, a second antibody to the protein is added. The second antibody is directed against a different epitope of the protein, and may be either a monoclonal antibody or a polyclonal antibody.

A conjugate reagent comprising an antibody that binds to the second antibody conjugated to an enzyme such as horse radish peroxidase (HRP) is added, either after a second wash step to remove unbound protein, or at the same time the second antibody is added. Following a suitable incubation period, unbound conjugate reagent is removed by washing, and a developing solution containing the substrate for the enzyme conjugate is added to the plate, causing color to develop. The optical density readings at the correct wavelength give numerical values for each well. The values for the sample are compared with the standard curve values, permitting levels of the desired protein to be quantitated.

To quantitate sIL-4R, an ELISA using two monoclonal antibodies (MAb) directed to different epitopes of IL-4R was developed. The first MAb (referred to as M10) was adsorbed onto plates overnight, and the peroxidase (HRP) conjugated second antibody (referred to as HRP-M8) was added after a wash step.

Example 4 This example describes the transfection of CHO cells with various constructs and compares the production of sIL-4R by pools of transfected cells. The various sIL-4R expression plasmids were transfected into CHO cells using Lipofectamine

M

Cells were first selected for the DHFR+ phenotype, then pooled and selected at different MTX concentrations. Pools of cells were grown for two to three days, then supernatant fluid harvested and analyzed by ELISA as described in Example 3, and specific productivity (defined as pg of protein produced per day by 10'cells) was determined. The results of a representative experiment are shown in Table 3 below.

WO 01/27299 PCT/US00/28431 Table 3 Specific ELISA Productivity Construct Cells/ml x 10' Viable (jig protein) g/10' cells/day BGH, Control 1.98 94 0.3 0.12 SPA4, SV40 late 1.57 89 2.3 1.11 SPA6, SV40 early 2.42 91 3.2 1.10 MLPA 1.81 92 2.5 1.08 PY, Polyoma virus 2.4 93 0.4 0.14 These results demonstrated that the insertion of internal polyA sites in between a DNA encoding a desired recombinant protein and a DNA encoding a selectable marker can enhance expression of the desired recombinant protein from pools of transfected cells.

Example This example illustrates the production of sIL-4R by pools of transfected CHO cells over time. A high level of expression was stable over many passages. Four independent transfections with the MLPA plasmid were performed substantially as described previously, and passaged over 20 generations. Expression was monitored from each culture individually, and specific productivity results were averaged; the averages are shown in Table 4.

Table 4 Passage Specific Productivity Standard number Ug/10' cells/day deviation 1.22 0.41 1.26 0.39 1.18 0.55 18 1.09 0.49 20 1.02 0.62 As can be seen in the data from Table 4, expression remained stable over passages. Cells from passage 20 from two of the pools were then amplified in methotrexate and monitored for IL-4R expression; results are shown in Table Amplified pools exhibited increased expression when compared to the unamplified pools.

WO 01/27299 PCTIUS00/28431 Table Passage Specific Productivity Specific Productivity number Pool #1 Pool #2 27 1.95 1.37 29 2.10 1.58 33 2.08 1.42 Example 6 This example illustrates the effect of internal polyA sites on clones of cells derived from transfected pools. BGH, SPA4 and SPA6 cells were cloned by limiting dilution in the presence of MTX. Several colonies were picked and screened for specific productivity of sIL-4R as described for the pools. Results are shown in Table 6.

Table 6 Specific MTX ELISA Productivity Construct Clone Concentration Cells/mi Viable (ug protein) BGH 3 200 1.84 69 14.6 4.25 SPA4 2 50 2.26 89 16.2 3.99 SPA6 10 100 1.82 94 1.6 0.47 SPA6 11 100 1.46 83 7 2.44 SPA6 13 100 0.96 67 9.3 4.40 SPA6 16 100 1.02 73 0 0 These results demonstrate that the clones picked from the pools transfected with expression vectors comprising an internal polyA site can express high levels of the desired recombinant protein. For the purposes of producing large amounts of recombinant protein for use as a pharmaceutical, clones are often reamplified in methotrexate. In order to evaluate the effect of an internal polyA site on the reamplification process, clone 2 from the SPA4 pool was reamplified by culturing the cells for several passages in increasing concentrations of methotrexate. Once the cells had recovered from the methotrexate amplification with viabilities of about 90%, the specific productivity was determined by culturing the cells for two to three days, harvesting the supernatant fluid, and assaying the supernatant fluid for IL-4R by ELISA; results are shown in Table 7.

WO 01/27299 PCT/US00/28431 Table 7 Specific Methotrexate ELISA Productivity Construct Concentration Cells/mi Viable (ag protein) SPA4-2 50nM 2.32 94 18.7 4.58 SPA4-2 100nM 1.66 91 21.1 6.83 SPA4-2 150nM 1.95 90 27.7 7.86 SPA4-2 200nM 1.93 91 28.8 8.24 These results demonstrated that clones of cells transfected with expression vectors comprising an internal polyA site can be reamplified, and will be expected to evince higher specific productivity.

Example 7 This example describes the preparation of several expression vectors for the expression of recombinant proteins. An expression vector encoding a marker protein (secreted alkaline phosphatase or SEAP; Berger et al., Gene 66:1, 1988) is prepared substantially as described previously, using the MLPA polyA site internally; a polyA site other than BGH may be used as the terminal polyA site. Several changes are made to the IRES sequence within the expression vectors. As discussed in Davies and Kaufman (J.

Virology 66:1924; 1992), the efficiency of translation of a second gene can be manipulated by altering the sequence of the IRES at or near the junction of the IRES with the second gene, in this case, DHFR. Table 8 depicts the nucleotide sequence added to the IRES; the first base indicated in the Table is directly after nucleotide 566 of the EMCV IRES (SEQ ID NO:7). Translational start sites (ATG) are underlined; the 3'ATG is the first ATG of muDHFR.

Table 8 Construct DNA Sequence at IRES DHFR junction IX-312 ATTGCTCGAGATCCGTGCCATCATG (SEQ ID NO:8) IXED-1 ATGATAATATG (SEQ ID NO:9) IXED-3 ATGATAATATGGCCACAACCATG (SEQ ID Appending the nucleotide sequences to the IRES will modulate expression of DHFR sufficiently to increase the percentage of cells transfected without significantly decreasing the levels of the desired recombinant protein. The vectors (including control WO 01/27299 PCT/US00/28431 vectors) are used in standard transfections to prepare transfected cells expressing SEAP substantially as described herein. Expression levels of the marker protein, SEAP, are determined using a quantitative assay such as that available from CLONTECH Laboratories (Palo Alto, CA, USA; Yang et al., Biotechniques 2:1110, 1997).

WO 01/27299 WO 0127299PCTIJSOO/28431 SEQUENCE LISTING <110> Irnmunex Corporation <120> VECTORS AND METHODS FOR RECOMBINANT <130> 2902-WO <140> to be assigned-- <141> 2000-10-12 <150> 60/159,177 <151> 1999-10-13 <160> <170> Patentln version <210> 1 <211> 222 <212> DNA <213> <400> 1 atccagacat gataagatac attgatgagt ttggacaaac aaaaatgctt tatttgtgaa. atttgtgatg ctattgcttt gcaataaaca agttcaacaa caattgcatt cattttatgt tgggaggttt tttaaagcaa gtaaaacctc tacaaatgtg PROTEIN EXPRESSION cacaactaga atgcagtgaa atttgtaacc attataagct ttcaggttca gggggaggtg gt <210> 2 <211> 285 <212> DNA <213> Bovine <400> 2 aattgtctag agctcgctga tcagcctcga ctgtgccttc tagttgccag ccatctgttg tttgcccctc ccccgtgcct tccttgaccc tggaaggtgc cactcccact gtcctttcct aataaaatga ggaaattgca tcgcattgtc tgagtaggtg tcattctatt ctggggggtg gggtggggca ggacagcaag ggggaggatt gggaagacaa tagcaggcat gctggggatg cggtgggctc tatggcttct gaggcggaaa gaaccagctg gggca <210> 3 <211> 222 <212> DNA <213> <400> 3 accacatttg tagaggtttt acttgcttta aaaaacctcc cacacctccc cctgaacctg aaacataaaa tgaatgcaat tgttgttgaa cttgtttatt gcagcttata atggttacaa ataaagcaat agcatcacaa atttcacaaa taaagcattt ttttcactgc attctagttg 120 180 222 120 180 240 285 120 180 WO 01127299 WOOI/7299PC'r[USOOI28431 tggtttgtcc aaactcatca atgtatctta tcatgtctgg at <210> 4 <211> 19 <212> RNA <213> RNA recognition sequence <400> 4 acaugaggau uacccaugu <210> <211> <212> <213> <220> <221> <222> <220> <221> <222> <220> <221> <222> 2478

DNA

Homo sapiens

CDS

(2478) ratpept ide sig-peptide <400> atg ggg tgg ctt tgc Met Gly Trp, Leu Cys ggg ctc ctg ttc Gly Leu Leu. Phe gtg agc tgC ctg Val Ser Cys Leu ctg ctg cag gtg gca agc tct ggg Leu Leu Gin Val Ala Ser Ser Gly aac atg Asn. Met -1 1 aag gtc ttg Lys Val Leu.

cag gag ccc 96 Gin Glu. Pro acc tgc gtc Thr Cys Val tcc gac tac atg Ser Asp Tyr met atc tct act tgc ile Ser Thr Cys tag aag atg Trp Lys Met aat ggt Asn Gly ccc acc aat tgc Pro Thr Asn Cys acc gag ctc cgc Thr Glu, Leu Arg ttg tao cag ctg Leu Tyr Gin Leu ttt ctg ctc tcc gaa gcc cac acg tgt Phe Leu Leu Ser Glu Ala His Thr Cys cct. gag aac aac Pro Glu Asn Asn ggc gog ggg tgc Gly Ala Gly Cys tgc cac ctg ctc Cys His Leu. Leu.

gat gao gtg gtc Asp Asp Val Val agt gcg Ser Ala gat aac tat aca ctg gac ctg tgg gct ggg cag cag ctg Asp Asn Tyr Thr Leu Asp Leu Trp Ala Gly Gin Gin Leu ctg tgg aag Leu Trp Lys WO 01/27299 WOO17299PTfUSOO/28431 ggc tcc ttc aag ccc agc gag cat gtg aaa ccc agg gcc Gly Ser Phe Lys Pro Ser Giu His Val. Lys Pro Arg Ala 95 100 cca gga aac Pro Gly Asn ctg aca Leu Thr 105 gtt cac acc aat gtc tcc gac act ctg ctg ctg acc tgg agc Val His Thr Asn Val Ser Asp Thr Leu Leu Leu Thr Trp Ser ccg tat ccc cct Pro Tyr Pro Pro aat tac ctg tat Asri Tyr Leu Tyr cat ctc acc tat His Leu Thr Tyr gtc aac att tgg agt gaa aac gac ccg gca gat ttc aga atc Vai Asn Ile Trp Ser Giu Asn Asp Pro Ala Asp Phe Arg Ile 140 1 A C tat aac Tyr Asn 150 gtg acc tac Val Thr Tyr tct ggg att Ser Gly Ile 170 gaa ccc tcc cc Giu Pro Ser Leu atc gca gcc agc Ile Ala Ala Ser acc ctg aag Thr Leu Lys 165 cag tgc tat Gin Cys Tyr tcc tac agg gca Ser Tyr Arg Ala gtg agg gcc tgg Val Arg Ala Trp aac acc Asn Thr 185 acc tgg agt gag Thr Trp Ser Giu agc ccc agc acc Ser Pro Ser Thr tgg cac aac tcc Trp His Asn Ser agg gag ccc ttc Arg Giu Pro Phe cag cac ctc ctg Gin His Leu Leu ggc gtc agc gtt Giy Val Ser Val tgc att gtc atc Cys Ile Vai Ile gcc gtc tgc ctg Ala Val Cys Leu tgc tat gtc agc Cys Tyr Val Ser atc acc Ile Thr 230 aag att aag Lys Ile Lys cgc ctc gtg Arg Leu Val 250 gaa tgg tgg gat Giu Trp TrP ASP att ccc aac cca Ile Pro Asn Pro gcc cgc agc Ala Arg Ser 245 cag tgg gag Gin Trp Glu gct ata. ata atc Ala Ile Ile Ile gat gct cag ggg Asp Ala Gin Gly aag cgg Lys Arg 265 tcc cga ggc cag Ser Arg Gly Gin cca gcc aag tgc Pro Ala Lys Cys cac tgg aag aat His Trp Lys Asn tgt ctt acc aag ctc ttg ccc tgt ttt ctg gag Cys Leu Thr Lys Leu Leu Pro Cys Phe LeU Giu 280 285 290 cac aac atg aaa His Asn Met Lys gat gaa gat cct Asp Giu Asp Pro aag gct gcc aaa Lys Ala Ala Lys atg cct ttC cag Met Pro Phe Gin ggc tct Gly Ser 310 1008 1056 gga aaa tca Gly Lys Ser gca tgg tgc cca gtg Ala Trp Cys Pro Val 315 gag atc agc aag aca Giu Ile Ser Lys Thr 320 gtc ctc tgg Vai Lou Trp 325 WO 01/27299 cca gag age Pro Glu Ser 330 PCT/USOO/28431 atc agc gtg gtg Ile Ser Val Val tgt gtg gag ttg ttt gag gcc ccg Cys Val Giu Leu Phe Glu Ala Pro 340 gtg gag Val Glu 345 tgt gag gag gag Cys Giu Glu Giu gag gta gag gaa Glu Val Giu Giu aaa ggg age ttc Lys Gly Ser Phe gca tcg cct gag Ala Ser Pro Glu ago agg gat gac Ser Arg Asp Asp cag gag gga agg Gin Glu Gly Arg ggc att gtg gc Gly Ile Val Ala cta aca gag age Leu Thr Giu Ser ttc ctg gac otg Phe Leu Asp Leu ctc gga Leu Gly 390 gag gag aat Giu Giu Asn ctt eca oct Leu Pro Pro 410 ggC ttt tgc cag Gly Phe Cys Gin gac atg ggg gag Asp Met Gly Glu tca tgc ctt Ser Cys Leu 405 gat gag ttc Asp Glu Phe teg gga agt acg agt get cac atg ccc Ser Gly Ser Thr Ser Ala His Met Pro 415 cea agt Pro Ser 425 gca ggg ccc aag Ala Gly Pro Lys gea ect ccc tgg Ala Pro Pro Trp, aag gag eag cet Lys Giu Gin Pro eac ctg gag cca agt cet oct gee agc His Leu Giu Pro Ser Pro Pro Ala Ser 445 acc cag agt cea Thr Gin Ser Pro 1104 1152 1200 1248 1296 1344 1392 1440 1488 1536 1584 1632 1680 1728 1776 aae ctg act tgc Asn Leu Thr Cys gag acg ccc etc Glu Thr Pro Leu ate gca ggC aac Ile Ala Gly Asn ect get Pro Ala 470 tac cge age Tyr Arg Ser ctg ggt cca Leu Gly Pro 490 age aae tee etg Ser Asn Ser Leu cag tea ceg tgt ccc aga gag Gin Ser Pro Cys Pro Arg Glu 485 gao eca ctg etg Asp Pro Leu Leu aga cac ctg gag Arg His Leu Giu gta gaa CCC Val Giu Pro gag atg Glu Met 505 ccc tgt gte eec cag ete tot gag eca Pro Cys Val Pro Gin Leu Ser Glu Pro 510 act gtg ccc caa Thr Val Pro Gin gag eca gaa ace Giu Pro Giu Thr gag eag ate etc Glu Gin Ile Leu ega aat gte etc Arg Asn Vai Leu eat ggg gen get gen gee ccc gte tog His Gly Ala Ala Ala Ala Pro Val Ser 540 eec ace agt ggc Pro Thr Ser Gly tat cag Tyr Gin 550 gag ttt gta Glu Phe Val geg gtg gag cag Ala Val Glu Gin ggc ace cag gee Gly Thr Gin Ala agt gcg gtg Ser Ala Val 565 WO 01/27299 gtg ggc ttg Val Gly Leu 570 PCT/USOO/28431 ggt ccc eca gga Gly Pro Pro Gly gct ggt tac aag Ala Gly Tyr Lys ttc tca age Phe Ser Ser ctg ctt Leu Leu 585 gce age agt gct Ala Ser Ser Ala tcc cea gag aaa Ser Pro Glu Lys ggg ttt ggg get Gly Phe Gly Ala agt ggg gaa gag Ser Gly Glu Glu tat aag cet tte Tyr Lys Pro Phe eaa gae Gin Asp 610 ete att cet Leu Ile Pro tge eet ggg gae Cys Pro Gly Asp gcc cea gtc ect Ala Pro Vai Pro eec ttg ttc ace Pro Leu Phe Thr ttt gga Phe Gly 630 ctg gac agg Leu Asp Arg age tce eca Ser Ser Pro cca c ege agt Pro Pro Arg Ser eag age tea eat Gin Ser Ser His etc cca age Leu Pro Ser 645 gta gag gae Val Glu Asp gag eac etg ggt Giu His Leu Gly gag ceg ggg gaa Glu Pro Gly Glu atg eea Met Pro 665 aag eec eca. ett Lys Pro Pro Leu eag gag eag gee Gin Glu Gin Ala gac ccc ett gtg Asp Pro Leu Val age ctg gge agt Ser Leu Gly Ser att gtc tac tea Ile Val Tyr Ser ett ace tge eae Leu Thr Cys His 1824 1872 1920 1968 2016 2064 2112 2160 2208 2256 2304 2352 2400 2448 2478 tge gge eac etg Cys Gly His Leu cag tgt cat ggc Gin Cys His Giy gag gat ggt gge Glu Asp Gly Gly cag ace Gin Thr 710 ect gte atg Pro Val Met teg eee ect Ser Pro Pro 730 agt ec tge tgt Ser Pro Cys Cys tge tge tgt gga Cys Cys Cys Gly gac agg tee Asp Arg Ser 725 eea ggt ggg Pro Giy Gly aca ace eec etg Thr Thr Pro Leu gee eca gac ec Ala Pro Asp Pro gtt eca Val Pro 745 ctg gag gee agt Leu Giu Ala Ser tgt ceg gee tee Cys Pro Ala Ser gea cee teg ggc Ala Pro Ser Gly tea gag aag agt Ser Glu Lys Ser tee tea tea tee Ser Ser Ser Ser eat ec gee ect His Pro Ala Pro aat get cag age Asn Ala Gin Ser age eag ace Ser Gin Thr eec aaa ate gtg ase ttt Pro Lys Ile Val Asn Phe 785 gte tee Val Ser 790 gtg gga ccc Val Gly Pro tae atg agg gte Tyr Met Arg Val WO 01/27299 <210> 6 <211> 826 <212> PRT <213> Homno sapiens PCTJUS00128431 <400> 6 Met Leu Thr Asn Val1 Gly Asp Gly Leu Asn 120 Val Val Ser Asn Gly Trp Leu Gin Cys Val Gly Pro Phe Leu Ala Gly Asn Tyr Ser Phe Thr Val 105 Pro Tvr Leu Ser Cys Val Thr Leu Lys Pro His Thr Pro Pro Trp Ser 140 Leu Glu 155 Ser Tyr Trp Ser Gly Ser Met Ser 30 Ala His Leu Glu Val 110 Asn Asn Ser Ala Trp 190 Leu Asn Ile Glu Thr Leu Ala Val Asp Leu Pro Arg 160 Val Pro Val1 Val Cys Leu Pro Asp Gin Arg Leu 115 His Phe Ala Trp, Lys 195 WO 01/27299 PTUO/83 PCT/USOO/28431 Tyr 200 Cys Lys Arg Lys Cys 280 Asp Gly Pro Val Cys 360 Gly Glu Leu I Arg Ile Ile Leu Arg 265 Leu Giu Lys Glu G1u 345 Ala Ile 'iu 'ro Giu Pro Val Ile Lys Lys 235 Vai Ala 250 Ser Arg Thr Lys Asp Pro Ser Ala 315 Ser Ile 330 Cys Giu Ser Pro Val Ala Asn GiyC 395 Pro Ser C 410 Phe Leu 220 Giu Ile Giy Leu His 300 Trp, Ser Glu Giu krg 380 ;ly ;iy Glu 205 *Ala *Trp Ile Gin Leu 285 Lys Cys Vai Glu Ser 365 Leu Phe Ser 9 Gin Val Trp Ile Giu 270 Pro Ala Pro VIal Giu 350 Ser rhr ,ys Phr His Cys Asp Gin 255 Pro Cys Aia Vai Arg 335 Giu A-rg Glu Gin Ser 415 Leu Leu Gin 240 Asp Ala Phe Lys Giu 320 Cys Vai Asp Ser Gin 400 Ala Leu Leu 225 Ile Aia Lys Leu Glu 305 Ile Val1 Giu Asp Leu 385 Asp His Leu 210 Cys Pro Gin Cys Glu 290 Met Ser Giu Glu Phe 370 Phe Met Met Gly Tyr Asn Giy Pro 275 His Pro Lys Leu Glu 355 Gin Leu Gly Pro Giy 435 Ser Vai Ser Ile 230 Ala Arg 245 Gin Trp Trp, Lys Met Lys Gin Gly 310 Val Leu 325 Giu Ala Gly Ser Gly Arg Leu Leu 390 Ser Cys 405 Asp Giu Pro Ser Ala Gly Pro Lys Giu Aia Pro Pro Trp Lys Giu Gin Pro WO 01/27299 Leu His I 440 Asn Leu 'I Pro Pro Ala Thr Pro Leu Tyr Leu Glu Pro 520 His Glu Val Leu Ser 600 Cys Leu Arc Gly Met 505 Glu Gly Phe Gly Leu 585 Ser Pro k.sp Ser Pro 490 Pro Pro Ala Val Leu 570 Ala Giy Gly Arg Phe 475 Asp Cys Giu Ala His 555 Gly Ser Giu A~sp Glu Ser Asn Ser Leu Pro Val Thr Ala 540 Ala Pro Ser Glu Pro 620 Pro Leu Pro Trp, 525 Ala Val Pro Ala Gly 605 Ala Pro *Leu Gin 510 Glu Pro Glu Gly Val 590 Tyr Pro Arg Ala 495 Leu Gin Val Gln Glu 575 Ser Lys Val Ser Ser 480 Arg Ser I le Ser Gly 560 Ala Pro Pro Pro Pro Ser Val 465 Gln His Glu Leu Ala 545 Gly Gly Glu Phe Val1 625 Gin Pro 450 Ile Ser Leu Pro Arg 530 Pro Thr Tyr Lys Gin 610 Pro Ser Thr Ala Pro Giu Thr 515 Arg Thr Gin Lys Cys 595 Asp Leu Ser *Gln Gly Cys Glu 500 Thr Asn Ser Ala Ala 580 Gly Leu Phe His Lys Ser Asn Pro 485 Val1 Val Val Gly Ser 565 Phe Phe Ile Thr Leu 645 V1al PCTUSOO/28431 Pro Asp 455 Pro Ala 470 Arg Glu Glu Pro Pro Gin Leu Gin 535 Tyr Gin 550 Ala Val Ser Ser Gly Ala Pro Gly 615 Phe Gly 630 Pro Ser Glu Asp 635 Ser Ser Pro GlU His Leu Gly Leu 650 655 Glu Pro Gly Clu 660 Asp Pro Leu Val Met Pro 665 Lys Pro Pro Leu GiGlGnAl Gln Glu Gln Ala WO 01127299 Asp Ser Leu 680 PCTIUSOOI28431 Giy Ser Gly Ile Val Tyr Ser Aia Leu Thr Cys His Leu Cys Gly His Leu Gin Cys His Gly Gin Glu 705 Asp Gly Gly Gin Thr 710 Pro Vai Met Ser Pro Pro 730 Ser Pro Cys Cys Cys Cys Cys Giy 725 Pro Giy Gly Thr Thr Pro Leu Ala Pro Asp Pro Val Pro 745 Leu Giu Aia Ser Cys Pro Ala Ser Ala Pro Ser Gly Ile Ser Giu Lys Ser 760 Asn Ala Gin Ser Ser 780 Ser Ser Ser Ser His Pro Ala Pro Ser Gin Thr Pro Ile Val Asn Phe Val Ser 790 Val Giy Pro Thr Tyr Met Arg Val Ser Tyr 795 800 <210> 7 <211> 566 <2i2> DNA <213> EMC <400> 7 ccctctccc tgcgtttgtc gaaacctggc aatgcaaggt aacaacgtct ctgcggccaa cgt tgtgagt ggggctgaag cacatgctt t gacgtggttt tccccCcCc tatatgttat cctgtcttct ctgttgaatg gtagcgaccc aagccacgtg tgga tagt tg gatgcccaga acatgtgttt tcctttgaaa ctaacgttac tttccaccat tgacgagcat tcgtgaagga tttgcaggca tataagatac tggaaagagt aggtacccca agtcgaggtt aacacg tggccgaagc at tgccgtct tcctaggggt agcagttcct gcggaacccc acctgcaaag caaatggctc ttgtatggga aaaaaacgtc cgcttggaat tttggcaatg ctttcccctc ctggaagctt ccacctggcg gcggcacaac tcctcaagcg tctgatctgg taggcccccc aaggccggtg tgagggcccg tcgccaaagg cttgaagaca acaggtgcct cccagtgcca tattcaacaa ggcCtcggtg gaaccacggg WO 01/272" WO 0127299PCTIUSOO/28431 <210> 8 <211> <212> DNA <213> EMCV <400> 8 attgctcgag atccgtgcca tcatg <210> 9 <211> 11 <212> DNA <213> EMCV <400> 9 atgataatat g 1 <210> <211> 23 <212> DNA <213> EMCV <400> atgataatat ggccacaacc aig 23

Claims

1. An expression vector comprising, in the following order, a promoter sequence, a first coding sequence, a polyadenylation site, and a second coding sequence, wherein no promoter sequence occurs between the internal polyadenylation site and the second coding sequence.

2. The expression vector of claim 1, wherein the second coding sequence encodes a selectable marker.

3. The expression vector of claim 2, wherein the selectable marker is dihydrofolate reductase.

4. The expression vector of any one of claims 1 to 3, wherein the internal polyadenylation site is selected from the group consisting of SV40 late 15 polyadenylation site (SEQ ID NO:1), SV40 early polyadenylation site (SEQ ID NO:3), and a mutant SV40 late polyadenylation site consisting essentially of nucleotides 80 through 222 of SEQ ID NO:1.

5. The expression vector of any one of claims 1 to 4, wherein an internal ribosome entry site (IRES) is inserted between the DNA encoding the internal polyadenylation site and the DNA encoding the selectable marker such that the IRES is operably linked to the selectable marker.

6. A stable pools of cells transfected with an expression vector according to claim

7. A cell line cloned from the stable pool of cells of claim 6.

8. A method for obtaining a recombinant protein, comprising culturing a stable pool of cells according to claim 6 under conditions promoting expression of the protein, and recovering the protein.

9. The expression vector of any one of claims 1 to 5, which further comprises a second polyadenylation site following the second coding sequence and operably linked thereto.

W:NIGEL\SPECIES\80216=00.doc An expression vector comprising, in the following order, a promoter sequence, a first coding sequence, a polyadenylation site consisting essentially of nucleotides 80 through 222 of SEQ ID NO:1, and a second coding sequence, wherein no promoter sequence occurs between the internal polyadenylation site and the second coding sequence.

11. A stable pool of mammalian cells transfected with an expression vector according to claim

12. A cell line cloned from the pool of claim 11.

13. A method for obtaining a recombinant protein, comprising culturing a stable pool of cells according to claim 11 under conditions promoting expression of the protein, and recovering the protein.

14. A method for obtaining a recombinant protein, comprising culturing a cell line according to claim 12 under conditions promoting expression of the protein, and recovering the protein. *0

15. A mammalian host cell containing an expression vector according to any one of claim 5, 9 or

16. The expression vector of claim 10, which further comprises a second polyadenylation site following the second coding sequence and operably linked thereto.

17. A mammalian host cell containing an expression vector according to claim 16. DATED: 25 March, 2002 PHILLIPS ORMONDE FITZPATRICK Attorneys for: IMMUNEX CORPORATION W:\NIGEL\SPECIES\80216=00.doc