WO2007062474A1

WO2007062474A1 - An expression system construct comprising ires and uses thereof

Info

Publication number: WO2007062474A1
Application number: PCT/AU2006/001818
Authority: WO
Inventors: James Watson Goding; Adnan Sali
Original assignee: Monash University
Priority date: 2005-11-30
Filing date: 2006-11-30
Publication date: 2007-06-07

Abstract

The present invention relates generally to a nucleic acid expression system and methods for use therein. More particularly, the present invention provides a bicistronic expression system which, due to the design of a novel selection mechanism, exhibits improved selection efficiency relative to known bicistronic expression systems. The construct of the invention comprises an IRES operably linked to a selection marker. The expression system of the present invention now facilitates the high level expression of both soluble and membrane bound forms of a protein of interest. The transfected host cell systems, and molecule produced thereby, which utilise the nucleic acid expression system of the present invention are useful, inter alia, in a wide variety of diagnostic, therapeutic, prophylatic and research based settings.

Description

A NOVEL EXPRESSION SYSTEM AND USES THEREOF

FIELD OF THE INVENTION

The present invention relates generally to a nucleic acid expression system and methods for use therein. More particularly, the present invention provides a bicistronic expression system which, due to the design of a novel selection mechanism, exhibits improved selection efficiency relative to known bicistronic expression systems. The expression system of the present invention now facilitates the high level expression of both soluble and membrane bound forms of a protein of interest. The transfected host cell systems, and molecule produced thereby, which utilise the nucleic acid expression system of the present invention are useful, inter alia, in a wide variety of diagnostic, therapeutic, prophylactic and research based settings.

BACKGROUND OF THE INVENTION

The reference to any prior art in this specification is not, and should not be taken as, an acknowledgment or any form of suggestion that that prior art forms part of the common general knowledge in Australia.

Proteins are produced in systems for a wide range of applications in biology and biotechnology. These include research into cellular and molecular function, production of proteins as biopharmaceuticals or diagnostic reagents, and modification of the traits or phenotypes of livestock and crops.

A wide variety of recombinant host cell systems have been developed to facilitate the production of proteins. For example, bacterial host cell systems, insect host cell systems and mammalian host cell systems are all widely used. The choice of system depends entirely on the nature of the outcomes which are required. Bacterial expression systems, for instance, provide an inexpensive and rapid means of expressing a protein but have major limitations in their ability to post-translationally modify a translated protein. This is a particularly important issue in the context of biopharmaceuticals.

Biopharmaceuticals are usually proteins which exhibit an extracellular function, such as antibodies for immunotherapy or hormones or cytokines for eliciting a cellular response. Proteins with extracellular functions exit the cell via the secretory pathway and undergo post-translational modifications during secretion. These modifications (primarily glycosylation and disulfide bond formation) do not generally occur in bacteria. Moreover, the specific oligosaccharides attached to proteins by glycosylating enzymes are species and cell-type specific. These considerations often limit the choice of host cell for heterologous protein production to eukaryotic cells (Kaufman, 2000). For expression of human therapeutic proteins, host cells such as bacteria, yeast, or plant cells may be inappropriate. Even the subtle differences in protein glycosylation between rodents and human, for example, can sometimes be sufficient to render proteins produced in rodent cells unacceptable for therapeutic use (Sheeley et ah, 1997). The consequences of improper (i.e., non-human) glycosylation include immunogenicity, reduced functional half-life, and loss of activity. This limits the choice of host cells further, to human cell lines or to cell lines such as Chinese Hamster Ovary (CHO) cells, which may produce glycoproteins with human-like carbohydrate structures (Liu, 1992).

That being the case, there is an ongoing need to both refine existing expression systems and to develop new systems which facilitate more efficient expression in the context of any host cell type, although in particular in the context of mammalian host cell expression systems. To this end, the transfection of DNA into mammalian cells for generation of stable cell lines is an extremely inefficient process, and is nearly always made feasible by the use of "selectable markers" which generally rescue cells from death induced by drugs. The most popular drug selection system for this purpose is G418, which is a neomycin analogue which works by binding to ribosomes and inhibiting protein synthesis. G418 is attractive because it is a "dominant" marker that works in virtually any mammalian cell line, although the dose does need to be tailored to suit individual cells. Disadvantages of G418, however, include : (a) expense;

(b) the common occurrence of "escape" cellular variants in which the cells grow but fail to express the gene of interest; and

(c) drug selection generally takes several weeks.

In terms of the selection of successfully transfected mammalian and non-mammalian cells, standard methods used to generate stable cell lines have generally required transfecting a host cell line with two expression cassettes: one expressing the gene of interest and the other expressing a selectable marker such as an antibiotic resistance marker. These cassettes can be introduced into the host cell either by cotransfecting two plasmids that each contain one of the expression cassettes or by using one plasmid that contains both cassettes, usually driven by separate promoters. Unfortunately, after transfection and selection often only 10-30% of the cells functionally express the gene of interest, or, in the case of cotransfection, the stable integration of only the cassette expressing the selectable marker. Additionally, the level of gene expression using standard methods cannot be predicted; gene expression is generally low and, because the selective pressure is only on the cassette that expresses the antibiotic resistance, expression levels can decrease over time.

Attempts to get round this problem have involved the generation of vectors in which the selectable marker is on the same plasmid as the gene of interest. Commonly, the transfection involves circular (uncut) plasmid DNA, but this results in difficulties because in order to be integrated into the chromosome, the circle has to be cleaved by endonucleases within the cell. Since the cell does not know what genes are of interest to the investigator, linearisation by the cell is essentially random, and if such cleavage occurs in an unfavorable site (eg within the promoter of the gene of interest, within the gene of interest, or between the gene of interest and the selectable marker gene), it is possible and indeed common to find many cells (often the majority) which grow but do not express the gene of interest. Whenever the gene of interest and the selectable marker part company by the mechanisms described above (or by other mechanisms), cells that do express the gene of interest can generally be obtained by cellular cloning and testing each clone for expression, but this process is laborious and slow.

However, a major advance came with the development of bicistronic vectors, in which the selectable marker and the gene of interest are derived from the same messenger RNA (mRNA), driven by the same promoter. This is made possible by the use of an Internal Ribosome Entry Site (IRES) found in certain viruses. Normally, a ribosome finishes a polypeptide chain when it encounters a stop codon, and does not rebind to the RNA even if an additional initiator Methionine and an open reading frame exist downstream (ie on the Υ side) of the stop codon. However, if an IRES sequence is present after a stop codon in a messenger RNA molecule, the ribosome will re-attach to the mRNA and a second protein can be translated from the same RNA. If the second protein is a selectable marker (encoding drug resistance or other selectable property), then use of this marker will guarantee that the gene of interest (placed between the promoter and the IRES) will be expressed. A number of bicistronic vectors have been produced based on these concepts. However, even these vectors exhibit limitations in terms of the ease and efficiency of protein production. For example, there have been certain inherent efficiency limitations in terms of the manner in which the selectable markers have been designed and utilised. Further, and in another example, currently available vectors have been unsuitable for linearisation and chromosomal integration due to the presence of restriction sites in functionally crucial nucleic acid regions, including within the nucleic acid region encoding the protein of interest. In still another example, it is common experience to find that the great majority of stable transfectants express the gene of interest at quite low levels, and there is therefore a need for systems to enrich for those rare transfectant clones that express at high levels. High expression can be encouraged by the use of carefully selected strong promoters that work well in the cell of interest, but this is not sufficient to ensure high expression. Expression levels do not correlate well with the number of copies of the transgene that have been integrated.

The factor that seems to be the most important in terms of achieving the good expression of a construct is the site of its integration (Harrison et at, 1995, Biochim Biophys Acta 1260, 147-56.) This can be controlled by specialised and complex systems involving homologous recombination. There are also a number of systems involving gradually increasing the concentration of the selectable marker drug to force selection of high expressing clones, using methionine sulfoxamine or methotrexate, but these systems have not fulfilled their initial promise and are not widely used.

Accordingly, there is a need for simple, reliable and robust selection systems which enable the growth only of cells expressing the gene of interest. Still further, there is a need for more simple and robust means to select highly expressing cells.

In work leading up to the present invention, a nucleic acid expression construct has been developed which overcomes the limitations inherent in currently available bicistronic vectors and which provides many of the improvements detailed above. Specifically, a bicistronic expression cassette has been designed which facilitates anti-folate salvage pathway based selection (in particular HAT-based selection) of transfectants and still further, can be constructed to still also provide a mechanism to both generate and detect transfectants exhibiting high level expression of the gene of interest. The optional incorporation of a defined linearisation polylinker site overcomes the problems associated with linearisation which relies on the presence of randomly distributed restriction endonuclease sites while the incorporation, or not, of appropriate signal sequences provides flexibility in terms of producing cytoplasmic, secreted or membrane bound forms of the protein of interest.

SUMMARY OF THE INVENTION

Throughout this specification and the claims which follow, unless the context requires otherwise, the word "comprise", and variations such as "comprises" and "comprising", will be understood to imply the inclusion of a stated integer or step or group of integers or steps but not the exclusion of any other integer or step or group of integers or steps.

As used herein, the term "derived from" shall be taken to indicate that a particular integer or group of integers has originated from the species specified, but has not necessarily been obtained directly from the specified source. Further, as used herein the singular forms of "a", "and" and "the" include plural referents unless the context clearly dictates otherwise.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

The subject specification contains amino acid and nucleotide sequence information prepared using the programme Patentln Version 3.1, presented herein after the bibliography. Each amino acid and nucleotide sequence is identified in the sequence listing by the numeric indicator <201> followed by the sequence identifier (eg. <210>l, <210>2, etc). The length, type of sequence (amino acid, DNA, etc.) and source organism for each sequence is indicated by information provided in the numeric indicator fields <211>, <212> and <213>, respectively. Amino acid and nucleotide sequences referred to in the specification are identified by the indicator SEQ ID NO: followed by the sequence identifier (eg. SEQ ID NO:1, SEQ ID NO:2, etc.). The sequence identifier referred to in the specification correlates to the information provided in numeric indicator field <400> in the sequence listing, which is followed by the sequence identifier (eg. <400>l, <400>2, etc). That is SEQ ID NO:1 as detailed in the specification correlates to the sequence indicated as <400>l in the sequence listing.

One aspect of the present invention provides a nucleic acid expression construct said construct comprising a promoter operably linked to both:

(i) a first nucleic acid region comprising either a nucleic acid sequence encoding a protein of interest or a nucleic acid sequence into which can be incorporated a nucleic acid sequence encoding a protein of interest; and

(ii) a second nucleic acid region located in the 3 ' direction to the first nucleic acid region, said second nucleic acid region comprising an IRES operably linked to a nucleic acid sequence encoding one or more selectable markers, at least one of which selectable markers facilitates antifolate salvage pathway-based selection;

wherein the expression of said construct would result in translation of both said protein of interest and said one or more selectable markers.

Another aspect of the present invention provides a DNA expression construct said construct comprising a promoter operably linked to both:

(i) a first DNA region comprising either a DNA sequence encoding a protein of interest or a DNA sequence into which can be incorporated a DNA sequence encoding a protein of interest; and

(ii) a second DNA region located in the 3 ' direction to the first DNA region, said second DNA region comprising an IRES operably linked to a DNA sequence encoding one or more selectable markers, at lest one of which selectable markers facilitates HAT-based selection;

Yet another aspect of the present invention provides a plasmid-derived DNA expression construct said construct comprising a promoter operably linked to both: (i) a first DNA region comprising either a DNA sequence encoding a protein of interest or a DNA sequence into which can be incorporated a DNA sequence encoding a protein of interest; and

(ii) a second DNA region located in the 3 ' direction to the first DNA region, said second DNA region comprising an IRES operably linked to a DNA sequence encoding one or more selectable markers, at least one of which selectable markers facilitates antifolate salvage pathway-based selection;

Still another aspect of the present invention provides a plasmid-derived DNA expression construct said construct comprising a SRa promoter operably linked to both:

(i) a first DNA region comprising a DNA sequence encoding a protein of interest or a DNA sequence into which can be incorporated a DNA sequence encoding a protein of interest; and

Yet still another aspect of the present invention provides a plasmid-derived DNA expression construct said construct comprising a cytomegalovirus promoter operably linked to both: (i) a first DNA region comprising a DNA sequence encoding a protein of interest or a DNA sequence into which can be incorporated a DNA sequence encoding a protein of interest; and

Still yet another aspect of the present invention provides a plasmid-derived DNA expression construct said construct comprising a promoter operably linked to both:

(ii) a second DNA region located in the 3 ' direction to the first DNA region, said second DNA region comprising an IRES operably linked to a DNA sequence encoding one or more selectable markers, at least one of which selectable markers is thymidine kinase or functional fragment, homologue, derivative or mimetic thereof;

A further aspect of the present invention provides a plasmid-derived DNA expression construct said construct comprising a promoter operably limited to both: (i) a first DNA region comprising a DNA sequence encoding a protein of interest or a DNA sequence into which can be incorporated a DNA sequence encoding a protein of interest; and

(ii) a second DNA region located in the 3 ' direction to the first DNA region, said second DNA region comprising an IRES operably linked to a DNA sequence encoding one or more selectable markers, at least one of which selectable markers is HPRT or functional fragment homologue, derivative or mimetic thereof;

Yet another further aspect of the present invention provides a DNA expression construct comprising a promoter operably linked to both:

(ii) a second DNA region located in the 3 ' direction to the first DNA region, said second DNA region comprising an IRES operably linked to a DNA sequence encoding (a) HPRT or thymidine kinase and (b) GFP or functional fragment, homologue, derivative or mimetic thereof;

wherein the expression of said construct would result in translation of both said protein of interest and said HPRT or thymidine kinase and GFP.

Still another further aspect of the present invention provides a DNA expression construct said constructing comprising a promoter operably linked to both: (i) a first DNA region comprising a DNA sequence encoding a protein of interest or a DNA sequence into which can be incorporated a DNA sequence encoding a protein of interest; and

(ii) a second DNA region located in the 3 ' direction to the first DNA region, said second nucleic acid region comprising an internal ribosome entry site operably linked to a DNA sequence encoding (a) HPRT or thymidine kinase and (b) alkaline phosphatase or functional fragment, homologue, derivative or mimetic thereof;

wherein the expression of said construct would result in translation of both said protein of interest and said HPRT or thymidine kinase and alkaline phosphatase.

Yet another further aspect of the present invention provides a DNA expression construct said construct comprising a promoter operably linked to both a first and second DNA region, which:

(i) first DNA region comprises a DNA sequence encoding a protein of interest or a

DNA sequence into which can be incorporated a DNA sequence encoding a protein of interest and optionally a signal sequence and/or GST and

(ii) second DNA region located in the 3 ' direction to the first DNA region, said second DNA region comprising an IRES operably linked to a DNA sequence encoding HPRT or thymidine kinase and optionally GFP, alkaline phosphatase and/or a PEST sequence; and optionally

(iii) an antibiotic resistance gene optionally under its own promoter control; and optionally

(iv) one or more linearisation polylinkers

wherein the expression of said construct would result in translation of both said first and second DNA regions.

In one embodiment the present invention provides a DNA expression construct said construct comprising a promoter operably linked to both a first and second DNA region, which:

DNA sequence into which can be incorporated a nucleic acid sequence encoding a protein of interest and a signal sequence; and

(ii) second DNA region is located in the 3 ' direction to the first DNA region, said second DNA region comprising an IRES operably linked to a DNA sequence encoding HPRT and GFP; and

(iii) an antibiotic resistance gene optionally under its own promoter control

In a second embodiment the present invention provides a DNA expression construct said construct comprising a promoter operably linked to both a first and second DNA region, which:

(i) first DNA region comprises a DNA sequence encoding a protein of interest or a DNA sequence into which can be incorporated a nucleic acid sequence encoding a protein of interest and a signal sequence; and

(ii) a second DNA region is located in the 3 ' direction to the first DNA region, said second DNA region comprising an IRES operably linked to a DNA sequence encoding HPRT, GFP and a PEST sequence; and (iii) an antibiotic resistance gene optionally under its own promoter control; and optionally

(iv) a linearisation polylinker

In a third embodiment the present invention provides a DNA expression construct said construct comprising a promoter operably linked to both a first and second DNA region, which:

(ii) second DNA region is located in the 3 ' direction to the first DNA region, said second DNA region comprising an IRES operably linked to a DNA sequence encoding thymidine kinase; and

(iii) an antibiotic resistance gene optionally under its own promoter control

In a fourth embodiment the present invention provides a DNA expression construct said construct comprising a promoter operably linked to both a first and second DNA region, which:

(iii) an antibiotic resistance gene optionally under its own promoter control

In a fifth embodiment, the present invention provides a DNA expression construct said construct comprising a promoter operably linked to both a first and second DNA region, which:

DNA sequence into which can be incorporated a nucleic acid sequence encoding a protein of interest and GST; and

(ii) second DNA region is located in the 3 ' direction to the first DNA region, said second DNA region comprising an IRES operably linked to a DNA sequence encoding HPRT, GFP and a PEST sequence; and

(iii) an antibiotic resistance gene optionally under its own promoter control

In a sixth embodiment, the present invention provides a DNA expression construct said construct comprising a promoter operably linked to both a first and second DNA region, which: (i) first DNA region comprises a DNA sequence encoding a protein of interest or a

(iii) an antibiotic resistance gene optionally under its own promoter control

In a seventh embodiment the present invention provides a DNA expression construct said construct comprising a promoter operably linked to both a first and second DNA region, which:

(iii) an antibiotic resistance gene optionally under its own promoter control

wherein the expression of said construct would result in translation of both said first and second DNA regions. Yet another aspect of the present invention is directed to host cells which have been transfected with the construct of the present invention.

A further aspect of the present invention is directed to a method of producing a protein of interest said method comprising transfecting the construct as hereinbefore defined into a host cell, culturing said host cell for a time and under conditions sufficient to express said construct and optionally purifying said protein.

Still yet another aspect of the present invention extends to the protein product produced by the expression constructs of the present invention.

The present invention also extends to the use of the expressed protein of interest in the treatment and diagnosis of patients. To this end, the present invention encompasses in vivo administration of the transfected cells themselves and/or the in vivo or in situ transfection of cells with the constructs of the present invention.

Still another aspect of the present invention contemplates a pharmaceutical composition comprising either a protein generated by the method of the present invention or a construct as hereinbefore defined together with one or more pharmaceutically acceptable carriers.

Yet another aspect of the present invention is directed to a kit for facilitating the expression of a protein of interest, said kit comprising a construct as hereinbefore defined and optionally reagents useful for:

(i) facilitating incorporation into the subject construct of a nucleic acid molecule encoding a protein of interest where the construct is empty;

(ii) effecting transfection of the construct into a host cell; and

(iii) reagents useful for facilitating application of the selection means which have been incorporated into the construct. Yet another aspect of the present invention is directed to a method for screening for an agent capable of interacting with a protein of interest, said method comprising contacting a putative modulatory agent with a host cell of the present invention, which host cell expresses the protein of interest in membrane bound form and detecting an altered expression phenotype.

Still another aspect of the present invention is directed to agents identified in accordance with the screening method defined herein and to said agents for use in the methods of the present invention.

BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1 is a schematic representation of Vector pHAT-1 (also known as vector A).

Figure 2 is a graphical representation of the number of cells as represented on the vertical axis, and the fluorescence intensity on the horizontal axis (log scale), The left hand panel shows the FACS profile and untransfected cells. The right hand panel shows the FACS profile after HAT selection of transfectants using vector A. It is clear that the entire profile has shifted to the right, indicating that essentially 100% of cells are expressing GFP (and therefore the gene of interest), although there is marked heterogeneity in the level of expression. Note in particular the minor subpopulation present with fluorescence intensity greater than 10² units. This population of very highly expressing cells can be easily enriched by flow cytometry (see Figure 3).

Figure 3 is a graphical representation of the enrichment of highly expressing cells by repeated rounds of FACS sorting.

Figure 4 is a schematic representation of Vector pHAT-2 (also known as vector B).

Figure 5 is a schematic representation of Vector pHAT-3 (also known as vector C).

Figure 6 is a schematic representation of Vector pHAT-4 (also known as vector D).

Figure 7 is a schematic representation of Vector pHAT-5 (also known as Vector E).

Figure 8 is a schematic representation of Vector pHAT-6 (also known as Vector F).

Figure 9 is a schematic representation of vector pHAT-7 (also known as Vector G).

Figure 10 is a schematic representation of Vector pHAT-8 (also known as Vector H). Figure 11 is a schematic representation of Vector pHAT-9 (also known as Vector I).

Figure 12 is a schematic representation of Vector pHAT-3.shATX (also known as Vector C.shATX).

Figure 13 is a schematic representation of Vector pHAT-lO.shATX (also known as Vector pdSRalHEM.sliATX).

Figure 14 is a schematic representation of Vector pRIRESEC-1.

Figure 15 is a schematic representation of Vector pHAT-11 (also known as Vector K).

Figure 16 is a schematic representation of Vector pHAT-12 (also known as Vector L).

Figure 17 is a schematic representation of Vector pHAT-13 (also known as Vector K).

DETAILED DESCRIPTION OF THE INVENTION

The present invention is predicated, in part, on the development of a highly efficient and reliable nucleic acid expression system, more particularly a highly efficient and reliable bicistronic expression system. Whereas existing expression systems, even existing bicistronic expression systems, have exhibited significant disadvantages in terms of the selection of successfully transfected cells and the identification of high copy number expressors, the expression cassette of the present invention overcomes this problem by virtue of the introduction of an antibiotic salvage pathway-based selection system which can itself be optionally designed to facilitate the generation of host cells expressing high levels of the protein of interest. Still further, the use of such a selection system, in particular a HAT-based selection system, provides improved choice with respect to potential host cell types since the generation of HAT sensitive cell lines is a relatively simple and routine procedure (see Ringertz NR and Savage RE Cell Hybrids. Academic Press 1976, especially pages 150-154, Selection of Hybrids Made from Drug Resistant Cells, which describes the process and gives examples). The expression construct of the present invention can also be optionally modified to incorporate additional selection markers, defined linearisation polylinker sites and specificity in terms of the post- translational localisation of the protein of interest. This development now provides a highly flexible and simple means of reliably producing protein molecules for use in a range of diagnostic, therapeutic, prophylactic or research based applications.

Accordingly, one aspect of the present invention provides a nucleic acid expression construct said construct comprising a promoter operably linked to both:

Reference to a "nucleic acid" should be understood as a reference to both deoxyribonucleic acid and ribonucleic acid thereof. To this end, the term "expression" refers to the transcription and translation of DNA or the translation of RNA resulting in the synthesis of a peptide, polypeptide or protein. A DNA construct, for example, corresponds to the construct which one may seek to transfect into a host cell for subsequent expression while an example of an RNA construct is the RNA molecule transcribed from a DNA construct, which RNA construct merely requires translation to generate the protein of interest. Reference to "expression product" is a reference to the product produced from the transcription and translation of a nucleic acid molecule. Although the present invention is preferably directed to the transfection of a DNA vector into a host cell for the purpose of producing a protein of interest, the construct of the present invention should be understood to extend to an RNA molecule corresponding to the subject DNA construct, whether generated via transcription of the DNA construct or otherwise synthetically generated. Preferably the subject nucleic acid molecule is a deoxyribonucleic acid molecule.

The present invention therefore more particularly provides a DNA expression construct said construct comprising a promoter operably linked to both:

(ii) a second DNA region located in the 3 ' direction to the first DNA region, said second DNA region comprising an IRES operably linked to a DNA sequence encoding one or more selectable markers, at least one of which selectable markers facilitates HAT-based selection;

Reference to a nucleic acid "expression construct" should be understood as a reference to a nucleic acid molecule which is transmissible to a host cell and designed to undergo transcription. The DNA molecule transcribed therefrom is preferably translated by the host cell machinery, although it should be understood that in some situations it may be the DNA product which is the required end product. In general, expression constructs are also referred to by a number of alternative terms, which terms are widely utilised interchangeably, including "expression cassette" and "vector".

The expression construct of the present invention may be generated by any suitable method including recombinant or synthetic techniques. To this end, the subject construct may be constructed from first principles, as would occur where an entirely synthetic approach is utilised, or it may be constructed by appropriately modifying an existing cloning vector.

Where one adopts the latter approach, the range of vectors which could be utilised as a starting point are extensive and include, but are not limited to:

(i) Plasmids

Plasmids are small independently replicating pieces of cytoplasmic DNA, generally found in prokaryotic cells, which are capable of autonomous replication. Plasmids are commonly used in the context of molecular cloning due to their capacity to be transferred from one organism to another. Without limiting the present invention to any one theory or mode of action, plasmids can remain episomal or they can become incorporated into the genome of a host. Examples of plasmids which one might utilise include the bacterial derived pBR322, the pUC series, the Bluescript series, the pGEM series, the pLITMUS series and many others which would be well known to those of skill in the art. (ii) Bacteriophage

Bacteriophages are viruses which infect and replicate in bacteria. They generally consist of a core of nucleic acid enclosed within a protein coat (termed the capsid). Depending on the type of phage, the nucleic acid may be either DNA (single or double stranded) or DNA (single stranded) and they may be either linear or circular. Phages may be filamentous, polyhedral or polyhedral and tailed, the tubular tails to which one or more tubular tail fibres are attached. Phages can generally accommodate larger fragments of foreign DNA than, for example, plasmids. Examples of phages include, but are not limited to the E.coli lambda phages, Pl bacteriophage and the T-even phages (e.g. T4).

(iii) Baculovirus

These are any of a group of DNA viruses which multiply only in invertebrates and are generally classified in the family Baculoviridae. Their genome consists of double-stranded circular DNA.

(iv) Artificial Chromosomes

Artificial chromosomes such as yeast artificial chromosomes or bacterial artificial chromosomes.

(v) Hybrid vectors such as cosmids, phagemids andphasmids

Cosmids are generally derived from plasmids but also comprise cos sites for lambda phage while phagemids represent a chimaeric phage-plasmid vector. Phasmids generally also represent a plasmid-phage chimaera but are defined by virtue of the fact that they contain functional origins of replication of both. Phasmids can therefore be propagated either as a plasmid or a phage in an appropriate host strain. (vi) Commercially available vectors which are themselves entirely synthetically generated or are modified versions of naturally occurring vectors. Examples of such vectors include, but are not limited to, pIRESlneo, pIRESlhyg, pIRES2- EGFP.

It would be understood by the person of skill in the art that the selection of an appropriate vector for modification, to the extent that one chooses to do this rather than synthetically generate a construct, will depend on a number of factors including the type of host cell in which it is desired to express the construct and the amount of DNA which is sought to be introduced to the construct. In terms of the former issue, it is generally understood that certain vectors are more readily transfected into certain host cell types. For example, the range of cell types which can act as a host for a given plasmid may vary from one plasmid type to another. In another example, baculovirus are known to be more suitable for transfection of insect cells than plasmids. In still yet another example, the larger the DNA insert which is required to be inserted, the more limited the choice of vector from which the expression construct of the present invention is generated. To this end, the size of the inserted DNA can vary depending on factors such as the size of the DNA sequence encoding the protein of interest, the number of selection markers which are utilised and the incorporation of features such as PEST sequences, linearisation polylmker regions and the like. Each of these features is discussed in more detail hereinafter. Where a long piece of DNA is sought to be inserted it may be more appropriate to utilise a bacteriophage or hybrid vector, such as a cosmid vector. However, in a preferred embodiment, one would utilise a plasmid as the basis for generating the construct of the present invention due to its ease of availability, wide choice of plasmid types, the extensiveness of their molecular characterisation and relative ease of molecular manipulation.

The present invention therefore preferably provides a plasmid-derived DNA expression construct said construct comprising a promoter operably linked to both:

By "plasmid-derived" is meant that the subject construct corresponds to a plasmid which has been modified to introduce the features of the construct of the present invention. To this end, the plasmid which is modified may correspond to a naturally occurring plasmid or it may correspond to a plasmid which has already undergone some degree of modification. These pre-existing modifications may be entirely unrelated to the features which are characteristic of the construct of the present invention or they may conveniently correspond to one or more of these features, such as the use of a plasmid which has previously been modified to incorporate an IRES sequence. A corresponding meaning should be understood to apply to constructs which are "bacteriophage-derived", "baculovirus-derived", "cosmid-derived" etc.

The expression construct of the present invention may be of any form including circular or linear. In this context, a "circular" nucleotide sequence should be understood as a reference to the circular nucleotide sequence portion of any nucleotide molecule. For example, the nucleotide sequence may be completely circular, such as a plasmid, or it may be partly circular, such as the circular portion of a nucleotide molecule generated during rolling circle replication (this may be relevant, for example, where a construct is being replicated by this type of method rather than via a cellular based cloning system). In this context, the "circular" nucleotide sequence corresponds to the circular portion of this molecule. A "linear" nucleotide sequence should be understood as a reference to any nucleotide sequence which is in essentially linear form. The linear sequence may be a linear nucleotide molecule or it may be a linear portion of a nucleotide molecule which also comprises a non-linear portion such as a circular portion. An example of a linear nucleotide sequence includes, but is not limited to, a plasmid derived construct which has been linearised in order to facilitate its integration into the chromosomes of a host cell or a construct which has been synthetically generated in linear form. To this end, it should also be understood that the configuration of the construct of the present invention may or may not remain constant. For example, a circular plasmid-derived construct may be transfected into a host cell where it remains a stable circular episome which undergoes replication and transcription in this form. However, in another example, the subject construct may be one which is transfected into a cell in circular form but undergoes intracellular linearisation prior to chromosomal integration. This is not necessarily an ideal situation since such linearisation may occur in a random fashion and potentially cleave the construct in a crucial region thereby rendering it ineffective. However, in another example and in particular in the context of constructs which incorporate a defined linearisation polylinker region, the construct may be designed and generated in a circular form but may be linearised prior to transfection into a host cell, thereby enabling both the induction of a predetermined cleavage event and the chromosomal integration of a linear form of the construct.

The nucleic acid molecules which are utilised in the method of the present invention are derivable from any human or non-human source. Non-human sources contemplated by the present invention include primates, livestock animals (eg. sheep, pigs, cows, goats, horses, donkeys), laboratory test animal (eg. mice, hamsters, rabbits, rats, guinea pigs), domestic companion animal (eg. dogs, cats), birds (eg. chicken, geese, ducks and other poultry birds, game birds, emus, ostriches) captive wild or tamed animals (eg. foxes, kangaroos, dingoes), reptiles, fish, insects, prokaryotic organisms or synthetic nucleic acids. Non- human sources also include plant sources such as rice, wheat, maize, barley or canole.

It should be understood that the constructs of the present invention may comprise nucleic acid material from more than one source. For example, whereas the construct may originate from a bacterial plasmid, in modifying that plasmid to introduce the features defined herein nucleic acid material from non-bacterial sources may be introduced. These sources may include, for example, viral DNA (e.g. IRES DNA), mammalian DNA (e.g. the DNA encoding the protein of interest) or synthetic DNA (e.g. to introduce specific restriction endonuclease sites via a polylinker). Still further, the cell type in which it is proposed to express the subject construct may be different again in that it does not correspond to the same organism as all or part of the nucleic acid material of the construct. For example, a construct consisting of essentially bacterial and viral derived DNA may nevertheless be expressed in mammalian host cells, this being particularly likely where the protein of interest requires glycosylation or some other form of post-translational modification which cannot be provided by a bacterial host cell.

Reference to nucleic acid "derivatives" should be understood to include reference to fragments, parts, portions, mutants, or homologs of nucleic acid molecules from natural, synthetic or recombinant sources. "Functional derivatives" should be understood as derivatives which exhibit any one or more of the functional activities of nucleic acid molecules. The derivatives of said nucleic acid sequences include fragments having particular regions of the sequence fused to other proteinaceous or non-proteinaceous molecules (this being particularly useful, for example, in the context of detection tags, such as a fluorescent tag, or some other non-nucleic acid component which facilitates some aspect of its functioning such as its targetting or entry to a cell). The biotinylation of the nucleotide or nucleic acid sequence is an example of a "functional derivative" as herein defined. Derivatives of DNA sequences may be derived from single or multiple nucleotide substitutions, deletions and/or additions.

Analogs contemplated herein include, but are not limited to, modifications to the nucleotide or nucleic acid sequence such as modifications to its chemical makeup or overall conformation. This includes, for example, modification to the manner in which nucleotides or nucleic acid sequences interact with other nucleotides or nucleic acid sequences such as at the level of backbone formation of complementary base pair hybridisation. Analogs of the subject nucleic acid molecule include the polymeric form of ribonucleosides (adenosine, guanosine, uridine or cytidine) or deoxyribonucleosides (deoxyadenosine, deoxyguanosine, deoxythymidine, or deoxycytidine), or any phosphoester analogs thereof, such as phosphorothioates and thioesters, in single stranded form, double-stranded form or otherwise.

As detailed hereinbefore, the expression construct of the present invention facilitates the highly efficient and reliable generation of host cell transfectants which express a protein of interest. To this end, reference to "protein of interest" should be understood as a reference to any protein molecule which one is seeking to express. This protein may be a eukaryotic or a non-eukaryotic protein and it may correspond to all or just part of the naturally occurring form of the protein. For example, one may seek to merely express an epitopic region of an antigen or a binding region of a protein such as an enzyme. Alternatively, it may be desired to express the full-length form of the protein, such as would be required where one is seeking to produce large quantities of a fully functional molecule, for example for use as a drug. If the subject protein requires some form of post-translational modification in order to exhibit functionality, assuming that one is seeking to produce a functionally active protein, it would be necessary to ensure that the construct encoding this protein is expressed in a host cell which is capable of the required post-translation modification.

The term "protein" should be understood to encompass peptides, polypeptides and proteins. It should also be understood that these terms are used interchangeably herein. The protein may be glycosylated or unglycosylated and/or may contain a range of other molecules fused, linked, bound or otherwise associated to the protein such as lipids, carbohydrates or other peptides, polypeptides or proteins (such as would occur where the protein of interest is produced as a fusion protein with another molecule, for example DNA). Reference hereinafter to a "protein" includes a protein comprising a sequence of amino acids as well as a protein associated with other molecules such as amino acids, lipids, carbohydrates or other peptides, polypeptides or proteins. The subject protein is also one which may or may not have undergone post-translational modification, depending on the host cell by which it was produced. Accordingly, the protein may or may not exhibit all or some of its usual functional attributes.

In terms of the expression of a protein of interest, it should be understood that the present invention encompasses both vectors which incorporate DNA encoding the protein of interest and "empty vectors", these being vectors which exhibit all of the features of the bicistronic vectors herein defined with the exception of the incorporation of the DNA encoding the protein of interest. The skilled person would appreciate that the present invention lies with the overall construction of the functional and structural features of the present vectors and not with whether or not the DNA encoding the protein of interest has actually been incorporated. One would appreciate that in terms of a commercial product, it is likely that vectors would be provided which do not incorporate the protein of interest, the final step of incorporating DNA encoding the protein of interest being taken by the party wishing to express that protein. Accordingly, reference to the first nucleic acid region comprising a "nucleic acid sequence into which can be incorporated a nucleic acid sequence encoding a protein of interest" should be understood as a reference to a vector which has been designed to accommodate the insertion of DNA encoding a protein of interest into a defined site in the vector, but which incorporation has not actually occurred. Such a vector is also commonly termed an "empty vector".

Proteins which may be particularly useful to express include antibodies, blood clotting factors such as factor VII and VIII, fragments of cell surface receptors such as the TNF receptor, extracellular domains of membrane glycoproteins, glycoprotein hormones, growth factors and enzymes.

The construct of the present invention has been defined in terms of it comprising a promoter operably linked to first and second nucleic acid regions. As discussed in detail hereinafter, it should be understood that both of these nucleic acid regions, in addition to the construct as a whole, may comprise nucleic acid components additional to the components specifically detailed herein.

Without limiting the present invention to any one theory or mode of action, a promoter is a region of DNA to which DNA polymerase binds before initiating the transcription of DNA to DNA. The nucleotide at which transcription starts is designated +1 and nucleotides are numbered from this with negative numbers indicating upstream nucleotides and positive downstream nucleotides. Most bacterial promoters contain two consensus sequences that seem to be essential for the binding of the polymerase. The first, the Pribnow box, is at about -10 and has the consensus sequence 5'-TATAAT-3'. The second, the -35 sequence, is centred about -35 and has the consensus sequence 5'-TTGACA-3\ Most factors that regulate gene transcription do so by binding at or near the promoter and affecting the initiation of transcription. Much less is known about eukaryote promoters; each of the three DNA polymerases has a different promoter. DNA polymerase I recognizes a single promoter for the precursor of rRNA. DNA polymerase II, that transcribes all genes coding for polypeptides, recognizes many thousands of promoters. Most have the Goldberg- Hogness or TATA box that is centred around position —25 and has the consensus sequence 5 '-TATAAAA-3 '. Several promoters have a CAAT box around -90 with the consensus sequence 5 '-GGCCAATCT-3 '.

Reference to "promoter" should therefore be understood as a reference to any suitable promoter which can be selected by the person of skill in the art. The promoter may be one which is naturally present in the vector which formed the basis for the generation of the subject construct or it may be one which has been introduced due to its unique properties. Preferably, the subject promoter is SRa or cytomegalovirus, although SRa is particularly preferred for use with mammalian cells due to its strong activity in myeloma cells and CHO cells. Reference to "SRa" and to promoters, in general, or any other protein herein discussed (such as HPRT, TK, IRES, DNA etc.), should be understood as a reference to all forms of these proteins and to functional derivatives, homologues, variants and mimetics thereof. This includes, for example, any isoforms which arise from alternative splicing of the subject protein or functional mutants or polymorphic variants of these proteins. Without limiting the present invention to any one theory or mode of action, SRa is a particular strong promoter which encourages high level expression of the DNA encoding the protein of interest. Other promoters which could be utilised include, but are not limited to any mammalian promoter, in particular those which give strong expression in the host cells of interest such as CMV, EF-I alpha, SV40 early, β-actin, phosphoglycerokinase, thymidine kinase, HTLV-I long terminal repeat and the adenovirus major late promoter.

The present invention therefore preferably provides a plasmid-derived DNA expression construct said construct comprising a SRa promoter operably linked to both:

The present invention therefore preferably provides a plasmid-derived DNA expression construct said construct comprising a cytomegalovirus promoter operably linked to both:

Preferably, said construct is plasmid-derived.

Reference to the subject promoter being "operably linked" to the first and second nucleic acid regions should be understood as a reference to the promoter being incorporated into the construct such that it initiates transcription of both the first and second nucleic acid molecules as a single DNA transcript. Accordingly, one of skill in the art would appreciate that the promoter need not necessarily be positioned directly adjacent and 5' to the first nucleic acid region nor need the second nucleic acid region be positioned directly adjacent and 3 ' to the first nucleic acid region. In fact, and as hereinbefore briefly introduced, the subject construct may comprise additional nucleic acid components either within or adjacent to the subject first and second nucleic acid regions. These are discussed in more detail hereinafter. Of primary functional importance, however, is that the promoter generates a single DNA transcript which minimally comprises transcript corresponding to the first and second nucleic acid regions.

The construct of the present invention is effectively a bicistronic construct in that it has been designed such that the nucleic acid of interest and the selectable marker are not separated during host cell transfection. However, whereas bicistronic constructs have been previously known and used, the construct of the present invention represents a significant improvement in that, inter alia, it has been designed to utilise a more efficacious selection system, being an antifolate salvage pathway-based selection system (in particular a HAT- based selection system, the incorporation of which can itself be designed to provide functional outcomes such as more flexibility in terms of host cell choice (eg. cells can be used even where transfection efficiency is generally low) and the generation of host cells expressing high levels of the protein of interest. These features overcome limitations which have been inherent in previously available bicistronic vector based expression systems. The optional incorporations of still further features such as additional selection means (eg. GFP and/or DNA selection means), linearisation polylmker and means for facilitating adherent cell surface expression of the protein of interest render the construct of the present invention highly advantageous.

Without limiting the present invention to any one theory or mode of action, most animal cells can synthesize the purine and pyrimidine nucleotides de novo from simple carbon and nitrogen compounds, rather than from already formed purines and pyrimidines. The folic acid antagonists amethopterin and aminopterin interfere with the donation of methyl and formyl groups by tetrahydrofolic acid in the early stages of de novo synthesis of glycine, purine nucleoside monophosphates, and thymidine monophosphate. These drugs are called antifolates, since they block reactions involving tetrahydrofolate, an active form of folic acid. Many cells, however, contain enzymes that can synthesize the necessary nucleotides from purine bases and thymidine if they are provided in the medium: These salvage pathways (herein referenced to as "antifolate salvage pathways") bypass the metabolic blocks imposed by antifolates.

A number of mutant cell lines lacking the enzyme needed to catalyze one of the steps in a salvage pathway have been isolated. For example, cell lines lacking thymidine kinase or HPRT cannot utilise the antifolate salvage pathway since both of the enzymes are critically required for the functioning of this path. In order for such cells to grow in the presence of an antifolate they require restoration of the enzyme which they are lacking. The restitution of these enzymes is therefore useful as a selection marker. To this end, the cell culture medium most often used to select cells based on restoration of the antifolate salvage pathway is HAT medium. Normal cells can grow in HAT medium due to the fact that although aminopterin blocks de novo synthesis of purines and TMP, the thymidine in the media is transported into the cell and converted to TMP by thymidine kinase and the hypoxanthine is transported and converted into usable purines by HPRT. On the other hand, neither TK^" nor HPRT^" cells can grow in HAT medium because each lacks an enzyme of the salvage pathway. However, cells in which production of the missing enzyme has been restored will produce a functioning salvage pathway and grow in HAT medium.

Accordingly, reference to a nucleic acid sequence encoding a "selectable marker" should be understood as a reference to a nucleic acid molecule encoding a protein which facilitates the identification and/or isolation of cells expressing that protein. To this end, the expression construct of the present invention minimally encodes a protein which facilitates the application of antifolate salvage pathway-based selection. Reference to "antifolate salvage pathway-based selection" should be understood as a reference to a selection system in which host cells lacking the capacity to produce functional thymidine kinase or HPRT are exposed to an antifolate and grow only if the construct of the invention, which expresses either thymidine kinase or HPRT, has been successfully incorporated, thereby enabling functioning of the salvage pathway. Preferably, the subject selection system is a HAT-based selection system. Reference to "HAT-based selection" should be understood as a reference to a selection system in host cells which are grown in the presence of hypoxanthine, aminopterin and thymidine (herein referred to as "HAT") are unable to grow unless the cell has been successfully transfected with the construct of the present invention. In terms of the present invention, the HAT based selection system is preferably one which is designed for use with a host cell lacking the capacity to produce either thymidine kinase or HPRT. Accordingly, restoration of the salvage pathway is preferably effected by designing the construct of the invention to express the gene encoding either thymidine kinase or HPRT, the choice of which enzyme depending on the nature of the host cell which is sought to be utilised.

Accordingly, in one preferred embodiment the present invention provides a plasmid- derived DNA expression construct said construct comprising a promoter operably linked to both:

In another preferred embodiment there is provides a plasmid-derived DNA expression construct said construct comprising a promoter operably limited to both:

In accordance with those embodiments, said promoter is preferably cytomegalovirus or SRa.

More preferably, said construct is plasmid derived.

It has also been determined that although the construct of the present invention provides significant selection advantages by virtue of its design to function in a HAT-based selection system, still further flexibility and benefits can be provided by incorporating DNA encoding additional selection markers for expression as a fusion protein with thymidine kinase or HPRT. It should therefore be understood that although the present invention contemplates the incorporation of still further additional selection markers which are not intended to be expressed as a fusion protein with thymidine kinase or HPRT (eg. antibiotic resistance genes under the control of separate promoters or GST genes for expression as a fusion with the protein of interest - these features being discussed in more detail hereinafter) this particular embodiment of the invention is directed to the expression of two or more selectable markers, but preferably two, as a fusion protein. To this end, it has been determined that the production of a thymidine kinase or HPRT fusion with GFP results in both functional thymidine kinase/HPRT and GFP production, this being an important determination since the secondary and tertiary folding patterns of each of the expression products must occur correctly in order to result in appropriate functionality. When protein molecules are expressed as a fusion, their correct folding must still occur in order to result in functionality. However, due to the presence of the fused additional protein, this is not always possible. Accordingly, in the context of the expression of any fusion protein, the person of skill in the art must determine whether the correct folding of the fused proteins has occurred. This can be determined by any one of a range of techniques, including functional analyses.

Accordingly, said second selectable marker is preferably GFP (green fluorescent protein). Other markers which could be expressed as a fusion with TK or HPRT include red fluorescent protein, yellow fluorescent protein, cyan fluorescent protein or any other fluorescent protein or mimetic or functional derivative or variant. In particular, for the expression of antibodies or other proteins consisting of two or more different protein chain subunits, transfection with a combination of two such vectors, each expressing one of the requisite subunits, with different coloured fluorescent proteins as fusions with HPRT or TK and each containing DNA encoding one such subunit would allow isolation of cells expressing both subunits, using flow cytometry. For example, a vector containing GFP and DNA encoding an antibody heavy chain could be used to transfect a cell line which is also transfected with a vector containing red fluorescent protein (RFP) and DNA encoding an antibody light chain, and cells expressing both chains could be isolated by flow cytometry, selecting cells that are both red and green fluorescent. This process could be extended to encompass proteins made up of three or more different chains, with use of vectors containing different coloured fluorescent proteins. Cells expressing all subunits could be isolated by flow cytometry.

GFP has also been found to be particularly useful in that whereas a HAT based selection system provides a means of quickly, efficiently and simply selecting for successfully transfected cells in the context of a straight forward cell-culturing based system, the co- expression of GFP (which is less desirable for high throughput screening to identify transfectants due to the more complex screening mechanism which is required to be performed) nevertheless provides a particularly useful and accurate means of identifying transfectants which are expressing the protein of interest at high levels. Without limiting the present invention to any one theory or mode of action, one of the major factors in determining the expression levels of a recombinantly produced protein can be the site of integration of the gene encoding that protein in the host cell chromosome, to the extent that integration occurs rather than episomal replication and expression. Cells which exhibit favourable integration sites often result in higher levels of integration than cells which have not integrated the linearised construct in a favourable position. To the extent that one utilises a site directed technique for integrating heterologous DNA, such as homologous recombination, this problem may be minimised. However, since most recombinant expression systems are based on largely random integration of heterologous DNA, the selection of both successful transfectants and, thereafter, those transfectants producing the highest levels of the protein of interest is of importance. Accordingly, although it is quite cumbersome to utilise GFP expression for the purpose of selecting successfully transfected cells, since this involves the application of sterile FACS sorting of large populations of cells, the application of this technique to the subpopulations of cells which have been determined, via HAT-based selection, to have been successfully transformed provides a less cumbersome but highly reliable and accurate means of identifying cells producing high levels of the protein of interest. However, it should also be understood that although GFP expression is a preferred method of achieving this objective, it is not the only means by which this objective can be achieved. Other selection markers suitable for use in this regard are secretory alkaline phosphatase, for example produced under the influence of a second IRES or alternatively via construction of a fusion protein between HPRT and secretory alkaline phosphatase, in which a cleavable secretory signal sequence is present between the two proteins allowing simultaneous expression of HPRT in the cytoplasm and secretion of alkaline phosphatase. The latter embodiment may be facilitated by a short spacer or "kinker" consisting of several glycine residues (typically 5) immediately on the amino (5') side of the secretory signal sequence of alkaline phosphatase to allow access of the signal sequence to binding by the signal recognition particle. In those cases where the secretory alkaline phosphatase is synthesised by the same mRNA as the protein of interest, the secreted alkaline phosphatase can be detected in the culture supernatant by virtue of its enzyme activity. Clones with the highest alkaline phosphatase activity in the supernatant would also be expected to have the highest expression of the protein of interest. The advantage of such a system is that hundreds of clones could be rapidly and easily screened for high expression by testing their supernatants, obviating the need for expensive and sophisticated equipment such as flow cytometry.

The present invention therefore preferably provides a DNA expression construct comprising a promoter operably linked to both:

In another preferred embodiment the present invention provides a DNA expression construct said constructing comprising a promoter operably linked to both: (i) a first DNA region comprising a DNA sequence encoding a protein of interest or a DNA sequence into which can be incorporated a DNA sequence encoding a protein of interest; and

(ii) a second DNA region located in the 3 ' direction to the first DNA region, said second nucleic acid region comprising an internal ribosome entry site operably linked to a DNA sequence encoding (a) HPRT or thymidine kinase and (b) secretory alkaline phosphatase or functional fragment, homologue, derivative or mimetic thereof;

wherein the expression of said construct would result in translation of both said protein of interest and said HPRT or thymidine kinase and secretory alkaline phosphatase.

Preferably, said promoter is cytomegalovirus or SRa.

More preferably, said construct is plasmid-derived.

As detailed hereinbefore, the construct of the present invention is bicistronic in that its design ensures the production of both the protein of interest and the selectable markers of the second nucleic acid region. This is achieved first by virtue of placing the transcription of both nucleic acid molecules under the control of a single promoter such that a single transcript is generated. However, in order to facilitate the production of the protein of interest in a form which is not fused to the selectable markers as a single expression product, the second nucleic acid region is located immediately 3 ' to an IRES sequence, the IRES sequence itself being located 3' to the first nucleic acid region. Without limiting the present invention to any one theory or mode of action, RNA is generally translated from the first AUG downstream of the 5' cap.

Initiation of translation in this way also depends on the so-called "Kozak rules" which set out the sequences adjacent to the initiator AUG codon that are required for efficient initiation of translation. For example, if the first base of the AUG codon is assigned the number +1, position -3 is usually a purine (either A or G), and position ±4 is usually G.

However, translation can also be initiated directly at an IRES sequence. Accordingly, an IRES sequence facilitates the translation of RNA from an internal RNA site thereby enabling the translation of two or more separate protein coding sequences which are contained on the same mRNA transcript, without the requirement that they be translated as a fusion protein. In the context of the present invention, the IRES sequence is "operably linked" to the nucleic acid sequence encoding the selectable markers of the second nucleic acid region. Reference to "operably linked" in this context should be understood to have a meaning corresponding to the meaning earlier provided. Specifically, the subject IRES sequence is incorporated into the construct such that its RNA counterpart initiates translation of the second nucleic acid region. Accordingly, one of skill in the art would appreciate that the IRES sequence need not be positioned directly adjacent and 5' to the second nucleic acid region, although this would be preferable. Rather, the IRES sequence need only be positioned such that it achieves the objective of the translation of the second nucleic acid region.

It should be understood that the vectors of the present invention may be designed in polycistronic form. That is, the vector contains more than one IRES sequence, enabling the translation of more than two proteins from the one mRNA transcript. Without limiting the present invention to any one theory or mode of action, and by way of exemplification only, a construct of this type would be useful where a second IRES is used to drive the synthesis of a secretory form of alkaline phosphatase or any other secreted enzyme that is easily detected in the culture supernatant (Cullen et ah, 1992, Methods in Enzymology 216:362-368). In this case it would be possible to select clones with high expression by a simple colour reaction in which culture supernatants are tested for enzyme activity, rather than the use of flow cytometry which requires expensive and sophisticated equipment. A small aliquot of culture supernatant could be quickly and routinely tested. The rapidity of development of a yellow colour would be a measure of the expression level. The relatively large dilution factor would render the absorbance of the phenol red in the culture medium negligible in relation to the colour produced by cleavage of the substrate by alkaline phosphatase.

As detailed hereinbefore, reference to the protein molecules of the present invention such as thymidine kinase, HPRT, GFP, alkaline phosphatase, hypoxanthine/aminopterin/thymidine (abbreviated as "HAT") in addition to molecules such as enhancers which are hereinafter described in more detail should be understood as a reference to all forms of these proteins and to functional fragments, homologues, derivatives or mimetics thereof, including isoforms, functional mutants or polymorphic variants. In particular, for example, molecules such as thymidine kinase and HPRT are found in a number of different organisms. Accordingly, it should be understood that the nucleic acid molecules for use in the construct of the present invention can be derived from any appropriate source, including synthetic or recombinant sources.

Still without limiting the present invention in any way, it should be understood that the constructs of the present invention are not limited to the nucleic acid components hereinbefore described in detail. Rather, the constructs may comprise additional nucleic acid components. These additional components may be either endogenously present, such as components which form part of a starting plasmid which is proposed to be modified in order to generate the construct of the present invention (these may or may not contribute additional functional features) or they may be inserted during construction of the plasmid in order to provide still further functional attributes additional to those hereinbefore discussed.

Examples of optional additional nucleic acid components which may form part of the constructs of the present invention include, but are not limited to:

(i) Signal sequences

The construct of the present invention can be designed to produce proteins which are cytoplasmically retained, secreted or membrane bound. The decision of what form a protein of interest is required to take will depend largely on the functional requirement of the protein. For example, anchored cell surface expression of a protein of interest provides a convenient means for screening for molecules which interact with the protein of interest such as antibodies, antagonists, agonists or the like particularly to the extent that the protein is expressed on the membrane of an adherent cell type. Still further membrane anchored forms of proteins may be particularly suitable for administration to an animal for the purpose of generating monoclonal antibodies to the protein since the host cells provide a convenient source of the protein which is likely to be correctly folded and have appropriate post-translational modifications such as glycosylation and disulphide bonds, and the host cell may provide adjuvant properties such as may be provided by antigenic differences to the recipient, notably in the major histocompatibility complex (MHC).

Cell surface expression may also greatly facilitate the screening of monoclonal antibodies by a modified form of solid phase assay (commonly known as ELISA, although immunofluorescence could be used instead of an enzymic colour reaction), because the relevant protein is simply presented on the surface of an adherent cell line such as Ltk- attached to tissue culture trays. This approach has many advantages. Firstly, the protein of interest need not be purified. Secondly, the relevant protein is likely to be correctly folded and in its native conformation, whereas proteins bound to plastic plates by passive adsorption are often denatured. Thirdly, the fact that the cells are adherent to the plates means that washing to remove unbound antibody can be done by simple flicking of the plates or suction and does not require centrifugation.

Alternatively, secreted proteins are particularly suitable where the protein is required to be harvested and purified, for example, for distribution and use as a drug.

Still further the subject protein may be cytoplasmically expressed. Cytoplasmic expression could be useful in many different situations. For example, it could be useful for evaluating the functional properties of the protein of interest inside cells, such as its effects on the control of cellular differentiation, growth, division, metabolism, motility, movement, invasion or metastasis. It could also be useful in situations in which a particular post-translational modification such as phosphorylation occurs within the cytoplasm.

In terms of designing the construct of the present invention such that the protein is appropriately expressed, it may be necessary to incorporate into the first nucleic acid region a DNA sequence encoding a signal sequence, preferably in cleavable form, where the protein is desired to be secreted. Without limiting the present invention to any one theory or mode of action, a signal sequence is a peptide which is present on proteins destined either to be secreted or to be membrane bound. They are normally located at the N-terminus of the protein and are generally cleaved from the mature protein. The signal sequence generally interacts with the signal recognition particle and directs the ribosome to the endoplasmic reticulum where co-translational insertion takes place. Where the signal sequence is cleavable, it is generally removed by signal peptidase, this being a specific protease located on the cisternal face of the endoplasmic reticulum. The choice of signal sequence which is to be utilised will depend on the requirements of the particular situation and can be determined by the person of skill in the art. In the context of the exemplification provided herein, but without being limited in that regard, the influenza haemaglutinin signal sequence is utilised to facilitate secretion of the protein of interest while a non-cleavable amino terminal anchor sequence, derived from the anchor of murine NPP-I (PC-I), permits the generation of type II membrane proteins. In the absence of any signal sequence the expression product will be cytoplasmically localised.

If a type I membrane protein is desired, both a 5' cleavable signal sequence at the amino end of the protein and a non-cleavable membrane anchor at the 3 ' (carboxy) end of the protein are required. These could be provided within the vector or one or both could be encoded by the DNA of the protein of interest.

The nucleic acid molecule encoding the signal sequence to the extent that one is utilised, may be positioned in the construct at any suitable location which can be determined as a matter of routine procedure by the person of skill in the art but preferably immediately 5 ' to the nucleic acid sequence encoding the protein of interest (such that it can be expressed as an immediately adjacent fusion with the protein of interest) but 3' to the promoter such that expression of the signal sequence is placed under the control of the promoter. The DNA encoding the signal sequence is also required to be placed in frame with the gene of interest. In terms of the definitions provided herein, the nucleic acid sequence encoding the signal sequence preferably forms part of the first nucleic acid region.

(ii) Additional selection markers

As foreshadowed hereinbefore, in addition to the selection markers previously discussed, one may insert any number of further selection markers which are not necessarily required to be localised to the second nucleic acid region and are designed, for example, to facilitate the use of the vectors in a variety of ways, such as purification of the protein of interest. For example, the glutathione S -transferase

(GST) gene fusion system provides a convenient means of harvesting the protein of interest. Without limiting the present invention to any one theory or mode of action, a GST-fusion protein can be purified, by virtue of the GST tag, using glutathione agarose beads. This particular tagging system is preferably used in the context of an E.coli expression system although it can also be utilised in the context of eukaryotic host cell expression systems. In utilising this type of selection marker, the person of skill in the art would appreciate that in order to facilitate cleavage of the purified protein from the GST-protein fusion, the fusion expression product can be designed such that it incorporates a cleavage site to release the protein of interest from the GST region. In the context exemplified herein, a pentaglycine kinker is encoded immediately prior to a thrombin cleavage site thereby facilitating efficient thrombin cleavage to release the protein of interest. Still further, in this particular embodiment the nucleic acid sequence encoding GST and the thrombin cleavage site is localised in the first nucleic acid region immediately 5' to the protein of interest but 3' to the promoter. Since GST expression facilitates purification of cytoplasmically expressed protein subsequently to cell lysis, it is not essential that a signal sequence is incorporated in order to facilitate secretion of the expression product. However, this should not be excluded and the present invention should be understood to extend to constructs encoding a secretable GST-protein fusion. This could be achieved, for example, by designing the sequence of the first nucleic acid region such that it encodes a cleavable signal sequence fused to a cleavable GST which is, in turn, fused to the protein of interest.

In another example, a fusion tag could be used which is itself a fusion between 360 bp of protein A (allowing purification of the secreted product) and beta lactamase

(a bacterial enzyme which allows testing of supernatants by a simple colour reaction). Beta lactamase facilitates selection of an assay for the protein of interest in the absence of an assay for the protein of interest. The protein A/beta lactamase fusion is preferably separated from the protein of interest by a thrombin cleavage site and a pentaglycine kinker to facilitate thrombin cleavage, so that after the protein is purified on IgG beads, the tag can be easily removed.

Other fusion tags that could be included to facilitate purification of the protein of interest include staphylococcal protein A, streptococcal protein G, hexahistidine, calmodulin-binding peptides and maltose-binding protein (the latter is also useful to help ensure correct folding of the protein of interest).

Yet another selectable marker which one may seek to utilise is an antibiotic resistance gene. As detailed hereinbefore, antibiotic resistance genes have previously been utilised in the context of bicistronic vectors as the selection marker of choice. However, disadvantages associated with this selection system, particularly in the context of the use of the neomycin analogue G418 in the context of mammalian cells, have led to the development of the HAT-based selectable bicistronic vector of the present invention. Nevertheless, antibiotic resistance can still provide certain alternative applications. In particular, to the extent that one is seeking to replicate the construct of the present invention prior to, for example, transfection of the construct population into the target host cell population (e.g. a mammalian host cell population), one may seek to conveniently do this via bacterial host cell expansion. This may be relevant, for example, where the protein of interest cannot be expressed in a functional form by prokaryotic cells due to their inability to appropriately post-translationally modify the protein product but where the starting copy number of the plasmid of interest is not conveniently increased in a mammalian host cell due to the slower expansion, more difficult transfection and more complex growth requirements of mammalian cells relative to bacterial cells. In terms of increasing the copy number of a construct of interest, it would be appreciated that this can be achieved using any one of a number of suitable techniques including bacterial expansion and in vitro amplification (e.g. via rolling circle amplification), although bacterial expansion does provide a particularly convenient technique. That being the case, selection of bacterial host cells merely comprising the construct, in particular as an episomal plasmid construct, (i.e. irrespective of issues such as ability to express the construct, expression levels and the like) is conveniently achieved utilising an antibiotic resistance marker, the expression of which is driven by a bacterial promoter. Suitable antibiotic resistance genes include, but are not limited to, kanamycin, neomycin and ampicillin. Since it is not intended, nor even desirable, that this antibiotic resistance gene would be expressed or otherwise functional in the host cell population which is proposed to be utilised to produce the protein of interest, this gene is preferably not incorporated into the host cell as part of the first or second nucleic acid regions. Rather, it is separately inserted into a distinct region of the construct under the control of its own promoter, which promoter need not necessarily be one recognisable by the host cell population which will ultimately be used to express the protein. Accordingly, to the extent that it is proposed to multiply the construct in a prokaryotic host cell such as E.coli, one might preferably use a bacterial promoter. As detailed hereinbefore, the fact that these types of nucleic acid regions are either non-functional or redundant in terms of the expression of the construct in some host cell types is anticipated and not problematic in terms of the overall design of the construct of the invention.

(iii) Enhancers

Enhancers are eukaryotic control elements which can increase the expression of a gene by increasing its transcription. More specifically, enhancers generally increase the rate of transcription, this being achieved independently of the orientation or position of the enhancer. Accordingly, to the extent that an enhancer is sought to be utilised, it may be localised anywhere in the subject construct. In choosing an enhancer for use with a given construct, it would be appreciated that these elements are generally tissue specific. However, they are an extremely well- defined population of molecules and the selection of an appropriate enhancer for use would fall well within the skill of the person in the art. For example, the K light chain enhancer is suitable for use in the context of myeloma cell expression and can be inserted at any location in the vector which is intended to be transfected into those cells.

(iv) PEST sequences

Without limiting the present invention to any one theory or mode of action, PEST amino acid sequence regions are defined as regions rich in proline, glutamic acid, serine and threonine which confer susceptibility to rapid intracellular enzymatic proteolysis. Accordingly, these regions are effectively degradation susceptibility signals and can be conveniently utilised in the context of the constructs of the present invention in order to increase the transcription rate of the protein of interest. Preferably, the nucleic acid sequence encoding a PEST region is incorporated into the second nucleic acid region in order to destabilise the selectable markers expressed by this region. In the context of PBPRT or thymidine kinase expression, a certain threshold level of this enzyme must be expressed in order to enable the cell to grow in HAT medium. Accordingly, by destabilising the HPRT or thymidine kinase expression product, the host cell in issue is forced to increase the transcription rate of the HPRT or thymidine kinase gene in order to provide sufficient HPRT or thymidine kinase levels to facilitate host cell growth. Accordingly, there effectively occurs enrichment for host cells exhibiting higher rates of transcription. Since the HPRT or thymidine kinase transcript is expressed as a single transcript encoding both the first and second nucleic acid regions, levels of the protein of interest are also effectively increased. To the extent that the incorporation of a nucleic acid sequence encoding a PEST sequence is desired, this can be conveniently inserted at the 3' end of the second nucleic acid region, that is after the HPRT, thymidine kinase, HGPRT/GFP fusion or thymidine kinase/GFP fusion sequences. It would be appreciated, however, that the PEST sequence may also be localised to any other suitable position. Another factor that can influence protein stability and lifetime is the "N-terminal rule", which defines certain sequences at the N-termini of proteins that encourage stability or encourage proteolytic breakdown.

(v) Linearisation Polylinker

In order to overcome the problems associated with the random distribution of naturally occurring restriction sites and the problems this can cause where these sites occur in multiple positions and/or are localised to important regions such as in the middle of genes, a linearisation polylinker may be inserted into the construct.

Choice of an appropriate site within the linearisation polylinker for linearisation by cleavage prior to transfection is facilitated by having several restriction sites within the linearisation polylinker or elsewhere to choose from, and may obviate the need for the active deletion of known but unwanted naturally occurring restriction sites, although the application of active deletion is not intended to be excluded. This can be particularly important in the context of random linearisation and integration of the construct into the host cell chromosome since cleavage in a nucleic acid region of functional importance can render the construct useless. Still further, although this can be technically overcome by choosing and designing host cell systems which retain the construct in episomal form, episomes do not exhibit the stability of chromosomally integrated DNA and are known to degrade after a time. The directed linearisation of a circular construct prior to its chromosomal integration using a restriction enzyme which targets a suitable cleavage site therefore provides the best means of overcoming this problem. Accordingly, reference to "linearisation polylinker" should be understood as a reference to a region of the construct which has been designed to incorporate one or more specific restriction sites. Preferably, the polylinker comprises multiple rare restriction sites which are not present in the genes of interest. Still further, the subject polylinker is designed such that sequences of 6-10 bases, preferably 6-9, 7-9, 6-8 or 7-8 are preferably required for cleavage thereby minimising the possibility that these sites would be naturally found in one of the genes of interest. Preferably, the subject polylinker comprises two, three, four, five, six, seven or more restriction sites. The linearisation polylinker may be inserted at any suitable site in the construct, for example at a position between the 3' side of the poly A addition signal and the 5' side of the promoter that is used to drive mRNA encoding the gene of interest. The choice and number of restriction sites which are included in a polylinker can be designed by the person of skill in the art to suit a given purpose. A preferred polylinker is exemplified herein and includes two or more of the following restriction sites which are absent from the remainder of the vector. The choice of which site to use will depend on individual circumstances, notably that this site should be absent from the sequence encoding the protein of interest:

- Xcm l

- Age I

- HpH I - Bst E II

- Bss H II Swa l - Ora l

It should be understood, however, that although this particular polylinker has been designed to provide a broad selection of rare restriction sites for use, it may be modified by the addition, removal or modification of these sites. Further, although the specification exemplifies the insertion of one linearisation polylinker site, the construct may be designed to incorporate more than one such polylinker site.

Still further, any one or more other pre-existing restriction endonuclease sites may be deleted or substituted, in particular to omit any common restriction sites which might diminish the usefulness of the vector, on the basis that restriction sites are generally most useful if they are unique. Single unique restriction sites may also be introduced into the construct and should be understood to fall within the definition of "linearisation polylinker".

(vi) Other

It would be appreciated that the person of skill in the art may introduce any other such components into the construct as are desired. These may include, for example, specific sites to facilitate directed homologous recombination in order to induce chromosomal integration at a specific chromosomal location, thereby avoiding the potential disadvantages of random insertion. It should also be understood that the constructs of the present invention may also comprise components which are residual from a vector which was utilised as a basis for generating the constructs of the present invention. These components may be redundant or non-functional and do not, therefore, necessarily require removal. Alternatively, they may themselves provide desirable features, such as endogeneous antibiotic resistance genes or they may be disadvantageous and require removal during construction of the constructs of the invention. Accordingly, the present invention preferably provides a DNA expression construct said construct comprising a promoter operably linked to both a first and second DNA region, which:

(ii) second DNA region located in the 3 ' direction to the first DNA region, said second DNA region comprising an IRES operably linked to a DNA sequence encoding

HPRT or thymidine kinase and optionally GFP₅ alkaline phosphatase and/or a PEST sequence; and optionally

(iv) one or more linearisation polylinkers

Preferably, said signal sequence is the cleavable influenza haemaglutinin signal sequence or the non-cleavable NPPI membrane anchor.

More preferably, said promoter is SRa or cytomegalovirus and said construct is plasmid- derived.

Still more preferably said antibiotic resistance gene is kanamycin, neomycin, G418 or ampicillin resistance driven by a promoter and regulatory sequences that allows expression in E. coli. Yet more preferably, said linearisation polylinker comprises one or more of the restriction sites Xcm I, Age I, HpH I, Bst E II, BssH II, Swa I or Dra I.

DNA sequence into which can be incorporated a nucleic acid sequence encoding a protein of interest and a cleavable signal sequence; and

(iii) an antibiotic resistance gene optionally under its own promoter control

Preferably, said promoter is SRa or cytomegalovirus, said construct is plasmid-derived, said signal sequence is a cleavable signal sequence, and more preferably the influenza haemaglutinin signal sequence, and said antibiotic resistance gene is Neo/Kan resistance driven by a promoter and regulatory sequences that allows expression in E. coli.

More preferably, said construct corresponds to pHAT-1 (also known as Vector A).

In a second embodiment the present invention provides a DNA expression construct said construct comprising a promoter operably linked to both a first and second DNA region, which: (i) first DNA region comprises a DNA sequence encoding a protein of interest or a

(ii) a second DNA region is located in the 3 ' direction to the first DNA region, said second DNA region comprising an IRES operably linked to a DNA sequence encoding HPRT, GFP and a PEST sequence; and

(iv) a linearisation polylinker

Preferably, said promoter is SRa or cytomegalovirus, said construct is plasmid-derived, said signal sequence is a cleavable signal, and more preferably the influenza haemaglutinin signal sequence, said antibiotic resistance gene is Kan/Neo resistance driven by a promoter and regulatory sequences that allows expression in E. coli. and said linearisation polylinker comprises one or more of the restriction sites Xcm I, Age I, HpH I, Bst E II, BssH II, Swa I or Dra I.

More preferably, said construct corresponds to pHAT-2 (also known as Vector B). pHAT-2 was deposited with National Measurement Institute on 29 March 2006 under Accession Number NM06/00014.

In a third embodiment the present invention provides a DNA expression construct said construct comprising a promoter operably linked to both a first and second DNA region, which: (i) first DNA region comprises a DNA sequence encoding a protein of interest or a

(iii) an antibiotic resistance gene optionally under its own promoter control

More preferably, said construct corresponds to pHAT-3 (also known as Vector C). pHAT-3 was deposited at National Measurement Institute on 29 March 2006 under Accession Number NM06/00013.

In another preferred embodiment said promoter is SRa or cytomegalovirus, said signal sequence is a non-cleavable signal sequence, and more preferably the membrane anchor sequence of NPPl, and said antibiotic resistance gene is Kan/Neo resistance driven by a promoter and regulatory sequences that allows expression in E. coli.

Still more preferably, said construct corresponds to pHAT-4, also known as Vector D.

In yet another preferred embodiment said promoter is SRa or cytomegalovirus, said construct is plasmid-derived, said signal sequence is a non-cleavable signal sequence, and more preferably the cytoplasmic tail and membrane anchor sequence of NPPl₅ and said antibiotic resistance gene is ampicillin resistance driven by a promoter and regulatory sequences that allows expression in E. coli.

Yet more preferably, said construct corresponds to pHAT-9 (also known as Vector I).

(iii) an antibiotic resistance gene optionally under its own promoter control

Preferably, said promoter is SRa or cytomegalovirus, said construct is plasmid derived, said signal sequence is cleavable, and more preferably the influenza haemaglutinin signal sequence, and said antibiotic resistance gene is ampicillin resistance driven by a promoter and regulatory sequences that allows expression in E. coli.

More preferably said construct corresponds to Vector pHAT-5 (also known as Vector E).

(i) first DNA region comprises a DNA sequence encoding a protein of interest or a DNA sequence into which can be incorporated a nucleic acid sequence encoding a protein of interest and GST; and

(iii) an antibiotic resistance gene optionally under its own promoter control

Preferably said promoter is SRa or cytomegalovirus, said construct is plasmid-derived and said antibiotic resistance gene is Kan/Neo resistance driven by a promoter and regulatory sequences that allows expression in E. coli.

More preferably, said construct corresponds to vector pHAT-6 (also known as Vector F). pHAT-6 was deposited at National Measurement Institute on 29 March 2006 under Accession Number NM06/00012.

In a sixth embodiment, the present invention provides a DNA expression construct said construct comprising a promoter operably linked to both a first and second DNA region, which:

(i) first DNA region comprises a DNA sequence encoding a protein of interest or a DNA sequence into which can be incorporated a nucleic acid sequence encoding a protein of interest and a signal sequence; and (ii) second DNA region is located in the 3 ' direction to the first DNA region, said second DNA region comprising an IRES operably linked to a DNA sequence encoding HPRT and GFP; and

(iii) an antibiotic resistance gene optionally under its own promoter control

Preferably, said promoter is SRa or cytomegalovirus, said construct is plasmid-derived, said signal sequence is cleavable, and more preferably the influenza haemaglutinin signal sequence, and said antibiotic resistance gene is ampicillin resistance driven by a promoter and regulatory sequences that allows expression in E. coli.

More preferably, said construct corresponds to Vector pHAT-7 (also known as Vector G).

(iii) an antibiotic resistance gene optionally under its own promoter control wherein the expression of said construct would result in translation of both said first and second DNA regions.

More preferably, said construct corresponds to Vector pHAT-8 (also known as vector H).

In yet another preferred embodiment said construct corresponds to vector pHAT-3.shATX

(also known as vector ChATX) which contains sequences encoding a secretory form of human autotaxin, or pHAAT-lO.shATX (also known as pdSRalHEm.shATX), which also encodes a secretory form of human autotaxin.

In still another embodiment, said construct corresponds to pHAT-11, pHAT-12 or pHAT-13.

Methods for generating the constructs of the present invention would be well known to those of skill in the art in terms of the application of standard recombinant techniques to facilitate the modification or de novo synthesis of a construct. Similarly, and as hereinbefore described, methods for amplifying the copy number of a construct prior to transfection of the target host cell population are also well known.

As detailed hereinbefore, the constructs of the present invention are transfected into host cells in order to facilitate their replication and/or expression in order to obtain the protein of interest. By "host cell" is meant any prokaryotic cell or eukaryotic cell which can be transformed or transfected with a nucleotide sequence. For example, contemplated herein are host cells suitable for cloning and/or expression of nucleotide sequences. The subject host cells may be ones which maintain the subject construct in episomal form for the purposes of replication and expression or they may be ones which either naturally or subsequently to an appropriate exogenous signal chromosomally integrate the construct subsequently to its linearisation. The choice of host cell for use will depend on a number of factors such as the features of the construct which is to be replicated and/or expressed (e.g. the expression of bacterial vs. eukaryotic promoters) and the requirements in terms of expression of the protein of interest. For example, constructs which merely require replication and harvesting may be transfected into fast growing bacterial cells where that construct comprises an antibiotic resistance gene under the control of a bacterial promoter. However, if the desired outcome is the expression of a protein which requires glycosylation, a eukaryotic host cell, in particular a mammalian host cell will be required. Examples of bacterial host cells suitable for use include E. coli. Mammalian host cells include any cell type which exhibits a non-functional antifolate-salvage pathway. To this end, there are a number of widely available cell lines which lack either the thymidine kinase enzyme or the HPRT enzyme including mouse myeloma cells which lack HPRT (NS-I, X63AG8, X63AG8.653, Sp/2, NS-O), the adherent cell line ltk^" which lacks thymidine kinase. Other HAT-sensitive lines include variants of the mouse myeloma MPC-11 and the rat myeloma line Y3.

It also should be understood that generating cell lines which are HAT sensitive is a matter of routine procedure and thereby allows one to customise the host cell type which is utilised. This is particularly valuable, for example, if one is seeking to express the construct of the invention in a cell type such as a CHO cell or some other type of adherent or non-adherent cell which exhibits functional or phenotypic properties which are particularly suitable for a given application. The generation of a HAT sensitive variant of any mammalian cell line can be achieved by a number of well known and long-established techniques including the targeted deletion of the HPRT or thymidine kinase genes by homologous recombination or the natural selection of spontaneously occurring variant cells which can be routinely selected by virtue of their resistance to either the thymidine analogue 5-bromodeoxyuridine (where the cell lacks thymidine kinase) or the guanine analogues 8-azaguanne and 6-thioguanine (where the cell lacks HPRT). Selection of such mutants may be facilitated by irradiation or chemical mutagens, (see Ringertz NR and Savage RE Cell Hybrids. Academic Press 1976, especially pages 150-154, Selection of Hybrids Made from Drug Resistant Cells, which describes the process and gives examples). Still further, it would be appreciated that means for culturing cell lines in order to facilitate HAT based selection would be well known to the person of skill in the art.

Accordingly, yet another aspect of the present invention is directed to host cells which have been transfected with the construct of the present invention.

It should be appreciated that reference to "transfecting" the construct into a host cell includes reference to both transferring the subject construct into the host cell of choice and subjecting the host cell to the particular selection means to which the construct has been designed to respond. It should also be understood that "purification" of the protein should be understood to include reference to both isolating/enriching the protein and, further, subject the protein to any additional treatment regimes such as cleavage of a GST portion where the protein has been expressed as a GST fusion. However, whereas in many situations it will be desirable to enrich for and/or purify the protein of interest, this may not always be the case. For example, the cell surface expression of a protein of interest is unlikely to involve a traditional purification step. Rather, these cultures may be used in their in situ form for the application of screening technology or the cells may be harvested, such as would occur if these cells were to be pooled and introduced into a mouse for the generation of monoclonal antibodies to the protein of interest. To this end, where the host cell population is appropriately selected, the cell itself may act to provide adjuvant like properties due to features such as an MHC phenotype which is distinct to that of the mouse into which it is being introduced.

Still yet another aspect of the present invention extends to the protein product produced by the expression constructs of the present invention. The present invention also extends to the use of the expressed protein of interest in the treatment and diagnosis of patients. To this end, the present invention encompasses in vivo administration of the transfected cells themselves and/or the in vivo or in situ transfection of cells with the constructs of the present invention.

Accordingly, another aspect of the present invention contemplates a pharmaceutical composition comprising either a protein generated by the method of the present invention or a construct as hereinbefore defined together with one or more pharmaceutically acceptable carriers .

(ii) effecting transfection of the construct into a host cell; and

(iii) reagents useful for facilitating application of the selection means which have been incorporated into the construct.

Yet another aspect of the present invention is directed to a method for screening for an agent capable of interacting with a protein of interest, said method comprising contacting a putative modulatory agent with a host cell of the present invention, which host cell expresses the protein of interest in membrane bound form and detecting an altered expression phenotype.

Reference to "detecting an altered expression phenotype" should be understood as a reference to detecting cellular state associated with the interaction, or absence thereof, of an interactive agent. Methods of detection include, for example, changes observable extracellularly such as the physical interaction, or not, of an agent. Still further, there is also encompassed methods of detecting functional changes to the host cell, to the extent that the protein that is expressed in the cell by transfection using the aforementioned vectors may be capable of inducing certain functional effects on the host cell. This method is particularly useful for elucidating gene function by virtue of effects of the transfected protein on cell survival, growth, division, differentiation, motility, invasiveness, metastasis, secretion of hormones and other changes in cell function, and for detecting antagonistic agents which function by binding to the protein of interest, such as antibodies. Methods for detecting such interactions are a matter of routine procedure and would be well known to a person of skill in the art.

Further features of the present invention are more fully described in the following non- limiting figures and/or examples.

EXAMPLE 1

MATERIALS AND METHODS

Vector A

This vector is a mosaic made up of a series of different elements.

Promoter.

The SRa promoter was chosen because it is a very strong promoter in a variety of cell types (see Takebe et al, 1988, MoI Cell Biol 8, 466-72).

The approximately 794 bp promoter/enhancer segment was cut out of the vector pME18S (Takebe et al. , 1988, supra) using Hind III at the 5 ' end and EcoR I at the 3 ' end, and ligated into the vector pcDNA3 (Invitrogen) that had been cut with the same enzymes to create pcDNA3.SRα. pcDNA3 had a BamH I site in its polylinker, but since it lay between the EcoR I and Hind III sites it was eliminated during the cloning step.

The resulting construct had a single BamH I site which lay within the intron that is part of the SRa promoter region from pME18S. It was decided to eliminate this site from the SRa promoter because it was desired to include a BamH I site in the expression polylinker of vectors to be constructed using this promoter. This can be done by cutting pcDNA3.SRα with BamH I, filling in the ends using T4 DNA polymerase and deoxynucleotide triphosphates (dNTPs), purifying the DNA by agarose electrophoresis and recircularising with T4 ligase. The elimination of the BamH I site created a CIa I site but this site is methylated in most commonly used strains of E. coli.

The SRa promoter was excised with Hind III at its 5' end and EcoR I at its 3' end, and circularised by insertion of a double stranded synthetic oligonucleotide with single stranded overhangs to generate an Ase I site at the 5' end and an Nhe I site at its 3' end, and to eliminate the previous Hind III site at the 5 ' end and the previous EcoR I site at the 3' end.

The sequences of the upper and lower strands of the synthetic oligonucleotide are as follows:

upper strand: 5' AGCTATTAATGAATTCGCGGGATCCGCTAGC 3' (SEQ ID NO:1) lower strand: 5' AATTGCTAGCGGATCCCGCGAATTCATTAAT 3' (SEQ ID NO:2)

The circularised DNA was cut with Ase I and Nhe I and gel purified for cloning.

Influenza haemaglutinin cleavable signal sequence

This was obtained from the vector pSHT (Madison and Bird (1992), Gene 121, 179-180), and modified by the polymerase chain reaction to incorporate an Nhe I site at its 5 ' end immediately prior to the initiator methionine (sequence of oligonucleotide primer for this modification is underlined) and an EcoR I site at its 3' end (oligonucleotide 3843) as follows:

The arrow indicates the site of cleavage by signal peptidase.

Nhe I 1/1 31/11

GAT GCTAGC ATG GCC ATC ATT TAT CTC ATT CTC CTG TTC ACA GCA (SEQ ID NO: 3)

Met ala ile ile tyr leu ile leu leu phe thr ala

BamH ISma I Spe I EcoR I

GTG AGA GGG GAT CCC GGG ACT AGT TAA CTA AGA ATTC (SEQ ID NO: 4) val arg gly asp pro gly <-TCA ATT GAT TCT TAAG CGA (SEQ ID NO: 6) t <- Oligo 3482

(SEQ ID NO:5)

Position -3 is now A, which is favorable for initiation (Kozak) . Initial modifications to pIRES2-EGFP:

The BamH I site in the polylinker was eliminated by cutting with BamH I, filling in with dNTPs and DNA polymerase as above, followed by gel purification and recircularisation with T4 ligase. This created a methylated CIa I site where the BamH I site had been abolished.

A similar procedure was used to eliminate the Not I site at the 3 ' end of the enhanced green fluorescent protein (EGFP), which created an Fse I site.

The vector was then cut with Nhe I and EcoR I, and the modified influenza haemaglutinin sequence (see above) was inserted at the same sites.

The insertion of the signal sequence eliminated the following sites from the polylinker: BgI II, Xho I, Sac I and BstB I.

The vector was then cut with Sal I and Sac II and a double stranded oligonucleotide adaptor inserted to create Xba I and Not I sites, generating vector 601.

At this stage, the vector still had the cytomegalovirus (CMV) promoter, bounded by an Ase I site at its 5 ' end and an Nhe I site at its 3 ' end. The vector was then cut with Ase I and Nlie I and the modified SRa promoter (see above) inserted at the same enzyme sites to generate vectors 626, 627 and 656 which are identical.

Hypoxanthine phosphoribosyl transferase (HPRT) gene:

The gene for HPRT was isolated from Leishmania donovani (the gift of Dr. Buddy Ullman, Seattle), and modified by PCR to incorporate a BstX I site at each end.

This was done in two stages. The 5 ' primer had an EcoR I site (underlined) followed by a BstX I site (double underlined) at its 5' end, while the 3' primer had a BamH I site (underlined) and a BstX I site (double underlined) at its 3' end, but the stop codon was omitted to allow synthesis of a fusion protein with EGFP, and care was taken to ensure that when the construct was cut with BstX I and ligated to EGFP, the result would be a continuous open reading frame to generate a fusion protein.

5' primer (12137):

5' GCGGAATTCCCACAACCATGGCAATGAGCAACTCGGCC 3' (SEQ ID NO:7)

3^r primer (12136):

5' GCGGGATCCCCATGGTTGTGGCCACCTTGCTCTCCGG 3' (SEQ ID NO:8)

The resulting PCR product was cut with EcoR I and BamH I and ligated into Bluescript II that had been cut with the same enzymes. DNA sequencing showed that the sequence was correct with the exception of a single base change resulting in an change from asparagine to isoleucine, but this did not affect the enzymic activity. The HPRT gene was cut out of Bluescript with BstX I and ligated into the vector that had been cut with BstX I and treated with alkaline phosphatase to prevent self-religation. Clones with the correct orientation were identified by cutting with Pst I and Hind III/EcoR V, and the fusion protein was found to be functional because it was able to confer resistance to HAT medium when transfected into NS-I mouse myeloma cells.

Vector B

Vector B can be considered to be a modified form of vector A, with the following changes:

Polylinker:

The polylinker was cut with BamH I and EcoR I, and an irrelevant insert of approximately 2,000 bp was inserted. This was later removed by cutting with BamH I and EcoR I and replaced the insert by a short double stranded adaptor oligonucleotide such that the BamH I and EcoR I sites were preserved, separated by a six base "spacer" which does not contain any restriction sites. This reduces the number of restriction sites within the polylinker, but it makes the Sma I site unique (vector A has two Sma I sites in the polylinker).

Destabilization of the HPRT-GFP fusion protein:

As mentioned earlier, a Not I site at the 3' end of the EGFP gene had been deleted by cutting with Not I, filling in with dNTPs and DNA polymerase and religation, creating a new Fse l site.

"PEST" sequences are short amino acid sequences rich in certain amino acids, notably proline, glutamic acid, serine and threonine, which can destabilise proteins and greatly increase their rate of proteolytic degradation (Rogers et al. (1986), Science 234, 364-8.).

The PEST sequence from mouse ornithine decarboxylase was amplified from the vector pd2EGFP-l (Clontech) by PCR using the following primers:

5' primer (12494; on 5' side of BsrG I site near C terminus of EGFP):

5' CCGGGATCACTCTCGGCATG 3' (SEQ ID NO:9)

3' primer (12496; to add Fse I site at 3' end of gene):

5' CCT GAA TTC TGG CCG GCC GCG CAT CTA CAC 3' (SEQ ID NO:10)

The BamH I site is underlined. The Fse I site is double underlined.

The amplified sequence was ligated into the vector that had been cut with BsrG I and Fse I. This is expected to result in more rapid degradation of both EGFP and HPRT (which are parts of the same fusion protein), which may have two beneficial effects: (a) reducing potential toxicity of EGFP if it is expressed at high levels and (b) reducing the expression levels of HPRT, meaning that only those integrants making large amounts of the relevant messenger RNA will be able to grow, and hence the large amounts of niRNA will result in higher expression of the gene of interest that has been inserted into the expression polylinker.

Linearisation polylinker:

The vector was cut with BstB I (which lies just past the 3 ' end of the Kanamycin resistance gene), dephosphorylated with shrimp alkaline phosphatase to prevent religation to itself, and a double stranded phosphorylated oligonucleotide containing several restriction sites that are absent from the rest of the vector was inserted using T4 ligase. The sequence of the oligonucleotide strands were as follows:

Upper strand:

5' CGCCACCGGTGACCTGGCGCGCCATTTAAAT 3' (SEQ ID NO: 11)

Lower strand:

5' CGATTTAAATGGCGCGCCAGGTCACCGGTGG 3' (SEQ ID NO: 12)

The insertion of this double stranded sequence had the effect of abolishing the BstB I site but creating the following new and unique restriction sites:

Xcm I, Age I, Hph I, BstE II, BssH II, Swa I (8 base cutter, so cutting is extremely rare), Dra l. Vector C:

Thymidine kinase:

The herpes simplex thymidine kinase gene was isolated and modified by PCR, using the gene targetting vector pNT as the template.

The 5' primer (oligonucleotide 24217) had an extra BamH I (underlined) and BstX I site (double underlined) to facilitate cloning:

5' GCGGGATCCCCACAACCATGGATGGCTTCGTACCCCTGCCAT 3' (SEQ ID NO: 13)

The 3' primer (oligonucleotide 24216) included the stop codon (underlined) followed by an Fse I site (double underlined) and then an EcoR I site (underlined):

5' GCGGAATTCGGCCGGCCTCAGTTAGCCTCCCCCATCTC 3' (SEQ ID NO: 14)

The PCR was performed using a proofreading polymerase (Pfx) for accuracy. The product was purified, cut with BamH I and EcoR I, repurified by agarose gel electrophoresis and ligated into the plasmid Bluescript II at the BamH I and EcoR I sites.

The thymidine kinase gene was excised from Bluescript as follows. The construct was cut with Hind III and blunted with T4 DNA polymerase, which was then inactivated by heat treatment. The construct was then cut with BstX I to release the thymidine kinase gene, which now has a blunt 3 ' end and a BstX I site at the 5' end.

It should be noted that an EcoR I site remains at the 3' end of the thymidine kinase gene, which means that the EcoR I site in the expression polylinker is not unique. However, the patent also claims to include a modification of Vector C in which this site is omitted by excising the thymidine ldnase gene from Bluescript by cutting with EcoR I at the 3' end (instead of Hind HI) followed by blunting with T4 polymerase and blunt ligation to the filled in BsrG I site in the vector as described below.

Preparation of vector:

The expression polylinker of Vector A was modified to remove the Sac II and second Sma I site by cutting with Sac II and Sma I, blunting with T4 polymerase and religation.

The vector was then cut with BsrG I, which cuts near the 3 ' end of EGFP, and blunted with T4 DNA polymerase, and the polymerase inactivated by heat. It was then cut with BstX I and purified by agarose gel electrophoresis. This removed the HPRT-EGFP fusion protein gene, leaving behind a short stub at the 3 ' end which will be after the stop codon of the thymidine kinase gene and therefore irrelevant.

The modified thymidine kinase gene was then ligated in, with a BstX I site at its 5' end and a blunt site at its 3 ' end, replacing the HPRT-EGFP gene.

Vector D:

Vector D was modified from vector C by cutting with Nhe I and BamH I to remove the cleavable influenza haemaglutinin secretory signal sequence, which was replaced by a non-cleavable signal/membrane anchor sequence from mouse PC-I (plasma cell membrane glycoprotein- 1, also known as nucleotide pyrophosphatase/phosphodiesterase I or NPPl) as shown below:

Non-cleavable signal/membrane anchor of mouse NPPl:

This sequence was isolated and modified by PCR to introduce an Nhe I site at the 5 ' end (underlined prior to the initiator Methionine double underlined) and a BamH I site

(underlined) at the 3 ' end at the junction of the transmembrane region and the beginning of the extracellular region. In this way, proteins that are ligated in-frame to the BamH I site can be expressed as membrane proteins on the external surface of transfected cells.

5' primer (oligonucleotide 6581):

5' GCGATAGCTAGCACGATGGAGCGCGACGGC 3' (SEQ ID NO: 15)

3' primer (oligonucleotide 5248):

5 ' GCGTGGATCCAACCCAAATATACAACC 3 ' (SEQ ID NO: 16)

As for Vector C, it should be noted that an EcoR I site remains at the 3' end of the thymidine kinase gene, which means that the EcoR I site in the expression polylinker is not unique. However, this patent also claims to include a modification of Vector D in which this site is omitted by excising the thymidine kinase gene from Bluescript by cutting with EcoR I (instead of Hind III) at the 3' end followed by blunting with T4 polymerase and blunt ligation to the filled in BsrG I site in the vector as described below.

Vector E:

This vector was derived from Vector A to make it confer Ampicillin resistance on E. coli instead of Kanamycin resistance.

Vector preparation:

Vector A was cut with BstB I and AfI II to remove the Kanamycin/Neomycin resistance gene and purified by agarose gel electrophoresis. Preparation of the Ampicillin resistance gene:

The ampicillin resistance gene, including the regulatory regions (-35, Pribnow box and Shine-Dalgarno sequence) was isolated and modified from the plasmid Bluescript II by PCR using the following oligonucleotides:

5' oligonucleotide (25417):

BamH I site is underlined. AfI II site is double underlined.

5' GCGGGATCCCTTAAGGTGGCACTTTTCGGGGAAAT Υ (SEQ ID NO:17)

3' oligonucleotide (24651):

EcoR I site is underlined. CIa I site is double underlined. B stB I site is underlined.

5 ' GCGGAATTCATCGATTTCGAATTACCAATGCTTAATCAGTGA 3 ' (SEQ ID NO: 18)

The ampicillin gene was isolated from Bluescript by PCR using the proofreading DNA polymerase Pfx for accuracy, and purified by agarose gel electrophoresis. It was then cut with BamH I and EcoR I and ligated into the same sites of the plasmid pEGFP2-IRES (Clontech). This vector confers resistance to Kanamycin but not Ampicillin. After ligation of the Ampicillin resistance gene into its polylinker, it was used to transform E. coli which were plated onto LB agar containing Ampicillin. Numerous colonies were obtained, proving that the Ampicillin resistance gene was functional. The Ampicillin resistance gene was then excised from pEGFP2-IRES using AfI II and BstB I₅ and ligated into vector A that had been cut with the same enzymes, as described above, and used to transform E. coli and plated onto Ampicillin containing plates. Numerous colonies were obtained, showing that the vector now conferred resistance to Ampicillin.

Vector F:

Preparation of Glutathione S-Transferase (GST) gene and polylinker:

The GST gene and polylinker were obtained from the bacterial expression vector pGEX- KT (Hakes et at, 1992, Analytical Biochemistry 202: 293-298), and modified by PCR to include an Nhe I site at the 5' end of the GST gene using the proofreading DNA polymerase Pfx for accuracy as follows:

5' primer (oligo 26285):

The Nhe I site is underlined.

5 ' GCG GCT AGC ATG TCC CCT ATA CTA GGT TAT TG 3 ' (SEQ ID NO : 19)

3' primer (oligo 26284)

The Not I site is underlined.

5' TCTGCG GCC GCGACAAGC TGTGAC CGT CTC C 3' (SEQ IDNO:20)

The PCR product was purified by agarose gel electrophoresis, cut with Nhe I and Not I, repurified, and ligated into Vector B which had been cut with the same enzymes. A similar construct was made which included the cytoplasmic tail of NPPl between the BamH I and EcoR I sites, and was shown to express a fusion protein of the expected size which could be purified by affinity chromatography on glutathione-coupled agarose beads.

Typical electroporation of mammalian cells

Linearisation of vector:

DNA: 78 μl BSA: 10 μl

NEB buffer 4: 10 μl BstB I 2 μl

Incubate at 65⁰C for 1.5 hours (use heat block with steel lid to avoid condensation). Run a 5 μl aliquot on a gel to confirm linearisation.

Ethanol precipitation:

Extract Ix with 100 μl chloroform to kill the enzyme (optional).

Add 10 μl of 3 M KAc and 200 μl of 95% ethanol. Mix by inversion.

Spin at 13K for 5 minutes in Eppendorf or similar centrifuge.

Discard supernatant and wash pellet in 70% ethanol to remove salt.

DO NOT RESUSPEND DNA!

Preparation of cells:

For NS-I cells, spin down and wash xl-2 in sterile PBS. For ltk- cells, pour off medium but keep it. Wash cells twice in saline in the flask. Trypsinise 2-3 flasks. Inactivate the trypsin with old medium. Spin down cells and resuspend in 2.4 ml PBS (for 1 control and 2 test groups).

Transfection:

Resuspend DNA in 50 μl sterile PBS in the hood.

Add all the DNA to 800 μl cells in sterile PBS (typically about 5-10 million cells; numbers are not critical), and transfer the mix into a 0.8 cm Gene Pulser cuvette.

For NS-I cells: electroporate about 5 million cells; 250 V, 500 uF, 0.8 cm cuvette, time constant about 10 mS (volume in cuvette is 800 μl).

For Ltk- cells, pour off medium but keep it; wash in saline in flask, trypsinise 2-3 flasks, collect cells in PBS, electroporate them at 240 V, 960 uF.

Post transfection:

Immediately after electroporation, transfer all the contents of the cuvette including debris into a previously prepared flask containing DMEM/10% FCS.

Add HAT concentrate the next day.

Expect to see colonies within a week (10-14 days for Ltk- cells).

Analysis of expression:

Analysis of expression will depend on the individual system to be studied, but could include immunofluorescence (microscopy or flow cytometry) or analysis of biological activities such as enzyme activity. RESULTS

Vector A

This bicistronic vector is very extensively modified from the Clontech pIRES series.

The promoter has been replaced with the very strong SRa promoter to encourage high expression of the gene of interest.

The promoter is followed by the cleavable signal sequence of influenza haemaglutinin, allowing cDNAs that are cloned in-frame at the BamH I site to be secreted into the medium.

The BamH I site is followed by a large polylinker with numerous sites for commonly used restriction enzymes.

The selectable marker is a fusion protein made up of hypoxanthine phosphoribosyl transferase (HPRT) combined with enhanced green fluorescent protein (EGFP) so that transfected cells are fluorescent. When the vector is linearised (at the BstB I site, for example), it can be transfected by electroporation into the very commonly used mouse myeloma cell lines that lack HPRT, and transfectants can be selected using HAT medium. If one million cells are transfected (an easy number), dozens or hundreds of transfectant clones are commonly obtained. If more clones are desired, it is easy to scale up.

Transfected colonies are visible after 3-5 days, and after about a week all untransfected cells are dead, and essentially 100% of the growing cells express the gene of interest, although at widely differing levels of expression as shown by the heterogeneous profiles seen in flow cytometry using the fluorescence activated cell sorter (FACS; Figure 2). The selection system is inexpensive and very robust and rapid. It is then possible, by repeated rounds of FACS sorting, to isolate rare clones that express much higher levels (Figure 3). Enrichment of 50-100 fold over the starting population may be possible by several rounds of FACS sorting in which the brightest cells are selected and grown.

If it is desired to obtain cytoplasmic expression, the Nhe I site that lies on the 5' side of the signal sequence can be used for cloning. If this is followed immediately by a Methionine codon (ATG), position -3 with respect to the A will be deoxyadenosine, which follows the Kozak rules for initiation of protein synthesis.

The presence of EGFP also allows Vector A to be used to transfect cells that do not lack HPRT (eg CHO cells), increasing its general usefulness. In this case, transfectants are selected by flow cytometry (FACS sorting).

Vector B

Vector B is a modified version of vector A, but with the following additional features:

1. The HPRT-GFP protein is destabilised by the presence of a "PEST" sequence at its Υ end, derived from ornithine decarboxylase. This results in a much shorter half- life of the fusion protein, which may be helpful in two ways:

(a) By reducing the expression of HPRT, those clones producing large amounts of mRNA are selected, and low expressing clones will not survive. In this way, expression of the gene of interest will be greater.

(b) Although controversial, there are claims that GFP can be toxic to cells, and high level expression of GFP is not always desirable. The reduced stability of the fusion protein may help minimise this potential problem. 2. The vector incorporates a unique feature of a "linearisation polylinker" which replaces the BstB I site in vector A. The linearisation polylinker comprises a cluster of seven unique restriction sites, all of which are potentially available for linearisation. This greatly increases the chances that one or more sites will be absent from the gene of interest, and virtually guarantees that linearisation will be possible without destroying the gene of interest.

3. The presence of EGFP also allows Vector B to be used to transfect cells that do not lack HPRT (eg CHO cells), increasing its general usefulness. In this case, transfectants are selected by flow cytometry (FACS sorting).

Vector C

This vector is similar to Vector A, but the GFP-HPRT fusion protein is replaced by thymidine kinase (TK) from herpes simplex virus. This vector is designed for use in the adherent cell line ltk-, which lacks thymidine kinase and is therefore killed in HAT medium but is able to be rescued by replacement of the TK gene.

This cell line is was originally isolated in the 1940s, and a mutant lacking TK was produced in 1961. It is very robust and easy to grow, and is in the public domain. It has been widely used for transfection using HAT selection by cotransfection with a separate plasmid encoding TK. The advantages of the use of a bicistronic vector for this purpose are essentially the same as the use of HPRT as in Vectors A and B.

It is suitable for production of secreted proteins by transfected cells, by ligation of the gene of interest in-frame at the BamH I site, and a large number of other common restriction sites are available downstream for the 3' end of the gene of interest.

If cytoplasmic expression is desired, the Nhe I site should be used at the 5' end of the gene of interest, followed by an initiator methionine (ATG) codon. If this is done, the sequence will contain the favorable Kozak initiation sequence, where position -3 is a deoxyadenosine residue. Any of the remaining sites in the polylinker can be used for ligation of the 3' end.

Vector D

This vector is similar in its properties to Vector C, but instead of a cleavable signal sequence for secretion, a non-cleavable amino terminal signal/anchor sequence is present. If the cDNA of interest is cloned in-frame at the BamH I site, this allows generation of at type II membrane protein that will remain attached to the external surface of the plasma membrane of the Ltk- cells.

The special attractiveness of this vector lies in the ability to express cDNAs as membrane proteins on the surface of adherent cells. This property could be particularly useful for the production and screening for monoclonal antibodies, in a system similar to enzyme-linked immunosorbent assay (ELISA), but with several major advantages:

1. There is no need to purify the protein of interest (which is usually essential to coat the plates for ELISA).

2. When proteins are passively adsorbed onto plastic plates in ELISA, it is common to find that more than 90% of the protein molecules are denatured. In contrast, cell surface expression will only be possible if the protein of interest is correctly folded. This should greatly facilitate the generation of monoclonal antibodies to native proteins, and the proteins do not have to be pure. All that is required is a cDNA and the ability to modify the ends to provide in-frame ligation at the BamH I site, and to provide a suitable site at the 3 ' end for ligation. The availability of an extensive range of restriction sites in the polylinker should make this process easy. The ends of any cDNA can be easily and inexpensively modified by design of appropriate oligonucleotide primers and the polymerase chain reaction (PCR). 3. The fact that the cells are adherent means that no centrifugation steps are required for washing off the unbound antibodies. If readout using immunofluorescence is desired, the plate can be turned upside down and read under a low or medium powered lens as described by Stitz et al, J Immunol Methods, 1998, 106, 211-6. If membrane proteins are to be detected, the cells would not be fixed or permeabilised. If it is desired to analyse intracellular proteins, the gene of interest could be cloned using the Nhe I site at the 5' end to avoid the signal sequence, and any of the downstream restriction sites could be used for ligation of the 3' end.

Vector E

Vector E is similar to Vector A but instead of Kanamycin resistance, the selectable marker for growth in E. coli is Ampicillin resistance, which is somewhat more robust than Kanamycin resistance (ie the concentration of Ampicillin is less critical than for Kanamycin).

Vector F

Vector F is similar to vector B but with highly significant differences. It is designed for cytoplasmic expression of the desired protein in transfected mammalian cells that lack HPRT (eg the commonly available mouse myeloma cell lines). The expressed protein is made as a fusion protein with glutathione S -transferase, which has been widely used for bacterial expression.

Production of the desired protein as a fusion with GST allows simple one-step purification of the fusion protein on glutathione-agarose beads (Smith et al, 1988. Methods in Enzymology vol 326, pages 254-270).

There are many reports of vectors that generate fusions with glutathione S -transferase in the cytoplasm of mammalian cells (Tsai et al 1997. Biotechniques 23: 794-800). See also GST vectors for yeast (Mitchell et al, Yeast 9:715 (1993); Romanos et al, Gene 152: 137 (1995)) bacculovirus (Davies et al, Bio/Technology 11 :933 (1993), Beelαnan et al, Gene 146: 285 (1994) and Wang et al, Virology 208:142 (1995)), mammalian cells (Rudert et al, Gene 169:281 (1996), Chumakov and Koeffler Gene 131:231 (1993) and Chatton et al, Bio/Techniques 18:142 (1995)) and finally one for Xenopus microinjection (MacNicol et al, Gene 196:25 (1997)).

None of these vectors were bicistronic and they lacked the major features of the current series of vectors. In particular, the fact that the vectors were not bicistronic means that they lacked the tight linkage between growth and expression that is possible with the vectors in this patent application. Vector F also has the unique and major advantages of (a) the selectable marker HPRT is fused to destabilised green fluorescent protein (GFP) allowing enrichment for highly expressing clones by FACS sorting and encouraging high level expression by destabilising the HPRT gene as has already been explained for Vector B, and (b) the presence of a linearisation polylinker to allow the choice of many restriction sites for linearisation, which also facilitates tight coupling between growth and expression.

In addition, Vector F has the significant advantage that it includes as pentaglycine "kinker" immediately prior to the thrombin cleavage site, which greatly improves the efficiency of thrombin cleavage to release the desired protein from the glutathione S-transferase (see Hakes et al , 1992, supra).

Vector G

This vector is similar to vector E but the Sac II site and the second Sma I site in the polylinker have been deleted, so that the Sma I site that remains in the polylinker is now unique which makes the polylinker more useful. Vector H

It is similar to vector C, but has Ampicillin resistance instead of Kanamycin resistance. It had the EcoR I site at the 3' end of the thymidine kinase gene removed so that the EcoR I site in the polylinker is now unique and hence useable.

Genes ligated in-frame at the BamH I site can be expressed as secreted proteins.

Vector I

Essentially identical to vector H, except that the cleavable influenza haemaglutinin secretion signal sequence was removed by cutting with Nhe I and BamH I, and replaced by the cytoplasmic tail and membrane anchor of mouse NPPl (PC-I).

Genes cloned in-frame at the BamH I site can be expressed as membrane proteins.

Other variations

Vectors B, C, and D can be easily modified to change the antibiotic resistance marker from Kanamycin to Ampicillin.

It is also a straightforward matter to substitute the neomycin/kanamycin resistance gene for the HPRT gene in each of vectors A-F, which would allow the use of G418 as the selectable marker in mammalian cells, in which case any mammalian cell could be used, without having to have a mutation that deletes HPRT or TK (thymidine kinase).

Vector ChATX

This construct was linearised with BstB I and used to transfect a Ltk- cell line. It is designed to allow secretion of the extracellular domain of human autotaxin into the culture medium. Vector pdSRalHEm.shATX

This construct was linearised with BstB I and used to transfect the NS-I myeloma cell line. It is designed to allow secretion of the extracellular domain of human autotaxin into the culture medium.

I

OO 4^

I

OO

EXAMPLE 2

Method

Vector was modified from pIRES2-EGFP (Clontech) to provide the following:

1. Influenza hemagglutinin signal sequence to allow secretion of cloned proteins.

2. Rearranged polylinker to suit requirements.

3. Red fluorescent protein was inserted to replace green fluorescent protein (see below).

Made in series of steps:

1. BamHI site at approx 706 deleted by cutting, filling in and religation.

2. Not I site at approx 2024 just after GFP stop codon deleted by cutting, filling in and religation. 3. Cut with Nhe I and EcoRI and PCR'dHA signal sequence inserted at these sites.

4. Cut with Sail and Sac II and synthetic Xba I/Not I adapter inserted.

Ligation of Sal I/Xba I/Not I/Sac II linker

SR alpha promoter from pME 18S had BamHI site deleted from intron, and then promoter was cut out with Hind II and EcoRI and ligated to a short double stranded oligo with Hind III and EcoRI overhangs, but which did not recreate these sites.

RFP was PCR-ed using oligos 14328 (EcoRI and BstXI at 5' end) and 14329 (Notl and BamHI at 3' end). PCR product was cut with EcoRI and BamHI and parked into Litmus 28. It was cut out of there by partial digest using BstXI and Notl (BstXI is present in the sequence of RFP); and ligated into pIRES-EGFP (V577).

Notl site at the 3' of RFP at position 2266 was abolished by cutting, filling in and religation. (this is V733).

V733 was cut with Xmnl and BstBI and fragment containing RFP was ligated into pIRES- EGFP, B-N-SS+N+X+SRalpha (V656) Xmnl site at the position 1146 was lost during ligation.

Minipreps were checked with Bstxl, Fspl, and Xmnl and proved to be correct.

Results

The bicistronic vector pRIRESEC-1 was designed to complement the pHAT vectors by allowing simultaneous expression of two different proteins in the same cell (for example, heavy and light chains of antibodies, but this general process could be used to express any two proteins in cells). In a typical application, the antibody light chain would be ligated into one of the pHAT vectors that provided HAT selection and Green Fluorescent Protein, and the construct transfected into HAT-sensitive cells and transfectants selected by HAT medium. High level expression of light chains can be achieved by flow cytometry, selecting the brightest 1 % of cells, re-culturing and repeating the process as necessary to achieve a population of cells with the highest possible expression.

After high level expression of light chains is achieved, the cells are transfected with pRIRESEC-1 containing antibody heavy chain cDNA. A cleavable secretory signal sequence is provided in the vector (from influenza haemagglutinin), so the signal sequence of the antibody heavy chain should be omitted. Alternatively, if desired, the endogenous heavy chain signal sequence could be used, in which case the influenza signal sequence would be omitted and the 5' end of the endogenous antibody signal sequence would be ligated to the Nhe I site in the vector.

The transfected cells are then analysed by flow cytometry, and cells that possess both green and red fluorescence are isolated by cell sorting. Repeated rounds of cell sorting will allow selection of cells expressing both green fluorescence (indicating light chain expression) and higher and higher levels of red fluorescence, indicating expression of the antibody heavy chain. In this way, cells can be isolated which express high level secretion of intact antibodies. It should be noted that it is highly preferable to start with expression of light chains and then add heavy chains, because isolated heavy chains without light chains are thought to be toxic to cells (Argon, Y., Burrone, O. R., and Milstein, C. (1983). Molecular characterization of a nonsecreting myeloma mutant, Eur J Immunol 13, 301-5.) . By first selecting for high level light chain expression, the cells will express an excess of light chains and when the heavy chains are added, they will spontaneously pair up with light chains such that there are no unpaired heavy chains in the cell. Free light chains are generally not toxic to cells.

It should be noted that the strategy given above could be used for any two proteins (ie not just for antibodies), and could be used with minor modifications for secretory, membrane or cytoplasmic proteins, depending on the presence or absence of signal sequences or membrane anchor sequences.

Those skilled in the art will appreciate that the invention described herein is susceptible to variations and modifications other than those specifically described. It is to be understood that the invention includes all such variations and modifications. The invention also includes all of the steps, features, compositions and compounds referred to or indicated in this specification, individually or collectively, and any and all combinations of any two or more of said steps or features.

BIBLIOGRAPHY

Argon, Y., Burrone, O.R., and Milstein, C. (1983). Molecular characterization of a nonsecreting myeloma mutant, Eur J Immunol 13: 301-5.

Beekman et al, Gene 146: 285 (1994)

Chatton et al, Bio/Techniques 18:142 (1995)

Chumakov and Koeffler Gene 131:231 (1993)

Cullen B.R. and Malim M.H., 1992, Secreted placental alkaline phosphatase as an eukaryotic reporter gene. Methods in Enzymology 216:362-368

Davies et al, Bio/Technology 11:933 (1993)

Hakes, DJ. and Dixon, J.E. (1992) New vectors for high level expression of recombinant proteins in bacteria. Analytical Biochemistry 202: 293-298

Harrison, T.M., Hudson, K., Munson, S.E., Uff, S., and Glassford, S. (1995). Derivation and partial analysis of two highly active myeloma cell transfectants. Biochim Biophys Acta 1260, 147-56

Kaufman, 2000

Liu, 1992

MacNicol et al, Gene 196:25 (1997)

Madison, E. L., and Bird, P. (1992). A vector, pSHT, for the expression and secretion of protein domains in mammalian cells., Gene 121, 179-180

Mitchell etal, Yeast 9:115 (1993)

Ringertz N.R. and Savage R.E. Cell Hybrids. Academic Press (1976), Selection of Hybrids Made from Drug Resistant Cells, 150-154 Rogers, S., Wells, R., and Rechsteiner, M. (1986). Amino acid sequences common to rapidly degraded proteins: the PEST hypothesis, Science 234, 364-8

Romanos et al, Gene 152: 137 (1995)

Rudert et al, Gene 169:281 (1996)

Sheeley et α/., 1997

Smith, D.B. 2000. Generating fusions to glutathione S-transferase. Methods in Enzymology vol 326, pages 254-270

Stitz, L., Hengartner, H., Althage, A., and Zinkemagel, R. M. (1988). An easy and rapid method to screen large numbers of antibodies against internal cellular determinants, J Immunol Methods 106, 211-6

Takebe, Y., Seiki, M., Fujisawa, J., Hoy, P., Yokota, K., Arai, K., Yoshida, M., and Arai, N. (1988). SR alpha promoter: an efficient and versatile mammalian cDNA expression system composed of the simian virus 40 early promoter and the R-U5 segment of human T-cell leukemia virus type 1 long terminal repeat, MoI Cell Biol 8, 466-72

Tsai, R. Y, and Reed, R.R. 1997. Using a eukaryotic GST fusion vector for proteins difficult to express in E. coli. Biotechniques 23: 794-800

Wang et al, Virology 208:142 (1995)

Wigler, M., Sweet, R., Sim, G. K., Wold, B., Pellicer, A., Lacy, E., Maniatis, T., Silverstein, S., and Axel, R. (1979). Transformation of mammalian cells with genes from procaryotes and eucaryotes, Cell 16, 777-85.

Claims

THE CLAIMS DEFINING THE INVENTION ARE AS FOLLOWS:

1. A nucleic acid expression construct said construct comprising a promoter operably linked to both:

(ii) a second nucleic acid region located in the 3 ' direction to the first nucleic acid region, said second nucleic acid region comprising one or more IRES operably linked to a nucleic acid sequence encoding one or more selectable markers, at least one of which selectable markers facilitates antifolate salvage pathway-based selection;

2. The nucleic acid expression construct of claim 1 wherein said nucleic acid is DNA.

3. The expression construct according to claim 2 wherein said expression construct is plasmid derived.

4. The expression construct according to claim 2 or 3 wherein said promoter is CMV, EF- lα, SV40 early, beta actin, phosphoglycerokinase, thymidine kinase, HTLV-I long terminal repeat or adenovirus major late promoter.

5. The expression construct according to claim 2 or 3 wherein said promoter is SRa.

6. The expression construct according to claim 2 or 3 wherein said promoter is cytomegalovirus.

7. The expression construct according to any one of claims 2 to 6 wherein said anti- folate salvage pathway-based selection is HAT-based selection.

8. The expression construct according to claim 7 wherein said selectable marker which facilitates HAT-based selection is thymidine kinase or functional fragment, homologue, derivative or mimetic thereof.

9. The expression construct according to claim 7 wherein said selectable marker which facilitates HAT-based selection is HPRT or functional fragment, homologue, derivative or mimetic thereof.

10. The expression construct according to any one of claims 2, 3 or 7-9 wherein said second nucleic acid molecule encodes a second selectable marker, which second selectable marker is a fluorescent protein.

11. The expression construct according to claim 10 wherein said fluorescent protein is green fluorescent protein, red fluorescent protein, yellow fluorescent protein or cyan fluorescent protein.

12. The expression construct according to any one of claims 2, 3 or 7-9 wherein said nucleic acid molecule encodes a second selectable marker, which second selectable marker is a secretory alkaline phosphatase.

13. A DNA expression construct said construct comprising a promoter operably linked to both a first and second DNA region:

(i) which first DNA region comprises a DNA sequence encoding a protein of interest or a DNA sequence into which can be incorporated a DNA sequence encoding a protein of interest and optionally a signal sequence and/or fusion tag and (ii) which second DNA region located in the 3 ' direction to the first DNA region, said second DNA region comprising one or more IRES sequences operably linked to a DNA sequence encoding HPRT or thymidine kinase and optionally GFP, alkaline phosphatase and/or a PEST sequence; and optionally

(iv) one or more linearisation polylinkers

14. The expression construct according to claim 13 wherein said signal sequence is the influenza haemaglutinin signal sequence.

15. The expression construct according to claim 13 wherein said signal sequence is a non-cleavable amino terminal anchor sequence derived from the anchor of murine NPP-I.

16. The expression construct according to claim 13 wherein said fusion tag is GST.

17. The expression construct according to claim 13 wherein said fusion tag is staphylococcal protein A, streptococcal protein G, hexahistidine, calmodulin-binding peptides or maltose binding protein.

18. The expression construct according to claim 13 wherein said antibiotic resistance gene is kanamycin, neomycin, G418 or ampicillin.

19. The expression construct according to claim 13 wherein said linearization polylinker requires the recognition of a sequence of 6, 7, 8, 9 or 10 bases for cleavage to occur.

20. The expression construct according to claim 19 wherein said expression construct comprises 2, 3, 4, 5, 6, 7 or more restriction sites.

21. The expression construct according to claim 20 wherein said restriction sites are selected from:

- Xcm l

- Age I

- HpH I

- Bst E II

- Bss H II

- Swa l

- Dra I.

22. The expression construct according to any one of claims 13-21 wherein said promoter is SRa or cytomegalovirus and said construct is plasmid-derived.

23. The expression construct according to claim 13 wherein said construct comprises a promoter operably linked to both a first and second DNA region, which:

(ii) second DNA region is located in the 3 ' direction to the first DNA region, said second DNA region comprising an IRES operably linked to a DNA sequence encoding HPRT and GFP; and (iii) an antibiotic resistance gene optionally under its own promoter control

24. The expression construct according to claim 23 wherein said promoter is SRa or cytomegalovirus, said construct is plasmid-derived, said signal sequence is a cleavable signal sequence, and more preferably the influenza haemaglutinin signal sequence, and said antibiotic resistance gene is Neo/Kan resistance driven by a promoter and regulatory sequences that allows expression in E. coli.

25. The expression construct according to claim 24 wherein said construct corresponds to pH AT-I.

26. The expression construct according to claim 13 wherein said construct comprises a promoter operably linked to both a first and second DNA which:

(i) which first DNA region comprises a DNA sequence encoding a protein of interest or a DNA sequence into which can be incorporated a nucleic acid sequence encoding a protein of interest and a signal sequence; and

(ii) which second DNA region is located in the 3 ' direction to the first DNA region, said second DNA region comprising an IRES sequence operably linked to a DNA sequence encoding HPRT, GFP and a PEST sequence; and

(v) an antibiotic resistance gene optionally under its own promoter control; and optionally

(vi) a linearisation polylinker

27. The expression construct according to claim 26 wherein said promoter is SRa or cytomegalovirus, said construct is plasmid-derived, said signal sequence is a cleavable signal, and more preferably the influenza haemaglutinin signal sequence, said antibiotic resistance gene is Kan/Neo resistance driven by a promoter and regulatory sequences that allows expression in E. coli and said linearisation polylinker comprises one or more of the restriction sites Xcm I, Age I, HpH I, Bst E II, BssH II, Swa I or Dra I.

28. The expression construct according to claim 27 wherein said construct corresponds to pHAT-2.

29. The expression construct according to claim 28 wherein said construct corresponds to the construct deposited in National Measurement Institute under Accession No. NM06/00014.

30. The expression construct according to claim 13 wherein said construct comprises a promoter operably linked to both a first and second DNA region:

(ii) which second DNA region is located in the 3 ' direction to the first DNA region, said second DNA region comprising an IRES sequence operably linked to a DNA sequence encoding thymidine kinase; and

(iii) an antibiotic resistance gene optionally under its own promoter control

31. The expression construct according to claim 30 wherein said promoter is SRa or cytomegalovirus, said construct is plasmid-derived, said signal sequence is a cleavable signal sequence, and more preferably the influenza haemaglutinin signal sequence, and said antibiotic resistance gene is Neo/Kan resistance driven by a promoter and regulatory sequences that allows expression in E. coli.

32. The expression construct according to claim 31 wherein said construct corresponds to pHAT-3.

33. The expression construct according to claim 32 wherein said construct corresponds to the construct deposited with National Measurement Institute under Accession No. NM06/00013.

34. The expression construct according to claim 30 wherein said promoter is SRa or cytomegalovirus, said signal sequence is a non-cleavable signal sequence, and more preferably the membrane anchor sequence of NPPl, and said antibiotic resistance gene is Kan/Neo resistance driven by a promoter and regulatory sequences that allows expression in E. coli.

35. The expression construct according to claim 34 wherein said construct corresponds to pHAT-4.

36. The expression construct according to claim 30 wherein said promoter is SRa or cytomegalovirus, said construct is plasmid-derived, said signal sequence is a non-cleavable signal sequence, and more preferably the cytoplasmic tail and membrane anchor sequence of NPPl, and said antibiotic resistance gene is ampicillin resistance driven by a promoter and regulatory sequences that allows expression in E. coli.

37. The expression construct according to claim 36 wherein said construct corresponds to pHAT-9.

38. The expression construct according to claim 13 wherein said construct comprises a promoter operably linked to both a first and second DNA region:

(ii) which second DNA region is located in the 3 ' direction to the first DNA region, said second DNA region comprising an IRES sequence operably linked to a DNA sequence encoding HPRT and GFP; and

(iii) an antibiotic resistance gene optionally under its own promoter control

39. The expression construct according to claim 38 wherein said promoter is SRa or cytomegalovirus, said construct is plasmid derived, said signal sequence is cleavable, and more preferably the influenza haemaglutinin signal sequence, and said antibiotic resistance gene is ampicillin resistance driven by a promoter and regulatory sequences that allows expression in E. coli.

40. The expression construct according to claim 39 wherein said construct corresponds to Vector pHAT-5.

41. The expression construct according to claim 13 wherein said construct comprises a promoter operably linked to both a first and second DNA region:

(i) which first DNA region comprises a DNA sequence encoding a protein of interest or a DNA sequence into which can be incorporated a nucleic acid sequence encoding a protein of interest and GST; and

(iii) an antibiotic resistance gene optionally under its own promoter control

42. The expression construct according to claim 41 wherein said promoter is SRa or cytomegalovirus, said construct is plasmid-derived and said antibiotic resistance gene is Kan/Neo resistance driven by a promoter and regulatory sequences that allows expression in E. coli.

43. The expression construct according to claim 42 wherein said construct corresponds to vector pHAT-6.

44. The expression construct according to claim 43 wherein said construct corresponds to the construct deposited with the National Measurement Institute under Accession No. NM06/00013.

45. The expression construct according to claim 13 wherein said construct comprises a promoter operably linked to both a first and second DNA region:

(iii) an antibiotic resistance gene optionally under its own promoter control

46. The expression construct according to claim 45 wherein said promoter is SRa or cytomegalovirus, said construct is plasmid-derived, said signal sequence is cleavable, and more preferably the influenza haemaglutinin signal sequence, and said antibiotic resistance gene is ampicillin resistance driven by a promoter and regulatory sequences that allows expression in E. coli.

47. The expression construct according to claim 46 wherein said construct corresponds to Vector pHAT-7.

48. The expression construct according to claim 47 wherein said construct comprises a promoter operably linked to both a first and second DNA region:

(iii) an antibiotic resistance gene optionally under its own promoter control

49. The expression construct according to claim 48 wherein said promoter is SRa or cytomegalovirus, said construct is plasmid derived, said signal sequence is cleavable, and more preferably the influenza haemaglutinin signal sequence, and said antibiotic resistance gene is ampicillin resistance driven by a promoter and regulatory sequences that allows expression in E. coli.

50. The expression construct according to claim 49 wherein said construct corresponds to Vector pHAT-8.

51. The expression construct according to claim 49 wherein said construct corresponds to vector pHAT-3.shATX or pHAAT-lO.shATX.

52. The expression construct corresponding to pHAT- 11.

53. The expression construct corresponding to pHAT-12.

54. The expression construct corresponding to pHAT- 13.

55. An isolated host cell, which host cell has been transfected with a construct according to any one of claims 1-54.

56. The host cell according to claim 55 wherein said cell is NS-I, X63AG, X63AG8.653, Spl2, NS-O, ltk^", MPC-11 or Y3.

57. A method of producing a protein of interest said method comprising transfecting the construct according to any one of claims 1-54 into a host cell, culturing said host cell for a time and under conditions sufficient to express said construct and optionally purifying said protein.

58. The protein product produced by the method according to claim 57.

59. Use of the expressed protein according to claim 58 in the treatment and diagnosis of patients.

60. A pharmaceutical composition comprising either a protein generated by the method of claim 52 or a construct according to any one of claims 1-54 together with one or more pharmaceutically acceptable carriers.

61. A kit for facilitating the expression of a protein of interest, said kit comprising a construct according to any one of claims 1-54 and optionally reagents useful for:

(ii) effecting transfection of the construct into a host cell; and

62. A method for screening for an agent capable of interacting with a protein of interest, said method comprising contacting a putative modulatory agent with a host cell according to claim 55, which host cell expresses the protein of interest in membrane bound form and detecting an altered expression phenotype.

63. Agents identified in accordance with the screening method of claim 62.

64. Use of the agents of claim 63 in the manufacture of a medicament for the treatment of a condition in a mammal.