WO2010018444A2

WO2010018444A2 - An expression vector and a method thereof

Info

Publication number: WO2010018444A2
Application number: PCT/IB2009/006517
Authority: WO
Inventors: Villoo Morawala Patell; Sunit Maitya; Ashutosh Vyas; Gopalakrishnan Chellappa
Original assignee: Avesthagen Limited
Priority date: 2008-08-12
Filing date: 2009-08-12
Publication date: 2010-02-18
Also published as: WO2010018444A3; BRPI0918008A2; AU2009280913A1; ZA201101882B; KR20110044769A; CN102177240A; EP2331693A2; MX2011001644A; CA2736580A1; JP2012506694A

Abstract

The present invention relates to vectors and compounds for the expression of recombinant soluble proteins. More particularly, the present invention relates to nucleic acid molecules, expression vectors, and host cells for the expression of recombinant soluble Tumour Necrosis Factor Alfa receptor (TNFR) - Human IgG Fc fusion protein. The invention further relates to methods for preparing recombinant soluble Tumour Necrosis Factor Alfa receptor (TNFR) - Human IgG Fc fusion protein using the host cells transfected with the expression vectors.

Description

AN EXPRESSION VECTOR AND A METHOD THEREOF

FIELD OF THE INVENTION

The present invention relates to vectors and compounds for the expression of recombinant soluble proteins. More particularly, the present invention relates to nucleic acid molecules, expression vectors, and host cells for the expression of recombinant soluble Tumour Necrosis Factor Alfa receptor (TNFR) - Human IgG Fc fusion protein. The invention further relates to a method for preparing recombinant soluble Tumour Necrosis Factor Alfa receptor (TNFR) - Human IgG Fc fusion protein using the host cells transfected with the expression vectors.

BACKGROUND OF THE INVENTION

In 1986, the US FDA approved human tissue plasminogen activator (tPA; Genentech, CA, USA) protein from mammalian cells to be used for therapeutic purposes. Advances in cell culture and recombinant DNA technologies have facilitated the expression of a variety of similar proteins of therapeutic or other economic value using genetically engineered cells. Currently there are several monoclonal antibodies, which have regulatory approval for therapeutic use and, several hundreds are at various stages of development and approval. Like tPA, most of these proteins are expressed in immortalized Chinese hamster ovary (CHO) cells, but other cell lines, such as mouse myeloma (NSO), baby hamster kidney (BHK), human embryo kidney (HEK- 293) are also approved for recombinant protein production.

The expression of many biologically active therapeutics, which are derived from higher eukaryotic sources, often requires post-translational modifications which do not naturally occur in lower eukaryotic or prokaryotic cells, thus necessitating the use of cells derived from higher eukaryotic sources. Accordingly, a mammalian expression system is generally preferred for manufacturing most of therapeutic proteins, since the post-translational modifications required are carried out in the cell line as well. A variety of mammalian cell expression systems are now available for expression of proteins. Generally expression vectors use a strong viral or cellular promoter/enhancer to drive the expression of the recombinant gene. However, the level of expression of a recombinant protein achieved from these expression vectors/systems in mammalian cells is not commercially viable. There are two critical issues during the production of therapeutics that need to be addressed (a) time taken to provide the material (b) lowering the price of the material for access to the masses, especially in developing or underdeveloped countries. Therefore, the biotechnology industry continues to look at new technologies and process development strategies that will reduce timelines and cost.

ENBREL (Etanercept), also known as TNFR:Fc fusion protein, is a recombinant fusion protein comprising the extracellular domain of the human tumor necrosis factor receptor superfamily, member 1B (p75) and the Fc domain of human IgGL It is an important scientific advance for the treatment of rheumatoid arthritis such that, in many patients it has been shown to reduce the signs and symptoms of rheumatoid arthritis, polyarticular-course juvenile rheumatoid arthritis, ankylosing spondylitis, psoriatic arthritis, and psoriasis. Tumor Necrosis Factor (TNF) is a naturally occurring cytokine that is involved in normal inflammatory and immune responses. It plays an important role in the inflammatory processes of rheumatoid arthritis (RA), polyarticular-course juvenile rheumatoid arthritis (JRA) and the resulting joint pathology. Elevated levels of TNF are found in the synovial fluid of RA patients.

Production of recombinant proteins such as TNFR:Fc within mammalian cells can be difficult because of low genetic stability of the recombinant gene and/or the silencing of the recombinant gene. Several molecular mechanisms have been reported that may lead to gene silencing including DNA methylation, negative position effect, integration-dependent repression etc., to name a few.

Thus, there is a need in the art to overcome the deficiencies of the known methods for producing recombinant fusion proteins, by providing expression systems with higher genetic stability during large scale production of such proteins. In order to facilitate production of large quantities of TNFR:Fc from cell culture, a novel expression vector has been developed. Use of this expression vector has been shown to increase the expression of the therapeutic protein. The cloning, expression, and purification of TNFR:Fc has been mentioned in this application.

SUMMARY OF INVENTION

The present disclosure includes one or more of the features recited in the appended claims and/or the following features which, alone or in any combination, may comprise patentable subject matter. The present invention is based on the discovery of novel Scaffold/Matrix Attachment Region (S/MAR) sequences, which may be used for increasing and stabilizing the expression yield of recombinant proteins in mammalian and other eukaryotic cells. The S/MAR sequences increase genetic stability of nearby transcription cassettes and inhibit gene silencing by interfering with mechanisms such as DNA methylation. Furthermore, the presence of S/MAR sequences is thought to decrease clone-to-clone variability through decreasing position effects.

Thus, in one embodiment, , the present invention relates to an isolated nucleic acid having one or more nucleotide sequences selected from the group consisting of SEQ ID NO: 1 , SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, and SEQ ID NO: 6, variants and functional fragments thereof and sequences being at least 70% homologous thereto or sequences that hybridize to the isolated nucleic acid under stringent conditions.

In another embodiment, the present invention relates to an expression vector carrying Scaffold/Matrix Attachment Region (S/MAR) sequence(s) and any combination(s) thereof. The S/MAR sequences may be selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, and SEQ ID NO: 6, complements, variants and functional fragments thereof and sequences being at least 70% homologous thereto as determined by pair wise DNA sequence alignment using matching methods like the BLAST (Basic Local Alignment Search Tool) algorithm.

In another embodiment, the present invention relates to an expression vector carrying Scaffold/Matrix Attachment Region (S/MAR) sequence(s) or any combination(s) thereof, and a sequence encoding Tumor Necrosis Factor Alfa Receptor (TNFR) - IgG Fc fusion protein operably linked to one or more expression control elements.

The S/MAR sequence(s) may be located upstream or downstream of the transcriptional promoter within the expression vector. Further, the S/MAR sequence(s) may be located at a distance of from 0 to 10 kb from the sequence encoding Tumour Necrosis Factor Alfa receptor (TNFR) - Human IgG Fc fusion protein.

In a further embodiment, the present invention relates to a method for construction of an expression vector carrying Scaffold/Matrix Attachment Region (S/MAR) sequence(s) or any combination(s) thereof. In another embodiment, the present invention relates to a host cell comprising an expression vector carrying Scaffold/Matrix Attachment Region (S/MAR) sequence(s) or any combination(s) thereof.

In yet another embodiment, the present invention relates to a method for producing Tumor Necrosis Factor Alfa Receptor (TNFR) - IgG Fc fusion protein through recombinant expression using an expression vector carrying Scaffold/Matrix Attachment Region (S/MAR) sequence(s) or any combination(s) thereof.

In a still further embodiment, the present invention relates to Tumor Necrosis Factor Alfa Receptor (TNFR) - IgG Fc fusion protein expressed by the expression vector carrying Scaffold/Matrix Attachment Region (S/MAR) sequence(s) or any combination(s) thereof.

In yet another embodiment, the present invention relates to a method for producing Tumor Necrosis Factor Alfa Receptor (TNFR) - IgG Fc fusion protein by transfecting a mammalian cell with an expression construct carrying the gene encoding Tumor Necrosis Factor Alfa Receptor (TNFR) - IgG Fc fusion protein, and co-transfecting the same mammalian cell using a plasmid carrying Scaffold/Matrix Attachment Region (S/MAR) sequence(s) or any combination(s) thereof.

In a still further embodiment, the present invention relates to epigenetic and genetic factors that influence the biological activity of Scaffold/Matrix Attachment Region (S/MAR) sequence(s).

BRIEF DESCRIPTION OF THE DRAWINGS

The invention is further explained with reference to the drawings in which:

Fig. 1 illustrates the construct of plasmid vector pCDNA3.1/ TNFR:Fc;

Fig. 2 illustrates the construct of plasmid vector pCDNA3.1/MAR1/TNFR:Fc;

Fig. 3 illustrates the construct of plasmid vector pCDNA3.1/MAR2/TNFR:Fc;

Fig. 4 illustrates the construct of plasmid vector pCDNA3.1/MAR3/TNFR:Fc;

Fig. 5 illustrates the construct of plasmid vector pCDNA3.1/MAR4/TNFR:Fc; Fig. 6 illustrates the construct of plasmid vector pCDNA3.1/MAR5/TNFR:Fc; and

Fig. 7 illustrates the construct of plasmid vector pCDNA3.1/MAR6/TNFR:Fc

DETAILED DESCRIPTION OF THE INVENTION

The term "Scaffold/Matrix Attachment Region (S/MAR)" as used herein refers to non- consensus-like AT-rich DNA elements several hundred base pairs (bp) in length, which organize the nuclear DNA of the eukaryotic genome into some 60,000 chromatin domains, by periodic attachment to the protein scaffold or matrix of the cell nucleus. They are typically found in non- coding regions such as flanking regions, chromatin border regions, and introns.

With "at least 70% homology" is meant DNA wherein the nucleotide sequence is least 70% homologous to a defined sequence measured by pair wise DNA sequence alignment using matching methods like the BLAST (Basic Local Alignment Search Tool) algorithm.

The term "functional fragments of SEQ ID NO: 1 , 2, 3, 4, 5 and 6" as used herein means fragments of said sequences of a size large enough to have the desired effect on expression yields.

The term "flanking" as used herein means that the sequences in question are either directly connected to the expression vector or are connected by linking DNA sequences which may be up to 10 kb or more in length as long as such linking sequences do not interfere with the desired effect of the sequences. The expression vector comprises at a minimum a gene of interest and expression control elements operatively linked thereto.

The expression control elements comprise the usual regulator elements such as transcriptional promoters, enhancers, repressors, RNA polymerase binding sites, polyadenylation sites, translation initiation signals, and translation termination signals and may be readily accomplished by one ordinarily skilled in the art.

The term "complement" refers to a nucleic acid sequence having a complementary nucleotide sequence and reverse orientation as compared to a reference nucleotide sequence. For example, the sequence 5" ATGCACGGG 3' is complementary to 5' CCCGTGCAT 3¹. As used herein, the term "variant" refers to polynucleotide or polypeptide sequences different from the specifically identified sequences, wherein one or more nucleotides or amino acid residues is deleted, substituted, or added. Variants may be naturally occurring allelic variants, or non-naturally occurring variants. Variants may be from the same or from other species and may encompass homologues, paralogues and orthologues. In certain embodiments, variants possess biological activities that are the same or similar to those of the sequences in question.

Variant polynucleotide sequences preferably exhibit at least 50%, more preferably at least 51%, more preferably at least 52%, more preferably at least 53%, more preferably at least 54%, more preferably at least 55%, more preferably at least 56%, more preferably at least 57%, more preferably at least 58%, more preferably at least 59%, more preferably at least 60%, more preferably at least 61%, more preferably at least 62%, more preferably at least 63%, more preferably at least 64%, more preferably at least 65%, more preferably at least 66%, more preferably at least 67%, more preferably at least 68%, more preferably at least 69%, more preferably at least 70%, more preferably at least 71%, more preferably at least 72%, more preferably at least 73%, more preferably at least 74%, more preferably at least 75%, more preferably at least 76%, more preferably at least 77%, more preferably at least 78%, more preferably at least 79%, more preferably at least 80%, more preferably at least 81%, more preferably at least 82%, more preferably at least 83%, more preferably at least 84%, more preferably at least 85%, more preferably at least 86%, more preferably at least 87%, more preferably at least 88%, more preferably at least 89%, more preferably at least 90%, more preferably at least 91%, more preferably at least 92%, more preferably at least 93%, more preferably at least 94%, more preferably at least 95%, more preferably at least 96%, more preferably at least 97%, more preferably at least 98%, and most preferably at least 99% identity to a sequence of the present invention.

The term "sequence motif, as used herein, refers to a certain nucleotide sequence of at least 2 nucleotides comprised in a larger oligonucleotide sequence. A sequence motif may occur once in an oligonucleotide sequence, or it may occur any number of times. For example, the oligonucleotide 5'-AUCAUCAUG-3' comprises three occurrences of the sequence motif 5'-AU-3\ two occurrences of the sequence motifs 5-UC-3¹ and 5-CA-3', and one occurrence of the sequence motif 5-UG-3' The term "epigenetic factors" as used herein refers to any external process or factor that, in operation, affects the expression of a gene or a set of genes, and stands in contrast to the "genetic factors" which refers to any internal process or factor that includes internal factors such as proteins, nucleic acids etc.

As used herein, the term "TNFR:Fc receptor protein" or "TNFR:Fc" refers to a protein having amino acid sequence similar to the extracellular domain of the human TNFRII (p75) protein and which can bind to its native ligand TNF-alpha in turn inhibiting the TNF-alpha from binding to the cell membrane bound TNFRI or TNFRII. Of the two distinct forms of TNFR known to exist, the preferred TNFR of the present invention is the TNFRII (p75). Soluble TNFR constructs are devoid of the transmembrane region for facilitating secretion out of the cell. The soluble part of the TNFRII which is the extracellular domain is fused in frame to the Fc region of the human IgGLThe fusion protein of the present invention is biologically active, i.e. it can bind TNF in solution.

The term "biomolecule" as used herein refers to a substance, a compound or a component associated with a biological environment including, but not limited to, sugars, amino acids, peptides proteins, oligonucleotides, polynucleotides, polypeptides, organic molecules, haptens, epitopes, biological cells, parts of biological cells, vitamins, hormones and the like.

The term "expression system" refers to any in vivo or in vitro biological system that is used to produce one or more protein encoded by a polynucleotide

The term "recombinant", as used herein means that a protein is derived from recombinant expression systems, which in this specification is a mammalian cell based expression system.

The term "isolated DNA sequence", as used herein refers to a DNA polymer in the form of a separate fragment or as a part of a larger DNA construct. Such sequences would be cloned in expression vectors and would enable isolation of the sequence in large amounts for identification, manipulation and recovery of the DNA fragment. Such sequences will be provided in an open reading form without any interruptions by non-translated DNA regions or by introns.

As used herein, the term "nucleotide sequence" refers to a heteropolymer of deoxyribonucleotides. DNA sequences encoding the proteins provided by this invention can be assembled from cDNA fragments and short oligonucleotide linkers, or from a series of oligonucleotides, to provide a synthetic gene which is capable of being expressed in a recombinant transcriptional unit

"Chromosome" is an organized structure of DNA and proteins found inside the cell.

"Chromatin" is the complex of DNA and protein, found inside the nuclei of eukaryotic cells, which makes up the chromosome.

"DNA" or "Deoxyribonucleic Acid" contains genetic information. It is made up of different nucleotides A, G, T or C.

A "gene", as used herein refers to a deoxyribonucleotide (DNA) sequence coding for a given mature protein. It does not include untranslated flanking regions such as RNA transcription initiation signals, polyadenylation addition sites, promoters or enhancers.

As used herein, the term "transcriptional promoter" refers to a nucleic acid sequence that controls the expression of a coding sequence or functional RNA. Promoters may be derived from a native gene, or be composed of different elements derived from different promoters found in nature. The promoter may be any nucleic acid sequence which shows transcriptional activity in the host cell and may be derived from genes encoding proteins either homologous or heterologous to the host cell. Examples of suitable promoters are the SV40 promoter, the MT-1 (metallothionein gene) promoter, the human cytomegalovirus immediate-early promoter etc.

"Transcriptional enhancer" refers to the sequence of gene that acts to initiate the transcription of the gene independent of the position or orientation of the gene.

"Transcriptional repressor" refers to the sequence of the gene that acts to inhibit the transcription of the gene independent of the position or orientation of the gene.

The term "signal peptide" refers to an amino terminal polypeptide preceding the secreted mature protein. In mature protein it is not present, as it is cleaved.

As used herein, the term "protein" refers to any polymer of two or more individual amino acids (whether or not naturally occurring) linked via peptide bonds, as occur when the carboxyl carbon atom of the carboxylic acid group bonded to the alpha-carbon of one amino acid (or amino acid residue) becomes covalently bound to the amino nitrogen atom of the amino group bonded to the alpha-carbon of an adjacent amino acid. These peptide bond linkages, and the atoms comprising them (i.e., alpha-carbon atoms, carboxyl carbon atoms (and their substituent oxygen atoms), and amino nitrogen atoms (and their substituent hydrogen atoms)) form the "polypeptide backbone" of the protein. In addition, as used herein, the term "protein" is understood to include the terms "polypeptide" and "peptide" (which, at times, may be used interchangeably herein).

The term "vector" as used herein refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. Vectors, usually derived from plasmids, functions like a "molecular carrier", which will carry fragments of DNA into a host cell.

The vector may be any vector which may conveniently be subjected to recombinant DNA procedures, and the choice of vector will often depend on the host cell into which it is to be introduced. Thus, the vector may be an autonomously replicating vector, i.e., a vector which exists as an extra chromosomal entity, the replication of which is independent of chromosomal replication, e.g., a plasmid. Alternatively, the vector may be one which, when introduced into a host cell, is integrated into the host cell genome and replicated together with the chromosome (s) into which it has been integrated. The vector is preferably an expression vector in which an encoding DNA sequence is operably linked to additional segments required for transcription of the DNA. In general, the expression vector is derived from plasmid or viral DNA, or may contain elements of both. The term, "operably linked" indicates that the segments are arranged so that they function in concert for their intended purposes, e. g. transcription initiates in a promoter and proceeds through the DNA sequence coding for the polypeptide

The term "plasmids", as used herein refers to small circular double stranded polynucleotide structures of DNA found in bacteria and some other organisms. Plasmids can replicate independently of the host cell chromosome.

The term "replication" refers to the synthesis of DNA from its template DNA strand.

The term "transcription" refers to the synthesis of RNA from a DNA template.

The term "translation" refers to the synthesis of a polypeptide from messenger RNA.

The term "cis" refers to the placement of two or more DNA elements linked on the same plasmid. The term "trans" refers to the placement of two or more elements on two or more different plasmids.

The term "Orientation" refers to the order of nucleotides in the DNA sequence.

As used herein, the term "isolated nucleic acid fragment" refers to a polymer of DNA or RNA that is single or double stranded. An isolated nucleic acid fragment in the form of a polymer of DNA may be comprised of one or more segments of cDNA, genomic DNA or synthetic DNA.

The term "gene amplification" as used herein refers to the selective, repeated replication of a certain gene or genes without proportional increase in the copy number of other genes. It is an important widespread developmental and evolutionary process in many organisms. Gene amplification can be classified in two categories (i) developmentally regulated gene expression as seen in Xenopus oocytes and (ii) spontaneously occurring gene expression as amplification of the lac region reported in Escherichia coli. The best known gene amplification in mammalian cells is dihydrofolate reductase (DHFR).

The term "transformation" refers to the transfer of a nucleic acid fragment into the genome of a host organism, resulting in genetically stable inheritance. Host organisms containing the transformed nucleic acid fragments are referred to as "transformed" organisms.

The term "eukaryotic cell" refers to any cell from a eukaryotic organism whose cells are organized into complex structures by internal membrane and cytoskeleton. Any eukaryotic cell that can be used for gene/protein manipulation and also can be maintained under cell culture conditions and subsequently transfected would be included in this invention. Especially preferable cell types include stem cells, embryonic stem cells, Chinese hamster ovary cells (CHO), COS, BHK21, NIH3T3, HeLa, C2C12, HEK, MDCK, cancer cells, and primary differentiated or undifferentiated cells. The mammalian cells may include CHO cells, HeLa cells, baby hamster kidney (BHK) cells, COS cells, HEK 293 cells or other immortalized cell lines.

As used herein, the term "transfection" refers to the introduction of a foreign material like DNA into eukaryotic cells by any means of transfer. Different methods of transfection include, but are not limited to, Calcium phosphate transfection, electroporation, lipofectamine transfection and DEAE-Dextran transfection. The term "transfected cell" refers to the eukaryotic cell in which the foreign DNA has been introduced into the eukaryotic cells. This DNA can be part of the host chromosome or replicate as an extra chromosomal element.

The term "co-transfection" refers to the method of simultaneously transfecting a eukaryotic cell with more than one exogenous gene foreign to the cell.

The term "transient gene expression" refers to a convenient method for the rapid production of small quantities of protein. COS cells are generally used for the characterization of transient expression

The term "stable gene expression" refers to the preparation of stable cell lines that permanently express the gene of interest depending on the stable integration of plasmid into the host chromosome

A novel eukaryotic expression vector has been constructed that comprises the Tumor Necrosis Factor Alfa receptor (TNFR) - Human IgG Fc fusion protein activity-encoding DNA, and drives the expression of TNFR:Fc activity when transfected into an appropriate cell line. The novel expression vector can be used to produce soluble TNFR:Fc. The recombinantly produced TNFR:Fc activity is useful in the treatment and prevention of varieties of disorders including rheumatoid arthritis, polyarticular-course juvenile rheumatoid arthritis, ankylosing spondylitis, psoriatic arthritis, and psoriasis.

The present invention relates to a novel eukaryotic expression vector used for producing soluble TNFR:Fc in increased quantity.

Prokaryotic expression systems were part of the early repertoire of research tools in molecular biology. The de novo synthesis of recombinant eukaryotic proteins in a prokaryotic system imposed a number of problems on the eukaryotic gene product. The two most critical among them were improper protein folding and assembly, and the lack of post-translational modification, principally glycosylation and phosphorylation. Prokaryotic systems do not possess all the appropriate protein synthesizing machinery to produce a structural and/or catalytically functional eukaryotic protein. Therefore, mammalian expression systems are generally preferred for manufacturing therapeutic proteins, for the simple reason that the post-translational modifications required will be addressed by the cellular system(s) in situ. A variety of mammalian expression systems are now available for either the transient or stable expression of recombinant genes. Generally, Chinese hamster ovary (CHO) cell stable expression systems (CHO SES) are used. Moreover, baby hamster kidney (BHK) cells, human embryonic kidney (HEK) 293 cells, mouse L-cells, and myeloma cell lines like J558L and Sp2/0, etc., are also employed as hosts for the establishment of stable transfectants.

However, the integration of foreign DNA into the genome of a host cell is an unpredictable process. It has been well documented that transgene expression is highly variable among cell lines, and that integration may cause unexpected changes in the phenotype. The reasons underlying the large variability in clonal expression levels include differing plasmid copy numbers and a phenomenon known as position effect, which was initially described in Drosophila melanogaster as position-effect variegation. The position of integration can influence transgene expression through at least three mechanisms: the activity of local regulatory elements, the local chromatin structure and the local state of DNA methylation. Two common approaches can be used to protect DNA from negative position effects or integration-dependent repression. One approach involves the direct integration of the transgene into a predetermined site that is transcriptionally active using site-specific recombination methods. Another method is to incorporate DNA sequence elements found in chromatin border regions into the expression vector, such that regardless of the integration site, the gene will be protected from surrounding chromatin influences. For recombinant protein expression, sequences that behave as chromatin borders and protect transfected genes from surrounding chromatin influences include insulator sequences and scaffold/matrix-attachment regions (S/MARs).

S/MARs are DNA sequences that bind isolated nuclear scaffolds or nuclear matrices in vitro with high affinity. Expression studies suggest that flanking a transgene with an insulator could reduce the position effect, thus suppressing clonal expression variability. S/MARs are relatively short (100-1000 bp long) sequences that anchor the chromatin loops to the nuclear matrix. S/MARs have been observed to flank the ends of domains encompassing various transcriptional units. It has also been shown that S/MARs bring together the transcriptionally active regions of chromatin such that transcription is initiated in the region of the chromosome that coincides with the surface of the nuclear matrix.

As such, they may define boundaries of independent chromatin domains, such that only the encompassing cis-regulatory elements control the expression of the genes within the domain. A number of possible functions have been discussed earlier for S/MARs, which include forming boundaries of chromatin domains, changing of chromatin conformations, participating in initiation of DNA replication and organizing the chromatin structure of a chromosome. SMARs are common in centromere-associated DNA and telomeric arrays, and appear to be important in mitotic chromosome assembly and maintenance of chromosome shape during metaphase. Thus, S/MARs are involved in multiple independent processes during different stages of the cell cycle.

Analyses of experimentally identified S/MARs have revealed a typical element to be as short as 300 base pairs (bp) and up to several kilobases (kb) long. These S/MARs may contain several sequence motifs, including AT-rich nucleotide motifs (> 70% A-T). Most MARs appear to contain a MAR-specific sequence called "MAR recognition signature," which is a bipartite sequence that consists of two individual sequences AATAAYAA and AVWVRTAANNWWGNNNC within 200 bp. Other sequences, proposed to be indicative of MAR sequences, are the DNA- unwinding motif (AATATATTAATATT), replication initiator protein sites (ATTA and, ATTTA), homo-oligonucleotide repeats (e.g., the A-box AATAAAYAAA and the T-box TTWTWTTWTT), DNase l-hypersensitive sites, potential nucleosome-free stretches, polypurine-polypyrimidine tracks, and sequences that may adopt non-B-DNA or triple-helical conformations under conditions of negative supercoiling.

It was speculated that S/MAR could form genetic boundaries between chromosomal domains that independently organize into structures permissive or non-permissive for gene expression, referred to as euchromatin and heterochromatin domains, respectively. A transgene flanked by S/MAR elements may therefore constitute an autonomous chromatin domain whose expression would remain independent of the adjacent chromosomal environment. Recently, S/MARs have been shown to increase the expression of adjacent transgenes when co-inserted into a chromosomal environment. Alternatively, S/MARs may actively reconfigure chromatin around its chromosomal integration site and thereby prevent transgene silencing, for instance by mediating histone modifications or changes in sub-nuclear localization.

A well characterized 3-kb MAR element bordering the 5'-end of the chicken lysozyme (cLys) locus, the Drosophila Scs boundary element, hspSAP MAR, the mouse T-cell receptor TCRa, and the rat LAP locus control region, have all been reported to mediate permissive chromatin structures in a variety of systems. These elements were introduced in expression vectors and the resulting transgene expression was assayed by stable transfection of CHO cells. Most of these elements showed modest effects on the expression levels of the transgene in CHO cell pools. In contrast, cLysMAR was six-fold more effective than the second best of these elements. Moreover, a further four-fold increase in expression level was seen when two cLysMAR elements flanked the expression cassette.

Multiple copies of cLysMAR are large (6kb), which thereby restricts the general use of such chromatin elements taking into account the size of the expression vectors. One approach to improve MAR versatility consists of co-transfecting the transgene expression cassette, and various amounts of cLysMAR in trans on a separate plasmid. This was shown to enhance expression to levels higher than those obtained with plasmids bearing just one cLysMAR element. Thus, MAR-bearing plasmids can be simply added to current expression vectors, in co- transfections, to significantly increase expression levels. In another approach, short functional elements of cLysMAR were defined by deletion mutagenesis. These portions of MAR, when multimerized, were found to be equally active as the full-length element, although of much smaller size (P.-A. Girod and N. Mermod, unpublished data).

Multimeric protein production faces one major problem: to have efficient production, all subunits need to be coordinately synthesized at stoichiometric levels. Therefore, identical expression signals, such as promoters and 3/-regions were used in the different expression cassettes. Two linearized plasmids encoding the heavy- and light chain expression cassette and harbouring adjacent chromatin elements of various kinds were transfected simultaneously into CRO cells, in the presence or absence of MAR elements. Once again, addition of cLysMAR elements increased the average and maximal productivities of isolated CRO cell clones by 5- to 10-fold.

One of the optimal settings consisted of adding the MAR elements both in cis, on each of the heavy and light chain expression vectors, and in trans, by further adding to the transfection mix an additional MAR-containing plasmid. The co-transfection of cLysMAR in cis and trans increased the mean expression level in a pool of stably co-transfected cells. Moreover, it reduced the expression variability in different clones, thus allowing the isolation of clones exhibiting high secretion levels at a higher frequency.

The chicken lysozyme 5' MAR was identified as one of the most active sequence in a study that compared the effect of various chromatin structure regulatory elements on transgene expression. It was also shown to increase the levels of regulated or constitutive transgene expression in various mammalian cell lines. Inclusion of cLysMAR sequence increased overall expression of transgenes when transfected into CHO cell line. As previously mentioned, mammalian expression systems are generally preferred for manufacturing therapeutic proteins, as they require post-translational modifications. A variety of mammalian cell expression systems are now available for expression of proteins. However, the level of expression of the recombinant protein achieved from these expression vectors/systems in mammalian cells is not commercially viable.

ENBREL is a dimeric fusion protein consisting of the extracellular ligand-binding portion of the human-75 kilodalton (p75) tumor necrosis factor receptor (TNFR) linked to the Fc portion of human IgGl The Fc component of Enbrel contains the CH2 domain, the CH3 domain and hinge region, but not the CH1 domain of IgGL ENBREL is produced by recombinant DNA technology in a Chinese hamster ovary (CHO) mammalian cell expression system. It consists of 934 amino acids and has an apparent molecular weight of approximately 150 kilodaltons.

ENBREL is an important scientific advance that has shown to reduce the signs and symptoms of rheumatoid arthritis, polyarticular-course juvenile rheumatoid arthritis, ankylosing spondylitis, psoriatic arthritis, and psoriasis.

TNF is a naturally occurring cytokine that is involved in normal inflammatory and immune responses. It plays an important role in the inflammatory processes of rheumatoid arthritis (RA), polyarticular-course juvenile rheumatoid arthritis (JRA) and the resulting joint pathology. Elevated levels of TNF are found in the synovial fluid of RA patients. Two distinct receptors for TNF (TNFRs), a 55 kilodalton protein (p55) and a 75 kilodalton protein (p75) exist naturally as monomeric molecules on cell surfaces and in soluble forms.

Biological activity of TNF is dependent upon binding to either cell surface TNF receptors. A protein like ENBREL would inhibit the action of TNF by competitive inhibition thus rendering TNF biologically inactive by preventing the binding of TNF to its cellular receptors.

The fusion of the extracellular domain of the TNFR to the Fc part of the human IgGI would allow the dimerization of the molecule leading to optimal pharmacokinetics of the protein by increasing the serum resident time. As both the domains of the fusion protein are of human origin, the chimeric protein would not ideally induce an immune response, thus making it a suitable molecule for human use.

The present invention relates to a novel expression vector using the above-mentioned S/MAR to produce Enbrel in larger quantity. Upon isolation from culture media, products of expression of the DNA sequence display the biological activities of TNFR.Fc. Vector development, cloning and sub-cloning, transfection, fermentation and purification strategies are disclosed.

METHODOLOGY

Cloning and Construction of Expression Vector

In pCDNA3.1 vector, the gene of interest is regulated by Human cytomegalovirus (CMV) immediate-early promoter/enhancer. It permits efficient, high level expression of the recombinant protein. The gene of interest, TNFR:Fc was cloned into pCDNA3.1 vector using direction TOPO expression kit. The positive transformants were initially verified by colony PCR and later using appropriate restriction enzymes. It was double digested using BamHI and Xhol that gave the expected pattern. It was also confirmed by using the restriction enzymes Apal and Sac II, which clearly demonstrated the expected pattern. The inserts were later sequence verified.

Isolated scaffold/matrix-attachment regions (S/MARs), here marked as S/MAR sequences were initially cloned into pGEMTeasy vector by PCR and confirmed by appropriate restriction enzyme analysis and sequencing. Subsequently, the cloned S/MARs were inserted upstream of CMV promoter in the pCDNA3.1 vector using restriction sites. The insertion was confirmed by restriction analysis.

Expression of recombinant TNFR

Recombinant expression vectors, in this case pCDNA3.1 , were used to express the gene encoding TNFR:Fc fusion protein. Recombinant expression vectors are replicable DNA constructs in which the DNA encoding for the protein of interest is linked to certain gene elements that drive its expression. An assembly of such a transcription unit generally comprises of - transcriptional promoters or enhancers, appropriate transcription and translational initiation and termination sites, a coding sequence that encodes for the protein of interest and a selection marker that can help to differentiate between the transfected and the non-transfected mammalian cell line clones.

The expression of recombinant proteins in mammalian cells is particularly preferred as part of this invention since such proteins are known to be correctly folded thereby resulting in a fully functional conformation. The cell line that will be used for recombinant gene expression is the CHO-K1 cell line, and will be a homogenous population of cells. The transfected colony of CHO-K1 containing the stably integrated transcriptional unit encoding for the recombinant protein will be a monoculture, i.e. the cells will be the progeny of a single ancestral transformant.

The transformed host cells will be transfected with expression vectors containing the complete transcriptional unit. The expressed TNFR:Fc fusion protein will be secreted into the culture supernatant. Elevated levels of expression product are achieved by selecting for cell lines using a selection marker such as a gene coding for antibiotic (neomycin) resistance.

Many variations of the present invention will suggest themselves to those skilled in the art in light of the above description. Such obvious variations are within the full intended scope of the appended claims.

SEQUENCE LISTING

<110> Avesthagen Limited

PATELL, VILLOO MORAWALA DR

<120> An Expression Vector and a method thereof for TNF Alpha <130> <160> 6 <170> Patentln version 3.5

<210> 1

<211> 2960

<212> DNA

<213> pCDNA3.1

<400> 1 gatccgtaat acaattgtac caggttttgg tttattacat gtgactgacg gcttcctgtg 60 cgtgctcagg aaacggcagt tgggcactgc actgcccggt gatggtgcca cggtggctcc 120 tgccgccttc tttgatattc actctgttgt atttcatctc ttcttgccga tgaaaggata 180 taacagtctg tgaggaaata cttggtattt cttctgatca gcgtttttat aagtaatgtt 240 gaatattgga taaggctgtg tgtcctttgt cttgggagac aaagcccaca gcaggtggtg 300 gttggggtgg tggcagctca gtgacaggag aggttttttt gcctgttttt tttgttgttt 360 ttttttttta agtaaggtgt tcttttttct tagtaaaatt tctactggac tgtatgtttt 420 gacaggtcag aaacatttct tcaaaagaag aaccttttgg aaactgtaca gcccttttct 480 ttcattccct ttttgctttc tgtgccaatg cctttggttc tgatttgcat tatggaaaac 540 gttgatcgga acttgaggtt tttatttata gtgtggcttg aaagcttgga tagctgttgt 600 tacatgagat accttattaa gtttaggcca gcttgatgct ttattttttt ccctttgaag 660 tagtgagcgt tctctggttt ttttcctttg aaactggcga ggcttagatt tttctaatgg 720 gattttttac ctgatgatct agttgcatac ccaaatgctt gtaaatgttt tcctagttaa 780 catgttgata acttcggatt tacatgttgt atatacttgt catctgtgtt tctagtaaaa 840 atatatggca tttatagaaa tacgtaattc ctgatttcct ttttttttta tctctatgct 900 ctgtgtgtac aggtcaaaca gacttcactc ctatttttat ttatagaatt ttatatgcag 960 tctgtcgttg gttcttgtgt tgtaaggata cagccttaaa tttcctagag cgatgctcag 1020 taaggcgggt tgtcacatgg gttcaaatgt aaaacgggca cgtttggctg ctgccttccc 1080 gagatccagg acactaaact gcttctgcac tgaggtataa atcgcttcag atcccaggga 1140 agtgtagatc cacgtgcata ttcttaaaga agaatgaata ctttctaaaa tattttggca 1200 taggaagcaa gctgcatgga tttgtttggg acttaaatta ttttggtaac ggagtgcata 1260 ggttttaaac acagttgcag catgctaacg agtcacagca tttatgcaga agtgatgcct 1320 gttgcagctg tttacggcac tgccttgcag tgagcgattt gcagataggg gtggggtgct 1380 ttgtgtcgtg ttcccacacg ctgccacaca gccacctccc ggaacacatc tcacctgctg 1440 ggtacttttc aaaccatctt agcagtagta gatgagttac tatgaaacag agaagttcct 1500 cagttggata ttctcatggg atgtcttttt tcccatgttg ggcaaagtat gataaagcat 1560 ctctatttgt aaattatgca cttgttagtt cctgaatcct ttctatagca ccacttattg 1620 cagcaggtgt aggctctggt gtggcctgtg tctgtgcttc aatcttttaa gcttctttgg 1680 aaatacactg acttgattga agtctcttga agatagtaaa cagtacttac ctttgatccc 1740 aatgaaatcg agcatttcag ttgtaaaaga attccgccta ttcataccat gtaatgtaat 1800 tttacacccc cagtgctgac actttggaat atattcaagt aatagacttt ggcctcaccc 1860 tcttgtgtac tgtattttgt aatagaaaat attttaaact gtgcatatga ttattacatt 1920 atgaaagaga cattctgctg atcttcaaat gtaagaaaat gaggagtgcg tgtgctttta 1980 taaatacaag tgattgcaaa ttagtgcagg tgtccttaaa aaaaaaaaaa agtaatataa 2040 aaaggaccag gtgttttaca agtgaaatac attcctattt ggtaaacagt tacattttta 2100 tgaagattac cagcgctgct gactttctaa acataaggct gtattgtctt cctgtaccat 2160 tgcatttcct cattcccaat ttgcacaagg atgtctgggt aaactattca agaaatggct 2220 ttgaaataca gcatgggagc ttgtctgagt tggaatgcag agttgcactg caaaatgtca 2280 ggaaatggat gtctctcaga atgcccaact ccaaaggatt ttatatgtgt atatagtaag 2340 cagtttcctg attccagcag gccaaagagt ctgctgaatg ttgcgttgcc ggagacctgt 2400 atttctcaac aaggtaagat ggtatcctag caactgcgga ttttaataca ttttcagcag 2460 aagtacttag ttaatctcta cctttaggga tcgtttcatc atttttagat gttatacttg 2520 aaatactgca taacttttag ctttcatggg ttcctttttt tcagccttta ggagactgtt 2580 aagcaatttg ctgtccaact tttgtgttgg tcttaaactg caatagtagt ttaccttgta 2640 ttgaagaaat aaagaccatt tttatattaa aaaatacttt tgtctgtctt cattttgact 2700 tgtctgatat ccttgcagtg ctcattatgt cagttctgtc agatattcag acatcaaaac 2760 ttaacgtgag ctcagtggag ttacagctgc ggttttgatg ctgttattat ttctgaaact 2820 agaaatgatg ttgtcttcat ctgctcatca aacacttcat gcagagttta aggctagtga 2880 gaaatgcata catttattga tactttttta aagtcaactt tttatcagat ttttttttca 2940 tttggaaata tattgttttc 2960

<210> 2 <211> 400 <212> DNA <213> pCDNA3.1

<400> 2 aaaaatatat atagcattat atatatttat ataatttata catataatta tatatataat 60 atatataatt atatataatg tataatatat tatatgtact atatatttac atatattata 120 taaatttata tataatatat ttatataaat atattatata tattttatat ataaattata 180 tatattatat aaaataaatt atatataata tataatgtat tatatataat atatatataa 240 tgaattatat atattatata tataaaatga atcatatata atatatatat aaaatgaatc 300 atatatatta tatataaaat gaatcatata tactatatac ataaaatgaa tcacatatat 360 tatatacata aaatgagtca catatattat atataaaatg 400

<210> 3 <211> 800 <212> DNA <213> pCDNA3.1

<400> 3 gtatctataa tatataaaat atattatata tggtatacat attatatata aaatatatta 60 tatatgttat acctattata tataaaatat attatatatg gtatacctat tatatataaa 120 atatactata tatgatatac atattatata taaaatatat tatatatggt atatatatta 180 tatataaaat atattatata tggtatatat attatatata aaatatatat gttacatgaa 240 aaatatatat aaaatatata atatataaaa tatattgtat agtatatata aaatatatta 300 tatataatat ataaaatata ttgtacatta tatataaaat atattatata taatatataa 360 aatatattat gtatattata taaaatatat tatatataat atataaatat attgtataat 420 atataaatat attatatata atatatttat atataatata ttttatatat tttatataaa 480 tatattatat ataatatatt ttatatattt tatatatata taaaatatat atggttttaa 540 gacagagtct tgctctgtca cccaggctag agtgtatata tatttatata tataatatat 600 aaaatatata atatataata tataaaatat ataatatata atattatata ttgtatataa 660 aatatattta atatattata tattatatat aaaatatatt atataatata taatatataa 720 tatttatatt tatattaaat ttatataata taatatttat attatataat atataaaata 780 tataatatat aaatatattt 800

<210> 4 <211> 400 <212> DNA <213> pCDNA3.1

<400> 4 atacactata gtatacacta tatacactat agtatacact atatacacta tatatagtat 60 actatatata cactatataa tatatatagt ataatagtat aagatactat ataacatatt 120 atatattata ctatatataa tatataatat attatatatt tatattttta taatatataa 180 atattatatt ttaaaaatat atattataaa atatttatta tattttatat tatatataaa 240 ataattataa tatatatatt ataaaatata tattataaaa tatttaaaaa tatataatat 300 aaaaatatat attttataat atataattat attatattat ataatatata atataatata 360 taatatataa tagtataata tactataata tacatataat 400

<210> 5

<211> 800

<212> DNA

<213> pCDNA3.1

<400> 5 aaaggtgaaa ggaatatata ttatatatat ttcaaacata tataagaata tatttgaaat 60 ataatatgat ataaattaat atattaatat taatataaat caatatatta atatattaat 120 ataatacatt taaatacata ttatataata taaaatacat tattatatta tacattatat 180 gatataaaat acattattat attatacatt atatgatata aaatacatta ttatattata 240 cattatgtga tatacaatac attattatat tatacattat gtgatataca atacattata 300 ttatacatta tatgatatac aatacattat tatattatat attatatgat atacaataca 360 ttattatatt atatattata tgatatacaa tacattattt tattatatat tatatgatat 420 acaatacatt attatatatt atatgatata aaatacatta ttatattata tattatatga 480 tataaaatac attatatatt ctatattata tgatataaaa tacattatat attctatatt 540 atatgatata aaatacatta tatattctat attatatgat ataaaataca ttatatattc 600 tatattatat gatataaaat acattatata ttctatatta tatgatataa aatacattat 660 atattctata ttatatgata taaaatacat tatatattct atattatatg atataaaata 720 cattatatat tctatattat atgatataaa atacattata tattctatat tatatgatat 780 aaaatacatc atactctata 800

<210> 6 <211> 800 <212> DNA <213> pCDNA3.1

<400> 6 tagtatatgt aacataatat atataatgtt atatataata tatacatata tacatatatg 60 ttaatatatg tatatattat atataaaata tttataaata tattatatat atttaataca 120 taatatttaa tatattaaat atatatattt aataattaat atatgtgtaa atatacatat 180 taaataatat ataatattaa tatataatta taatttatat attacatatt atatctatta 240 ttacatatga taatatataa ttatatatta atataatatt aaatatatat gtataaatat 300 agaatatata ttatatatat catatattat atattttata tattatatat attttatata 360 atatatatta tattatatta tatataatat atattatata ttctatatat aatatatatt 420 atataatata taaatatata atatataata tatataatat ataaaatata taataaatat 480 atattatata taaaatatat aatatataat atatagtata tataaaatat ataataaata 540 gtatatagta tatataaaat atataataaa tagtatatag tatatataaa atatataata 600 aatagtatat agtatatata aaatatataa tatataatat atagtatata taaaatatat 660 aatatataat atatagtata tataaaatat atactatata atatatagta tatataaaat 720 atataatata taatatatag tatatataaa atatataata tatataaaat atttataaat 780 ataaaatata ttagaaatat 800

Claims

WE CLAIM:

1. An isolated nucleic acid comprising one or more sequences selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6, complements, variants, and functional fragments thereof and sequences being at least 70% homologous thereto.

2. The isolated nucleic acid of claim 1 , wherein the one or more sequences comprise S/MAR sequences.

3. The isolated nucleic acid of claim 2, wherein the one or more S/MAR sequences increase expression of a biomolecule when said sequences are used in an expression system.

4. The isolated nucleic acid of claim 3, wherein the one or more S/MAR sequences increase expression of the biomolecule independently of the orientation of said sequences in the expression system.

5. The isolated nucleic acid of claim 3, wherein the biomolecule is a protein.

6. The isolated nucleic acid of claim 3, wherein the biomolecule is a fusion protein.

7. The isolated nucleic acid of claim 6, wherein the fusion protein is recombinant soluble Tumour Necrosis Factor Alfa receptor (TNFR) - Human IgG Fc fusion protein.

8. The isolated nucleic acid of claim 2, wherein the one or more S/MAR sequences contain one or more nucleotide sequence motifs.

9. The isolated nucleic acid of claim 8, wherein the one or more nucleotide sequence motifs includes at least one AT-rich nucleotide motif.

10. A method for constructing an expression vector having increased expression efficiency, the method comprising inserting the isolated nucleic acid of claim 1 into an expression vector.

11. The method according to claim 10, wherein the expression vector is a mammalian expression vector.

12. A vector comprising the isolated nucleic acid of claim 1.

13. The vector of claim 12, wherein said vector is a bacterial plasmid, a bacteriophage vector, a yeast episomal vector, an artificial chromosomal vector, or a viral vector.

14. The vector of claim 12, wherein said vector is a mammalian expression vector.

15. A method for producing a recombinant host cell, the method comprising introducing the isolated nucleic acid of claim 1 or the expression vector of claim 12 into a host cell.

16. The method according to claim 15, wherein the isolated nucleic acid or the expression vector is introduced by way of transfection.

17. The method according to claim 16, wherein the isolated nucleic acid gets integrated with the genome of the recombinant host cell upon transfection.

18. A host cell produced according to the method of claim 15.

19. A host cell comprising the vector of claim 12.

20. The host cell according to claims 18 or 19, wherein said host cell is a eukaryotic cell.

21. The host cell according to claims 18 or 19, wherein said host cell is a mammalian cell.

22.An expression vector comprising a nucleic acid molecule that comprises (a) a sequence encoding a protein operably linked to one or more expression control elements and (b) and one or more S/MAR sequences selected from the group consisting of SEQ ID NO: 1 , SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, and SEQ ID NO: 6, complements, variants, and functional fragments thereof and sequences being at least 70% homologous thereto.

23. The expression vector of claim 22, wherein the protein is a fusion protein.

24. The expression vector of claim 22, wherein the protein is recombinant soluble Tumour Necrosis Factor Alfa receptor (TNFR) - Human IgG Fc fusion protein.

25. A host cell comprising the expression vector of claim 22.

26. The host cell of claim 25, wherein said host cell is a eukaryotic cell.

27. The host cell of claim 25, wherein said host cell is a mammalian cell.

28. An expression vector comprising a nucleic acid molecule that comprises (a) a sequence encoding Tumour Necrosis Factor Alfa receptor (TNFR) - Human IgG Fc fusion protein operably linked to one or more expression control elements and (b) one or more S/MAR sequences selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, and SEQ ID NO: 6, complements, variants, and functional fragments thereof and sequences being at least 70% homologous thereto.

29. A host cell comprising the expression vector of claim 28.

30. The host cell of claim 29, wherein said host cell is a eukaryotic cell.

31. The host cell of claim 29, wherein said host cell is a mammalian cell.

32. The expression vector of claim 28, wherein the one or more expression control elements comprise at least one of transcriptional promoter, transcriptional enhancer, transcriptional repressor, polyadenylation site, origin of replication site, translation initiation signal and translation termination signal.

33. The expression vector of claim 32, wherein the one or more S/MAR sequences are located upstream of the transcriptional promoter.

34. The expression vector of claim 32, wherein the one or more S/MAR sequences are located downstream of the transcriptional promoter.

35. The expression vector of claim 32, wherein the one or more S/MAR sequences are located downstream of the translation termination signal.

36. The vector of claim 32, wherein the one or more S/MAR sequences are located upstream of the transcriptional promoter and downstream of the translation termination signal.

37. The vector of claim 32, wherein the one or more S/MAR sequences are located downstream of the transcriptional promoter and of the translation termination signal.

38. The expression vector of claim 28, wherein the one or more S/MAR sequences are located at a distance of 0 to 10 KB from the sequence encoding Tumour Necrosis Factor Alfa receptor (TNFR) - Human IgG Fc fusion protein.

39. The expression vector of claim 28, wherein the one or more S/MAR sequences are located at a distance of 0 to 10 KB from the origin of replication site.

40. The expression vector of claim 28, wherein said expression vector is used for recombinant expression of Tumour Necrosis Factor Alfa receptor (TNFR) - Human IgG Fc fusion protein.

41. A method for producing a protein, the method comprising the steps of (a) transfecting a mammalian cell with an expression vector comprising (I) a sequence encoding the protein and (II) one or more S/MAR sequences selected from the group consisting of SEQ ID NO: 1 , SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, and SEQ ID NO: 6, complements, variants and functional fragments thereof and sequences being at least 70% homologous thereto; (b) culturing the transfected mammalian cell under conditions suitable for expression of the protein; and (c) isolating the expressed protein.

42. The method according to claim 41 , wherein the protein is a fusion protein.

43. The method according to claim 41, wherein the protein is recombinant soluble Tumour Necrosis Factor Alfa receptor (TNFR) - Human IgG Fc fusion protein.

44.A method for producing Tumour Necrosis Factor Alfa receptor (TNFR) - Human IgG Fc fusion protein, the method comprising the steps of (a) transfecting a mammalian cell with an expression vector comprising (I) a sequence encoding Tumour Necrosis Factor Alfa receptor (TNFR) - Human IgG Fc fusion protein and (II) one or more S/MAR sequences selected from the group consisting of SEQ ID NO: 1 , SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, and SEQ ID NO: 6, complements, variants and functional fragments thereof and sequences being at least 70% homologous thereto; (b) culturing the transfected mammalian cell under conditions suitable for expression of the fusion protein; and (c) isolating the expressed fusion protein.

45. The method according to claim 44 further comprising the step of purifying said isolated fusion protein.

46. Tumour Necrosis Factor Alfa receptor (TNFR) - Human IgG Fc fusion protein produced according to the method of claim 44.

47.A method for producing Tumour Necrosis Factor Alfa receptor (TNFR) - Human IgG Fc fusion protein, the method comprising the steps of (a) transfecting a mammalian cell with an expression vector comprising a sequence encoding Tumour Necrosis Factor Alfa receptor (TNFR) - Human IgG Fc fusion protein; (b) co-transfecting the mammalian cell with a plasmid comprising one or more S/MAR sequences selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, and SEQ ID NO: 6, complements, variants and functional fragments thereof and sequences being at least 70% homologous thereto; (c) culturing the transfected mammalian cell under conditions suitable for expression of the fusion protein; and (d) isolating the expressed fusion protein.

48. The method according to claim 47 further comprising the step of purifying said isolated fusion protein.

49. Tumour Necrosis Factor Alfa receptor (TNFR) - Human IgG Fc fusion protein produced according to the method of claim 47.

50. A factor which influences the activity of one or more S/MAR sequences, wherein the one or more S/MAR sequences comprises one or more sequences selected from the group consisting of SEQ ID NO: 1 , SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6, complements, variants, and functional fragments thereof and sequences being at least 70% homologous thereto.

51. The factor of claim 48, wherein said factor is at least one of a genetic factor, or an epigenetic factor.