WO2024155890A2

WO2024155890A2 - Dna construct configurations for production of protein biologics

Info

Publication number: WO2024155890A2
Application number: PCT/US2024/012163
Authority: WO
Inventors: Jeremy Minshull; Varsha SITARAMAN
Original assignee: DNA Twopointo Inc
Current assignee: DNA Twopointo Inc
Priority date: 2023-01-19
Filing date: 2024-01-19
Publication date: 2024-07-25
Anticipated expiration: 2025-07-19
Also published as: WO2024155890A9; WO2024155890A3

Abstract

DNA construct configurations are provided for production of protein biologics.

Description

DNA CONSTRUCT CONFIGURATIONS FOR PRODUCTION OF PROTEIN

BIOLOGICS

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority from U.S. Provisional Patent Application No. 63/480,638, filed on January 19, 2023, U.S. Provisional Patent Application No. 63/486,889, filed on February 24, 2023, and U.S. Provisional Patent Application No. 63/515,646, filed on July 26, 2023, each of which is incorporated by reference herein in its entirety.

SEQUENCE LISTING

[0002] A Sequence Listing has been submitted electronically in XML format and is hereby incorporated by reference in its entirety. The XML copy, created on January 19, 2024, is named SEQ ATUM-VCON-PCT.xml and is 542.6 kilobytes in size.

BACKGROUND

[0003] Protein biologies are often produced from mammalian host cells by integration of heterologous polynucleotides encoding the one or more polypeptide chains that comprise the protein biologic into the genome of the host cell. Example host cells may include human cells such as human embryonic kidney (HEK) cells and rodent cells such as NS0 cells and Chinese hamster ovary (CHO) cells. The efficiency with which the protein biologic can be produced depends on the identities and configurations of the regulatory sequence elements and open reading frame (ORF) sequence elements encoding the polypeptide chains within the polynucleotide. These elements determine the amount of each polypeptide chain that the host cell can produce, and for protein biologies comprising more than one polypeptide chain (multi-chain biologies), the relative production of the different polypeptide chains. The efficiency of integration, the size of the heterologous DNA sequence that can be integrated, the number of copies of the heterologous DNA sequence that are integrated into each genome, and the type of genomic loci where integration occurs can often be improved by placing the heterologous DNA into a transposon.

[0004] Transposons comprise two ends that are recognized by a transposase. The transposase acts on the transposon to cut it from one DNA molecule and integrate it into another. The DNA between the two transposon ends is transposed by the transposase along with the transposon ends. Heterologous DNA flanked by a pair of transposon ends, such that it is recognized and transposed by a corresponding transposase, is referred to herein as a transposon or a synthetic transposon. Introduction of a synthetic transposon and a corresponding transposase into the nucleus of a eukaryotic cell may result in transposition of one or more copies of the transposon into the genome of the cell. Because a transposase integrates the entire DNA molecule between the two transposon ends into the host cell genome without rearrangement of the elements within the transposon, transposons are generally advantageous for maintaining the integrity of individual ORFs, and for ensuring that the organization of genetic elements is retained during the genomic integration process.

[0005] The amount of a polypeptide chain that a host cell produces is usually positively correlated with the number of copies of the ORF encoding the polypeptide that are integrated into the host cell genome. The amount of a polypeptide chain that a host cell produces is also usually correlated with the regulatory elements that are operably linked to the ORF: for example, the stronger the promoter that is operably linked to an ORF, the more of the polypeptide encoded by the ORF the host cell will produce.

[0006] However, a significant degree of complexity emerges when a heterologous DNA molecule integrated into a host cell comprises more than one ORF, each with operably linked regulatory sequence elements. This is because regulatory elements interact with each other: for example, a first promoter operably linked to a first ORF may interfere with the activity of a second promoter operably linked to a second ORF on the same DNA molecule, either reducing or increasing the activity of the second promoter compared with its activity in the absence of the first promoter. These interactions are difficult to predict and poorly understood, so it is often necessary to experimentally determine these effects for different combinations of promoters.

[0007] One example of a heterologous DNA having multiple ORFs, each with operably linked regulatory sequence elements, arises in the production of monoclonal antibodies. Monoclonal antibodies are protein biologies that typically comprise a pair of heterodimers, each heterodimer comprising two polypeptide chains, a “light” chain often comprising a kappa or lambda constant region sequence and a “heavy” chain often comprising an IgG, IgA, or IgM constant region sequence. During synthesis and secretion of the monoclonal antibody from the host cell, the light chain performs a chaperone function for the CHI domain of the heavy chain, thereby enabling the cell to fold and secrete the heavy chain. Thus, production of the monoclonal antibody usually requires the light chain to be synthesized in some excess over the heavy chain. Over-production of the heavy chain relative to the light chain generally leads to an accumulation of the heavy chain within the cell, with negative consequences for cell growth and monoclonal antibody productivity. [0008] A specific need exists for a DNA construct encoding both chains of a monoclonal antibody, the DNA construct characterized in that when expressed in the host cell, the regulatory elements operably linked to each ORF result in higher production of the light chain than the heavy chain.

[0009] A related example of a heterologous DNA having multiple ORFs, each with operably linked regulatory sequence elements, arises in the production of multi-specific antibodies. Multispecific antibody molecules often comprise three or four different antibody-like chains which have binding sites for at least two different epitopes. Bispecific antibodies can be produced by expressing both required heavy chains and light chains in a single cell. However, mispairing between chains may result in up to ten different antibody-like compounds being made by such a cell (see Schaefer et al., Proc Natl Acad Sci USA 108: 11187-92, 2011), so that purification of the desired bispecific antibody may present a challenge. Mispairing of heavy chains with each other can be reduced by inserting an amino acid “knob” into the Cp3 domain of one of the two heavy chains and a corresponding “hole” into the Cp3 domain of the other so that the different heavy chains can more readily form heterodimers than homodimers, thus reducing formation of a non- bispecific antibody in which both heavy chains are the same (Ridgway et al., Protein Eng 9:617- 21, 1996; Atwell et al., J Mol Biol 270:26-35, 1997; and US Patent No. 7,695,936). However, such mutations still leave four potential pairings of the two light chains with the two heavy chains, of which only one combination is correct. These individual chains need to be produced at ratios that favor assembly of the correct final molecule and ease of downstream purification. It may be difficult to predict the appropriate regulatory elements to operably link to each ORF encoding each chain a priori, since over-production of some chains may be needed to help other chains fold, mRNA for some chains may translate less well than mRNA for other chains, and protein produced from some chains may be less stable than protein produced from other chains. It is therefore often advantageous to determine empirically the expression ratios of different chains that produce the highest amount of correctly assembled molecule. Depending on the purification methods used to separate the correctly assembled molecule from incorrectly assembled/partially assembled molecules, it may be preferable to choose a chain expression ratio that does not lead to the highest amount of correctly assembled molecule but instead minimizes the amount produced of a difficult- to-remove impurity. [0010] A specific need exists for one or more DNA constructs, each encoding both chains of a half-antibody, which may be combined to form a multi-specific antibody, the DNA construct(s) characterized in that when expressed in the host cell, the regulatory elements operably linked to each ORF result in a pre-determined chain expression ratio.

[0011] Finally, it is normal practice to screen many hundreds or thousands of clonal cell lines derived from a pool producing, e.g., a 4-chain heterodimer, in order to identify a clone that produces a high yield of product with a desirable level of heterodimer. Determination of the amount of multi-specific antibody and the amount of one or more incorrect assembly products produced by a cell pool or cell line is a process that can take a significant amount of time and labor. [0012] A need exists for one or more DNA constructs and methods for making and using the DNA constructs to produce, e.g., a 4-chain heterodimer, that maintain the integrity of the DNA constructs and are extremely efficient, in order to minimize the number of cell pools or cell lines whose products must be assessed.

SUMMARY

[0013] In one aspect, a transposon is provided for the production in a mammalian cell of a multi-specific antibody comprised of a first polypeptide, a second polypeptide, a third polypeptide, and a fourth polypeptide, the transposon comprising: (A) a first transcriptional unit, the first transcriptional unit comprising: a first ORF and a second ORF operably linked by a first internal ribosome entry sequence (IRES), and wherein the first and second ORFs are operably linked to first regulatory elements including a first promoter, a first polyadenylation signal sequence, and optionally a first enhancer and a first intron, the first regulatory elements being active in a mammalian cell to express the first and second ORFs, which respectively encode a first secretion signal fused to the first polypeptide and a second secretion signal fused to the second polypeptide, and wherein expression of the first transcriptional unit in the mammalian cell results in secretion of the first secretion signal fused to the first polypeptide and the second secretion signal fused to the second polypeptide and removal of the secretion signals to secrete the first and second polypeptides; and (B) a second transcriptional unit, the second transcriptional unit comprising: a third ORF and a fourth ORF operably linked by a second IRES, and wherein the third and fourth ORFs are operably linked to second regulatory elements including a second promoter, a second polyadenylation signal sequence, and optionally a second enhancer and a second intron, the second regulatory elements being active in a mammalian cell to express the third and fourth ORFs, which respectively encode a third secretion signal fused to the third polypeptide and a fourth secretion signal fused to the fourth polypeptide, and wherein expression of the second transcriptional unit in the mammalian cell results in secretion of the third secretion signal fused to the third polypeptide and the fourth secretion signal fused to the fourth polypeptide and removal of the secretion signals to secrete the third and fourth polypeptides. In one aspect, the first and second polypeptides are respectively the first and second chains of a first half antibody. In one aspect, the third and fourth polypeptides are respectively the first and second chains of a second half antibody. In one aspect, the first polypeptide is a component of a first half antibody, and the second polypeptide is a component of a second half antibody. In one aspect, the first IRES and the second IRES are identical. In one aspect, the first IRES and the second IRES are different. In one aspect, at least one of the first IRES and the second IRES is a viral IRES selected from the group consisting of an encephalomyocarditis virus, an echovirus, a foot and mouth disease virus, Theiler’s encephalomyelitis virus, Sikhote-Alin virus, or a coxsackievirus. In one aspect, the first IRES and the second IRES are each independently selected from SEQ ID NOs: 248-281. In one aspect, one of the polypeptides encoded by the first transcriptional unit has the same amino acid sequence, after cleavage of the secretion signal, as one of the polypeptides encoded by the second transcriptional unit. In one aspect, the transposon further comprises a nucleic acid sequence encoding a selectable marker. In one aspect, a multi-specific antibody produced by expression in a mammalian cell of the transposon is provided. In one aspect, an isolated mammalian cell comprising the transposon is provided. In one aspect, a monoclonal cell line prepared by isolating individual cells from a pool of cells whose genome comprises the transposon is provided.

[0014] In another aspect, a composition is provided for the production in a mammalian cell of a multi-specific antibody comprised of a first polypeptide, a second polypeptide, a third polypeptide, and a fourth polypeptide, the composition comprising: (A) a first transposon comprising a first ORF and a second ORF, wherein: (1) the first ORF is operably linked to first regulatory elements including a first promoter, a first polyadenylation signal sequence, and optionally a first enhancer and a first intron, the first regulatory elements being active in the mammalian cell to express the first ORF, which encodes a first secretion signal fused to the first polypeptide, and wherein expression of the first ORF in the mammalian cell results in secretion of the first secretion signal fused to the first polypeptide and removal of the first secretion signal to secrete the first polypeptide; (2) the second ORF is operably linked to second regulatory elements including a second promoter, a second polyadenylation signal sequence, and optionally a second enhancer and a second intron, the second regulatory elements being active in the mammalian cell to express the second ORF, which encodes a second secretion signal fused to the second polypeptide, and wherein expression of the second ORF in the mammalian cell results in secretion of the second secretion signal fused to the second polypeptide and removal of the second secretion signal to secrete the second polypeptide; and (B) a second transposon comprising a third ORF and a fourth ORF, wherein: (1) the third ORF is operably linked to third regulatory elements including a third promoter, a third polyadenylation signal sequence, and optionally a third enhancer and a third intron, the third regulatory elements being active in the mammalian cell to express the third ORF, which encodes a third secretion signal fused to the third polypeptide, and wherein expression of the third ORF in the mammalian cell results in secretion of the third secretion signal fused to the third polypeptide and removal of the third secretion signal to secrete the third polypeptide; and (2) the fourth ORF is operably linked to fourth regulatory elements including a fourth promoter, a fourth polyadenylation signal sequence, and optionally a fourth enhancer and a fourth intron, the fourth regulatory elements being active in the mammalian cell to express the fourth ORF, which encodes a fourth secretion signal fused to the fourth polypeptide, and wherein expression of the fourth ORF in the mammalian cell results in secretion of the fourth secretion signal fused to the fourth polypeptide and removal of the fourth secretion signal to secrete the fourth polypeptide. In one aspect, the first and second polypeptides are respectively the first and second chains of a first half antibody. In one aspect, the third and fourth polypeptides are respectively the first and second chains of a second half antibody. In one aspect, the first polypeptide is a component of a first half antibody, and the second polypeptide is a component of a second half antibody. In one aspect, one of the polypeptides encoded for by the first transposon has the same amino acid sequence, after cleavage of the secretion signal, as one of the polypeptides encoded for by the second transposon. In one aspect, the first transposon is flanked by a first pair of transposon ends, the second transposon is flanked by a second pair of transposon ends, and the first and second transposons are transposed by the same corresponding transposases. In one aspect, the first transposon is flanked by a first pair of transposon ends, the second transposon is flanked by a second pair of transposon ends, and the first and second transposons are transposed by different corresponding transposases. In one aspect, the first transposon further comprises a nucleic acid sequence that encodes for a first selectable marker, and the second transposon further comprises a nucleic acid sequence that encodes for a second selectable marker, and the first selectable marker and the second selectable marker confer a survival advantage under the same restrictive condition. In one aspect, the first transposon further comprises a nucleic acid sequence that encodes for a first selectable marker, and the second transposon further comprises a nucleic acid sequence that encodes for a second selectable marker, and the first selectable marker and the second selectable marker confer resistance to different selective conditions. In one aspect, a multi-specific antibody produced by expression in a mammalian cell of the first and second transposons is provided. In one aspect, an isolated mammalian cell comprising the first and second transposons is provided. In one aspect, a monoclonal cell line prepared by isolating single cells from a pool of cells whose genome comprises the first and second transposons is provided.

[0015] In another aspect, a composition is provided for the production in a mammalian cell of an antibody, the composition comprising a transposon, the transposon comprising a first transcriptional unit and a second transcriptional unit, the first transcriptional unit comprising: (A) a first ORF and a second ORF operably linked by an IRES, and wherein the first ORF and the second ORF are operably linked to first regulatory elements including a promoter, a polyadenylation signal sequence, and optionally an enhancer and an intron, the first regulatory elements being active in the mammalian cell to express the first and second ORFs, which respectively encode a first secretion signal fused to a first polypeptide and a second secretion signal fused to a second polypeptide, and wherein expression of the first transcriptional unit in the mammalian cell results in secretion of the first secretion signal fused to the first polypeptide and the second secretion signal fused to the second polypeptide and removal of the secretion signals to secrete the first and second polypeptides; and (B) the second transcriptional unit comprising a third ORF and a fourth ORF operably linked by a second IRES, and wherein the third ORF and the fourth ORF are operably linked to second regulatory elements including a promoter, a polyadenylation signal sequence, and optionally an enhancer and an intron, the second regulatory elements being active in the mammalian cell to express the third and fourth ORFs, which respectively encode a third secretion signal fused to the third polypeptide and a fourth secretion signal fused to the fourth polypeptide, and wherein expression of the second transcriptional unit in the mammalian cell results in secretion of the third secretion signal fused to the third polypeptide and the fourth secretion signal fused to the fourth polypeptide and removal of the secretion signals to secrete the third and fourth polypeptides. In one aspect, the first and second polypeptides are respectively the first and second chains of the antibody. In one aspect, the third and fourth polypeptides are respectively the first and second chains of the antibody. In one aspect, one of the polypeptides encoded for by the first or second ORF has the same amino acid sequence, after cleavage of the secretion signal, as one of the polypeptides encoded for by the third or fourth ORF. In one aspect, the ORFs that encode the same polypeptides do not have identical nucleic acid sequences, thereby reducing sequence repeat regions within the transposon and increasing its stability. In one aspect, introducing the transposon and a corresponding transposase into the mammalian cell results in a light chain:heavy chain production ratio of between about 1.5:1 to about 4: 1. In one aspect, at least one of the first IRES and the second IRES is a viral IRES selected from the group consisting of an encephalomyocarditis virus, an echovirus, a foot and mouth disease virus, Theiler’s encephalomyelitis virus, Sikhote-Alin virus, or a coxsackievirus. In one aspect, each IRES comprises a nucleotide sequence independently selected from SEQ ID NOs: 248-281. In one aspect, the transposon further comprises a nucleic acid sequence encoding a selectable marker. In one aspect, an antibody produced by expression in a mammalian cell of the transposon is provided. In one aspect, an isolated mammalian cell comprising the transposon is provided. In one aspect, a monoclonal cell line prepared by isolating single cells from a pool of cells whose genome comprises the transposon is provided.

[0016] In another aspect, a method is provided for constructing a mammalian cell line to produce a multi-specific antibody comprised of a first polypeptide, a second polypeptide, a third polypeptide, and a fourth polypeptide, the method comprising: (A) providing a first transposon comprising a first ORF and a second ORF, wherein: (1) the first ORF is operably linked to first regulatory elements including a first promoter, a first polyadenylation signal sequence, and optionally a first enhancer and a first intron, the first regulatory elements being active in the mammalian cell to express the first ORF, which encodes a first secretion signal fused to the first polypeptide, and wherein expression of the first ORF in the mammalian cell results in secretion of the first secretion signal fused to the first polypeptide and removal of the first secretion signal to secrete the first polypeptide; and (2) the second ORF is operably linked to second regulatory elements including a second promoter, a second polyadenylation signal sequence, and optionally a second enhancer and a second intron, the second regulatory elements being active in the mammalian cell to express the second ORF, which encodes a second secretion signal fused to the second polypeptide, and wherein expression of the second ORF in the mammalian cell results in secretion of the second secretion signal fused to the second polypeptide and removal of the second secretion signal to secrete the second polypeptide; and (B) providing a second transposon comprising a third ORF and a fourth ORF, wherein: (1) the third ORF is operably linked to third regulatory elements including a third promoter, a third polyadenylation signal sequence, and optionally a third enhancer and a third intron, the third regulatory elements being active in the mammalian cell to express the third ORF, which encodes a third secretion signal fused to the third polypeptide, and wherein expression of the third ORF in the mammalian cell results in secretion of the third secretion signal fused to the third polypeptide and removal of the third secretion signal to secrete the third polypeptide; and (2) the fourth ORF is operably linked to fourth regulatory elements including a fourth promoter, a fourth polyadenylation signal sequence, and optionally a fourth enhancer and a fourth intron, the fourth regulatory elements being active in the mammalian cell to express the fourth ORF, which encodes a fourth secretion signal fused to the fourth polypeptide, and wherein expression of the fourth ORF in the mammalian cell results in secretion of the fourth secretion signal fused to the fourth polypeptide and removal of the fourth secretion signal to secrete the fourth polypeptide; and (C) introducing the first and second transposons and their corresponding transposases into a mammalian cell, so that the first and second transposons are integrated into the genome of the mammalian cell. In one aspect, the first and second polypeptides are respectively the first and second chains of a first half antibody. In one aspect, the third and fourth polypeptides are respectively the first and second chains of a second half antibody. In one aspect, the first polypeptide is a component of a first half antibody, and the second polypeptide is a component of a second half antibody. In one aspect, one of the polypeptides encoded for by the first transposon has the same amino acid sequence, after cleavage of the secretion signal, as one of the polypeptides encoded for by the second transposon. In one aspect, the first and second transposons are introduced into the mammalian cell in an about 1 : 1 ratio. In one aspect, different amounts of the first and second transposons are introduced into the mammalian cell. In one aspect, the first transposon further comprises a nucleic acid sequence that encodes for a first selectable marker, and the second transposon further comprises a nucleic acid sequence that encodes for a second selectable marker, and the first selectable marker and the second selectable marker confer a survival advantage under the same restrictive condition. In one aspect, the first transposon further comprises a nucleic acid sequence that encodes for a first selectable marker, and the second transposon further comprises a nucleic acid sequence that encodes for a second selectable marker, and the first selectable marker and the second selectable marker confer resistance to different selective conditions. In one aspect, the method further comprising selecting cells that have integrated the first and second transposons into their genomes by subjecting each cell to restrictive conditions that require expression of the selectable marker for survival of the cell. In one aspect, the first transposon is flanked by a first pair of transposon ends, the second transposon is flanked by a second pair of transposon ends, and the first and second transposons are transposed by the same corresponding transposase. In one aspect, the first transposon is flanked by a first pair of transposon ends, the second transposon is flanked by a second pair of transposon ends, and the first and second transposons are transposed by different corresponding transposases. In one aspect, the method further comprises growing cells that have integrated the first and second transposons into their genomes under conditions where the cells produce and secrete the multi-specific antibody. In one aspect, the method further comprises preparing a plurality of pools of the mammalian cells, wherein each pool of cells has a different ratio of the first and second transposons introduced. In one aspect, the method further comprises comparing the amount of the multi-specific antibody and the amount of one or more incorrect assembly products produced by each pool in the plurality of pools and identifying a preferred pool for production of the multi-specific antibody. In one aspect, the method further comprises preparing monoclonal cell lines from a pool of cells whose genome comprises the first and second transposons. In one aspect, the method further comprises determining the amount of multi-specific antibody and the amount of one or more incorrect assembly products produced by each monoclonal cell line. In one aspect, the number of monoclonal cell lines prepared is at least one of: fewer than 10,000, fewer than 1,000, fewer than 900, fewer than 800, fewer than 700, fewer than 600, fewer than 500, fewer than 400, fewer than 300, fewer than 200, fewer than 100, fewer than 90, fewer than 80, fewer than 70, fewer than 60, fewer than 50, fewer than 40, and fewer than 30. In one aspect, the method further comprises producing the multi-specific antibody using the mammalian cell line.

[0017] In another aspect, a method is provided for constructing a mammalian cell line to produce a multi-specific antibody comprised of a first polypeptide, a second polypeptide, a third polypeptide, and a fourth polypeptide, the method comprising: (A) providing a transposon comprising a first transcriptional unit and a second transcriptional unit, wherein: (1) the first transcriptional unit comprises a first ORF and a second ORF operably linked by a first IRES, and wherein the first and second ORFs are operably linked to first regulatory elements including a first promoter, a first polyadenylation signal sequence, and optionally a first enhancer and a first intron, the first regulatory elements being active in the mammalian cell to express the first and second ORFs, which respectively encode a first secretion signal fused to the first polypeptide and a second secretion signal fused to the second polypeptide, and wherein expression of the first transcriptional unit in the mammalian cell results in secretion of the first secretion signal fused to the first polypeptide and the second secretion signal fused to the second polypeptide and removal of the secretion signals to secrete the first and second polypeptides; and (2) the second transcriptional unit comprises a third ORF and a fourth ORF operably linked by a second IRES, and wherein the third and fourth ORFs are operably linked to second regulatory elements including a second promoter, a second polyadenylation signal sequence, and optionally a second enhancer and a second intron, the second regulatory elements being active in the mammalian cell to express the third and fourth ORFs, which respectively encode a third secretion signal fused to the third polypeptide and a fourth secretion signal fused to the fourth polypeptide, and wherein expression of the second transcriptional unit in the mammalian cell results in secretion of the third secretion signal fused to the third polypeptide and the fourth secretion signal fused to the fourth polypeptide and removal of the secretion signals to secrete the third and fourth polypeptides; and (B) introducing the transposon and a corresponding transposase into the mammalian cell, so that the transposon is integrated into the genome of the mammalian cell. In one aspect, the first and second polypeptides are respectively the first and second chains of a first half antibody. In one aspect, the third and fourth polypeptides are respectively the first and second chains of a second half antibody. In one aspect, the first polypeptide is a component of a first half antibody, and the second polypeptide is a component of a second half antibody. In one aspect, at least one of the first IRES and the second IRES is a viral IRES selected from the group consisting of an encephalomyocarditis virus, an echovirus, a foot and mouth disease virus, Theiler’s encephalomyelitis virus, Sikhote- Alin virus, or a coxsackievirus. In one aspect, each IRES comprises a nucleotide sequence independently selected from SEQ ID NOs: 248-281. In one aspect, one of the polypeptides encoded by the first transcriptional unit has the same amino acid sequence, after cleavage of the secretion signal, as one of the polypeptides encoded by the second transcriptional unit. In one aspect, the transposon further comprises a nucleic acid sequence that encodes for a selectable marker. In one aspect, the method further comprises selecting cells that have integrated the transposon into their genomes by subjecting each cell to restrictive conditions that require expression of the selectable marker for survival of the cell. In one aspect, the method further comprises growing cells that have integrated the transposon into their genomes under conditions where the cells produce and secrete the multi-specific antibody. In one aspect, the method further comprises providing a plurality of transposons, wherein different regulatory elements are operably linked to the first or second transcriptional units, or the relative positions of the ORFs are modified, and different relative expression levels of each polypeptide are thereby obtained. In one aspect, the method further comprises preparing a plurality of pools of mammalian cells, wherein each pool of cells has a different transposon from the plurality of transposons introduced. In one aspect, more than one transposon from the plurality of transposons is introduced. In one aspect, the method further comprises comparing the amount of the multi-specific antibody and the amount of one or more incorrect assembly products produced by each pool in the plurality of pools and identifying a preferred pool for production of the multi-specific antibody. In one aspect, the method further comprises preparing monoclonal cell lines from a pool of cells whose genome comprises the transposon. In one aspect, the method further comprises determining the amount of multi-specific antibody and the amount of one or more incorrect assembly products produced by each monoclonal cell line. In one aspect, the number of monoclonal cell lines prepared is at least one of: fewer than 10,000, fewer than 1,000, fewer than 900, fewer than 800, fewer than 700, fewer than 600, fewer than 500, fewer than 400, fewer than 300, fewer than 200, fewer than 100, fewer than 90, fewer than 80, fewer than 70, fewer than 60, fewer than 50, fewer than 40, fewer than 30. In one aspect, the method further comprises producing the multi-specific antibody using the mammalian cell line.

[0018] In another aspect, a method is provided for constructing a mammalian cell line to produce an antibody, the method comprising: (A) providing a transposon comprising a first transcriptional unit and a second transcriptional unit, wherein: (1) the first transcriptional unit comprises a first ORF and a second ORF operably linked by a first IRES, and wherein the first ORF and the second ORF are operably linked to first regulatory elements including a promoter, a polyadenylation signal sequence, and optionally an enhancer and an intron, the first regulatory elements being active in the mammalian cell to express the first and second ORFs, which respectively encode a first secretion signal fused to a first polypeptide and a second secretion signal fused to a second polypeptide, and wherein expression of the first transcriptional unit in the mammalian cell results in secretion of the first secretion signal fused to the first polypeptide and the second secretion signal fused to the second polypeptide and removal of the secretion signals to secrete the first and second polypeptides; and (2) the second transcriptional unit comprises a third ORF and a fourth ORF operably linked by a second IRES, and wherein the third ORF and the fourth ORF are operably linked to second regulatory elements including a promoter, a polyadenylation signal sequence, and optionally an enhancer and an intron, the second regulatory elements being active in the mammalian cell to express the third and fourth ORFs, which respectively encode a third secretion signal fused to the third polypeptide and a fourth secretion signal fused to the fourth polypeptide, and wherein expression of the second transcriptional unit in the mammalian cell results in secretion of the third secretion signal fused to the third polypeptide and the fourth secretion signal fused to the fourth polypeptide and removal of the secretion signals to secrete the third and fourth polypeptides; and (B) introducing the transposon and a corresponding transposase into the mammalian cell, so that the transposon is integrated into the genome of the mammalian cell. In one aspect, the first and second polypeptides are respectively the first and second chains of the antibody. In one aspect, the third and fourth polypeptides are respectively the first and second chains of the antibody. In one aspect, one of the polypeptides encoded for by the first or second ORF has the same amino acid sequence, after cleavage of the secretion signal, as one of the polypeptides encoded for by the third or fourth ORF. In one aspect, the ORFs that encode the same polypeptides do not have identical nucleic acid sequences, thereby reducing sequence repeat regions within the transposon and increasing its stability. In one aspect, introducing the transposon and a corresponding transposase into the mammalian cell results in a light chaimheavy chain production ratio of between about 1.5: 1 to about 4: 1. In one aspect, at least one of the first IRES and the second IRES is a viral IRES selected from the group consisting of an encephalomyocarditis virus, an echovirus, a foot and mouth disease virus, Theiler’s encephalomyelitis virus, Sikhote-Alin virus, or a coxsackievirus. In one aspect, each IRES comprises a nucleotide sequence independently selected from SEQ ID NOs: 248-281. In one aspect, the transposon further comprises a nucleic acid sequence that encodes for a selectable marker. In one aspect, the method further comprises selecting cells that have integrated the transposon into their genomes by subjecting each cell to restrictive conditions that require expression of the selectable marker for survival of the cell. In one aspect, the method further comprises growing cells that have integrated the transposon into their genomes under conditions where the cells produce and secrete the antibody. In one aspect, the method further comprises preparing monoclonal cell lines from a pool of cells whose genome comprises the transposon. In one aspect, the method further comprises determining the amount of antibody produced by each monoclonal cell line. In one aspect, the number of monoclonal cell lines prepared is at least one of: fewer than 10,000, fewer than 1,000, fewer than 900, fewer than 800, fewer than 700, fewer than 600, fewer than 500, fewer than 400, fewer than 300, fewer than 200, fewer than 100, fewer than 90, fewer than 80, fewer than 70, fewer than 60, fewer than 50, fewer than 40, and fewer than 30. In one aspect, the method further comprises producing the antibody using the mammalian cell line.

BRIEF DESCRIPTION OF THE FIGURES

[0019] The accompanying figures, which are incorporated in and constitute a part of the specification, are used merely to illustrate various example embodiments. [0020] Figure 1 shows a schematic configuration of a DNA construct (100) for production of a 2-chain antibody or antibody-related molecule, including over-production of the light chain relative to the heavy chain, using first and second transcriptional units, each comprising a CMV enhancer, a promoter, an ORF, and a polyA tail, but not including an IRES.

[0021] Figure 2 shows a schematic configuration of a DNA construct (200) for production of a 2-chain antibody or antibody-related molecule, including over-production of the light chain relative to the heavy chain, using a single transcriptional unit comprising two ORFs and an IRES to link the ORFs.

[0022] Figure 3 shows a schematic configuration of a DNA construct (300) for production of a 2-chain antibody or antibody-related molecule, including over-production of the light chain relative to the heavy chain, using first and second transcriptional units, each transcriptional unit comprising two ORFs (one each for the light and heavy chain) and an IRES to link the ORFs, wherein the DNA construct comprises a transposon.

[0023] Figure 4 shows assembly products possible from a first and second light chain (LI and L2 respectively) and a first and second heavy chain (Hl and H2 respectively). Desired halfantibodies (each, an “antibody -related molecule”) have LI paired with Hl and L2 paired with H2. Pairing of these half-antibodies with each other results in the full heterodimeric antibody. Selfpairing of either of these half-antibodies results in homodimer 1 or homodimer 2. Mispairing of light and heavy chains (LI with H2 or L2 with Hl) produces mispaired half antibodies, which may assemble into aberrant products (not shown).

[0024] Figures 5A and 5B show a schematic configuration of a pair of transposons (500) for production of a 4-chain multi-specific antibody, each transposon comprising two transcriptional units, each transcriptional unit comprising an ORF, and each ORF is operably linked to regulatory elements such that the ORFs are expressible in the host cell, but not including an IRES.

[0025] Figure 6 shows a schematic configuration of a pair of transposons (600) for production of a 4-chain multi-specific antibody, each transposon comprising a single transcriptional unit, each transcriptional unit comprising two ORFs (one for each chain of one half-antibody) and an IRES to link the ORFs.

[0026] Figure 7 shows a schematic configuration of a transposon (700) for production of a 3- or 4-chain multi-specific antibody, the transposon comprising two transcriptional units, each transcriptional unit comprising two ORFs and an IRES to link the ORFs.

[0027] Figures 8A and 8B show a schematic configuration of a pair of transposons (800) for production of a 4-chain multi-specific antibody, each transposon comprising two transcriptional units, each transcriptional unit comprising two ORFs (one for each chain of one half-antibody) and an IRES to link the ORFs.

[0028] Figure 9 shows a table (Table 1) comparing the production of three different monoclonal antibodies from transposons constructed essentially as shown in Figure 1 (mAb) or Figure 3 (mAb dual IRES). The transposons were integrated into the genomes of CHO cells, and the resultant cell pools were used to produce the monoclonal antibody. Concentrations of each antibody were measured at different times of the culture: 7 days (column B), 10 days (column C), 12 days (column D), and 14 days (column E). Measurements are shown for antibody 1 in the mAb configuration (row 1) or the dual IRES configuration (row 2); antibody 2 in the mAb configuration (row 3) or the dual IRES configuration (row 4); and antibody 3 in the mAb configuration (row 5) or the dual IRES configuration (row 6). [0029] Figure 10 shows half antibodies and heterodimeric and homodimeric full assembly products from the 4-chain multi-specific antibody described in Examples 2 and 3.

[0030] Figure 11 shows hydrophobic interaction chromatography (HIC) traces of protein A- purified material from CHO pools whose genomes comprise transposons constructed similarly to the constructs shown in Figures 5A and 5B. The trace from purified material derived from a pool comprising only transposon 1 encoding the “hole” half antibody is shown as “Ctrl transposon 1.” The trace from purified material derived from a pool comprising only transposon 2 encoding the “knob” half antibody is shown as “Ctrl transposon 2.” In the absence of the other half-antibody, no heterodimer formation is possible, so half antibodies and homodimers can be identified as knob or hole-related impurities. Traces A, B, and C refer to HIC traces from material derived from pools of cells whose genomes comprise the transposons as indicated in column I of Table 2 (Figure 12).

[0031] Figure 12 shows a table (Table 2) comparing the heterodimer production and contaminant production resulting from various transfection ratios of transposons constructed similarly to the constructs shown in Figures 5A and 5B. Transposons were prepared and transfected into CHO cells. The amount of the first transposon (transposon 1) transfected (in pg) is shown in column B. The amount of transposon 2 transfected (in pg) is shown in column C. HIC was used to quantify the relative amounts of heterodimer (column D), hole-related contaminants (column E), knob-related contaminants (column F), and unidentified contaminants (column G) from protein A-purified material from each pool. The total amount of material purifiable from 1 L of culture is shown in column H. The corresponding HIC traces are shown in Figure 11, with the traces labelled as shown in column I. [0032] Figure 13 shows a table (Table 3) comparing the heterodimer production and contaminant production resulting from various transfection ratios of transposons constructed similarly to the constructs shown in Figure 7. Columns B - 1 identify the regulatory elements used with reference to Figure 7. HIC was used to quantify the relative amounts of heterodimer (column J), hole-related contaminants (column K), knob-related contaminants (column L), and unidentified contaminants (column M) from protein A-purified material from each pool. The total amount of material purifiable from 1 L of culture is shown in column N. The corresponding HIC traces are shown in Figure 14, with the traces labelled as shown in column A.

[0033] Figure 14 shows HIC traces of protein A-purified material from CHO pools whose genomes comprise transposons constructed similarly to the constructs shown in Figure 7. Traces 1-9 refer to HIC traces from material derived from pools of cells whose genomes comprise the transposons as indicated in column A of Figure 13, Table 3. The control traces are as described for Figure 11.

[0034] Figure 15 shows a table (Table 4) comparing the heterodimer production and contaminant production resulting from various transposons constructed similarly to the constructs shown in Figure 7. Columns B - E and G - J identify the regulatory elements used with reference to Figure 7. Column F indicates which heavy chain (HC) sequence was encoded in the first transcriptional unit (726 Second ORF in Figure 7). Column K indicates which HC sequence was encoded in the second transcriptional unit (746 Fourth ORF in Figure 7). Cation Ion Exchange (cIEX) was used to quantify the relative amounts of heterodimer (column N), HC1 -related contaminants (column O), and HC2-related contaminants (column M) from protein A-purified material from each pool. The total amount of material purifiable from 1 L of culture is shown in column L. The corresponding cIEX traces are shown in Figure 16, with traces labelled as shown in column A.

[0035] Figure 16 shows cIEX traces of protein A-purified material from CHO pools whose genomes comprise transposons constructed similarly to the constructs shown in Figure 7. Traces Pl -PIO refer to cIEX traces from material derived from pools of cells whose genomes comprise the transposons as indicated in column A of Figure 15, Table 4.

[0036] Figure 17 shows a table (Table 5) showing the productivity of 16 monoclonal cell lines derived from the pool of cells described in Example 4 and shown in Figure 15, Table 4, as P8. Column A lists the clone name, column B shows the duration of the fed batch, column C shows the purified yield (mg/L), and column D shows the % of the protein A-purified material that was heterodimer. The pool is shown as the original measurements (P8 original), as well as a second fed batch run at the same time as the clones (P8 repeat).

[0037] Figure 18 shows a table (Table 6) showing the productivity of 10 monoclonal cell lines derived from the pool of cells described in Example 3 and shown in Figure 13, Table 3, as Sample 6. Column A lists the clone name, column B shows the duration of the fed batch, column C shows the purified yield (mg/L), and column D shows the % of the protein A-purified material that was heterodimer. The pool is shown as the original measurements (pool original), as well as a second fed batch run at the same time as the clones (pool repeat).

[0038] Figure 19 shows a table (Table 7) showing the productivity of 10 monoclonal cell lines derived from the pool of cells described in Example 2 and shown in Figure 12, Table 2, as Sample 3. Column A lists the clone name, column B shows the duration of the fed batch, column C shows the purified yield (mg/L), and column D shows the % of the protein A-purified material that was heterodimer. The pool is shown as the fed batch run at the same time as the clones. [0039] Figure 20 shows a table (Table 8) comparing the production of monoclonal antibodies from three different transposons constructed essentially as shown in Figure 1 (mAb8), or Figure 3 (2x LC-IRES-HC, indicating that the first ORF and the third ORF encode for the light chain of the antibody; or LC-IRES-HC/HC-IRES-LC, indicating that the first ORF and the fourth ORF encode for the light chain of the antibody). The transposons were integrated into the genomes of CHO cells, and the resultant cell pools were used to produce the monoclonal antibody. Concentrations of each antibody were measured at different times of the culture: 7 days, 10 days, 12 days, and 14 days. Measurements are shown for the antibody in the 2x LC-IRES-HC configuration (row 1), the mAb configuration (row 2), and the LC-IRES-HC/HC-IRES-LC configuration (row 3).

DETAILED DESCRIPTION

I. Definitions

[0040] Use of the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise. For example, reference to “a polynucleotide” may include a plurality of polynucleotides.

[0041] Terms such as “connected,” “attached,” “linked,” and “conjugated” are used interchangeably herein and encompass direct as well as indirect connection, attachment, linkage, or conjugation unless the context clearly dictates otherwise. Where a range of values is recited, each intervening integer value, and each fraction thereof, between the recited upper and lower limits of that range is also specifically disclosed, along with each subrange between such values. The upper and lower limits of any range can independently be included in or excluded from the range, and each range where either, neither, or both limits are included is also encompassed. Where a value being discussed has inherent limits, for example where a component can be present at a concentration of from 0 to 100%, or where the pH of an aqueous solution can range from 1 to

14, those inherent limits are specifically disclosed. Where a value is explicitly recited, values that are “about” (that is, within ±10%) the same quantity or amount as the recited value are also within the scope. Where a combination is disclosed, each sub-combination of the elements of that combination is also specifically disclosed. Conversely, where different elements or groups of elements are individually disclosed, combinations thereof are also disclosed. Where any element is disclosed as having a plurality of alternatives, examples in which each alternative is excluded singly or in any combination with the other alternatives are also hereby disclosed; more than one element can have such exclusions, and all combinations of elements having such exclusions are hereby disclosed.

[0042] Unless defined otherwise herein, all technical and scientific terms have the same meaning as commonly understood by one of ordinary skill in the relevant art. Singleton, et al., Dictionary of Microbiology and Molecular Biology, 2^nd Ed., John Wiley and Sons, New York (1994), and Hale & Marham, The Harper Collins Dictionary of Biology, Harper Perennial, NY, 1991, provide one of skill with a general dictionary of many of the terms used herein. Unless otherwise indicated, nucleic acids are written left to right in 5’ to 3’ orientation; amino acid sequences are written left to right in amino to carboxy orientation, respectively. The terms defined immediately below are more fully defined by reference to the specification as a whole.

[0043] The “configuration” of a polynucleotide means the functional sequence elements within the polynucleotide and the order and direction of those elements.

[0044] The terms “corresponding transposon” and “corresponding transposase” are used to indicate an activity relationship between a transposase and a transposon. A transposase transposases its corresponding transposon. [0045] The term “coupling element” or “translational coupling element” means a DNA sequence that allows the expression of a first polypeptide to be linked to the expression of a second polypeptide. IRES elements and cis-acting hydrolase elements are examples of coupling elements. [0046] The terms “DNA sequence,” “RNA sequence,” or “polynucleotide sequence” refer to a contiguous nucleic acid sequence. The sequence can be an oligonucleotide of 2 to 20 nucleotides in length to a full-length genomic sequence of thousands or hundreds of thousands of base pairs.

[0047] The term “expression construct” means any polynucleotide designed to transcribe an RNA, such as, for example, a construct that contains at least one promoter that is or may be operably linked to a downstream gene, coding region, or polynucleotide sequence (for example, a cDNA or genomic DNA fragment that encodes a polypeptide or protein, or an RNA effector molecule, for example, an antisense RNA, triplex-forming RNA, ribozyme, an artificially selected high affinity RNA ligand (aptamer), a double-stranded RNA, for example, an RNA molecule comprising a stem-loop or hairpin dsRNA, or a bi-finger or multi-finger dsRNA or a microRNA, or any RNA). An “expression vector” is a polynucleotide comprising a promoter that can be operably linked to a second polynucleotide. Transfection or transformation of the expression construct into a recipient cell allows the cell to express an RNA effector molecule, polypeptide, or protein encoded by the expression construct. An expression construct may be a genetically engineered plasmid, virus, recombinant virus, or an artificial chromosome derived from, for example, a bacteriophage, adenovirus, adeno-associated virus, retrovirus, lentivirus, poxvirus, or herpesvirus. Such expression vectors can include sequences from bacteria, viruses, or phages. Such vectors include chromosomal, episomal, and virus-derived vectors, for example, vectors derived from bacterial plasmids, bacteriophages, yeast episomes, yeast chromosomal elements, and viruses, vectors derived from combinations thereof, such as those derived from plasmid and bacteriophage genetic elements, cosmids, and phagemids. An expression construct can be replicated in a living cell, or it can be made synthetically. The terms “expression construct,” “expression vector,” “vector,” and “plasmid” are used interchangeably herein to demonstrate the application of the invention in a general, illustrative sense, and are not intended to limit the invention to a particular type of expression construct.

[0048] The term “expression polypeptide” means a polypeptide encoded by a gene on an expression construct.

[0049] The term “expression system” means any in vivo or in vitro biological system that is used to produce one or more gene products encoded by a polynucleotide.

[0050] A “gene transfer system” refers to a vector or gene transfer vector, i.e., a polynucleotide comprising the gene to be transferred which is cloned into a vector (a “gene transfer polynucleotide” or “gene transfer construct”). A gene transfer system may also comprise other features to facilitate the process of gene transfer. For example, a gene transfer system may comprise a vector and a lipid or viral packaging mix for enabling a first polynucleotide to enter a cell, or it may comprise a polynucleotide that includes a transposon and a second polynucleotide sequence encoding a corresponding transposase to enhance productive genomic integration of the transposon. The transposases and transposons of a gene transfer system may be on the same nucleic acid molecule or on different nucleic acid molecules. The transposase of a gene transfer system may be provided as a polynucleotide or as a polypeptide.

[0051] Two elements are “heterologous” to one another if not naturally associated. For example, a nucleic acid sequence encoding a protein linked to a heterologous promoter means a promoter other than that which naturally drives expression of the protein. A heterologous nucleic acid flanked by transposon ends or inverted terminal repeats (“ITR”s) means a heterologous nucleic acid not naturally flanked by those transposon ends or ITRs, such as a nucleic acid encoding a polypeptide other than a transposase, including an antibody heavy or light chain. A nucleic acid is heterologous to a cell if not naturally found in the cell or if naturally found in the cell but in a different location (e.g., episomal or different genomic location) than the location described.

[0052] The term “host” means any prokaryotic or eukaryotic organism that can be a recipient of a nucleic acid. A “host” includes prokaryotic or eukaryotic organisms that can be genetically engineered. For examples of such hosts, see Maniatis et al., Molecular Cloning. A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1982). As used herein, the terms “host,” “host cell,” “host system,” and “expression host” can be used interchangeably.

[0053] An “intron” is a segment of a DNA or RNA molecule that does not code for proteins and interrupts the sequence of genes.

[0054] An “IRES” or “internal ribosome entry site” is an RNA element that allows for translation initiation in a cap-independent manner, as part of the greater process of protein synthesis.

[0055] An “isolated” polypeptide or polynucleotide means a polypeptide or polynucleotide that has been either removed from its natural environment, produced using recombinant techniques, or chemically or enzymatically synthesized. Polypeptides or polynucleotides may be purified, that is, essentially free from any other polypeptide or polynucleotide and associated cellular products or other impurities.

[0056] The terms “nucleoside” and “nucleotide” include those moieties that contain not only the known purine and pyrimidine bases, but also other heterocyclic bases that have been modified. Such modifications include methylated purines or pyrimidines, acylated purines or pyrimidines, or other heterocycles. Modified nucleosides or nucleotides can also include modifications on the sugar moiety, for example, where one or more of the hydroxyl groups are replaced with halogen, aliphatic groups, or are functionalized as ethers, amines, or the like. The term “nucleotidic unit” is intended to encompass nucleosides and nucleotides.

[0057] An “open reading frame” or “ORF” means a portion of a polynucleotide that, when translated into amino acids, contains no stop codons. The genetic code reads DNA sequences in groups of three base pairs, which means that a double-stranded DNA molecule can read in any of six possible reading frames-three in the forward direction and three in the reverse. An ORF typically also includes an initiation codon at which translation may start.

[0058] The term “operably linked” refers to functional linkage between two sequences such that one sequence modifies the behavior of the other. For example, a first polynucleotide comprising a nucleic acid expression control sequence (such as a promoter, IRES sequence, enhancer, or array of transcription factor binding sites) and a second polynucleotide are operably linked if the first polynucleotide affects transcription and/or translation of the second polynucleotide. Similarly, a first amino acid sequence comprising a secretion signal, i.e., a subcellular localization signal, and a second amino acid sequence are operably linked if the first amino acid sequence causes the second amino acid sequence to be secreted or localized to a subcellular location.

[0059] A “piggyBac-like transposase” means a transposase with at least 20% sequence identity as identified using the TBLASTN algorithm to the piggyBac transposase from Trichoplusia ni (SEQ ID NO: 79), and as more fully described in Sakar, A. et. Al., (2003). Mol. Gen. Genomics 270: 173-180. “Molecular evolutionary analysis of the widespread piggyBac transposon family and related ‘domesticated’ species,” incorporated herein by reference in its entirety and further characterized by a DDE-like DDD motif, with aspartate residues at positions corresponding to

D268, D346, and D447 of Trichoplusia ni piggyBac transposase on maximal alignment. PiggyBac-like transposases are also characterized by their ability to excise their transposons precisely with a high frequency. A “piggyBac-like transposon” means a transposon having transposon ends that are the same or at least 80%, including at least 90, 95, 96, 97, 98 or 99% identical to the transposon ends of a naturally occurring transposon that encodes a piggyBac-like transposase. A piggyBac-like transposon includes an ITR sequence of approximately 12-16 bases at each end. These repeats may be identical at the two ends, or the repeats at the two ends may differ at 1 or 2 or 3 or 4 positions in the two ITRs. The transposon is flanked on each side by a 4 base sequence corresponding to the integration target sequence that is duplicated on transposon integration (the “Target Site Duplication” or “Target Sequence Duplication” or “TSD”).

[0060] The terms “polynucleotide,” “oligonucleotide,” “nucleic acid,” “nucleic acid molecule,” and “gene” are used interchangeably to refer to a polymeric form of nucleotides of any length, and may comprise ribonucleotides, deoxyribonucleotides, analogs thereof, or mixtures thereof. These terms refer only to the primary structure of the molecule. Thus, the terms include triple-, double-, and single-stranded DNA, as well as triple-, double-, and single-stranded RNA. The terms also encompass modified, for example by alkylation and/or by capping, and unmodified forms of the polynucleotide. More particularly, the terms “polynucleotide,” “oligonucleotide,” “nucleic acid,” and “nucleic acid molecule” include polydeoxyribonucleotides (containing 2- deoxy-D-ribose), polyribonucleotides (containing D-ribose), including tRNA, rRNA, hRNA, siRNA, and mRNA, whether spliced or unspliced, any other type of polynucleotide that is an N- or C-glycoside of a purine or pyrimidine base, and other polymers containing nonnucleotidic backbones, for example, polyamide (for example, peptide nucleic acids (“PNAs”)) and polymorpholino (commercially available from the Anti-Virals, Inc., Corvallis, Oreg., as Neugene) polymers, and other synthetic sequence-specific nucleic acid polymers providing that the polymers contain nucleobases in a configuration that allows for base pairing and base stacking, such as is found in DNA and RNA. There is no intended distinction in length between the terms “polynucleotide,” “oligonucleotide,” “nucleic acid,” and “nucleic acid molecule,” and these terms are used interchangeably herein. These terms include, for example, 3 ’-deoxy -2’, 5 ’-DNA, oligodeoxyribonucleotide N3’ P5’ phosphoramidates, 2’-O-alkyl-substituted RNA, double- and single-stranded DNA, as well as double- and single-stranded RNA, and hybrids thereof, including for example hybrids between DNA and RNA or between PNAs and DNA or RNA, and also include known types of modifications, for example, labels, alkylation, “caps,” substitution of one or more of the nucleotides with an analog, internucleotide modifications such as, for example, those with uncharged linkages (for example, methyl phosphonates, phosphotriesters, phosphoramidates, carbamates, or the like) with negatively charged linkages (for example, phosphorothioates, phosphorodithioates, or the like), and with positively charged linkages (for example, aminoalkylphosphoramidates, aminoalkylphosphotriesters), those containing pendant moieties, such as, for example, proteins (including enzymes (for example, nucleases), toxins, antibodies, signal peptides, poly-L-lysine, or the like), those with intercalators (for example, acridine, psoralen, or the like), those containing chelates (of, for example, metals, radioactive metals, boron, oxidative metals, or the like), those containing alkylators, those with modified linkages (for example, alpha anomeric nucleic acids, or the like), as well as unmodified forms of the polynucleotide or oligonucleotide.

[0061] A “promoter” means a nucleic acid sequence sufficient to direct transcription of an operably linked nucleic acid molecule. A promoter can be used together with other transcription control elements (for example, enhancers) that are sufficient to render promoter-dependent gene expression controllable in a cell type-specific, tissue-specific, or temporal-specific manner, or that are inducible by external signals or agents; such elements, may be within the 3’ region of a gene or within an intron. In one aspect, the promoter may be operably linked to a nucleic acid sequence, for example, a cDNA, a gene sequence, or an effector RNA coding sequence, in such a way as to enable expression of the nucleic acid sequence, or a promoter is provided in an expression cassette into which a selected nucleic acid sequence to be transcribed can be conveniently inserted.

[0062] The term “selectable marker” means a polynucleotide segment that allows one to select for or against a molecule or a cell that contains it, often under particular conditions. These markers can encode an activity, such as, but not limited to, production of RNA, a peptide, or a protein, or these markers can provide a binding site for RNA, peptides, proteins, inorganic and organic compounds, or compositions. Examples of selectable markers include, but are not limited to: (1) DNA segments that encode products that provide resistance against otherwise toxic compounds (e.g., antibiotics); (2) DNA segments that encode products that are otherwise lacking in the recipient cell (e.g., tRNA genes, auxotrophic markers); (3) DNA segments that encode products that suppress the activity of a gene product; (4) DNA segments that encode products that can be readily identified (e.g., phenotypic markers such as beta-galactosidase, GFP, and cell surface proteins); (5) DNA segments that bind products that are otherwise detrimental to cell survival and/or function; (6) DNA segments that otherwise inhibit the activity of any of the DNA segments described in Nos. 1-5 above (e.g., antisense oligonucleotides); (7) DNA segments that bind products that modify a substrate (e.g. restriction endonucleases); (8) DNA segments that can be used to isolate a desired molecule (e.g. specific protein binding sites); (9) DNA segments that encode a specific nucleotide sequence that can be otherwise non-functional (e.g., for PCR amplification of subpopulations of molecules); and/or (10) DNA segments, which when absent, directly or indirectly confer sensitivity to particular compounds.

[0063] Sequence identity can be determined by aligning sequences using algorithms, such as BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package Release 7.0 (Genetics Computer Group, 575 Science Dr., Madison, Wis.), using default gap parameters, or by inspection, and the best alignment (i.e., resulting in the highest percentage of sequence similarity over a comparison window). Percentage of sequence identity is calculated by comparing two optimally aligned sequences over a window of comparison, determining the number of positions at which the identical residues occur in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of matched and mismatched positions not counting gaps in the window of comparison (i.e., the window size), and multiplying the result by 100 to yield the percentage of sequence identity. Unless otherwise indicated, the window of comparison between two sequences is defined by the entire length of the shorter of the two sequences. Identity or homology with respect to such sequences is defined herein as the percentage of amino acid residues in the candidate sequence that are identical with the known peptides, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent homology, and not considering any conservative substitutions as part of the sequence identity. N-terminal, C-terminal, or internal extensions, deletions, or insertions into the peptide sequence shall not be construed as affecting homology.

[0064] A “target nucleic acid” is a nucleic acid into which a transposon is to be inserted. Such a target can be part of a chromosome, episome, or vector.

[0065] An “integration target sequence” or “target sequence” or “target site” for a transposase is a site or sequence in a target DNA molecule into which a transposon can be inserted by a transposase. The piggyBac transposase from Trichoplusia ni inserts its transposon predominantly into the target sequence 5’-TTAA-3’. PiggyBac-like transposases transpose their transposons using a cut-and-paste mechanism, which results in duplication of their 4 base pair target sequence on insertion into a DNA molecule. The target sequence is thus found on each side of an integrated piggyBac -like transposon.

[0066] The term “translation” refers to the process by which a polypeptide is synthesized by a ribosome “reading” the sequence of a polynucleotide.

[0067] A “transposase” is a polypeptide that catalyzes the excision of a corresponding transposon from a donor polynucleotide, for example a vector, and (providing the transposase is not integration-deficient) the subsequent integration of the transposon into a target nucleic acid. A transposase may be a piggyBac-like transposase. Other non-limiting, suitable transposases are disclosed in U.S. Patent No. 10,041,077B2, which is incorporated herein by reference in its entirety.

[0068] The term “transposition” refers to the action of a transposase in excising a transposon from one polynucleotide and then integrating it, either into a different site in the same polynucleotide, or into a second polynucleotide.

[0069] The term “transposon” means a polynucleotide that can be excised from a first polynucleotide, for instance, a vector, and be integrated into a second position in the same polynucleotide, or into a second polynucleotide, for instance, the genomic or extrachromosomal DNA of a cell, by the action of a corresponding trans-acting transposase. A transposon comprises a first transposon end and a second transposon end, which are polynucleotide sequences recognized by and transposed by a transposase. A transposon usually further comprises a first polynucleotide sequence between the two transposon ends, such that the first polynucleotide sequence is transposed along with the two transposon ends by the action of the transposase. Natural transposons frequently comprise DNA encoding a transposase that acts on the transposon. Transposons as claimed herein are “synthetic transposons,” comprising a heterologous polynucleotide sequence that is transposable by virtue of its juxtaposition between two transposon ends. A suitable transposon is a piggyBac-like transposon. Other non-limiting, suitable transposons are disclosed in U.S. Patent No. 10,041,077B2.

[0070] The term “transposon end” means the cis-acting nucleotide sequences that are sufficient for recognition by and transposition by a corresponding transposase. Transposon ends of piggyBac-like transposons comprise perfect or imperfect repeats such that the respective repeats in the two transposon ends are reverse complements of each other. These are referred to as ITRs or terminal inverted repeats (“TIR”s). A transposon end may or may not include an additional sequence proximal to the ITR that promotes or augments transposition.

[0071] The term “vector,” “DNA vector,” or “gene transfer vector” refers to a polynucleotide that is used to perform a “carrying” function for another polynucleotide. For example, vectors are often used to allow a polynucleotide to be propagated within a living cell, to allow a polynucleotide to be packaged for delivery into a cell, or to allow a polynucleotide to be integrated into the genomic DNA of a cell. A vector may further comprise additional functional elements, such as, for example, a transposon.

[0072] The disclosure refers to several genes and proteins for which it provides an example “SEQ ID NO:.” Unless otherwise apparent from the context, reference to a gene or protein should be understood as including the specific SEQ ID NO, as well as allelic, species, and induced variants thereof having at least 90, 95, or 99% identity thereto. Examples of allelic and species variants can be found in the SwissProt and other databases. [0073] Mutations are sometimes referred to in the form XnY, wherein X is a wildtype amino acid, n is an amino acid position of X in a wildtype sequence, and Y is a replacement amino acid. If the mutation occurs in a sequence having a different number of amino acids than the wildtype sequence, it is present at the position in the sequence aligned with position n in the wildtype sequence when the respective sequences are maximally aligned.

II. Transposon Elements

[0074] Heterologous polynucleotides may be more efficiently integrated into a target genome if they are part of a transposon, for example so that they may be integrated by a transposase. A particular benefit of a transposon is that the entire polynucleotide between the transposon ITRs is integrated. This is in contrast with random integration, where a polynucleotide introduced into a eukaryotic cell is often fragmented at random in the cell, and only parts of the polynucleotide become incorporated into the target genome, usually at a low frequency. There are several different classes of transposon. piggyBac-like transposons include the piggyBac transposon from the looper moth Trichoplusia ni, Xenopus piggyBac-like transposons, Bombyx piggyBac-like transposons, Heliothis piggyBac-like transposons, Helicoverpa piggyBac-like transposons, Agrotis piggyBac-lkike transposons, Amyelois piggyBac-like transposons, piggyBat piggyBac- like transposons, and Oryzias piggyBac-like transposons. hAT transposons include TcBuster. Mariner transposons include Sleeping Beauty. Each of these transposons can be integrated into the genome of a mammalian cell by a corresponding transposase. Heterologous polynucleotides incorporated into transposons may be integrated into mammalian cells, as well as hepatocytes, neural cells, muscle cells, blood cells, embryonic stem cells, somatic stem cells, hematopoietic cells, embryos, zygotes, and sperm cells (some of which are open to being manipulated in an in vitro setting). Cells can also be pluripotent cells (cells whose descendants can differentiate into several restricted cell types, such as hematopoietic stem cells or other stem cells) or totipotent cells

(i.e., a cell whose descendants can become any cell type in an organism, e.g., embryonic stem cells).

[0075] Gene transfer systems may comprise a transposon in combination with a corresponding transposase protein that transposases the transposon, or a nucleic acid that encodes the corresponding transposase protein and is expressible in the target cell. The nucleic acid encoding the transposase protein may be a DNA molecule or an mRNA molecule.

[0076] When there are multiple components of a gene transfer system, for example one or more polynucleotides comprising transposon ends flanking genes for expression in the target cell, and a transposase (which may be provided either as a protein or encoded by a nucleic acid), these components can be transfected into a cell at the same time, or sequentially. For example, a transposase protein or its encoding nucleic acid may be transfected into a cell prior to, at the same time, or after transfection of a corresponding transposon. Additionally, administration of either component of the gene transfer system may occur repeatedly, for example, by administering at least two doses of this component.

[0077] Transposase proteins may be encoded by polynucleotides including RNA or DNA. RNA molecules may include those with appropriate substitutions to reduce toxicity effects on the cell, for example, substitution of uridine with pseudouridine and substitution of cytosine with 5- m ethyl cytosine. mRNA encoding the transposase may be prepared such that it has a 5 ’-cap structure to improve expression in a target cell. Example cap structures are a cap analog (G(5’)ppp(5’)G), an anti-reverse cap analog (3’-O-Me-m⁷G(5’)ppp(5’)G, a clean cap (m7G(5’)ppp(5’)(2’OmeA)pG), and an rnCap (m7G(5’)ppp(5’)G). mRNA encoding the transposase may be prepared such that some bases are partially or fully substituted, for example uridine may be substituted with pseudo-uridine, and cytosine may be substituted with 5-methyl- cytosine. Any combinations of these caps and substitutions may be made. Similarly, the nucleic acid encoding the transposase protein or the transposon of can be transfected into the cell as a linear fragment or as a circularized fragment, either as a plasmid or as recombinant viral DNA. If the transposase is introduced as a DNA sequence encoding the transposase, then the ORF encoding the transposase may be operably linked to a promoter that is active in the target mammalian cell.

[0078] A suitable piggyBac-like transposon for modifying the genome of a mammalian cell is a Xenopus transposon, which comprises, from 5’ to 3’, a first ITR with the nucleotide sequence SEQ ID NO: 1, a heterologous polynucleotide to be transposed, and a second ITR with the nucleotide sequence SEQ ID NO: 2. The transposon may further be flanked by a copy of the tetranucleotide 5’-TTAA-3’ on each side, immediately adjacent to the ITRs and distal to the heterologous polynucleotide. The transposon may further comprise a first additional polynucleotide immediately adjacent to one ITR, e.g., the first ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 5 or SEQ ID NO: 6. The transposon may further comprise a second additional polynucleotide immediately adjacent to one ITR, e.g., the second ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 7 or SEQ ID NO: 8. This transposon may be transposed by a corresponding Xenopus transposase comprising a polypeptide sequence at least 90% identical to the polypeptide sequence of SEQ ID NO: 9 or SEQ ID NO: 10, for example any of SEQ ID NOs: 9-41. The Xenopus transposase may optionally be fused to a heterologous nuclear localization signal. The transposase may be a hyperactive variant of a naturally occurring transposase. The hyperactive variant transposase may comprise one or more of the following amino acid changes, relative to the polypeptide sequence of SEQ ID NO: 9: Y6L, Y6H, Y6V, Y6I, Y6C, Y6G, Y6A, Y6S, Y6F, Y6R, Y6P, Y6D, Y6N, S7G, S7V, S7D, E9W, E9D, E9E, M16E, M16N, M16D, M16S, M16Q, M16T, M16A, M16L, M16H, M16F, M16I, S18C, S18Y, S18M, S18L, S18Q, S18G, S18P, S18A, S18W, S18H, S18K, S18I, S18V, S19C, S19V, S19L, S19F, S19K, S19E, S19D, S19G, S19N, S19A, S19M, S19P, S19Y, S19R, S19T, S19Q, S20G, S20M, S20L, S20V, S20H, S20W, S20A, S20C, S20Q, S20D, S20F, S20N, S20R, E21N, E21W, E21G, E21Q, E21L, E21D, E21A, E21P, E21T, E21S, E21Y, E21V, E21F, E21M, E22C, E22H, E22R, E22L, E22K, E22S, E22G, E22M, E22V, E22Q, E22A, E22Y, E22W, E22D, E22T, F23Q, F23A, F23D, F23W, F23K, F23T, F23V, F23M, F23N, F23P, F23H, F23E, F23C, F23R, F23Y, S24L, S24W, S24H, S24V, S24P, S24I, S24F, S24K, S24Y, S24D, S24C, S24N, S24G, S24A, S26F, S26H, S26V, S26Q, S26Y, S26W, S28K, S28Y, S28C, S28M, S28L, S28H, S28T, S28Q, V31L, V31T, V31I, V31Q, V31K, A34L, A34E, L67A, L67T, L67M, L67V, L67C, L67H, L67E, L67Y, G73H, G73N, G73K, G73F, G73V, G73D, G73S, G73W, G73L, A76L, A76R, A76E, A76I, A76V, D77N, D77Q, D77Y, D77L, D77T, P88A, P88E, P88N, P88H, P88D, P88L, N91D, N91R, N91A, N91L, N91H, N91V, Y141I, Y141M, Y141Q, Y141S, Y141E, Y141W, Y141V, Y141F, Y141A, Y141C, Y141K, Y141L, Y141H, Y141R, N145C, N145M, N145A, N145Q, N145I, N145F, N145G, N145D, N145E, N145V, N145H, N145W, N145Y, N145L, N145R, N145S, P146V, P146T, P146W, P146C, P146Q, P146L, P146Y, P146K, P146N, P146F, P146E, P148M, P148R, P148V, P148F, P148T, P148C, P148Q, P148H, Y150W, Y150A,

Y150F, Y150H, Y150S, Y150V, Y150C, Y150M, Y150N, Y150D, Y150E, Y150Q, Y150K,

H157Y, H157F, H157T, H157S, H157W, A162L, A162V, A162C, A162K, A162T, A162G,

A162M, A162S, A162I, A162Y, A162Q, A179T, A179K, A179S, A179V, A179R, L182V,

L182I, L182Q, L182T, L182W, L182R, L182S, T189C, T189N, T189L, T189K, T189Q, T189V, T189A, T189W, T189Y, T189G, T189F, T189S, T189H, L192V, L192C, L192H, L192M, L192I, S193P, S193T, S193R, S193K, S193G, S193D, S193N, S193F, S193H, S193Q, S193Y, V196L, V196S, V196W, V196A, V196F, V196M, V196I, S198G, S198R, S198A, S198K, T200C, T200I, T200M, T200L, T200N, T200W, T200V, T200Q, T200Y, T200H, T200R, S202A, S202P, L210H, L210A, F212Y, F212N, F212M, F212C, F212A, N218V, N218R, N218T, N218C, N218G, N218I, N218P, N218D, N218E, A248S, A248L, A248H, A248C, A248N, A248I, A248Q, A248Y, A248M, A248D, L263V, L263A, L263M, L263R, L263D, Q270V, Q270K, Q270A, Q270C, Q270P, Q270L, Q270I, Q270E, Q270G, Q270Y, Q270N, Q270T, Q270W, Q270H, S294R, S294N, S294G, S294T, S294C, T297C, T297P, T297V, T297M, T297L, T297D, E304D, E304H, E304S, E304Q, E304C, S308R, S308G, L310R, L310I, L310V, L333M, L333W, L333F, Q336Y, Q336N, Q336M, Q336A, Q336T, Q336L, Q336I, Q336G, Q336F, Q336E, Q336V, Q336C, Q336H, A354V, A354W, A354D, A354C, A354R, A354E, A354K, A354H, A354G, C357Q, C357H, C357W, C357N, C357I, C357V, C357M, C357R, C357F, C357D, L358A, L358F, L358E, L358R, L358Q, L358V, L358H, L358C, L358M, L358Y, L358K, L358N, L358I, D359N, D359A, D359L, D359H, D359R, D359S, D359Q, D359E, D359M, L377V, L377I, V423N, V423P, V423T, V423F, V423H, V423C, V423S, V423G, V423A, V423R, V423L, P426L, P426K, P426Y, P426F, P426T, P426W, P426V, P426C, P426S, P426Q, P426H, P426N, K428R, K428Q, K428N, K428T, K428F, S434A, S434T, S438Q, S438A, S438M, T447S, T447A, T447C, T447Q, T447N, T447G, L450M, L450V, L450A, L450I, L450E, A462M, A462T, A462Y, A462F, A462K, A462R, A462Q, A462H, A462E, A462N, A462C, V467T, V467C, V467A, V467K, I469V, I469N, I472V, I472L, I472W, I472M, I472F, L476I, L476V, L476N, L476F, L476M, L476C, L476Q, P488E, P488H, P488K, P488Q, P488F, P488M, P488L, P488N, P488D, Q498V, Q498L, Q498G, Q498H, Q498T, Q498C, Q498E, Q498M, L502I, L502M, L502V, L502G, L502F, E517M, E517V, E517A, E517K, E517L, E517G, E517S, E517I, P520W, P520R, P520M, P520F, P520Q, P520V, P520G, P520D, P520K, P520Y, P520E, P520L, P520T, S521A, S521H, S521C, S521V, S521W, S521T, S521K, S521F, S521G, N523W, N523A, N523G, N523S, N523P, N523M, N523Q, N523L, N523K, N523D, N523H, N523F, N523C, I533M, I533V, I533T, I533S, I533F, I533G, I533E, D534E, D534Q, D534L, D534R, D534V, D534C, D534M, D534N, D534A, D534G, D534F, D534T, D534H, D534K, D534S, F576L, F576K, F576V, F576D, F576W, F576M, F576C, F576R, F576Q, F576A, F576Y, F576N, F576G, F576I, F576E, K577L, K577G, K577D, K577R, K577H, K577Y, K577I, K577E, K577V, K577N, I582V, I582K, I582R, I582M, I582G, I582N, I582E, I582A, I582Q, Y583L, Y583C, Y583F, Y583D, Y583Q, L587F, L587D, L587R, L587I, L587P, L587N, L587E, L587S, L587Y, L587M, L587Q, L587G, L587W, L587K, and L587T.

[0079] A suitable piggyBac-like transposon for modifying the genome of a mammalian cell is a Bombyx transposon, which comprises, from 5’ to 3’, a first ITR with the nucleotide sequence SEQ ID NO: 42, a heterologous polynucleotide to be transposed, and a second ITR with the nucleotide sequence SEQ ID NO: 43. The transposon may further be flanked by a copy of the tetranucleotide 5’-TTAA-3’ on each side, immediately adjacent to the ITRs and distal to the heterologous polynucleotide. The transposon may further comprise a first additional polynucleotide immediately adjacent to one ITR, e.g., the first ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 44. The transposon may further comprise a second additional polynucleotide immediately adjacent to one ITR, e.g., the second ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 45. This transposon may be transposed by a corresponding Bombyx transposase comprising a polypeptide sequence at least 90% identical to the polypeptide sequence of SEQ ID NO: 46 or SEQ ID NO: 47, for example any of SEQ ID NOs: 46-69. The Bombyx transposase may optionally be fused to a heterologous nuclear localization signal. The transposase may be a hyperactive variant of a naturally occurring transposase. The hyperactive variant transposase may comprise one or more of the following amino acid changes, relative to the polypeptide sequence of SEQ ID NO: 46: Q85E, Q85M, Q85K, Q85H, Q85N, Q85T, Q85F, Q85L, Q92E, Q92A, Q92P, Q92N, Q92I, Q92Y, Q92H, Q92F, Q92R, Q92D, Q92M, Q92W, Q92C, Q92G, Q92L, Q92V, Q92T, V93P, V93K, V93M, V93F, V93W, V93L, V93A, V93I, V93Q, P96A, P96T, P96M, P96R, P96G, P96V, P96E, P96Q, P96C, F97Q, F97K, F97H, F97T, F97C, F97W, F97V, F97E, F97P, F97D, F97A, F97R, F97G, F97N, F97Y, H165E, H165G, H165Q, H165T, H165M, H165V, H165L, H165C, H165N, H165D, H165K, H165W, H165A, E178S, E178H, E178Y, E178F, E178C, E178A, E178Q, E178G, E178V, E178D, E178L, E178P, E178W, C189D, C189Y, C189I, C189W, C189T, C189K, C189M, C189F, C189P, C189Q, Cl 89V, A196G, L200I, L200F, L200C, L200M, L200Y, A201Q, A201L, A201M, L203V, L203D, L203G, L203E, L203C, L203T, L203M, L203A, L203Y, N207G, N207 A, L211 G, L211 M, L211 C, L211 T, L211 V, L211 A, W215 Y, T217 V, T217 A, T2171, T217P, T217C, T217Q, T217M, T217F, T217D, T217K, G219S, G219A, G219C, G219H, G219Q, Q235C, Q235N, Q235H, Q235G, Q235W, Q235Y, Q235A, Q235T, Q235E, Q235M, Q235F, Q238C, Q238M, Q238H, Q238V, Q238L, Q238T, Q238I, R242Q, K246I, K253V, M258V, F261L, S263K, C271S, N303C, N303R, N303G, N303A, N303D, N303S, N303H, N303E, N303R, N303K, N3O3L, N303Q, I312F, I312C, I312A, I312L, I312T, 1312V, I312G, I312M, F321H, F321R, F321N, F321Y, F321W, F321D, F321G, F321E, F321M, F321K, F321A, F321Q, V323I, V323L, V323T, V323M, V323A, V324N, V324A, V324C, V324I, V324L, V324T,

V324K, V324Y, V324H, V324F, V324S, V324Q, V324M, V324G, A330K, A330V, A330P,

A33OS, A330C, A330T, A330L, Q333P, Q333T, Q333M, Q333H, Q333S, P337W, P337E, P337H, P337I, P337A, P337M, P337N, P337D, P337K, P337Q, P337G, P337S, P337C, P337L, P337V, F368Y, L373C, L373V, L373I, L373S, L373T, V389I, V389M, V389T, V389L, V389A, R394H, R394K, R394T, R394P, R394M, R394A, Q395P, Q395F, Q395E, Q395C, Q395V, Q395A, Q395H, Q395S, Q395Y, S399N, S399E, S399K, S399H, S399D, S399Y, S399G, S399Q, S399R, S399T, S399A, S399V, S399M, R402Y, R402K, R402D, R402F, R402G, R402N, R402E, R402M, R402S, R402Q, R402T, R402C, R402L, R402V, T403W, T403A, T403V, T403F, T403L, T403Y, T403N, T403G, T403C, T403I, T403S, T403M, T403Q, T403K, T403E, D404I, D404S, D404E, D404N, D404H, D404C, D404M, D404G, D404A, D404Q, D404L, D404P, D404V, D404W, D404F, N408F, N408I, N408A, N408E, N408M, N408S, N408D, N408Y, N408H, N408C, N408Q, N408V, N408W, N408L, N408P, N408K, S409H, S409Y, S409N, S409I, S409D, S409F, S409T, S409C, S409Q, N441F, N441R, N441M, N441G, N441C, N441D, N441L, N441A, N441V, N441W, G448W, G448Y, G448H, G448C, G448T, G448V, G448N, G448Q, E449A, E449P, E449T, E449L, E449H, E449G, E449C, E449I, V469T, V469A, V469H, V469C, V469L, L472K, L472Q, L472M, C473G, C473Q, C473T, C473I, C473M, R484H, R484K, T507R, T507D, T507S, T507G, T507K, T507I, T507M, T507E, T507C, T507L, T507V, G523Q, G523T, G523A, G523M, G523S, G523C, G523I, G523L, I527M, I527V, Y528N, Y528W, Y528M, Y528Q, Y528K, Y528V, Y528I, Y528G, Y528D, Y528A, Y528E, Y528R, Y543C, Y543W, Y543I, Y543M, Y543Q, Y543A, Y543R, Y543H, E549K, E549C, E549I, E549Q, E549A, E549H, E549C, E549M, E549S, E549F, E549L, K550R, K550M, K550Q, S556G, S556V, S556I, P557W, P557T, P557S, P557A, P557Q, P557K, P557D, P557G, P557N, P557L, P557V, H559K, H559S, H559C, H559I, H559W, V560F, V560P, V560I, V560H, V560Y, V560K, N561P, N561Q, N561G, N561A, V562Y, V562I, V562S, V562M, V567I, V567H,

V567N, S583M, E601V, E601F, E601Q, E601W, E605R, E605W, E605K, E605M, E605P, E605Y, E605C, E605H, E605A, E605Q, E605S, E605V, E605I, E605G, D607V, D607Y, D607C, D607N, D607W, D607T, D607A, D607H, D607Q, D607E, D607L, D607K, D607G, S609R, S609W, S609H, S609V, S609Q, S609G, S609T, S609K, S609N, S609Y, L610T, L610I, L610K, L610G, L610A, L610W, L610D, L610Q, L610S, L610F, and L610N.

[0080] A suitable piggyBac-like transposon for modifying the genome of a mammalian cell is . Myotis transposon, which comprises, from 5’ to 3’, a first ITR with the nucleotide sequence SEQ ID NO: 70, a heterologous polynucleotide to be transposed, and a second ITR with the nucleotide sequence SEQ ID NO: 71. The transposon may further be flanked by a copy of the tetranucleotide 5’-TTAA-3’ on each side, immediately adjacent to the ITRs and distal to the heterologous polynucleotide. The transposon may further comprise a first additional polynucleotide immediately adjacent to one ITR, e.g., the first ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 72. The transposon may further comprise a second additional polynucleotide immediately adjacent to one ITR, e.g., the second ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 73. This transposon may be transposed by a corresponding Myotis transposase comprising a polypeptide sequence at least 90% identical to the polypeptide sequence of SEQ ID NO: 74. The Myotis transposase may optionally be fused to a heterologous nuclear localization signal. The transposase may be a hyperactive variant of a naturally occurring transposase. The hyperactive variant transposase may comprise one or more of the following amino acid changes, relative to the sequence of SEQ ID NO: 74: 14V, D475G, P491Q, A561T, T546T, T300A, T294A, A520T, G239S, S5P, S8F, S54N, D9N, D9G, 1345 V,

M481V, El JG, K 130T, R427H, S8P, S36G, and D10G. [0081] A suitable piggyBac-like transposon for modifying the genome of a mammalian cell is a Trichoplusia transposon, which comprises, from 5’ to 3’, a first 1TR with the nucleotide sequence SEQ ID NO: 75, a heterologous polynucleotide to be transposed, and a second ITR with the nucleotide sequence SEQ ID NO: 76. The transposon may further be flanked by a copy of the tetranucleotide 5’-TTAA-3’ on each side, immediately adjacent to the ITRs and distal to the heterologous polynucleotide. The transposon may further comprise a first additional polynucleotide immediately adjacent to one ITR, e.g., the first ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 77. The transposon may further comprise a second additional polynucleotide immediately adjacent to one ITR, e.g., the second ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 78. This transposon may be transposed by a corresponding Trichoplusia transposase comprising a polypeptide sequence at least 90% identical to the polypeptide sequence of SEQ ID NO: 79. The Trichoplusia transposase may optionally be fused to a heterologous nuclear localization signal. The transposase may be a hyperactive variant of a naturally occurring transposase. The hyperactive variant transposase may comprise one or more of the following amino acid changes, relative to the sequence of SEQ ID NO: 79: G2C, Q40R, BOV, G165S, T43A, S61R, S103P, S103T, M194V, R281G, M282V, G316E, I426V, Q497L, N505D, Q573L, S509G, N570S, N538K, Q591P, Q591R, F594L, M194V, I30V, S103P, G165S, M282V, S509G, N538K, N571S, C41T, A1424G, C1472A, G1681A, T150C, A351G, A279G, T1638C, A898G, A880G, G1558A, A687G, G715A, T13C, C23T, G161A, G25A, T1050C, A1356G, A26G, A1033G, A1441G, A32G, A389C, A32G, A389C, A32G, T1572A, G456A, T1641C, T1 155C, G1280A, T22C, A106G, A29G, C137T, A14V, D475G, P491Q, A561T, T546T, T300A, T294A, A520T, G239S, S5P, S8F, S54N, D9N, D9G, 1345 V, M481V, E11G, K130T, G9G, R427H, S8P, S36G, D10G, S36G, A51T, C153A, C277T, G201A, G202A, T236A, A1O3T, A104C, T140C, G138T, T118A, C74T, A179C, S3N, BOV, A46S, A46T, I82W, S1O3P, R119P, C125A, C125L, G165S, Y177K, Y177H, F18OL, Fl 801, Fl 80V, M185L, A187G, F200W, V207P, V209F, M226F, L235R, V240K, F241L, P243K, N258S, M282Q, L296W, L296Y, L296F, M298V, M298A, M298L, P311V, P311I, R315K,

T319G, Y327R, Y328V, C340G, C340L, D421H, V436I, M456Y, L470F, S486K, M5O3I,

M503L, V552K, A570T, Q591P, Q591R, R65A, R65E, R95A, R95E, R97A, R97E, R135A,

R135E, R161A, R161E, R192A, R192E, R208A, R208E, K176A, K176E, K195A, K195E,

S171E, M14V, D270N, BOV, G165S, M282L, M282I, M282V, and M282A.

[0082] A suitable piggyBac-like transposon for modifying the genome of a mammalian cell is an Amyelois transposon, which comprises, from 5’ to 3’, a first ITR with the nucleotide sequence SEQ ID NO: 80, a heterologous polynucleotide to be transposed, and a second ITR with the nucleotide sequence SEQ ID NO: 81. The transposon may further be flanked by a copy of the tetranucleotide 5’-TTAA-3’ on each side, immediately adjacent to the ITRs and distal to the heterologous polynucleotide. The transposon may further comprise a first additional polynucleotide immediately adjacent to one ITR, e.g., the first ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 82. The transposon may further comprise a second additional polynucleotide immediately adjacent to one ITR, e.g., the second ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 83. This transposon may be transposed by a corresponding Amyelois transposase comprising a polypeptide sequence at least 90% identical to the polypeptide sequence of SEQ ID NO: 84. The Amyelois transposase may optionally be fused to a heterologous nuclear localization signal. The transposase may be a hyperactive variant of a naturally occurring transposase. The hyperactive variant transposase may comprise one or more of the following amino acid changes, relative to the sequence of SEQ ID NO: 84: P65E, P65D, R95S, R95T, V100I, V100L, V100M, L115D, L115E, E116P, H121Q, H121N, K139E, K139D, T159N, T159Q, V166F, V166Y, V166W, G179N, G179Q, W187F, W187Y, P198R, P198K, L203R, L203K, I209L, I209V, I209M, N211R, N211K, E238D, L273I, L273V, L273M, D304K, D304R, I323L, I323M, I323V, Q329G, Q329R, Q329K, T345L, T345I, T345V, T345M, K362R, T366R, T366K, T38OS, L408M, L408I, L408V, E413S, E413T, S416E, S416D, I426M, I426L, I426V, S435G, L458M, L458I, L458V, A472S, A472T, V475I, V475L, V475M, N483K, N483R, I491M, I491V, I491L, A529P, K540R, S560K, S560R, T562K, T562R, S563K, and S563R.

[0083] A suitable piggyBac-like transposon for modifying the genome of a mammalian cell is a Heliothis transposon, which comprises, from 5’ to 3’, a first ITR with the nucleotide sequence SEQ ID NO: 85, a heterologous polynucleotide to be transposed, and a second ITR with the nucleotide sequence SEQ ID NO: 86. The transposon may further be flanked by a copy of the tetranucleotide 5’-TTAA-3’ on each side, immediately adjacent to the ITRs and distal to the heterologous polynucleotide. The transposon may further comprise a first additional polynucleotide immediately adjacent to one ITR, e.g., the first ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 87. The transposon may further comprise a second additional polynucleotide immediately adjacent to one ITR, e.g., the second ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 88. This transposon may be transposed by a corresponding Heliothis transposase comprising a polypeptide sequence at least 90% identical to the polypeptide sequence of SEQ ID NO: 89. The Heliothis transposase may optionally be fused to a heterologous nuclear localization signal. The transposase may be a hyperactive variant of a naturally occurring transposase. The hyperactive variant transposase may comprise one or more of the following amino acid changes, relative to the sequence of SEQ ID NO: 89: S41V, S41I, S41L, L43S, L43T, V81E, V81D, D83S, D83T, V85L, V85I, V85M, P125S, P125T, Q126S, Q126T, Q131R, Q131K, Q131T, Q131S, S136V, S136I, S136L, S136M, E140C, EMO A, N151Q, K169E, K169D, N212S, I239L, I239V, I239M, H241N, H241Q, T268D, T268E, T297C, M300R, M300K, M305N, M305Q, L312I, C316A, C316M, L321V, L321M, N322T, N322S, P351G, H357R, H357K, H357D, H357E, K360Q, K360N, E379P, K397S, K397T, Y421F, Y421W, V450I, V450L, V450M, Y495F, Y495W, A447N, A447D, A449S, A449V, K476L, V492A, I500M, L585K, and T595K.

[0084] A suitable piggyBac-like transposon for modifying the genome of a mammalian cell is an Oryzias transposon which comprises, from 5’ to 3’, a first ITR with the nucleotide sequence SEQ ID NO: 90 or SEQ ID NO: 92, a heterologous polynucleotide to be transposed, and a second ITR with the nucleotide sequence SEQ ID NO: 91 or SEQ ID NO: 93. The transposon may further be flanked by a copy of the tetranucleotide 5’-TTAA-3’ on each side, immediately adjacent to the ITRs and distal to the heterologous polynucleotide. The transposon may further comprise a first additional polynucleotide immediately adjacent to one ITR, e.g., the first ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 94. The transposon may further comprise a second additional polynucleotide immediately adjacent to one ITR, e.g., the second ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 95. This transposon may be transposed by a corresponding Oryzias transposase comprising a polypeptide sequence at least 90% identical to the polypeptide sequence of SEQ ID NO: 96. The Oryzias transposase may optionally be fused to a heterologous nuclear localization signal. The transposase may be a hyperactive variant of a naturally occurring transposase. The hyperactive variant transposase may comprise one or more of the following amino acid changes, relative to the sequence of SEQ ID NO: 96: E22D, A124C, Q131D, Q131E, L138V, L138I, L138M, D160E, Y164F, Y164W, I167L, 1167V, I167M, T202R, T202K, I206L, 1206 V, I206M, I210L, I210V, I210M, N214D, N214E, V253I, V253L, V253M, V258L, V258I, V258M, A284L, A284I, A284M, A284V, V386I, V386M, V386L, M400L, M400I, M400V, S408E, S408D, L409I, L409V, L409M, V458L, V458M, V458I, V467I, V467M, V467L, L468I, L468V, L468M, A514R, A514K, V515I, V515M, V515L, R548K, D549K, D549R, D550R, D550K, S551K, and S551R

[0085] A suitable piggyBac-like transposon for modifying the genome of a mammalian cell is an Agrotis transposon, which comprises, from 5’ to 3’, a first ITR with the nucleotide sequence SEQ ID NO: 97, a heterologous polynucleotide to be transposed, and a second ITR with the nucleotide sequence SEQ ID NO: 98. The transposon may further be flanked by a copy of the tetranucleotide 5’-TTAA-3’ on each side, immediately adjacent to the ITRs and distal to the heterologous polynucleotide. The transposon may further comprise a first additional polynucleotide immediately adjacent to one ITR, e.g., the first ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 99. The transposon may further comprise a second additional polynucleotide immediately adjacent to one ITR, e g., the second ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 100. This transposon may be transposed by a corresponding Agrotis transposase comprising a polypeptide sequence at least 90% identical to the polypeptide sequence of SEQ ID NO: 101. The Agrotis transposase may optionally be fused to a heterologous nuclear localization signal. The transposase may be a hyperactive variant of a naturally occurring transposase.

[0086] A suitable piggyBac-like transposon for modifying the genome of a mammalian cell is a Helicoverpa transposon, which comprises, from 5’ to 3’, a first ITR with the nucleotide sequence SEQ ID NO: 102, a heterologous polynucleotide to be transposed, and a second ITR with the nucleotide sequence SEQ ID NO: 103. The transposon may further be flanked by a copy of the tetranucleotide 5’-TTAA-3’ on each side, immediately adjacent to the ITRs and distal to the heterologous polynucleotide. The transposon may further comprise a first additional polynucleotide immediately adjacent to one ITR, e g., the first ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 104. The transposon may further comprise a second additional polynucleotide immediately adjacent to one ITR, e g., the second ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 105. This transposon may be transposed by a corresponding Helicoverpa transposase comprising a polypeptide sequence at least 90% identical to the polypeptide sequence of SEQ ID NO: 106. The Helicoverpa transposase may optionally be fused to a heterologous nuclear localization signal. The transposase may be a hyperactive variant of a naturally occurring transposase.

[0087] A suitable Mariner transposon for modifying the genome of a mammalian cell is a Sleeping Beauty transposon, which comprises, from 5’ to 3’, a first ITR with the with nucleotide sequence of SEQ ID NO: 107, a heterologous polynucleotide to be transposed, and a second ITR with nucleotide sequence of SEQ ID NO: 108. The transposon may further comprise a first additional polynucleotide immediately adjacent to one ITR, e g., the first ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 109. The transposon may further comprise a second additional polynucleotide immediately adjacent to one ITR, e.g., the second ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 110. This transposon may be transposed by a corresponding Sleeping Beauty transposase comprising a polypeptide sequence at least 90% identical to the polypeptide sequence of SEQ ID NO: 111, including hyperactive variants thereof.

[0088] A suitable hAT transposon for modifying the genome of a mammalian cell is a TcBuster transposon, which comprises, from 5’ to 3’, a first ITR with the nucleotide sequence SEQ ID NO: 112, a heterologous polynucleotide to be transposed, and a second ITR with the nucleotide sequence SEQ ID NO: 113. The transposon may further comprise a first additional polynucleotide immediately adjacent to one ITR, e.g., the first ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 114. The transposon may further comprise a second additional polynucleotide immediately adjacent to one ITR, e.g., the second ITR, and proximal to the heterologous polynucleotide, whose nucleotide sequence is at least 95% identical to SEQ ID NO: 115. This transposon may be transposed by a corresponding Sleeping Beauty transposase comprising a polypeptide sequence at least 90% identical to the polypeptide sequence of SEQ ID NO: 116, including hyperactive variants thereof.

[0089] A transposase protein can be introduced into a cell as a protein or as a nucleic acid encoding the transposase, for example as a ribonucleic acid, including mRNA or any polynucleotide recognized by the translational machinery of a cell; as DNA, e.g., as extrachromosomal DNA including episomal DNA; as plasmid DNA, or as viral nucleic acid. Furthermore, the nucleic acid encoding the transposase protein can be transfected into a cell as a nucleic acid vector such as a plasmid, or as a gene expression vector, including a viral vector. The nucleic acid can be circular or linear. DNA encoding the transposase protein can be stably inserted into the genome of the cell or into a vector for constitutive or inducible expression. Where the transposase protein is transfected into the cell or inserted into the vector as DNA, the transposase encoding sequence may be operably linked to a heterologous promoter. There are a variety of promoters that could be used, including constitutive promoters, tissue-specific promoters, inducible promoters, species-specific promoters, cell-type specific promoters, and the like. All DNA or RNA sequences encoding transposase proteins are expressly contemplated. Alternatively, the transposase may be introduced into the cell directly as protein, for example using cellpenetrating peptides (e.g., as described in Ramsey and Flynn, 2015. Pharmacol. Ther. 154: 78-86 “Cell-penetrating peptides transport therapeutics into cells”); using small molecules including salt plus propanebetaine (e.g., as described in Astolfo et. Al., 2015. Cell 161 : 674-690); or electroporation (e.g., as described in Morgan and Day, 1995. Methods in Molecular Biology 48: 63-71 “The introduction of proteins into mammalian cells by electroporation”).

III. Production of 2-chain antibodies using a DNA construct comprising first and second transcriptional units, each transcriptional unit comprising a CMV enhancer, a promoter, and an ORF, but not including an IRES

[0090] With reference to Figure 1, one DNA construct configuration (100) that usually achieves over-production of the light chain relative to the heavy chain comprises: (i) a first transcriptional unit (110) comprising a first cytomegalovirus (CMV) enhancer (112) operably linked to a first promoter (114) and an ORF (116) encoding an antibody light chain, and (ii) a second transcriptional unit (120) comprising a second CMV enhancer (122) operably linked to a second promoter (124) and an ORF (126) encoding an antibody heavy chain, wherein the first transcriptional unit (110) is placed to the 5’ of the second transcriptional unit (120), and both transcriptional units are transcribed in the same (5’ to 3’) direction. The first (112) and second (122) CMV enhancers may be enhancers from the CMV immediate early gene 1, 2, or 3 from a rodent or a primate virus, such as, for example, from a mouse, human, or chimpanzee CMV. Example CMV enhancers may have nucleotide sequences SEQ ID NOs: 117-126. The first (112) and second (122) CMV enhancers may be the same or different. The first (114) and second (124) promoters may be promoters from the CMV immediate early gene 1, 2, or 3 from a rodent or a primate virus, such as, for example, from a mouse, human, chimpanzee, or macaque CMV. Example CMV promoters may have nucleotide sequences SEQ ID NOs: 127-146. The first (114) and second (124) promoters may also be hybrid promoters comprising sequences from more than one promoter, such as a hybrid between the human and mouse CMV promoters, between the human and chimpanzee promoters, or between the mouse and chimpanzee promoters. Example hybrid CMV promoters may have nucleotide sequences SEQ ID NOs: 147-153. The first (114) and second (124) promoters may be promoters from a mammalian or avian actin gene, such as, for example, from a rodent or a primate actin alpha gene, such as, for example, from mouse, rat, hamster, human, or chicken. Example actin promoters may have nucleotide sequences SEQ ID NOs: 154-164. The first (114) and second (124) promoters may be promoters from a mammalian EFl alpha gene, such as, for example, from a rodent or a primate EFl alpha gene, such as, for example, from a mouse, rat, hamster, jerboa, beaver, or a human. Example EFl alpha promoters may have nucleotide sequences SEQ ID NOs: 165-188. The first (114) and second (124) promoters may be the same or different. Optionally, the first (110) and second (120) transcriptional units may also be operably linked to an intron (not shown in Figure 1), including in the 5’ untranslated region (UTR) of the ORF. Introns, particularly the first intron in a gene, often comprise transcription factor binding sites and thereby enhance transcription and consequently expression of operably linked ORFs. The presence of an intron in the 5’UTR of an ORF also frequently increases mRNA export from the nucleus, thereby enhancing expression of operably linked ORFs. Useful introns to include in the 5’ UTR of an ORF may include those from the CMV immediate early gene 1, 2, or 3 from a rodent or a primate virus (example CMV introns may have nucleotide sequences SEQ ID NOs: 189-206); those from a mammalian or avian actin gene (example actin introns may have nucleotide sequences SEQ ID NOs: 211-225); those from a mammalian EF 1 alpha gene (example EF l alpha introns may have nucleotide sequences SEQ ID NOs: 226-239); and synthetic introns (example synthetic introns may have nucleotide sequences SEQ ID NOs: 207-210). In cases where an intron is included in the 5’UTR of an operably linked ORF, it should be included in a sequence context such that it is spliceable in the host cell. The first (110) and second (120) transcriptional units may comprise the same or different introns. In one aspect, the first (116) and second (126) ORFs may be linked with an intron. In other aspects, either the first (116) or the second (126) ORF may be linked with an intron, while the other ORF need not be so linked. For expression in a mammalian cell, each ORF may also be operably linked to a termination/polyadenylation sequence (118, 128) located to the 3’ of the ORF. Polyadenylation sequences (118, 128) typically comprise signals that cause the RNA polymerase transcribing the gene to stop. Polyadenylation sequences (118, 128) also comprise signals that cause the cell to trim the 3’ end of the RNA and add a stretch of adenine residues creating a polyadenosyl stretch at the 3’ end of the RNA molecule. Suitable polyadenylation sequences may include those represented by nucleotide sequences SEQ ID NOs: 240-247. IV. Production of 2-chain antibodies using a DNA construct comprising a single transcriptional unit, the transcriptional unit comprising two ORFs and an IRES to link the ORFs [0091] With reference to Figure 2, another configuration (200) that favors higher production of the light chain over the heavy chain is when the ORFs encoding the light chain (216) and heavy chain (226) are operably linked by an IRES (215) in a single transcriptional unit (210), such as are found in the 5’ UTRs of many positive strand RNA viruses, such as, for example, picornaviruses, encephalomyocarditis virus (EMCV), enteroviruses, and coxsackieviruses. Example IRES sequences useful for expressing two ORFs from a single mRNA may have nucleotide sequences SEQ ID NOs: 248-281. An IRES sequence placed between two ORFs leads to initiation of translation at two places within the mRNA. Normal eukaryotic translation initiates by a mechanism where a ribosome binds to the 5’ cap structure at the beginning of the mRNA and scans along the mRNA in a 5’ to 3’ direction until it reaches a start codon in good context where translation of the encoded protein begins. Good sequence contexts for translational initiation from a 5’ cap often have the sequence pattern 5’-SNSRMVAUG-3’ where S is C or G; N is any base; R is A or G; V is G, C, or A; M is A or C; and AUG is the initiating AUG in the ORF. The sequence 5’-GCCACCAUG-3’ is a particularly good translational initiation context. Viral IRES sequences fold into structures that mimic aspects of the normal 5’ cap structure bound to one or more translation initiation factors. This permits mammalian ribosomes to initiate translation in the internal portion of the mRNA, rather than the beginning, at an ORF that is operably linked to the IRES. Typically, the efficiency of translational initiation at an IRES is less than translational initiation when ribosomes scan from the 5 ’cap to a good translational initiation context. This efficiency reduction can be exploited to obtain a higher level of expression of a first ORF (e.g., 216, a light chain) relative to a second ORF (e.g., 226, a heavy chain), by placing the first ORF (216) close to the 5’ end of the mRNA in a good translational initiation context, and operably linking the second ORF (226) to an IRES (215) to the 3’ of the first ORF (216). The EMCV virus IRES is particularly useful for the expression of monoclonal antibodies; when using a configuration such as that shown in Figure 2, the relative expression of a light chain placed in the position of the first ORF (216) is between two and three times the expression level of a heavy chain placed in the position of the second ORF (226). This is approximately the desired relative expression for the chains of a monoclonal antibody. Example EMCV IRES sequences may comprise nucleotide sequences SEQ ID NOs: 248-254. Suitable enhancers, promoters, and polyadenylation sequences include those otherwise disclosed herein.

V. Production of 2-chain antibodies using a DNA construct comprising first and second transcriptional units, each transcriptional unit comprising two ORFs and an IRES to link the ORFs, wherein the DNA construct comprises a transposon

[0092] As shown in Figure 2, a first (216) and second (226) ORF operably linked by an IRES (215) comprise only a single transcriptional unit (210). The number of copies of both ORFs that are integrated into a host genome may be doubled by placing two transcriptional units within a DNA construct to be integrated into the host genome. In one aspect, the DNA construct may comprise a transposon. With reference to Figure 3, in one aspect, a first transcriptional unit (310) comprises a first ORF (316) and a second ORF (326), the first ORF (316) is close to the 5’ end of the mRNA in a good translational initiation context, the second ORF (326) is operably linked to a first IRES (315), and both the first IRES (315) and the second ORF (326) are placed to the 3’ of the first ORF (316). In one aspect, the second transcriptional unit (320) comprises a third ORF (336) and a fourth ORF (346), the third ORF (336) is close to the 5’ end of the mRNA in a good translational initiation context, the fourth ORF (346) is operably linked to a second IRES (325), and both the second IRES (325) and the fourth ORF (346) are placed to the 3’ of the third ORF

(336). For a monoclonal antibody, in one aspect, the first (316) and third (336) ORFs may encode the same mature light chain sequence (though the secretion signal sequences may be different), and the second (326) and fourth (346) ORFs may encode the same mature heavy chain sequence (though the secretion signal sequences may be different). Example light chain signal sequences may have amino acid sequences SEQ ID NOs: 301-306. Example heavy chain signal sequences may have amino acid sequences SEQ ID NOs: 305-317. In one aspect, each of the two transcriptional units (310, 320) is operably linked to a promoter (314, 324) and a polyA sequence (318, 328) so that both transcriptional units (310, 320) are expressible in the host cell. Optionally, the transcriptional units may also be operably linked to enhancer (312, 322) and intron (317, 327) sequences. As shown in Figure 3, the transcriptional units comprise a transposon flanked by transposon ends (301, 302).

[0093] Repeated sequences in a DNA construct can result in recombination either within a bacterial host while the construct is being propagated, or within the mammalian expression host after the construct has been integrated into its genome. Recombination between repeats results in the loss of some sequences and instability of expression. It is therefore desirable to reduce sequence repeats where possible. This can be done by using different regulatory elements such as enhancers, promoters, introns, and polyadenylation signal sequences. However, it is sometimes desirable to use some or all of the same regulatory elements for the first and second transcriptional units in order to properly balance expression of the encoded polypeptides. Even in such a case, sequence identity between the first and second transcriptional units can still be reduced by selecting different codons to encode the first and second polypeptides within the first and second transcriptional units. For example, codons may be selected as described in U.S. Patent No. 7,561,972, which is incorporated herein by reference in its entirety, for each occurrence of each

ORF independently. It may be desirable to optimize antibody constant regions to maximize similarity to constant regions with good known expression properties. Example sequences encoding the human kappa constant region with amino acid sequence SEQ ID NO: 284 may include nucleotide sequences SEQ ID NO: 292 and SEQ ID NO: 293. Example sequences encoding the human IgGl constant region with amino acid sequence SEQ ID NO: 282 may include nucleotide sequences SEQ ID NO: 288 and SEQ ID NO: 289. Example sequences encoding the human IgG4 constant region with amino acid sequence SEQ ID NO: 283 may include nucleotide sequences SEQ ID NO: 290 and SEQ ID NO: 291. Example sequences encoding the human lambda 1, 2, and 7 constant regions with respective amino acid sequences SEQ ID NO: 285, SEQ ID NO: 286, and SEQ ID NO: 287 may include nucleotide sequences SEQ ID NO: 294, SEQ ID NO: 295, and SEQ ID NO: 296, respectively.

[0094] Suitable enhancers, promoters, IRES sequences, introns, and polyadenylation sequences include those otherwise disclosed herein. For example, in one aspect, a suitable combination for the “middle” section of the vector configuration shown in Figure 3, that is, first poly A sequence (318), second enhancer (322), second promoter (324), and second intron (327) may be represented by any of SEQ ID NOs: 297-300.

[0095] The vector configuration shown in Figure 3 is widely applicable to antibody (and Fab) expression, regardless of the relative expression of heavy and light chains, as the order of elements within this configuration can provide additional tuning of expression of the different antibody subunits. First ORF 316 may encode a light chain, with second ORF 326 encoding a heavy chain. Alternatively, first ORF 316 may encode a heavy chain, with second ORF 326 encoding a light chain. Similarly, third ORF 336 may encode a second copy of the light chain, with fourth ORF 346 encoding a second copy of the heavy chain, or third ORF 336 may encode a second copy of the heavy chain, with fourth ORF 346 encoding a second copy of the light chain. Alternatively, first ORF 316 may encode a first copy of light chain, with second ORF 326 encoding a second copy of the same light chain, while third ORF 336 may encode a first copy of the heavy chain, with fourth ORF 346 encoding a second copy of the heavy chain. Alternatively, first ORF 316 may encode a first copy of a heavy chain, with second ORF 326 encoding a second copy of the same heavy chain, while third ORF 336 may encode a first copy of the light chain, with fourth ORF 346 encoding a second copy of the light chain.

VI. Production of multi-specific antibodies

[0096] One general format for a four-chain multi-specific antibody is two half-antibodies, each comprising two chains, which may comprise, for example, a heavy chain and a light chain. In this format, the first light chain associates with the first heavy chain, and the second light chain associates with the second heavy chain. The first and second heavy chains associate with each other. In some aspects, the first and second light chains comprise the same amino acid sequence, a special case of a four-chain multi-specific antibody with a “common light chain.” Incorrect assembly products result when the chains associate inappropriately, for example (other than for common light chain molecules), if the first light chain associates with the second heavy chain or the second light chain associates with the first heavy chain, or where the first heavy chain associates with another first heavy chain or the second heavy chain associates with another second heavy chain. These assembly products are shown schematically in Figure 4.

[0097] ORFs encoding the chains for one half-antibody for a four-chain multi-specific antibody may be incorporated into a first DNA construct, and ORFs encoding the chains for the other half-antibody may be incorporated into a second DNA construct, and the first and second DNA constructs may be introduced into the same cell. Introducing different relative amounts of the first and second DNA constructs introduces different numbers of copies of the ORFs encoding the chains for the first half-antibody relative to numbers of copies of the ORFs encoding the chains for the second half-antibody. This can provide a physically easy way to compensate for potential expression differences between the two half-antibodies. However, when DNA is integrated into a mammalian cell genome by the process of random integration, the DNA is fragmented within the host cell, some fragments are completely degraded while other fragments are integrated into the host cell’s genome, and certain of the integrated fragments may be amplified by the selection process. The random integration process thus frequently results in large stochastic variations in the relative numbers of copies of ORFs, even when those ORFs are originally introduced as part of the same DNA construct. Furthermore, random DNA integration into a mammalian genome is a relatively rare event, so when two different DNA constructs are introduced into a mammalian cell there is a high likelihood that only one or the other will be integrated at all. Random integration will thus confound attempts to control precisely the relative copy numbers of ORFs introduced into a cell on more than one DNA construct. To avoid these limitations, the DNA constructs as described herein may comprise transposons that are introduced with corresponding transposases that transpose the transposons into the host genome. The transposition process retains the integrity of the DNA between the transposon ends, and the integration frequency is much higher than with random integration, so the relative number of integrated copies of each ORF and its operably linked regulatory elements are close to the relative number that were introduced into the cell. VII. Production of a 4-chain multi-specific antibody using two transposons, each transposon comprising two transcriptional units, each transcriptional unit comprising an ORF, and each ORF is operably linked to regulatory elements such that the ORFs are expressible in the host cell, but not including an IRES

[0098] One example configuration (500) of a pair of transposons that may be used to encode a 4-chain multi-specific antibody is shown in Figures 5A and 5B. The first transposon comprises two transcriptional units (510, 520), each comprising an ORF (516, 526), and each ORF (516, 526) is operably linked to regulatory elements (e.g., optional enhancers (512, 522), promoters (514, 524), polyadenylation sequences (518, 528), and optional introns (517, 527)), such that the ORFs (516, 526) are expressible in the host cell. The two ORFs (516, 526) encode a first polypeptide chain and a second polypeptide chain, wherein the first and second polypeptide chains comprise a first half-antibody. The second transposon comprises two transcriptional units (530, 540), each comprising an ORF (536, 546), and each ORF (536, 546) is operably linked to regulatory elements (e.g., optional enhancers (532, 542), promoters (534, 544), polyadenylation sequences (538, 548), and optional introns (537, 547)), such that the ORFs (536, 546) are expressible in the host cell. The two ORFs (536, 546) encode a third polypeptide chain and a fourth polypeptide chain, wherein the third and fourth polypeptide chains comprise a second half-antibody. With further reference to Figures 5A and 5B, the first transposon has ends (501, 502) that are recognized by a first transposase, which can integrate the first transposon into the genome of the host cell. The second transposon has ends (503, 504) that are recognized by a second transposase, which can integrate the second transposon into the genome of the host cell. The first transposase and the second transposase may be the same, or they may be different. The first transposon may further comprise a gene encoding a first selectable marker expressible in the host cell, and the second transposon may further comprise a gene encoding a second selectable marker expressible in the host cell. The first and second selectable markers may be the same or they may be different.

[0099] The 4-chain multi-specific antibody comprises the first half-antibody and the second half-antibody (see Figure 4). Each transposon may be configured to express an optimal ratio of the two polypeptide chains comprising the half-antibody, including the order of the ORFs as described with respect to Figure 3.

[0100] Suitable enhancers, promoters, introns, and polyadenylation sequences include those otherwise disclosed herein. For example, in one aspect, a suitable combination for the “middle” section of the DNA constructs shown in Figures 5A and/or 5B, that is, either or both of first and third polyA sequences (518, 538), second and fourth enhancers (522, 542), second and fourth promoters (524, 544), and second and fourth introns (527, 547) may be represented by any of SEQ ID NOs: 297-300.

VIII, Production of a 4-chain multi-specific antibody using two transposons, each transposon comprising a single transcriptional unit, each transcriptional unit comprising two ORFs and an

IRES to link the ORFs

[0101] Another example configuration (600) of a pair of transposons that may be used to encode a 4-chain multi-specific antibody is shown in Figure 6. The first transposon comprises one transcriptional unit (610) comprising first (616) and second (626) ORFs operably linked by a first IRES (615). The first ORF (616) is further operably linked to regulatory elements (e.g., an enhancer (612), a promoter (614), and optionally an intron (617)), and the second ORF (626) is further operably linked to a polyadenylation sequence (618), such that the ORFs (616, 626) are expressible in the host cell. The two ORFs (616, 626) encode a first polypeptide chain and a second polypeptide chain, wherein the first and second polypeptide chains comprise a first halfantibody. The second transposon comprises one transcriptional unit (620) comprising third (636) and fourth (646) ORFs operably linked by a second IRES (625). The third ORF (636) is further operably linked to regulatory elements (e.g., an enhancer (622), a promoter (624), and optionally an intron (627)), and the fourth ORF is further operably linked to a polyadenylation sequence 628), such that the ORFs (636, 646) are expressible in the host cell. The two ORFs (636, 646) encode a third polypeptide chain and a fourth polypeptide chain, wherein the third and fourth polypeptide chains comprise a second half-antibody. The 4-chain multi-specific antibody comprises the first half-antibody and the second half-antibody. Each transposon may be configured to express an optimal ratio of the two polypeptide chains comprising the half-antibody, including the order of the ORFs as described with respect to Figure 3.

[0102] With further reference to Figure 6, the first transposon has ends (601, 602) that are recognized by a first transposase, which can integrate the first transposon into the genome of the host cell. The second transposon has ends (603, 604) that are recognized by a second transposase, which can integrate the second transposon into the genome of the host cell. The first transposase and the second transposase may be the same, or they may be different. The first transposon may further comprise a gene encoding a first selectable marker expressible in the host cell, and the second transposon may further comprise a gene encoding a second selectable marker expressible in the host cell. The first and second selectable markers may be the same or they may be different. [0103] Suitable enhancers, promoters, IRESes, introns, and polyadenylation sequences include those otherwise disclosed herein. IX, Production of a 3- or 4-chain multi-specific antibody using a single transposon, the transposon comprising two transcriptional units, each transcriptional unit comprising two ORFs and an IRES to link the ORFs

[0104] Another example configuration (700) of a pair of transposons that may be used to encode a 4-chain multi-specific antibody is shown in Figure 7. The transposon comprises a first (710) and second (720) transcriptional unit expressible in the host cell. The first transcriptional unit (710) comprises first (716) and second (726) ORFs operably linked by a first IRES (715). The first ORF (716) is further operably linked to regulatory elements (e.g., optionally an enhancer (712), a promoter (714), and optionally an intron (717)), and the second ORF (726) is further operably linked to a polyadenylation sequence (718). The two ORFs (716, 726) encode a first polypeptide chain and a second polypeptide chain, wherein the first and second polypeptide chains comprise a first half-antibody. The second transcriptional unit (720) comprises a third (736) and fourth (746) ORF operably linked by a second IRES (725). The third ORF (736) is further operably linked to regulatory elements (e.g., optionally an enhancer (722), a promoter (724), and optionally an intron (727)), and the fourth ORF (746) is further operably linked to a polyadenylation sequence (728). The two ORFs (736, 746) encode a third polypeptide chain and a fourth polypeptide chain, wherein the third and fourth polypeptide chains comprise a second half-antibody. The 4-chain multi-specific antibody comprises the first half-antibody and the second half-antibody. Each transposon may be configured to express an optimal ratio of the two polypeptide chains comprising the half-antibody, including the order of the ORFs as described with respect to Figure 3.

[0105] With further reference to Figure 7, the transposon has ends (701, 702) that are recognized by a transposase, which can integrate the transposon into the genome of the host cell. The transposon may further comprise a gene encoding a selectable marker expressible in the host cell. Integration of a DNA segment comprising a gene encoding a selectable marker into the genome of a host cell provides a selective advantage to the cell under certain culture conditions. For example, a gene encoding glutamine synthetase may provide a selective advantage when the cells are grown under conditions of limiting glutamine, or in the presence of an inhibitor of glutamine synthetase such as methionine sulfoximine (MSX). A gene encoding dihydrofolate reductase may provide a selective advantage when the cells are grown under conditions of limiting folate, or in the presence of an inhibitor of dihydrofolate reductase such as methotrexate (MTX). A gene encoding an enzyme that detoxifies a cytotoxic drug may provide a selective advantage when cells are grown in the presence of the drug. Example drugs for which selectable markers are available to enable mammalian cells to grow in their presence include neomycin, G418, blasticidin, hygromycin, zeocin, ouabain, and puromycin.

[0106] Suitable enhancers, promoters, IRESes, introns, and polyadenylation sequences include those otherwise disclosed herein. For example, in one aspect, a suitable combination for the “middle” section of the vector configuration shown in Figure 7, that is, first polyA sequence (718), second enhancer (722), second promoter (724), and second intron (727) may be represented by any of SEQ ID NOs: 297-300.

X, Production of a 3- or 4-chain multi-specific antibody using two transposons, each transposon comprising two transcriptional units, each transcriptional unit comprising two ORFs and an IRES to link the ORF s

[0107] Another example configuration (800) of a pair of transposons that may be used to encode a 4-chain multi-specific antibody is shown in Figures 8A and 8B. The first transposon comprises a first (810) and second (820) transcriptional unit expressible in the host cell. The first transcriptional unit (810) comprises first (816) and second (826) ORFs operably linked by a first IRES (815). The first ORF (816) is further operably linked to regulatory elements (e.g., optionally an enhancer (812), a promoter (814), and optionally an intron (817)), and the second ORF (826) is further operably linked to a polyadenylation sequence (818). The two ORFs (816, 826) encode a first polypeptide chain and a second polypeptide chain, wherein the first and second polypeptide chains comprise a first half-antibody. The second transcriptional unit (820) comprises a third (836) and fourth (846) ORF operably linked by a second IRES (825). The third ORF (836) is further operably linked to regulatory elements (e.g., optionally an enhancer (822), a promoter (824), and optionally an intron (827)), and the fourth ORF (846) is further operably linked to a polyadenylation sequence (828). The third (836) and fourth (846) ORFs may also encode the first polypeptide chain and the second polypeptide chain, wherein the first and second polypeptide chains comprise a first half-antibody. The second transposon comprises a third (830) and fourth (840) transcriptional unit expressible in the host cell. The third transcriptional unit (830) comprises fifth (856) and sixth (866) ORFs operably linked by a third IRES (835). The fifth ORF (856) is further operably linked to regulatory elements (e.g., optionally an enhancer (832), a promoter (834), and optionally an intron (837)), and the sixth ORF (866) is further operably linked to a polyadenylation sequence (838). The two ORFs (856, 866) encode a third polypeptide chain and a fourth polypeptide chain, wherein the third and fourth polypeptide chains comprise a second half-antibody. The fourth transcriptional unit (840) comprises a seventh (876) and eighth (886) ORF operably linked by a fourth IRES (845). The seventh ORF (876) is further operably linked to regulatory elements (e.g., optionally an enhancer (842), a promoter (844), and optionally an intron (847)), and the eighth ORF (886) is further operably linked to a polyadenylation sequence (848). The seventh (876) and eighth (886) ORFs may also encode the third polypeptide chain and a fourth polypeptide chain, wherein the third and fourth polypeptide chains comprise a second half-antibody.

[0108] With further reference to Figures 8A and 8B, the first transposon has ends (801, 802) that are recognized by a first transposase, which can integrate the first transposon into the genome of the host cell. The second transposon has ends (803, 804) that are recognized by a second transposase, which can integrate the second transposon into the genome of the host cell. The first transposase and the second transposase may be the same, or they may be different. The first transposon may further comprise a gene encoding a first selectable marker expressible in the host cell, and the second transposon may further comprise a gene encoding a second selectable marker expressible in the host cell. The first and second selectable markers may be the same, or they may be different.

[0109] Each transposon may be configured to express an optimal ratio of the two polypeptide chains comprising the half-antibody, including the order of the ORFs as described with respect to Figure 3.

[0110] Suitable enhancers, promoters, IRES sequences, introns, and polyadenylation sequences include those otherwise disclosed herein. For example, in one aspect, a suitable combination for the “middle” section of the DNA constructs shown in Figures 8A and/or 8B, that is, either or both of first and third polyA sequences (818, 838), second and fourth enhancers (822, 842), second and fourth promoters (824, 844), and second and fourth introns (827, 847) may be represented by any of SEQ ID NOs: 297-300. EXAMPLES

Example 1 : Production of 2-chain antibodies using first and second transcriptional units, each transcriptional unit comprising two ORFs and an IRES to link the ORFs, wherein the DNA construct comprises a transposon

[0111] Three transposons were constructed having configurations similar to that shown in Figure 3 in order to produce monoclonal antibodies. The transposons each comprised a first transcriptional unit with a mouse CMV enhancer (SEQ ID NO: 119) and a mouse CMV promoter (SEQ ID NO: 127) operably linked to a first ORF encoding an antibody light chain with a first secretion signal, a first variable region and a first kappa constant region, the first kappa constant region with amino acid sequence SEQ ID NO: 284 and nucleotide sequence SEQ ID NO: 292, and a second ORF encoding an antibody heavy chain with a second secretion signal, a second variable region, and a first IgGl constant region, the first IgGl constant region with amino acid sequence 99% identical to SEQ ID NO: 282 and nucleotide sequence 99% identical to SEQ ID NO: 288. The two ORFs in the first transcriptional unit were operably linked by an IRES with nucleotide sequence SEQ ID NO: 248 to the 3’ of the first ORF and immediately to the 5’ of the second ORF. The first transcriptional unit also comprised a polyA signal with nucleotide sequence SEQ ID NO: 245 to the 3’ of the second ORF. The transposons each further comprised a second transcriptional unit with a mouse CMV enhancer (SEQ ID NO: 119) and a hybrid mouse/human CMV promoter (SEQ ID NO: 151) operably linked to a third ORF, which, in this example, encoded an antibody light chain with a third secretion signal, a third variable region, and a second kappa constant region, the second kappa constant region with amino acid sequence SEQ ID NO: 284 and nucleotide sequence SEQ ID NO: 293, and a fourth ORF encoding an antibody heavy chain with a fourth secretion signal, a fourth variable region, and a second IgGl constant region, the second IgGl constant region with amino acid sequence 99% identical to SEQ ID NO: 282 and nucleotide sequence 99% identical to SEQ ID NO: 289. The two ORFs in the second transcriptional unit were operably linked by an IRES with nucleotide sequence SEQ ID NO: 248 to the 3’ of the third ORF and immediately to the 5’ of the fourth ORF. The second transcriptional unit also comprised a polyA signal with nucleotide sequence SEQ ID NO: 243 to the 3’ of the fourth ORF. The two mature light chains in the transposons had identical amino acid sequences: the first variable region and the third variable region had identical amino acid sequences, and the first kappa constant region and the second kappa constant region had identical amino acid sequences. However, the nucleotide sequences of the first and third ORFs were different. The two mature heavy chains had identical amino acid sequences: the second variable region and the fourth variable region had identical amino acid sequences, and the first IgG constant region and the second IgG constant region had identical amino acid sequences. However, the nucleotide sequences of the second and fourth ORFs were different.

[0112] The transposons each further comprised a glutamine synthetase selectable marker. Each transposon was transfected with a corresponding transposase into its own pool of CHO cells lacking a functional glutamine synthetase gene and grown in media lacking glutamine to select for cells with transposons integrated into their genomes. Once cells had recovered, the pools of recovered cells were grown in a 14 day fed batch, and the concentration of secreted antibody in the media was measured in the supernatant. The antibody concentrations in the media for cell pools derived from the example transposons are shown in Figure 9, Table 1, rows 2 (mAbl dual

IRES), 4 (mAb2 dual IRES), and 6 (mAb3 dual IRES). Comparative Example 1 : Production of 2-chain antibodies using first and second transcriptional units, each comprising a CMV enhancer, a promoter, and an ORF, but not including an IRES [0113] Three transposons were constructed having configurations similar to that shown in Figure 1 in order to produce the same monoclonal antibodies as produced in Example 1. The transposons each comprised a first transcriptional unit with a first ORF encoding an antibody light chain with a secretion signal, a variable region, and a kappa constant region, the kappa constant region with amino acid sequence SEQ ID NO: 284 and nucleotide sequence SEQ ID NO: 292. The first ORF was operably linked to a mouse CMV enhancer (SEQ ID NO: 119) and a mouse CMV promoter (SEQ ID NO: 127) to its 5’ end, and to a polyA signal with nucleotide sequence SEQ ID NO: 245 to its 3’ end. The transposons each further comprised a second transcriptional unit with a second ORF encoding an antibody heavy chain with a secretion signal, a variable region, and an IgGl constant region, the IgGl constant region with amino acid sequence 99% identical to SEQ ID NO: 282 and a nucleotide sequence 99% identical to SEQ ID NO: 288. The second ORF was operably linked to a mouse CMV enhancer (nucleotide sequence SEQ ID NO: 119), a hybrid mouse/human CMV promoter (nucleotide sequence SEQ ID NO: 151), and a hybrid intron (SEQ ID NO: 208) to its 5’ end, and to a polyA signal with nucleotide sequence SEQ ID NO: 243 to its 3’ end.

[0114] The transposons each further comprised a glutamine synthetase selectable marker. Each transposon was transfected with a corresponding transposase into its own pool of CHO cells lacking a functional glutamine synthetase gene and grown in media lacking glutamine to select for cells with transposons integrated into their genomes. Once the cells had recovered, the pools of recovered cells were grown in a 14 day fed batch, and the concentration of secreted antibody in the media was measured in the supernatant. The antibody concentrations in the media for cell pools derived from the transposon are shown in Figure 9, Table 1, rows 1 (mAbl), 3 (mAb2), and

5 (mAb3).

[0115] As can be seen in Table 1, for each of the antibodies, at every time point measured, the amount of antibody produced was higher for the cell line whose genome comprised the transposon with a dual IRES configuration (i.e., having a configuration similar to Figure 3) compared to the cell line from Comparative Example 1 whose genome comprised a transposon lacking an IRES element (i.e., having a configuration similar to Figure 1).

Example 2: Production of a 4-chain multi-specific antibody using two transposons, each transposon comprising two transcriptional units, each transcriptional unit comprising an ORF, and each ORF is operably linked to regulatory elements such that the ORFs are expressible in the host cell

[0116] A first transposon with a configuration similar to that shown in Figure 5A (“A: First transposon”) was constructed to produce a first and second polypeptide chain, which together comprise a first half-antibody. The first transposon comprised a first transcriptional unit with a first ORF encoding a first polypeptide comprising, from N-to-C terminus, a secretion signal, an antibody variable region, and an IgG CHI domain, the first ORF operably linked to a mouse CMV enhancer (SEQ ID NO: 119), a mouse CMV promoter (SEQ ID NO: 127) to its 5’ end, and to a polyA signal with nucleotide sequence SEQ ID NO: 245 to the 3’ of the first ORF. After removal of the secretion signal, the first polypeptide had a mature amino acid sequence SEQ ID NO: 318. The first transposon further comprised a second transcriptional unit with a second ORF encoding a second polypeptide comprising, from N-to-C terminus, a secretion signal, an antibody variable region, a kappa constant region, and an IgGl Fc constant region comprising “hole” mutations Y349C, T366S, L368A and Y407V (EU numbering), the second ORF operably linked to a mouse CMV enhancer (SEQ ID NO: 119), a mouse CMV promoter (SEQ ID NO: 127), and a human

CMV intron A (SEQ ID NO: 190) to its 5’ end, and to a polyA signal with nucleotide sequence SEQ ID NO: 243 to the 3’ of the second ORF. After removal of the secretion signal, the second polypeptide had a mature amino acid sequence SEQ ID NO: 320.

[0117] A second transposon with a configuration similar to that shown in Figure 5B (“B: Second transposon) was constructed to produce a third and fourth polypeptide chain, which together comprise a second half-antibody. The second transposon comprised a third transcriptional unit with a third ORF encoding a third polypeptide comprising, from N-to-C terminus, a secretion signal, an antibody variable region, and a kappa constant region, the third ORF operably linked to a mouse CMV enhancer (SEQ ID NO: 119), a mouse CMV promoter (SEQ ID NO: 127) to its 5’ end, and to a polyA signal with nucleotide sequence SEQ ID NO: 245 to the 3’ of the third ORF. After removal of the secretion signal, the third polypeptide had a mature amino acid sequence SEQ ID NO: 319. The second transposon further comprised a fourth transcriptional unit with a fourth ORF encoding a fourth polypeptide comprising, from N-to-C terminus, a secretion signal, an antibody variable region, and an IgGl constant region comprising “knob” mutations S354C and T366W, the fourth ORF operably linked to a mouse CMV enhancer (SEQ ID NO: 119), a mouse CMV promoter (SEQ ID NO: 127), and a human CMV intron A (SEQ ID NO: 190) to its 5’ end, and to a polyA signal with nucleotide sequence SEQ ID NO: 243 to the 3’ of the fourth ORF. After removal of the secretion signal, the fourth polypeptide had a mature amino acid sequence SEQ ID NO: 321.

[0118] The first and second transposons each comprised transposon ends recognized and transposable by the same piggyBac-like transposase. Each transposon comprised a left transposon end comprising, from 5’ to 3’, a tetranucleotide 5’-TTAA-3’, a transposon ITR with nucleotide sequence SEQ ID NO: 3, and an additional transposon end with nucleotide sequence SEQ ID NO:

6. Each transposon comprised a right transposon end comprising, from 5’ to 3’, an additional transposon end with nucleotide sequence SEQ ID NO: 7, a transposon ITR with nucleotide sequence SEQ ID NO: 4, and a tetranucleotide 5’-TTAA-3’. The first and second transposons each also comprised a selectable marker: a glutamine synthetase ORF operably linked to 5’ and 3’ regulatory elements expressible in a mammalian cell. The first and second transposons were cotransfected with mRNA encoding a corresponding transposase (with amino acid sequence SEQ ID NO: 16) into a pool of CHO cells, the CHO cells lacking expression of a functional glutamine synthetase enzyme. Three different pools of CHO cells were prepared, wherein different amounts of first and second transposons were introduced. In the first transfection, 12.5 pg of each transposon were introduced with 3 pg of mRNA encoding the corresponding transposase into a first pool of 5 million CHO cells (Sample 1 in Figure 12, Table 2); in the second transfection, 8.3 pg of the first transposon and 16.7 pg of the second transposon were introduced with 3 pg of mRNA encoding the corresponding transposase into a second pool of 5 million CHO cells (Sample 2 in Figure 12, Table 2); in the third transfection, 16.7 pg of the first transposon and 8.3 pg of the second transposon were introduced with 3 pg of mRNA encoding the corresponding transposase into a third pool of 5 million CHO cells (Sample 3 in Figure 12, Table 2). In addition, a first control pool was prepared by transfecting only with 25 pg of the first transposon and 3 pg of mRNA encoding the corresponding transposase into a pool of 5 million CHO cells, and a second control pool was prepared by transfecting only with 25 pg of the second transposon and 3 pg of mRNA encoding the corresponding transposase into a pool of 5 million CHO cells. Each pool was allowed to recover for 48 hours post-transfection and placed in media lacking glutamine until the cells recovered, thereby selecting for a pool of CHO cells whose genomes stably comprise the transfected transposon(s). Each recovered pool was grown in a 14-day fed batch production process. At the end of 14 days, the culture was harvested, cells and cell debris were removed, and the resulting supernatant was purified by protein A affinity-capture. Purified protein was analyzed by HIC-HPLC (High-Pressure Liquid Chromatography) to quantify the heterodimer and homodimer content in the purified samples. Protein was loaded onto a HIC column (MabPac HIC (4.6x100 mm), Thermo Scientific) and a salt-based gradient was used to elute the samples. Samples were diluted in an equal volume of dilution buffer (20 mM Sodium Phosphate, pH 6.5, 2 M NaCl), and a gradient was run between 100% Buffer A (20 mM Sodium Phosphate, pH 6.5, I M NaCl) and 100% Buffer B (20 mM Sodium Phosphate, pH 6.5) to elute the samples. Chromatographic peaks were assigned by comparison with homodimer and heterodimer standards and quantified by calculating the areas under the curve.

[0119] Figure 10 shows the major fully-assembled molecular species (that is, those that contain four polypeptide chains) that can be formed from combinations of the first and second polypeptides encoded by the ORFs on the first transposon and the third and fourth polypeptides encoded by the ORFs on the second transposon. Light chain mispairing is minimized/eliminated by swapping the kappa constant region (now present as CL on the second polypeptide) with the CHI constant domain (now present as CHI in the first polypeptide) in one half of the molecule (the half encoded on transposon 1). The major mis-assembly products that can be formed are unpaired half-antibodies and self-associated half-antibodies to form 4-chain homodimers. These species can be separated and identified using HIC. Figure 11 shows HIC traces of the protein A- binding material from the control pools transfected with either transposon 1 (encoding only the “hole” half-antibody) or transposon 2 (encoding only the “knob” half-antibody). Since each of these pools lacks one half of the 4-chain multi-specific antibody, they can be used to identify the HIC peaks associated with each half-antibody and its homodimer. These are annotated as “hole- related impurities” and “knob-related impurities” on Figure 11.

[0120] The protein A-purified materials from the first, second, and third pools were analyzed by HIC and are shown as traces A, B, and C respectively on Figure 11. A new major peak appears in the traces from these pools that was not seen in either of the control pools: this is the desired heterodimer peak. The areas under each peak were integrated to quantify the relative amounts of heterodimer compared to the hole-related half antibody and homodimer (hole-related contaminants) and the knob-related half antibody and homodimer (knob-related contaminants). Figure 12 (Table 2) shows that for each of the three co-transfections, the pools of cells were highly productive, making between 3,574 and 4,266 mg of protein A purifiable material per liter of culture. Table 2 also shows that changing the relative proportions of the first and second transposon used in the co-transfection changed which half-antibody was produced in excess. Equal amounts of the first and second transposon resulted in 3,776 mg/L of protein A-purifiable material, nearly 82% of which was heterodimer and 15% knob-related contaminants (Table 2, Sample 1; Figure 11, Trace A). The knob-associated ORFs on transposon 2 were thus more highly expressed. Transfecting twice as much of the second transposon as the first resulted in more material overall (4,266 mg/L), but only about 61% was heterodimer, the remainder of nearly 40% was knob-related contaminants (Table 2, Sample 2; Figure 11, Trace B). By contrast, transfecting twice as much of the first transposon as the second yielded 3,574 mg/L of protein A-purifiable material, 79% heterodimer with no knob-related contaminants but with 13.5% hole-related contaminants (Table 2, Sample 3; Figure 11, Trace C). Thus, co-transfection of two transposons, each comprising ORFs encoding one half-antibody of a 4-chain multi-specific antibody, is an experimentally convenient way to obtain variation in relative expression levels of the two half- antibodies comprising the 4-chain multi-specific antibody. A first transposon and a second transposon may be transfected at relative ratios of 1 : 1, or this ratio may be varied, for example, with one transposon being present in, e g., at least a 10% excess over the other in the transfection mix, up to, e.g., a 100-fold over the other in the transfection mix. Example transposon ratios include about 10:9, about 10:8, about 10:7, about 10:6, about 10:5, about 10:4, about 10:3, about 10:2, about 10: 1, about 20:1, about 30: 1, about 40: 1, about 50: 1, about 60:1, about 70: 1, about 80: 1, about 90: 1, about 100: 1, or any other value between any two of those ratios. This manipulation allows the user to select a pool for further work that produces a high amount of the desired molecule, but it also allows the user to select a pool where the contaminants can be chosen to minimize downstream-processing challenges.

Example 3: Production of a 4-chain multi-specific antibody using a transposon, the transposon comprising two transcriptional units, each transcriptional unit comprising two ORFs and an IRES to link the ORFs

[0121] The same 4-chain multi-specific antibody described in Example 2 was encoded on a series of nine different transposons (Samples 1-9 of Figure 13, Table 3) configured generally as shown in Figure 7. The first, second, third, and fourth ORFs encoded the first, second, third, and fourth polypeptide chains as described in Example 2 and as shown in Figure 10. Enhancers, promoters, introns, and polyA regions operably linked to the ORFs were varied as shown in Figure 13 (Table 3), with element names assigned as shown in Figure 7. IRES 715 and IRES 725 operably linking ORFs 716 and 726 and ORFs 736 and 746, respectively, were both the EMCV IRES with nucleotide sequence SEQ ID NO: 248.

[0122] Each transposon comprised a left transposon end comprising, from 5’ to 3’, a tetranucleotide 5’-TTAA-3’, a transposon ITR with nucleotide sequence SEQ ID NO: 3, and an additional transposon end with nucleotide sequence SEQ ID NO: 6. Each transposon comprised a right transposon end comprising, from 5’ to 3’, an additional transposon end with nucleotide sequence SEQ ID NO: 7, a transposon ITR with nucleotide sequence SEQ ID NO: 4, and a tetranucleotide 5’-TTAA-3’. Each transposon also comprised a selectable marker: a glutamine synthetase ORF operably linked to 5’ and 3’ regulatory elements expressible in a mammalian cell. Each transposon (25 pg) was co-transfected with 3 pg of mRNA encoding a corresponding transposase (with amino acid sequence SEQ ID NO: 16) into a pool of 5 million CHO cells. Each pool was allowed to recover for 48 hours post-transfection and was placed in media lacking glutamine until the cells recovered, thereby selecting for a pool of CHO cells whose genomes stably comprise the transfected transposon(s). Each recovered pool was grown in a 14-day fed batch production process. At the end of 14 days, the culture was harvested, cells and cell debris were removed, and the resulting supernatant was purified by protein A affinity-capture. Purified protein was analyzed by HIC, as described in Example 2. Traces are shown in Figure 14. Peak areas were integrated to quantify the amounts of different molecular species present in the purified material. The results of quantitation are shown in Figure 13, Table 3.

[0123] By altering the identities of regulatory elements operably linked to the different ORFs, the overall productivity of the cells may be modulated, as well as the relative levels of the first (hole) and second (knob) half antibodies. For example, Samples 1 and 2 have a human CMV enhancer operably linked to the first and second ORFs encoding the hole half-antibody. Cell pools whose genomes comprise these transposons produce an excess of the knob half-antibody, and they therefore produce knob-related contaminants. Samples 3 and 4 are otherwise identical to Samples 1 and 2, except that the human CMV enhancer operably linked to the first and second ORFs has been replaced by a mouse CMV enhancer. The result is an excess of the hole half-antibody and hole-related contaminants.

Example 4: Production of a 4-chain multi-specific antibody with a common light chain using a transposon, the transposon comprising two transcriptional units, each transcriptional unit comprising two ORFs and an IRES to link the ORFs

[0124] A series of 10 transposons (Samples Pl -PIO of Figure 15, Table 4), configured generally as shown in Figure 7, were constructed to express a first and second polypeptide chain, which together comprised a first half-antibody, and a third and fourth polypeptide chain, which together comprised a second half-antibody. The first and second half antibodies comprised a 4- chain multi-specific antibody. The two mature light chains of this multi-specific antibody were identical, a format referred to as a “common light chain.”

[0125] Each transposon comprised a left transposon end comprising, from 5’ to 3’, a tetranucleotide 5’-TTAA-3’, a transposon ITR with nucleotide sequence SEQ ID NO: 3, and an additional transposon end with nucleotide sequence SEQ ID NO: 6. Each transposon comprised a right transposon end comprising, from 5’ to 3’, an additional transposon end with nucleotide sequence SEQ ID NO: 7, a transposon ITR with nucleotide sequence SEQ ID NO: 4, and a tetranucleotide 5’-TTAA-3’. Each transposon comprised a first transcriptional unit with a first ORF encoding a first polypeptide comprising, from N-to-C terminus, a secretion signal, an antibody variable region, and a kappa domain. The first transcriptional unit further comprised a second ORF encoding a second polypeptide comprising, from N-to-C terminus, a secretion signal, an antibody variable region, and an IgG4 constant domain. Each transposon further comprised a second transcriptional unit with a third ORF encoding a third polypeptide comprising, from N-to- C terminus, a secretion signal, an antibody variable region, and a kappa domain. The second transcriptional unit further comprised a fourth ORF encoding a fourth polypeptide comprising, from N-to-C terminus, a secretion signal, an antibody variable region, and an IgG4 constant domain. After removal of the secretion signals, the first and third polypeptides had identical mature amino acid sequence SEQ ID NO: 322.

[0126] Each transposon comprised two ORFs encoding two heavy chain polypeptides. HC1 had mature amino acid sequence SEQ ID NO: 323, and HC2 had mature amino acid sequence SEQ ID NO: 324. Within the series of 10 transposons, HC1 was sometimes encoded as the second ORF (in the first transcriptional unit, see Figure 7); at other times, HC 1 was encoded as the fourth ORF (in the second transcriptional unit, see Figure 7). Conversely, HC2 was sometimes encoded as the fourth ORF (in the second transcriptional unit); at other times, HC2 was encoded as the second ORF (in the first transcriptional unit), so that each transposon had one copy of the ORF encoding HC1 and one copy of the ORF encoding HC2. Enhancers, promoters, introns, and poly A regions operably linked to the ORFs were varied as shown in Figure 15 (Table 4), with element names assigned as shown in Figure 7. IRES 715 and IRES 725 operably linking ORFs 716 and 726 and ORFs 736 and 746, respectively, were both the EMCV IRES with nucleotide sequence SEQ ID NO: 248.

[0127] Each transposon also comprised a selectable marker: a glutamine synthetase ORF operably linked to 5’ and 3’ regulatory elements expressible in a mammalian cell. Each transposon (25 pg) was co-transfected with 3 pg of mRNA encoding a corresponding transposase (with amino acid sequence SEQ ID NO: 16) into a pool of 5 million CHO cells. Each pool was allowed to recover for 48 hours post-transfection and placed in media lacking glutamine until the cells recovered, thereby selecting for a pool of CHO cells whose genomes stably comprise the transfected transposons. Each recovered pool was grown in a 14-day fed batch production process. At the end of 14 days, the culture was harvested, cells and cell debris were removed, and the resulting supernatant was purified by protein A affinity-capture. Purified protein was analyzed by cIEX-HPLC to quantify the heterodimer and homodimer content in the purified samples. Protein was loaded onto a cation exchange column (BioLC ProPac WCX-10 (4x250 mm), Thermo Scientific), and a pH-based gradient was used to elute the samples. Samples were diluted in an equal volume of Buffer A (9.6 mM Tris Base, 6 M PIPES, 1 1 mM Imidazole, 20 mM NaCl, pH 6.0), and a gradient was run between 100% Buffer A and 100% Buffer B (9.6 mM Tris Base, 6 mM PIPES, 11 mM Imidazole, 20 mM NaCl, pH 10.0) to elute the samples. Chromatographic peaks were assigned by comparison with homodimer and heterodimer standards and quantified by calculating the areas under the curve. Chromatographic traces for each sample are shown in Figure 16.

[0128] By altering the identities of regulatory elements operably linked to the different ORFs, the overall productivity of the cells may be modulated, as well as the relative levels of the first (HC1 -containing) and second (HC2-containing) half antibodies. For example, with reference to Figure 15, Table 4, cell pools P5 and P6 are identical, except that in P5, HC2 is in the first transcriptional unit, while in P6, HC1 is in the first transcriptional unit. The overall productivities of the two pools are very comparable: around 2.5 g/L (column L); but when HC1 is in the first transcriptional unit (P6), only 60% of the product is heterodimer, with the rest being LC-HC 1 half antibody (column O). When HC2 is in the first transcriptional unit (P5), heterodimer content improves to 69% of the product, and there is an excess of LC-HC2 (column M). Use of different regulatory elements in P8 improves both the productivity (>2.9 g/L) and the heterodimer content (>86%). [0129] Different regulatory element configurations may be preferable for obtaining optimal results with different molecules. Figure 13, Table 3 shows that for the 4-chain multi-specific antibody described in Examples 2 and 3, the configuration shown in Sample 6 (mouse CMV enhancer and promoter operably linked to transcriptional unit 1, and mouse CMV enhancer, promoter, and human intron A operably linked to transcriptional unit 2) with the second highest productivity and the second highest heterodimer content is superior to the configuration shown in Sample 2 (human CMV enhancer, human EFl promoter, and human EFl intron operably linked to transcriptional unit 1, and human CMV enhancer, mouse EF l promoter, and mouse EF l intron operably linked to transcriptional unit 2). These same configurations are shown for the common light chain molecule described in this Example 4, as Figure 15, Table 4, Samples P4 and P8, respectively. For the common light chain molecule, the first configuration shown in Figure 15, Table 4, P4 (mouse CMV enhancer and promoter operably linked to transcriptional unit 1, and mouse CMV enhancer and promoter and human intron A operably linked to transcriptional unit 2), with a productivity of 2.6 g/L of which 76% is heterodimer, is inferior to the second configuration (human CMV enhancer, human EFl promoter, and human EFl intron operably linked to transcriptional unit 1, and human CMV enhancer, mouse EFl promoter, and mouse EFl intron operably linked to transcriptional unit 2) shown in Figure 15, Table 4, P8, with a productivity of 3.0 g/L, of which 87% is heterodimer.

[0130] In sum, the general format shown in Figure 7 is a highly adaptable format that can be used generally for expression of 3- and 4-chain multi-specific antibodies. Modification of the identities of regulatory elements can be used to compensate for differences in chain expression levels and assembly characteristics. The regulatory elements can also be varied in order to preferentially produce a combination of products from which the desired heterodimer may be most easily purified.

Example 5: Clonal distribution of productivity and heterodimer content for cells expressing a 4- chain multi-specific antibody with a common light chain encoded on a single transposon, the transposon comprising two transcriptional units, each transcriptional unit comprising two ORFs and an IRES to link the ORFs

[0131] Sixteen individual monoclonal cell lines were derived from the pool of cells generated as described in Example 4 using the transposon shown in P8 of Figure 15, Table 4. Each monoclonal line was grown in a 14-day fed batch production process. At the end of 14 days, or at an earlier time if cell viability fell below 80%, the cultures were harvested, cells and cell debris were removed, and the resulting supernatant was purified by protein A affinity-capture. Purified protein was analyzed by cIEX-HPLC to quantify the heterodimer and homodimer content in the purified samples. Protein was loaded onto a cation exchange column (BioLC ProPac WCX-10 (4x250 m ), Thermo Scientific), and a pH-based gradient was used to elute the samples. Samples were diluted in an equal volume of Buffer A (9.6 mM Tris Base, 6 mM PIPES, 1 1 mM Imidazole, 20 mM NaCl, pH 6.0), and a gradient was run between 100% Buffer A and 100% Buffer B (9.6 mM Tris Base, 6 mM PIPES, 11 mM Imidazole, 20 mM NaCl, pH 10.0) to elute the samples. Chromatographic peaks were assigned by comparison with homodimer and heterodimer standards and quantified by calculating the areas under the curve. Yield and heterodimer % are shown in Table 5 (Figure 17).

[0132] In this example, only 16 clones were screened and nearly half of them were found to be comparable in yield and heterodimer % to the pool. The use of a single transposon, configured as shown in Figure 7, allows different regulatory elements to be used to achieve optimal expression balancing of the different chains of a multi-specific antibody. Results achieved with the pool are also obtained from monoclonal lines derived from the pool by screening (i.e., measuring productivity and product quality attributes such as heterodimer content) fewer than 1,000 monoclonal lines, fewer than 500 monoclonal lines, fewer than 400 monoclonal lines, fewer than 300 monoclonal lines, fewer than 200 monoclonal lines, or fewer than 100 monoclonal lines. Example 6: Clonal distribution of productivity and heterodimer content for cells expressing a 4- chain multi-specific antibody encoded on a single transposon, the transposon comprising two transcriptional units, each transcriptional unit comprising two ORFs and an IRES to link the ORFs [0133] Ten individual monoclonal cell lines were derived from the pool of cells generated as described in Example 3 using the transposon shown in Figure 13, Table 3, Sample 6. Each monoclonal line was grown in a 14-day fed batch production process. At the end of 14 days, or at an earlier time if cell viability fell below 80%, the cultures were harvested, cells and cell debris were removed, and the resulting supernatant was purified by protein A affinity-capture. Purified protein was analyzed by HIC-HPLC to quantify the heterodimer and homodimer content in the purified samples. Protein was loaded onto an HIC column (MabPac HIC (4.6x100 mm), Thermo Scientific), and a salt-based gradient was used to elute the samples. Samples were diluted in an equal volume of dilution buffer (20 mM Sodium Phosphate, pH 6.5, 2 M NaCl), and a gradient was run between 100% Buffer A (20 mM Sodium Phosphate, pH 6.5, IM NaCl) and 100% Buffer B (20 M Sodium Phosphate, pH 6.5) to elute the samples. Chromatographic peaks were assigned by comparison with homodimer and heterodimer standards and quantified by calculating the areas under the curve. Yield and heterodimer % are shown in Table 6 (Figure 18).

[0134] Only 10 clones were screened, and nearly half of them were found to be comparable in yield and heterodimer % to the pool. The use of a single transposon, configured as shown in Figure 7, allows different regulatory elements to be used to achieve optimal expression balancing of the different chains of a multi-specific antibody.

Example 7: Clonal distribution of productivity and heterodimer content for cells expressing a 4- chain multi-specific antibody encoded on a pair of transposons, each transposon comprising two transcriptional units

[0135] Ten individual monoclonal cell lines were derived from the pool of cells generated as described in Example 2 using the transposon shown in Figure 12, Table 2, Sample 3. Each monoclonal line was grown in a 14-day fed batch production process. At the end of 14 days, or at an earlier time if cell viability fell below 80%, the cultures were harvested, cells and cell debris were removed, and the resulting supernatant was purified by protein A affinity-capture. Purified protein was analyzed by HIC-HPLC to quantify the heterodimer and homodimer content in the purified samples. Protein was loaded onto an HIC column (MabPac HIC (4.6x100 mm), Thermo Scientific), and a salt-based gradient was used to elute the samples. Samples were diluted in an equal volume of dilution buffer (20 mM Sodium Phosphate, pH 6.5, 2 M NaCl), and a gradient was run between 100% Buffer A (20 mM Sodium Phosphate, pH 6.5, I M NaCi ) and 100% Buffer B (20 mM Sodium Phosphate, pH 6.5) to elute the samples. Chromatographic peaks were assigned by comparison with homodimer and heterodimer standards and quantified by calculating the areas under the curve. Yield and heterodimer % are shown in Table 7 (Figure 19).

[0136] Only 10 clones were screened, and nearly half of them were found to be comparable in yield and heterodimer % to the pool. The use of a pair of transposons, configured as shown in Figure 5A and 5B, allows different regulatory elements and different transfection ratios to be used to achieve optimal expression balancing of the different chains of a multi-specific antibody. The productivity and product quality results seen in pools translate to those seen in monoclonal lines derived from the pools.

Example 8: Production of 2-chain antibodies using first and second transcriptional units, each transcriptional unit comprising two ORFs and an IRES to link the ORFs. wherein the DNA construct comprises a transposon

[0137] A transposon was constructed having a configuration similar to that shown in Figure 3 in order to produce a monoclonal antibody. The transposon comprised a first transcriptional unit with a mouse CMV enhancer (SEQ ID NO: 119) and a mouse CMV promoter (SEQ ID NO: 127) operably linked to a first ORF encoding an antibody light chain with a secretion signal, a variable region, and a lambda constant region, the first lambda constant region with amino acid sequence SEQ ID NO: 285 and nucleotide sequence SEQ ID NO: 294, and a second ORF encoding an antibody heavy chain with a secretion signal, a variable region, and an IgGl constant region, the IgGl constant region with amino acid sequence SEQ ID NO: 282 and nucleotide sequence 99% identical to SEQ ID NO: 288. The two ORFs in the first transcriptional unit were operably linked by an IRES with nucleotide sequence SEQ ID NO: 248 to the 3’ of the first ORF and immediately to the 5’ of the second ORF. The first transcriptional unit also comprised a poly A signal with nucleotide sequence SEQ ID NO: 245 to the 3’ of the second ORF. The transposon further comprised a second transcriptional unit with a mouse CMV enhancer (SEQ ID NO: 119), a mouse CMV promoter (SEQ ID NO: 127), and a CMV intron (SEQ ID NO: 190) operably linked to a third ORF encoding an antibody light chain with a secretion signal, a variable region, and a lambda constant region, the lambda constant region with amino acid sequence SEQ ID NO: 285 and nucleotide sequence SEQ ID NO: 325, and a fourth ORF encoding an antibody heavy chain with a secretion signal, a variable region, and an IgGl constant region, the IgGl constant region with amino acid sequence SEQ ID NO: 282 and nucleotide sequence SEQ ID NO: 326. The two ORFs in the second transcriptional unit were operably linked by an IRES with nucleotide sequence SEQ ID NO: 248 to the 3’ of the third ORF and immediately to the 5’ of the fourth ORF. The second transcriptional unit also comprised a polyA signal with nucleotide sequence SEQ ID NO: 243 to the 3’ of the fourth ORF. The two mature light chains in the transposon had identical amino acid sequences: the first variable region and the third variable region had identical amino acid sequences and the first lambda constant region and the second lambda constant region had identical amino acid sequences. However, the nucleotide sequences of the first and third ORFs were different. The two mature heavy chains had identical amino acid sequences: the second variable region and the fourth variable region had identical amino acid sequences, and the first IgG constant region and the second IgG constant region had identical amino acid sequences. However, the nucleotide sequences of the second and fourth ORFs were different.

[0138] The transposon further comprised a glutamine synthetase selectable marker. The transposon was transfected with a corresponding transposase into a pool of CHO cells lacking a functional glutamine synthetase gene and grown in media lacking glutamine to select for cells with transposons integrated into their genomes. Once cells had recovered, the pools of recovered cells were grown in a 14 day fed batch, and the concentration of secreted antibody in the media was measured in the supernatant. The antibody concentrations in the media for cell pools derived from the example transposon are shown in Figure 20, Table 8, row 1 (2x LC-IRES-HC).

Comparative Example 8A: Production of 2-chain antibodies using first and second transcriptional units, each comprising a CMV enhancer, a promoter, and an ORF, but not including an IRES [0139] A transposon was constructed having a configuration similar to that shown in Figure

1 in order to produce the same monoclonal antibody as produced in Example 8. The transposon comprised a first transcriptional unit with a first ORF encoding an antibody light chain with a secretion signal, a variable region, and a lambda constant region, the lambda constant region with amino acid sequence SEQ ID NO: 285 and nucleotide sequence SEQ ID NO: 294. The first ORF was operably linked to a mouse CMV enhancer (SEQ ID NO: 119) and a mouse CMV promoter (SEQ ID NO: 127) to its 5’ end, and to a polyA signal with nucleotide sequence SEQ ID NO: 245 to its 3’ end. The transposon further comprised a second transcriptional unit with a second ORF encoding an antibody heavy chain with a secretion signal, a variable region, and an IgGl constant region, the IgGl constant region with amino acid sequence SEQ ID NO: 282 and a nucleotide sequence 99% identical to SEQ ID NO: 288. The second ORF was operably linked to a mouse CMV enhancer (SEQ ID NO: 119), a mouse CMV promoter (SEQ ID NO: 127), and a CMV intron (SEQ ID NO: 190) to its 5’ end, and to a polyA signal with nucleotide sequence SEQ ID NO: 243 to its 3’ end.

[0140] The transposon further comprised a glutamine synthetase selectable marker. The transposon was transfected with a corresponding transposase into a pool of CHO cells lacking a functional glutamine synthetase gene and grown in media lacking glutamine to select for cells with transposons integrated into their genomes. Once the cells had recovered, the pools of recovered cells were grown in a 14 day fed batch, and the concentration of secreted antibody in the media was measured in the supernatant. The antibody concentrations in the media for cell pools derived from the second transposon are shown in Figure 20, Table 8, row 2 (mAbl).

[0141] As can be seen in Table 8, in contrast to the data shown in Example 1 and Comparative Example 1, at every time point measured, the amount of antibody produced was lower for the cell line whose genome comprised the transposon with a dual IRES configuration (i.e., having a configuration similar to Figure 3) compared to the cell line from Comparative Example 8A whose genome comprised a transposon lacking an IRES element (i.e., having a configuration similar to Figure 1).

[0142] A key difference between Example 8/Comparative Example 8A and Example 1/Comparative Example 1 is that in Example 1 and in Comparative Example 1, the antibody light chains expressed relatively poorly. In Comparative Example 1, where the transposon did not comprise any IRES elements (i.e., with a configuration similar to Figure 1), the ratio of expression of light chain to heavy chain was approximately 1.4: 1. As explained above, light chain is necessary to help the CHI domain of the heavy chain to fold, so without adequate light chain, antibody expression may be limited by heavy chain folding. When two ORFs occur in the same transcriptional unit, and the second ORF initiates translation using an IRES, the second ORF is typically expressed at a lower level than the first. For example, the ratio of expression of first and second ORFs may be between 1.5: 1 and 5:l or 10: 1 or more, depending on the IRES. In Example 1, where the transposon comprised two transcriptional units, each comprising an IRES (i.e., with a configuration similar to Figure 3), the ratio of expression of light chain to heavy chain was increased to approximately 1.8: 1. In addition to the increase in overall expression because of effective increased copy number of the genes encoding heavy and light chains, the increased level of light chain resulting from the IRES configuration helped increase overall expression of the antibody.

[0143] In contrast, in Comparative Example 8A, where the transposon that did not comprise any IRES elements (i.e., with a configuration similar to Figure 1), the ratio of expression of light chain to heavy chain was approximately 3: 1. Although an excess of light chain expression is required for proper folding and assembly of antibodies, levels of light chain expression that are higher than required for efficient heavy chain folding divert biosynthetic resources toward synthesis of excess light chain and away from production of fully assembled antibody. Thus, in

Example 8, wherein the transposon comprising two transcriptional units, each comprising an IRES (i.e., with a configuration similar to Figure 3), the ratio of light chaimheavy chain was close to 5:1, suggesting total productivity was limited by excess light chain synthesis.

Comparative Example 8B: Production of 2-chain antibodies using first and second transcriptional units, each transcriptional unit comprising two ORFs and an IRES to link the ORFs, wherein the DNA construct comprises a transposon, and wherein the order of the ORFs is different in the first and second transcriptional units

[0144] A transposon was constructed having a configuration similar to that shown in Figure 3 in order to produce the same monoclonal antibody as produced in Example 8 and Comparative Example 8A. The transposon was identical to the transposon described in Example 8, with the exception that the order of the ORFs encoding the heavy and light chains in the second transcriptional unit were reversed, so that third ORF 336 in Figure 3 encoded the heavy chain and fourth ORF 346 in Figure 3 encoded the light chain.

[0145] The transposon further comprised a glutamine synthetase selectable marker. The transposon was transfected with a corresponding transposase into a pool of CHO cells lacking a functional glutamine synthetase gene and grown in media lacking glutamine to select for cells with transposons integrated into their genomes. Once the cells had recovered, the pools of recovered cells were grown in a 14 day fed batch, and the concentration of secreted antibody in the media was measured in the supernatant. The antibody concentrations in the media for cell pools derived from the transposon are shown in Figure 20, Table 8, row 3 (LC-IRES-HC/HC-IRES-LC).

[0146] As can be seen in Table 8, at every time point measured, the amount of antibody produced was higher for the cell line whose genome comprised the transposon comprising a dual IRES configuration (i.e., having a configuration similar to Figure 3) in which the second transcriptional unit comprised an ORF encoding the heavy chain, followed by an IRES, followed by an ORF encoding the light chain.

Claims

CLAIMS What is claimed is:

1. A transposon for the production in a mammalian cell of a multi-specific antibody comprised of a first polypeptide, a second polypeptide, a third polypeptide, and a fourth polypeptide, the transposon comprising:

(A) a first transcriptional unit, the first transcriptional unit comprising: a first open reading frame (ORF) and a second ORF operably linked by a first internal ribosome entry sequence (IRES), and wherein the first and second ORFs are operably linked to first regulatory elements including a first promoter, a first polyadenylation signal sequence, and optionally a first enhancer and a first intron, the first regulatory elements being active in a mammalian cell to express the first and second ORFs, which respectively encode a first secretion signal fused to the first polypeptide and a second secretion signal fused to the second polypeptide, and wherein expression of the first transcriptional unit in the mammalian cell results in secretion of the first secretion signal fused to the first polypeptide and the second secretion signal fused to the second polypeptide and removal of the secretion signals to secrete the first and second polypeptides; and

(B) a second transcriptional unit, the second transcriptional unit comprising: a third ORF and a fourth ORF operably linked by a second IRES, and wherein the third and fourth ORFs are operably linked to second regulatory elements including a second promoter, a second polyadenylation signal sequence, and optionally a second enhancer and a second intron, the second regulatory elements being active in a mammalian cell to express the third and fourth ORFs, which respectively encode a third secretion signal fused to the third polypeptide and a fourth secretion signal fused to the fourth polypeptide, and wherein expression of the second transcriptional unit in the mammalian cell results in secretion of the third secretion signal fused to the third polypeptide and the fourth secretion signal fused to the fourth polypeptide and removal of the secretion signals to secrete the third and fourth polypeptides.

2. The transposon of claim 1, wherein the first and second polypeptides are respectively the first and second chains of a first half antibody.

3. The transposon of claim 1, wherein third and fourth polypeptides are respectively the first and second chains of a second half antibody.

4. The transposon of claim 1, wherein the first polypeptide is a component of a first half antibody, and the second polypeptide is a component of a second half antibody.

5. The transposon of claim 1, wherein the first IRES and the second IRES are identical.

6. The transposon of claim 1, wherein the first IRES and the second IRES are different.

7. The transposon of claim 1, wherein at least one of the first and second IRESes is a viral

IRES selected from the group consisting of an encephalomyocarditis virus, an echovirus, a foot and mouth disease virus, Theiler’s encephalomyelitis virus, Sikhote-Alin virus, or a coxsackievirus.

8. The transposon of claim 1, wherein the first IRES and the second IRES are each independently selected from SEQ ID NOs: 248-281.

9. The transposon of claim 1, wherein one of the polypeptides encoded by the first transcriptional unit has the same amino acid sequence, after cleavage of the secretion signal, as one of the polypeptides encoded by the second transcriptional unit.

10. The transposon of claim 1, wherein the transposon further comprises a nucleic acid sequence encoding a selectable marker.

11. A multi-specific antibody produced by expression in a mammalian cell of the transposon according to any one of claims 1-10.

12. An isolated mammalian cell comprising the transposon according to any one of claims 1- 10.

13. A monoclonal cell line prepared by isolating individual cells from a pool of cells whose genome comprises the transposon according to any one of claims 1-10.

14. A composition for the production in a mammalian cell of a multi-specific antibody comprised of a first polypeptide, a second polypeptide, a third polypeptide, and a fourth polypeptide, the composition comprising:

(A) a first transposon comprising a first open reading frame (ORF) and a second ORF, wherein:

(1) the first ORF is operably linked to first regulatory elements including a first promoter, a first polyadenylation signal sequence, and optionally a first enhancer and a first intron, the first regulatory elements being active in the mammalian cell to express the first ORF, which encodes a first secretion signal fused to the first polypeptide, and wherein expression of the first ORF in the mammalian cell results in secretion of the first secretion signal fused to the first polypeptide and removal of the first secretion signal to secrete the first polypeptide; and

(2) the second ORF is operably linked to second regulatory elements including a second promoter, a second polyadenylation signal sequence, and optionally a second enhancer and a second intron, the second regulatory elements being active in the mammalian cell to express the second ORF, which encodes a second secretion signal fused to the second polypeptide, and wherein expression of the second ORF in the mammalian cell results in secretion of the second secretion signal fused to the second polypeptide and removal of the second secretion signal to secrete the second polypeptide; and

(B) a second transposon comprising a third ORF and a fourth ORF, wherein:

(1) the third ORF is operably linked to third regulatory elements including a third promoter, a third polyadenylation signal sequence, and optionally a third enhancer and a third intron, the third regulatory elements being active in the mammalian cell to express the third ORF, which encodes a third secretion signal fused to the third polypeptide, and wherein expression of the third ORF in the mammalian cell results in secretion of the third secretion signal fused to the third polypeptide and removal of the third secretion signal to secrete the third polypeptide; and

(2) the fourth ORF is operably linked to fourth regulatory elements including a fourth promoter, a fourth polyadenylation signal sequence, and optionally a fourth enhancer and a fourth intron, the fourth regulatory elements being active in the mammalian cell to express the fourth ORF, which encodes a fourth secretion signal fused to the fourth polypeptide, and wherein expression of the fourth ORF in the mammalian cell results in secretion of the fourth secretion signal fused to the fourth polypeptide and removal of the fourth secretion signal to secrete the fourth polypeptide.

15. The composition of claim 14, wherein the first and second polypeptides are respectively the first and second chains of a first half antibody.

16. The composition of claim 14, wherein the third and fourth polypeptides are respectively the first and second chains of a second half antibody.

17. The composition of claim 14, wherein the first polypeptide is a component of a first half antibody, and the second polypeptide is a component of a second half antibody.

18. The composition of claim 14, wherein one of the polypeptides encoded for by the first transposon has the same amino acid sequence, after cleavage of the secretion signal, as one of the polypeptides encoded for by the second transposon.

19. The composition of claim 14, wherein the first transposon is flanked by a first pair of transposon ends, the second transposon is flanked by a second pair of transposon ends, and the first and second transposons are transposed by the same corresponding transposases.

20. The composition of claim 14, wherein the first transposon is flanked by a first pair of transposon ends, the second transposon is flanked by a second pair of transposon ends, and the first and second transposons are transposed by different corresponding transposases.

21. The composition of claim 14, wherein the first transposon further comprises a nucleic acid sequence that encodes for a first selectable marker, and the second transposon further comprises a nucleic acid sequence that encodes for a second selectable marker, and the first selectable marker and the second selectable marker confer a survival advantage under the same restrictive condition.

22. The composition of claim 14, wherein the first transposon further comprises a nucleic acid sequence that encodes for a first selectable marker, and the second transposon further comprises a nucleic acid sequence that encodes for a second selectable marker, and the first selectable marker and the second selectable marker confer resistance to different selective conditions.

23. A multi-specific antibody produced by expression in a mammalian cell of the first and second transposons according to any one of claims 14-22.

24. An isolated mammalian cell comprising the first and second transposons according to any one of claims 14-22.

25. A monoclonal cell line prepared by isolating single cells from a pool of cells whose genome comprises the first and second transposons according to any one of claims 14-22.

26. A composition for the production in a mammalian cell of an antibody, the composition comprising a transposon, the transposon comprising a first transcriptional unit and a second transcriptional unit, wherein:

(A) the first transcriptional unit comprises a first open reading frame (ORF) and a second ORF operably linked by a first internal ribosome entry sequence (IRES), and wherein the first ORF and the second ORF are operably linked to first regulatory elements including a promoter, a polyadenylation signal sequence, and optionally an enhancer and an intron, the first regulatory elements being active in the mammalian cell to express the first and second ORFs, which respectively encode a first secretion signal fused to a first polypeptide and a second secretion signal fused to a second polypeptide, and wherein expression of the first transcriptional unit in the mammalian cell results in secretion of the first secretion signal fused to the first polypeptide and the second secretion signal fused to the second polypeptide and removal of the secretion signals to secrete the first and second polypeptides; and

(B) the second transcriptional unit comprises a third ORF and a fourth ORF operably linked by a second IRES, and wherein the third ORF and the fourth ORF are operably linked to second regulatory elements including a promoter, a polyadenylation signal sequence, and optionally an enhancer and an intron, the second regulatory elements being active in the mammalian cell to express the third and fourth ORFs, which respectively encode a third secretion signal fused to the third polypeptide and a fourth secretion signal fused to the fourth polypeptide, and wherein expression of the second transcriptional unit in the mammalian cell results in secretion of the third secretion signal fused to the third polypeptide and the fourth secretion signal fused to the fourth polypeptide and removal of the secretion signals to secrete the third and fourth polypeptides.

27. The composition of claim 26, wherein the first and second polypeptides are respectively the first and second chains of the antibody.

28. The composition of claim 26, wherein the third and fourth polypeptides are respectively the first and second chains of the antibody.

29. The composition of claim 26, wherein one of the polypeptides encoded for by the first or second ORF has the same amino acid sequence, after cleavage of the secretion signal, as one of the polypeptides encoded for by the third or fourth ORF.

30. The composition of claim 29, wherein the ORFs that encode the same polypeptides do not have identical nucleic acid sequences, thereby reducing sequence repeat regions within the transposon and increasing its stability.

31. The composition of claim 26, wherein introducing the transposon and a corresponding transposase into the mammalian cell results in a light chaimheavy chain production ratio of between about 1.5: 1 to about 4:1.

32. The composition of claim 26, wherein at least one of the first and second IRESes is a viral IRES selected from the group consisting of an encephalomyocarditis virus, an echovirus, a foot and mouth disease virus, Theiler’s encephalomyelitis virus, Sikhote-Alin virus, or a coxsackievirus.

33. The composition of claim 26, wherein each IRES comprises a nucleotide sequence independently selected from SEQ ID NOs: 248-281.

34. The composition of claim 26, wherein the transposon further comprises a nucleic acid sequence encoding a selectable marker.

35. An antibody produced by expression in a mammalian cell of the transposon according to any one of claims 26-34.

36. An isolated mammalian cell comprising the transposon according to any one of claims 26-

34.

37. A monoclonal cell line prepared by isolating single cells from a pool of cells whose genome comprises the transposon according to any one of claims 26-34.

38. A method for constructing a mammalian cell line to produce a multi-specific antibody comprised of a first polypeptide, a second polypeptide, a third polypeptide, and a fourth polypeptide, the method comprising:

(A) providing a first transposon comprising a first open reading frame (ORF) and a second ORF, wherein:

(2) the second ORF is operably linked to second regulatory elements including a second promoter, a second polyadenylation signal sequence, and optionally a second enhancer and a second intron, the second regulatory elements being active in the mammalian cell to express the second ORF, which encodes a second secretion signal fused to the second polypeptide, and wherein expression of the second ORF in the mammalian cell results in secretion of the second secretion signal fused to the second polypeptide and removal of the second secretion signal to secrete the second polypeptide; and (B) providing a second transposon comprising a third ORF and a fourth ORF, wherein:

(2) the fourth ORF is operably linked to fourth regulatory elements including a fourth promoter, a fourth polyadenylation signal sequence, and optionally a fourth enhancer and a fourth intron, the fourth regulatory elements being active in the mammalian cell to express the fourth ORF, which encodes a fourth secretion signal fused to the fourth polypeptide, and wherein expression of the fourth ORF in the mammalian cell results in secretion of the fourth secretion signal fused to the fourth polypeptide and removal of the fourth secretion signal to secrete the fourth polypeptide; and

(C) introducing the first and second transposons and their corresponding transposases into a mammalian cell, so that the first and second transposons are integrated into the genome of the mammalian cell.

39. The method of claim 38, wherein the first and second polypeptides are respectively the first and second chains of a first half antibody.

40. The method of claim 38, wherein the third and fourth polypeptides are respectively the first and second chains of a second half antibody.

41. The method of claim 38, wherein the first polypeptide is a component of a first half antibody, and the second polypeptide is a component of a second half antibody.

42. The method of claim 38, wherein one of the polypeptides encoded for by the first transposon has the same amino acid sequence, after cleavage of the secretion signal, as one of the polypeptides encoded for by the second transposon.

43. The method of claim 38, wherein the first and second transposons are introduced into the mammalian cell in an about 1 : 1 ratio.

44. The method of claim 38, wherein different amounts of the first and second transposons are introduced into the mammalian cell.

45. The method of claim 38, wherein the first transposon further comprises a nucleic acid sequence that encodes for a first selectable marker, and the second transposon further comprises a nucleic acid sequence that encodes for a second selectable marker, and the first selectable marker and the second selectable marker confer a survival advantage under the same restrictive condition.

46. The method of claim 38, wherein the first transposon further comprises a nucleic acid sequence that encodes for a first selectable marker, and the second transposon further comprises a nucleic acid sequence that encodes for a second selectable marker, and the first selectable marker and the second selectable marker confer resistance to different selective conditions.

47. The method of any one of claims 45 or 46, the method further comprising selecting cells that have integrated the first and second transposons into their genomes by subjecting each cell to restrictive conditions that require expression of the selectable marker for survival of the cell.

48. The method of claim 38, wherein the first transposon is flanked by a first pair of transposon ends, the second transposon is flanked by a second pair of transposon ends, and the first and second transposons are transposed by the same corresponding transposase.

49. The method of claim 38, wherein the first transposon is flanked by a first pair of transposon ends, the second transposon is flanked by a second pair of transposon ends, and the first and second transposons are transposed by different corresponding transposases.

50. The method of claim 38, wherein the method further comprises growing cells that have integrated the first and second transposons into their genomes under conditions where the cells produce and secrete the multi-specific antibody.

51. The method of claim 38, further comprising preparing a plurality of pools of the mammalian cells, wherein each pool of cells has a different ratio of the first and second transposons introduced.

52. The method of claim 51, further comprising comparing the amount of the multi-specific antibody and the amount of one or more incorrect assembly products produced by each pool in the plurality of pools and identifying a preferred pool for production of the multi-specific antibody.

53. The method of claim 52, further comprising preparing monoclonal cell lines from a pool of cells whose genome comprises the first and second transposons.

54. The method of claim 53, further comprising determining the amount of multi-specific antibody and the amount of one or more incorrect assembly products produced by each monoclonal cell line.

55. The method of claim 54, wherein the number of monoclonal cell lines prepared is at least one of: fewer than 10,000, fewer than 1,000, fewer than 900, fewer than 800, fewer than 700, fewer than 600, fewer than 500, fewer than 400, fewer than 300, fewer than 200, fewer than 100, fewer than 90, fewer than 80, fewer than 70, fewer than 60, fewer than 50, fewer than 40, and fewer than 30.

56. Producing the multi-specific antibody using the mammalian cell line of claim 38.

57. A method for constructing a mammalian cell line to produce a multi-specific antibody comprised of a first polypeptide, a second polypeptide, a third polypeptide, and a fourth polypeptide, the method comprising:

(A) providing a transposon comprising a first transcriptional unit and a second transcriptional unit, wherein:

(1) the first transcriptional unit comprises a first open reading frame (ORF) and a second ORF operably linked by a first internal ribosome entry sequence (IRES), and wherein the first and second ORFs are operably linked to first regulatory elements including a first promoter, a first polyadenylation signal sequence, and optionally a first enhancer and a first intron, the first regulatory elements being active in the mammalian cell to express the first and second ORFs, which respectively encode a first secretion signal fused to the first polypeptide and a second secretion signal fused to the second polypeptide, and wherein expression of the first transcriptional unit in the mammalian cell results in secretion of the first secretion signal fused to the first polypeptide and the second secretion signal fused to the second polypeptide and removal of the secretion signals to secrete the first and second polypeptides; and

(2) the second transcriptional unit comprises a third ORF and a fourth ORF operably linked by a second IRES, and wherein the third and fourth ORFs are operably linked to second regulatory elements including a second promoter, a second polyadenylation signal sequence, and optionally a second enhancer and a second intron, the second regulatory elements being active in the mammalian cell to express the third and fourth ORFs, which respectively encode a third secretion signal fused to the third polypeptide and a fourth secretion signal fused to the fourth polypeptide, and wherein expression of the second transcriptional unit in the mammalian cell results in secretion of the third secretion signal fused to the third polypeptide and the fourth secretion signal fused to the fourth polypeptide and removal of the secretion signals to secrete the third and fourth polypeptides; and

(B) introducing the transposon and a corresponding transposase into the mammalian cell, so that the transposon is integrated into the genome of the mammalian cell.

58. The method of claim 57, wherein the first and second polypeptides are respectively the first and second chains of a first half antibody.

59. The method of claim 57, wherein the third and fourth polypeptides are respectively the first and second chains of a second half antibody.

60. The method of claim 57, wherein the first polypeptide is a component of a first half antibody, and the second polypeptide is a component of a second half antibody.

61. The method of claim 57, wherein at least one of the first and second IRESes is a viral IRES selected from the group consisting of an encephalomyocarditis virus, an echovirus, a foot and mouth disease virus, Theiler’s encephalomyelitis virus, Sikhote-Alin virus, or a coxsackievirus.

62. The method of claim 57, wherein each IRES comprises a nucleotide sequence independently selected from SEQ ID NOs: 248-281.

63. The method of claim 57, wherein one of the polypeptides encoded by the first transcriptional unit has the same amino acid sequence, after cleavage of the secretion signal, as one of the polypeptides encoded by the second transcriptional unit.

64. The method of claim 57, wherein the transposon further comprises a nucleic acid sequence that encodes for a selectable marker.

65. The method of claim 64, further comprising selecting cells that have integrated the transposon into their genomes by subjecting each cell to restrictive conditions that require expression of the selectable marker for survival of the cell.

66. The method of claim 57, further comprising growing cells that have integrated the transposon into their genomes under conditions where the cells produce and secrete the multispecific antibody.

67. The method of claim 57, further comprising providing a plurality of transposons, wherein different regulatory elements are operably linked to the first or second transcriptional units, or the relative positions of the ORFs are modified, and different relative expression levels of each polypeptide are thereby obtained.

68. The method of claim 67, further comprising preparing a plurality of pools of mammalian cells, wherein each pool of cells has a different transposon from the plurality of transposons introduced.

69. The method of claim 68, wherein more than one transposon from the plurality of transposons is introduced.

70. The method of claim 69, further comprising comparing the amount of the multi-specific antibody and the amount of one or more incorrect assembly products produced by each pool in the plurality of pools and identifying a preferred pool for production of the multi-specific antibody.

71. The method of claim 70, further comprising preparing monoclonal cell lines from a pool of cells whose genome comprises the transposon.

72. The method of claim 71, further comprising determining the amount of multi-specific antibody and the amount of one or more incorrect assembly products produced by each monoclonal cell line.

73. The method of claim 72, wherein the number of monoclonal cell lines prepared is at least one of: fewer than 10,000, fewer than 1,000, fewer than 900, fewer than 800, fewer than 700, fewer than 600, fewer than 500, fewer than 400, fewer than 300, fewer than 200, fewer than 100, fewer than 90, fewer than 80, fewer than 70, fewer than 60, fewer than 50, fewer than 40, fewer than 30.

74. Producing the multi-specific antibody using the mammalian cell line of claim 57.

75. A method for constructing a mammalian cell line to produce an antibody, the method comprising:

(1) the first transcriptional unit comprises a first open reading frame (ORF) and a second ORF operably linked by a first internal ribosome entry sequence (IRES), and wherein the first ORF and the second ORF are operably linked to first regulatory elements including a promoter, a polyadenylation signal sequence, and optionally an enhancer and an intron, the first regulatory elements being active in the mammalian cell to express the first and second ORFs, which respectively encode a first secretion signal fused to a first polypeptide and a second secretion signal fused to a second polypeptide, and wherein expression of the first transcriptional unit in the mammalian cell results in secretion of the first secretion signal fused to the first polypeptide and the second secretion signal fused to the second polypeptide and removal of the secretion signals to secrete the first and second polypeptides; and

(2) the second transcriptional unit comprises a third ORF and a fourth ORF operably linked by a second IRES, and wherein the third ORF and the fourth ORF are operably linked to second regulatory elements including a promoter, a polyadenylation signal sequence, and optionally an enhancer and an intron, the second regulatory elements being active in the mammalian cell to express the third and fourth ORFs, which respectively encode a third secretion signal fused to the third polypeptide and a fourth secretion signal fused to the fourth polypeptide, and wherein expression of the second transcriptional unit in the mammalian cell results in secretion of the third secretion signal fused to the third polypeptide and the fourth secretion signal fused to the fourth polypeptide and removal of the secretion signals to secrete the third and fourth polypeptides; and

76. The method of claim 75, wherein the first and second polypeptides are respectively the first and second chains of the antibody.

77. The method of claim 75, wherein the third and fourth polypeptides are respectively the first and second chains of the antibody.

78. The method of claim 75, wherein one of the polypeptides encoded for by the first or second ORF has the same amino acid sequence, after cleavage of the secretion signal, as one of the polypeptides encoded for by the third or fourth ORF.

79. The method of claim 78, wherein the ORFs that encode the same polypeptides do not have identical nucleic acid sequences, thereby reducing sequence repeat regions within the transposon and increasing its stability.

80. The method of claim 75, wherein introducing the transposon and a corresponding transposase into the mammalian cell results in a light chaimheavy chain production ratio of between about 1.5: 1 to about 4:1.

81. The method of claim 75, wherein at least one of the first and second IRESes is a viral IRES selected from the group consisting of an encephalomyocarditis virus, an echovirus, a foot and mouth disease virus, Theiler’s encephalomyelitis virus, Sikhote-Alin virus, or a coxsackievirus.

82. The method of claim 75, wherein each IRES comprises a nucleotide sequence independently selected from SEQ ID NOs: 248-281.

83. The method of claim 75, wherein the transposon further comprises a nucleic acid sequence that encodes for a selectable marker.

84. The method of claim 83, further comprising selecting cells that have integrated the transposon into their genomes by subjecting each cell to restrictive conditions that require expression of the selectable marker for survival of the cell.

85. The method of claim 75, further comprising growing cells that have integrated the transposon into their genomes under conditions where the cells produce and secrete the antibody.

86. The method of claim 75, further comprising preparing monoclonal cell lines from a pool of cells whose genome comprises the transposon.

87. The method of claim 86, further comprising determining the amount of antibody produced by each monoclonal cell line.

88. The method of claim 87, wherein the number of monoclonal cell lines prepared is at least one of: fewer than 10,000, fewer than 1,000, fewer than 900, fewer than 800, fewer than 700, fewer than 600, fewer than 500, fewer than 400, fewer than 300, fewer than 200, fewer than 100, fewer than 90, fewer than 80, fewer than 70, fewer than 60, fewer than 50, fewer than 40, and fewer than 30.

89. Producing the antibody using the mammalian cell line of claim 75.