METHODS AND COMPOSITIONS FOR THE CHROMOSOMAL INTEGRATION OF HETEROLOGOUS SEQUENCES
Related Information The contents ofthe patents, patent applications, and references cited throughout this specification are hereby incorporated by reference in their entireties.
Government Sponsored Research
This work was supported, in part, by grants from the Florida Agricultural Experiment Station (publication number R-06853) U.S. Department of Agriculture, National Research Initiative (98-35504-6177 & 98-35505-6976), and the U.S. Department of Energy, Office of Basic Energy Science (DE-FG02-96ER20222).
Background of the Invention Plasmid vectors are versatile tools which facilitate the isolation, expression and analysis of genes (Bolivar, F., et al, Gene 2: 95-113 (1977)). Useful characteristics include the facile production of identical DNA for subsequent in vitro and in vivo manipulation, the presence of multiple cloning sites (MCS), selectable markers which allow rapid screening for new or improved traits, and the ease with which they can be established as multiple cellular copies to alter gene expression in recombinant hosts. However, the physiological burden imposed by multiples copies of plasmid genes, potential for internal re-arrangements, and segregational instability are disadvantages for many biotechnological applications (Peredelchuk, M. Y., Gene 187: 231-238 (1997)). Antibiotic-resistance genes are frequently used for plasmid maintenance. Alternative selectable markers based on metabolic deficiencies ofthe host (Degryse, E., J. Biotech. 18: 29-40 (1991)) pose further complications for improvement cycles in production strains. For applications such as the deliberate field release, development of organisms for use in food products, and development of biocatalysts for bulk chemicals, special requirements for plasmid maintenance are undesirable. Many ofthe problems associated with plasmids can be eliminated by the chromosomal integration of desired traits. Integration tools based on modified transposons and conditional plasmid replicons have been developed (de Lorenzo, V., et
al, J. Bacteriol. 172: 6568-6572 (1990); de Lorenzo, V., et al, Methods Enzymol. 235: 386-405 (1994); Rode, C. K., et al, Gene. 166: 1-9 (1995); Hamilton, C. H., et al., J. Bacteriol. 171 : 4617-4622 (1989); Kaniga, K., et al, J. Bacteriol. 109: 137-141 (1991); Le Borgne, S., et al, Gene 223: 213-219 (1998); Link, A.J., et al, J. Bacteriol. 179: 6228-6237 (1997)). With these tools, integration can be random or precisely directed by DNA fragments homologous to the host genome. However, complications still remain with most integration systems such as the persistence of selectable markers, transposons, or replicons. For strains in which multiple alterations or continuing improvements are desired, the accumulation of markers and delivery systems can be troublesome. Selectable events may be limited by the availability of functional markers. Integrated DNA (replicons, transposon genes, and selectable marker genes) can serve as a site for homologous recombination events which interfere with targeting or randomness during subsequent constructions. Also, the persistence of replicons and transposons increase the potential for gene transfer to other organisms in the environment. Replicons and transposons can be eliminated by transforming with purified DNA fragments which lack replication functions ( Hasan, N., et al, Gene 150: 51-56 (1994); Ohta, K., et ah, Appl. Environ. Microbiol. 57: 893-900 (1991)). Non-antibiotic markers are available but are often less efficient than antibiotics (de Lorenzo, V., et ah, J. Bacteriol. 172: 6568-6572 (1990); de Lorenzo, V., et ah, Methods Enzymol. 235: 386- 405 (1994); Herrero, M., et ah, J. Bacteriol. 172: 6557-6567 (1990)). In a few cases, loss of functions such as tetracycline-sensitivity and absence of sucrose-sαci? system can be selected directly (Bochner, B. R., et al. J. Bacteriol. 143: 926-933 (1980); Kaniga, K., et ah, J. Bacteriol. 109: 137-141 (1991); Ried, J.L., et al., Gene 57: 239-246 (1985)). However, loss of function due to a mutation is typically not a precise event and can result from unstable point mutations, partial deletion ofthe resistance gene, or extended deletions which impair the host.
Summary of the Invention
The foregoing limitations are overcome using the method and vectors ofthe present invention. In particular, the invention provides a method for integrating nucleic acids into a genome in such a way that any unwanted vector or selectable marker DNA can be removed. This allows for the genome ofthe recipient host cell to be made
substantially free of any unnecessary nucleic acid, e.g., vector sequence, marker sequence, that can lead to genomic instability.
Accordingly, in one aspect, the invention provides a method for integrating a nucleic acid construct into the genome of a host cell by contacting the cell with a nucleic acid construct under conditions such that the nucleic acid construct is integrated by the cell. The method uses a nucleic acid construct that contains a passenger sequence and a marker sequence, where the marker sequence is flanked by a first and second recombining site. In one embodiment, the first and second recombining site are oriented in the same direction. In another embodiment, the method employs a construct that further contains an origin of replication between the first and second recombining sites, e.g., a conditional origin of replication (i.e., replicon) preferably, e.g., pSClOlori, R6K-γori, colEl, oriEV, or an origin of replication derived therefrom.
In another embodiment, the nucleic acid construct ofthe above aspect contains a sequence that contains a promoter, a restriction site, an intron, an exon, an IRES element, a polyadenylation site, or a combination thereof.
In yet another related embodiment, the nucleic acid construct ofthe above aspect contains a guide sequence capable of directing site-specific integration ofthe nucleic acid construct to a specific site in the sequence of a replicating genome. In even another embodiment, the above method involves exposing the targeted cell to a site-specific recombinase that results in recombination between the first and second recombining sites ofthe foregoing nucleic acid construct such that the intervening sequence flanked by the recombining sites is excised. The method involves using a recombinase such as Xer, Int, Cre, and preferably, FLP recombinase. Accordingly, in a related embodiment, the corresponding recombining sites are dif, att, loxP, and preferably FRT. In a preferred embodiment, the above method can be repeated such that the sequential introduction of more than one genetic element may be introduced.
In still another embodiment, the passenger sequence encodes at least one gene, preferably a gene involved in ethanologenesis, such as, for example, adh or pdc. In a related embodiment, the gene involved in ethanologenesis may be derived from a
prokaryote or a eukaryote. In another related embodiment, the passenger sequence may further contain a promoter 5' to the passenger sequence.
In another embodiment, the above method employs a nucleic acid construct containing a marker sequence that encodes a selectable gene, e.g., an antibiotic resistance gene or a non-antibiotic resistance gene. In a preferred embodiment, the antibiotic resistance gene is the gentamycin resistance gene, zeocin resistance gene, kanamycin resistance gene, ampicillin resistance gene, tetracycline resistance gene, or chloramphenicol resistance gene. In another preferred embodiment, the non-antibiotic resistance gene is an auxotrophic gene, a metal ion resistance gene, a trp gene, or a gene encoding a green fluorescent protein.
In a preferred embodiment ofthe first aspect, the method employs a bacterial cell, preferably a Gram-negative bacterial cell, more preferably a facultatively anaerobic bacterial cell, more preferably a bacterial cell selected from the family Enterobacteriaceae, and most preferably, a bacterial cell ofthe genus Klebsiellα or Escherichia (e.g., E. coli B or E. coli K12). In a related embodiment, the host cell is a recombinant bacterial cell. In another related embodiment, the method uses a nucleic acid construct that is, or is derived from, a plasmid selected from the group consisting of pLOI2223, pLOI2224, pLOI2225, pLOI2226, pLOI2227, pLOI2228, and pLOI2403.
In a second aspect, the invention provides a method for producing a recombinant ethanologenic cell by contacting a cell with a nucleic acid construct under conditions in which integration ofthe nucleic acid construct occurs resulting in the formation of a recombinant ethanologenic cell. The method uses a nucleic acid construct that contains a passenger sequence that contains an ethanologenic gene, and a marker sequence, flanked by a first and second recombining site. In one embodiment, the first and second recombining site are oriented in the same direction.
In another embodiment, the passenger sequence encodes an ethanologenic gene such as adh ox pdc. In a related embodiment, the passenger sequence encodes pdc, adhB, and cat.
In another embodiment, the nucleic acid further contains a guide sequence thereby resulting in site-specific integration ofthe nucleic acid construct. In a related embodiment, the guide sequence is derived from a replicating genome.
In yet another embodiment, the method employs a construct that further contains an origin of replication between the first and second recombining sites, e.g., a conditional origin of replication (i.e., replicon) preferably, e.g., pSClOlori, R6K-γori, colEl, oriEV, or an origin of replication derived therefrom. In even another embodiment, the above method involves exposing the targeted cell with a site-specific recombinase that results in recombination between the first and second recombining sites ofthe foregoing nucleic acid construct such that the intervening sequence flanked by the recombining sites is excised. The method involves using a recombinase such as Xer, Int, Cre, and preferably, FLP recombinase. Accordingly, in a related embodiment, the corresponding recombining sites are dif, att, loxP, and preferably FRT.
In still another embodiment, the above method employs a nucleic acid construct containing a marker sequence that encodes a selectable gene, e.g., an antibiotic resistance gene or a non-antibiotic resistance gene. In a preferred embodiment, the antibiotic resistance gene is the gentamycin resistance gene, zeocin resistance gene, kanamycin resistance gene, ampicillin resistance gene, tetracycline resistance gene, or chloramphenicol resistance gene. In another preferred embodiment, the non-antibiotic resistance gene is an auxotrophic gene, a metal ion resistance gene, a trp gene, or a gene encoding a green fluorescent protein. In a preferred embodiment ofthe foregoing aspect, a recombinant ethanologenic cell is produced according to the above method.
In a third aspect, the invention provides a recombinant host cell having a nucleic acid construct that contains a passenger sequence and a marker sequence, flanked by a first and second recombining site. In one embodiment, the first and second recombining site are oriented in the same direction.
In a related embodiment, the host cell contains a nucleic acid construct where the passenger sequence encodes a gene involved in ethanologenesis such as, e.g., αdh or pdc. In a related embodiment, the passenger sequence is selected from the group including αdh, pdc, and cat.
In one embodiment, the host cell contains a nucleic acid construct that further contains an origin of replication between the first and second recombining sites, e.g., a conditional origin of replication (i.e., replicon) preferably, e.g., pSClOlori, R6K-γori, colEl, oriEV, or a an origin of replication derived therefrom. In another embodiment, the host cell contains a nucleic acid construct containing a guide sequence capable of directing site-specific integration ofthe nucleic acid construct to a specific site in the sequence of a replicating genome.
In yet another embodiment, the host cell contains a nucleic acid construct containing a marker sequence that encodes a selectable gene, e.g., an antibiotic resistance gene or a non-antibiotic resistance gene. In a preferred embodiment, the antibiotic resistance gene is the gentamycin resistance gene, zeocin resistance gene, kanamycin resistance gene, ampicillin resistance gene, tetracycline resistance gene, or chloramphenicol resistance gene. In another preferred embodiment, the non-antibiotic resistance gene is an auxotrophic gene, a metal ion resistance gene, a trp gene, or a gene encoding a green fluorescent protein.
In another embodiment, the host cell is exposed to a site-specific recombinase that results in recombination between the first and second recombining sites ofthe foregoing nucleic acid construct such that the intervening sequence flanked by the recombining sites, is excised. In a preferred embodiment, the recombination involves using a recombinase such as Xer, Int, Cre, and preferably, FLP recombinase.
Accordingly, in a related embodiment, the corresponding recombining sites are dif, att, loxP, and preferably FRT.
In a preferred embodiment, the host cell is a bacterial cell, e.g., a recombinant cell and/or an ethanologenic cell, preferably a Gram-negative bacterial cell, more preferably a facultatively anaerobic bacterial cell, more preferably selected from the family Enterobacteriaceae, and most preferably, ofthe genus Klebsiellα or Escherichia (e.g., E. coli B or E. coli K12). In a related embodiment, the host cell contains, a nucleic acid construct that is, or is derived from, a plasmid selected from the group consisting of pLOI2223, pLOI2224, pLOI2225, pLOI2226, pLOI2227, pLOI2228, and pLOI2403. In a fourth aspect, the invention provides a method for producing ethanol by providing a recombinant ethanologenic cell with a nucleic acid construct that contains a passenger sequence and a marker sequence, where the marker sequence is flanked by a
first and second recombining site. The method includes contacting the cell with a substrate which can be fermented into ethanol, such that expression ofthe passenger sequence results in the production of ethanol.
In one embodiment, the first and second recombining site are oriented in the same direction.
In another embodiment, the passenger sequence encodes a gene involved in ethanologenesis, such as, e.g., adh ox pdc. In a related embodiment, the passenger sequence encodes adhB, pdc, and cat.
In another embodiment, the method employs a construct that further contains an origin of replication between the first and second recombining sites, e.g., a conditional origin of replication (i.e., replicon) preferably, e.g., pSClOlori, R6K-γori, colEl, oriEV, or a an origin of replication derived therefrom.
In yet another embodiment, the nucleic acid construct ofthe above aspect contains a guide sequence capable of directing site-specific integration ofthe nucleic acid construct to a specific site in the sequence of a replicating genome.
In yet another embodiment, the above method involves exposing the targeted cell with a site-specific recombinase that results in recombination between the first and second recombining sites ofthe foregoing nucleic acid construct such that the intervening sequence flanked by the recombining sites is excised. The method involves using a recombinase such as Xer, Int, Cre, and preferably, FLP recombinase.
Accordingly, in a related embodiment, the corresponding recombining sites are dif, att, loxP, and preferably FRT.
In still another embodiment, the above method employs a nucleic acid construct containing a marker sequence that encodes a selectable gene, e.g., an antibiotic resistance gene or a non-antibiotic resistance gene. In a preferred embodiment, the antibiotic resistance gene is the gentamycin resistance gene, zeocin resistance gene, kanamycin resistance gene, ampicillin resistance gene, tetracycline resistance gene, or chloramphenicol resistance gene. In another preferred embodiment, the non-antibiotic resistance gene is an auxotrophic gene, a metal ion resistance gene, a trp gene, or a gene encoding a green fluorescent protein.
In a preferred embodiment, the method employs a host cell that a bacterial cell, e.g., a recombinant bacterial cell, preferably a Gram-negative bacterial cell, more preferably a facultatively anaerobic bacterial cell, more preferably selected from the family Enterobacteriaceae, and most preferably, either Klebsiella or Escherichia (e.g., E. coli B or E. coli Kl 2).
In one embodiment, the method uses a nucleic acid construct that is, or is derived from, a plasmid selected from the group consisting of pLOI2223, pLOI2224, pLOI2225, pLOI2226, pLOI2227, pLOI2228, and pLOI2403.
In a fifth aspect, the invention provides a nucleic acid construct containing a passenger sequence, and a marker sequence, where the marker sequence is flanked by a first and second recombining site. In a related embodiment the first and second recombining site may be oriented in the same direction.
In another embodiment, the construct further contains an origin of replication between the first and second recombining sites, e.g., a conditional origin of replication (i.e., replicon) preferably, e.g., pSClOlori, R6K-γori, colEl, oriEV, or a an origin of replication derived therefrom.
In another embodiment, the nucleic acid construct contains a guide sequence derived from a replicating genome. In a related embodiment, the guide sequence is derived from a bacterial cell. In yet another embodiment, the construct further contains at least one unique restriction enzyme site. In yet another related embodiment, the passenger sequence of the construct contains an ethanologenic gene, such as, e.g., αdh ox pdc, and preferably αdh , pdc, cat or a combination thereof.
In still another related embodiment, the passenger sequence ofthe construct contains a sequence selected from the group consisting of a heterologous promoter and a prokaryotic termination sequence.
In even another embodiment, the nucleic acid construct contains a marker sequence that encodes a selectable gene, e.g., an antibiotic resistance gene or a non- antibiotic resistance gene. In a preferred embodiment, the antibiotic resistance gene is the gentamycin resistance gene, zeocin resistance gene, kanamycin resistance gene, ampicillin resistance gene, tetracycline resistance gene, or chloramphenicol resistance gene. In another preferred embodiment, the non-antibiotic resistance gene is an
auxotrophic gene, a metal ion resistance gene, a trp gene, or a gene encoding a green fluorescent protein.
In another embodiment, the construct is exposed to a site-specific recombinase that results in recombination between the first and second recombining sites ofthe nucleic acid construct such that the intervening sequence flanked by the recombining sites is excised. The excision involves using a recombinase such as Xer, Int, Cre, and preferably, FLP recombinase. Accordingly, in a related embodiment, the construct contains recombining sites such as dif, att, loxP, and preferably FRT.
In a preferred embodiment ofthe above aspect, the construct is, or is derived from, a plasmid selected from the group consisting of pLOI2223, pLOI2224, pLOI2225, pLOI2226, pLOI2227, pLOI2228, and pLOI2403.
In a related embodiment, the invention provides a kit including at least one ofthe foregoing nucleic acid constructs and instructions for use.
Other features and advantages ofthe invention will be apparent from the following detailed description and claims.
Brief Description of the Drawings
Figure 1 shows a schematic of various integration vectors and helper plasmids (see text for details).
Figure 2 shows a schematic illustrating the use of an integration vector for the insertion of heterologous "passenger" genes into a host genome. A "guide" sequence allows for the site specific targeting ofthe passenger sequence via homologous recombination. A helper plasmid provides a recombinase (FLP) that catalyzes the in vivo excision of any unnecessary sequence (e.g., the replicon and marker sequence) flanked by recombining sites (FRT) leaving resident in the genome the passenger gene. Figure 3 shows a diagram illustrating the targeting of heterologous "passenger" genes to a site specific region of a genome ( . e. , the adhE gene (shaded)) and recombinase-mediated deletion ofthe replicon and marker used during plasmid construction and initial integration. The alignment ofthe guide sequence ofthe targeting vector and the genomic integration site (Panel A), cross-over event (Panel B),
and resultant recombinase-mediated (FLP) excision ofthe replicon and selectable marker (Panel C) are shown.
Figure is a photograph of an agarose gel stained with ethidium bromide showing PCR amplified nucleic acid fragments amplified using different primers (Panel A) and restriction enzyme digested (Panel B) to confirm the correct chromosomal integration of a heterologous nucleic acid (see text for details).
Detailed Description of the Invention
In order for the full scope ofthe invention to be clearly understood, the following definitions are provided.
I. Definitions
As used herein the term "host cell" and "recombinant host cell" is intended to include a cell suitable for genetic manipulation, e.g., which can incorporate heterologous polynucleotide sequences, e.g. , which can be transfected. The cell can be a microorganism or a higher eukaryotic cell, such as an animal cell or a plant cell. The term is intended to include progeny ofthe cell originally transfected. In preferred embodiments, the cell is a bacterial cell, e.g., a Gram-negative bacterial cell, and this term is intended to include all facultatively anaerobic Gram-negative cells ofthe family Enterobacteriaceae such as Escherichia, Shigella, Citrobacter, Salmonella, Klebsiella, Enterobacter, Erwinia, Kluyvera, Serratia, Cedecea, Morganella, Hafnia, Edwardsiella, Providencia, Proteus, and Yersinia. Particularly preferred recombinant hosts are Escherichia coli or Klebsiella oxytoca cells. Preferably, the term recombinant host cell is intended to include a cell that has already been selected or engineered to have certain desirable properties and suitable for further modification using the compositions and methods ofthe invention.
The term "passenger sequence" is intended to include any desired sequence that is intended for integration into the host cell. For example, the passenger sequence may include, a restriction site, multiple restriction sites (e.g., a polylinker or multiple cloning site (MCS)), a unique stretch of sequence suitable for marking the cell as distinct from a wild type cell, a regulatory element, e.g., a polyadenylation sequence, a promoter, an intron, a splice signal, an internal ribosomal entry site (IRES); for regulating the
expression of multiple genes or cistrons, and/or a gene, for example a gene encoding a polypeptide. A passenger sequence encoding a gene may further include a promoter if appropriate or use the endogenous promoter found proximal to the site of integration. In addition, the passenger sequence encoding a gene may further include other regulatory elements, if appropriate, to improve expression. A passenger sequence may comprise genetic elements derived from any source, e.g., eukaryotes, prokaryotes, virii, or synthetic polynucleotide fragments.
The term "guide sequence" is intended to include a sequence that can be located 5', 3', or internal to the sequence intended to be integrated, such that recombination between the introduced vector (or portion thereof) and the recipient host cell genome is accomplished. Typically, a guide sequence has high similarity or identity with a site specific region ofthe recipient host cell genome such that a targeted integration ofthe passenger sequence to this site by homologous recombination is accomplished.
The term "site-specific integration" is meant to refer to the integration of an exogenous nucleic acid sequence to a specific area ofthe genome of a recipient host cell. In general, a guide sequence allows for the homologous recombination between a portion ofthe targeting vector and the host cell such that a predictable targeting ofthe sequence to that area ofthe genome represented in part by the guide sequence, is specifically targeted. The term "marker sequence" is intended to include any sequence that can be encoded by a nucleic acid, introduced into the replicating genome of a recipient host cell, and detected, thereby indicating that the cell has been "marked" by such a sequence. Accordingly, the term is intended to include a sequence for example having a restriction enzyme sequence that can be detected by a corresponding restriction enzyme. In addition, or alternatively, the sequence may be detected using any of a variety of art recognized techniques, e.g., polymerase chain reaction using appropriate primers, nucleic acid hybridization, etc. Preferably, the marker sequence encodes a gene product that confers on the cell a selectable phenotype, e.g., resistance to an antibiotic or other cytotoxic agent or a conditional growth advantage. Accordingly, a marker sequence can encode, e.g., the gentamycin resistance gene, zeocin resistance gene, kanamycin resistance gene, ampicillin resistance gene, tetracycline resistance gene, chloramphenicol resistance gene, hygromycin resistance gene, thymidine kinase, an
auxotrophic gene, a metal ion resistance gene, a trp gene, a gene producing a visual phenotype (e.g., green fluorescent protein), or a gene producing a cell surface antigen (e.g., CD4). Any ofthe foregoing markers when expressed in cells can be detected using art recognized techniques such as, e.g., appropriate culture conditions or selection techniques (e.g., antibodies, FACS (fluoroscein activated cell sorting), or flow cytometric analysis).
The term "origin of replication" or "replicon" is intended to include any sequence, conditional or otherwise, that can confer on a nucleic acid sequence the ability to be replicated in a cell for the purposes of, e.g., propagating the nucleic acid or maintaining the presence ofthe nucleic acid in a cell. Such sequences are known in the art and are routinely incorporated into the backbone of many nucleic acid vectors to facilitate their propagation and typically include, e.g., pSClOlori, R6K-γori, colElori, or oriEV.
The term "recombining site" is intended to include a nucleic acid element that represents a site specific binding site for a recombinase. Examples of such elements include, e.g., the FRT sequence, the dif sequence, the att sequence, and the loxP sequence.
The term "recombinase" is intended to include any recombinase having a site specific recombinase activity. Examples of such recombinases are FLP, Xer, Int, and Cre and these enzymes typically have site specific recombinase activity on, respectively, the FRT sequence, the dif sequence, the att sequence and the loxP sequence.
The term "gene involved in ethanologenesis" is intended to include any gene capable of conferring on a cell ethanologenic properties or capable of improving any aspect of cellular ethanologenesis, such as, e.g., substrate uptake, substrate processing, ethanol tolerance, etc. Genes involved in ethanologenesis are, e.g. , alcohol dehydrogenase, pyruvate decarboxylase, secretory protein/s, and polysaccharases, and these genes, or their homologs, may be derived from any appropriate organism.
The term "substrate" is intended to include any moiety that can be converted into ethanol. Substrates that are suitable for converting into ethanol include sugar moieties such as, e.g., monosaccharides, disaccharides, trisaccharides, oligosaccharides, and complex carbohydrates, including, e.g. , lignocellulose, which comprises cellulose, hemicellulose, and pectin.
The term "derived from" is intended to include the isolation (in whole or in part) of a polynucleotide segment from an indicated source. The term is intended to include, for example, direct cloning, PCR amplification, or artificial synthesis from, or based on, a sequence associated with the indicated polynucleotide source. The term "ethanologenic" is intended to include the ability of a microorganism to produce ethanol from a carbohydrate as a primary fermentation product. The term is intended to include naturally occurring ethanologenic organisms, ethanologenic organisms with naturally occurring or induced mutations, and ethanologenic organisms which have been genetically modified. The term "Gram-negative bacteria" is intended to include the art recognized definition of this term. Typically, Gram-negative bacteria include, for example, the family Enterobacteriaceae which comprises, among others, the species Escherichia and Klebsiella.
//. Nucleic Acid Constructs
The invention provides a number of novel nucleic acid constructs that are suitable for targeting a heterologous nucleic acid sequence to the genome or chromosome of a recipient host cell.
The heterologous sequence or passenger sequence ofthe targeting vector can be any desirable sequence that is useful when introduced into a cell. Accordingly, the sequence may be a regulatory element (e.g., a promoter, intron, splicing signal, internal ribosome entry site, polyadenylation signal) or a gene encoding a gene product, e.g., a polypeptide, or a combination thereof (e.g., a gene encoding a polypeptide that is under the transcriptional control of a promoter). In addition, the heterologous sequence may be targeted to any replicating genome, e.g. a bacterial cell, a yeast, an insect cell, a plant cell, or an animal cell. It is well known in the art as to what regulatory elements, e.g., promoters, etc., are suitable for use in, e.g., a bacterial cell versus a plant or animal cell.
If site specific integration ofthe passenger sequence is desired (as opposed to random integration) a guide sequence must be incorporated into the nucleic acid vector. Typically the guide sequence contains a portion ofthe genomic locus to be targeted. Considered design of this region allows for the accurate placement ofthe passenger
sequence anywhere in the genome, preferably, e.g., under the control of a desirable endogenous promoter.
To facilitate the construction ofthe targeting vector, an origin of replication, or replicon, and a marker sequence is included in the vector. Specifically exemplified in the present invention are the markers encoding antibiotic resistance genes to ampicillin, kanamycin, and chloramphenicol, however it will be appreciated that other selectable markers may be used as described above. In part, the selection of an appropriate marker will depend on the organism to be targeted, e.g., prokaryotic versus eukaryotic. The characteristics ofthe foregoing markers and there efficacies in various organism is well known in the art.
In addition to an appropriate marker, an origin of replication or replicon is also incorporated into the targeting construct to facilitate propagation ofthe vector during the construction and/or initial integration ofthe vector. Specifically exemplified are the pSClOl and R6K-γ replicons however other replicons, e.g., colEl, and oriEV as well as replicons suitable for use in eukaryotic cells are also encompassed by the invention.
Importantly, the constructs ofthe invention possess site specific recombining sites that flank all ofthe vector sequence that, after targeting ofthe passenger sequence to the genome, is no longer necessary or desired. Indeed, one advantage ofthe invention is that following targeting and integration ofthe passenger sequence to the genome of the host cell, the in vivo excision of any undesired sequence can be accomplished using a recombinase that binds the site specific recombining sites contained in the integrated vector.
Critical for allowing the correct excision ofthe undesired vector sequence is the placement and directionality ofthe recombining sites. The novel vectors ofthe invention possesses a first and a second recombining site that flank all ofthe replicon and marker sequence and are oriented in the same direction. This allows for the complete excision of all the vector sequence leaving behind the desired targeted passenger sequence and a minimal remainder of the joined recombining sites (68 bps).
While the FRT recombining site and corresponding FLP recombinase is exemplified (Senecoff, J., et a P.N.A.S. 82:7270-7274 (1985); Szybalski, W., Gene
135:279-290 (1993); Hoang, T.T., et ah, Gene 212:77-86 (1998)), a number of other site specific recombining sites (e.g., dif, att, loxP) and corresponding recombinases (e.g.,
Xer, Int, Cre) may be used. Accordingly, any ofthe above described markers may be used after the excision step because the recipient host cell has once again been rendered sensitive to the selectable marker. Thus, one can sequentially introduce multiple genes into an organism without ever running into the difficulty of having to change markers or exhaust all suitable markers. Accordingly, if a particular selectable marker works efficiently in a particular organism, it may be exclusively used to introduce multiple passenger sequences into a target organism.
In a preferred embodiment, the constructs ofthe invention are modified to contain a passenger sequence encoding ethanol pathway gene products such as, e.g., alcohol dehydrogenase (e.g., adhB) and pyruvate decarboxylase (pdc), that are necessary for ethanol production in, e.g., a microorganism. A guide sequence specific for the endogenous gene in the target organism allows for the targeting of these genes to a specific locus such that the heterologous ethanol pathway genes (i.e., adhB and pdc) axe brought under the transcriptional control of an appropriate endogenous promoter (e.g., αdhE).
Accordingly, the foregoing novel constructs allow for the genetic engineering of superior host cells suitable, e.g., for various industrial applications, such as, e.g., use in a fermentation reaction for producing, e.g., ethanol (for a review of other industrial applications, see, e.g., Barrios-Gonzalez et αh, Biotechnoh Ann. Rev. 2:85-121 (1996); From Ethnomycology to Fungal Biotechnology: Exploiting from Natural Resources for Novel Products, Singh, J., Ed., Plenum Press, Pub. (1999); Manual of Industrial Microbiology and Biotechnology, Demain, A. Ed., Am. Soc. of Microbiology, Pub., (1999); Biomining: Theory, Microbes, and Industrial Processes, Rawlings, Ed., R.G. Landes Co., Pub. (1997); Biotechnology of Industrial Antibiotics, Vandamme, E., Ed., Marcel Dekker, Pub. (1984); Industrial Biotechnology, Malik, V., Ed., Science, Pub. (1992); Biotechnology and Food Ingredients, Goldberg et ah, Ed., Aspen Publishers (1991); Biotechnology and Food Process Engineering, Schwartzberg et ah, Ed., Marcel Dekker, Pub.(1990); and Food Biotechnology: Techniques and Applications, Mittal, G., Technomic Pub. Co. (1992).
///. Methods of Use
In a preferred embodiment, the methods ofthe invention apply the constructs described above for the integration of one or more desirable gene sequences into a host cell. Accordingly, the methods ofthe invention have immediate application in the genetic engineering of, e.g., microorganisms (e.g., for various industrial applications), plants (e.g., for pest resistance, hardiness, yields), and animal cells (e.g., for producing cytokines, hormones, etc.).
In one particular application ofthe methods ofthe invention, one or more genes necessary for producing ethanol are provided on a plasmid or integrated into a host chromosome using the method ofthe invention. More preferably, essential genes for fermenting a sugar substrate into ethanol, e.g., pyruvate decarboxylase (e.g., pdc) and/or alcohol dehydrogenase (e.g., adh, preferably adhB) are introduced into the host ofthe invention using an bicistronic operon or PET operon as described in U.S.P.N. 5,821,093, hereby incorporated by reference. Art recognized techniques for introducing nucleic acids into prokaryotic or eukaryotic cells such as e.g., bacteria, yeast, insects, plants, and animals as well as appropriate culturing methods have been described (see, e.g., Large-Scale Mammalian Cell Culture Technology, Lubiniecki, A., Ed., Marcel Dekker, Pub., (1990), Bacterial Cell Culture: Essential Data, Ball, A., John Wiley & Sons, (1997), Molecular and Cell Biology of Yeasts, Yarranton et al. , Ed., Van Nostrand Reinhold, Pub., (1989); Yeast Physiology and Biotechnology, Walker, G., John Wiley & Sons, Pub., (1998); Baculovirus Expression Protocols, Richardson, C, Ed., Humana Press, Pub., (1998); Methods in Plant Molecular Biology: A Laboratory Course Manual, Maliga, P., Ed., C.S.H.L. Press, Pub., (1995); Current Protocols in Molecular Biology, eds. Ausubel et ah, John Wiley & Sons (1992), Sambrook, J. et ah, Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY (1989), and Bergey's Manual of Determinative Bacteriology, Kreig et ah, Williams and Wilkins (1984)).
Accordingly, using the methods ofthe invention, a single genetic construct can be designed to encode all ofthe necessary gene products (e.g., a glucanase, an endoglucanase, an exoglucanase, a secretory protein/s, pyruvate decarboxylase, alcohol
dehydrogenase) for imparting to a non-ethanologenic (or weakly ethanologenic organism) all the necessary genes for producing ethanol from a sugar substrate.
Alternatively, the invention allows for more than one gene to be introduced serially without having to change markers. In a preferred embodiment, this allows for the engineering of a host cell suitable for performing simultaneous saccharification and fermentation (SSF). In addition, it will also be appreciated that such a host may be further manipulated, using methods known in the art, to have mutations in any endogenous gene/s (e.g., recombinase genes) that would interfere with the stability, expression, and function ofthe introduced genes. Further, it will also be appreciated that the invention is intended to encompass any regulatory elements, gene/s, or gene products, i.e., polypeptides, that are sufficiently homologous to the ones described herein.
Methods for screening strains having the introduced genes may be performed using, e.g., PCR amplification using site or gene specific primers, visual screens that can identify cells expressing either the desired integrated gene, marker, or absence ofthe marker and any other unnecessary nucleic acid or other art recognized techniques.
For example, for screening an ethanologenic microorganism, the ADH gene product produces acetaldehyde that reacts with the leucosulfonic acid derivative of p-roseaniline to produce an intensely red product. Thus, ADH-positive clones can be easily screened and identified as bleeding red colonies.
Rapid evaluations of ethanol producing potential can also be made by testing the speed of red spot development on aldehyde indicator plates (Conway et ah, (1987) J Bacteriol. 169:2591-2597). Typically, strains which prove to be efficient in sugar conversion to ethanol can be recognized by the production of red spots on aldehyde indicator plates within minutes of transfer.
In a most preferred embodiment ofthe invention, the methods ofthe invention allow for the production of a single host cell that is ethanologenic, that is, has all the necessary genes, either naturally occurring or artificially introduced or enhanced (e.g., using a surrogate promoter and/or genes from a different species or strain), such that the host cell has the ability to produce and secrete a polysaccharase/s, degrade a complex sugar, and ferment the degraded sugar into ethanol. Accordingly, such a host is suitable for simultaneous saccharification and fermentation. Moreover, this host cell is free of
marker sequences, unnecessary vector sequence, a heterologous replicon, and these sequences can lead to genomic instability in the organism or make the organism less suitable for release or use outside, e.g. , a controlled laboratory environment.
In addition, it should be readily apparent to one skilled in the art that the ability conferred by the present invention, to transform genes coding for a protein or an entire metabolic pathway into a host cell without leaving resident in the host genome unnecessary or undesired nucleic acid has wide application. Envisioned in this regard, for example, is the application ofthe present invention to a variety of situations where regulatory elements and/or genes may be added at will and moreover, deleterious genes deleted, without unnecessarily littering the genome ofthe target organism with unwanted nucleic acid.
IV. Host Celts
The invention also relates to host cells that are modified using the foregoing methods and/or nucleic acid constructs ofthe invention. Preferably, any host cell, e.g., animal, plant, or microbe, can be modified according to the methods ofthe invention. More preferably, the methods and constructs ofthe invention allow for the engineering of heterologous DNA into the genome of any microorganism suitable, e.g., for an industrial application. Even more preferably, the invention provides for the engineering of an organism useful for the production of ethanol, e.g., by fermentation.
Accordingly, in one embodiment, a host cell ofthe invention comprises a heterologous polynucleotide segment encoding a polypeptide under the transcriptional control of a naturally occurring promoter in the genome or, if appropriate, under the control of a heterologous promoter. In another embodiment, the host cell prepared according to the method ofthe invention is exposed to a recombinase such that the in vivo excision of any unnecessary nucleic acid flanked by recombining sites is accomplished while the targeted passenger sequence encoding a desirable gene is left behind. Typically, the nucleic acid sequence between the recombining sites contains, for example, a marker sequence and may further contain an origin of replication or replicon. The host cell ofthe invention is render free of this sequence by being exposed to a site specific recombinase, preferably encoded by an expressible construct, which is introduced into the cell using standard techniques.
In a preferred embodiment, the resultant recombinant host cell is made substantially free of any nucleic acid flanked by FRT recombining sites using an FLP recombinase. It is understood that the host cell ofthe invention may also be modified using nucleic acid constructs having recombining sites such as a dif sequence, att sequence, or loxP sequence and made substantially free of any undesired nucleic acid using a corresponding recombinase, such as, e.g., Xer, Int, or Cre.
The resultant host cell as prepared according to the method ofthe invention has the advantage of no longer having any unnecessary heterologous DNA which can lead to, e.g., genomic stability. In addition, the host cell no longer encodes a marker sequence and thus can be retargeted with the same selectable marker sequence in order to introduce another genetic modification to the cell. Thus, the serial introduction of multiple genes can be achieved using the same selection marker. In theory at least, there appears to be no limit as to the number of gene constructs that can be introduced into a host cell ofthe invention. In a most preferred embodiment, the host cell ofthe invention contains at least one passenger sequence containing, e.g., a gene of interest, e.g., a gene encoding a desired polypeptide for use in the bioconversion of a sugar to ethanol, or a step thereof. Preferred ethanologenic genes include those that encode, e.g., any ethanol pathway gene that can facilitate the production of ethanol from the cell or extract thereof, such as alcohol dehydrogenase, pyruvate decarboxylase, a cellulase, or a secretory polypeptide. A desired ethanologenic gene may be derived from another species, e.g., a yeast, an insect, an animal, or a plant. The techniques for introducing and expressing such genes in a recombinant host, are presented in the examples.
For example, a non-ethanologenic host cell may be converted into an ethanologenic host by introducing, for example, ethanologenic genes from an efficient ethanol producer like Zymomonαs mobilis. This type of genetic engineering, using the constructs and methods ofthe invention, results in a recombinant host capable of efficiently fermenting sugar into ethanol.
Accordingly, the invention makes use of a non-ethanologenic recombinant host, e.g., E. coli strain B or E. coli strain K-12. These strains may be used to express a desired polypeptide, e.g., an ethanologenic gene introduced using techniques describe herein. In a preferred embodiment of the present invention, the host cell having been
modified using the methods and constructs ofthe invention, can be applied in degrading or depolymerizing a complex saccharide into monosaccharides. Subsequently, the host cell can catabolize the simpler sugar into ethanol by fermentation. This process of concurrent complex saccharide depolymerization into smaller sugar residues followed by fermentation is referred to as simultaneous saccharification and fermentation (SSF). In another embodiment, the invention makes use of a recombinant host that is ethanologenic and is improved using the methods ofthe invention. It is understood that an improvement of an ethanologenic host may include, e.g., increasing the ability of organism to produce ethanol, depolymerize a particular substrate, tolerate a higher ethanol level, etc. In a preferred embodiment, the recombinant host is a Gram-negative bacterium. In another embodiment, the recombinant host is from the family Enterobacteriaceae. The ethanologenic hosts of U.S.P.N. 5,821,093, hereby incorporated by reference, for example, are suitable hosts and include, in particular, E. coli strains KO4 (ATCC 55123), KOI 1 (ATCC 55124), and KO12 (ATCC 55125). In addition, the LYOl strain may be employed as described in U.S.S.N. 08/834,900 and this application is hereby incorporated by reference.
It will be appreciated that host cells prepared according to the method ofthe invention may be used in conjunction with another recombinant host that expresses, yet another desirable polypeptide, e.g., a different passenger gene sequence. In a particular example, a non-ethanologenic host cell may be used in conjunction with an ethanologenic host cell and either one or both of these host cells may be engineered according to the methods ofthe invention to accomplish, e.g., one step of a multistep process, e.g., the converting of a substrate into ethanol. For example, a non- ethanologenic host may be engineered for carrying out, e.g., the depolymerization of a complex sugar followed by the use of an engineered ethanologenic host for fermenting the depolymerized sugar. Accordingly, it will be appreciated that the host cells ofthe invention may be used serially or contemporaneously for carrying out a particular process.
This invention is further illustrated by the following examples which should not be construed as limiting.
EXEMPLIFICA TION
EXAMPLE I
Construction of Targeting Vectors for Introducing Heterologous DNA that is
Excisable In Vivo
In this example, selectable targeting vectors for introducing heterologous DNA into a recipient host cell are described. Throughout the example, the following materials and methods were used unless otherwise stated.
Materials and Methods
The bacterial strains and plasmids used in this example are listed in Table 1 , below.
Table 1. Bacterial strains and plasmids
For all plasmid constructions, standard methods were employed (Sambrook, J., et al. Molecular Cloning. A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NF(1989)). Reagents used in cloning were molecular biology grade and used as directed by the manufacturers. Restriction enzymes, T4 DΝA polymerase, and Klenow polymerase were purchased from New England Biolabs (Beverly, MA). Taq PCR MasterKit was purchased from Qiagen (Santa Clarita, CA). Polymerase chain reactions (PCR) were carried out using an Eppendorf Mastercycler (Westbury, NY). Primers were obtained from Genosys Biotechnologies (The Woodlands, TX). PCR products were ligated into pCR2.1-TOPO (Invitrogen, Carlsbad, CA) using topoisomerase. DNA fragments were isolated from gels using a QIAquick
Gel Extraction Kit. DNA fragments were assembled using a Rapid DNA ligation Kit (Boehringer Mannheim Corporation, Indianapolis, IN). Wizard Plus kits (Promega, Madison, WI) were used for plasmid purification. Dideoxy sequencing of plasmids was performed by using fluorescent primers and a LI-COR model 4000L DNA Sequencer (Lincoln, NB) as previously described (Lai, X., et ah, Appl. Envir. Microbiol. 63:355- 363 (1996)). Except where mentioned, all ligation reactions were done with the Rapid DNA Ligation Kit and used to transform E. coli DH5α, followed by selection in LB plates containing ampicillin.
Plasmid Constructions
Two sets of insertion vectors were constructed using, as an initial vector, the suicide plasmids pSG76 and pST76 series, described by Posfai et al. (1997). These low copy number plasmids contain the conditional replicons R6K-γ and pSClOl, respectively. Modifications were executed in several steps so that another recombining site, i.e., FRT site, was introduced into each of original vectors, as well as a more suitable multiple cloning site (MCS). Unless mentioned in the text, all new plasmids are shown in brackets, just after its corresponding description. Complete sequences for the following plasmids have been deposited in GenBank with the following accession numbers: pLOI2223, AF172933; pLOI224, AF172934; pLOI225, AF172935; pLOI226, AF172936; pLOI227, AF172937; pLOI228, AF172938 (see also, respectively, SEQ ID NOS: 1-6).
To facilitate the manipulation of each construct, most ofthe plasmid manipulations were done in pUC18 (Messing, J., Methods Enzymol. 101 :20-78 (1983)) and derivatives made herein. An 800 bp Xmήl-Hindlll fragment from pUC18 was exchanged for a fragment comprising part ofthe bla gene plus the FRT site from pSG76-A to create pLOI2216. Digestion, Klenow treatment and self-ligation [pLOI2217] abolished the Hindlll site in pLOI2216. The restriction sites from EcoRI to Pstl were removed by double-digestion with these enzymes, Klenow treatment and self- ligation [pLOI2218]. Then a new EcoRI site was created by linker addition into the previously added Clal site, giving rise to a pUCl 8- plasmid derivative carrying one FRT site (called FRT1 in Table 1) but no MCS [pLOI2219].
To introduce a new MCS into pLOI2219, the Sphl site from pUC18 was replaced with an Ascl site by linker addition [pLOI2400], and the Kpnl site was replaced with a Bghll linker [pLOI2401]. The modified pUC18-based MCS was then introduced into pLOI2219 as a 965 bp Scαl/EcoRI fragment originating form pLOI2220. By linker addition, an Ascl site took the place of EcoRI site [pLOI2221], and a new EcoRI site replaced the Hindlll site [pLOI2222].
The resultant plasmid was sequenced in both direction and used as a source of DNA carrying the modified MCS and the FRT1 site.
Construction ofpL 012223
To create the conditional replication vector carrying two FRT sites and the ampicillin-resistance gene, pLOI2222 was double-digested with EcoRI and Seal and a 157 bp fragment was ligated to pSG76-A (R6K-γ ori, ApR) digested with the same enzymes. Transformation was done in E. coli SI 7-1, since this strain provides the λpir protein involved in the replication ofthe R6K-γ ori and selection was done in the presence of 50 μg/ml ampicillin. The resultant plasmid was denominated pLOI2223.
Construction ofpLOI2224
To create the conditional replication vector carrying two FRT sites and the kanamycin-resistance gene, pLOI2222 was double-digested with EcoRI and Seal and a 157 bp fragment was ligated to pSG76-K (R6K-γ ori, KmR) digested with the same enzymes. Transformation was done in E. coli SI 7-1, since this strain provides the λpir protein involved in the replication ofthe R6K-γ ori and selection was done in the presence of 50 μg/ml kanamycin. DNA from several colonies was prepared and analyzed by digestion with Ascl and Bghll, separately, to confirm the presence ofthe modified MCS. The resultant plasmid was denominated pLOI2224.
Construction ofpLOI2225
To create the conditional replication vector carrying two FRT sites and the chloramphenicol-resistance gene, pLOI2222 was double-digested with EcoRI and Seal and a 157 bp fragment was ligated to pSG76-C (R6K-γ ori, CmR) digested with the same enzymes. Transformation was performed as described above and selection was done in
the presence of 50 μg/ml chloramphenicol. The resultant plasmid was termed pLOI2225.
Construction ofpLOI2226 To create the thermosensitive replication vector carrying two FRT sites and the ampicillin-resistance gene, pLOI2222 was double-digested with EcoRI and Seal and a 157 bp fragment was ligated to pST76-A (pSClOl ori, ApR) digested with the same enzymes. For plasmids carrying the thermosensitive replicon (pSClOl), E. coli SI 7-1 transformed cells were kept at 30°C and cells were selected with 40 μg/ml ampicillin. The resultant plasmid was termed pLOI2226.
Construction ofpLOI2227
To create the thermosensitive replication vector carrying two FRT sites and the kanamycin-resistance gene, pLOI2222 was double-digested with EcoRI and Seal and a 157 bp fragment was ligated to pST76-K (pSC 101 ori, KmR) digested with the same enzymes and transformation was performed as described above and selection was done in the presence of 40μg/ml of kanamycin. The resultant plasmid was termed pLOI2227.
Construction ofpLOI2228 To create the thermosensitive replication vector carrying two FRT sites and the chloramphenicol-resistance gene, pLOI2222 was double-digested with EcoRI and Seal and a 157 bp fragment was ligated to pST76-C (pSClOl ori, CmR) digested with the same enzymes and transformation was performed as described above and selection was done in the presence of 40μg/ml of chloramphenicol. The resultant plasmid was termed pLOI2228.
Construction ofPLOI223I
The construct pLOI2231 (SΕQ ID NO: 7) is a derivative of pLOI2224 where adhE (guide) and the PET (passenger) genes have been cloned. The PET passenger genes comprise adhB and pdc. To obtain this construction the adhE gene was first cloned into pCR2.1-TOPO vector as indicated by manufactured. Briefly, the adhE gene was PCR amplified using a pair of Genosys ORFmer primers:
5 ' TTGCTCTTCCATGGCTGTTACTAATGTCGCTGAA3 ' forward primer; 5 ' TTGCTCTTCGTTAAGCGGATTTTTTCGCTTTTTTCT3 ' reverse primer. The PCR product of adhE includes the initiation codon ATG and the termination codon TAA. A fresh PCR sample of adhE was ligated to the pCR2.1-TOPO vector. This plasmid was designated as pLOI2408. A subcloning step was performed to transfer adhE from pLOI2408 to pLOI2403. Plasmid pLOI2408 was digested with restriction enzyme EcoRI. The product of this digestion allows to obtain the whole PCR product previously cloned. The adhE fragment was separated by gel electrophoresis and isolated from the gel using the QIAGΕN kit. The isolated adhE fragment was visualized by UV light after ethidium bromide staining and gel electrophoresis. This fragment was ligated to plasmid pLOI2403. The construct pLOI2403 was previously digested with restriction enzyme EcoRI. The adhE gene fragment and pLOI2403 were ligated using the rapid DNA ligation kit as directed by manufacturer. The plasmid obtained after the ligation between the adhE EcoRI digested fragment and pLOI2403 EcoRI digested plasmid was designated pLOI2413. In this plasmid, the 3' end ofthe ααTzΕ gene is orientated to be close to the BamHl site of pLOI2403 as revealed by sequence analysis.
In order to ligate the PET genes (passenger) to adhE gene (guide) a ligation was performed between a PET fragment and the plasmid pLOI2413. PET genes were isolated from plasmid pLOI510 as a βαmHI fragment of about 4.4 Kbp. The PET EcoRI digested fragment was isolated from a gel after electrophoresis of plasmid pLOI510 BamHl digested with the QIAGΕN kit. The PET DNA fragment was visualized by ethidium bromide after gel electrophoresis. A ligation ofthe isolated PET fragment with plasmid pLOI2413 previously digested with restriction enzyme BamHl was performed. The plasmid obtained was designated pLOI2230.
To obtain an integration vector with guide and the passenger fragments as shown in Fig. 3, the plasmid pLOI2224 was ligated to a fragment containing adhE and PET genes isolated from plasmid pLOI2230. The resultant plasmid termed pLOI2230 was digested with the restriction enzyme Ascl and the digestion product was cloned into pLOI2224 previously digested with enzyme Ascl. The plasmid obtained from this ligation was designated pLOI2231 and the orientation ofthe various genetic elements of the plasmid are shown in Fig. 3
Targeting Strategy
Recombinase-based integration systems offer the opportunity to effect precise DNA deletions in vivo (Cherepanov, P. P., et a Gene 158: 9-14 (1995); Posfai, G. et ah, Nucleic Acids Res. 22: 2392-2398 (1994)) and in vitro (Cox, M. M., Proc. Nath Acad. Sci. USA 80: 4223-4227 (1983); Hasan, N., et ah, Gene 150: 51-56 (1994), Wild, J., Z. et al. , Gene 179:181-188 (1996)). The present invention encompasses plasmids as described by Posfai et l. (J. Bacteriol. 179: 4426-4428 (1997)) that were further modified to incorporate homologous DNA as a guide sequence, two FRT sites in the same orientation flanking an antibiotic marker and a multi cloning site (MCS) (Fig. 1). A full set of plasmids (ampicillin, chloramphenicol, and kanamycin resistance) was made for each conditional replicon (Fig. 1; pSClOl and R6K-γ). Since both conditional replicons are present at low copy numbers, an additional high copy vector was developed to facilitate constructions by adding an Ascl site on either side ofthe MCS in pLITMUS38 (New England Biolabs, Beverly, MA) to produce pLOI2403 (Fig.
1).
A general procedure for chromosomal integration of DNA using the new vectors with conditional replicons is presented in Fig. 2. The integration of heterologous passenger DNA carrying desired functions can be targeted to any specific chromosomal site by an adjacent fragment of homologous DNA (guide) during growth at 30°C
(pSClOl), or directly selected at 37-42°C (pSClOl or R6K-γ). With a single cross-over event, the entire plasmid is incorporated into the chromosome (If needed, pSCl 01 -based integration vectors can be eliminated by overnight growth and plating at elevated temperatures.). After integration, recombinants were transformed with pFT-A containing the yeast FLP gene under control ofthe tetracycline promoter and grown under permissive conditions (30°C, pSClOl). During growth with chlortetracycline, FLP recombinase was induced which in turn resulted in the excision ofthe DNA bracketed by the concurrently-facing FRT sites (selectable marker and replicon) from the chromosome as shown in Fig. 3. After growth at 37-42°C to eliminate plasmid pFT-A encoding the recombinase, only the passenger gene(s), a single FRT, and the homologous guide fragment remained in the chromosome as described in further detail below.
EXAMPLE II
A Method for Engineering an Ethanologenic Host Cell by Chromosomal Integration of Heterologous DNA
In this example, methods for introducing heterologous DNA to engineer an ethanologenic organism are described.
To illustrate the utility ofthe foregoing vectors, heterologous genes where targeted into the genome of two different test microorganism in order to generate a genetically engineered organism with a desired phenotype ( . e. , ability to produce ethanol). In particular, derivatives of E. coli B (strain SE2272, Δfrd ) and E. coli K-12 (SE2275, tsfrd) where constructed in which three heterologous genes were integrated immediately behind the endogenous adhE gene in the chromosome. The guide and passenger DNA were cloned into pLOI2403, a high copy plasmid vector described above. For this construction, the promoterless adhE coding region (guide) was amplified with Genosys ORFmer primers for the coding region (forward, 5' TTGCTCTTCCATGGCTGTTACTAATGTCGCTGAA3 ' ; reverse,
5' TTGCTCTTCGTTAAGCGGATTTTTTCGCTTTTTTCT3 ' ) and cloned into pCR2.1-TOPO to produce pLOI2408 (see for example Fig. 3). After EcoRI digestion, the 2.6 Kbp adhE region from pLOI2408 was moved into the corresponding site in pLOI2403 to produce pLOI2413. The BamHl site immediately downstream from the 3' end of the adhE coding region was used to insert a 4.6 Kbp BamHl fragment from pLOI510 (Ohta, K. et ah, Appl. Environ. Microbiol. 57: 893-900 (1991)) containing three genes (passenger): a promoterless Zymomonas mobilis pdc without transcriptional terminator and a promoterless Z mobilis adhB with a transcriptional terminator, followed by a complete cat operon with promoter and terminator. In the resulting plasmid (pLOI2230), transcription ofthe heterologous genes was oriented concurrently with adhE. All constructs containing the Z. mobilis genes were grown in LB supplemented with glucose (20 g/L for plates; 50 g/L for broth). The 7.2 Kbp Ascl fragment from pLOI2230 (high copy vector) containing adhE, the artificial operon pdc adhB, and cat was ligated into the low copy integration vector pLOI2224 which contains a R6K-γ replicon (λ ?/r-dependent) and transformed into the
permissive host SI 7-1 (de Lorenzo, V., et ah, J. Bacteriol. 172: 6568-6572 (1990)) with selection for kanamycin and chloramphenicol. The resulting clone containing pLOI2231 was used for large-scale plasmid isolation (500 ml) by the alkaline lysis procedure (Sambrook, J., Molecular Cloning. A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY (1989)).
Approximately 500 ng of pLOI2231 DNA was used for electroporation of SE2272 and SE2275. Both are non-permissive hosts. Recombinants were readily obtained by selection for either kanamycin (vector) or chloramphenicol (passenger) resistance. Up to 2 h was allowed for expression ofthe resistance gene prior to spreading on plates for selection. Approximately 1000 recombinants per 1 μg DNA (electroporation) were recovered using E. coli K-12 SE2275, 5-fold higher than that obtained with the E. coli B SE2272. Thirty recombinants from each host were screened for the functional expression of alcohol dehydrogenase on indicator plates (Conway, T.C., et ah, J. Bacteriol. 169:2591-2597 (1987)). Based on the rate and intensity of color development, these recombinants expressed higher levels of ADH activity than the respective unmodified SE2272 or SE2275, or S17-l(pLOI2231) harboring promoterless pdc and adhB genes. Unlike the control strains, these recombinants also exhibited a colonial phenotype (large raised colonies on LB containing glucose) that is typical for ethanologenic E. coli (Ingram, L.O., et ah, Environ. Microbiol. 53: 2420-2425 (1987)). Small-scale DNA preparations (7 recombinants per host) were tested for the presence of pLOI2231. None contained plasmids as tested by gel filtration or based on transformation experiments using SI 7-1 as the host. These recombinants were presumed to contain chromosomally integrated genes. One clone from each parent, strains FM7 (E. coli B SE2272) and FM19 (E coli K-12 SE2275), was selected for further study.
Accordingly, strains FM7 and FM19 were transformed with the helper plasmid (pFT-A) carrying the FLP gene (Posfai, G., et ah, J. Bacteriol. 179: 4426-4428 (1997)) and incubated at 30°C with selection for ampicillin resistance. A mixture of colonies was used to inoculate a broth culture for induction of FLP with autoclaved chlortetracycline (20 μg/ml). After 6 h incubation at 30°C, the culture was diluted
1 : 1000 in LB containing glucose and incubated at 42°C for 16 h to eliminate the helper plasmid. After streaking on solid medium, isolated colonies were screened for the
absence of antibiotic markers. Approximately 80% ofthe colonies were ampicillin and kanamycin sensitive and retained only chloramphenicol resistance and the ethanologenic traits (passenger genes inserted into the MCS). Loss of ampicillin resistance indicated that the helper plasmid had been successfully eliminated while loss of kanamycin resistance confirmed the FLP-recombinase-dependent deletion ofthe vector. These new derivatives of FM7 and FM19 were designated FM18 and FM20, respectively.
PCR was used to verify the integration events in both FM18 and FM20. Two new sets of primers were designed to amplify the adhE gene including the unique junctions predicted for pdc (Fig. 3, primers 3 and 4) and cat (Fig. 3, primers 5 and 6) as a result of integration and FLP-mediated deletion. Forward primer 3 hybridizes to the promoter region of adhE while reverse primer 4 hybridizes to the N-terminal coding region of pdc. Forward primer 5 hybridizes to the C-terminal coding region of cat gene and reverse primer 6 hybridizes to the 3' untranslated portion ofthe adhE. Note that the primers used to clone adhE, forward primer 1 and reverse primer 2, hybridize to the N- terminal and C-terminal coding regions of adhE and are inside ofthe regions encoded by forward primer 3 and reverse primer 6. All primer sets (SE2272 template: 1+2, 3+6; FM18 template: 3+4, 5+6) generated products ofthe expected sizes (Fig. 4A). Identical results were also obtained using FM20 DNA as the template. As shown in Fig. 3, and arrows with numbers 1, 2, 3, 4, 5, and 6 represent primers used to amplify the corresponding regions. The sequences for these primers are as follows: Primer 1. 5 ' TTGCTCTTCCATGGCTGTTACTAATGTCGCTGAA3 ' ; Primer 2. 5' TTGCTCTTCGTTAAGCGGATTTTTTCGCTTTTTTCT3 ' ; Primer 3. 5' GTGAGTGTGAGCGCGGAGT3' ; Primer 4. 5' TGGCACGAGCATAACCTTC3' ; Primer 5.
5' CAGTACTGCGATGAGTGGCA3' ; Primer 6.5' GTTGCCAGACAGCGCTACT3' . The ααTzE gene contains a single central BstEll site which does not occur elsewhere in the PCR products. This site was used to verify the identity ofthe PCR fragments. As shown in Fig. 4B, all PCR products were cut once to produce fragments containing the N-terminal and C-terminal regions of adhE. Fragments from the adhE coding region alone (primers 1+2) were smaller (N-terminal fragment = 1,226 bp; C- terminal fragment = 1,470 bp) than fragments which included parts ofthe native adhE promoter (primers 3+4 and 3+6; N-terminal fragment = 1,325 bp) or adhE terminator (primers 3+6, 5+6; C-terminal fragment = 1,489 bp). The fragment which included part
of pdc (primers 3+4) was the largest C-terminal fragment (1,783 bp). The fragment which included part of cat (primers 5+6) was the largest N-terminal fragment (1,988 bp).
In Fig. 4, panel A shows full length PCR products of adhE junctions. Positions and sequences of primers are provided in Fig. 3. The lanes in panel A of Fig. 4 represent: 1., Hindlll digest of phage λ DNA; 2., adhE coding region (2,696 bp) of SE2272 amplified using forward primer 1 and reverse primer 2; 3., adhE promoter and 3' untranslated sequence (2,814 bp) of SE2272 amplified using forward primer 3 and reverse primer 4; 4., The adhE and . dc junction (3,108 bp) of FM18 amplified using forward primer 3 and reverse primer 4; 5., The cat junction and adhE (3,477 bp) of FM18 amplified using forward primer 5 and reverse primer 6. The amplification conditions for primer pairs 1 + 2 and 3 + 6 (25 cycles) were: 45 sec at 94°C, 45 sec at 60°C, 60 sec at 72°C. The amplification conditions for primer pairs 3 + 4 and 5 + 6 (30 cycles) were: 45 sec at 94°C, 60 sec at 60°C. DNA was held at 94°C for 3 min prior to the first cycle. The elongation time was increased to 10 min during the final cycle. In Fig. 4, panel B shows a BstEII digestion ofthe above-described PCR products.
A single, central _5_ tEII site was used to cleave PCR products containing adhE into a N- terminal and C-terminal fragments. The lanes in panel B of Fig. 4 show the following: Lane 1 contains a DNA standard (descending: 3.0, 2.0, 1.5, 1.2, 1.0 Kbp); Lanes 2 through 5 contain BstEll digested PCR products of fragments described in 4A, respectively.
The expression of adhE is regulated by a number of factors in E. coli including era, adhR, and the abundance of NADH (Leonardo, M.R., et ah, J Bacteriol. 175: 870- 878 (1993); Leonardo, M.R., et ah, J. Bacteriol. 178: 6013-6018 (1996); Mikulskis, A., et ah, J. Bacteriol. 179: 7129-7134 (1997)). Both message levels and activity are approximately 10-fold higher during anaerobic growth with glucose than during aerobic growth. Since the Z. mobilis genes are integrated behind the adhE coding region to form an operon fusion, expression of pdc should also increase in response to anaerobiosis. Strains FM18 and FM20 were grown in Luria broth containing 50 g glucose/liter under aerobic and anaerobic conditions. PDC activities were determined in heat-treated preparations to eliminate lactate dehydrogenase and other confounding activities (Conway, T., et ah, J. Bacteriol. 169: 949-954 (1987)). Under anaerobic conditions, PDC activities in FM18 and FM20 were 0.254 U/mg protein and 0.185 U/mg protein,
respectively, approximately 4-fold higher than the activities observed in cells grown under aerobic conditions.
The above results demonstrate that the new integration vectors can be used to place promoterless genes under the control of a chromosomal promoter in a site specific fashion. Moreover, after the integration event, unnecessary nucleic acid encoding a replicon and selectable marker was removed in vivo using a recombining site specific recombinase. This approach avoids potential problems of lethality or mutation due to unregulated expression in plasmids during construction and integration. These vectors can also be used to replace promoters in chromosomal genes. Additional unique restriction sites are available for the insertion of genes which can be temporarily expressed after integration and subsequently deleted with the replicon and selectable marker. This option, the temporary introduction of new genes, may be useful to test new traits in an isogenic background.
Although the vectors described must be propagated in E. coli, they are potentially useful with other organisms. The FLP recombinase is extremely efficient
(Wild, J., Z., et ah, Gene 179:181-188 (1996)) and could be produced intracellularly as a transient expression product after transformation or electroporation of pFT-A.
Equivalents Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments ofthe invention described herein. Such equivalents are intended to be encompassed by the following claims. Moreover, any number of genetic constructs, host cells, and methods described in United States Patent Nos. 5,821,093; 5,482,846; 5,424,202; 5,028,539; 5,000,000; 5,162,516; and U.S. patent application serial No. 60/136,376 may be employed in carrying out the present invention and are hereby incorporated by reference.