WO2010139797A1 - A method for improving gene expression - Google Patents

A method for improving gene expression Download PDF

Info

Publication number
WO2010139797A1
WO2010139797A1 PCT/EP2010/057856 EP2010057856W WO2010139797A1 WO 2010139797 A1 WO2010139797 A1 WO 2010139797A1 EP 2010057856 W EP2010057856 W EP 2010057856W WO 2010139797 A1 WO2010139797 A1 WO 2010139797A1
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
protein
host cell
fluorescent protein
gfp
Prior art date
Application number
PCT/EP2010/057856
Other languages
French (fr)
Inventor
Charles Dorman
Colin Corcoran
Original Assignee
The Provost, Fellows And Scholars Of The College Of The Holy And Undivided Trinity Of Queen Elizabeth Near Dublin
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Provost, Fellows And Scholars Of The College Of The Holy And Undivided Trinity Of Queen Elizabeth Near Dublin filed Critical The Provost, Fellows And Scholars Of The College Of The Holy And Undivided Trinity Of Queen Elizabeth Near Dublin
Publication of WO2010139797A1 publication Critical patent/WO2010139797A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/67General methods for enhancing the expression
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/43504Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from invertebrates
    • C07K14/43595Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from invertebrates from coelenteratae, e.g. medusae

Definitions

  • the present invention is directed to a method for improving and/or monitoring gene expression in a host cell comprising a protein encoding nucleic acid.
  • the invention is also directed to a modified protein encoding nucleic acid, protein, expression system, plasmid vector and host cell.
  • the protein encoding nucleic acid is a fluorescent protein nucleic acid
  • Host cells include bacteria (such as E.coli, B. subtilis etc), yeast (such as S.cerevisiae) or eukaryotic cell lines.
  • Conventional DNA sources and delivery mechanisms include viruses (such as baculovirus, retrovirus, adenovirus), plasmids, artificial chromosomes and bacteriophage (such as lambda).
  • the expression system used will depend on the gene involved, for example Saccharomyces cerevisiae is often selected for proteins that require significant post-translational modification. Insect or mammalian cell lines may be used when human-like splicing of the mRNA is required. Additionally, bacterial expression has the advantage of easily producing large amounts of protein. E. coli is one of the most widely used expression hosts, and DNA is normally introduced in a plasmid expression vector.
  • genes from many organisms can be placed under the control of an inducible promoter in a fast growing organism such as Saccharomyces cerevisae or E. coli. This facilitates the expression of the gene product to very high levels (often >50% of total cell protein).
  • the gene product can then be purified to homogeneity.
  • An example of this is the expression of restriction endo-nucleases from multiple organisms in £ coli, which are then purified and used in vitro to digest DNA.
  • the genes encoding the restriction endo- nucleases Pstl, Hpy99l and Psil from Providencia stuartii 164, Helicobacter pylori J99 and Pseudomonas species SE-G49, respectively, are all cloned into E. coli for over-expression and purification.
  • the present invention is directed to improving gene expression in host cells.
  • green fluorescent protein in the early 1960s started a new era in cell biology by enabling investigators to apply molecular cloning methods, fusing the fluorophore moiety to a wide variety of protein and enzyme targets, in order to monitor cellular processes in living systems using optical microscopy and related methodology.
  • molecular cloning methods fusing the fluorophore moiety to a wide variety of protein and enzyme targets, in order to monitor cellular processes in living systems using optical microscopy and related methodology.
  • the green fluorescent protein and its colour-shifted genetic derivatives have demonstrated invaluable service in many thousands of live-cell imaging experiments.
  • the Green Fluorescent Protein was originally isolated in the 1960's from the jelly fish Aequorea victoria and is encoded by the gfp gene and fluoresces green when exposed to blue light.
  • GFP is composed of 238 amino acids (26.9 kDa).
  • GFP has a typical beta barrel structure, consisting of one beta-sheet with alpha helix(s) containing the chromophore running through the center. Inward facing sidechains of the barrel induce specific cyclization reactions in the tripeptide Ser65-Tyr66-Gly67 that lead to chromophore formation. The hydrogen bonding network and electron stacking interactions with these sidechains influence the colour of wild type (wt) GFP and its numerous derivatives.
  • the GFP from A. victoria has a major excitation peak at a wavelength of 395 nm and a minor one at 475 nm. Its emission peak is at 509 nm which is in the lower green portion of the visible spectrum.
  • GFP mutants were studied in order to find a protein with better characteristics. Indeed, the first reported crystal structure of a GFP was that of the S65T mutant (a single point mutation) by in the mid 1990's. This mutation dramatically improved the spectral characteristics of GFP, resulting in increased fluorescence, photostability and a shift of the major excitation peak to 488 nm with the peak emission kept at 509 nm. This matched the spectral characteristics of commonly available FITC filter sets, increasing the practicality of use by the general researcher.
  • GFP residues In addition to the first single amino acid substitution, S65T, researchers have modified the GFP residues by directed and random mutagenesis to produce the wide variety of GFP derivatives in use today. For example, a 37°C folding efficiency (F64L) point mutant to this scaffold yielding enhanced GFP (EGFP) was discovered in 1995 by Ole Thastrup. EGFP allowed for the use of GFPs in mammalian cells. EGFP has an extinction coefficient of 55,000 M ⁇ 1 cm ⁇ 1 . Another mutation, superfolder GFP, related to a series of mutations that allow GFP to rapidly fold and mature even when fused to poorly folding peptides, was reported in 2006.
  • EBFP - blue fluorescent protein
  • ECFP Cerulean, CyPet
  • BFP derivatives (except mKalamal) contain the Y66H substitution.
  • the critical mutation in cyan derivatives is the Y66W substitution, which causes the chromophore to form with an indole rather than phenol component.
  • Several additional compensatory mutations in the surrounding barrel are required to restore brightness to this modified chromophore due to the increased bulk of the indole group.
  • the red-shifted wavelength of the YFP derivatives is accomplished by the T203Y mutation and is due to ⁇ - electron stacking interactions between the substituted tyrosine residue and the chromophore.
  • FRET fluorescence resonance energy transfer
  • pHluorins pH-sensitive mutants known as pHluorins, and later super-ecliptic pHluorins.
  • pHluorins tagged to synaptobrevin have been used to visualize synaptic activity in neurons.
  • roGFP Redox sensitive versions of GFP
  • V163A from GFPuv (Crameri et al., Nat Biotechnol 14 (1996), 315-9) were combined with mutations in the chromophore of GFPmuti (F64L and S65T) (Cormack et al., 1996) to create the GFP+ protein which was 320 times more fluorescent than the original GFP (Scholz et al. Eur. J. Biochem. 267 (2000), 1565-70).
  • the GFPmuti protein is identical with EGFP, which is commercially available, widely used even though only about 20% of EGFP/GFPmut1 is correctly folded at 37 0 C (Tsien Annu Rev Biochem 67 (1998), 509-44).
  • GFP+ was necessary to allow the measurement of gene activity in the native location (Hautefort et al., App. Env. Microbiol. 69 (2003), 7480-7491).
  • DsRed is a recently cloned 28-kDa fluorescent protein isolated from the coral of the Discoma genus. DsRed has an emission maximum of 583-nm, which can be further extended to 602-nm by mutation of Lys-83 to Met. This emission spectrum makes it ideal as a fluorescent partner for GFP, as fluorescence of both proteins could be individually measured in a single cell (Baird et al., Proc. Natl. Acad. Sci. USA 97 (2000), 11984-11989). Similar to the early studies of GFP, a number of the limitations of this protein are presently being addressed through random mutagenesis of the coding sequence and screening for improved variants. One of the major issues with DsRed include the long maturation time before a red signal is detected and that it forms tetramers, both undesirable characteristics for a transcriptional/translational fusion.
  • GFP Because of its easily detectable green fluorescence, GFP from Aeq ⁇ orea has been used widely to study gene expression and protein localization. Furthermore, GFP, like other fluorescent proteins, does not require a substrate or cofactor to fluoresce; hence, it is possible to directly express GFP and use it as a reporter in numerous species and in a wide variety of cells.
  • fluorescent proteins and mutants or variants thereof can be introduced into organisms and maintained in their genome through breeding, injection with a viral vector, or cell transformation of either linear or circular DNA.
  • the GFP gene has been introduced and expressed in many bacteria, yeast and other fungi, plant, fly, and mammalian cells, including human.
  • GFP and other related fluorescent proteins and related mutants/variants as described above are now used routinely as reporters of gene expression in all types of cells. They are an essential tool for biologists and work is continually being carried out on developing improved variants to overcome inherent limitations of the native proteins.
  • the present invention is directed to improving the efficacy of fluorescent protein genes for use as a molecular tool, in particular when used in gene expression systems.
  • a method for improving gene expression in a host cell comprising a protein encoding nucleic acid comprising assessing the A and T nucleotide content and/or the intrinsic curvature of a wild type protein encoding nucleic acid or mutant thereof; preparing an altered protein encoding nucleic acid by modifying the A and T nucleotide content of the wild type protein encoding nucleic acid or mutant thereof to equal or lower the A and T nucleotide content of the host cell such that the intrinsic curvature of the altered protein encoding nucleic acid is reduced compared to the wild type protein encoding nucleic acid or mutant thereof; and using the altered protein encoding nucleic acid in host cell gene expression systems.
  • the modified A and T nucleotide content of the altered protein encoding nucleic acid results in reduced affinity, compared to the wild type protein encoding nucleic acid or mutant thereof, to host cell transcriptional repressor proteins.
  • a method for improving gene and/or protein expression in a host cell comprising a fluorescent protein nucleic acid comprising assessing the A and T nucleotide content of the fluorescent protein nucleic acid; and modifying the A and T nucleotide content of the fluorescent protein nucleic acid to equal or lower the A and T nucleotide content of the host cell.
  • a method for improving gene and/or protein expression in a host cell comprising a fluorescent protein nucleic acid or mutant thereof comprising assessing the A and T nucleotide content and/or the intrinsic curvature of a wild type fluorescent protein nucleic acid or mutant thereof; preparing an altered fluorescent protein nucleic acid by modifying the A and T nucleotide content of the wild type fluorescent protein nucleic acid or mutant thereof to equal or lower the A and T nucleotide content of the host cell such that the intrinsic curvature of the altered fluorescent protein nucleic acid is reduced compared to the wild type fluorescent protein nucleic acid or mutant thereof; and using the altered fluorescent protein nucleic acid in a host cell gene expression system.
  • a modified fluorescent protein nucleic acid comprising a sequence encoding a wild type fluorescent protein or mutant thereof with a lower A and T nucleotide content and/or reduced intrinsic curvature compared to the wild type fluorescent protein nucleic acid or mutant thereof.
  • the protein has reduced affinity to one or more host cell transcriptional repressor proteins compared to the wild type fluorescent protein nucleic acid or mutant thereof.
  • a fluorescent protein encoded by the modified nucleic acid of the invention is provided.
  • an expression system comprising the modified nucleic acid sequence or fluorescent protein according to the invention, preferably for use in a host cell such as Escherichia coli.
  • a plasmid vector comprising the modified nucleic acid sequence or fluorescent protein according to the invention, preferably for use in a host cell such as Escherichia coli.
  • a host cell comprising the modified nucleic acid sequence, fluorescent protein, plasmid vector, expression system of the invention.
  • a method of monitoring gene expression in a host cell using the modified nucleic acid, fluorescent protein, expression system or host cell of the invention is provided.
  • the terminology "equal to or lower than the A and T nucleotide residue content of a host cell” covers an A and T residue content equal to or within +/- 15%, +/- 10%, +/- 9%, +/- 8%, +/- 7%, +/- 6%, preferably +/- 5%, +/- 4%, +/- 3%, +/- 2%, more preferably +/-1%, +/- 0.5%, +/- 0.1%, of the A and T residue content of the host cell.
  • DNA is made up of combinations of 4 nucleotides; Adenosine (A), Tyrosine (T), Guanosine (G) and Cytosine (C). Each species uses each of these nucleotides in differing ratios. A and T are grouped because they both contain only 2 hydrogen bonds while G and C both contain 3. It is known, for example, that approximately 50% of the genome of Escherichia coli (a Gram- negative bacterium), a typical host cell, is an A or a T nucleotide (AfT). In addition, it is known that, the AAT nucleotide content of the fluorescent protein gene, gfp+, for example is approximately 59%, i.e. it is "A/T rich".
  • AfT rich or "AfT nucleotide rich” to convey an AfT nucleotide content higher than the average A/T residue content of any suitable host cell, including E. coli and many other host cells.
  • Other proteins and fluorescent proteins will have different A/T nucleotide contents and may or may not be A/T rich compared to a host cell.
  • the approximate A and T nucleotide content of the following Gram-positive and negative bacteria is given on the table below:
  • reducing the intrinsic curvature of the altered protein encoding nucleic acid ideally means lowering the curvature amplitude compared to the wild type protein encoding nucleic acid or mutant thereof.
  • intrinsic curvature should be reduced to less than approximately 15 ⁇ per helical turn, approximately 10 s per helical turn, more preferably approximately 9 Q per helical turn, even more preferably approximately 7 s per helical turn.
  • all regions of intrinsic curvature greater than approximately 7 Q per helical turn in the fluorescent protein gfp should be reduced to approximately 7 s per helical turn or less.
  • all regions of intrinsic curvature greater than approximately 9 s per helical turn in the fluorescent protein dsred should be reduced to approximately 9 ⁇ per helical turn or less.
  • codon optimization to increase the A/T content of a gene to match that of a G/C rich genome will make it a poor target for H-NS
  • similar codon optimization for an A/T rich genome may increase the affinity of H-NS for the target gene. Therefore, to avoid H- NS mediated complications when optimizing genes for expression in A/T rich genomes, it may be advantageous to optimize genes using a codon-table derived from highly expressed genes of the organism of interest that have a reduced A/T content compared to the average A/T content of the genome.
  • the optimal level of A/T content required to avoid H-NS mediated complications in an A/T rich genome can then be determined empirically.
  • host cells are not limited to bacteria, and include all suitable prokaryote host cells, such as bacterial cells and yeast cells, and eukaryote host cells, such as mammalian cells (including human, primate and rodent cells). Furthermore, the host cells of the invention will be understood to express transcriptional repressor proteins, such as nucleoid-associated transcriptional repressor proteins and typically H-NS or H-NS-like proteins.
  • H-NS transcriptional repressor proteins
  • H-NS-like proteins include but are not limited to the bacterial proteins Sfh, StpA, Hha, YdgT, MvaT, MvaU, Lsr2, BpH3.
  • Such transcriptional repressor proteins and repressor-like proteins are also present in mammalian cells.
  • protein encoding nucleic acid covers, but is not limited to fluorescent proteins.
  • Non-fluorescent proteins which are over-expressed in host cells for purification are also contemplated.
  • Such proteins include all suitable medical protein products (vaccines, hormones (insulin, growth hormone), clotting factors, cytokines etc) and enzymes for example. This is a non-exhaustive list.
  • fluorescent protein covers, but is not limited to, fluorescent proteins such as GFP, YFP, CFP, BFP and DsRed and also covers any potential mutants or variants thereof. There are an extensive range of known variants and mutants of both GFP and DsRed including a wide variety of single or double amino acid substitutions. For convenience, where the term mutant is referred to in the following text, it will tie understood that this term also covers fluorescent protein variants.
  • gfpT (or gfp 7 ) may also be referred to interchangeably as gfpTCD (or gfp rCD ).
  • a method for improving gene expression in a host cell comprising a protein encoding nucleic acid comprising assessing the A and T nucleotide content and/or the intrinsic curvature of a wild type protein encoding nucleic acid or mutant thereof; preparing an altered protein encoding nucleic acid by modifying the A and T nucleotide content of the wild type protein encoding nucleic acid or mutant thereof to equal or lower the A and T nucleotide content of the host cell such that the intrinsic curvature of the altered protein encoding nucleic acid is reduced compared to the wild type protein encoding nucleic acid or mutant thereof; and using the altered protein encoding nucleic acid in host cell gene expression systems.
  • the modified A and T nucleotide content of the altered protein encoding nucleic acid results in reduced affinity, compared to the wild type protein encoding nucleic acid or mutant thereof, to host cell transcriptional repressor proteins.
  • the protein encoding nucleic acid may be present in an extrachromosomal vector, such as a plasmid.
  • the protein encoding nucleic acid may be integrated into the host cell genome.
  • transcriptional repressor proteins such as H-NS proteins
  • H-NS proteins transcriptional repressor proteins
  • the invention is applicable to any protein for over-expression and subsequent purification in a host cell.
  • proteins include fluorescent proteins and non- fluorescent protein products.
  • the protein encoding nucleic acid may encode many medically useful protein products such as vaccines, hormones (insulin, growth hormone), clotting factors, cytokines etc and enzymes for example.
  • the protein encoding nucleic acid may encode various restriction endo- nucleases Pstl, Hpy99l and Psil from Providencia st ⁇ artii 164, Helicobacter pylori J99 and Pseudomonas species SE-G49. Many other protein products may be contemplated.
  • Helicobacter pylori J99 has an average genomic A/T content of 61%, which is the same A/T content as the fluorescent protein, gfpmut2 ( Figure 5B).
  • This high average AyT content of H. pylori for example means it will have a large number of genes with a high A/T content Thus, due to a high AT content, it is to be expected that when genes are placed in E. coli, they are highly likely to be targeted by H-NS.
  • the present invention provides a method for preventing this H-NS interaction by reducing the A/T nucleotide content to equal or lower than of the host cell, such as E. coli.
  • the host cell is a bacterium, preferably a gram-negative bacterium, more preferably Escherichia coli.
  • Other host cells may be contemplated including suitable prokaryotic and eukaryotic cells, such as yeast, insect, mammalian, primate and rodent cells.
  • a modified protein encoding nucleic acid comprising a sequence encoding a wild type protein or mutant thereof with an equal or lower A and T nucleotide content and/or reduced intrinsic curvature compared to the wild type protein encoding nucleic acid or mutant thereof.
  • the resultant nucleic acid has reduced affinity to one or more host cell transcriptional repressor proteins compared to the wild type protein encoding nucleic acid or mutant thereof.
  • the invention is also directed to a modified protein encoding nucleic acid of the invention and protein, expression system, plasmid vector and host cell comprising the modified protein encoding nucleic acid.
  • a method for improving gene and/or protein expression in a host cell comprising a fluorescent protein nucleic acid comprising assessing the A and T nucleotide content of the fluorescent protein nucleic acid or mutant thereof; and modifying the A and T nucleotide content of the fluorescent protein nucleic acid or mutant thereof to equal or lower the A and T nucleotide content of the host cell.
  • the A and T residue content of the fluorescent protein nucleic acid may be equal to or within +/- 15%, +/- 10%, +/- 9%, +/- 8%, +/- 7%, +/- 6%, preferably +/- 5%, +/- 4%, +/- 3%, +/- 2%, more preferably +/-1%, +/- 0.5%, +/- 0.1%, of the A and T residue content of the host cell.
  • the modified fluorescent protein nucleic acid or mutant thereof can then be used as a molecular tool in gene expression systems and host cells.
  • a method for improving gene expression in a host cell comprising a fluorescent protein nucleic acid or mutant thereof comprising assessing the A and T nucleotide content and/or the intrinsic curvature of a wild type fluorescent protein nucleic acid or mutant thereof; preparing an altered fluorescent protein nucleic acid by modifying the A and T nucleotide content of the wild type fluorescent protein nucleic acid or mutant thereof to equal or lower the A and T nucleotide content of the host cell such that the intrinsic curvature of the altered fluorescent protein nucleic acid is reduced compared to the wild type fluorescent protein nucleic acid or mutant thereof; and using the altered fluorescent protein nucleic acid in host cell gene expression systems.
  • the crucial aspect of this invention is to modify the fluorescent protein nucleic acid A/T nucleotide content and/or intrinsic curvature to be lower than the A/T nucleotide content of the host cell.
  • the resultant fluorescent protein nucleic acid has reduced affinity to one or more host cell transcriptional repressor proteins compared to the wild type fluorescent protein encoding nucleic acid or mutant thereof. Unexpectedly, this ensures improved fluorescent protein expression of a structurally identical fluorescent protein.
  • this method reduces the potential detrimental impact of a fluorescent protein nucleic acid or mutant thereof on gene and/or protein expression in a host cell comprising a fluorescent protein nucleic acid.
  • transcriptional repressor proteins such as nucleoid-associated transcriptional repressor proteins and typically H-NS or H-NS like proteins, targets fluorescent protein genes, such as the gfp gene or mutants or variants thereof.
  • the transcriptional repressor proteins namely the nucleoid-associated protein H-NS, are molecules that are abundant in E. coli and related organisms and are powerful repressors of transcription.
  • H-NS can create bridges between different DNA molecules or between different parts of the same DNA molecule and can also nucleate along DNA.
  • H-NS is essentially a protein with the ability to slow or block gene expression. Thus, these H-NS proteins interfere with gene expression at the level of transcription.
  • GFP is commonly used as a transcriptional reporter gene in host cells.
  • H-NS interferes with the expression of the GFP protein in host cells such as bacteria. This discovery is unexpected and we postulate that many of the experiments carried out using GFP previously, in E. coli and related bacteria, are likely to have been complicated by the unsuspected participation of H-NS.
  • the binding of the H-NS protein to the gfp+ gene has the potential to undermine its fidelity as a reporter of gene expression. This can mislead investigators, perhaps causing them to underestimate the levels at which genes of interest are expressed or it may complicate their study by introducing a new and unsuspected layer of gene regulation and artefacts into the system under examination.
  • H-NS binding and oligomerization causes the formation of new topological domains, which can lead to both local transcriptional repression and in addition, genome wide changes in gene expression.
  • gfp transcriptional and translational fusions are located on multicopy plasmids, the possibility exists that global gene expression could be altered by the binding of H-NS in gfp, depleting the amount of free H-NS in the cell and thus leading to altered global gene expression.
  • H-NS also has the potential to modify gene expression in a host cell by upregulating/downregulating various local/distal genes.
  • the gene encoding GFP has a high A and T nucleotide content and DNA with this nucleotide content is preferentially bound by the H-NS protein.
  • This repression mechanism frequently involves cross-linking of DNA molecules by H-NS to either distant DNA molecules or different parts of the same molecule to form DNA-H-NS-DNA bridges.
  • These DNA-H-NS-DNA bridges block the free movement of RNA polymerase and interfere with the process of transcription.
  • the present invention addresses all of the above-mentioned aspects to prevent H-NS binding and associated H-NS mediated transcriptional silencing to results in an improved fluorescent protein for use in known c xpression systems/host cells.
  • the invention focuses on the optimization of gfp for expression in bacterial cells that express H-NS or related proteins. This modified fluorescent protein gene is thus modified to circumvent undesirable interference by H-NS.
  • the teachings of the invention in relation to altering A and T nucleotide content are widely applicable. These teachings are applicable to any fluorescent proteins and host cell transcriptional repressor proteins. It will also be understood the concept of the invention is equally applicable to a fluorescent (GFP) protein which is not identical in characteristics/properties to the fluorescent (GFP) protein or mutant/variant thereof that it is derived from.
  • GFP fluorescent
  • DNA curvature or intrinsic curvature is related to the nucleotide content of DNA and refers to the curve of the DNA that is caused solely by the nucleotide content. This is influenced by the A/T content (since As and Ts have a higher internal bend angle leading to higher deflection from linear DNA) and also the sequence - there are certain combinations of nucleotides, including Gs and Cs, that can lead to increased bending of the DNA.
  • intrinsic curvature is used to distinguish sequence determined bends from bends that are introduced by DNA binding proteins. Promoter regions tend to be AfT rich to allow easy strand separation of the DNA.
  • Intrinsic curvature is a good measure/indicator of whether the modified nucleic acid does not have affinity to such transcriptional repressor proteins.
  • An intrinsic curvature of low amplitude (for example below approximately 15 Q per helical turn, approximately 10 9 per helical turn, more preferably approximately 9 s per helical turn, even more preferably approximately 7- per helical turn) is desirable.
  • the present invention involves modifying the AfT nucleotide residue content to equal or lower the average A and T nucleotide content of the host cell, carrying out codon optimization as needed and monitoring and modifying as necessary the intrinsic curvature of the nucleic acid to achieve an intrinsic curvature of low amplitude.
  • the present invention is directed to a method for improving gene and/or protein expression in a host cell.
  • a modified fluorescent protein gene which is impervious to interference by the transcriptional repressor proteins, such as H-NS, but which expresses an unaltered fluorescent protein.
  • the fluorescent gene of the invention accurately reports transcriptional activity, due to the removal of the repressive effects of H-NS.
  • the new fluorescent gene of the invention can be used to monitor gene expression in bacteria such as E. coli without interference from H-NS binding to ensure that gene/protein expression in a host cell can now be conducted free from the undesirable complications that arise from transcriptional repressor proteins potentially binding to the fluorescent protein gene, with its associated bridging activity interfering with the faithful expression of that gene.
  • the results should be more physiologically-relevant and free from artefacts caused by H-NS interference.
  • lowering the A and T nucleotide content of a fluorescent protein in regions proximal to the promoter region and/or the ribosome binding site may improve gene/protein expression in a host cell.
  • the high AfT content of the gfp+ gene may have had a direct role on transcription (independent of H-NS) since the location of AfT rich regions close to promoter regions can lead to reduced opening of the promoter region and thus reduced access for RNA polymerase and reduced transcription of the gene.
  • energy usually supplied by underwound or "negatively supercoiled" DNA
  • the region requiring the lowest amount of energy separates first, using up the superhelical energy. Since A and T nucleotides only form 2 hydrogen bonds (G and C form 3), AfT rich regions become single stranded before regions with a lower A/T percentage.
  • the presence of the A/T rich gfp ⁇ gene proximal to a promoter region could influence the amount of superhelical energy (specifically, the amount of superhelical twist in the DNA) in the system and thus affect transcription of the gene of interest.
  • This aspect is relevant to all A/T rich fluorescent proteins even if they did't bound by H-NS. In this situation the A and T nucleotide content should be assessed and modified accordingly.
  • the fluorescent protein nucleic acid is modified so that it is no longer A and T nucleotide rich (AT-rich) compared to the host cell nucleic acid average A and T nucleotide content.
  • a and T nucleotide content of the fluorescent protein nucleic acid is modified to result in reduced affinity, compared to the wild type fluorescent protein nucleic acid or mutant thereof, to host cell transcriptional repressor proteins.
  • the A and T nucleotide content of the fluorescent protein transcriptional repressor protein binding region nucleic acid is modified to equal or lower the A and T nucleotide content of the host cell transcriptional repressor protein nucleic acid binding region.
  • the fluorescent protein nucleic acid has an intrinsic curvature of lower amplitude, for example below approximately 15 s per helical turn, approximately 10 s per helical turn, more preferably approximately 9 9 per helical turn, even more preferably approximately 7- per helical tum)than the wild type fluorescent protein nucleic acid.
  • the host cell is a bacterium, preferably a gram-negative bacterium, more preferably Escherichia coli.
  • Other host cells may be contemplated as described below.
  • the transcriptional repressor protein is a nucleoid-associated transcriptional repressor protein or repressor like protein, preferably H-NS or H-NS like proteins, such as Sfh and StpA.
  • the H-NS protein is a member of the family of nucleoid-associated proteins of bacteria. These proteins bind to DNA, regulate gene expression and organize the structure of the nucleoid (the part of the bacterial cell that contains the genetic material). H-NS is abundant (20,000 dimers per cell) and it binds to DNA sequences that have specific characteristics namely regions of intrinsic curvature and a high A and T content. Each H-NS dimer has two DNA binding domains and this facilitates the construction of DNA-H-NS-DNA. These bridges block the process of transcription (in which a gene is read by RNA polymerase) and silence the expression of genetic information.
  • the H-NS protein binds to DNA sequences throughout the chromosomes of bacteria that express it, such as Escherichia coli, Salmonella and Shigella.
  • the genes that it targets become silenced or down-regulated (Dorman, Nat. Rev. Microbiol. 5 (2007), 157-161).
  • H-NS is not the only DNA-protein-DNA bridge builder in biology. Such proteins are found in many cell types, including archaea and eukaryotes (Luijsterburg et al., Crit. Rev. Biochem. MoI. Biol. 43 (2008), 393-418). Importantly, a protein from the mouse has been shown to be able to substitute for H-NS in bacteria (Timchenko et al., EMBO J. 15 (1996), 3986-3992). Thus, the present invention relates to H-NS like proteins also.
  • the A and T nucleotide content of the entire fluorescent protein is modified compared to the wild type fluorescent protein nucleic acid or mutant thereof.
  • the A and T nucleotide content of the fluorescent protein promoter region and/or ribosome binding site (RBS) nucleic acid is modified compared to the wild type fluorescent protein nucleic acid or mutant thereof.
  • the A and T nucleotide content of the regions proximal to the fluorescent protein nucleic acid promoter region is modified such that the A and T nucleotide content to equal or lower the A and T nucleotide content of the host cell nucleic acid.
  • the fluorescent protein of the invention may be a green fluorescent protein (GFP), YFP, CFP, BFP or red fluorescent protein (DsRed) or a mutant or variant thereof.
  • GFP green fluorescent protein
  • YFP green fluorescent protein
  • CFP CFP
  • BFP red fluorescent protein
  • DsRed red fluorescent protein
  • a modified fluorescent protein nucleic acid comprising a sequence encoding a wild type fluorescent protein or mutant thereof with an equal or lower A and T nucleotide content and/or reduced intrinsic curvature compared to the wild type fluorescent protein nucleic acid or mutant thereof.
  • the nucleic acid has reduced affinity to one or more host cell transcriptional repressor proteins compared to the wild type fluorescent protein nucleic acid or mutant thereof.
  • the modified nucleic acid has reduced A and T nucleotide content across the entire length compared to the wild type fluorescent protein nucleic acid or mutant thereof.
  • the modified nucleic acid has reduced A and T nucleotide content in the regions proximal to the promoter region and/or ribosome binding site (RBS) of the fluorescent protein nucleic acid compared to the same regions of the wild type fluorescent protein nucleic acid or mutant thereof.
  • RBS ribosome binding site
  • the modified nucleic acid has an A and T nucleotide content equal to or lower than a host cell average A and T nucleotide content.
  • the host cell is a bacterium, preferably the Gram-negative bacterium Escherichia coli which has an A and T nucleotide content of approximately 50%.
  • the percentage of A and T nucleotides based on the full length modified nucleic acid sequence is from approximately 25% to 70% (for example the host cell Mycobacterium tuberculosis (a Gram-positive bacterium) has an approximate 35% A and T nucleotide content). Ideally, the A and T content is equal to or lower than that of the host cell.
  • the nucleic acid comprises the nucleic acid sequence of Figures 4B or 6B or a sequence with at least 70%, preferably 80%, more preferably 85%, more preferably 90%, more preferably 95%, even more preferably 99% homology over the entire length to the nucleic acid sequence of Figures 4B or 6B.
  • These are the "gfpV and DsRedT genes of the following examples.
  • the gfpT gene contains 157 nucleotide changes compared to the gfp+ coding sequence, without altering the amino acid sequence of the protein. Both gfp genes are 717 base pairs (bp) and thus vary at the nucleotide level by greater than approximately 20%.
  • the nucleic acid of the invention has improved transcription compared to the wild type fluorescent protein nucleic acid or mutant thereof.
  • a fluorescent protein encoded by the modified nucleic acid as described before is provided.
  • an expression system comprising the modified nucleic acid sequence or fluorescent protein as described above, preferably for use in a host cell such as Escherichia coli.
  • a plasmid vector comprising the modified nucleic acid sequence or fluorescent protein as described above, preferably for use in a host cell such as Escherichia coli.
  • a host cell comprising the modified nucleic acid sequence, fluorescent protein, plasmid vector, expression system as described above.
  • the A and T nucleotide content of the modified fluorescent protein nucleic acid is equal to or lower than the A and T nucleotide content of the host cell nucleic acid.
  • the fluorescent protein is a green fluorescent protein (GFP) or red fluorescent protein (DsRed).
  • the green fluorescent protein mutant may be selected from the following; a spectral variant, a pHluorins, a variant with an altered Stokes shift, an oligomerization variant, a folding variant, a photoactivatable variant, a photoconversion variant, a photoswitchable variant, a redox sensitive variant and/or gfp + .
  • the transcriptional repressor protein may be any nucleoid-associated repressor protein, preferably H-NS, repressor-like proteins, preferably H-NS like proteins such as Sfh and StpA.
  • a method of monitoring gene expression in a host cell using the modified nucleic acid, fluorescent protein, expression system or host cell of the invention is provided.
  • the fluorescent gene of the invention may be used in many commercial applications. For example, it may be used in GFP-based kits for the study of gene expression.
  • the modified nucleic acid of the invention may be inserted into a recombinant vector which may be any vector which may conveniently be subjected to recombinant DNA procedures.
  • a recombinant vector which may be any vector which may conveniently be subjected to recombinant DNA procedures.
  • the choice of vector will often depend on the host cell into which it is to be introduced.
  • the vector may be an autonomously replicating vector, i.e. a vector which exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication, e.g. a plasmid.
  • the vector may be one which, when introduced into a host cell, is integrated into the host cell genome and replicated together with the chromosome(s) into which it has been integrated.
  • the vector is preferably an expression vector in which the DNA sequence encoding the fluorescent protein of the invention is operably linked to additional segments required for transcription of the DNA.
  • the expression vector is derived from plasmid or viral DNA, or may contain elements of both.
  • operably linked indicates that the segments are arranged so that they function in concert for their intended purposes, e.g. transcription initiates in a promoter and proceeds through the DNA sequence coding for the fluorescent protein of the invention.
  • the promoter may be any DNA sequence which shows transcriptional activity in the host cell of choice and may be derived from genes encoding proteins either homologous or heterologous to the host cell, including native Aeq ⁇ orea GFP genes.
  • Suitable promoters for directing the transcription of the DNA sequence encoding the fluorescent protein of the invention in mammalian cells are the SV40 promoter (Subramani et al., MoI. Cell Biol. 1 (1981), 854-864), the MT-1 (metallothionein gene) promoter (Palmiter et al., Science 222 (1983), 809-814) or the adenovirus 2 major late promoter.
  • the fluorescent protein gene of the invention may also be placed in plasmid vectors designed for simple cloning of e.g. the promoter of interest (transcriptional fusion) or the gene of terest (translational fusion) or both. While transcriptional fusions report promoter activity, translational fusions are often used to view the movement of the tagged protein by fluorescent microscopy.
  • the fluorescent protein gene of the invention could also be integrated onto the chromosome to allow study of gene expression from its native location.
  • One of the most effective ways to construct these fusions is using the lamda red mechanism, integrating the gfp gene or modified gfp (gfpT) gene after the end of the gene of interest's (GOI) coding sequence.
  • W ⁇ iIe colonies can be screened for fluorescence using a FACs, fluorescent microscope, UV lamp or in some cases, by eye, we recommend using a linked selectable marker such as the kanamycin resistance cassette to allow initial selection of integrants.
  • the number of available fluorescent reporter genes has increased in recent years as researchers have isolated genes encoding fluorescent proteins from an increasing variety of organisms and included the genes in cloning cassettes.
  • fluorescent proteins from sea creatures have been used as reporter genes capable of integration into DNA via cloning cassettes. Products of these genes fluoresce under certain wavelengths of light, permitting the tracking of proteins in, e.g., heterologous cells, such as dog and monkey cells.
  • the most commonly used proteins of this nature fluoresce green, and were obtained from the jellyfish, Aeq ⁇ orea victoria, and sea pansy, Renilla reniformis.
  • a red fluorescent protein (RFP) known as drFP583, and a turquoise fluorescent protein, known as dsFP483, have been isolated from the Indo Pacific mushroom corals (Discosoma sp. "red” and Discosoma striata, respectively).
  • Discosoma and Actinodiscus are mushroom corals, soft bodied anthozoans that do not produce an external skeleton. It should be noted that the relationship between the genus Discosoma and the genus Actinodiscus is not well understood. Both Actinodiscus and Discosoma are members of the Actinodiscidae Family, which is a member of the Corallimporpharia (mushroom) Order. The taxonomy of the Corallimporpharia is poorly defined, and therefore, the nature of the relation of Actinodiscus to Discosoma is uncertain. Discosoma and Actinodiscus are believed to be different genera of the same family, but they could be more closely or distantly related.
  • the Discoma red fluorescent protein (FP583, commercially known as DsRed) isolated by Matz in 1999 (Matz et al. Nat. Biotech. 17 (1999), 969-973), while providing a potential alternative to gfp+, contains the same intrinsic features make gfp+ a target for H-NS binding.
  • the dsred gene is 55% A/T and is predicted to contain strong intrinsic curvature.
  • teachings of the present invention are also applicable to DsRed which may be optimized as described above to reduce H-NS affinity.
  • Fig. 1 shows that H-NS binds to gfp + in vivo in accordance with Example 1.
  • Chromatin immunoprecipitation (ChIP) using a H-NS specific monoclonal antibody was followed by quantitative PCR (qPCR).
  • the Y-axis indicates fold enrichment relative to input DNA (DNA before addition of the H-NS specific antibody).
  • Probe 2 showed over 12 fold enrichment over input DNA indicating H-NS binds in the gfp+ gene.
  • Figs. 2A and 2B relate to Example 2 and codon optimization.
  • Example 2 contains reduced A/T content and DNA curvature compared to gfp+.
  • the Bend.lt program (Vlahovicek et al., Nucleic Acids. Res. 31 (2003 ), 3686-7) was used to determine the predicted curvature of the two gfp genes. Regions of strong intrinsic curvature in the gfp+ that are reduced in gfpl are indicated by filled arrows ( Figure 2A).
  • the new gfpl gene has reduced A/T content compared to the gfp+ gene ( Figure 2B) While for the gfp+ gene the most A/T rich region is approaching 80% (average A/T content is 59%) the A/T content of some regions is reduced by over 20% in the new gfpl gene (average A/T content is 50%).
  • the entire coding sequence (717bp) of both genes is shown in Figures 4A and 4B.
  • Fig. 3 shows the osmotic induction of proil fusions in accordance with Example 3.
  • Fig. 3A is a diagram of the downstream regulatory region (DRE) of pro V containing H- NS binding sites essential for repression of proU in low osmolality media.
  • the transcriptional start site of proU is indicated as +1.
  • the positions of high-affinity H-NS binding sites and the points of insertion of the 3 reporter genes are indicated relative to +1.
  • Fig. 3B shows chromosomal lacZ, gfp+ and new gfpl fusions constructed at +98 bp
  • Fig. 3C shows chromosomal lacZ, gfp+ and new gfpl fusions constructed at
  • Fig. 3(D) is a barchart showing the effect of removing H-NS on fusion gene expression in 100 mM NaCI. Mean values and ranges are plotted.
  • Fig. 3(E) is a table showing the fold-increases in fusion gene expression caused by changes in H-NS occupancy.
  • Figs. 4A and 4B shows the full coding sequences for the gfp+ gene and the new gfpl.
  • gfp+ (Scholz et al. Eur. J. Biochem. 267 (2000), 1565-70) contains a number of mutations that improve the folding and emission spectrum of the GFP protein, optimizing it for use with flow assisted cell sorters (FACs).
  • the sequence of gfp+ was obtained from pZepO8 (Hautefort et al., App. Env. Microbiol. 69 (2003), 7480-7491 ).
  • the new gfpl is the gfp made in accordance with Example 2.
  • Figure 5A is the full coding sequence for one of the most commonly used variants of the gfp gene, the gfpmut2 gene.
  • the gfpmut2 gene contains a higher A/T (61%) content than the gfp+ gene (59%) and also contains regions of high intrinsic curvature making it a strong H-NS target (Fig. 5B).
  • FIG. 6A is the full coding sequence for DsRed (Bartilson et al. MoI. Microbiol. 39 (2001), 126-135).
  • Figure 6B is the full coding sequence for DsRedT.
  • gfpJ has reduced AfT content and intrinsic curvature and is predicted to be a poor target for H-NS.
  • the DsRed coding sequence is 678 nucleotides in length, 55% of which are either an A or a T (Fig 6D). 2 regions of strong intrinsic curvature that are reduced in dsredT are indicated by arrows (Fig 6C).
  • DsRed contains the same key determinants of H-NS binding affinity to a region as the gfp+ (Fig 2) gene and thus, is most likely bound by H-NS.
  • Fig 2 the gfp+ gene was isolated from Aeq ⁇ orea victoria
  • DsRed is the basis for the most intensive research into fluorescent proteins since the gfp gene was isolated from Aeq ⁇ orea victoria
  • a wide range of derivatives of DsRed (Baird et al., Proc. Natl. Acad. Sci. USA 97 (2000), 11984-11989) may also be bound by H-NS.
  • Fig. 7A is a diagram of the fimA promoter (P) is located in an invertible element (fimS). Site-specific recombination at the inverted repeats (IR), flanking fimS, results in inversion of the fimA promoter between phase ON and phase OFF.
  • Figs. 7B to D are the results of determining the response of fimS to novobiocin-induced
  • Fig 7B shows an increase in switching towards phase ON in response to increasing amounts of novobiocin.
  • Fig 7C shows an increase in switching towards phase OFF.
  • Fig 7D (fimA-gfpT) behaves in a wild type manner, switching towards phase ON in response to novobiocin.
  • Fig 8 shows that H-NS binds to gfp+ with higher affinity than gfpT in vitro.
  • Electrophoretic mobility shift assay analysis using purified H-NS and biotinylated gfp+ and gfpT probes (Fig 8A). Biotinylated proU and lacZ probes are used as positive and negative controls, respectively (Fig. 8B). H-NS binding to gfp+ and gfpT (Fig. 8C) in the presence of qual amounts of non biotinylated (unlabelled) probes. The concentration of purified H-NS used is indicated above each lane.
  • Fig. 9 shows that gfpT has improved translation efficiency.
  • Fig 9A is a graph showing the expression of gfp+ and gfpT from the prpBCDE promoter on the plasmid pPro in wildtype and hns mutant cells.
  • the prpBCDE promoter is repressed by glucose and induced by propionate. Mean values and ranges are plotted; fold differences between gfp+ and gfpT fluorescence levels are shown above the bars.
  • Fig. 9B is a graph showing the codon adaptation index (CAI) values for gfpT and gfp+ in various organisms. Organisms above the dashed line will translate gfpT more efficiently than gfp+ (and wildtype gfp).
  • CAI codon adaptation index
  • gfp+ (Scholz et al. Eur. J. Biochem. 267 (2000), 1565-70), contains a number of mutations that improve the folding and emission spectrum of the GFP protein, optimizing it for use with flow assisted cell sorters (FACs).
  • FACs flow assisted cell sorters
  • gfp ⁇ is the gfp made in accordance with Example 2 of the invention which has been codon optomized. - The following examples were carried out in the E. coli k12 strain, CSH50 (The CoIi genetic stock centre).
  • XL1 Blue was used as a cloning strain (Stratagene).
  • pZepO ⁇ (Hautefort et al., App. Env. Microbiol. 69 (2003), 7480-7491) was used as the source of gfp+.
  • All restriction emzymes used were from New England Biolabs (NEB).
  • Phusion polymerase (Finnzymes) was used as per manufacturers instructions for all PCRs other than qPCR, where SYBR green qPCR mix (QuantiTect) was used.
  • the plamid prep kit used was by RBC biosciences. All other standard reagents were from Sigma- Aldrich. - Antibiotics, where needed, used in the following concentrations; Carbenicillin 100 ⁇ g/ml, Kanamycin 50 ⁇ g/ml and chloramphenicol 25 ⁇ g/ml.
  • cassettes After integration of chromosomal cassettes and single colony pufirifaction under antiobiotic selection, the cassettes were considered stable and antibiotic selection unneccessary. Strains containing plasmids were maintained under selection to prevent plasmid loss
  • H-NS binds to the qfp+ gene in vivo.
  • chromatin immunoprecipitation involves crosslinking proteins to DNA in live cells using formaldehyde, purifying the DNA-protein complexes and then using a specific antibody to the protein of interest (in this case H-NS). This antibody-protein-DNA complex can then be purified, the protein removed from the DNA and the DNA quantified using quantitative real-time PCR. This identifies if a piece of DNA was bound by H-NS.
  • the fold enrichment of the DNA is an indication of the affinity of the protein of interest for the DNA.
  • Chromatin immunoprecipitation using a monoclonal H-NS specific antibody was performed as described previously (Lucchini et al., PLoS Pathog. 2 (2006), e81). Quantitative
  • PCR was performed on enriched DNA and unenriched (input) DNA using the Rotor-Gene 3000 real-time PCR machine (Corbett Research) and SYBR green (QuantiTect) as per manufacturers instructions. Input DNA was quantified using a Nanodrop 2000 (Thermo Scientific). The primers used were designed using Primer3 and are listed below.
  • Each product was between 100 and 150 bp in length and encoded for only a single specific product (analyzed by agarose gel electrophoresis after a 40 cycle PCR).
  • ChIP DNA samples were diluted 1 in 10 in AnalaR water (BDH). Each reaction was also performed using 20 ng of input DNA (4 ng/ ⁇ l).
  • the SYBR green PCR was set up in a 25 ⁇ l reaction containing 2.5 ⁇ l AnalaR water, 5 ⁇ l 1.5 ⁇ M forward and reverse primer mix, 12.5 ⁇ l SYBR green PCR mix and 5 ⁇ l of the DNA sample.
  • the following thermocycle was used: 1. 95 0 C for 15sec, 2. 52 0 C for 60sec, 3. 72 0 C for 15 sec. This was repeated for 40 cycles.
  • the ⁇ C ⁇ of the ChIP probe with the highest C ⁇ value i.e. the lowest amount of amplified DNA
  • probe 2 located in the gfp+ gene, is strongly enriched for H-NS binding while the other probes are not. Probe 2 showed over 12 fold enrichment over input DNA indicating the presence of H-NS. This suggests that there is a strong H-NS binding site in this region (as supported by the DNA curvature and AAT content data, Figs. 2A and 2B).
  • H-NS bound in this region is not nucleating along the DNA but instead, may be forming DNA-protein bridges with a second H-NS binding site This second site could be proximal to the gfp+ gene and lead to the transcriptional repression of a local gene (as described in Example 3) or distal, potentially forming a new topological domain and altering the expression of multiple genes.
  • H-NS titration from native genes was previously shown to occur upon the introduction of an of AfT rich plasmid DNA (Doyle et al., Science 315 (2007), 251-252).
  • H-NS binding occurs in wild type gfp and it's derivatives as the minor modifications in the coding sequence required to alter the characteristics of the protein (often requiring only a single amino acid substation) are unlikely to alter the determinants of H-NS binding and thus, the affinity of H-NS for the DNA.
  • Example 2 how the 48 nucleotide difference between the gfpmut2 and gfp+ genes is insufficient to significantly change the A/T content or curvature of the DNA. This instead requires a more thorough approach described in Example 2.
  • Example 1 chromatin immunoprecipitation using a H-NS specific monoclonal antibody showed H-NS binding in the gfp+ gene (Fig. 1). This was supported by bioinformatic analysis of the gfp+ gene that showed high intrinsic curvature and high A/T content-typical features of H-NS bound regions (Fig. 2A).
  • EMSA electrophoretic mobility shift assays
  • the initial basis for their study was the differences in expression between the luxAB and the lacZ reporter fusions inserted in the proU DRE. While the luxAB fusion showed repression of proU of in low osmolality media, the lacZ fusion showed derepression of proU under the same conditions.
  • Rod models are the simplest form of DNA models and represent DNA as a cylindrical rod of constant diameter, made up of short cylindrical segments (e.g. the size of a base pair) and then to compute a given rod parameter (e.g. DNA curvature) on the basis of segment composition.
  • Dinucleotide models define the segment as two adjacent base pairs while trinucleotide models define the unit around the central base pair of a given trinucleotide.
  • Analysis of DNA curvature was performed using the bend.it server, which calculates the curvature of DNA molecules based on their DNA sequences using dinucleotides and trinucleotides models.
  • the genetic code uses 64 nucleotide triplets (codons) to encode 20 amino acids and stop, meaning each amino acid is encoded by on average 3 codons. The frequencies with which codons are used by different organisms varies significantly leading to variation in G/C content. The degeneracy of the genetic code enables many alternative nucleotide sequences to encode the same protein and allows for the codon optomization of a protein to a specific organism, without altering the protein at the amino acid level.
  • the simplest way to redesign a DNA sequence is to work from the amino acid sequence and use a 'one amino acid - one codon 1 approach where, for every amino acid, the most abundant codon for the organism of interest is used.
  • a highly expressed gene designed as such would result in depletion of the transfer RNA pool for those codons, potentially allowing the incorrect incorporation of another tRNA leading to translation error.
  • the program for optimization of gfp+ (Gene Designer) (Villalobos et al., BMC Bioinformatics 7 (2006), 285) optimizes genes for expression by using a codon usage table in which each codon is given a probability score based on the frequency distribution of the codons in the desired genome normalized for every amino acid.
  • H-NS targets horizontally acquired A/T rich DNA, so making the gfp+ gene a similar average A/T content to that of the E. coli genome (50%), was intended to allow it blend in and not be target by H-NS as an horizontally acquired gene. This method could be used for all other fluorescent proteins to prevent H-NS binding in the introduced DNA.
  • Codon optomization reduced the A/T richness and intrinsic curvature of the g1p+ gene.
  • Figure 2A shows the intrinsic curvature results for the gfp+ and gfpl genes and
  • Figure 2B shows the A/T content results for the the gfp+ and new gfp ⁇ gene.
  • Fig 2 shows that the new gfpl gene contains reduced A/T content and DNA curvature compared to gfp+.
  • the new gfpl gene has strongly G/C rich regions approaching 0.8 (80%). This reflects the overall drop in A/T content between the gip+ gene (59%) and the gfpl gene (50%)
  • New qfp ⁇ gene as a transcriptional reporter to a known H-NS regulated gene
  • the gfp+-cat cassette from pZepO ⁇ was PCR amplified using primers fimA-gfp-hcat.iw and fimA-gfp-hcat.rv and integrated into the chromosome of the CSH50 in the fimA gene selecting for chloramphenicol resistance (encoded by the cat gene).
  • kan was PCR amplified from pKD4 Datsenko and Wanner (Proc. Natl. Acad. Sci. USA 97 (2000), 6640-6645) using primers kan.int.fw and kan. ' mX.rv that have 5' extensions allowing for chromosomal replacement of the cat gene in the fimA- gip+- cat construct.
  • the chromosomally located gfp+-kan cassette was then used as template for integration of gfp+-kan into the proU gene.
  • PCR products amplified using either +98proU-gfp+.fw or +936proU-gfp+.fw along with the pr ⁇ l/.(stop)./can.rv were integrated onto the chromosome creating gfp+, transcriptional fusions to the proU promoter.
  • the gfpT gene was synthesized by DNA2.0 (San Diego, CA, USA) and provided cloned into a custom vector (pJ204, pUC ori, encodes ampicillin resistance).
  • the kan cassette from pKD4 was PCR amplified using primers kan.Xmal.fw and kan.Xmal.rv, which incorporated Xmal sites to allow for cloning into the Xmal site of pJ204 (located about 30bp downstream of the gfpT stop codon).
  • the linear and plasmid DNA were digested with Xmal (NEB), ligated using T4 DNA iigase (Roche) as recommended by the manufacturer's instructions. The ligated DNA was then transformed by heat shock into the E coli cloning strain XL1 Blue made competent for transformation using calcium chloride (These are standard laboratory techniques). Plasmids were purified (RBC biosciences) and used as template using primers; +98proU-gfpT.1w or . along with prol/.(stop)./can.rv. The amplification of a ⁇ 2 Kb product confirmed which colonies contained the kan gene in the correct orientation (promoter facing away from the gfpT coding sequence). These PCR products were then integrated into the chromosome creating gfpT transcriptional fusions to the proU promoter.
  • lacZY-cat an E. coli strain containing the cat gene inserted in lacA; obtained from D. M. Stoebel, Department of Microbiology, School of Genetics and Microbiology, Trinity College, Dublin 2, Ireland.
  • primers +98proU-lacZ.iw or +936proU-lacZJw and the proU.(stop).cat.rv primer were then integrated into the chromosome creating lacZ transcriptional fusions to the proU promoter.
  • a stop codon (TAA or TGA) was included in the every forward primer so that it integrated in- frame and prevented formation of a translational fusion.
  • Presumptive integrants were screened for an increased size compared to the wild type (WT) gene using either fimA.fw and fimA.xv (that amplify a -550 bp region of fimA in WT CSH50) or prol/.fw and proU.xv (that amplify a -1.4 Kb flanking region of the proU promoter in WT CSH50).
  • Integration in proU causes a ⁇ 1.2Kb deletion when integrated at +98 bp from the proU transcriptional start site or a -330 bp deletion when integrated at +936 bp from the proU transcriptional start site.
  • the 3' integration event is directly before the stop codon in the proV (the first gene transcribed by the proU promoter). Fusions of the correct predicted size (as analyzed by agarose gel electrophrophesis) were sequenced on both strands to ensure correct integration (GATC Biotech).
  • PD32 is an H-NS deficient strain containing the bla gene (that encodes resistance to the antibiotic ampicillin) inserted in the hns gene (Dersch et al., MoI. Microbiol. 8, (1993), 875-889).
  • the mutant hns allele was transduced using phage P1vir (a standard technique in molecular biology) into the prof fusion strains to analyse expression from proU in the absence of H-NS.
  • ⁇ -galactosidase activity of the lacZ fusions were measured using a described by Miller (1992) with minor differences. Reactions were performed in 96-well microtiter plates. The kinetics of substrate hydrolysis at 37 0 C was measured for at least ten samples, at 30 second intervals after an initial 3 minute lag period, using a multiscan ascent plate reader (Thermo labsystems). The total volume of each reaction was maintained at 200 ⁇ l.
  • Beta-galactosidase activity was determined according to the following formula: Slope (OD 4 i4/time)/(OD6oo x volume ( ⁇ l) of cells used)
  • Figure 3A shows the downstream regulatory region (DRE) of proV containing H-NS binding sites essential for repression of proU in low osmolarity media
  • Chromosomal lacZ, gfp+ and new gfpT fusions were constructed at +98 bp (disrupts DRE) (Fig. 3B) and +936 bp (DRE intact) (Fig. 3C) creating 6 reporter fusion strains.
  • proU-gfp+ accounted for repression of proU-gfp+(+98) in vivo
  • the expression of all proU fusions was tested in an H-NS deficient background.
  • Cells were cultured in the repressive conditions of 100 mM NaCI, and the data expressed as a percentage of maximal derepressed expression to facilitate comparisons between GFP fluorescence and ⁇ - galactosidase activity (Fig 3D).
  • proU expression was elevated in the absence of H-NS. This revealed that even in the absence of the DRE, H-NS continued to bind the URE and repress the proU-gfpT (+98) and proU-lacZ ⁇ +93) fusions.
  • Example 7 While these results suggest that fluorescence levels of the GFP protein have not been altered (Fig 3C), it is shown in Example 7 that the GFPT is translated more efficiently than GFP+ due to the codon optimization of GFPT (as described in Example 2).
  • Example 4
  • New gfpT gene as a transcriptional reporter to a second known H-NS regulated gene.
  • the site-specific recombinase FimB binds at the inverted repeats catalyzing the inversion of the fimS DNA segment leading to either fimbriate (phase ON) or afimbriate cells (phase OFF).
  • the inversion of fimS is sensitive to varying levels of DNA supercoiling.
  • DNA supercoiling is controlled through the antagonistic actions of DNA gyrase (which tightly winds the DNA) and topoisomerase I (which relaxes the DNA).
  • DNA gyrase which tightly winds the DNA
  • topoisomerase I which relaxes the DNA.
  • inversion from phase ON to phase OFF and from phase OFF to phase ON occurs at an equal rate.
  • Fig 7B When the DNA is relaxed due to the addition of the DNA gyrase inhibiting drug novobiocin, switching from phase OFF to phase ON dramatically increases (Fig 7B).
  • Novobiocin sodium salt (Sigma) was prepared fresh in sterile water before use.
  • the construction of the fimA-gfp+-kan fusion was described in Example 3.
  • a gfpl-kan cassette was integrated into fimA using the primers
  • fimA-qfpT-kan fw 5 1 - GAT TGA TGC GGG TCA TAC CAA CGT TCT GGC TCT GCA GGA TTC ATT AAG AAG GAG AT- '3 fimA-qfpT-kan: rv 5 1 - TCT GCA CAC CAA CGT TTG TTG CGC TAC CCG CAG CTG AAC TCA TAT GAA TAT
  • Novobiocin induction was performed as follows. Briefly, single colonies were resuspended in 100 ⁇ l LB broth and used to innoculate 3 ml of LB in a 15 ml test tube, which were then incubated at 37 S C/ 200 rpm (aerobic conditions). Cultures were allowed grow until exponential phase (when the optical density at 60OnM (OD 6O o) is between 0.2 to 0.4) at which point they were diluted and used to innoculate 3ml broths containing varying concentrations of novobiocin. Dilutions were calculated to allow 15 generations before cessation of growth at an OD 600 of ⁇ 3, assuming each generation resulted in a doubling of the OD 6O o of the culture.
  • the orientation of fimS was determined using a PCR based assay.
  • Primers OL20 (5' -CCG TAA CGC AGA CTC ATC CTC - "3) and OL4 (5 1 - GAC AGA ACA ACG ATT GCC AG - '3) were used to amplify from outside of the invertible region.
  • the resulting PCR products were digested at a unique BstUI site, asymmetrically located within fimS, allowing for distinction between cells containing fimS in the phase ON or phase OFF orientation based on the size of the digested fragments (see Figs 7B-D).
  • H-NS binds to gfp* with higher affinity than gfpl in vitro
  • the regulatory regions flanking the proU promoter are A+T rich, highly intrinsically curved (Owen-Hughes et al., Cell 71 (1992), 255-65) and contain multiple high affinity H-NS binding sites (Bouffartigues et al., Nat. Struct. MoI. Biol. 14 (2007), 441-8). This DNA was therefore used as a positive control for H-NS binding (Figure 8).
  • the lacZ reporter gene is a poor target for H-NS as it is relatively G+C rich and not intrinsically curved (Owen-Hughes et al., Cell 71 (1992), 255-65) and was used as a negative control for H-NS binding (Figure 8).
  • Electrophoretic mobility shift assays were performed to determine the affinity (K app ) of H-NS for gfp+ and gfpl in vitro. Since H-NS binds with low specificity and affinity and H-NS binding is highly co-operative, in order to assess H-NS binding affinity accurately for the two gfp genes a narrow range of protein concentrations was chosen. H-NS was found to bind gfp+ strongly with a K app of 4.9 nM ( Figure 8A). A further indication of the high affinity of H-NS for gfp+ is the narrow range of protein (4.5-10.55 nM) required for the transition from initial binding to fully bound probe, resulting in a single high molecular mass complex.
  • H-NS had a lower affinity for gfpT ( Figure 8A; Kapp, 7.5 nM).
  • the lower affinity of H-NS for gfpT resulted in smearing of the DNA over a wide range of protein concentrations (7.9-18.75nM) with the gfpl probe only resolving as a single bound complex at 25nM H-NS.
  • the proU regulatory region was used as a positive control for H-NS binding (Figure 8B). As expected, the proU probe was strongly bound by H-NS (K app 6.2). The proU region contains a number of well characterized H-NS binding sites (Bouffartigues et al., Nat. Struct. MoI. Biol. 14
  • H-NS binding to each probe was carried out in 20 ul reaction mixtures containing increasing concentrations of purified H-NS protein (final concentrations; 0-25nM) in 2OmM Tris HCL, 1 mM EDTA, 100 ug/ml BSA, 1 mM DTT, 10% glycerol and 80 mM NaCI. Reactions were incubated at 4 S C for 30 min.
  • the Chemiluminescent Nucleic Acid Detection Module (Pierce) was used as per manufacturers instructions followed by signal detection using developer and fixer solutions (Kodak) and Hyperfilm (Amersham Biosciences). Densitometric analysis was performed using Image J software.
  • PCR was used to amplify the entire coding sequence of gfp+ and gfpT and equivalents sized regions (717 bp) in proU and lacZ.
  • the primers used are listed 5' to 3' below;
  • Bio indicates primers that contained a 5' biotin tag that allows for visualization of the DNA. Primers with identical sequences to gfp+.bs.bio.rv and gfpT.bs.bio.rv but without 5' biotin tags were used as unlabelled DNA in Figure 8C.
  • dsredT differs from dsred by 155 nucleotide substitutions across the 678 base-pair gene has reduced A/T content (49%) and reduced predicted DNA curvature (Figure 6A to C).
  • dsrecfT will have a a lower H-NS affinity than dsred.
  • Example 7 gfpT is translated more efficiently than gfp+
  • Blunt-ended PCR amplicons of the gfp+ and gfp J open reading frames were generated using Phusion polymerase (NEB) and primers gfp+.pPro24-blunt.fw, gfp+.pPro24-Pstl.rv, gfp T .pPro24-blunt.fw, gfp T .pPro24-Pstl.rv (listed 5' to 3' below).
  • PCR amplicons were first digested with Pstl, then blunt ends were phosphorylated using T4 polynucleotide kinase in T4 ligase buffer (Roche Diagnostics, Mannheim, Germany) followed by purification with a HiYield gel/PCR DNA fragments extraction kit (RBC Biosciences).
  • pPro24 Lee & Keasling, 2005 was digested with Smal and Pstl, dephosphorylated using Antarctic phosphatase (NEB), and then ligated to PCR amplicons using a Rapid DNA ligation kit (Roche). Correct clones were confirmed by DNA sequencing.
  • the prpBCDE promoter in pPro24-gfp clones was induced with propionate as follows: streak- isolated colonies were used to inoculate 4 ml LB (86 mM NaCI) broth cultures and these were grown to an OD 600 -0.5. Cultures were then diluted 1/500 into fresh LB including glucose (to repress the prpBCDE promoter) or propionate (to induce it). Cultures were grown overnight at 37 9 C with shaking and samples were fixed and analysed by flow cytometry the following morning.
  • CAI Codon Adaptation Index
  • CAI values were calculated using CAIcal (Puigbo et al., Biol Direct. 3, (2008), 38) from codon usage tables provided by the Codon Usage Database (Nakamura et al., Nucl. Acids Res. 28, (2000), 292). All codon usage tables used in this analysis were derived from whole genome sequences. Results
  • Figure 9B shows that gfpT is predicted to have improved translation efficiency in both bacterial and eukaryotic model organisms.
  • GFPT is optimized for high expression in E. coli and thus may have reduced toxicity when highly expressed.
  • Another potential advantage may be increased fluorescence of GFPT vs GFP+ when the genes are weakly transcribed. This could arise due to the optimized coding sequence of GFPT allowing more efficient use of the host transfer RNA pool than GFP+, allowing faster translation and thus, a greater accumulation of fluorescent protein. This would result in GFPT cells having a higher level of fluorescence than the GFP+ cells under identical conditions.
  • GFPT potentially has a wider range of fluorescence than GFP+, without having the same detrimental effect on the host.

Landscapes

  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • Biomedical Technology (AREA)
  • Microbiology (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Tropical Medicine & Parasitology (AREA)
  • Toxicology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Medicinal Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The present invention is directed to a method for improving gene expression in a host cell comprising a modified protein encoding nucleic acid comprising the steps of assessing the A and T nucleotide content and/or the intrinsic curvature of a wild type protein encoding nucleic acid or mutant thereof, preparing an altered protein encoding nucleic acid with modified A and T nucleotide content and using the altered protein encoding nucleic acid in host cell gene expression systems. The present invention is also directed to the modified nucleic acid sequence, protein, plasmid vector, expression system comprising the altered protein encoding nucleic acid.

Description

"A METHOD FOR IMPROVING GENE EXPRESSION"
Field of the Invention
The present invention is directed to a method for improving and/or monitoring gene expression in a host cell comprising a protein encoding nucleic acid. The invention is also directed to a modified protein encoding nucleic acid, protein, expression system, plasmid vector and host cell. Preferably, the protein encoding nucleic acid is a fluorescent protein nucleic acid
Background to the Invention
Cell-based protein expression systems are well known and the aim of such expression systems is to overexpress a desired protein. Host cells include bacteria (such as E.coli, B. subtilis etc), yeast (such as S.cerevisiae) or eukaryotic cell lines. Conventional DNA sources and delivery mechanisms include viruses (such as baculovirus, retrovirus, adenovirus), plasmids, artificial chromosomes and bacteriophage (such as lambda). The expression system used will depend on the gene involved, for example Saccharomyces cerevisiae is often selected for proteins that require significant post-translational modification. Insect or mammalian cell lines may be used when human-like splicing of the mRNA is required. Additionally, bacterial expression has the advantage of easily producing large amounts of protein. E. coli is one of the most widely used expression hosts, and DNA is normally introduced in a plasmid expression vector.
It is common for genes from many organisms to be placed under the control of an inducible promoter in a fast growing organism such as Saccharomyces cerevisae or E. coli. This facilitates the expression of the gene product to very high levels (often >50% of total cell protein). The gene product can then be purified to homogeneity. An example of this is the expression of restriction endo-nucleases from multiple organisms in £ coli, which are then purified and used in vitro to digest DNA. For example the genes encoding the restriction endo- nucleases Pstl, Hpy99l and Psil from Providencia stuartii 164, Helicobacter pylori J99 and Pseudomonas species SE-G49, respectively, are all cloned into E. coli for over-expression and purification.
This type of process has been used extensively to manufacture medically-useful products since the 1970s when synthetic humanized insulin was developed by joining the insulin gene with a plasmid vector inserted into the E. coli. Insulin was the first FDA-licensed drug produced through this recombinant DNA technology. Other bacterial recombinant proteins include recombinant human growth hormone, interleukin-2 lymphocyte growth factor and interferon, a cytokine. Vaccines can be manufactured in yeast for example. The first drug produced commercially by mammalian cell culture was tissue plasminogen activator (tPA), used to dissolve blood clots. Another recombinant protein produced by the mammalian CHO cells is glycoprotein factor VIII, a blood clotting factor. There are many other recombinant protein products manufactured in this way. Accordingly, improvements to such cell-based protein expression systems are always being developed.
Thus, the present invention is directed to improving gene expression in host cells.
The discovery of green fluorescent protein in the early 1960s started a new era in cell biology by enabling investigators to apply molecular cloning methods, fusing the fluorophore moiety to a wide variety of protein and enzyme targets, in order to monitor cellular processes in living systems using optical microscopy and related methodology. When coupled to recent technical advances in fluorescence and microscopy, including ultrafast low light level digital cameras and multitracking laser control systems, the green fluorescent protein and its colour-shifted genetic derivatives have demonstrated invaluable service in many thousands of live-cell imaging experiments.
The Green Fluorescent Protein (GFP) was originally isolated in the 1960's from the jelly fish Aequorea victoria and is encoded by the gfp gene and fluoresces green when exposed to blue light. GFP is composed of 238 amino acids (26.9 kDa). GFP has a typical beta barrel structure, consisting of one beta-sheet with alpha helix(s) containing the chromophore running through the center. Inward facing sidechains of the barrel induce specific cyclization reactions in the tripeptide Ser65-Tyr66-Gly67 that lead to chromophore formation. The hydrogen bonding network and electron stacking interactions with these sidechains influence the colour of wild type (wt) GFP and its numerous derivatives. The GFP from A. victoria has a major excitation peak at a wavelength of 395 nm and a minor one at 475 nm. Its emission peak is at 509 nm which is in the lower green portion of the visible spectrum.
However despite being isolated in the 1960's, its utility as a tool for molecular biologists was not realized until the 1990's when the cloning and nucleotide sequence of wild type gfp gene took place. At this time, it was determined that the GFP molecule folded and was fluorescent at room temperature, without the need for exogenous cofactors specific to the jellyfish. However, although this near-wild type GFP was fluorescent, it had several drawbacks, including dual peaked excitation spectra, pH sensitivity, chloride sensitivity, poor fluorescence quantum yield, poor photostability and poor folding at 370C.
Thus, GFP mutants were studied in order to find a protein with better characteristics. Indeed, the first reported crystal structure of a GFP was that of the S65T mutant (a single point mutation) by in the mid 1990's. This mutation dramatically improved the spectral characteristics of GFP, resulting in increased fluorescence, photostability and a shift of the major excitation peak to 488 nm with the peak emission kept at 509 nm. This matched the spectral characteristics of commonly available FITC filter sets, increasing the practicality of use by the general researcher.
In addition to the first single amino acid substitution, S65T, researchers have modified the GFP residues by directed and random mutagenesis to produce the wide variety of GFP derivatives in use today. For example, a 37°C folding efficiency (F64L) point mutant to this scaffold yielding enhanced GFP (EGFP) was discovered in 1995 by Ole Thastrup. EGFP allowed for the use of GFPs in mammalian cells. EGFP has an extinction coefficient of 55,000 M~1cm~1. Another mutation, superfolder GFP, related to a series of mutations that allow GFP to rapidly fold and mature even when fused to poorly folding peptides, was reported in 2006.
Many other mutations have been made, including colour mutants; in particular
- blue fluorescent protein (EBFP, EBFP2, Azurite, mKalamal );
- cyan fluorescent protein (ECFP, Cerulean, CyPet); and
- yellow fluorescent protein derivatives (YFP, Citrine, Venus, YPet). BFP derivatives (except mKalamal) contain the Y66H substitution.
The critical mutation in cyan derivatives is the Y66W substitution, which causes the chromophore to form with an indole rather than phenol component. Several additional compensatory mutations in the surrounding barrel are required to restore brightness to this modified chromophore due to the increased bulk of the indole group. The red-shifted wavelength of the YFP derivatives is accomplished by the T203Y mutation and is due to π- electron stacking interactions between the substituted tyrosine residue and the chromophore. These two classes of spectral variants are often employed for fluorescence resonance energy transfer (FRET) experiments.
There are many other mutants, including but not limited to the following:
- Semi-rational mutagenesis of a number of residues led to pH-sensitive mutants known as pHluorins, and later super-ecliptic pHluorins. By exploiting the rapid change in pH upon synaptic vesicle fusion, pHluorins tagged to synaptobrevin have been used to visualize synaptic activity in neurons. - Redox sensitive versions of GFP (roGFP) were engineered by introduction of cysteines into the beta barrel structure. The redox state of the cysteines determines the fluorescent properties of roGFP. -A-
- A variant combining mutations designed to improve folding at 37 0C (F99S, M153T,
V163A) from GFPuv (Crameri et al., Nat Biotechnol 14 (1996), 315-9) were combined with mutations in the chromophore of GFPmuti (F64L and S65T) (Cormack et al., 1996) to create the GFP+ protein which was 320 times more fluorescent than the original GFP (Scholz et al. Eur. J. Biochem. 267 (2000), 1565-70). The GFPmuti protein is identical with EGFP, which is commercially available, widely used even though only about 20% of EGFP/GFPmut1 is correctly folded at 37 0C (Tsien Annu Rev Biochem 67 (1998), 509-44). The development of
GFP+ was necessary to allow the measurement of gene activity in the native location (Hautefort et al., App. Env. Microbiol. 69 (2003), 7480-7491).
It is know well-known that there are other fluorescent proteins isolated from different sources which may be used in the same manner as GFP. These include DsRed and the protein encoded by the gfp gene from the great star coral, Montastraea cavernosa.
DsRed is a recently cloned 28-kDa fluorescent protein isolated from the coral of the Discoma genus. DsRed has an emission maximum of 583-nm, which can be further extended to 602-nm by mutation of Lys-83 to Met. This emission spectrum makes it ideal as a fluorescent partner for GFP, as fluorescence of both proteins could be individually measured in a single cell (Baird et al., Proc. Natl. Acad. Sci. USA 97 (2000), 11984-11989). Similar to the early studies of GFP, a number of the limitations of this protein are presently being addressed through random mutagenesis of the coding sequence and screening for improved variants. One of the major issues with DsRed include the long maturation time before a red signal is detected and that it forms tetramers, both undesirable characteristics for a transcriptional/translational fusion.
Because of its easily detectable green fluorescence, GFP from Aeqυorea has been used widely to study gene expression and protein localization. Furthermore, GFP, like other fluorescent proteins, does not require a substrate or cofactor to fluoresce; hence, it is possible to directly express GFP and use it as a reporter in numerous species and in a wide variety of cells.
These fluorescent proteins and mutants or variants thereof can be introduced into organisms and maintained in their genome through breeding, injection with a viral vector, or cell transformation of either linear or circular DNA. For example, the GFP gene has been introduced and expressed in many bacteria, yeast and other fungi, plant, fly, and mammalian cells, including human.
Thus, GFP and other related fluorescent proteins and related mutants/variants as described above are now used routinely as reporters of gene expression in all types of cells. They are an essential tool for biologists and work is continually being carried out on developing improved variants to overcome inherent limitations of the native proteins.
Thus, the present invention is directed to improving the efficacy of fluorescent protein genes for use as a molecular tool, in particular when used in gene expression systems.
Statements of the Invention
According to a first general aspect of the invention, there is provided a method for improving gene expression in a host cell comprising a protein encoding nucleic acid comprising assessing the A and T nucleotide content and/or the intrinsic curvature of a wild type protein encoding nucleic acid or mutant thereof; preparing an altered protein encoding nucleic acid by modifying the A and T nucleotide content of the wild type protein encoding nucleic acid or mutant thereof to equal or lower the A and T nucleotide content of the host cell such that the intrinsic curvature of the altered protein encoding nucleic acid is reduced compared to the wild type protein encoding nucleic acid or mutant thereof; and using the altered protein encoding nucleic acid in host cell gene expression systems. Ideally, the modified A and T nucleotide content of the altered protein encoding nucleic acid results in reduced affinity, compared to the wild type protein encoding nucleic acid or mutant thereof, to host cell transcriptional repressor proteins.
According to a second general aspect of the invention, there is provided a method for improving gene and/or protein expression in a host cell comprising a fluorescent protein nucleic acid comprising assessing the A and T nucleotide content of the fluorescent protein nucleic acid; and modifying the A and T nucleotide content of the fluorescent protein nucleic acid to equal or lower the A and T nucleotide content of the host cell.
According to one embodiment of this second aspect of the invention, there is provided a method for improving gene and/or protein expression in a host cell comprising a fluorescent protein nucleic acid or mutant thereof comprising assessing the A and T nucleotide content and/or the intrinsic curvature of a wild type fluorescent protein nucleic acid or mutant thereof; preparing an altered fluorescent protein nucleic acid by modifying the A and T nucleotide content of the wild type fluorescent protein nucleic acid or mutant thereof to equal or lower the A and T nucleotide content of the host cell such that the intrinsic curvature of the altered fluorescent protein nucleic acid is reduced compared to the wild type fluorescent protein nucleic acid or mutant thereof; and using the altered fluorescent protein nucleic acid in a host cell gene expression system. According to a third aspect of the invention, there is provided a modified fluorescent protein nucleic acid comprising a sequence encoding a wild type fluorescent protein or mutant thereof with a lower A and T nucleotide content and/or reduced intrinsic curvature compared to the wild type fluorescent protein nucleic acid or mutant thereof. Ideally, the protein has reduced affinity to one or more host cell transcriptional repressor proteins compared to the wild type fluorescent protein nucleic acid or mutant thereof.
According to a fourth aspect of the invention, there is provided a fluorescent protein encoded by the modified nucleic acid of the invention.
According to a fifth aspect of the invention, there is provided an expression system comprising the modified nucleic acid sequence or fluorescent protein according to the invention, preferably for use in a host cell such as Escherichia coli.
According to a sixth aspect of the invention, there is provided a plasmid vector comprising the modified nucleic acid sequence or fluorescent protein according to the invention, preferably for use in a host cell such as Escherichia coli.
According to a seventh aspect of the invention, there is provided a host cell comprising the modified nucleic acid sequence, fluorescent protein, plasmid vector, expression system of the invention.
According to a eighth aspect of the invention, there is provided a method of monitoring gene expression in a host cell using the modified nucleic acid, fluorescent protein, expression system or host cell of the invention.
Detailed Description of the Invention
In this specification, it will be understood that any of the percentage identities or homologies referred to in the specification are determined using available conventional methods over the entire/whole length of the sequence.
In this specification, it will be understood that the terminology "equal to or lower than the A and T nucleotide residue content of a host cell" covers an A and T residue content equal to or within +/- 15%, +/- 10%, +/- 9%, +/- 8%, +/- 7%, +/- 6%, preferably +/- 5%, +/- 4%, +/- 3%, +/- 2%, more preferably +/-1%, +/- 0.5%, +/- 0.1%, of the A and T residue content of the host cell.
DNA is made up of combinations of 4 nucleotides; Adenosine (A), Tyrosine (T), Guanosine (G) and Cytosine (C). Each species uses each of these nucleotides in differing ratios. A and T are grouped because they both contain only 2 hydrogen bonds while G and C both contain 3. It is known, for example, that approximately 50% of the genome of Escherichia coli (a Gram- negative bacterium), a typical host cell, is an A or a T nucleotide (AfT). In addition, it is known that, the AAT nucleotide content of the fluorescent protein gene, gfp+, for example is approximately 59%, i.e. it is "A/T rich". In this specification, we have used the term "AfT rich" or "AfT nucleotide rich" to convey an AfT nucleotide content higher than the average A/T residue content of any suitable host cell, including E. coli and many other host cells. Other proteins and fluorescent proteins will have different A/T nucleotide contents and may or may not be A/T rich compared to a host cell. As an example, the approximate A and T nucleotide content of the following Gram-positive and negative bacteria is given on the table below:
Figure imgf000009_0001
It will be understood that reducing the intrinsic curvature of the altered protein encoding nucleic acid ideally means lowering the curvature amplitude compared to the wild type protein encoding nucleic acid or mutant thereof.
Ideally, intrinsic curvature should be reduced to less than approximately 15δ per helical turn, approximately 10s per helical turn, more preferably approximately 9Q per helical turn, even more preferably approximately 7s per helical turn. For example, we found that all regions of intrinsic curvature greater than approximately 7Q per helical turn in the fluorescent protein gfp should be reduced to approximately 7s per helical turn or less. For the fluorescent protein dsred we found that all regions of intrinsic curvature greater than approximately 9s per helical turn in the fluorescent protein dsred should be reduced to approximately 9ρ per helical turn or less. We have found that codon optimization to increase the A/T content of a gene to match that of a G/C rich genome (<50% A/T) will make it a poor target for H-NS, similar codon optimization for an A/T rich genome may increase the affinity of H-NS for the target gene. Therefore, to avoid H- NS mediated complications when optimizing genes for expression in A/T rich genomes, it may be advantageous to optimize genes using a codon-table derived from highly expressed genes of the organism of interest that have a reduced A/T content compared to the average A/T content of the genome. The optimal level of A/T content required to avoid H-NS mediated complications in an A/T rich genome can then be determined empirically.
It will be understood that host cells are not limited to bacteria, and include all suitable prokaryote host cells, such as bacterial cells and yeast cells, and eukaryote host cells, such as mammalian cells (including human, primate and rodent cells). Furthermore, the host cells of the invention will be understood to express transcriptional repressor proteins, such as nucleoid-associated transcriptional repressor proteins and typically H-NS or H-NS-like proteins.
Furthermore, the comments that follow in relation to H-NS are equally applicable to all transcriptional repressor proteins. The invention is also applicable to H-NS-like proteins or H-NS homologs. It will be understood that H-NS like proteins include but are not limited to the bacterial proteins Sfh, StpA, Hha, YdgT, MvaT, MvaU, Lsr2, BpH3. Such transcriptional repressor proteins and repressor-like proteins are also present in mammalian cells.
In this specification, it will be understood that the term "protein encoding nucleic acid" covers, but is not limited to fluorescent proteins. Non-fluorescent proteins which are over-expressed in host cells for purification are also contemplated. Such proteins include all suitable medical protein products (vaccines, hormones (insulin, growth hormone), clotting factors, cytokines etc) and enzymes for example. This is a non-exhaustive list.
In this specification, it will be understood that the term "fluorescent protein" covers, but is not limited to, fluorescent proteins such as GFP, YFP, CFP, BFP and DsRed and also covers any potential mutants or variants thereof. There are an extensive range of known variants and mutants of both GFP and DsRed including a wide variety of single or double amino acid substitutions. For convenience, where the term mutant is referred to in the following text, it will tie understood that this term also covers fluorescent protein variants.
!■. will be understood that the following comments and teachings which although relate to a specific gfp variant tested, gfp+, are applicable to all fluorescent proteins and mutants and variants thereof in general. In the following description gfpT (or gfp7) may also be referred to interchangeably as gfpTCD (or gfprCD). According to a first general aspect of the invention, there is provided a method for improving gene expression in a host cell comprising a protein encoding nucleic acid comprising assessing the A and T nucleotide content and/or the intrinsic curvature of a wild type protein encoding nucleic acid or mutant thereof; preparing an altered protein encoding nucleic acid by modifying the A and T nucleotide content of the wild type protein encoding nucleic acid or mutant thereof to equal or lower the A and T nucleotide content of the host cell such that the intrinsic curvature of the altered protein encoding nucleic acid is reduced compared to the wild type protein encoding nucleic acid or mutant thereof; and using the altered protein encoding nucleic acid in host cell gene expression systems. Ideally, the modified A and T nucleotide content of the altered protein encoding nucleic acid results in reduced affinity, compared to the wild type protein encoding nucleic acid or mutant thereof, to host cell transcriptional repressor proteins.
It will be understood that these teachings are applicable to any protein encoding nucleic acid with a high A and T nucleotide content (AT-rich) compared to a host cell nucleic acid content.
It will be understood that the protein encoding nucleic acid may be present in an extrachromosomal vector, such as a plasmid. Alternatively, the protein encoding nucleic acid may be integrated into the host cell genome.
We have surprisingly found that this method prevents the possible interference of transcriptional repressor proteins, such as H-NS proteins, with gene expression at the level of transcription. By reducing the affinity of transcriptional repressor proteins, such as H-NS and H-NS-like proteins, for target genes their rates of transcription should increase, leading to an increase in yield of the gene product. In this manner, the invention is applicable to any protein for over-expression and subsequent purification in a host cell. Such proteins include fluorescent proteins and non- fluorescent protein products.
The protein encoding nucleic acid may encode many medically useful protein products such as vaccines, hormones (insulin, growth hormone), clotting factors, cytokines etc and enzymes for example. In addition the protein encoding nucleic acid may encode various restriction endo- nucleases Pstl, Hpy99l and Psil from Providencia stυartii 164, Helicobacter pylori J99 and Pseudomonas species SE-G49. Many other protein products may be contemplated.
Helicobacter pylori J99 has an average genomic A/T content of 61%, which is the same A/T content as the fluorescent protein, gfpmut2 (Figure 5B). This high average AyT content of H. pylori for example means it will have a large number of genes with a high A/T content Thus, due to a high AT content, it is to be expected that when genes are placed in E. coli, they are highly likely to be targeted by H-NS. The present invention provides a method for preventing this H-NS interaction by reducing the A/T nucleotide content to equal or lower than of the host cell, such as E. coli. These teachings are relevant to many different proteins and many different host cells.
Advantageously, the host cell is a bacterium, preferably a gram-negative bacterium, more preferably Escherichia coli. Other host cells may be contemplated including suitable prokaryotic and eukaryotic cells, such as yeast, insect, mammalian, primate and rodent cells.
According to another embodiment of this first aspect of the invention, there is provided a modified protein encoding nucleic acid comprising a sequence encoding a wild type protein or mutant thereof with an equal or lower A and T nucleotide content and/or reduced intrinsic curvature compared to the wild type protein encoding nucleic acid or mutant thereof. The resultant nucleic acid has reduced affinity to one or more host cell transcriptional repressor proteins compared to the wild type protein encoding nucleic acid or mutant thereof.
The invention is also directed to a modified protein encoding nucleic acid of the invention and protein, expression system, plasmid vector and host cell comprising the modified protein encoding nucleic acid.
It will be understood that the following discussion on the second aspect of the invention relating to fluorescent proteins is equally applicable to this first aspect of the invention.
According to a second general aspect of the invention, there is provided method for improving gene and/or protein expression in a host cell comprising a fluorescent protein nucleic acid comprising assessing the A and T nucleotide content of the fluorescent protein nucleic acid or mutant thereof; and modifying the A and T nucleotide content of the fluorescent protein nucleic acid or mutant thereof to equal or lower the A and T nucleotide content of the host cell.
In this manner, the A and T residue content of the fluorescent protein nucleic acid may be equal to or within +/- 15%, +/- 10%, +/- 9%, +/- 8%, +/- 7%, +/- 6%, preferably +/- 5%, +/- 4%, +/- 3%, +/- 2%, more preferably +/-1%, +/- 0.5%, +/- 0.1%, of the A and T residue content of the host cell. The modified fluorescent protein nucleic acid or mutant thereof can then be used as a molecular tool in gene expression systems and host cells.
According to one embodiment of this second aspect of the invention, there is provided a method for improving gene expression in a host cell comprising a fluorescent protein nucleic acid or mutant thereof comprising assessing the A and T nucleotide content and/or the intrinsic curvature of a wild type fluorescent protein nucleic acid or mutant thereof; preparing an altered fluorescent protein nucleic acid by modifying the A and T nucleotide content of the wild type fluorescent protein nucleic acid or mutant thereof to equal or lower the A and T nucleotide content of the host cell such that the intrinsic curvature of the altered fluorescent protein nucleic acid is reduced compared to the wild type fluorescent protein nucleic acid or mutant thereof; and using the altered fluorescent protein nucleic acid in host cell gene expression systems.
Thus, the crucial aspect of this invention is to modify the fluorescent protein nucleic acid A/T nucleotide content and/or intrinsic curvature to be lower than the A/T nucleotide content of the host cell. Surprisingly, we found that the resultant fluorescent protein nucleic acid has reduced affinity to one or more host cell transcriptional repressor proteins compared to the wild type fluorescent protein encoding nucleic acid or mutant thereof. Unexpectedly, this ensures improved fluorescent protein expression of a structurally identical fluorescent protein.
We have advantageously found that this method reduces the potential detrimental impact of a fluorescent protein nucleic acid or mutant thereof on gene and/or protein expression in a host cell comprising a fluorescent protein nucleic acid.
Specifically, the present invention is based on findings that transcriptional repressor proteins, such as nucleoid-associated transcriptional repressor proteins and typically H-NS or H-NS like proteins, targets fluorescent protein genes, such as the gfp gene or mutants or variants thereof. The transcriptional repressor proteins, namely the nucleoid-associated protein H-NS, are molecules that are abundant in E. coli and related organisms and are powerful repressors of transcription. H-NS can create bridges between different DNA molecules or between different parts of the same DNA molecule and can also nucleate along DNA. H-NS is essentially a protein with the ability to slow or block gene expression. Thus, these H-NS proteins interfere with gene expression at the level of transcription.
GFP is commonly used as a transcriptional reporter gene in host cells. We have now unexpectedly shown that H-NS interferes with the expression of the GFP protein in host cells such as bacteria. This discovery is unexpected and we postulate that many of the experiments carried out using GFP previously, in E. coli and related bacteria, are likely to have been complicated by the unsuspected participation of H-NS. The binding of the H-NS protein to the gfp+ gene has the potential to undermine its fidelity as a reporter of gene expression. This can mislead investigators, perhaps causing them to underestimate the levels at which genes of interest are expressed or it may complicate their study by introducing a new and unsuspected layer of gene regulation and artefacts into the system under examination.
Importantly, we also postulate that H-NS binding and oligomerization causes the formation of new topological domains, which can lead to both local transcriptional repression and in addition, genome wide changes in gene expression. As most gfp transcriptional and translational fusions are located on multicopy plasmids, the possibility exists that global gene expression could be altered by the binding of H-NS in gfp, depleting the amount of free H-NS in the cell and thus leading to altered global gene expression. Thus, H-NS also has the potential to modify gene expression in a host cell by upregulating/downregulating various local/distal genes.
Considering that all of the GFP derivatives used today are highly related and contain minimal nucleotide substitutions, and given that H-NS binding is affected by general characteristics such as DNA curvature or intrinsic curvature (which typically requires large scale substitutions to alter), it is reasonable to extrapolate that the genes of all of the commonly used GFP variants, such as but not limited to GFPmuti , GFPmut2 and GFPmut3, are also bound by H-NS or other transcriptional repressor proteins. Thus, this problem is widespread for all GFP variants.
These unexpected findings have led to the development of the method for improving gene and/or protein expression in a host cell comprising a modified fluorescent protein nucleic acid.
It is known that, the gene encoding GFP has a high A and T nucleotide content and DNA with this nucleotide content is preferentially bound by the H-NS protein. This repression mechanism frequently involves cross-linking of DNA molecules by H-NS to either distant DNA molecules or different parts of the same molecule to form DNA-H-NS-DNA bridges. These DNA-H-NS-DNA bridges block the free movement of RNA polymerase and interfere with the process of transcription. By recognizing that H-NS binding is a problem with the gfp+ gene in particular, we have engineered a modified gfp+ gene that codes for a protein identical to the gfp+ protein, but is no longer bound by H-NS. Thus, advantageously, the present invention addresses all of the above-mentioned aspects to prevent H-NS binding and associated H-NS mediated transcriptional silencing to results in an improved fluorescent protein for use in known c xpression systems/host cells. Specifically, the invention focuses on the optimization of gfp for expression in bacterial cells that express H-NS or related proteins. This modified fluorescent protein gene is thus modified to circumvent undesirable interference by H-NS.
Due to the degeneracy of the genetic code, we have modified the nucleotide sequence of the fluorescent gene without disturbing the order of the amino acids in the resultant protein product. Thus, the protein product encoded by the gene remains unchanged. Changing the nucleotide sequence in this manner enables the production of a modified fluorescent protein with altered A and T nucleotide content. Accordingly, we have been able to redesign the gfp+ gene such that the A and T nucleotide content matches/equals or lowers the average A and T nucleotide content from E coli or other related bacterial and non-bacterial expression systems without altering the amino acid sequence or characteristics of the protein product, GFP. Specifically, we have utilized "codon optimization" involving modifying the gfp+ gene to abrogate binding of the H-NS protein to the gfp gene whilst maintaining the coding sequence to ensure that the protein that is expressed is identical at the amino acid sequence level to the GFP+ protein. In this way the modified gene continues to express a GFP protein that is structurally identical to GFP+.
The teachings of the invention in relation to altering A and T nucleotide content are widely applicable. These teachings are applicable to any fluorescent proteins and host cell transcriptional repressor proteins. It will also be understood the concept of the invention is equally applicable to a fluorescent (GFP) protein which is not identical in characteristics/properties to the fluorescent (GFP) protein or mutant/variant thereof that it is derived from.
Furthermore, it will be understood that these teachings are applicable to any fluorescent protein with a high A and T nucleotide content (AT-rich) compared to a host cell nucleic acid content.
In addition to the A and T nucleotide content, intrinsic curvature of the DNA is also important. DNA curvature or intrinsic curvature is related to the nucleotide content of DNA and refers to the curve of the DNA that is caused solely by the nucleotide content. This is influenced by the A/T content (since As and Ts have a higher internal bend angle leading to higher deflection from linear DNA) and also the sequence - there are certain combinations of nucleotides, including Gs and Cs, that can lead to increased bending of the DNA. The term "intrinsic curvature" is used to distinguish sequence determined bends from bends that are introduced by DNA binding proteins. Promoter regions tend to be AfT rich to allow easy strand separation of the DNA. This often leads to promoter regions being intrinsically curved and, thus, good targets for the transcriptional repressor, H-NS. Thus, when the A and T nucleotide content of a nucleic acid is modified, the associated intrinsic curvature is similarly affected. Intrinsic curvature is a good measure/indicator of whether the modified nucleic acid does not have affinity to such transcriptional repressor proteins. An intrinsic curvature of low amplitude (for example below approximately 15Q per helical turn, approximately 109 per helical turn, more preferably approximately 9s per helical turn, even more preferably approximately 7- per helical turn) is desirable. Thus, the present invention involves modifying the AfT nucleotide residue content to equal or lower the average A and T nucleotide content of the host cell, carrying out codon optimization as needed and monitoring and modifying as necessary the intrinsic curvature of the nucleic acid to achieve an intrinsic curvature of low amplitude.
It will also be understood that the implications of these findings extend beyond host cells such as bacteria (E. coli and Salmonella) and yeasts. Thus, these teachings are applicable to a wide variety of host cells, whether prokaryotic or eukaryotic. Indeed, H-NS or H-NS-like proteins are widespread throughout many types of host cells including eukaryotic cells such as mammalian cells.
These findings have led to the present invention, which is directed to a method for improving gene and/or protein expression in a host cell. Specifically, we have developed a modified fluorescent protein gene which is impervious to interference by the transcriptional repressor proteins, such as H-NS, but which expresses an unaltered fluorescent protein. Hence, the fluorescent gene of the invention accurately reports transcriptional activity, due to the removal of the repressive effects of H-NS.
Advantageously, the new fluorescent gene of the invention can be used to monitor gene expression in bacteria such as E. coli without interference from H-NS binding to ensure that gene/protein expression in a host cell can now be conducted free from the undesirable complications that arise from transcriptional repressor proteins potentially binding to the fluorescent protein gene, with its associated bridging activity interfering with the faithful expression of that gene. Hence, the results should be more physiologically-relevant and free from artefacts caused by H-NS interference.
It is also postulated that lowering the A and T nucleotide content of a fluorescent protein in regions proximal to the promoter region and/or the ribosome binding site may improve gene/protein expression in a host cell.
Specifically, it is postulated that the high AfT content of the gfp+ gene may have had a direct role on transcription (independent of H-NS) since the location of AfT rich regions close to promoter regions can lead to reduced opening of the promoter region and thus reduced access for RNA polymerase and reduced transcription of the gene. This is because energy (usually supplied by underwound or "negatively supercoiled" DNA) is required to break the hydrogen bonds between the two DNA strands to allow access to the promoter region. The region requiring the lowest amount of energy separates first, using up the superhelical energy. Since A and T nucleotides only form 2 hydrogen bonds (G and C form 3), AfT rich regions become single stranded before regions with a lower A/T percentage. Therefore the presence of the A/T rich gfp± gene proximal to a promoter region could influence the amount of superhelical energy (specifically, the amount of superhelical twist in the DNA) in the system and thus affect transcription of the gene of interest. This aspect is relevant to all A/T rich fluorescent proteins even if they weren't bound by H-NS. In this situation the A and T nucleotide content should be assessed and modified accordingly.
Further more specific embodiments of the first general aspect of the invention are given below.
According to one embodiment of the second aspect of the invention, the fluorescent protein nucleic acid is modified so that it is no longer A and T nucleotide rich (AT-rich) compared to the host cell nucleic acid average A and T nucleotide content. Ideally, the A and T nucleotide content of the fluorescent protein nucleic acid is modified to result in reduced affinity, compared to the wild type fluorescent protein nucleic acid or mutant thereof, to host cell transcriptional repressor proteins.
Preferably, the A and T nucleotide content of the fluorescent protein transcriptional repressor protein binding region nucleic acid is modified to equal or lower the A and T nucleotide content of the host cell transcriptional repressor protein nucleic acid binding region.
Preferably, the fluorescent protein nucleic acid has an intrinsic curvature of lower amplitude, for example below approximately 15s per helical turn, approximately 10s per helical turn, more preferably approximately 99 per helical turn, even more preferably approximately 7- per helical tum)than the wild type fluorescent protein nucleic acid.
Advantageously, the host cell is a bacterium, preferably a gram-negative bacterium, more preferably Escherichia coli. Other host cells may be contemplated as described below.
According to another embodiment of this second aspect of the invention, the transcriptional repressor protein is a nucleoid-associated transcriptional repressor protein or repressor like protein, preferably H-NS or H-NS like proteins, such as Sfh and StpA.
The H-NS protein is a member of the family of nucleoid-associated proteins of bacteria. These proteins bind to DNA, regulate gene expression and organize the structure of the nucleoid (the part of the bacterial cell that contains the genetic material). H-NS is abundant (20,000 dimers per cell) and it binds to DNA sequences that have specific characteristics namely regions of intrinsic curvature and a high A and T content. Each H-NS dimer has two DNA binding domains and this facilitates the construction of DNA-H-NS-DNA. These bridges block the process of transcription (in which a gene is read by RNA polymerase) and silence the expression of genetic information. The H-NS protein binds to DNA sequences throughout the chromosomes of bacteria that express it, such as Escherichia coli, Salmonella and Shigella. The genes that it targets become silenced or down-regulated (Dorman, Nat. Rev. Microbiol. 5 (2007), 157-161).
H-NS is not the only DNA-protein-DNA bridge builder in biology. Such proteins are found in many cell types, including archaea and eukaryotes (Luijsterburg et al., Crit. Rev. Biochem. MoI. Biol. 43 (2008), 393-418). Importantly, a protein from the mouse has been shown to be able to substitute for H-NS in bacteria (Timchenko et al., EMBO J. 15 (1996), 3986-3992). Thus, the present invention relates to H-NS like proteins also.
According to another embodiment of this second aspect of the invention, the A and T nucleotide content of the entire fluorescent protein is modified compared to the wild type fluorescent protein nucleic acid or mutant thereof.
According to an alternative embodiment of this second aspect of the invention, the A and T nucleotide content of the fluorescent protein promoter region and/or ribosome binding site (RBS) nucleic acid is modified compared to the wild type fluorescent protein nucleic acid or mutant thereof.
Ideally, the A and T nucleotide content of the regions proximal to the fluorescent protein nucleic acid promoter region is modified such that the A and T nucleotide content to equal or lower the A and T nucleotide content of the host cell nucleic acid.
Advantageously, the fluorescent protein of the invention may be a green fluorescent protein (GFP), YFP, CFP, BFP or red fluorescent protein (DsRed) or a mutant or variant thereof.
According to a third general aspect of the invention, there is provided a modified fluorescent protein nucleic acid comprising a sequence encoding a wild type fluorescent protein or mutant thereof with an equal or lower A and T nucleotide content and/or reduced intrinsic curvature compared to the wild type fluorescent protein nucleic acid or mutant thereof. Ideally, the nucleic acid has reduced affinity to one or more host cell transcriptional repressor proteins compared to the wild type fluorescent protein nucleic acid or mutant thereof.
According to one embodiment, the modified nucleic acid has reduced A and T nucleotide content across the entire length compared to the wild type fluorescent protein nucleic acid or mutant thereof. According to an alternative embodiment, the modified nucleic acid has reduced A and T nucleotide content in the regions proximal to the promoter region and/or ribosome binding site (RBS) of the fluorescent protein nucleic acid compared to the same regions of the wild type fluorescent protein nucleic acid or mutant thereof.
Preferably, the modified nucleic acid has an A and T nucleotide content equal to or lower than a host cell average A and T nucleotide content.
Ideally, the host cell is a bacterium, preferably the Gram-negative bacterium Escherichia coli which has an A and T nucleotide content of approximately 50%.
According to another embodiment of the invention, the percentage of A and T nucleotides based on the full length modified nucleic acid sequence is from approximately 25% to 70% (for example the host cell Mycobacterium tuberculosis (a Gram-positive bacterium) has an approximate 35% A and T nucleotide content). Ideally, the A and T content is equal to or lower than that of the host cell.
According to a preferred embodiment of the invention, the nucleic acid comprises the nucleic acid sequence of Figures 4B or 6B or a sequence with at least 70%, preferably 80%, more preferably 85%, more preferably 90%, more preferably 95%, even more preferably 99% homology over the entire length to the nucleic acid sequence of Figures 4B or 6B. These are the "gfpV and DsRedT genes of the following examples. The gfpT gene contains 157 nucleotide changes compared to the gfp+ coding sequence, without altering the amino acid sequence of the protein. Both gfp genes are 717 base pairs (bp) and thus vary at the nucleotide level by greater than approximately 20%.
Ideally, the nucleic acid of the invention has improved transcription compared to the wild type fluorescent protein nucleic acid or mutant thereof.
According to a fourth general aspect of the invention, there is provided a fluorescent protein encoded by the modified nucleic acid as described before.
According to a fifth general aspect of the invention, there is an expression system comprising the modified nucleic acid sequence or fluorescent protein as described above, preferably for use in a host cell such as Escherichia coli. According to a sixth general aspect of the invention, there is provided a plasmid vector comprising the modified nucleic acid sequence or fluorescent protein as described above, preferably for use in a host cell such as Escherichia coli.
According to a seventh general aspect of the invention, there is provided a host cell comprising the modified nucleic acid sequence, fluorescent protein, plasmid vector, expression system as described above.
Ideally, the A and T nucleotide content of the modified fluorescent protein nucleic acid is equal to or lower than the A and T nucleotide content of the host cell nucleic acid.
Preferably, the fluorescent protein is a green fluorescent protein (GFP) or red fluorescent protein (DsRed). The green fluorescent protein mutant may be selected from the following; a spectral variant, a pHluorins, a variant with an altered Stokes shift, an oligomerization variant, a folding variant, a photoactivatable variant, a photoconversion variant, a photoswitchable variant, a redox sensitive variant and/or gfp+.
The transcriptional repressor protein may be any nucleoid-associated repressor protein, preferably H-NS, repressor-like proteins, preferably H-NS like proteins such as Sfh and StpA.
According to a eighth aspect of the invention, there is provided a method of monitoring gene expression in a host cell using the modified nucleic acid, fluorescent protein, expression system or host cell of the invention.
The fluorescent gene of the invention may be used in many commercial applications. For example, it may be used in GFP-based kits for the study of gene expression.
Additionally, the modified nucleic acid of the invention may be inserted into a recombinant vector which may be any vector which may conveniently be subjected to recombinant DNA procedures. The choice of vector will often depend on the host cell into which it is to be introduced. Thus, the vector may be an autonomously replicating vector, i.e. a vector which exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication, e.g. a plasmid. Alternatively, the vector may be one which, when introduced into a host cell, is integrated into the host cell genome and replicated together with the chromosome(s) into which it has been integrated.
The vector is preferably an expression vector in which the DNA sequence encoding the fluorescent protein of the invention is operably linked to additional segments required for transcription of the DNA. In general, the expression vector is derived from plasmid or viral DNA, or may contain elements of both. The term, "operably linked" indicates that the segments are arranged so that they function in concert for their intended purposes, e.g. transcription initiates in a promoter and proceeds through the DNA sequence coding for the fluorescent protein of the invention.
The promoter may be any DNA sequence which shows transcriptional activity in the host cell of choice and may be derived from genes encoding proteins either homologous or heterologous to the host cell, including native Aeqυorea GFP genes.
Examples of suitable promoters for directing the transcription of the DNA sequence encoding the fluorescent protein of the invention in mammalian cells are the SV40 promoter (Subramani et al., MoI. Cell Biol. 1 (1981), 854-864), the MT-1 (metallothionein gene) promoter (Palmiter et al., Science 222 (1983), 809-814) or the adenovirus 2 major late promoter.
Specifically, the fluorescent protein gene of the invention may also be placed in plasmid vectors designed for simple cloning of e.g. the promoter of interest (transcriptional fusion) or the gene of terest (translational fusion) or both. While transcriptional fusions report promoter activity, translational fusions are often used to view the movement of the tagged protein by fluorescent microscopy.
Additionally, the fluorescent protein gene of the invention could also be integrated onto the chromosome to allow study of gene expression from its native location. One of the most effective ways to construct these fusions is using the lamda red mechanism, integrating the gfp gene or modified gfp (gfpT) gene after the end of the gene of interest's (GOI) coding sequence. W^iIe colonies can be screened for fluorescence using a FACs, fluorescent microscope, UV lamp or in some cases, by eye, we recommend using a linked selectable marker such as the kanamycin resistance cassette to allow initial selection of integrants.
Still additionally, a further version could also be developed for random insertion in bacterial genomes using transposon technology.
The number of available fluorescent reporter genes has increased in recent years as researchers have isolated genes encoding fluorescent proteins from an increasing variety of organisms and included the genes in cloning cassettes.
For example, fluorescent proteins from sea creatures have been used as reporter genes capable of integration into DNA via cloning cassettes. Products of these genes fluoresce under certain wavelengths of light, permitting the tracking of proteins in, e.g., heterologous cells, such as dog and monkey cells. The most commonly used proteins of this nature fluoresce green, and were obtained from the jellyfish, Aeqυorea victoria, and sea pansy, Renilla reniformis. Additionally, a red fluorescent protein (RFP), known as drFP583, and a turquoise fluorescent protein, known as dsFP483, have been isolated from the IndoPacific mushroom corals (Discosoma sp. "red" and Discosoma striata, respectively). Both Discosoma and Actinodiscus are mushroom corals, soft bodied anthozoans that do not produce an external skeleton. It should be noted that the relationship between the genus Discosoma and the genus Actinodiscus is not well understood. Both Actinodiscus and Discosoma are members of the Actinodiscidae Family, which is a member of the Corallimporpharia (mushroom) Order. The taxonomy of the Corallimporpharia is poorly defined, and therefore, the nature of the relation of Actinodiscus to Discosoma is uncertain. Discosoma and Actinodiscus are believed to be different genera of the same family, but they could be more closely or distantly related.
Finally, it will be understood that the teachings of this invention are equally applicable to other forms and mutations/variants of the gfp gene, such as those encoding proteins that fluoresce at different wavelengths (e.g. yellow), allowing improved versions of those genes also to be produced.
The Discoma red fluorescent protein (FP583, commercially known as DsRed) isolated by Matz in 1999 (Matz et al. Nat. Biotech. 17 (1999), 969-973), while providing a potential alternative to gfp+, contains the same intrinsic features make gfp+ a target for H-NS binding. The dsred gene is 55% A/T and is predicted to contain strong intrinsic curvature. Thus, the teachings of the present invention are also applicable to DsRed which may be optimized as described above to reduce H-NS affinity.
The invention will be more clearly understood by the following description of some embodiments thereof, given by way of example only, with reference to the accompanying drawings, in which:
Fig. 1 shows that H-NS binds to gfp+ in vivo in accordance with Example 1. Chromatin immunoprecipitation (ChIP) using a H-NS specific monoclonal antibody was followed by quantitative PCR (qPCR). The Y-axis indicates fold enrichment relative to input DNA (DNA before addition of the H-NS specific antibody). Probe 2 showed over 12 fold enrichment over input DNA indicating H-NS binds in the gfp+ gene. Figs. 2A and 2B relate to Example 2 and codon optimization. The new gfp gene (gfpJ) of
Example 2 contains reduced A/T content and DNA curvature compared to gfp+. The Bend.lt program (Vlahovicek et al., Nucleic Acids. Res. 31 (2003 ), 3686-7) was used to determine the predicted curvature of the two gfp genes. Regions of strong intrinsic curvature in the gfp+ that are reduced in gfpl are indicated by filled arrows (Figure 2A). The new gfpl gene has reduced A/T content compared to the gfp+ gene (Figure 2B) While for the gfp+ gene the most A/T rich region is approaching 80% (average A/T content is 59%) the A/T content of some regions is reduced by over 20% in the new gfpl gene (average A/T content is 50%). The entire coding sequence (717bp) of both genes is shown in Figures 4A and 4B.
Fig. 3 shows the osmotic induction of proil fusions in accordance with Example 3.
Fig. 3A is a diagram of the downstream regulatory region (DRE) of pro V containing H- NS binding sites essential for repression of proU in low osmolality media. The transcriptional start site of proU is indicated as +1. The positions of high-affinity H-NS binding sites and the points of insertion of the 3 reporter genes are indicated relative to +1.
Fig. 3B shows chromosomal lacZ, gfp+ and new gfpl fusions constructed at +98 bp
(disrupts DRE) and Fig. 3C shows chromosomal lacZ, gfp+ and new gfpl fusions constructed at
+936 bp (DRE intact) creating 3 reporter fusion strains each, β-galactosidase activity of the proU-lacZ fusions was monitored kinetically using a multiscan ascent plate reader (Thermo labsystems). The slope of at least ten samples was used along with the OD6oo of the culture and volume of cells used to determine β-galactosidase activity, gfp fusions were measured using a flow cytometer (Beckman Coulter) and plotted by mean fluorescence values. Strains were grown overnight in LB with the indicated concentrations of NaCI as described previously (Lucht et al., J. Biol. Chem. 269 (1994), 6578-6586). Error bars show the standard error of the mean (SEM) of at least duplicate experiments.
Fig. 3(D) is a barchart showing the effect of removing H-NS on fusion gene expression in 100 mM NaCI. Mean values and ranges are plotted.
Fig. 3(E) is a table showing the fold-increases in fusion gene expression caused by changes in H-NS occupancy. Figs. 4A and 4B shows the full coding sequences for the gfp+ gene and the new gfpl. gfp+ (Scholz et al. Eur. J. Biochem. 267 (2000), 1565-70) contains a number of mutations that improve the folding and emission spectrum of the GFP protein, optimizing it for use with flow assisted cell sorters (FACs). The sequence of gfp+ was obtained from pZepO8 (Hautefort et al., App. Env. Microbiol. 69 (2003), 7480-7491 ). The new gfpl is the gfp made in accordance with Example 2.
Figure 5A is the full coding sequence for one of the most commonly used variants of the gfp gene, the gfpmut2 gene. Recently, every potential promoter in E. coli was cloned into a low copy number vector containing the gfpmut2 gene allowing for real-time monitoring of the transcriptional activity of the genome (Zaslaver et al. Nat. Meth. 3 (2006), 623-628).The gfpmut2 differs to the gfp+ gene sequence by 48 nucleotides. The gfpmut2 gene contains a higher A/T (61%) content than the gfp+ gene (59%) and also contains regions of high intrinsic curvature making it a strong H-NS target (Fig. 5B). 3 regions of strong intrinsic curvature are indicated by filled arrows. Figure 6A is the full coding sequence for DsRed (Bartilson et al. MoI. Microbiol. 39 (2001), 126-135). Figure 6B is the full coding sequence for DsRedT. gfpJ has reduced AfT content and intrinsic curvature and is predicted to be a poor target for H-NS. The DsRed coding sequence is 678 nucleotides in length, 55% of which are either an A or a T (Fig 6D). 2 regions of strong intrinsic curvature that are reduced in dsredT are indicated by arrows (Fig 6C). This high AfT content and high intrinsic curvature shows that DsRed contains the same key determinants of H-NS binding affinity to a region as the gfp+ (Fig 2) gene and thus, is most likely bound by H-NS. Considering that DsRed is the basis for the most intensive research into fluorescent proteins since the gfp gene was isolated from Aeqυorea victoria, a wide range of derivatives of DsRed (Baird et al., Proc. Natl. Acad. Sci. USA 97 (2000), 11984-11989) may also be bound by H-NS.
Fig. 7A is a diagram of the the fimA promoter (P) is located in an invertible element (fimS). Site-specific recombination at the inverted repeats (IR), flanking fimS, results in inversion of the fimA promoter between phase ON and phase OFF. Figs. 7B to D are the results of determining the response of fimS to novobiocin-induced
DNA relaxation using a PCR based assay. Bands corresponding to phase ON and phase OFF are labelled. The percentage of phase ON and phase OFF switches in the population are shown below each gel. Fig 7B (Wild type) shows an increase in switching towards phase ON in response to increasing amounts of novobiocin. In contrast, in Fig 7C (fimA-gfp+), novobiocin causes an increase in switching towards phase OFF. Fig 7D (fimA-gfpT) behaves in a wild type manner, switching towards phase ON in response to novobiocin.
Fig 8 shows that H-NS binds to gfp+ with higher affinity than gfpT in vitro.
Electrophoretic mobility shift assay (EMSA) analysis using purified H-NS and biotinylated gfp+ and gfpT probes (Fig 8A). Biotinylated proU and lacZ probes are used as positive and negative controls, respectively (Fig. 8B). H-NS binding to gfp+ and gfpT (Fig. 8C) in the presence of qual amounts of non biotinylated (unlabelled) probes. The concentration of purified H-NS used is indicated above each lane.
Fig. 9 shows that gfpT has improved translation efficiency.
Fig 9A is a graph showing the expression of gfp+ and gfpT from the prpBCDE promoter on the plasmid pPro in wildtype and hns mutant cells. The prpBCDE promoter is repressed by glucose and induced by propionate. Mean values and ranges are plotted; fold differences between gfp+ and gfpT fluorescence levels are shown above the bars.
Fig. 9B is a graph showing the codon adaptation index (CAI) values for gfpT and gfp+ in various organisms. Organisms above the dashed line will translate gfpT more efficiently than gfp+ (and wildtype gfp). [Organism classification] and abbreviations: [Alphaproteobacteria] Rh.sph., Rhodobacter sphaeroides; [Betaproteobacteria] Ra.sol., Ralstonia solanacearum; [Epsilonproteobacteria] H. p., Helicobacter pylori; [Gammaproteobacteria] E.c, Escherichia coli; H.i., Haemophius influenzae; P.a., Pseudomonas aeruginosa; S.e., Salmonella enterica; V.c, Vibrio cholerae; X.c, Xanthomonas campestris; Y.p., Yersinia pestis; [Actinobacteria] M.S., Mycobacterium smegmatis; St.coe., Streptomyces coelicolor; [Firmicutes] B.s., Bacillus subtilis; [Eukaryota] D.m., Drosophila melanogaster; H.s., Homo sapiens; M.m., Mus musculus; Sa.cer., Saccharomyces cerevisiae.
EXAMPLES
GENERAL MATERIALS
gfp+ (Scholz et al. Eur. J. Biochem. 267 (2000), 1565-70), contains a number of mutations that improve the folding and emission spectrum of the GFP protein, optimizing it for use with flow assisted cell sorters (FACs). gfpΥ is the gfp made in accordance with Example 2 of the invention which has been codon optomized. - The following examples were carried out in the E. coli k12 strain, CSH50 (The CoIi genetic stock centre).
XL1 Blue was used as a cloning strain (Stratagene). pZepOδ (Hautefort et al., App. Env. Microbiol. 69 (2003), 7480-7491) was used as the source of gfp+. - All restriction emzymes used were from New England Biolabs (NEB).
Phusion polymerase (Finnzymes) was used as per manufacturers instructions for all PCRs other than qPCR, where SYBR green qPCR mix (QuantiTect) was used.
The plamid prep kit used was by RBC biosciences. All other standard reagents were from Sigma- Aldrich. - Antibiotics, where needed, used in the following concentrations; Carbenicillin 100 μg/ml, Kanamycin 50 μg/ml and chloramphenicol 25 μg/ml.
After integration of chromosomal cassettes and single colony pufirifaction under antiobiotic selection, the cassettes were considered stable and antibiotic selection unneccessary. Strains containing plasmids were maintained under selection to prevent plasmid loss
Example 1
H-NS binds to the qfp+ gene in vivo.
, . .-._» binding in gfp+ was confirmed in vivo by chromatin immunoprecipation using an H-NS specific monoclonal antibody and quantitative PCR. Chromatin immunoprecipitation involves crosslinking proteins to DNA in live cells using formaldehyde, purifying the DNA-protein complexes and then using a specific antibody to the protein of interest (in this case H-NS). This antibody-protein-DNA complex can then be purified, the protein removed from the DNA and the DNA quantified using quantitative real-time PCR. This identifies if a piece of DNA was bound by H-NS. The fold enrichment of the DNA is an indication of the affinity of the protein of interest for the DNA.
Materials and Methods
Chromatin immunoprecipitation (ChIP) using a monoclonal H-NS specific antibody was performed as described previously (Lucchini et al., PLoS Pathog. 2 (2006), e81). Quantitative
PCR was performed on enriched DNA and unenriched (input) DNA using the Rotor-Gene 3000 real-time PCR machine (Corbett Research) and SYBR green (QuantiTect) as per manufacturers instructions. Input DNA was quantified using a Nanodrop 2000 (Thermo Scientific). The primers used were designed using Primer3 and are listed below.
ChIP primer list gfp.chip.1.fw 5'-GGGTCATACCAACGTTCTGG-31 gfp.chip.1.rv 51-TTGTGCCCATTAACATCACC-31 gfp.chip.2.fw 5'-TACAAGACGCGTGCTGAAGT-31 gfp.chip.2.rv 51-TGTGTCCGAGAATGTTTCCA-31 gfp.chip.3.fw 5r-GGCATGGATGAGCTCTACAAA-3I gfp.chip.3.rv 5'-TTTCCTTACGCGAAATACGG-3' gfp.chipAfw δ'-TCACTACCGGGCGTA I I I I I -31 gfp.chipArv δ'-TTGAGCAACTGACTGAAATGC-S' gfp.chip.δ.fw 51-TGTCGGCAGAATGCTTAATG-31 gfp.chip.δ.rv δ'-CTGCCATTCATCCGCTTATT-S'
Each product was between 100 and 150 bp in length and encoded for only a single specific product (analyzed by agarose gel electrophoresis after a 40 cycle PCR). ChIP DNA samples were diluted 1 in 10 in AnalaR water (BDH). Each reaction was also performed using 20 ng of input DNA (4 ng/μl). The SYBR green PCR was set up in a 25 μl reaction containing 2.5 μl AnalaR water, 5 μl 1.5 μM forward and reverse primer mix, 12.5 μl SYBR green PCR mix and 5 μl of the DNA sample. The following thermocycle was used: 1. 95 0C for 15sec, 2. 52 0C for 60sec, 3. 72 0C for 15 sec. This was repeated for 40 cycles. Cycle threshold (Cτ) values were extracted and the change in cycle threshold (ΔCT) for each probe determined as follows; ΔCT = CT input sample - Cτ ChIP sample. The ΔCγ of the ChIP probe with the highest Cτ value (i.e. the lowest amount of amplified DNA), was normalized to 0 by adding that ΔCT to the ΔCT of all of the ChIP probes. This ensures a positive value for the fold enrichment calculation.
Fold enrichment was calculated as follows; Fold enrichment= (1+PCRyield)ΔCT. The PCR yield refers to the efficiency of the amplification. Since PCR amplification is exponential, the amount of DNA is doubled in every cycle. Therefore, when the CT values of 10 fold dilutions of input DNA are plotted against the Iog10 of the amount of input DNA per reaction, the slope should be -3.32 [2332=10]. When the slope approaches this number the PCR yield=1.
Results and Conclusion
As shown in Figure 1 , probe 2, located in the gfp+ gene, is strongly enriched for H-NS binding while the other probes are not. Probe 2 showed over 12 fold enrichment over input DNA indicating the presence of H-NS. This suggests that there is a strong H-NS binding site in this region (as supported by the DNA curvature and AAT content data, Figs. 2A and 2B). This data also infers that H-NS bound in this region is not nucleating along the DNA but instead, may be forming DNA-protein bridges with a second H-NS binding site This second site could be proximal to the gfp+ gene and lead to the transcriptional repression of a local gene (as described in Example 3) or distal, potentially forming a new topological domain and altering the expression of multiple genes.
Considering that most gfp transcriptional and translational fusions are located on multicopy plasmids, the possibility exists that global gene expression could be altered by the binding of H- NS in gfp, depleting the amount of free H-NS in the cell and thus leading to altered global gene expression.
H-NS titration from native genes was previously shown to occur upon the introduction of an of AfT rich plasmid DNA (Doyle et al., Science 315 (2007), 251-252).
This is further supported by the recent observation that the presence of a promoterless gfp+ gene on a multicopy plasmid caused a dramatic reduction in Salmonella invasiveness (Clark et al., Microbiology 155 (2009), 461-467).
It is also postulated that H-NS binding occurs in wild type gfp and it's derivatives as the minor modifications in the coding sequence required to alter the characteristics of the protein (often requiring only a single amino acid substation) are unlikely to alter the determinants of H-NS binding and thus, the affinity of H-NS for the DNA. We have demonstrated in Example 2 how the 48 nucleotide difference between the gfpmut2 and gfp+ genes is insufficient to significantly change the A/T content or curvature of the DNA. This instead requires a more thorough approach described in Example 2.
Example 2 Codon Optomization
As shown in Example 1 , chromatin immunoprecipitation using a H-NS specific monoclonal antibody showed H-NS binding in the gfp+ gene (Fig. 1). This was supported by bioinformatic analysis of the gfp+ gene that showed high intrinsic curvature and high A/T content-typical features of H-NS bound regions (Fig. 2A).
Materials and Methods
To investigate if H-NS binding in gfp+ was affecting gene expression, we used the well- characterized H-NS regulated promoter proU and showed that gfp+ could functionally replace a H-NS binding site in the proU DRE.
We then redesigned the gfp+ gene to be optimized for E. coli codon usage, which resulted in a drop in the A/T content and intrinsic curvature (Fig. 2B). This new gfpl gene was shown to be unable to replace the H-NS site in the proU DRE (described in the previous report), behaving similarly to the lacZ fusion (which has been shown to not bind H-NS). The characteristics of the lacZ gene were previously investigated (Owen-Hughes et al., Cell 71 (1992), 255-65) when differing results were obtained from lacZ and luxAB reporter fusions inserted in the proU DRE.
In vitro experiments exploiting the altered migration of curved DNA in agarose depending on the presence or absence of ethidium bromide showed that both luxAB and the proU DRE contained curved regions, while the migration of lacZ gene was not altered by the presence of ethidium bromide and thus, the lacZ gene was not intrinsically curved.
The affinity of purified H-NS for luxAB, proU DRE and lacZ was assessed using electrophoretic mobility shift assays (EMSAs). EMSA involves incubation of a purified protein with purified DNA followed by electrophoresis through a polyacrylamide gel. The DNA that is bound by the purified protein migrates slower than unbound DNA and thus can be monitored and quantified. The lacZ
DNA was reproducibly shown to be bound with less efficiency (required more H-NS) than both the luxAB DNA and the proU DRE region (Owen-Hughes et al., Cell 71 (1992), 255-65). This was consistent with their observation that H-NS preferentially binds to curved DNA.
As mentioned, the initial basis for their study was the differences in expression between the luxAB and the lacZ reporter fusions inserted in the proU DRE. While the luxAB fusion showed repression of proU of in low osmolality media, the lacZ fusion showed derepression of proU under the same conditions.
The differences in transcription were due to H-NS binding in luxAB and the lack of H-NS binding in the lacZ reporter fusion (Forsberg et al., J. Bacteriol 176 (1994), 2128-32).
Analysis of DNA curvature was performed using the bend.it server as described in (Vlahovicek et al., Nucleic Acids. Res. 31 (2003 ), 3686-7).
The prediction of DNA curvature can be achieved using a number of different models. Rod models are the simplest form of DNA models and represent DNA as a cylindrical rod of constant diameter, made up of short cylindrical segments (e.g. the size of a base pair) and then to compute a given rod parameter (e.g. DNA curvature) on the basis of segment composition. Dinucleotide models define the segment as two adjacent base pairs while trinucleotide models define the unit around the central base pair of a given trinucleotide. Analysis of DNA curvature was performed using the bend.it server, which calculates the curvature of DNA molecules based on their DNA sequences using dinucleotides and trinucleotides models.
The genetic code uses 64 nucleotide triplets (codons) to encode 20 amino acids and stop, meaning each amino acid is encoded by on average 3 codons. The frequencies with which codons are used by different organisms varies significantly leading to variation in G/C content. The degeneracy of the genetic code enables many alternative nucleotide sequences to encode the same protein and allows for the codon optomization of a protein to a specific organism, without altering the protein at the amino acid level.
The simplest way to redesign a DNA sequence is to work from the amino acid sequence and use a 'one amino acid - one codon1 approach where, for every amino acid, the most abundant codon for the organism of interest is used. A highly expressed gene designed as such would result in depletion of the transfer RNA pool for those codons, potentially allowing the incorrect incorporation of another tRNA leading to translation error. In contrast to this method, the program for optimization of gfp+ (Gene Designer) (Villalobos et al., BMC Bioinformatics 7 (2006), 285) optimizes genes for expression by using a codon usage table in which each codon is given a probability score based on the frequency distribution of the codons in the desired genome normalized for every amino acid. For the redesign of the gfp+ gene, we used the ECoILCII table that is derived from a collection of highly expressed E. coli genes. This approach avoids the use of rare codons which are strongly associated with low levels of protein expression due to ribosome stalling and abortive translation. WhNe this ensured that the new gfp gene (gfpT) would be highly expressed in E. coli, the most important factor in the codon optimization was the corresponding reduction in A/T content and reduction in curvature.
Since candidate sequences are generated in silico using a Monte Carlo based algorithm where each codon choice is an independent probabilistic event, the software can perform the optimization on multiple occasions finding a new and equally optimal solution each time. This allowed for screening of the potential new gfp genes for reduced curvature. The variant chosen (gfpT) was synthesized and used in Example 3.
We chose to optimize the gfp+ gene for E. coli to give it an A/T content of 50%. We chose this because the majority of bacterial research is conducted in E. coli and Salmonella. As mentioned earlier, H-NS targets horizontally acquired A/T rich DNA, so making the gfp+ gene a similar average A/T content to that of the E. coli genome (50%), was intended to allow it blend in and not be target by H-NS as an horizontally acquired gene. This method could be used for all other fluorescent proteins to prevent H-NS binding in the introduced DNA.
Results and Conclusions
Codon optomization reduced the A/T richness and intrinsic curvature of the g1p+ gene. Figure 2A shows the intrinsic curvature results for the gfp+ and gfpl genes and Figure 2B shows the A/T content results for the the gfp+ and new gfp τ gene.Fig 2 shows that the new gfpl gene contains reduced A/T content and DNA curvature compared to gfp+.
The entire coding sequence (717 bp) of the two gfp genes was analysed using the Bend.lt ."i-ogram that determines the predicted curvature (Figure 2A) and A/T content (Figure 2B) of DNA. Regions of strong intrinsic curvature in the gfp+ gene that are reduced in the gfpJ gene are indicated by filled arrows.
While for the gφ+gene the most G/C rich region is 0.6 (60%), the new gfpl gene has strongly G/C rich regions approaching 0.8 (80%). This reflects the overall drop in A/T content between the gip+ gene (59%) and the gfpl gene (50%)
These plots indicate a general reduction in both A/T content and predicted curvature of the gfp+ gene, thus reducing two determinants of H-NS binding affinity. This data is a strong indicator that the intrinsic qualities of the DNA that allow H-NS to bind in the gfp+ gene are not present in the gfpl gene. The data also shows that the gfpmut2 gene is A/T rich and contains strong predicted curvature, and thus is most likely also bound by H-NS. It is therefore probable that many of the gfp variants contain similar traits and are bound by H-NS. Example 3
New qfpτ gene as a transcriptional reporter to a known H-NS regulated gene
Previous studies have shown H-NS binding in the downstream regulatory region (DRE) of proU and that H-NS binding in this region was responsible for repression of proU under low osmolarity conditions (Bouffartigues et al., Nat. Struct. MoI. Biol. 14 (2007), 441-8). In this example, we use the new gfpl gene (as developed in Example 2) as a transcriptional reporter to a known H-NS regulated gene.
Materials
Primer list: fimA-qfp+-cat. fw 51- GAT TGA TGC GGG TCA TAC CAA CGT TCT GGC TCT GCA GTA ATG AGC GTT CTA
GAT TTA AGA- 31 fimA-qfp+-cat. rv
5'- TCT GCA CAC CAA CGT TTG TTG CGC TAC CCG CAG CTG AAC TCT ACG AGA CAG
CAC ATT AAC- 31 kan.int.fw
5'-AAT GTC ATG ATA ATA ATG GTT TCT TAG ACG TCA GGT GGC GTG TAG GCT GGA
GCT GCT TCG - 3' kan.int.rv
5'-AGT TCA GCT GCG GGT AGC GCA ACA AAC GTT GGT GTG CAG ACA TAT GAA TAT CCT CCT TA - 31
+98proU-qfp+.fw
5' - GGC AAT TAA ATT AGA AAT TAA AAA TCT TTA TAA AAT ATA ATG AGC GTT CTA GAT
TTA AGA - 3'
+936proU-gfp+.fw 5' - CCG CCG GAC ACC GAA TGG CTT AAT TCG TAA AAC CCC TTA ATG AGC GTT CTA GAT TTA AGA - 3 +98proU-αfpT.fw
5' - GGC AAT TAA ATT AGA AAT TAA AAA TCT TTA TAA AAT ATG ATT GAT TAA GAA GGA GAT - 31 +936proU-qfpT.fw
5' - CCG CCG GAC ACC GAA TGG CTT AAT TCG TAA AAC CCC TTG ATT GAT TAA GAA GGA GAT - 3' proU.(stop).kan.rv
51 - GTC CGC CGC TGG CGT GGT ATC CCA CGG ATT ATT TTG ATC ACA TAT GAA TAT
CCT CCT TA - 31
+98proU-lacZ.fw 5' - GCA ATT AAA TTA GAA ATT AAA AAT CTT TAT AAA ATA TAA TGA GCG GAT AAC AAT
TTC ACA - 3'
+936proU-lacZ.fw
5' - CGC CGG ACA CCG AAT GGC TTA ATT CGT AAA ACC CCT TAA TGA GCG GAT AAC
AAT TTC ACA - 31 proU.(stop).cat.rv
51 - GTC CGC CGC TGG CGT GGT ATC CCA CGG ATT ATT TTG ATC ACA TAT GAA TAT
CCT CCT TAG - 31 kan.Xmal.fw
51 - CAT AAC GAG CCC GGG TGT AGG CTG GAG CTG CTT C- 31 kan.Xmal.rv
5'- CAT AAC GAG CCC GGG CAT ATG AAT ATC CTC CTT A- 3' fimA.fw
5'- GGA AAG CAG CAT GAA AAT TAA AAC TC - 31 fimA.rv 5'- GGT TAT TGA TAC TGA ACC TTG AAG G - 31 proϋ.fw
5'- AGG GGT TGC CCT CAG ATT CTC - 3' proU.rv
5'- GTC AGT CGG TGC AGT CGT C - 31
Methods
Integration of linear (PCR amplified DNA) was performed as described by Datsenko and
Wanner (Proc. Natl. Acad. Sci. USA 97 (2000), 6640-6645). The linear DNA was obtained as follows;
The gfp+-cat cassette from pZepOδ was PCR amplified using primers fimA-gfp-hcat.iw and fimA-gfp-hcat.rv and integrated into the chromosome of the CSH50 in the fimA gene selecting for chloramphenicol resistance (encoded by the cat gene). The kanamycin resistance cassette
(kan) was PCR amplified from pKD4 Datsenko and Wanner (Proc. Natl. Acad. Sci. USA 97 (2000), 6640-6645) using primers kan.int.fw and kan.'mX.rv that have 5' extensions allowing for chromosomal replacement of the cat gene in the fimA- gip+- cat construct. The chromosomally located gfp+-kan cassette was then used as template for integration of gfp+-kan into the proU gene. PCR products amplified using either +98proU-gfp+.fw or +936proU-gfp+.fw along with the prøl/.(stop)./can.rv were integrated onto the chromosome creating gfp+, transcriptional fusions to the proU promoter.
The gfpT gene was synthesized by DNA2.0 (San Diego, CA, USA) and provided cloned into a custom vector (pJ204, pUC ori, encodes ampicillin resistance). The kan cassette from pKD4 was PCR amplified using primers kan.Xmal.fw and kan.Xmal.rv, which incorporated Xmal sites to allow for cloning into the Xmal site of pJ204 (located about 30bp downstream of the gfpT stop codon). The linear and plasmid DNA were digested with Xmal (NEB), ligated using T4 DNA iigase (Roche) as recommended by the manufacturer's instructions.The ligated DNA was then transformed by heat shock into the E coli cloning strain XL1 Blue made competent for transformation using calcium chloride (These are standard laboratory techniques). Plasmids were purified (RBC biosciences) and used as template using primers; +98proU-gfpT.1w or .
Figure imgf000033_0001
along with prol/.(stop)./can.rv. The amplification of a ~2 Kb product confirmed which colonies contained the kan gene in the correct orientation (promoter facing away from the gfpT coding sequence). These PCR products were then integrated into the chromosome creating gfpT transcriptional fusions to the proU promoter.
M1655 lacZY-cat (an E. coli strain containing the cat gene inserted in lacA; obtained from D. M. Stoebel, Department of Microbiology, School of Genetics and Microbiology, Trinity College, Dublin 2, Ireland.) using the primers +98proU-lacZ.iw or +936proU-lacZJw and the proU.(stop).cat.rv primer. These PCR products were then integrated into the chromosome creating lacZ transcriptional fusions to the proU promoter.
A stop codon (TAA or TGA) was included in the every forward primer so that it integrated in- frame and prevented formation of a translational fusion.
Presumptive integrants were screened for an increased size compared to the wild type (WT) gene using either fimA.fw and fimA.xv (that amplify a -550 bp region of fimA in WT CSH50) or prol/.fw and proU.xv (that amplify a -1.4 Kb flanking region of the proU promoter in WT CSH50).
The presence of either gfp+-cat or gfp+-kan in fimA lead to an increase in size of -2 Kb compared to WT since the integrations were made to insert at a single site.
Integration in proU causes a ~1.2Kb deletion when integrated at +98 bp from the proU transcriptional start site or a -330 bp deletion when integrated at +936 bp from the proU transcriptional start site. The 3' integration event is directly before the stop codon in the proV (the first gene transcribed by the proU promoter). Fusions of the correct predicted size (as analyzed by agarose gel electrophrophesis) were sequenced on both strands to ensure correct integration (GATC Biotech).
PD32 is an H-NS deficient strain containing the bla gene (that encodes resistance to the antibiotic ampicillin) inserted in the hns gene (Dersch et al., MoI. Microbiol. 8, (1993), 875-889). The mutant hns allele was transduced using phage P1vir (a standard technique in molecular biology) into the prof fusion strains to analyse expression from proU in the absence of H-NS.
Osmotic induction of pro U fusions:
Single colonies were resuspended in 100 μl of L broth containing no salt (LO) and used to innoculte 3 ml broths containing increasing amounts of salt (NaCI). Cultures were incubated in aerated conditions overnight at 37 0C and 200rpm. 20 μl samlpes for flow cytometry (gfp-d gf pTiusions) were added to 1ml of 2% (vol/vol) formaldehyde/phosphate buffer saline (PBS). Samples were left at 4 0C under tin-foil until measuring fluorescence using a flow cytometer (Beckan Coulter).
β-galactosidase activity of the lacZ fusions were measured using a described by Miller (1992) with minor differences. Reactions were performed in 96-well microtiter plates. The kinetics of substrate hydrolysis at 37 0C was measured for at least ten samples, at 30 second intervals after an initial 3 minute lag period, using a multiscan ascent plate reader (Thermo labsystems). The total volume of each reaction was maintained at 200 μl.
Beta-galactosidase activity was determined according to the following formula: Slope (OD4i4/time)/(OD6oo x volume (μl) of cells used)
Sodium chloride (NaCI) was used to increase the osmolarity of the media as described previously (Lucht et al., J. Biol. Chem. 269 (1994), 6578-6586)
Results and Conclusion
Figure 3A shows the downstream regulatory region (DRE) of proV containing H-NS binding sites essential for repression of proU in low osmolarity media
Chromosomal lacZ, gfp+ and new gfpT fusions were constructed at +98 bp (disrupts DRE) (Fig. 3B) and +936 bp (DRE intact) (Fig. 3C) creating 6 reporter fusion strains.
We found that, as expected, insertion of a lacZ reporter gene at +98 bp causes derepression of the proU promoter at low salt. This is because H-NS does not bind in lacZ and its insertion at +98bp from the transcriptional start site deletes the H-NS binding site in the downstream regulatory element (DRE) (Fig 3A)
Insertion of the gfp+ gene at an identical site caused no derepression at low salt, demonstrating that H-NS binding in the gfp+ gene could functionally replace the well characterized, high affinity H-NS binding site in the DRE (Fig. 3B).
Insertion of the new gfpT gene disrupting the DRE leads to derepression of the proU promoter at low salt (similar to lacZ) indicating that H-NS does not bind in the new gfpT gene (Fig 3B). As a control, each reporter fusion was inserted at +936 bp leaving the DRE intact. As expected, these constructs showed repression of proU at low salt (Fig 3B).
To confirm that H-NS binding to gfp+ accounted for repression of proU-gfp+(+98) in vivo, the expression of all proU fusions was tested in an H-NS deficient background. Cells were cultured in the repressive conditions of 100 mM NaCI, and the data expressed as a percentage of maximal derepressed expression to facilitate comparisons between GFP fluorescence and β- galactosidase activity (Fig 3D). For all three fusions, proU expression was elevated in the absence of H-NS. This revealed that even in the absence of the DRE, H-NS continued to bind the URE and repress the proU-gfpT (+98) and proU-lacZ{+93) fusions. These data allowed an assessment of the relative effects of replacing only the DRE compared to elimination of H-NS protein from the cell (Fig. 3E). Replacing the DRE with gfpT resulted in an 8-fold increase in expression relative to expression from the proU-gfpT (+936) fusion. A similar comparison between proU-gfp+ at position +98 and +936 showed that gfp+ inserted in the DRE maintained full repression. Consequently, eliminating H-NS genetically resulted in a 7.3 to 9.4-fold increase in expression from the gfp+ fusions while the expression of proU-gfpT (+98) improved only 2.2- fold upon removal of H-NS (Fig 3E).
These experiments were based on the previously described observation that H-NS binding in the DRE is required to maintain repression of the proU promoter in low osmolarity media. The derepression of the proU promoter upon the insertion of the gfpl gene at +98 bp indicates that the codon optimization process which reduced the A/T content and predicted curvature of the gfpl gene, was sufficient to prevent H-NS binding. The gfpT gene is shown here to be a faithful reporter of gene activity. Equally the gfp+ gene is shown to be an unfaithful reporter of gene activity, effectively replacing the native H-NS binding site in the DRE.
While these results suggest that fluorescence levels of the GFP protein have not been altered (Fig 3C), it is shown in Example 7 that the GFPT is translated more efficiently than GFP+ due to the codon optimization of GFPT (as described in Example 2). Example 4
New gfpT gene as a transcriptional reporter to a second known H-NS regulated gene.
Summary Control of type 1 fimbria! expression in Escherichia coli ls controlled through inversion of a 314- bp invertible element (fimS), which contains the promoter for the major fimbrial subunit fimA (Fig 7A).
The site-specific recombinase FimB binds at the inverted repeats catalyzing the inversion of the fimS DNA segment leading to either fimbriate (phase ON) or afimbriate cells (phase OFF).
The inversion of fimS is sensitive to varying levels of DNA supercoiling. DNA supercoiling is controlled through the antagonistic actions of DNA gyrase (which tightly winds the DNA) and topoisomerase I (which relaxes the DNA). Under normal conditions, inversion from phase ON to phase OFF and from phase OFF to phase ON occurs at an equal rate. When the DNA is relaxed due to the addition of the DNA gyrase inhibiting drug novobiocin, switching from phase OFF to phase ON dramatically increases (Fig 7B).
We found that the introduction of the gip+ gene led to a reversal in this phase ON switching bias, with the population instead biasing towards phase OFF in response to novobiocin (Fig 7C). Switching in an isogenic fimA-gfpl fusion strain biased towards phase ON in response to novobiocin (Fig 7D), indicating that the repressive effects on inversion of fimS caused by the presence of the gfp+ gene are absent in the new gfpl gene.
The negative effect of gfp+ on fimS inversion was most likely due to H-NS binding in gfp+ and interacting with H-NS bound in fimS, allowing new DNA bridges to form around /ZmS. This would alter local DNA topology and may affect the interaction of the FimB recombinase with the inverted repeats.
Materials and methods
Novobiocin sodium salt (Sigma) was prepared fresh in sterile water before use. The construction of the fimA-gfp+-kan fusion was described in Example 3. A gfpl-kan cassette was integrated into fimA using the primers
fimA-qfpT-kan: fw 51 - GAT TGA TGC GGG TCA TAC CAA CGT TCT GGC TCT GCA GGA TTC ATT AAG AAG GAG AT- '3 fimA-qfpT-kan: rv 51 - TCT GCA CAC CAA CGT TTG TTG CGC TAC CCG CAG CTG AAC TCA TAT GAA TAT
CCT CCT TA- '3
Novobiocin induction was performed as follows. Briefly, single colonies were resuspended in 100 μl LB broth and used to innoculate 3 ml of LB in a 15 ml test tube, which were then incubated at 37 SC/ 200 rpm (aerobic conditions). Cultures were allowed grow until exponential phase (when the optical density at 60OnM (OD6Oo) is between 0.2 to 0.4) at which point they were diluted and used to innoculate 3ml broths containing varying concentrations of novobiocin. Dilutions were calculated to allow 15 generations before cessation of growth at an OD600 of ~3, assuming each generation resulted in a doubling of the OD6Oo of the culture.
The orientation of fimS was determined using a PCR based assay. Primers OL20 (5' -CCG TAA CGC AGA CTC ATC CTC - "3) and OL4 (51 - GAC AGA ACA ACG ATT GCC AG - '3) were used to amplify from outside of the invertible region. The resulting PCR products were digested at a unique BstUI site, asymmetrically located within fimS, allowing for distinction between cells containing fimS in the phase ON or phase OFF orientation based on the size of the digested fragments (see Figs 7B-D). Digested fragments before electrophoresed through a 2% agarose/TAE gel containing ethidium bromide (1 μg/ml). Gels were visualized under ultraviolet light (Alphalmager 2200, Alphalnnotech). The bands corresponding to phase OFF and phase ON were measured by densitometry using Image J software, and the percentage of phase OFF and phase ON cells was calculated. Each gel shown is representative of at least 3 independent experiments.
Results and Conclusions
These results shown in Figures 7B to D show that the presence of the gfp+ gene can have a dramatic effect on the inversion of fimS, while the gfpT gene has no such effect. That a gene designed to act as a transcriptional reporter can have such a dramatic effect on the architecture of the DNA has serious implications for the results obtained from previous studies using gfp. If, as we have shown, gfp+ can affect a recombination event, it is possible that the presence of the gfp gene could affect other processes requiring recombination events such as chromosome replication, plasmid partitioning and phage integration. Any alteration in these events could have a dramatic affect on the cell, possibly slowing growth, causing plasmid loss or have other unforeseen consequences, compromising the validity of the results obtained by using the reporter fusion. Example 5
H-NS binds to gfp* with higher affinity than gfpl in vitro
Summary The regulatory regions flanking the proU promoter are A+T rich, highly intrinsically curved (Owen-Hughes et al., Cell 71 (1992), 255-65) and contain multiple high affinity H-NS binding sites (Bouffartigues et al., Nat. Struct. MoI. Biol. 14 (2007), 441-8). This DNA was therefore used as a positive control for H-NS binding (Figure 8). The lacZ reporter gene is a poor target for H-NS as it is relatively G+C rich and not intrinsically curved (Owen-Hughes et al., Cell 71 (1992), 255-65) and was used as a negative control for H-NS binding (Figure 8).
Electrophoretic mobility shift assays (EMSA) were performed to determine the affinity (Kapp) of H-NS for gfp+ and gfpl in vitro. Since H-NS binds with low specificity and affinity and H-NS binding is highly co-operative, in order to assess H-NS binding affinity accurately for the two gfp genes a narrow range of protein concentrations was chosen. H-NS was found to bind gfp+ strongly with a Kapp of 4.9 nM (Figure 8A). A further indication of the high affinity of H-NS for gfp+ is the narrow range of protein (4.5-10.55 nM) required for the transition from initial binding to fully bound probe, resulting in a single high molecular mass complex. This also illustrates the highly co-operative nature of H-NS binding. H-NS had a lower affinity for gfpT (Figure 8A; Kapp, 7.5 nM). The lower affinity of H-NS for gfpT resulted in smearing of the DNA over a wide range of protein concentrations (7.9-18.75nM) with the gfpl probe only resolving as a single bound complex at 25nM H-NS.
The proU regulatory region was used as a positive control for H-NS binding (Figure 8B). As expected, the proU probe was strongly bound by H-NS (Kapp 6.2). The proU region contains a number of well characterized H-NS binding sites (Bouffartigues et al., Nat. Struct. MoI. Biol. 14
(2007), 441-8) and resolved as two separate high affinity complexes (arrowed). lacZ was poorly bound by H-NS (Kapp 16.3) and the resolution of a single protein-DNA complex at high H-NS concentrations (25 nM) simply highlights the low specificity of H-NS, which at saturating concentrations binds independently of sequence to DNA (Tupper et al., EMBO J. 13 (1994)
258-268). Although the change in Kapp between gfp+ and gfpT measured in vitro was relatively small, the difference in binding affinity was highly significant in vivo. H-NS affinity for gfp+ and gfpl was also compared in the same reaction using biotinylated and unlabelled DNA in equal amounts (50 pM) (Figure 8C). These data showed that when both genes were present, H-NS bound specifically to gfp+ and only bound gfpl when all the gfp+ DNA had been bound (14.1 nM). Materials and Methods
H-NS binding to each probe (0.4 ng DNA per reaction) was carried out in 20 ul reaction mixtures containing increasing concentrations of purified H-NS protein (final concentrations; 0-25nM) in 2OmM Tris HCL, 1 mM EDTA, 100 ug/ml BSA, 1 mM DTT, 10% glycerol and 80 mM NaCI. Reactions were incubated at 4 SC for 30 min. 10 μl of each reaction was loaded (without the addition of loading dye) onto a 5% poly-acrylamide gel [5% acrylamide/bisacrylamide (30:1) (National Diagnostics), 2% glycerol, 0.5X TBE] and electrophoresed at 90 V for 2 h (4 3C) followed by electrophorectic transfer (30V for 1 hr) to Biodyne B 0.45 μM membrane (Pall). 0.5X TBE (45 mM Tris-borate [pH 8.3] containing 1.25 mM disodium EDTA), was used as both running and transfer buffers. The wet membrane was UV treated twice at 150 m Joule in a GS Gene Linker UV chamber (Biorad). The Chemiluminescent Nucleic Acid Detection Module (Pierce) was used as per manufacturers instructions followed by signal detection using developer and fixer solutions (Kodak) and Hyperfilm (Amersham Biosciences). Densitometric analysis was performed using Image J software.
PCR was used to amplify the entire coding sequence of gfp+ and gfpT and equivalents sized regions (717 bp) in proU and lacZ. The primers used are listed 5' to 3' below;
gfp+.bs.fw ATG AGT AAA GGA GAA GAA CTT TTC gfp+.bs.bio.rv TTA TTT GTA GAG CTC ATC CAT G gfpT.bs.fw ATG AGC AAA GGC GAA GAG CT gfpT.bs.bio.rv TTA CTT ATA CAG TTC ATC CAT ACC G proV.bs.fw AGG GTG TTA TTT TCA AAA ATA TCA C proV.bs.bio.rv CAT ATG CGG CAT TAA GGC AA lacZ.bs.fw ATG ACC ATG ATT ACG GAT TCA CTG G lacZ.bs.bio.rv AGC GCG GCT GAA ATC ATC AT
Bio indicates primers that contained a 5' biotin tag that allows for visualization of the DNA. Primers with identical sequences to gfp+.bs.bio.rv and gfpT.bs.bio.rv but without 5' biotin tags were used as unlabelled DNA in Figure 8C.
Example 6 Optimization of a second fluorescent protein, DsRed, to reduce H-NS affinity
The same method of optimization as applied to gfp+ was used to alter the dsred gene in silico. The selected optimized gene, dsredT, differs from dsred by 155 nucleotide substitutions across the 678 base-pair gene has reduced A/T content (49%) and reduced predicted DNA curvature (Figure 6A to C). We predict that this optimized dsrecfT will have a a lower H-NS affinity than dsred.
Example 7 gfpT is translated more efficiently than gfp+
Materials and methods
Cloning gfp variants in plasmid pPro
Blunt-ended PCR amplicons of the gfp+ and gfpJ open reading frames were generated using Phusion polymerase (NEB) and primers gfp+.pPro24-blunt.fw, gfp+.pPro24-Pstl.rv, gfpT.pPro24-blunt.fw, gfpT.pPro24-Pstl.rv (listed 5' to 3' below).
gfp+.pPro24-blunt.fw AGT AAA GGA GAA GAA CTT TTC gfp+.pPro24-Pstl.rv TCT ACT GCA GTT ATT TGT AGA GCT CAT CCA TG gfpT.pPro24-blunt.fw AGC AAA GGC GAA GAG CTG TTC gfpT.pPro24-Pstl.rv TCT ACT GCA GTT ACT TAT ACA GTT CAT CC
PCR amplicons were first digested with Pstl, then blunt ends were phosphorylated using T4 polynucleotide kinase in T4 ligase buffer (Roche Diagnostics, Mannheim, Germany) followed by purification with a HiYield gel/PCR DNA fragments extraction kit (RBC Biosciences). pPro24 (Lee & Keasling, 2005) was digested with Smal and Pstl, dephosphorylated using Antarctic phosphatase (NEB), and then ligated to PCR amplicons using a Rapid DNA ligation kit (Roche). Correct clones were confirmed by DNA sequencing.
The prpBCDE promoter in pPro24-gfp clones was induced with propionate as follows: streak- isolated colonies were used to inoculate 4 ml LB (86 mM NaCI) broth cultures and these were grown to an OD600 -0.5. Cultures were then diluted 1/500 into fresh LB including glucose (to repress the prpBCDE promoter) or propionate (to induce it). Cultures were grown overnight at 37 9C with shaking and samples were fixed and analysed by flow cytometry the following morning.
Calculation of Codon Adaptation Index (CAI) values.
CAI values were calculated using CAIcal (Puigbo et al., Biol Direct. 3, (2008), 38) from codon usage tables provided by the Codon Usage Database (Nakamura et al., Nucl. Acids Res. 28, (2000), 292). All codon usage tables used in this analysis were derived from whole genome sequences. Results
Improved translation efficiency was confirmed by cloning gfp+ and gfpT in the pPro vector under control of the propionate-inducible prpBCDE promoter (Lee et al., Appl. Environ. Microbiol. 71
(2005), 6856- 6862.), for which there is no evidence of H-NS binding or repression (Dillon et al., MoI. Microbiol. (2010), doi:10.1111/j.1365-2958.2010.07173.x.). In pPro, gfpT produced on average 3.5-fold more GFP in a wildtype background and 2.7-fold more GFP in a hns mutant background compared to gfp+ (Figure 9A). The codon adaptation index (CAI) is a standard means to calculate the effects of species-specific codon biases on translation (Sharp et al.,
Nucl. Acids Res. 15, (1987), 1281-1295.). Figure 9B shows that gfpT is predicted to have improved translation efficiency in both bacterial and eukaryotic model organisms.
High levels of GFP, usually associated with expression from a multicopy plasmid, have j., uviously been shown to be toxic to the host cell, leading to dramatic plasmid loss or mutation in order to reduce the amount of GFP being produced (Hautefort et al., App. Env. Microbiol. 69 (2003), 7480-7491). It is possible that a part of this toxicity is due to the sub-optimal codon usage of gip+ leading to reduced translation efficiency. This would cause an increased burden on the translational machinery (including the ribosomal components), preventing efficient translation of essential host genes and thus, create toxic effects on the host.
GFPT is optimized for high expression in E. coli and thus may have reduced toxicity when highly expressed.
Another potential advantage may be increased fluorescence of GFPT vs GFP+ when the genes are weakly transcribed. This could arise due to the optimized coding sequence of GFPT allowing more efficient use of the host transfer RNA pool than GFP+, allowing faster translation and thus, a greater accumulation of fluorescent protein. This would result in GFPT cells having a higher level of fluorescence than the GFP+ cells under identical conditions.
Therefore, GFPT potentially has a wider range of fluorescence than GFP+, without having the same detrimental effect on the host.
In the specification the terms "comprise, comprises, comprised and comprising" and the terms "include, includes, included and including" are all deemed totally interchangeable and should be afforded the widest possible interpretation.
The invention is in no way limited to the embodiment hereinbefore described which may be varied in both construction and detail within the scope of the claims.

Claims

CLAiMS
1. A method for improving gene expression in a host cell comprising a protein encoding nucleic acid comprising assessing the A and T nucleotide content and/or the intrinsic curvature of a wild type protein encoding nucleic acid or mutant thereof; preparing an altered protein encoding nucleic acid by modifying the A and T nucleotide content of the wild type protein encoding nucleic acid or mutant thereof to equal or lower the A and T nucleotide content of the host cell such that the intrinsic curvature of the altered protein encoding nucleic acid is reduced compared to the wild type protein encoding nucleic acid or mutant thereof and the altered protein encoding nucleic acid has reduced affinity, compared to the wild type protein encoding nucleic acid or mutant thereof, to host cell transcriptional repressor proteins; and using the altered protein encoding nucleic acid in a host cell gene expression system.
2. The method according to claim 1 wherein the host cell transcriptional repressor protein is a nucleoid-associated transcriptional repressor protein, including H-NS.
3. The method according to any of the preceding claims wherein the wild type protein encoding nucleic acid or mutant thereof is modified so that it is no longer A and T nucleotide rich (AT-rich) compared to the host cell nucleic acid average A and T nucleotide content.
4. The method according to any of the preceding claims wherein the modified A and T nucleotide content of the altered protein encoding transcriptional repressor protein binding region nucleic acid is equal to or lower than the A and T nucleotide content of the host cell transcriptional repressor protein nucleic acid binding region.
5. The method according to any of the preceding claims wherein the A and T nucleotide content of the altered protein encoding promoter region and/or ribosome binding site (RBS) nucleic acid is modified compared to the wild type protein encoding nucleic acid or mutant thereof.
6. The method according to claim 5 wherein the A and T nucleotide content of the regions proximal to the altered protein encoding nucleic acid promoter region are modified such that the A and T nucleotide content is equal to or lower than the A and T nucleotide content of the host cell nucleic acid.
7. The method according to any of the preceding claims wherein the host cell is a bacterium, preferably Escherichia coli or Salmonella, or a yeast.
8. The method according to any of the preceding claims wherein the protein encoding nucleic acid is a fluorescent protein nucleic acid.
9. A modified nucleic acid comprising a sequence encoding a wild type protein or mutant thereof wherein the nucleic acid has an equal or lower A and T nucleotide content and/or reduced intrinsic curvature compared to the wild type protein encoding nucleic acid or mutant thereof characterised in that the protein has reduced affinity to one or more host cell transcriptional repressor proteins compared to the wild type protein encoding nucleic acid or mutant thereof.
10. A modified fluorescent protein nucleic acid according to claim 9 comprising a sequence encoding a wild type fluorescent protein or mutant thereof, wherein the nucleic acid has an equal or lower A and T nucleotide content and/or reduced intrinsic curvature compared to the wild type fluorescent protein nucleic acid or mutant thereof characterised in that the protein has reduced affinity to one or more host cell transcriptional repressor proteins compared to the wild type fluorescent protein nucleic acid or mutant thereof.
11. The modified nucleic acid according to claim 10 with equal or lower A and T nucleotide content in the regions proximal to the promoter region and/or ribosome binding site (RBS) of the fluorescent protein nucleic acid compared to the wild type fluorescent protein nucleic acid or mutant thereof.
12. The modified nucleic acid according to claims 10 or 11 with improved transcription compared to the wild type fluorescent protein nucleic acid or mutant thereof.
; 3. The modified nucleic acid according any of claims 10 to 12 comprising the nucleic acid sequence of Figure 4B or 6B or a sequence with at least 70%, preferably 80%, more preferably 90%, more preferably 95%, even more preferably 99% homology over the entire length to the nucleic acid sequence of Figure 4B or 6B.
14. The modified nucleic acid according to any of claims 10 to 13 for use in a host cell expression system wherein the fluorescent protein has an A and T nucleotide content equal to or lower than the host cell average A and T nucleotide content.
15. The modified nucleic acid according to claim 14 wherein the host cell is a bacterium, preferably Escherichia coli or Salmonella, or a yeast.
16. A fluorescent protein encoded by the modified nucleic acid of any of claims 11 to 16.
17. A host cell expression system comprising the modified nucleic acid sequence or fluorescent protein according to any of claims 10 to 16, preferably for use in a host cell such as Escherichia coli or Salmonella.
18. A plasmid vector comprising the modified nucleic acid sequence or fluorescent protein according to any of claims 10 to 16, preferably for use in a host cell such as Escherichia coli.
19. A host cell comprising the modified nucleic acid sequence, fluorescent protein, plasmid vector, expression system according to any of claims 11 to 18.
20. A host cell according to claim 19 wherein the A and T nucleotide content of the modified fluorescent protein nucleic acid is equal to or lower than the A and T nucleotide content of the host cell nucleic acid.
21. The method, modified nucleic acid, fluorescent protein, expression system, plasmid vector or host cell according to any of claims 10 to 20 wherein the fluorescent protein is a green fluorescent protein (GFP), yellow fluorescent protein (YFP) cyan fluorescent protein (CFP), blue fluorescent protein (BFP) or red fluorescent protein (DsRed) or a mutant thereof.
22. The method, modified nucleic acid, fluorescent protein, expression system, plasmid vector or host cell according to any of claims 10 to 20 wherein the fluorescent protein is a green fluorescent protein mutant selected from the following a spectral variant, a pHluorins, a variant with an altered Stokes shift, an oligomerization variant, a folding variant, a photoactivatable variant, a photoconversion variant, a photoswitchable variant, a redox sensitive variant and/or QfP+-
23. The modified nucleic acid, fluorescent protein, expression system, plasmid vector or host cell according to any of claims 10 to 20 wherein the host cell transcriptional repressor protein is a nucleoid-associated repressor protein, including H-NS.
24. A method of monitoring gene expression in a host cell comprising the use of the modified nucleic acid, fluorescent protein, expression system or host cell according to any of claims 10 to 23.
PCT/EP2010/057856 2009-06-05 2010-06-04 A method for improving gene expression WO2010139797A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0909689A GB0909689D0 (en) 2009-06-05 2009-06-05 Fluorescent proteins
GB0909689.2 2009-06-05

Publications (1)

Publication Number Publication Date
WO2010139797A1 true WO2010139797A1 (en) 2010-12-09

Family

ID=40936955

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2010/057856 WO2010139797A1 (en) 2009-06-05 2010-06-04 A method for improving gene expression

Country Status (2)

Country Link
GB (1) GB0909689D0 (en)
WO (1) WO2010139797A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111373033A (en) * 2017-11-23 2020-07-03 豪夫迈·罗氏有限公司 Proline hydroxylase and related uses, methods and products

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050071890A1 (en) * 1997-10-20 2005-03-31 Chen Li How Novel modified nucleic acid sequences and methods for increasing mRNA levels and protein expression in cell systems
WO2005118874A1 (en) * 2004-06-04 2005-12-15 Wyeth Enhancing protein expression

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050071890A1 (en) * 1997-10-20 2005-03-31 Chen Li How Novel modified nucleic acid sequences and methods for increasing mRNA levels and protein expression in cell systems
WO2005118874A1 (en) * 2004-06-04 2005-12-15 Wyeth Enhancing protein expression

Non-Patent Citations (45)

* Cited by examiner, † Cited by third party
Title
BAIRD ET AL., PROC. NATL. ACAD. SCI. USA, vol. 97, 2000, pages 11984 - 11989
BARTILSON ET AL., MOL. MICROBIOL., vol. 39, 2001, pages 126 - 135
BOUFFARTIGUES EMELINE ET AL: "H-NS cooperative binding to high-affinity sites in a regulatory element results in transcriptional silencing", NATURE STRUCTURAL & MOLECULAR BIOLOGY, vol. 14, no. 5, May 2007 (2007-05-01), pages 441 - 448, XP002595588, ISSN: 1545-9985 *
BOUFFARTIGUES ET AL., NAT. STRUCT. MOL. BIOL., vol. 14, 2007, pages 441 - 8
CHARLES J DORMAN: "H-NS, the genome sentinel", NATURE REVIEWS. MICROBIOLOGY, NATURE PUBLISHING GROUP, GB LNKD- DOI:10.1038/NRMICRO1598, vol. 5, 1 February 2007 (2007-02-01), pages 157 - 161, XP009137170, ISSN: 1740-1526, [retrieved on 20061227] *
CLARK ET AL., MICROBIOLOGY, vol. 155, 2009, pages 461 - 467
COLIN P CORCORAN ET AL: "H-NS silences gfp, the Green Fluorescent Protein gene: gfpTCD is a genetically remastered gfp gene with reduced susceptibility to H-NS-mediated transcription silencing and with enhanced translation", JOURNAL OF BACTERIOLOGY, vol. E-article, 16 July 2010 (2010-07-16), pages 1 - 14, XP009137094, ISSN: 0021-9193, [retrieved on 20100716] *
CRAMERI A ET AL: "IMPROVED GREEN FLUORESCENT PROTEIN BY MOLECULAR EVOLUTION USING DNA SHUFFLING", NATURE BIOTECHNOLOGY, NATURE PUBLISHING GROUP, NEW YORK, NY, US LNKD- DOI:10.1038/NBT0396-315, vol. 14, 14 March 1996 (1996-03-14), pages 315 - 319, XP000791095, ISSN: 1087-0156 *
CRAMERI ET AL., NAT BIOTECHNOL, vol. 14, 1996, pages 315 - 9
DATSENKO; WANNER, PROC. NATL. ACAD. SCI. USA, vol. 97, 2000, pages 6640 - 6645
DENG TILIANG: "Bacterial expression and purification of biologically active mouse c-Fos proteins by selective codon optimization", FEBS LETTERS, ELSEVIER, AMSTERDAM, NL LNKD- DOI:10.1016/S0014-5793(97)00522-X, vol. 409, no. 2, 1 January 1997 (1997-01-01), pages 269 - 272, XP002417744, ISSN: 0014-5793 *
DERSCH ET AL., MOL. MICROBIOL., vol. 8, 1993, pages 875 - 889
DILLON ET AL., MOL. MICROBIOL., 2010
DORMAN, NAT. REV. MICROBIOL, vol. 5, 2007, pages 157 - 161
DOYLE ET AL., SCIENCE, vol. 315, 2007, pages 251 - 252
DOYLE MARIE ET AL: "An H-NS-like stealth protein aids horizontal DNA transmission in bacteria", SCIENCE (WASHINGTON D C), vol. 315, no. 5809, January 2007 (2007-01-01), pages 251 - 252, XP002595563, ISSN: 0036-8075 *
FORSBERG ET AL., J. BACTERIOL, vol. 176, 1994, pages 2128 - 32
HAUTEFORT ET AL., APP. ENV. MICROBIOL., vol. 69, 2003, pages 7480 - 7491
LEE ET AL., APPL. ENVIRON. MICROBIOL., vol. 71, 2005, pages 6856 - 6862
LI A ET AL: "Optimized gene synthesis and high expression of human interleukin-18", PROTEIN EXPRESSION AND PURIFICATION, ACADEMIC PRESS, SAN DIEGO, CA LNKD- DOI:10.1016/J.PEP.2003.08.003, vol. 32, no. 1, 1 November 2003 (2003-11-01), pages 110 - 118, XP004469392, ISSN: 1046-5928 *
LIU TI ET AL: "Improved heterologous gene expression in Trichoderma reesei by cellobiohydrolase I gene (cbh1) promoter optimization.", ACTA BIOCHIMICA ET BIOPHYSICA SINICA FEB 2008 LNKD- PUBMED:18235978, vol. 40, no. 2, February 2008 (2008-02-01), pages 158 - 165, XP002595560, ISSN: 1745-7270 *
LUCHT ET AL., J. BIOL. CHEM., vol. 269, 1994, pages 6578 - 6586
LUIJSTERBURG ET AL., CRIT. REV. BIOCHEM. MOL. BIOL., vol. 43, 2008, pages 393 - 418
MAKOFF A J: "EXPRESSION OF TETANUS TOXIN FRAGMENT C IN E.COLI: HIGH LEVEL EXPRESSION BY REMOVING RARE CODONS", NUCLEIC ACIDS RESEARCH, OXFORD UNIVERSITY PRESS, SURREY, GB, vol. 17, no. 24, 25 December 1989 (1989-12-25), pages 10191 - 10202, XP000083955, ISSN: 0305-1048 *
MATZ ET AL., NAT. BIOTECH., vol. 17, 1999, pages 969 - 973
NAKAMURA ET AL., NUCL. ACIDS RES., vol. 28, 2000, pages 292
OWEN-HUGHES ET AL., CELL, vol. 71, 1992, pages 255 - 65
PALMITER ET AL., SCIENCE, vol. 222, 1983, pages 809 - 814
PON CYNTHIA L ET AL: "Repression of transcription by curved DNA and nucleoid protein H-NS: A mode of bacterial gene regulation", 2005, LANDES BIOSCIENCE, 810 S CHURCH ST, GEORGETOWN, TX 78626 USA SERIES : MOLECULAR BIOLOGY INTELLIGENCE, pages 1 - 8, XP002595561, Retrieved from the Internet <URL:http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=eurekah&part=A31864> [retrieved on 20100806] *
PUIGBO ET AL., BIOL DIRECT., vol. 3, 2008, pages 38
SCHOLZ ET AL., EUR. J. BIOCHEM, vol. 267, 2000, pages 1565 - 70
SCHOLZ ET AL., EUR. J. BIOCHEM., vol. 267, 2000, pages 1565 - 70
SCHOLZ O ET AL: "QUANTITATIVE ANALYSIS OF GENE EXPRESSION WITH AN IMPROVED GREEN FLUORESCENT PROTEIN", EUROPEAN JOURNAL OF BIOCHEMISTRY, BLACKWELL PUBLISHING, BERLIN, DE LNKD- DOI:10.1046/J.1432-1327.2000.01170.X, vol. 267, 1 March 2000 (2000-03-01), pages 1565 - 1570, XP000989981, ISSN: 0014-2956 *
SHARP ET AL., NUCL. ACIDS RES., vol. 15, 1987, pages 1281 - 1295
STOEBEL DANIEL M ET AL: "Anti-silencing: overcoming H-NS-mediated repression of transcription in Gram-negative enteric bacteria", MICROBIOLOGY (READING), vol. 154, no. Part 9, September 2008 (2008-09-01), pages 2533 - 2545, XP002595562, ISSN: 1350-0872 *
SUBRAMANI ET AL., MOL. CELL BIOL, vol. 1, 1981, pages 854 - 864
TIMCHENKO ET AL., EMBO J., vol. 15, 1996, pages 3986 - 3992
TSIEN, ANNU REV BIOCHEM, vol. 67, 1998, pages 509 - 44
TUPPER ET AL., EMBO J., vol. 13, 1994, pages 258 - 268
VILLALOBOS ET AL., BMC BIOINFORMATICS, vol. 7, 2006, pages 285
VLAHOVICEK ET AL., NUCLEIC ACIDS. RES., vol. 31, 2003, pages 3686 - 7
WILLIAMS D P ET AL: "DESIGN, SYNTHESIS AND EXPRESSION OF A HUMAN INTERLEUKIN-2 GENE INCORPORATING THE CODON USAGE BIAS FOUND IN HIGHLY EXPRESSED ESCHERICHIA COLI GENES", NUCLEIC ACIDS RESEARCH, OXFORD UNIVERSITY PRESS, SURREY, GB, vol. 16, no. 22, 25 November 1988 (1988-11-25), pages 10453 - 10467, XP000007466, ISSN: 0305-1048 *
WRIGHT A ET AL: "Diverse plasmid DNA vectors by directed molecular evolution of cytomegalovirus promoters", HUMAN GENE THERAPY, MARY ANN LIEBERT, NEW YORK ,NY, US LNKD- DOI:10.1089/HUM.2005.16.881, vol. 16, no. 7, 1 July 2005 (2005-07-01), pages 881 - 892, XP002445285, ISSN: 1043-0342 *
ZASLAVER ET AL., NAT. METH., vol. 3, 2006, pages 623 - 628
ZOLOTUKHIN S ET AL: "A HUMANIZED GREEN FLUORESCENT PROTEIN CDNA ADAPTED FOR HIGH-LEVEL EXPRESSION IN MAMMALIAN CELLS", JOURNAL OF VIROLOGY, THE AMERICAN SOCIETY FOR MICROBIOLOGY, US, vol. 70, no. 7, 1 July 1996 (1996-07-01), pages 4646 - 4654, XP002030427, ISSN: 0022-538X *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111373033A (en) * 2017-11-23 2020-07-03 豪夫迈·罗氏有限公司 Proline hydroxylase and related uses, methods and products

Also Published As

Publication number Publication date
GB0909689D0 (en) 2009-07-22

Similar Documents

Publication Publication Date Title
Balzer et al. A comparative analysis of the properties of regulated promoter systems commonly used for recombinant gene expression in Escherichia coli
Wilson et al. Detecting protein-protein interactions with GFP-fragment reassembly
Baumgarten et al. Isolation and characterization of the E. coli membrane protein production strain Mutant56 (DE3)
Aguilera et al. Dual role of LldR in regulation of the lldPRD operon, involved in L-lactate metabolism in Escherichia coli
JPH06217779A (en) Nucleotide arrangement for coding apoaequorin, recombinant vector containing said arrangement and recombinant microorganism containing said vector
KR20030092013A (en) Novel expression vectors
Takacs et al. Fluorescent proteins, promoters, and selectable markers for applications in the Lyme disease spirochete Borrelia burgdorferi
US10202656B2 (en) Dividing of reporter proteins by DNA sequences and its application in site specific recombination
WO2008019183A2 (en) Biopolymer and protein production using type iii secretion systems of gram negative bacteria
Aspiras et al. Expression of green fluorescent protein in Streptococcus gordonii DL1 and its use as a species-specific marker in coadhesion with Streptococcus oralis 34 in saliva-conditioned biofilms in vitro
WO2007149870A2 (en) Two-color fluorescent reporter for alternative pre-mrna splicing
US20020142387A1 (en) Methods for producing protein domains and analyzing three dimensional structures of proteins by using said domains
Zhang et al. A simplified, robust, and streamlined procedure for the production of C. elegans transgenes via recombineering
WO2010139797A1 (en) A method for improving gene expression
JP2021061758A (en) Translation promoter, template nucleic acid, production method of translation template, and production method of protein
JPWO2015025959A1 (en) Polypeptide exhibiting fluorescence characteristics and use thereof
US20180348231A1 (en) Ligand inducible polypeptide coupler system
Chen et al. AP profiling resolves co-translational folding pathway and chaperone interactions in vivo
Vlisidou et al. Photorhabdus virulence cassettes: extracellular multi-protein needle complexes for delivery of small protein effectors into host cells
Takacs et al. Characterization of fluorescent proteins, promoters, and selectable markers for applications in the Lyme disease spirochete Borrelia burgdorferi
CN115966258A (en) Method for screening IRES based on levenstan distance and polynucleotides screened thereby
Rubio Exploring RBM6 Overexpression: Impact on Cell Division and Gene Splicing in HeLa Cells
WO2015115610A1 (en) Expression cassette
Izadi et al. Producing a mammalian GFP expression vector containing neomycin resistance gene
Bledowski Development of Reporter Systems of Cellular Readouts of Chinese Hamster Ovary (CHO) Cells

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10724070

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10724070

Country of ref document: EP

Kind code of ref document: A1