WO1989000604A1 - Method for improving translation efficiency - Google Patents

Method for improving translation efficiency Download PDF

Info

Publication number
WO1989000604A1
WO1989000604A1 PCT/US1988/002341 US8802341W WO8900604A1 WO 1989000604 A1 WO1989000604 A1 WO 1989000604A1 US 8802341 W US8802341 W US 8802341W WO 8900604 A1 WO8900604 A1 WO 8900604A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
dna
stem
secondary structure
free energy
Prior art date
Application number
PCT/US1988/002341
Other languages
French (fr)
Inventor
Nancy Lee
Douglas Testa
Original Assignee
Interferon Sciences, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Interferon Sciences, Inc. filed Critical Interferon Sciences, Inc.
Publication of WO1989000604A1 publication Critical patent/WO1989000604A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/67General methods for enhancing the expression
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/52Cytokines; Lymphokines; Interferons
    • C07K14/555Interferons [IFN]
    • C07K14/56IFN-alpha
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/70Vectors or expression systems specially adapted for E. coli

Definitions

  • This invention relates to a method for increasing the production of proteins by biological cells and, in particular, to a method for improving the efficiency with which messenger RNA (mRNA) is translated.
  • mRNA messenger RNA
  • the invention provides a method for increasing the translation efficiency of a mRNA sequence which is produced from a DNA or RNA sequence comprising the steps of:
  • Shine-Dalgarno sequence is contained in a double stranded portion of a stem-loop region of the predicte-d secondary structure; (b) calculating a free energy value for the stem-loop region; and
  • step (ii) if either or both of the AUG initiation codon and the Shine-Dalgarno sequence are included in a double stranded portion of a stem-loop region of the predicted secondary structure, the calculated free energy value for such stem-loop region is more positive than the free energy value calculated in step (b).
  • the free energy value calculated in step (b) is in the range of from zero to about -7.0 kcal/mole, i.e., the calculated free energy is more positive than the free energy of hydrolysis of ATP. This range is particularly likely to have been ignored by prior art workers since it represents structures whose energy is less than the energy of hydrolysis of only one ATP.
  • Figure 1 shows the construction of hybrid plasmid pNL014.
  • Xb Xbal
  • E EcoRI
  • H Hindlll
  • S Sail
  • C Clal
  • 2u - Yeast 2u replication origin URA3 - Yeast URA3 gene
  • a.a. amino acid.
  • Figure 2 shows the construction of hybrid plasmid pNL015. Abbreviations: as in Figure 1.
  • pIN-I pIN-I-A vector. Marked bases are those which do not appear in pNL008.
  • Figure 3 shows the construction of hybrid plasmid pNL008. Abbreviations: as in Figure 1. SI - SI nuclease. Marked bases are those which do not appear n pNL015. EcoRI linker (GAATTC) was obtained from New England Biolabs.
  • Figure 4 shows the predicted secondary structures for mRNA produced from plasmids pNL015 and pNL008.
  • the sequences start with the first base (1) of the transcripts. Sequences under the broken lines are the Shine-Dalgarno region. The initiation codon is indicated by a heavy bar and the deleted or inserted sequences are marked by a light bar. The AUG of the methionyl interferon is marked by dots. Calculated free energies of the secondary structures are -3.9 kcal/mole and - 3.2 kcal/mole for pNL015 and pNL008 transcripts, respectively. Arrows indicate base substitutions as occurred in pNL016 and pNL017.
  • Figure 5 shows the results of a pulse-chase analysis of the in vivo stability of interferon fusion protein produced by JA221/pNL008 (lanes 1 - 4) and
  • Lane 0 contains molecular weight standards. Chase times were -- lanes 1 and 5: 0 min. ; lanes 2 and 6: 15 min. ; lanes 3 and 7: 30 min.; lanes 4 and 8: 60 min.
  • Alpha 1 indicates the position of IFN ⁇ l fusion protein.
  • the M.W. of IFN fusion protein from pNL015 was slightly higher than that from pNL008 due " to 3 extra amino acid residues.
  • Figure 6 shows the results of RNA dot blot hybridization experiments for JA221/pNL015 and JA221/pNL008.
  • Columns 1 and 2 represent serial dilution (1:5) of total cellular RNA (top row 40 ug) of JA221/pNL015 and JA221/ ⁇ NL008, respectively.
  • the total RNA was isolated following the procedure of Young and Furano and applied to nitrocellulose paper. See Young, F. S. and Furano, A. V., "Regulation of the synthesis of I L coli elongation factor Tu," Cell, 24 (1981) 695-706.
  • Nick-translated EcoRI fragment of IFN ⁇ l gene from pNL015 was used for hybridization according to Maniatis et.al.. See Maniatis, T. , Fritsch, E. F. , and Sambrook, J. , Molecular Cloning, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, (1982).
  • Figure 7 shows the results of stability analyses of IFN ⁇ l mRNA transcribed from pNL015 (solid triangles) and pNL008 (open triangles).
  • Rifampicin Calbiochem
  • was added 200ug/ml
  • JA221/pNL008 and JA221/pNL015 were incubated at 37°C for
  • Figure 8 shows the construction of pNL016 and pNL017. Abbreviations as in Figure 1. Oligonucleo- tides were synthesized by using . Applied Biosystems DNA Synthesizer and purified by gel elution. Following phosphorylation by polynucleotide kinase, appropriate pairs of oligonucleotides were mixed and annealed at
  • the present invention relates to the discovery that the sequestering of one or both of the AUG initiation codon and the Shine-Dalgarno sequence in a double stranded portion of even a weakly bound (relatively unstable) mRNA secondary structure, i.e., a secondary structure having a calculated free energy more positive than -10.0 kcal/mole, can have significant effects on protein production.
  • a weakly bound (relatively unstable) mRNA secondary structure i.e., a secondary structure having a calculated free energy more positive than -10.0 kcal/mole
  • the first step of the process of the invention involves determining if all or part of either or both of the AUG initiation codon and the Shine-Dalgarno sequence is contained in a double stranded portion of a stem-loop region of the predicted secondary structure for the mRNA (i.e., the secondary structure predicted from pairing of the bases of the mRNA's primary structure as opposed to an experimentally observed secondary structure) .
  • This determination is most conveniently performed by conducting computer analyses on the base sequence of the mRNA to identify one or, in some cases, a group of possible secondary structures for the mRNA.
  • a most probable secondary structure can normally be selected following techniques known in the art. See Akiyoshi Wada and
  • secondary structures can be determined visually, i.e. , by examining the primary structure and manually aligning A-U and C-G pairs, and the free energies of such structures can be manually calculated using, for example, the Tinoco techniques, supra.
  • parameters which can be considered in selecting most probable secondary structures include percent base match, probability of finding as good a match in a random sequence of bases of the same length, arid secondary structure free energy.
  • free energy it is important not to dismiss secondary structures having calculated free energies more positive than -10.0 kcal/mole as prior workers have done (see above) , since as shown below, such weak secondary structures can significantly affect protein production.
  • a suitable computer program for determining mRNA secondary structures is the SEQ - DNA Sequence Analysis
  • SEQ program California (hereinafter the "SEQ program”).
  • SEQ program uses the Tinoco et al. techniques, supra, to calculate free energies. See also Zuker and Sankov, Bull, of Math. Biol. , 46:591-621 (1984).
  • the predicted secondary structure or structures are analyzed to determine: 1) if all or part of either or both of the AUG initiation codon and the Shine-Dalgarno sequence is contained in a double stranded portion of a stem-loop region of the secondary structure or structures; and 2) if the free energy of such a stem-loop region is between zero and about -10.0 kcal/mole.
  • the free energy can be conveniently calculated using a computer program, such as, the SEQ program.
  • the mRNA sequence is then analyzed to identify
  • AUG initiation codon and the Shine-Dalgarno sequence are not contained in a double stranded portion of the predicted secondary structure.
  • the sequence is analyzed for modifications which will place the AUG initiation codon and/or the
  • Shine-Dalgarno sequence in double stranded portion of a predicted secondary structure which is even less stable than the predicted secondary structure of the original mRNA sequence, i.e., in a secondary structure whose calculated free energy is more positive than the calculated free energy of the original mRNA sequence.
  • Various constraints must be kept iii mind in considering possible modifications to the mRNA sequence. For example, if the protein which is to be produced is to remain unchanged, modifications which are to be made to the portion of the mRNA sequence which codes for the protein are limited to those which degenerately code for the same amino acids. See, for example, Nussinov, R.
  • RNA Folding Is Unaffected by the Nonrandom Degenerate Codon Choice Biochimica et Biophysica Acta, 698:111-115 (1982).
  • other constraints come into play. For example, it is in general preferred to avoid changes in the Shine-Dalgarno sequence and the spacing between the AUG initiation codon and the Shine-Dalgarno sequence that may adversely affect the translation initiation process. Also, as reported by De Boer et al.
  • a modified mRNA sequence has been selected, its predicted secondary structure is determined following the same procedures as those used for the original mRNA sequence. Again, the structure, or in some cases, structures are examined to determine the locations of the AUG initiation * codon and the Shine-Dalgarno sequence, and, if necessary, the free energy of the region of the secondary structure which contains these elements is calculated. If necessary, different or further modifications of the original mRNA sequence are then analyzed until a modified mRNA sequence is selected which achieves the goal of minimizing the likelihood that secondary structure will interfere with the functioning of the Shine-Dalgarno sequence and the AUG initiation codon. -17- Production of the modified mRNA sequence is achieved by altering the DNA or RNA sequence which codes for the original mRNA. Various techniques can be used to produce the modified DNA or RNA sequence. For example, site-specific mutagenesis can be used to achieve the modifications. See, for example, Messing,
  • LacY Trp was used as the transformation host.
  • pIN series vectors in particular, pIN-I-Ap and pIN-I-A 3 , were used as the cloning vehicles. See Nakamura, K. , and Inouye, M. , "Construction of versatile expression cloning vehicles using the lipoprotein gene of Escherichia coli," The EMBO Journal, .1:771-775 (1982). These vectors use the promoter and the 5 1 untranslated region of the E ⁇ coli outer membrane lipoprotein gene for transcriptional and translational initiation of the cloned gene.
  • Plasmid pCGS282 which was obtained from Collaborative Research, Inc., Lexington, Massachusetts, was used as a source of a leukocyte interferon ⁇ l gene (see Figure 1).
  • This plasmid is a hybrid plasmid in which a mature human interferon ⁇ l gene has been inserted between the S ⁇ cerevisae galactose promoter (GAL-P) and the S_ ⁇ cerevisae invertase transcription terminator (SUC ) .
  • GAL-P S ⁇ cerevisae galactose promoter
  • SUC S_ ⁇ cerevisae invertase transcription terminator
  • Tanaka and Weisblum Tanaka and Weisblum. Tanaka, T. and Weisblum, B. , "Construction of a colicin El-R factor composite plasmid in vitro: Means of amplification of deoxyribonucleic acid," J. Bacteriol. , 121:345-362 (1975).
  • SDS - PAGE Sodium dodecylsulfate-polyacrylamide gel electrophoresis
  • the sheet was washed with TBS (20 mM Tris-HCl, pH 7.5; 500 mM NaCl) and blocked with TBS containing 3% BSA at ' room temperature for 30 min.
  • the blocked sheet was incubated overnight in T-TBS (TBS + 0.05% Tween 20) at 4°C containing a 100-fold dilution of rabbit polyclonal antibody against alpha interferon which was obtained from Interferon Sciences, Inc., New Brunswick, New Jersey.
  • the sheet was washed three times with T-TBS, incubated in a 3000-fold dilution of BioRad peroxidase conjugated goat-anti-rabbit IgG at room temperature for 2 hours, and washed three times with TBS.
  • Example 1 The protein bands were visualized by developing the sheet in a freshly prepared solution of 0.05 % 4-chloro-l- naphthol/0.00015 % H 2 0 2 at room temperature for 30 min. The developed sheet was washed four to five times with distilled water to stop the reaction and then air dried.
  • Example 1 The protein bands were visualized by developing the sheet in a freshly prepared solution of 0.05 % 4-chloro-l- naphthol/0.00015 % H 2 0 2 at room temperature for 30 min. The developed sheet was washed four to five times with distilled water to stop the reaction and then air dried.
  • Plasmid pNL015 This example relates to the construction of a plasmid (pNL015) which produces a mRNA sequence in which the Shine-Dalgarno sequence is contained in a double stranded portion of a stem-loop region of the sequence's predicted secondary structure.
  • the calculated free-energy of the stem-loop region is -3.9 kcal/mole, i.e., the calculated free-energy is in the range of free energies which prior workers in the art thought could not significantly affect protein production.
  • pNL015 was constructed by first constructing pNL014 by ligating a Hindlll - Sail DNA fragment containing the IFN ⁇ l gene from pCGS282 to the large Hindlll - Sail fragment of pIN-I-A ⁇ (see Figure 1).
  • the promoter, the 5' untranslated region of the lipoprotein gene, a sequence coding for the first two amino acid residues of the prolipoprotein, and a linker sequence coding for seven amino acid residues are situated 5' to the coding region of the methionyl IFN ⁇ l gene.
  • the IFN ⁇ l 3 1 untranslated region is followed by an invertase transcription terminator. Accordingly, there are two transcriptional termination sequences, both eukaryotic in nature, following the ⁇ l interferon coding sequence.
  • the biological activity of interferon isolated from E ⁇ coli cells JA221 harboring pNL014 was measured using Vesicular Stomatitis virus (Indiana Strain) on HEp-2 cells in a cytopathic effect assay. See Lee, N. , Cozzitorto, J. , Wainwright, N. and Testa, D, , "Cloning with tandem gene systems for high level gene expression," Nucleic Acids Res. , 12 (1984) 6797-6812. The quantity of IFN isolated from these cells was in the range of 1.2 x 10 units/ml/OD.
  • pNL015 was constructed from pNL014 and IN-I-A ⁇ following the procedures shown in Figure 2.
  • Various combinations of the Percentmatch, MaxLoop, MinLoop, and AfterMismatch parameters used in the SEQ program were found to predict the same secondary structure as the default parameters.
  • the predicted secondary structure in the region of the AUG initiation codon and the Shine-Dalgarno sequence (AGAGGGU) obtained by this analysis is shown in Figure 4.
  • the secondary structure shown has a free energy of -3.9 kcal/mole, a percentage match of 75%, and an "E" factor, i.e., a probability of finding as good a match in a random sequence of bases * - of the same length, of 2.387.
  • the Shine-Dalgarno sequence is contained in a double stranded portion of a stem-loop region of the predicted secondary structure.
  • the stem-loop region has a calculated free energy in the range of 0 to -10.0 kcal/mole, i.e., a calculated free energy of -3.9 kcal/mole.
  • This examples illustrates the modification of pNL015 to produce a plasmid (pNLOO ⁇ ) which produces a mRNA having a predicted secondary structure in which neither the AUG initiation codon nor the Shine-Dalgarno sequence are included in a double stranded portion of a stem-loop region of the secondary structure.
  • pNL015 was linearized with
  • pNLOO ⁇ differs from pNL015 in that the mRNA transcript produced by pNLOO ⁇ has 11 bases deleted starting from base 17 downstream of the AUG initiation codon, and a 2 base (A-A) insertion between bases 9 and
  • Figure 4 shows a predicted secondary structure for pNLOO ⁇ where the deleted and inserted sequences in pNL015 and pNLOO ⁇ are indicated by the light bars.
  • the structure for pNLOO ⁇ shown in Figure 4 was constructed manually from the primary structure for the plasmid's first 74 bases. Using the Tinoco technique, the free energy for this structure was calculated manually and found to be -3.2 kcal/mole.
  • Computer analysis using the SEQ program of the same primary sequence predicted a less stable structure having an energy of -1.7 kcal/mole in which the SD sequence was partially contained in a double stranded portion of a stem-loop region.
  • interferon protein The in vivo stability of the interferon protein was examined in a pulse-chase experiment.
  • the interferon proteins encoded by pNL015 and pNL008 showed no noticeable degradation within 60 minutes (see Figure 5). It was therefore concluded that the differential rate of degradation by proteolytic enzymes was not a contributing factor to the observed difference in IFN expression.
  • the stability of the IFN mRNAs was ' studied by measuring levels of labeled IFN at various time intervals after the addition of rifampicin, an inhibitor of RNA synthesis, to cultures of JA221/pNL015 and JA221/pNLOO ⁇ . As can be seen in Figure 7, functional half-lives of the transcripts produced by both pNL015 and pNLOO ⁇ are approximately 4 to 5 minutes.
  • plasmids pNL016 and pNL017) were constructed to further confirm that the source of the difference in expression between plasmids pNL008 and pNL015 was mRNA secondary structure.
  • These plasmids were formed by synthesizing two EcoRI - Cla I DNA fragments (fragments A and B) to replace the corresponding sequence in pNL015 (bases 50 to 63 in Figure 4).
  • DNA fragment B contained a single base substitution whereby base 60 of the pNL015 transcript was changed from U to C.
  • DNA fragment A contained an additional base substitution whereby both base 60 (U) and base 62 (A) were changed to C.

Landscapes

  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Toxicology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Medicinal Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)

Abstract

A method for increasing the translation efficiency of a mRNA sequence is provided. In certain of its preferred embodiments, the method comprises the steps of: (a) constructing a predicted secondary structure for the mRNA; (b) analyzing the predicted secondary structure to determine if either or both of the AUG initiation codon and the Shine-Dalgarno sequence is contained in a double stranded portion of a stem-loop region of the predicted secondary structure; (c) calculating a free energy value for the stem-loop region; and (d) if the calculated free energy value is in the range of from zero to about -10.0 kcal/mole, modifying the DNA sequence for the mRNA so that when the modified sequence is transcribed it produces a modified mRNA sequence which has a predicted secondary structure wherein the AUG initiation codon and the Shine-Dalgarno sequence are not included in a double stranded portion of a stem-loop region of the predicted secondary structure. By means of this method, ten-fold increases in protein production have been achieved.

Description

METHOD FOR IMPROVING TRANSLATION EFFICIENCY
BACKGROUND OF THE INVENTION
1. Field of the Invention This invention relates to a method for increasing the production of proteins by biological cells and, in particular, to a method for improving the efficiency with which messenger RNA (mRNA) is translated.
2. Description of the Prior Art Extensive research has been performed to attempt to understand why some messenger RNAs are translated more efficiently than other messenger RNAs. For example, studies using E. coli have been performed in which nucleotide sequences of varying lengths have been placed between the AUG initiation codon and the Shine-Dalgarno (SD) sequence. See Shepard et al. , "Increased Synthesis in E. coli of Fibroblast and Leukocyte Interferons Through Alterations in Ribosome Binding Sites," DNA, 2..125-131 (1982). These studies identified an optimal spacing of 9 nucleotides. (As known in the art, the Shine-Dalgarno sequence comprises
3-12 nucleotides which are found in prokaryotic mRNAs and which are complementary to the 3* end of 16s rRNA.)
Similarly, the effects of base composition in the region between the SD sequence and the initiation codon have been studied. See De Boer et al. , "Portable Shine-Dalgarno Regions: A System for a Systematic Study of Defined Alterations of Nucleotide Sequences within E. coli Ribosome Binding Sites," NA. 2:231-235
(1983). In this case, it was found that the presence of A or T residues resulted in higher levels of mRNA translation, while C or G residues led to lower levels of translation.
In addition to the foregoing, various studies have been performed to determine the effects of mRNA secondary structure on translation efficiency. See, for example, Boyen et al. , "Enhancement of Translation Efficiency in Escherichia coli by Mutations in a Proximal Domain of Messenger RNA," J. Mo . Biol., 162:715-720 (1982); Coleman et al. , "Mutations Upstream of the Ribosome-binding Site Affect Translational Efficiency," J. Mol. Biol. , 181:139-143 (1985) j Hall et al. , "A Role for mRNA Secondary Structure in the Control of Translation Initiation," Nature, 295:616-618 (1982)} Iserentant et al. , "Secondary Structure of mRNA and Efficiency of Translation Initiation," Gene, £:1-12 (1980); Kastelein et al. , "Effect of the sequences upstream from the ribosome-binding site on the yield of protein from the cloned gene for phage MS2 coat protein," Gene, 23:245-254 (1983); Queen et al. , "Differential Translation Efficiency- Explains Discoordinate Expression of the Galactose Operon," Cell, 25:241-249 (1981); Schoner et al. , "Role of mRNA translational efficiency in bovine growth hormone expression in E. coli," Proc. Nat. Acad. Sci. , USA, 81:5403-5407 (1984); Shepard et al. , supra; and Steege, D. , "5'-Terminal nucleotide sequence of Escherichia coli lactose repressor mRNA: Features of translational initiation and reinitiation sites," Proc. Natl. Acad.
Sci., USA, 74:4163-4167 (1977). In these studies, A-U and G-C base pairing has been used to construct mRNA secondary structures from mRNA primary structures. The secondary structures have then been examined to determine if either or both of the translation initiation determinants, e.g., the SD sequence and the initiation codon, appear in a double stranded portion of the folded mRNA molecule, as opposed to a single stranded portion. In addition, the thermodynamic stabilities of the secondary structures have been calculated using the methods of Tinoco et al. See Tinoco et al. , "Improved Estimation of Secondary Structure in Ribonucleic Acids," Nature New Biology, 246:40-41 (1973); Tinoco et al. , "Estimation of Secondary Structure in Ribonucleic Acids", Nature, 230:362-367 (1971); and Borer et al. , "Stability of Ribonucleic Acid Double-Stranded Helices," J. Mol. Biol. , 86:843-853 (1974). See also Salser, . , Cold Spring Harbor Symp. Quant. Biol., 42:985-1002 (1977). Lower levels of protein production have generally been found to correlate with the sequestering of translation initiation determinants in double stranded regions of thermodynamically stable secondary structures, i.e., secondary structures having a calculated free energy (ΔG) more negative than about -10.0 kcal/mole, i.e., free energies more negative than the free energy of hydrolysis of ATP (ΔG = -7.3 kcal/mole).
Significantly, with regard to the present invention, the sequestering of translation initiation determinants in double stranded regions of secondary structures having calculated free energies more positive than -10.0 kcal/mole have not been considered important with regard to changes in translation efficiency. For example, in Hall et al. , supra, a SD sequence sequestered in a secondary structure having a free energy of -2.0 kcal was considered accessible for riboso e binding due to the instability of the secondary structure. Similarly, in Iserentant et al. , supra, an SD sequence was considered largely single stranded where release of the sequence from a double strand configuration only required 3.2 kcal. Along these same lines, in De Boer et al. , supra, it was stated that the observed differences in translation efficiency which these workers had found could not be explained on the basis of secondary structure because significantly stable stem-loops having ΔG values less than -10 kcal could not be found. Again, in Coleman et al. , supra, a secondary structure having a free energy of -3.05 kcal was considered unlikely to significantly affect ribosome binding to mRNA.
In contrast to these views of prior workers, as discussed and demonstrated in detail below, in accordance with the present invention it has been surprisingly found that translation efficiency can be significantly affected, e.g., by a factor of 10, by the sequestering of translation initiation determinants in secondary structures having calculated free energies more positive than -10.0 kcal/mole, e.g., on the order of -3.9 kcal/mole. As shown below, based on this discovery, even greater increases in protein production can be achieved than would have been achieved based on the prior understanding of the effects of mRNA secondary structure on translation efficiency.
SUMMARY OF THE INVENTION
In view of the foregoing state of the art, it is an object of this invention to increase the production of protein by biological cells. More particularly, it is an object of the invention to increase the efficiency with which mRNA is translated to produce protein.
To achieve the foregoing and other objects, the invention provides a method for increasing the translation efficiency of a mRNA sequence which is produced from a DNA or RNA sequence comprising the steps of:
(a) determining and analyzing a predicted secondary structure for the mRNA to determine if either or both of the AUG initiation codon and the
Shine-Dalgarno sequence is contained in a double stranded portion of a stem-loop region of the predicte-d secondary structure; (b) calculating a free energy value for the stem-loop region; and
(c) modifying the DNA or RNA sequence for the mRNA, if the calculated free energy value is in the range of from zero to about -10.0 kcal/mole, so that the modified DNA or RNA sequence produces a modified mRNA sequence which has a predicted secondary structure wherein either:
(i) the AUG initiation codon and the Shine-Dalgarno sequence are not included in a double stranded portion of a stem-loop region of the predicted secondary structure; or
(ii) if either or both of the AUG initiation codon and the Shine-Dalgarno sequence are included in a double stranded portion of a stem-loop region of the predicted secondary structure, the calculated free energy value for such stem-loop region is more positive than the free energy value calculated in step (b). As described in detail below, by means of the foregoing procedure, ten-fold increases in protein production have been achieved.
In certain preferred embodiments, the free energy value calculated in step (b) is in the range of from zero to about -7.0 kcal/mole, i.e., the calculated free energy is more positive than the free energy of hydrolysis of ATP. This range is particularly likely to have been ignored by prior art workers since it represents structures whose energy is less than the energy of hydrolysis of only one ATP.
The accompanying drawings, which are incorporated in and constitute part of the specification, illustrate the application of the invention to the production of human αl interferon, and together with the description, serve to explain the principles of the invention.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 shows the construction of hybrid plasmid pNL014. Abbreviations used: lpp ^ = E^ coli lipoprotein promoter, lpp t - E _ coli lipoprotein transcription terminator, A = Ampicillin resistance gene, Gal P = Yeast galactose promoter, αl = mature interferon αl gene, Met-αl = methionyl-αl, ol -
Interferon αl transcription terminator, SUC = Yeast invertase transcription terminator, Xb = Xbal, E = EcoRI, H = Hindlll, S = Sail, C = Clal, 2u - Yeast 2u replication origin, URA3 - Yeast URA3 gene, and a.a. = amino acid.
Figure 2 shows the construction of hybrid plasmid pNL015. Abbreviations: as in Figure 1. pIN-I = pIN-I-A vector. Marked bases are those which do not appear in pNL008.
Figure 3 shows the construction of hybrid plasmid pNL008. Abbreviations: as in Figure 1. SI - SI nuclease. Marked bases are those which do not appear n pNL015. EcoRI linker (GAATTC) was obtained from New England Biolabs.
Figure 4 shows the predicted secondary structures for mRNA produced from plasmids pNL015 and pNL008. The sequences start with the first base (1) of the transcripts. Sequences under the broken lines are the Shine-Dalgarno region. The initiation codon is indicated by a heavy bar and the deleted or inserted sequences are marked by a light bar. The AUG of the methionyl interferon is marked by dots. Calculated free energies of the secondary structures are -3.9 kcal/mole and - 3.2 kcal/mole for pNL015 and pNL008 transcripts, respectively. Arrows indicate base substitutions as occurred in pNL016 and pNL017.
Figure 5 shows the results of a pulse-chase analysis of the in vivo stability of interferon fusion protein produced by JA221/pNL008 (lanes 1 - 4) and
JA221/ρNL015 (lanes 5-8). After E^ coli cells were incubated with [ 35S]- methionine for 40 minutes, a
40,000-fold excess of unlabeled methionine was added.
At various chase times, samples were withdrawn and cell lysates were prepared for PAGE and immunostaining.
Lane 0 contains molecular weight standards. Chase times were -- lanes 1 and 5: 0 min. ; lanes 2 and 6: 15 min. ; lanes 3 and 7: 30 min.; lanes 4 and 8: 60 min. Alpha 1 indicates the position of IFN αl fusion protein. The M.W. of IFN fusion protein from pNL015 was slightly higher than that from pNL008 due "to 3 extra amino acid residues.
Figure 6 shows the results of RNA dot blot hybridization experiments for JA221/pNL015 and JA221/pNL008. Columns 1 and 2 represent serial dilution (1:5) of total cellular RNA (top row 40 ug) of JA221/pNL015 and JA221/ρNL008, respectively. The total RNA was isolated following the procedure of Young and Furano and applied to nitrocellulose paper. See Young, F. S. and Furano, A. V., "Regulation of the synthesis of I L coli elongation factor Tu," Cell, 24 (1981) 695-706. Nick-translated EcoRI fragment of IFN αl gene from pNL015 was used for hybridization according to Maniatis et.al.. See Maniatis, T. , Fritsch, E. F. , and Sambrook, J. , Molecular Cloning, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, (1982).
Figure 7 shows the results of stability analyses of IFN αl mRNA transcribed from pNL015 (solid triangles) and pNL008 (open triangles). Rifampicin (Calbiochem) was added (200ug/ml) to five 15 ml cell cultures of JA221/pNL008 and JA221/pNL015, respectively. The cultures were incubated at 37°C for
5, 10, 15, and 20 min. After each incubation, 50 uCi of [ 35S]-methionine was added to each culture, and the mixtures were incubated for another 1.5 min. The control cultures (zero time) were labeled with [ 35S]- methionine in the absence of rifampicin. Cell extracts were prepared and analyzed by SDS-PAGE. The IFN αl peak positions were identified by immunostaining. The peaks were cut out from the gel blot and the radioactivities of these fractions were measured in ACS scintillation fluid after being solubilized in 0.5 ml of a NCS solubilizer (Amersham Corp.) - water mixture (9:1) at 50°C for 2 hrs. The incorporation rates were calculated as percentage of incorporation at zero time.
Figure 8 shows the construction of pNL016 and pNL017. Abbreviations as in Figure 1. Oligonucleo- tides were synthesized by using.Applied Biosystems DNA Synthesizer and purified by gel elution. Following phosphorylation by polynucleotide kinase, appropriate pairs of oligonucleotides were mixed and annealed at
70°C in the presence of 66 mM Tris.HCl, pH 7.5 / 6 mM
MgCl2 / 500 mM ATP in a volume of 270 ul. The mixtures were cooled to 15°C during one hour to form DNA fragment A or B. Arrows indicate positions of base substitution as compared to pNL015.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
As described above, the present invention relates to the discovery that the sequestering of one or both of the AUG initiation codon and the Shine-Dalgarno sequence in a double stranded portion of even a weakly bound (relatively unstable) mRNA secondary structure, i.e., a secondary structure having a calculated free energy more positive than -10.0 kcal/mole, can have significant effects on protein production. In particular, it has been found that the elimination of such secondary structures can result in over a five-fold increase, e.g., a ten-fold increase, in protein production.
The first step of the process of the invention involves determining if all or part of either or both of the AUG initiation codon and the Shine-Dalgarno sequence is contained in a double stranded portion of a stem-loop region of the predicted secondary structure for the mRNA (i.e., the secondary structure predicted from pairing of the bases of the mRNA's primary structure as opposed to an experimentally observed secondary structure) .
This determination is most conveniently performed by conducting computer analyses on the base sequence of the mRNA to identify one or, in some cases, a group of possible secondary structures for the mRNA. When a group of structures is identified, a most probable secondary structure can normally be selected following techniques known in the art. See Akiyoshi Wada and
Akira Suyama, "Local Stability of DNA and RNA Secondary
Structure and its Relation to Biological Functions"
Prog. Biophys. Molec. Biol. , Vol. 47, pp 113-157 (1986); Nicholls, N. , The Regulation of Bacillus
Licheniformis Penicillinase: Locating and Sequencing the Repressor Gene, Ph.D. Thesis, Rutgers University,
Dissertation Abstracts International, Molecular
Biology, Vol. 47/06-B, 1986. In some cases, however, it may be advantageous to further analyze a group of possible secondary structures, as opposed to a single most probable structure. For example, depending on the details of the sequences involved, it may be possible to produce a modified mRNA which eliminates binding of the AUG initiation codon and/or the Shine-Dalgarno sequence for a whole family of secondary structures. As an alternate to using a computer program, secondary structures can be determined visually, i.e. , by examining the primary structure and manually aligning A-U and C-G pairs, and the free energies of such structures can be manually calculated using, for example, the Tinoco techniques, supra.
Examples of parameters which can be considered in selecting most probable secondary structures include percent base match, probability of finding as good a match in a random sequence of bases of the same length, arid secondary structure free energy. With regard to" free energy, it is important not to dismiss secondary structures having calculated free energies more positive than -10.0 kcal/mole as prior workers have done (see above) , since as shown below, such weak secondary structures can significantly affect protein production.
A suitable computer program for determining mRNA secondary structures is the SEQ - DNA Sequence Analysis
System marketed by IntelliGenetics, Inc., of Palo Alto,
California (hereinafter the "SEQ program"). A description of this program can be found in Brutlag, et al. , "SEQ: A Nucleotide Sequence Analysis and Recombination System," Nucleic Acids Research, l£:279-294 (1982). As discussed therein, the SEQ program uses the Tinoco et al. techniques, supra, to calculate free energies. See also Zuker and Sankov, Bull, of Math. Biol. , 46:591-621 (1984).
Once the predicted secondary structure or structures have been constructed, they are analyzed to determine: 1) if all or part of either or both of the AUG initiation codon and the Shine-Dalgarno sequence is contained in a double stranded portion of a stem-loop region of the secondary structure or structures; and 2) if the free energy of such a stem-loop region is between zero and about -10.0 kcal/mole. The free energy can be conveniently calculated using a computer program, such as, the SEQ program.
If both of the foregoing conditions are satisfied,-, the mRNA sequence is then analyzed to identify
Figure imgf000017_0001
-15- modifications to the mRNA sequence, i.e., base additions, deletions, and/or substitutions, which will result in predicted secondary structures wherein the
AUG initiation codon and the Shine-Dalgarno sequence are not contained in a double stranded portion of the predicted secondary structure. Alternatively, but less preferably, the sequence is analyzed for modifications which will place the AUG initiation codon and/or the
Shine-Dalgarno sequence in double stranded portion of a predicted secondary structure which is even less stable than the predicted secondary structure of the original mRNA sequence, i.e., in a secondary structure whose calculated free energy is more positive than the calculated free energy of the original mRNA sequence. Various constraints must be kept iii mind in considering possible modifications to the mRNA sequence. For example, if the protein which is to be produced is to remain unchanged, modifications which are to be made to the portion of the mRNA sequence which codes for the protein are limited to those which degenerately code for the same amino acids. See, for example, Nussinov, R. , "RNA Folding Is Unaffected by the Nonrandom Degenerate Codon Choice", Biochimica et Biophysica Acta, 698:111-115 (1982). Also, when modifying the protein coding region, it is preferred to select codons which correspond to those tRNAs which are most abundant in the host cell which will be used for- protein production. When modifying non-coding regions of the mRNA sequence, other constraints come into play. For example, it is in general preferred to avoid changes in the Shine-Dalgarno sequence and the spacing between the AUG initiation codon and the Shine-Dalgarno sequence that may adversely affect the translation initiation process. Also, as reported by De Boer et al. , the addition of C or G residues to the region between the initiation codon and the Shine-Dalgarno sequence is not preferred. Of course, although changes of the foregoing types are not preferred, they can be used if necessary to achieve the desired predicted secondary structure.
Once a modified mRNA sequence has been selected, its predicted secondary structure is determined following the same procedures as those used for the original mRNA sequence. Again, the structure, or in some cases, structures are examined to determine the locations of the AUG initiation * codon and the Shine-Dalgarno sequence, and, if necessary, the free energy of the region of the secondary structure which contains these elements is calculated. If necessary, different or further modifications of the original mRNA sequence are then analyzed until a modified mRNA sequence is selected which achieves the goal of minimizing the likelihood that secondary structure will interfere with the functioning of the Shine-Dalgarno sequence and the AUG initiation codon. -17- Production of the modified mRNA sequence is achieved by altering the DNA or RNA sequence which codes for the original mRNA. Various techniques can be used to produce the modified DNA or RNA sequence. For example, site-specific mutagenesis can be used to achieve the modifications. See, for example, Messing,
J., "New M13 Vectors for Cloning," Methods in
Enzymology —• Recombinant DNA, Wu et al. (eds.),
Academic Press, New York, Volume 101, Part C, pages 20-79 (1983). Similarly, the modifications can be produced through the use of synthetic gene fragments and nucleases which fragment the DNA or RNA sequence at predetermined locations. As will be evident to persons of ordinary skill in the art, other techniques which may be developed in the future for preparing or altering DNA or RNA sequences can be used in the practice of the present invention.
Production of protein using the modified DNA or RNA sequence is achieved following standard recombinant DNA/RNA techniques. A discussion of these techniques can be found in, for example, Molecular Cloning: A Laboratory Manual, by T. Maniatis et al. , Cold Spring Harbor Laboratory, Cold Spring Harbor, New York (1982). Again, as with the altering of DNA or RNA sequences, the present invention can be used in combination with techniques developed in the future for producing proteins from genetically altered cells. '•"'-
Without intending to limit it in any manner, the invention will be more fully described by the following examples. The materials and methods which are common to the examples are as follows.
MATERIALS AND METHODS Strains, plasmids, and plasmid isolation i coli K-12 strain JA221 (hsdM+ hsdR- recA Leu
LacY Trp) was used as the transformation host. pIN series vectors, in particular, pIN-I-Ap and pIN-I-A3, were used as the cloning vehicles. See Nakamura, K. , and Inouye, M. , "Construction of versatile expression cloning vehicles using the lipoprotein gene of Escherichia coli," The EMBO Journal, .1:771-775 (1982). These vectors use the promoter and the 51 untranslated region of the E^ coli outer membrane lipoprotein gene for transcriptional and translational initiation of the cloned gene.
Plasmid pCGS282, which was obtained from Collaborative Research, Inc., Lexington, Massachusetts, was used as a source of a leukocyte interferon αl gene (see Figure 1). This plasmid is a hybrid plasmid in which a mature human interferon αl gene has been inserted between the S^ cerevisae galactose promoter (GAL-P) and the S_^ cerevisae invertase transcription terminator (SUC ) . In this construction, 69 base pairs coding for the signal sequence of the prointerferon protein have been removed and replaced by an ATG translation initiation codon and Cla I and Hind IΪΪ linkers. Plasmids were isolated by the method described by
Tanaka and Weisblum. Tanaka, T. and Weisblum, B. , "Construction of a colicin El-R factor composite plasmid in vitro: Means of amplification of deoxyribonucleic acid," J. Bacteriol. , 121:345-362 (1975).
Preparation of cell extracts, gel electrophoresis and immunostainin .
Cell pellets from 20 ml of log phase E^ coli cell culture were suspended in 2 ml of 0.1 M phosphate buffer (pH 7.5). Cells were broken by sonication for 5 min. Cell extracts were obtained after 1 hour ultracentrifugation at 40,000 rpm at 4°C.
Sodium dodecylsulfate-polyacrylamide gel electrophoresis (SDS - PAGE) was performed using a 15% slab gel by the method of Anderson et al. Anderson, C. W. , Baum, P. R. and Gesteland, R.F. , "Processing of adenovirus 2-induced proteins," J. Virol. , 12 (1973) 241-252. A mixture of 20 ul of cell extract and 20 ul of loading buffer were loaded on the 15% SDS - PAGE and the samples were electrophoresed at 140 V until the bromophenol blue dye moved out of the gel. .The protein bands on the gel were transferred to a nitrocellulose sheet by electroblotting at 70 V for 3 hours at 4°C. The sheet was washed with TBS (20 mM Tris-HCl, pH 7.5; 500 mM NaCl) and blocked with TBS containing 3% BSA at' room temperature for 30 min. The blocked sheet was incubated overnight in T-TBS (TBS + 0.05% Tween 20) at 4°C containing a 100-fold dilution of rabbit polyclonal antibody against alpha interferon which was obtained from Interferon Sciences, Inc., New Brunswick, New Jersey. The sheet was washed three times with T-TBS, incubated in a 3000-fold dilution of BioRad peroxidase conjugated goat-anti-rabbit IgG at room temperature for 2 hours, and washed three times with TBS. The protein bands were visualized by developing the sheet in a freshly prepared solution of 0.05 % 4-chloro-l- naphthol/0.00015 % H202 at room temperature for 30 min. The developed sheet was washed four to five times with distilled water to stop the reaction and then air dried. Example 1
Construction of Plasmid pNL015 This example relates to the construction of a plasmid (pNL015) which produces a mRNA sequence in which the Shine-Dalgarno sequence is contained in a double stranded portion of a stem-loop region of the sequence's predicted secondary structure. The calculated free-energy of the stem-loop region is -3.9 kcal/mole, i.e., the calculated free-energy is in the range of free energies which prior workers in the art thought could not significantly affect protein production. pNL015 was constructed by first constructing pNL014 by ligating a Hindlll - Sail DNA fragment containing the IFN αl gene from pCGS282 to the large Hindlll - Sail fragment of pIN-I-A^ (see Figure 1). In this construction, the promoter, the 5' untranslated region of the lipoprotein gene, a sequence coding for the first two amino acid residues of the prolipoprotein, and a linker sequence coding for seven amino acid residues are situated 5' to the coding region of the methionyl IFN αl gene. At the 3' end, the IFN αl 31 untranslated region is followed by an invertase transcription terminator. Accordingly, there are two transcriptional termination sequences, both eukaryotic in nature, following the αl interferon coding sequence.
The biological activity of interferon isolated from E^ coli cells JA221 harboring pNL014 was measured using Vesicular Stomatitis virus (Indiana Strain) on HEp-2 cells in a cytopathic effect assay. See Lee, N. , Cozzitorto, J. , Wainwright, N. and Testa, D, , "Cloning with tandem gene systems for high level gene expression," Nucleic Acids Res. , 12 (1984) 6797-6812. The quantity of IFN isolated from these cells was in the range of 1.2 x 10 units/ml/OD. pNL015 was constructed from pNL014 and IN-I-A^ following the procedures shown in Figure 2. In this plasmid, the two eukaryotic terminators of plasmid pNL014 are replaced by a single E^ coli lipoprotein terminator. This change was found to result in a two fold increase in interferon activity (i.e., to the range of 2.8 x 10 units/ml/OD) .
Example 2
Construction of a Predicted Secondary Structure for mRNA Produced from pNL015
A predicted secondary structure for mRNA produced from pNL015 was constructed using the IntelliGenetics
SEQ program, supra. The parameter settings employed in performing the computer analysis of the primary mRNA sequence were the default parameters for the SEQ program, i.e. , Percentmatch - 70%, AfterMismatch - 2, LoopOut = 3, MinLoop -= 3, and MaxLoop =50. Various combinations of the Percentmatch, MaxLoop, MinLoop, and AfterMismatch parameters used in the SEQ program were found to predict the same secondary structure as the default parameters.
The predicted secondary structure in the region of the AUG initiation codon and the Shine-Dalgarno sequence (AGAGGGU) obtained by this analysis is shown in Figure 4. The secondary structure shown has a free energy of -3.9 kcal/mole, a percentage match of 75%, and an "E" factor, i.e., a probability of finding as good a match in a random sequence of bases*- of the same length, of 2.387. As is evident from Figure 4, the Shine-Dalgarno sequence is contained in a double stranded portion of a stem-loop region of the predicted secondary structure. Moreover, as indicated above, the stem-loop region has a calculated free energy in the range of 0 to -10.0 kcal/mole, i.e., a calculated free energy of -3.9 kcal/mole.
Example 3 Construction of Plasmid pNLQ08
This examples illustrates the modification of pNL015 to produce a plasmid (pNLOOδ) which produces a mRNA having a predicted secondary structure in which neither the AUG initiation codon nor the Shine-Dalgarno sequence are included in a double stranded portion of a stem-loop region of the secondary structure.
To construct pNLOOδ, pNL015 was linearized with
Clal, blunt-ended with SI nuclease, ligated with EcoRI linkers and digested with EcoRI. The small EcoRI fragment was isolated and inserted into the EcoRI site of the pIN-I-A2 cloning vehicle as shown in Figure 3. pNLOOδ differs from pNL015 in that the mRNA transcript produced by pNLOOδ has 11 bases deleted starting from base 17 downstream of the AUG initiation codon, and a 2 base (A-A) insertion between bases 9 and
10. Other than these differences, the two plasmids are identical.
Figure 4 shows a predicted secondary structure for pNLOOδ where the deleted and inserted sequences in pNL015 and pNLOOδ are indicated by the light bars. The structure for pNLOOδ shown in Figure 4 was constructed manually from the primary structure for the plasmid's first 74 bases. Using the Tinoco technique, the free energy for this structure was calculated manually and found to be -3.2 kcal/mole. Computer analysis using the SEQ program of the same primary sequence predicted a less stable structure having an energy of -1.7 kcal/mole in which the SD sequence was partially contained in a double stranded portion of a stem-loop region.
As is evident from Figure 4, the modifications to pNL015 have freed both the initiation codon and the Shine-Dalgarno sequence from double stranded regions of the predicted secondary structure. Significantly, this freeing was found to result in a ten-fold increase in the production of αl interferon. Specifically, EL_ coli cells transformed with pNLOOδ were found to produce 3 x 10 units/ml/OD as compared to only 2.8 x 10 units/ml/OD for cells transformed with pNL015.
Example 4
Experiments to Confirm that the Difference in Protein
Production Between Plasmids pNLOOδ and pNLQ15 Is Due To the Predicted Secondary Structures of Their mRNAs
A series of experiments were performed to confirm that the observed difference in protein production between plasmids pNLOOδ and pNL015 was due to their predicted secondary structures as opposed to other factors.
First, it was determined that the specific activities of interferon encoded by pNL015 and pNL008 were the same. In addition, it was determined that cells containing pNL015 had a proportionally lower quantity of protein detectable by Western blot analysis than cells containing pNL008. Accordingly, it was concluded that the difference in N-terminus amino acid residues was not a factor affecting the antiviral activity of the interferon protein.
The in vivo stability of the interferon protein was examined in a pulse-chase experiment. The interferon proteins encoded by pNL015 and pNL008 showed no noticeable degradation within 60 minutes (see Figure 5). It was therefore concluded that the differential rate of degradation by proteolytic enzymes was not a contributing factor to the observed difference in IFN expression.
Experiments were carried out to quantitate the interferon mRNA synthesized. Dot blot hybridization experiments showed that pNLOOδ produced slightly more interferon mRNA than pNL015 (see Figure 6). This minimal change in the efficiency of transcription, however, was not large enough to account for the observed ten fold increase in expression level.
The stability of the IFN mRNAs was ' studied by measuring levels of labeled IFN at various time intervals after the addition of rifampicin, an inhibitor of RNA synthesis, to cultures of JA221/pNL015 and JA221/pNLOOδ. As can be seen in Figure 7, functional half-lives of the transcripts produced by both pNL015 and pNLOOδ are approximately 4 to 5 minutes.
Based on the foregoing experiments, it was concluded that none of specific activity, rate of transcription, protein degradation or mRNA degradation could account for the observed difference in expression between pNL008 and pNL015.
In addition to the foregoing, two additional plasmids (pNL016 and pNL017) were constructed to further confirm that the source of the difference in expression between plasmids pNL008 and pNL015 was mRNA secondary structure. These plasmids were formed by synthesizing two EcoRI - Cla I DNA fragments (fragments A and B) to replace the corresponding sequence in pNL015 (bases 50 to 63 in Figure 4). DNA fragment B contained a single base substitution whereby base 60 of the pNL015 transcript was changed from U to C. DNA fragment A contained an additional base substitution whereby both base 60 (U) and base 62 (A) were changed to C.
These changes do not change the amino acid sequence defined by plasmid pNL015 since each of CUA, CUC, and UUA code for leucine. Moreover, using the SEQ program, it was found that the predicted secondary structures for the mRNA sequences corresponding to pNL016 and pNL017, like the predicted secondary structure corresponding to pNL015, had the Shine-Dalgarno sequence in a double stranded portion of a stem-loop region of the secondary structure.
The calculated free energy values for the mRNA stem-loop regions for the three plasmids, however, were different. Specifically, whereas the stem-loop region for pNL015 had a free energy of -3.9 kcal/mole, for pNL017, the free energy was -10.8 kcal/mole, and for pNL016, it was -17 kcal/mole. Significantly, E _ coli cells transformed with either pNLOlό or pNL017 produced no detectable interferon activity.
Since all three plasmids code for the same amino acid sequence, post-translational factors such as specific activity and protein stability cannot contribute to the change in interferon titer. Moreover, substitution of one or two bases in the coding region is unlikely to alter the overall rate of transcription. Having ruled out these factors, the only possibility left is translational inhibition.
Translational efficiency is unlikely to be affected by codon usage, since only one synonymous codon substitution is involved. Moreover, in E^ coli, the codon CUC used in pNLOlδ is a more frequently used codon and the codon CUA used in pNL017 is a less frequently used codon in comparison to the UUA codon used in pNL015. See Maruyama et al. , "Codon usage tabulated from the GenBank genetic sequence data,"
Nucl. Acids Res. , Vol. 14 sup., pages R151-R197 (1986).. Yet, both synonymous substitutions resulted in complete inhibition of expression.
These additional results for plasmids pNLOlδ and pNL017 add further support to the conclusion that the efficiency with which pNL015's mRNA is translated is a function of its secondary structure notwithstanding the relative instability of that secondary structure.

Claims

What is claimed is:
1. A method for increasing the translation efficiency of a mRNA sequence which (i) codes for a protein, (ii) is produced from a RNA or DNA sequence, and (iii) includes an AUG initiation codon and a
Shine-Dalgarno sequence, comprising the steps of:
(a) determining and analyzing a predicted secondary structure for the mRNA sequence to determine if one or both of the AUG initiation codon and the Shine-Dalgarno sequence is contained in a double stranded portion of a stem-loop region of the predicted structure;
(b) calculating a free energy value for the stem-loop region;
(c) if the calculated free energy value is in the range of from zero to about -10.0 kcal/mole:
(i) selecting a modified mRNA sequence; (ii) determining and analyzing a predicted secondary structure for the modified mRNA sequence to determine if either or both of the AUG initiation codon and the Shine-Dalgarno sequence is contained in a double stranded portion of a stem-loop region of the predicted structure; (iii) repeating steps (c) (i) and (c)(ii), if necessary, until a modified mRNA sequence is identified in which neither the AUG initiation codon nor the Shine-Dalgarno sequence is contained in a double stranded portion of a stem-loop region of the predicted secondary structure; and (d) modifying the DNA or RNA sequence so that it produces the modified mRNA sequence.
2. The method of Claim 1 wherein the free energy calculated in step (b) is in the range of from zero to about -7.0 kcal/mole.
3. The method of Claim 1 wherein the modification of the DNA or RNA sequence results in the production of at least five times more protein than produced by the unmodified DNA or RNA sequence.
4. A DNA or RNA sequence which has been modified by the method of Claim 1.
5. A transformed host cell which includes the modified DNA or RNA sequence of Claim 4.
6. A method for increasing the translation efficiency of a mRNA sequence which (i) codes for a protein, (ii) is produced from a DNA or RNA sequence, and (iii) includes an AUG initiation codon and a Shine-Dalgarno sequence, comprising the steps of:
(a) determining and analyzing a predicted secondary structure for the mRNA sequence, to determine if one or both of the AUG initiation codon and the Shine-Dalgarno sequence is contained in a double stranded portion of a stem-loop region of the predicted structure; (b) calculating a free energy value for the stem-loop region;
(c) if the calculated free energy value is in the range of from zero to about -10.0 kcal/mole:
(i) selecting a modified mRNA sequence; (ii) determining and analyzing a predicted secondary structure for the modified mRNA sequence to determine if either or both of the AUG initiation codon and the Shine-Dalgarno sequence is contained in a double stranded portion of a stem-loop region of the predicted structure; (iii) if neither the AUG initiation codon nor the Shine-Dalgarno sequence is contained in a double stranded portion of a stem-loop region of the predicted secondary structure for the modified mRNA, modifying the DNA or RNA sequence so that it produces the modified mRNA sequence; (iv) if either or both of the AUG initiation codon and the Shine-Dalgarno sequence is contained in a double stranded portion of a stem-loop region of the predicted structure for the modified mRNA, calculating a free energy value for the stem-loop region; (v) if the free energy value calculated in step (c)(iv) is more positive than the free energy value calculated in step (b), modifying the 04
-32- DNA or RNA sequence so that it produces the modified mRNA sequence;
(vi) if the free energy value calculated in step
(c)(iv) is equal to or more negative than the free energy value calculated in step (b), repeating steps (c)(i) and c(ii), and, as appropriate, steps (c)(iii), (c)(iv), and
(c)(v), until the condition of (c)(iii) or the condition of (c)(v) is satisfied.
7. The method of Claim 6 wherein the free energy calculated in step (b) is in the range of from zero to about -7.0 kcal/mole.
8. The method of Claim 6 wherein the modification of the DNA or RNA sequence results in the production of at least five times more protein than produced by the unmodified DNA or RNA sequence.
9. A DNA or RNA sequence which has been modified by the method of Claim 6.
10. A transformed host cell which includes the modified DNA or RNA sequence of Claim 9.
11. A method for increasing the translation efficiency of a mRNA sequence which (i) codes for a protein, (ii) is produced from a DNA or RNA sequence, and (iii) includes an AUG initiation codon and a Shine-Dalgarno sequence, comprising the steps of:
(a) determining and analyzing a predicted secondary structure for the mRNA sequence to determine if either or both of the AUG initiation codon and the Shine-Dalgarno sequence is contained in a double stranded portion of a stem-loop region of the predicted secondary structure;
(b) calculating a free energy value for the stem-loop region; and
(c) modifying the DNA or RNA sequence for the mRNA, if the calculated free energy value is in the range of from zero to about -10.0 kcal/mole, so that when the modified DNA or RNA sequence produces a modified mRNA sequence which has a predicted secondary structure wherein either:
(i) the AUG initiation codon and the Shine-Dalgarno sequence are not included in a double stranded portion of a stem-loop region of the predicted secondary structure; or
(ii) if either or both of the AUG initiation codon and the Shine-Dalgarno sequence are included in a double stranded portion of a stem-loop region of the predicted secondary structure, the calculated free energy value for such stem-loop region is more positive than the free energy value calculated in step (b) .
12. The method of Claim 11 wherein the free energy calculated in step (b) is in the range of from zero to about -7.0 kcal/mole.
13. The method of Claim 11 wherein the AUG initiation codon and the Shine-Dalgarno sequence are no . included in a double stranded portion of a stem-loop region of the predicted secondary structure of the modified mRNA.
14. The method of Claim 11 wherein the modification of the DNA or RNA sequence results in the production of at least five times more protein than produced by the unmodified DNA or RNA sequence.
15. A DNA or RNA sequence which has been modified by the method of Claim 11.
16. A transformed host cell which includes the modified DNA or RNA sequence of Claim 15.
PCT/US1988/002341 1987-07-13 1988-07-12 Method for improving translation efficiency WO1989000604A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US7272787A 1987-07-13 1987-07-13
US072,727 1987-07-13

Publications (1)

Publication Number Publication Date
WO1989000604A1 true WO1989000604A1 (en) 1989-01-26

Family

ID=22109393

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1988/002341 WO1989000604A1 (en) 1987-07-13 1988-07-12 Method for improving translation efficiency

Country Status (2)

Country Link
EP (1) EP0328584A1 (en)
WO (1) WO1989000604A1 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0568641A1 (en) * 1991-01-25 1993-11-10 United States Biochemical Corporation Regulation of nucleic acid translation
US5338853A (en) * 1989-12-22 1994-08-16 Elf Atochem North America, Inc. Derivatives of N-HALS-substituted amic acid hydrazides
WO1995011980A2 (en) * 1993-10-25 1995-05-04 The Government Of The United States Of America, Represented By The Secretary, Department Of Health And Human Services PLASMIDS FOR EFFICIENT EXPRESSION OF SYNTHETIC GENES IN $i(E. COLI)
WO2003066864A2 (en) * 2002-02-07 2003-08-14 Biomax Informatics Ag Method for predicting the expression efficiency in cell-free expression systems
WO2004053053A2 (en) * 2002-12-09 2004-06-24 F. Hoffmann La Roche Ag Optimised protein synthesis
WO2013071295A3 (en) * 2011-11-10 2013-07-18 Rutgers, The State University Of New Jersey Transcript optimized expression enhancement for high-level production of proteins and protein domains
WO2016086988A1 (en) * 2014-12-03 2016-06-09 Wageningen Universiteit Optimisation of coding sequence for functional protein expression
CN107075525A (en) * 2014-05-30 2017-08-18 纽约市哥伦比亚大学理事会 Change the method for expression of polypeptides
WO2019241684A1 (en) * 2018-06-15 2019-12-19 Massachusetts Institute Of Technology Synthetic 5' utr sequences, and high-throughput engineering and screening thereof
US10724040B2 (en) 2015-07-15 2020-07-28 The Penn State Research Foundation mRNA sequences to control co-translational folding of proteins

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4716217A (en) * 1984-08-31 1987-12-29 University Patents, Inc. Hybrid lymphoblastoid-leukocyte human interferons

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4716217A (en) * 1984-08-31 1987-12-29 University Patents, Inc. Hybrid lymphoblastoid-leukocyte human interferons

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Gene, Volume 58, issued July 1987 (Amsterdam, Netherlands), (LEE et al), "Modification of mRNA Secondary Structure and Alteration of the Expression of Human Interferon Alpha 1 in Escherichia Coli", see pages 77-86. *
Journal of Bacteriology, Volume 167, issued 2 September 1986, (Washington, D.C., USA), (AMBULOS et al), "Analysis of the Regulatory Sequences needed for Induction of the Chloramphenicol Acetyltransferase Gene Cat-86 by Chloramphenicol and Amicetin", see pages 842-49. *
MGG, Volume 182, issued July 1981 (Berlin, FRG), (HORINOUCHI et al), "The Control Region for Erythromycin Resistance: Free Energy Changes Related to Induction and Mutation to Constitutive Expression", see pages 341-348. *
Nucleic Acids Research, Volume 13, issued 25 March 1985 (Oxford, UK), (HALLEWELL et al), "Human Cu/Zn Superoxide Dismutase cDNA: Isolation of Clones Synthesising High Levels of Active or Inactive Enzyme from an Expression Library", see pages 2017-34. *
The EMBO Journal, Volume 4, issued September 1985, (Oxford, UK), (BRUCKNER et al), "Regulation of the Inducible Chloramphenicol Acetyltransferase Gene of the Staphylococcus Aureus Plasmid pUB112", see pages 2295-2300. *

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5338853A (en) * 1989-12-22 1994-08-16 Elf Atochem North America, Inc. Derivatives of N-HALS-substituted amic acid hydrazides
US5397821A (en) * 1989-12-22 1995-03-14 Elfatochem North America, Inc. Derivatives of N-hals-substituted amic acid hydrazides
EP0568641A1 (en) * 1991-01-25 1993-11-10 United States Biochemical Corporation Regulation of nucleic acid translation
EP0568641A4 (en) * 1991-01-25 1994-09-14 Us Biochemical Corp Regulation of nucleic acid translation
WO1995011980A2 (en) * 1993-10-25 1995-05-04 The Government Of The United States Of America, Represented By The Secretary, Department Of Health And Human Services PLASMIDS FOR EFFICIENT EXPRESSION OF SYNTHETIC GENES IN $i(E. COLI)
WO1995011980A3 (en) * 1993-10-25 1995-06-15 Us Health PLASMIDS FOR EFFICIENT EXPRESSION OF SYNTHETIC GENES IN $i(E. COLI)
DE10205091B4 (en) * 2002-02-07 2007-04-19 Biomax Informatics Ag A method for predicting expression efficiency in cell-free expression systems
WO2003066864A3 (en) * 2002-02-07 2003-12-11 Biomax Informatics Ag Method for predicting the expression efficiency in cell-free expression systems
WO2003066864A2 (en) * 2002-02-07 2003-08-14 Biomax Informatics Ag Method for predicting the expression efficiency in cell-free expression systems
WO2004053053A2 (en) * 2002-12-09 2004-06-24 F. Hoffmann La Roche Ag Optimised protein synthesis
WO2004053053A3 (en) * 2002-12-09 2004-09-30 Hoffmann La Roche Optimised protein synthesis
WO2013071295A3 (en) * 2011-11-10 2013-07-18 Rutgers, The State University Of New Jersey Transcript optimized expression enhancement for high-level production of proteins and protein domains
US10385350B2 (en) 2011-11-10 2019-08-20 Rutgers, The State University Of New Jersey Transcript optimized expression enhancement for high-level production of proteins and protein domains
CN107075525A (en) * 2014-05-30 2017-08-18 纽约市哥伦比亚大学理事会 Change the method for expression of polypeptides
EP3149176A4 (en) * 2014-05-30 2017-11-08 The Trustees of Columbia University in the City of New York Methods for altering polypeptide expression
CN107075525B (en) * 2014-05-30 2021-06-25 纽约市哥伦比亚大学理事会 Methods for altering expression of polypeptides
WO2016086988A1 (en) * 2014-12-03 2016-06-09 Wageningen Universiteit Optimisation of coding sequence for functional protein expression
US10724040B2 (en) 2015-07-15 2020-07-28 The Penn State Research Foundation mRNA sequences to control co-translational folding of proteins
WO2019241684A1 (en) * 2018-06-15 2019-12-19 Massachusetts Institute Of Technology Synthetic 5' utr sequences, and high-throughput engineering and screening thereof
US11875876B2 (en) 2018-06-15 2024-01-16 Massachusetts Institute Of Technology Synthetic 5' UTR sequences, and high-throughput engineering and screening thereof

Also Published As

Publication number Publication date
EP0328584A1 (en) 1989-08-23

Similar Documents

Publication Publication Date Title
Sacerdot et al. Sequence of a 1.26‐kb DNA fragment containing the structural gene for E. coli initiation factor IF3: presence of an AUU initiator codon.
Pinkham et al. The nucleotde sequence of the rho gene of E. coli K-12
Ehretsmann et al. Specificity of Escherichia coli endoribonuclease RNase E: in vivo and in vitro analysis of mutants in a bacteriophage T4 mRNA processing site.
Dasgupta et al. Multiple mechanisms for initiation of ColE1 DNA replication: DNA synthesis in the presence and absence of ribonuclease H
Watson A new revision of the sequence of plasmid pBR322
Duke et al. Sequence and structural elements that contribute to efficient encephalomyocarditis virus RNA translation
Masui et al. Novel high-level expression cloning vehicles: 104-fold amplification of Escherichia coli minor protein
Cormack et al. Structural requirements for the processing of Escherichia coli 5 S ribosomal RNA by RNase E in vitro
Aiba et al. The complete nucleotide sequence of the adenylate cyclase gene of Escherichia coli
McNeil et al. Saccharomyces cerevisiae CYC1 mRNA 5′-end positioning: analysis by in vitro mutagenesis, using synthetic duplexes with random mismatch base pairs
Murphy et al. A sequence upstream from the coding region is required for the transcription of the 7SK RNA genes
JPH11507521A (en) Chromosomal expression of heterologous genes in bacterial cells
WO1989000604A1 (en) Method for improving translation efficiency
Murotsu et al. Identification of the minimal essential region for the replication origin of miniF plasmid
US5362646A (en) Expression control sequences
JP2511948B2 (en) Enhanced protein production using a novel ribosome binding site in bacteria
Wu et al. Control of gene expression in bacteriophage P22 by a small antisense RNA. II. Characterization of mutants defective in repression.
HU205384B (en) Process for increased expressin of human interleukin-2 in mammal cells
Lee et al. Modification of mRNA secondary structure and alteration of the expression of human interferon α1 in Escherichia coli
US5017488A (en) Highly efficient dual T7/T3 promoter vector PJKF16 and dual SP6/T3 promoter vector PJFK15
Chen et al. The influence of adenine-rich motifs in the 3′ portion of the ribosome binding site on human IFN-γ gene expression in Escherichia coli
de Smit et al. Translational initiation at the coat‐protein gene of phage MS2: native upstream RNA relieves inhibition by local secondary structure
Sollazzo et al. High-level expression of RNAs and proteins: the use of oligonucleotides for the precise fusion of coding-to-regulatory sequences
Guidi-Rontani et al. Transcriptional control of polarity in Escherichia coli by cAMP
Schulz et al. Increased expression in Escherichia coli of a synthetic gene encoding human somatomedin C after gene duplication and fusion

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE FR GB IT LU NL SE

WWE Wipo information: entry into national phase

Ref document number: 1988906588

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1988906588

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1988906588

Country of ref document: EP