WO1995011980A2 - PLASMIDS FOR EFFICIENT EXPRESSION OF SYNTHETIC GENES IN $i(E. COLI) - Google Patents

PLASMIDS FOR EFFICIENT EXPRESSION OF SYNTHETIC GENES IN $i(E. COLI) Download PDF

Info

Publication number
WO1995011980A2
WO1995011980A2 PCT/US1994/012166 US9412166W WO9511980A2 WO 1995011980 A2 WO1995011980 A2 WO 1995011980A2 US 9412166 W US9412166 W US 9412166W WO 9511980 A2 WO9511980 A2 WO 9511980A2
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
vector
protein
dna
seq
Prior art date
Application number
PCT/US1994/012166
Other languages
French (fr)
Other versions
WO1995011980A3 (en
Inventor
Yury Khudyakov
Howard A. Fields
Original Assignee
The Government Of The United States Of America, Represented By The Secretary, Department Of Health And Human Services
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Government Of The United States Of America, Represented By The Secretary, Department Of Health And Human Services filed Critical The Government Of The United States Of America, Represented By The Secretary, Department Of Health And Human Services
Priority to AU80887/94A priority Critical patent/AU8088794A/en
Publication of WO1995011980A2 publication Critical patent/WO1995011980A2/en
Publication of WO1995011980A3 publication Critical patent/WO1995011980A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/005Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/67General methods for enhancing the expression
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/70Vectors or expression systems specially adapted for E. coli
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2770/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssRNA viruses positive-sense
    • C12N2770/00011Details
    • C12N2770/24011Flaviviridae
    • C12N2770/24211Hepacivirus, e.g. hepatitis C virus, hepatitis G virus
    • C12N2770/24222New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes

Definitions

  • the present invention relates to a vector for efficient expression of a protein encoded by the nucleotides in a synthetic coding sequence inserted therein.
  • a vector comprising a synthetic gene encoding the hepatitis C virus nucleocapsid protein.
  • Hepatitis C virus (HC ! s a recently identified new agent responsible for most cases of post-transfusion non-A »on-B hepatitis (NANBH) worldwide (Plagemann, 1991).
  • the virus contains a positive strand RNA genome comprising about 9,400 nucleotides (nt) that encodes a polyprotein of more than 3,000 amino acids.
  • nt nucleotides
  • HCV polyprotein The N-terminal region of the HCV polyprotein is processed into structural proteins C (a nucleocapsid protein) and El and E2/NS1 (envelope proteins). Although the precise points for processing of the polyprotein have not been confirmed, the polyprotein contains four nonstructural proteins: NS2, NS3, NS4, and NS5.
  • the HCV nucleocapsid protein C has been cloned and expressed in bacteria (Muraiso et al., 1990;
  • HCV nucleocapsid protein is known, there is no abundant source of the protein for, for example, immunodiagnostic purposes. Additionally, there is no useful means currently for manipulating the amino acid sequence of the protein for such uses as adding additional antigenic epitopes for immunodiagnostic purposes. While it is known to obtain expression of eukaiyotic genes in prokaryotes, especially E. coli, current vectors for the expression of proteins do not provide for efficient expression of natural proteins. Some provide for the production of hybrid fusion proteins, wherein the vectors provide both 5' untranslated regions and a translation start codon followed by some coding sequences.
  • the DNA encoding protein of desire would then be placed, without its translation start codon, after the vector's coding sequences such that a hybrid protein is produced from the construct containing amino acids coded by both the vector sequences and the DNA encoding the protein of desire.
  • Other vectors currently available for example, Pharmacia vectors BKK 223-3 and PDR 540
  • Pharmacia vectors BKK 223-3 and PDR 540 have strong promoters and are therefore efficient imitators of transcription, but do not provide for efficient translation of coding sequences other than the sequence native to the specific promoter in the vector.
  • a vector that provided efficient expression in prokaryotes, such as E. coli, of a protein encoded therein would be very helpful in providing a source of any given protein or peptide sequence.
  • PCR polymerase chain reaction
  • oligonucleotides covering the entire sequence to be synthesized are first allowed to anneal, and then the nicks are repaired with DNA ligase. The fragment is then cloned directly or cloned after amplification by the PCR. The DNA is subsequently used for in vitro assembly into longer sequences.
  • This approach is very sensitive to the secondary structure of oligonucleotides, which interferes with the synthesis. Therefore, the approach has low efficiency and is not reliable for assembly of long DNA fragments.
  • the second general method for gene synthesis utilizes polymerase to fill in single-stranded gaps in the annealed pairs of oligonucleotides.
  • polymerase reaction After the polymerase reaction, single-stranded regions of oligonucleotides become double-stranded and after digestion with restriction endonuclease can be cloned directly or used for further assembly of longer sequences by ligating different double-stranded fragments.
  • This approach is relatively independent of the secondary structure of oligonucleotides; however, after the polymerase reaction, each segment must be cloned.
  • the cloning step significantly delays the synthesis of long DNA fragments and greatly decreases the efficiency of the approach. Additionally, this approach can be used for only relatively small DNA fragments and requires restriction endonuclease recognition sites to be introduced into the sequence.
  • the major essential disadvantages of existing approaches for the synthesis of DNA are low efficacy, and the requirement that synthesized DNA must be amplified by cloning procedures, or by the PCR, before use.
  • the main problem with existing approaches is that the long polynucleotide must be assembled from relatively short oligonucleotides utilizing either inefficient chemical or enzymatic synthesis.
  • the use of short oligonucleotides for the synthesis of long polynucleotides can cause many problems due to multiple interactions of complementary bases, as well as problems related to adverse secondary structure of oligonucleotides. These problems lower the efficiency and widespread use of existing synthetic approaches. Therefore, there exists a great need for an efficient means to make synthetic DNA of any desired sequence.
  • Such a method could be universally applied.
  • the method could be used to efficiently make an array of DNA having specific substitutions in a known sequence which are expressed and screened for improved function.
  • the parent application (Serial No. 07/849,294) provides an efficient and powerful method for the synthesis of DNA.
  • the method is generally referred to as the Exchangeable Template Reaction (ETR).
  • prokaryotic expression vectors that enable the efficient production of a protein encoded by the coding DNA therein.
  • Such vectors can be constructed, for example, by the present ETR method for making synthetic DNA and can allow for efficient production of any desireable protein or peptide product.
  • the present invention satisfies this need by providing a vector that allows efficient expression of coding DNA therein by providing a construct that, after the DNA is transcribed into RNA, provdes an mRNA that is efficiently translated by prokaryotic cells into a protein • >duct.
  • HCN nucleocapsid protein for such uses as immunodiagnostics.
  • the present invention well satisfies this need by providing not only a vector that expresses HCN nucleocapsid protein at high levels but also a vector wherein the HCN coding sequences, and thus antigenic epitopes, can be easily manipulated.
  • the present invention provides a vector comprising, in sequential order 5' to 3': a Shine-Dalgamo nucleotide sequence having a unique restriction endonuclease site and a synthetically produced protein coding nucleotide sequence having a translation start codon; a sequence about equidistant 3' of the translation start codon as the Shine-Dalgarno sequence is 5', which selectively hybridizes with the Shine-Dalgarno sequence to form a hairpin loop wherein the start codon is exposed in the loop such that translation is efficiently initiated; and an adenosine-containing region of a length sufficient to substantially prevent secondary structure of the vector in the region immediately downstream of the hairpin loop. Additionally, the invention provides the above vector further comprising, after the adenosine-containing region, a restriction endonuclease site not found in the DNA encoding the native protein.
  • the instant invention also provides a vector comprising a synthetic gene encoding the hepatitis C virus nucleocapsid protein.
  • the present invention further provides a method of producing a protein comprising placing a cell having a vector as described above in protein expressing conditions and collecting the protein produced by the cell, and in particular when the vector comprises a synthetic gene encoding the hepatitis C virus nucleocapsid protein.
  • Fig. 1 is a schematic showing the general mechanism for the cyclic
  • Fig. 2 shows a schematic representation of the ETR.
  • De ⁇ xyoligonucleotides are shown as solid lines. Points represent DNA polymerase synthesized regions of the double-stranded fragment.
  • the upper strand consists of Oligol (SEQ ID NO:l) and newly-synthesized sequences.
  • the lower strand is composed of oligonucleotide sequences that remain after BstXI digestion and after synthesis of new sequences at the very 3' terminus of the strand. The order of the deoxyoligonucleotides involved in the reaction is indicated.
  • Fig. 3 shows a schematic representa: . n of the ETR corresponding to the middle part of the HCN nucleocapsid gene.
  • Fig. 4 shows a schematic representation of the ETR corresponding to the 3' terminal region of the HCN nucleocapsid gene.
  • the present invention provides a vector comprising, in sequential order 5' to 3': a Shine-Dalgarno nucleotide sequence having a unique restriction endonuclease site; and a synthetically produced protein coding nucleotide sequence having a translation start codon; a sequence about equidistant 3' of the translation start codon as the Shine-Dalgamo sequence is 5', which selectively hybridizes with the Shine-Dalgamo sequence to form a hairpin loop wherein the start codon is exposed in the loop such that translation is efficiently initiated; and an adenosine-containing region of a length sufficient to substantially prevent secondary structure of the vector in the region immediately downstream of the hairpin loop.
  • Such a vector provides efficient expression of the protein encoded by the coding nucleotide sequence.
  • the vector provides efficient expression of the protein encoded by the coding nucleotide sequence by allowing the manipulation of 5' untranslated region sequences, such that the mRNA produced will form a hairpin loop and thus a very efficient ribosome binding site such that sufficient amounts of protein can be produced for a desired use.
  • sufficient protein is obtained by the present method to immunoreact with sera from HCV-positive donors and be useful as a diagnostic.
  • the secondary structure of the mRNA around the initiator codon of the synthetic foreign protein coding sequence is designed as a hairpin with an ATG codon exposed in the loop and the Shine-Dalgamo (SD) sequence located in the double-stranded region.
  • This hairpin structure can be easily destroyed by the interaction of the Shine-Dalgamo sequence (SD) with the 3 '-end of the ribosomal 16S rRNA, resulting in the ATG start codon becoming exposed in the center of a large single-stranded region.
  • Another feature of this translation initiation region is an adenosine-containing (A-containing) sequence downstream from the hairpin.
  • Adenosine-containing sequence is meant enough adenosine, and sufficiently C and G-poor, to sufficiently prevent secondary structure and allow for efficient translation.
  • a sequence would contain at least greater than 50% adenoside residues to be considered "adenosine-containing.”
  • Adenosine- containing regions can include adenosine- rich regions having sufficient adenosine residues to sufficiently prevent secondary structure and allow for efficient translation.
  • Shine Dalgarno sequence is meant any of many variations of a sequence usually found in natural prokaryotic genes about 2 to about 15 nucleotides 5' of the ATG translation start codon, termed in the art the “Shine- Dalgarno sequence” and generally accepted as a binding site on the mRNA molecule for the ribosome, thereby facilitating translation of the mRNA.
  • the Shine Dalgarno sequence (SD) in the vector of the present invention is preferably about 2 to about 15 nucleotides 5' of the ATG, and even more preferably about 5 to about 7 nucleotides upstream of the ATG.
  • the SD can be any variant of the consensus sequence AGGAGGU, listed herein as SEQ ID NO: 13, that retains the characteristic of facilitating translation of the vector.
  • the SD is preferably from about 3 nucleotides to about 9 nucleotides in length. Many such variants are known in the art; others can be tested by known methods in combination with the teachings herein.
  • the SD of the invention has at least one, and preferably more than one, unique restriction endonuclease site within or around the SD, such that the SD sequences, or a portion thereof, can be replaced, using recombinant techniques with more desirable SD sequences, depending upon the c ding sequences to which complementarity is desired for formation of a hairpin loop.
  • unique is meant that the restriction endonuclease site is not found elsewhere in the vector.
  • a "synthetically produced protein coding nucleotide sequence” can be produced, for example, by the exchangeable template reaction (ETR) taught herein or by any other method for assembling synthetic genes.
  • Such coding nucleotide sequence includes first a translation start codon.
  • the coding nucleotide sequence includes a region about equidistant 3' of the translation start codon as the SD is 5' of the translation codon which selectively hybridizes with the SD to form a hairpin loop, such that the start codon is exposed in the center of the loop and translation is thereby efficiently initiated.
  • Either the SD or the region within the coding sequence, or both, can be modified to create sequences that can selectively hybridize, as taught and exemplified herein.
  • selective hybridize is meant that one nucleic acid specifically hybridizes with its target (i.e., complementary) nucleic acid based upon complementarity between the two sequences, rather than random, non-specific hybridization.
  • target i.e., complementary
  • the sequence of one nucleic acid is uniquely complementary to its target hybridizing sequence, such that the two sequences, under standard hybridizing conditions or cellular translation conditions, can form a hairpin loop.
  • the Shine-Dalgamo sequence can be modified by substituting nucleotides or by adding or deleting nucleotides, so long as it selectively hybridizes with the complementary coding region to form a hairpin loop and maintains its ribosome binding capabilities, as discussed above.
  • the complementary coding sequence can be modified utilizing the redundancy of the genetic code. Any nucleotide substitution can be made to this coding sequence that will not change the amino acid sequence encoded therein, as taught and exemplified herein.
  • the vector also includes, after the coding region that selectively hybridizes with the SD, an adenosine-rich region of a length and number of adenosines sufficient to prevent or decrease secondary structure of the mRNA of the vector in the region immediately downstream of the hairpin loop. This can separate the translation initiation region from the rest of the mRNA, thus decreasing the influence of the downstream sequences on the secondary structure within the translation initiation zone.
  • immediate downstream is meant at a distance 3' of the hairpin loop such that the influence of the sequences further downstream is decreased.
  • the amino acids coded by the region of the vector are examined, and using the degeneracy of the genetic code, substitutions that can be made to replace non-adenosine nucleotides with adenosines are made, as taught and exemplified herein.
  • the vector further can comprise, after the adenosine-rich region, a unique restriction endonuclease site, i.e., a site not found in the DNA encoding the native protein. It is preferable that more than one restriction endonuclease site be utilized. Such sites can be especially useful for such uses as adding, by known recombinant methods, nucleotide sequences encoding additional antigenic epitopes. Such constructs would be useful in producing antigen having multiple epitopes.
  • the protein encoded by the vector can then act as a carrier of additional antigenic epitopes.
  • the hepatitis C virus nucleocapsid protein described herein contains a set of unique restriction endonuclease recognition sites not present in the native sequence.
  • this sequence can be modified by standard molecular cloning techniques utilizing the additional restriction endonuclease sites to carry additional antigenic epitopes for, for example, hepatitis B vims, other hepatitis C virus proteins, or human immunodeficiency virus.
  • proteins are produced by the method herein, they can be used, for example, as immunodiagnostic reagents for the detection of antibodies to the epitopes they carry, as taught herein.
  • Such proteins can also be used, for example, as vaccines by administering the proteins to subjects by standard methods.
  • the invention provides the vector in a cell which can express the protein encoded by the vector.
  • Preferable cells are prokaryotic cells, as such cells particularly utilize the SD and the hairpin structure in translation initiation. Even more preferable are E. coli cells, many examples of which are known in the art and publicly available.
  • the coding sequence of the vector can comprise hepatitis C vims nucleocapsid protein.
  • a vector is exemplified by a vector having a coding sequence consisting essentially of the nucleotides in the sequence set forth in SEQ ID NO: 14.
  • This vector provides a sequence encoding hepatitis C virus nucleocapsid protein in which the native nucleotide coding sequence has been modified without altering the amino acid sequence.
  • the modifications include a region just 3' to the ATG start codon that, in the RNA molecule transcribed from the vector, can hybridize to the SD portion of the 5' untranslated region to fc ⁇ r a hairpin loop.
  • the modifications also include adding additional adenosine residues just 3' of the above hairpin loop-forming region, and further include the addition of unique restriction endonuclease recognition sites to the coding region that are not present in the native gene.
  • the vector can comprise a unique portion of SEQ ID NO: 14.
  • a unique portion of a vector includes any portion having a nucleotide sequence that substantially does not occur in known nucleic acids.
  • Another such vector is exemplified by a vector consisting essentially of the nucleotides in the sequence set forth in SEQ ID NO: 15, which provides the HCV coding sequence of SEQ ID NO: 14 that has been inserted 3' to sequences containing the T7 gene 10 promoter, a transcription start region, and a Shine-Dalgamo ribosome binding region, in order.
  • Unique restriction endonuclease sites have been placed on either side of the SD such that the nucleotide sequence can be altered if desired.
  • the SD can be removed and replaced using standard techniques with the restriction enzymes EcoRV and Ndel.
  • restriction endonuclease recognition site is located 5' of the promoter, and allows for alteration in the promoter of the construct, also by standard cloning techniques with restriction enzymes. Additionally, the vector can comprise a unique portion of SEQ ID NO: 15.
  • the invention also provides the vector consisting essentially of the nucleotides set forth in SEQ ID NO: 15 in a cell which can express the protein encoded by the vector.
  • Preferable cells are prokaryotic cells; even more preferable cells are E. coli cells.
  • nucleic acid which selectively hybridizes to a vector comprising the nucleotides in the sequence set forth in SEQ ID NO: 15.
  • a nucleic acid which "selectively hybridizes" to a vector means that hybridization is based upon complementarity of sequences rather than random non-specific hybridization; thus, it means that the sequences of the nucleic acid utilized for hybridization are unique to the target vector sequence, such that the target sequence can be detected.
  • a nucleic acid can comprise, for example, a DNA or RNA probe, a primer, or an RNA molecule, e.g., as encoded by SEQ ID NO: 15.
  • Hybridization conditions may vary, depending upon its purpose; however, the invention includes a nucleic acid that hybridizes under any conditions such that hybridization is specific.
  • the invention further provides a method of producing a protein comprising placing a cell containing the vector of the invention in protein expressing conditions and collecting the protein produced by the cell.
  • the vector can be that consisting essentially of the nucleotides in the sequence set forth in SEQ ID NO: 15.
  • Protein expressing conditions include any conditions that allow a cell to express proteins encoded by nucleotides contained therein and are exemplified herein.
  • the protein can be collected by any chosen method, such as harvesting the cells and preparing a lysate.
  • the lysate can be analyzed by, for example, gel electrophoresis.
  • the protein can then be manipulated as desired, including additional purification, cleavage into smaller subunits, etc., as known in the art.
  • Oligonucleotides for the synthesis of the HCV protein C gene were designed especially for ETR, a method for the synthesis of long polynucleotide DNA fragments using short synthetic oligonucleotides as templates for the DNA polymerase reaction.
  • the ETR is a method for the synthesis of long polynucleotide DNA fragments using short synthetic oligonucleotides as templates for DNA polymerase.
  • the method is based ⁇ ' a cyclic mechanism involving three main components: (1) polymerase activity .3 synthesize double-stranded DNA, (2) enzymatic activity to create 3' terminal single-stranded regions, and (3) specifically designed synthetic deoxyoligonucleotides used as templates for the polymerase reaction.
  • the critical step is the enzymatic creation of a 3' terminal single-stranded region at the "growing point" of the synthesizing polynucleotide chain, which is used for the complementary binding of the next oligonucleotide as a template to continue the polymerase reaction.
  • oligonucleotide additions for each cycle is encoded in each 3' terminal sequence.
  • a specific sequence of nucleotides can anneal with a complementary sequence of nucleotides from the synthetic oligonucleotide.
  • Each cycle begins with the complementary binding of the 3' terminal region of a synthetic oligonucleotide with the 3' protruding region of double-stranded DNA (step 1 in Fig. 1). After annealing, a DNA polymerase reaction occurs to create a second strand of DNA using the short synthetic oligonucleotide as a template for DNA polymerase (step 2 in Fig. 1). After polymerization is complete, the double-stranded DNA has been extended by the length of the synthetic oligonucleotide.
  • ETR is a method for the synthesis of DNA based on a cyclic mechanism of combining the following deoxyoligonucleotides in any order:
  • Cyclic as used herein means a sequential hybridization in a regularly repeated order.
  • hybridization of deoxypolynucleotides occurs only in a specified controlled order.
  • DPNTs deoxypolynucleotides
  • a series of DPNTs two or more, each of which encodes a unique segment of a desired long DPNT, a-rs synthesized.
  • the sequence of each DPNT is selected to produce, when later cleaved by an enzyme, a unique 3' protmsion which will hybridize with only one other member of the DPNT series.
  • the DPNTs are combined, only two of the DPNTs initially hybridize.
  • a polymerase utilizes the two hybridized DPNTs to form double strands.
  • the appropriate enyme then acts on the double-stranded DPNTs to form the unique 3' single-stranded protmsion.
  • the next DPNT which hybridizes only with this unique 3' protmsion then hybridizes.
  • the polymerase again directs the synthesis of double strands.
  • the enzyme again produces a unique 3' single protrusion which was previously synthesized to hybridize only with the next unique DPNT. The sequence is then repeated the desired number of times.
  • ETR can also utilize hybridization and cleavage which proceeds in both directions, e.g., first hybridize DPNTs in the middle of the desired sequence with cleavage sites on both subsequently-formed ends.
  • the selection of DPNTs and enzymes follows the procedure of unidirectional synthesis, but enzyme sites on both ends of the double-stranded DNA are created.
  • a new series of DPNTs can be P led, each having a 5' sequence which, when in double- stranded form, , ⁇ be enzymatically treated to form a unique 3' single-stranded protmsion for selective cyclic hybridization with another unique single-stranded DPNT of the series.
  • This procedure can be repeated many times.
  • the number of DPNTs in the reaction is only limited by undesired interference of hybridization. This can be avoided by creating unique 3' protrusions and hybridizing DPNTs which have minii al sequence similarity. Very long DPNTs, including genes and entire genomes, can thereby be synthesized by this method.
  • the method works so long as a unique 3' single-stranded protmsion is formed by an enzymatically-treated hybridized unique DPNT.
  • unique is meant a nucleotide sequence on one DPNT which is absent on another DPNT so that selective hybridization can occur.
  • the number of unique nucleotides necessary for selective hybridization depends on hybridization conditions. For example, for a 3' protmsion of four nucleotides, the optimal temperature of the reaction is about 37° C. This optimal temperature may be different if a different polymerase is utilized in the synthesis. This is true because different polymerases have different affinities to complementary complexes.
  • Thermostable enzymes also have a rather high affinity to such complexes.
  • a longer 3' protmsion should be more reactive and more specific in hybridization and utilize a higher annealing temperature.
  • the single-stranded region must be of a size to avoid being involved in secondary structure formation.
  • This region, to be effective in hybridization should be represented in a single-stranded form at the reaction temperature.
  • thermostable enzymes can be more effective in ETR because a higher reaction temperature can be utilized.
  • very effective single-stranded terminal regions can be about 7-9 nucleotides long. For such lengths it is routine to find conditions to maintain single-stranded form.
  • Specific complementary complexes between DPNTs can be effectively organized at higher temperatures, which decreases the possibility of improper complex formation.
  • the optimal temperature for the 7-9 nucleotide 3' protrusion may be around 55-65°C, the optimal temperature for the activity of thermostable polymerases.
  • a preferred range of 3' protrusion length is about 3- 12 nucleotides. Longer protrusions can be made and routinely tested by the methods described in the Experimental section to optimize length and conditions for a particular system.
  • the precise 5' sequence of a member of the series will depend on the desired sequence for the ultimate DNA and the type of enzyme utilized to form the protrusion. Thus, once an ultimate desired sequence is selected, a 5' sequence is synthesized which corresponds to the desired sequence and which will either be cleaved or exposed such that the desired sequences remain and the undesired sequences, if any, are removed prior to hybridization of the next member of the series. For example, if a restriction endonuclease is utilized, it must cleave in such a way that unique sequences for each member of the series to be hybridized are produced.
  • BstXI as described in detail in the Experimental section, provides one example of such a restriction endonuclease because the endonuclease allows for four unique nucleotides to be synthesized in each member of the series which remains after cleavage.
  • the members of the series of DPNTs must be synthesized if a restriction endonuclease is utilized, for example with a DNA synthesizer. Since the DPNT which starts the hybridization can hybridize directly with the second DPNT, it is not affected by the enzymatic treatment. Therefore, the first unique DPNT can be obtained, if desired, by means other than synthesis and can be single- or double-stranded.
  • the DPNT can be a fragment excised from natural DNA, e.g. plasmid, phage genome, or viral genome by restriction endonucleases.
  • the fragment can be obtained by specific amplification using PCR.
  • PCR fragments are more suitable because terminal sequences of the amplified fragment can be easily modified with primers used for amplification with the introduction of desirable nucleotide modifications, including artificially synthesized non-natural derivatives of nucleotides. Any suitable number of nucleotides sufficient for efficient hybridization under the selected conditions can be utilized for this initial hybridization.
  • This unique synthesis-initiating DPNT which begins synthesis by providing a template for hybridization of the second DPNT from the series, can be bound to a solid support for impioved efficiency.
  • the solid phase allows for the efficient separation of the synthesized DNA from other components of the reaction.
  • Different supports can be applied in the method.
  • supports can be magnetic latex beads or magnetic control pore glass beads. Being attached to the first DPNT, these beads allow the desirable product from the reaction mixture to be magnetic ⁇ illy separated. Binding the DPNT to the beads can be accomplished by a varisty of known methods, for example carbod ⁇ mide treatment (Gilham, Biochemistry 7:2809-2813 (1968); Mizutani and Tachbana, J.
  • the DPNT attached to the solid phase is the primer for synthesis of the whole DNA molecule. Synthesis can be accomplished by addition of sets of compatible oligonucleotides together with enzymes. After the appropriate incubation time, unbound components of the method can be washed out and the reaction can be repeated again to improve the efficiency of each oligonucleotide to be utilized as a template.
  • Solid phase to be efficiently used for the synthesis, can contain pores with sufficient room for synthesis of the long DNA molecules.
  • the solid phase can be composed of material that cannot non-specifically bind any undesired components of the reaction.
  • One way to solve the problem is to use control pore glass beads appropriate for long DNA molecules.
  • the initial primer can be attached to the beads through a long connector. The role of the connector is to position the primer from the surface of the solid support at a desirable distance.
  • Suitable polymerases may include Taq polymerase, large fragments of E. coli DNA polymerase I, DNA polymerase of T7 phase.
  • the optimal conditions of the polymerization vary with the type of polymerase used. Likewise, the optimal polymerase can vary with the conditions necessary for the synthesis (Bej et al., CriL Rev. Biochem. MoL BioL 26(3 -4):301 -334 (1991); Tabor and Richardson, Proc Nad. Acad. ScL USA, 86:4076-4080 (1989); Petruska et aL, Proc Natl Acad.
  • restriction endonuclease BstXI an enzyme capable of removing several nucleotides from the 5' terminus.
  • This restriction endonuclease is compatible with ETR for the following reasons: (1) a 3' protrusion is produced, (2) the single-stranded 3' protmsion does not have any sequence restrictions, and (3) after cleavage the restriction site cannot be restored by the interaction of the next synthetic oligonucleotides.
  • Any enzyme can be utilized which can form a unique 3' protmsion from double-stranded DNA.
  • Presently known enzymes useful in the method include not only BstXI, as used in the examples, but also 5' exonucleases specific for double-stranded DNA, such as the exonuclease of T7 and lambda phage, and an enzyme of DNA recombination, such as recA.
  • oligonucleotides to be used in the reaction as templates for polymerase reaction are chemically modified at a defined point to prevent T7 exonucleases from jumping over the modified nucleotides.
  • oligonucleotide phosphorodithioates can be utilized using methods described in Camthers, NucL Acids $ymp. Ser., 21:119- 120 (1989).
  • polymerase first fills gaps in hybridized DPNTs.
  • the exonuclease of the T7 starts cutting double-stranded DNA beginning from the 5' end (the opposite 5' end should be modified or attached to solid phase to prevent cleavage from the end). This reaction goes until the modified position where it stops. The 3' protrusion created by the exonuclease activity can then be used for hybridization with the next oligonucleotide in the cycle reaction.
  • T 7 is well known to have a relatively strong preference for double-stranded DNA (Kerr and Sadowski, /. BioL Chem., 247:311 -318 (1972); Thomas and Olivera, /. BioL Chem., 253:424-429 (1978); Shon et al., J.
  • exonucleases The main advantage of these exonucleases is the possibility of creating a single-stranded 3' protrusion of any necessary size to allow the use of higher temperatures in the reaction. Additionally, because the exonuclease recognizes any blunt end, its use eliminates the need to synthesize DPNT having a restriction site when polymerized to double-stranded form.
  • ETR can also be performed utilizing an enzyme of DNA recombination. It is known that recA can replace one strand of double-stranded DNA, in a strong sequence-specific manner, with a single-stranded DNA from solution creating D-loop structures (Cox and Lehman, Ann. Rev. Biochem., 56:229-262 (1987); Tadi-Laskowski et aL, Nucleic Acids Res., 16:8157-8169 (1988); Hahn et al., J. BioL Chem., 263:7431 -7436 (1988)). In this modification of the method, DPNTs are combined in one reaction with polymerase and recA.
  • Polymerase fills single-stranded gaps and recA replaces the terminal region of one of the strands of double-stranded DNA with a single-stranded DPNT from solution which provides the polymerase with a new template.
  • An advantage of the reaction is strong specificity of the hybridization which is due to enzymatic support.
  • the hybridization is the only step without enzymatic support While restriction endonucleases and exonucleases can only create a 3' protrusion, recA can create a single-stranded region at the ends of double-stranded DNA and anneals oligonucleotides to the 3' protmsion.
  • the nucleocapsid protein of HCV (also referred to as “core protein” or “protein C) was expressed in E. coli as an authentic non- hybrid protein.
  • core protein also referred to as "core protein” or "protein C)
  • protein C protein C
  • a set of unique restriction endonuclease recognition sites was introduced, including Smal, Avrll, Eco47III, AccI, Ddel, BstEII, SacII, Clal, Bbel, Ncol, Xbal, Tthllll, and Asul.
  • the recognition and cleavage sites of all restriction enzymes listed herein are known in the art and published. Since this gene was to be expressed in bacterial and mammalian cells, codon usage was of no concern.
  • sequence for the synthesis of oligonucleotides was taken from published data (position 330-917 nt, (Kato, N. et aL, Proc. NatL Acad. ScL USA, 87:9524-9528 (1990)), and modified for insertion of restriction endonuclease sites or modified to avoid adverse complementary interactions.
  • sequence for the synthesis of the gene encoding the nucleocapsid protein of hepatitis C virus was obtained from published data (Kato et al.. 1990).
  • the sequence of the mRNA 5 '-untranslated region was designed utilizing published recommendations (Khudyakov, 1985; Khudyakov et al., 1987).
  • the secondary stmctures of the oligonucleotides were predicted, as described (Frier et aL, 1986, Jaeger et al., 1989), and complementary interactions between oligonucleotides for each set of oligonucleotides were taken into consideration. Whenever the 3'-terminal region of the oligonucleotides was potentially involved in undesirable complimentary interactions, the primary structure of the oligonucleotides was changed without influencing the protein sequence.
  • a synthetic promoter having the sequence of the strong promoter of the bacteriophage T7 gene 10 (Tabor, S. et aL, Proc. Nad. Acad. ScL USA, 82:1074- 1078 (1985)) was utilized to provide efficient transcription.
  • HCV nucleocapsid protein was divided into 3 fragments. Each fragment was synthesized separately by ETR. The first fragment was synthesized from 5 deoxyoligonucleotides (Seq ID NOS: 1 -5) (Fig. 2), the second fragment from 3 deoxyoligonucleotides (Seq ID NOS: 6-8) (Fig. 3), and the third from 4 deoxyoligonucleotides (Seq ID NOS: 9 - 12) (Fig. 4). All reactions were carried out as described below.
  • Deoxyoligonucleotides were synthesized using an automatic synthesizer (Applied Biosystems Model 480A, Foster City, California) and purified by electrophoresis in 10% polyacrylamide gel electrophoresis (PAGE), containing 7M urea with TBE buffer (0.045M Tris-borate, 0.001 M EDTA, pH 83). Oligonucleotides were recovered from the gel by electroelution.
  • ETR Exchangeable Template Reaction
  • one of the oligodeoxynucleotides without a BstXI site was radiolabeled with [gamma-P-32] ATP in 50 mM Tris-HCl, pH 7.6, containing 10 mM MgCl, 5 mM DTT with the addition of 10 uCi [gamma-P- 32] ATP (5000 Ci/mmole, New England Nuclear) and 10-20 pmol of oligonucleotide.
  • the products of the ETR were analyzed by electrophoresis in 8% PAGE containing 8M urea, and analyzed by autoradiography.
  • ETR fragments were recovered from PAGE by incubating pieces of the gel in 0.15 M NaCl and 1 mM EDTA for 30 min at 65 °C. After incubation, the DNA was precipitated with ethanol. The pellet was dissolved in 20 ul of TE-buffer and 1 ul of the DNA solution was used to amplify the ETR fragments by the PCR. The terminal oligonucleotides used for synthesis of the fragments were used as primers for the PCR. Briefly, 20-50 pmol of each primer was added to the reaction mixture and the PCR consisted of 30 cycles as follows: 94°C for 45 sec, 65°C for 20 sec, and 72°C for 1 min.
  • the PCR amplified ETR fragments were treated with the appropriate restriction endonuclease with the recognition sites located at the termini of each fragment, and then ligated in 10 ul of a solution containing all three fragments, 50 mM Tris-HCl, pH 7.5, 10 mM MgCl, 1 mM DTT, 1 mM ATP, and 10 units of DNA ligase (Pharmacia, Piscataway, NJ) for 6 h.
  • One ⁇ l of the ligase reaction was used to amplify the fragment by the PCR to provide the full length DNA using PCR conditions described above and using the two terminal oligonucleotides as primers.
  • Amplified full length DNA was recovered from agarose gel using a DEAE procedure (Dretzen et aL, 1 ( and treated with restriction endonucleases to confirm the struct-, .c of the synthesized gene.
  • the nucleotide sequence of this construct is set forth in SEQ ID NO: 14.
  • Plasmid Construction A vector, designated pTS7, was constructed to prepare a plasmid for expression of the HCV nucleocapsid gene. Two oligonucleotides containing the gene 10 promoter from T7 bacteriophage and encoding the 5 '-untranslated region of the mRN ⁇ were annealed at 20°C and inserted into pBR322 between the EcoRI and Ea ⁇ iHI sites. A BamHI-Scal- fragment of the plasmid containing this synthetic sequence and the N-terminal part of the Amp gene were introduced into the plasmid pSP65 between the BamHI site localized in the multiple cloning region and Seal-site of the Amp gene.
  • pTSC6178-7 A plasmid containing the synthetic core gene under the control of a T7 phage promoter was designated pTSC6178-7.
  • the sequence of the cloned HCV synthetic gene was verified using the polymerase chain terminator method (Sanger et aL, 1977). This sequence is set forth in SEQ ID NO: 15.
  • a hybrid protein composed of beta-galactosidase and a portion of the HCV nucleocapsid protein two plasmids were used.
  • a DNA fragment containing the T7 promoter and a piece of the synthetic gene encoding amino acids 1 - 158 of the nucleocapsid protein was extracted from the plasmid pTSC6178-7 and inserted by standard techniques into the plasmid pCL121 at the BamHI site (Khudyakov et aL, 1987).
  • the resulting plasmid pFC105-301 encodes for a hybrid protein composed of amino acids 1 - 158 of the HCV nucleocapsid and beta- galactosidase under the control of the T7 promoter.
  • E. coli BL21(DE3) (Studier & Moffott, 1986) competent cells were prepared (Hanahan, 1983) and transformed with pTSC6178-7. Cells were grown in LB medium until an optical density at 600 nm was equal to 0.6 and then the T7 promoter was activated by the addition of IPTG at a final concentration of 1 mM. After 4-6 h fermentation at 37° C, the cells were harvested and a lysate was prepared (Sambrook et aL, 1989). Aliquots of the lysate were analyzed by Western blot (Hari ⁇ w & Lane, 1988).
  • Nitrocellulose filters containing immobilized proteins were incubated at 20° C for 2 h with human sera diluted 50 times in 50 mM tris HQ, pH 7.5, containing 0.5% Triton X-100, 1% gelatin, and 1% bovine scrum albumin (NET). After washing with NET three times, the filters were incubated for 1 h with affinity purified anti- human IgG antibodies coupled to horseradish peroxidase (TAGO, Burlingame, California) diluted 1:5000 in NET. After washing, diaminobenzidine (Sigma, St. Louis, Missouri) and hydrogen peroxide were used to develop the reaction.
  • TAGO horseradish peroxidase
  • Sera. Sera were obtained from the collection reposited at the D. I.
  • Buffer NEB3 (New England Biolabs, Beverly, MA; 50 mM Tris-HCl, pH 7.9, 10 mM MgClj, 100 mM NaCl, 1 mM DTT) was optimal for the BstXI reactor and Taq buffer (10 mM Tris-HCl, pH 7.8, 50 mM Kcl, 1 mM DTT, 1,5 mM, or 3 mM, or 5 mM MgC ⁇ ) was optimal for Taq DNA polymerase.
  • th optimal relative concentrations of the oligodeoxynucleotides were 1:4:20:40:60.
  • the rate of ETR changed as well.
  • a 1:4 relative concentration of oligos 1 and 2 a full-size fragment was observed after a 3 h incubation period at 37° C in NEB2.
  • a fragment was synthesized in detectable amounts after only 30 min (data not shown).
  • Plasmid pTSC6178-7 was unstable in transformed bacteria growing with or without the IPTG inducer. Modified derivatives oipTSC6178-7 were often found in bacteria after 2-3 passages on solid LB agar or in liquid medium. Because of this problem, the cells were transformed each time the HCV protein was to be expressed, or transformed cells were stored frozen and used as aliquots. In cells transformed with plasmid pTSC6178-7, a 27kDa protein identified by Western blot analysis effectively bound an antibody from sera containing serologic markers specific for HCV infection. Sera without markers of HCV infection did not immunoreact with this protein.
  • a protein of 135 kDa was expected to be reactive with anti-HCV sera by Western Blot.
  • the major protein interacting with anti-HCV antibody had a m.w. of 23 kDA. Only a very weak band could be identified as reactive with HCV specific sera in the position corresponding to a protein with the expected m.w. of 135 kDa.
  • the 23 kDa protein was very immunoreactive and represented by 5-7% of the total E. coli t oteins.
  • this 135 kDa hybrid protein undergoes cleavage with intracellular proteases, after which the 23 kDa protein fragment of the HCV nucleocapsid polypeptide was very stable and immunoreactive.
  • This protein represents a good source of immunoreactive recombinant HCV core protein for the development of a diagnostic test system for the detection of anti- HCV activity in human sera.
  • Gold, L., et aL "Translation initiation. In Escherichia coli and Salmonella typhimurium: Cellular and Molecular Biology, 2:1302-1307 (Neidhardt, F.C., et al. eds) American Society for Microbiology, Washington, DC (1987).
  • MOLECULE TYPE Other nucleic acid
  • MOLECULE TYPE Other nucleic acid

Landscapes

  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Virology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Medicinal Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The invention provides a vector for efficient expression of a synthetically produced protein coding nucleotide sequence comprising a vector comprising in sequential order 5' to 3': a) a Shine-Dalgarno nucleotide sequence having a unique restriction endonuclease site and b) a synthetically produced protein coding nucleotide sequence having i) a translation start codon; ii) a sequence about equidistant 3' of the translation start codon as the Shine-Dalgarno sequence is 5', which selectively hybridizes with the Shine-Dalgarno sequence to form a hairpin loop wherein the start codon is exposed in the loop such that translation is efficiently initiated; and iii) an adenosine-containing region of a length sufficient to substantially prevent secondary structure of the vector in the region immediately downstream of the hairpin loop. In particular is provided a vector providing a synthetic gene encoding the hepatitis C virus nucleocapsid protein that can be efficiently expressed in prokaryotic cells.

Description

P ASMIDS FOR EFFICIENT EXPRESSION OF SYNTHETIC GENES IN E. COLI
This application is a continuation-in-part of Serial No. 07/849,294 filed on March 10, 1992. The contents of Serial No. 07/849,294 are hereby incorporated in their entirety by reference.
Various references are cited herein. These references are hereby incorporated by reference into the application to more fully describe the state of the art to which the invention pertains.
BACKGROUND OF THE INVENTION
FIELD OF INVENTION The present invention relates to a vector for efficient expression of a protein encoded by the nucleotides in a synthetic coding sequence inserted therein. In particular is provided a vector comprising a synthetic gene encoding the hepatitis C virus nucleocapsid protein.
BACKGROUND ART
Hepatitis C virus (HC !s a recently identified new agent responsible for most cases of post-transfusion non-A »on-B hepatitis (NANBH) worldwide (Plagemann, 1991). The virus contains a positive strand RNA genome comprising about 9,400 nucleotides (nt) that encodes a polyprotein of more than 3,000 amino acids. The genetic organization of the HCV genon e was recently elucidated (Miller & Purcell,
1990, Takeuchi et al., 1990, Choo et 1991, Hijikata et al., 1991, Takamizawa et al., 1991). The N-terminal region of the HCV polyprotein is processed into structural proteins C (a nucleocapsid protein) and El and E2/NS1 (envelope proteins). Although the precise points for processing of the polyprotein have not been confirmed, the polyprotein contains four nonstructural proteins: NS2, NS3, NS4, and NS5. The HCV nucleocapsid protein C has been cloned and expressed in bacteria (Muraiso et al., 1990;
SUBSTITUTE SHEET {RULE 26) al., 1990; Muraiso et al., 1991; Takahashi et aL, 1992), and in eukaiyotic cells (Harada et aL, 1991). One of the main problems for the efficient expression of non-hybrid proteins in bacteria is a poor understanding of the mechanism of how the mRNA structure may influence the efficiency of translational initiation (Gold, L. et aL, American Society for Microbiology, Washington, D.C. (Neidhardt, F.C. et aL, eds), 2:1302- 1307 (1987)). Because HCV circulates in low titers, a source of DNA using chemical-enzymatic synthesis would be of significant benefit.
Furthermore, though the sequence of HCV nucleocapsid protein is known, there is no abundant source of the protein for, for example, immunodiagnostic purposes. Additionally, there is no useful means currently for manipulating the amino acid sequence of the protein for such uses as adding additional antigenic epitopes for immunodiagnostic purposes. While it is known to obtain expression of eukaiyotic genes in prokaryotes, especially E. coli, current vectors for the expression of proteins do not provide for efficient expression of natural proteins. Some provide for the production of hybrid fusion proteins, wherein the vectors provide both 5' untranslated regions and a translation start codon followed by some coding sequences. The DNA encoding protein of desire would then be placed, without its translation start codon, after the vector's coding sequences such that a hybrid protein is produced from the construct containing amino acids coded by both the vector sequences and the DNA encoding the protein of desire. Other vectors currently available (for example, Pharmacia vectors BKK 223-3 and PDR 540) have strong promoters and are therefore efficient imitators of transcription, but do not provide for efficient translation of coding sequences other than the sequence native to the specific promoter in the vector. Thus, a vector that provided efficient expression in prokaryotes, such as E. coli, of a protein encoded therein would be very helpful in providing a source of any given protein or peptide sequence. The technology for the functional expression of DNA fragments in heterologic genetic systems depends to a great extent on an accessible source of DNA. There are two ways to obtain genetic material for genetic engineering manipulations: (1) isolation and purification of DNA in an appropriate form from natural sources (this technique is well-elaborated and constitutes the backbone of genetic engineering and molecular biology), or (2) the synthesis of DNA using various chemical-enzymatic approaches, a discipline that has been intensively researched over the last 15 years. The former approach is limited to naturally-occurring sequences which do not easily lend themselves to specific modification. The latter approach is much more complicated and labor intensive. However, the chemical-enzymatic approach has many attractive features including the possibility of preparing, without any significant limitations, any desirable DNA sequence.
The use of natural sources of DNA for expression of proteins has been greatly facilitated by polymerase chain reaction (PCR). Compared with conventional synthetic approaches for preparing DNA, PCR is much less complicated and labor intensive. Conventional DNA synthesis from oligonucleotides is fundamentally different from DNA synthesis that occurs within organisms. The natural process uses a pre-existing DNA template as a substrate for multiple enzymatic activities involved in DNA replication. However, to synthesize a gene in vitro of a desired sequence, a pre-existing template does not exist, and one must rely on the use of relatively short single- stranded oligonucleotides (20 to 100 nucleotides). The use of several single- stranded oligonucleotides for the assembly of polynucleotides may result in many problems associated with complementary adverse interactions with each other. These factors are major contributors in restricting the efficiency and use of synthetic approaches for the assembly of DNA.
Two general methods currently exist for the synthetic assembly of oligonucleotides into long DNA fragments. First, oligonucleotides covering the entire sequence to be synthesized are first allowed to anneal, and then the nicks are repaired with DNA ligase. The fragment is then cloned directly or cloned after amplification by the PCR. The DNA is subsequently used for in vitro assembly into longer sequences. This approach is very sensitive to the secondary structure of oligonucleotides, which interferes with the synthesis. Therefore, the approach has low efficiency and is not reliable for assembly of long DNA fragments.
The second general method for gene synthesis utilizes polymerase to fill in single-stranded gaps in the annealed pairs of oligonucleotides. After the polymerase reaction, single-stranded regions of oligonucleotides become double-stranded and after digestion with restriction endonuclease can be cloned directly or used for further assembly of longer sequences by ligating different double-stranded fragments. This approach is relatively independent of the secondary structure of oligonucleotides; however, after the polymerase reaction, each segment must be cloned. The cloning step significantly delays the synthesis of long DNA fragments and greatly decreases the efficiency of the approach. Additionally, this approach can be used for only relatively small DNA fragments and requires restriction endonuclease recognition sites to be introduced into the sequence.
Thus, the major essential disadvantages of existing approaches for the synthesis of DNA are low efficacy, and the requirement that synthesized DNA must be amplified by cloning procedures, or by the PCR, before use. The main problem with existing approaches is that the long polynucleotide must be assembled from relatively short oligonucleotides utilizing either inefficient chemical or enzymatic synthesis. The use of short oligonucleotides for the synthesis of long polynucleotides can cause many problems due to multiple interactions of complementary bases, as well as problems related to adverse secondary structure of oligonucleotides. These problems lower the efficiency and widespread use of existing synthetic approaches. Therefore, there exists a great need for an efficient means to make synthetic DNA of any desired sequence. Such a method could be universally applied. For example, the method could be used to efficiently make an array of DNA having specific substitutions in a known sequence which are expressed and screened for improved function. The parent application (Serial No. 07/849,294) provides an efficient and powerful method for the synthesis of DNA. The method is generally referred to as the Exchangeable Template Reaction (ETR).
Furthermore, there exists a need for prokaryotic expression vectors that enable the efficient production of a protein encoded by the coding DNA therein. Such vectors can be constructed, for example, by the present ETR method for making synthetic DNA and can allow for efficient production of any desireable protein or peptide product. The present invention satisfies this need by providing a vector that allows efficient expression of coding DNA therein by providing a construct that, after the DNA is transcribed into RNA, provdes an mRNA that is efficiently translated by prokaryotic cells into a protein >duct.
Finally, given the prevalence of HCN infection discussed above, there exists an urgent need for an abundant source of HCN nucleocapsid protein for such uses as immunodiagnostics. The present invention well satisfies this need by providing not only a vector that expresses HCN nucleocapsid protein at high levels but also a vector wherein the HCN coding sequences, and thus antigenic epitopes, can be easily manipulated.
SUMMARY OF THE INVENTION
The present invention provides a vector comprising, in sequential order 5' to 3': a Shine-Dalgamo nucleotide sequence having a unique restriction endonuclease site and a synthetically produced protein coding nucleotide sequence having a translation start codon; a sequence about equidistant 3' of the translation start codon as the Shine-Dalgarno sequence is 5', which selectively hybridizes with the Shine-Dalgarno sequence to form a hairpin loop wherein the start codon is exposed in the loop such that translation is efficiently initiated; and an adenosine-containing region of a length sufficient to substantially prevent secondary structure of the vector in the region immediately downstream of the hairpin loop. Additionally, the invention provides the above vector further comprising, after the adenosine-containing region, a restriction endonuclease site not found in the DNA encoding the native protein.
The instant invention also provides a vector comprising a synthetic gene encoding the hepatitis C virus nucleocapsid protein.
The present invention further provides a method of producing a protein comprising placing a cell having a vector as described above in protein expressing conditions and collecting the protein produced by the cell, and in particular when the vector comprises a synthetic gene encoding the hepatitis C virus nucleocapsid protein.
BRIEF DESCRIPTION OF THE FIGURES
Fig. 1 is a schematic showing the general mechanism for the cyclic
ETR.
Fig. 2 shows a schematic representation of the ETR. Deαxyoligonucleotides are shown as solid lines. Points represent DNA polymerase synthesized regions of the double-stranded fragment. The upper strand consists of Oligol (SEQ ID NO:l) and newly-synthesized sequences. The lower strand is composed of oligonucleotide sequences that remain after BstXI digestion and after synthesis of new sequences at the very 3' terminus of the strand. The order of the deoxyoligonucleotides involved in the reaction is indicated. Fig. 3 shows a schematic representa: . n of the ETR corresponding to the middle part of the HCN nucleocapsid gene.
Fig. 4 shows a schematic representation of the ETR corresponding to the 3' terminal region of the HCN nucleocapsid gene.
DETAILED DESCRIPTION OF THE INVENTION
The present invention provides a vector comprising, in sequential order 5' to 3': a Shine-Dalgarno nucleotide sequence having a unique restriction endonuclease site; and a synthetically produced protein coding nucleotide sequence having a translation start codon; a sequence about equidistant 3' of the translation start codon as the Shine-Dalgamo sequence is 5', which selectively hybridizes with the Shine-Dalgamo sequence to form a hairpin loop wherein the start codon is exposed in the loop such that translation is efficiently initiated; and an adenosine-containing region of a length sufficient to substantially prevent secondary structure of the vector in the region immediately downstream of the hairpin loop. Such a vector provides efficient expression of the protein encoded by the coding nucleotide sequence. The vector provides efficient expression of the protein encoded by the coding nucleotide sequence by allowing the manipulation of 5' untranslated region sequences, such that the mRNA produced will form a hairpin loop and thus a very efficient ribosome binding site such that sufficient amounts of protein can be produced for a desired use. For example, it is herein exemplified that sufficient protein is obtained by the present method to immunoreact with sera from HCV-positive donors and be useful as a diagnostic.
The secondary structure of the mRNA around the initiator codon of the synthetic foreign protein coding sequence is designed as a hairpin with an ATG codon exposed in the loop and the Shine-Dalgamo (SD) sequence located in the double-stranded region. This hairpin structure can be easily destroyed by the interaction of the Shine-Dalgamo sequence (SD) with the 3 '-end of the ribosomal 16S rRNA, resulting in the ATG start codon becoming exposed in the center of a large single-stranded region. Another feature of this translation initiation region is an adenosine-containing (A-containing) sequence downstream from the hairpin. This sequence has a low potential for secondary structure formation and separates the initiation region from the rest of the mRNA, thus decreasing the influence of the downstream sequence on the secondary structure within the translation initiation zone. Presence of the A-containing sequence makes the prediction of the hairpin more probable. By "adenosine-containing" is meant enough adenosine, and sufficiently C and G-poor, to sufficiently prevent secondary structure and allow for efficient translation. Generally, a sequence would contain at least greater than 50% adenoside residues to be considered "adenosine-containing." Adenosine- containing regions can include adenosine- rich regions having sufficient adenosine residues to sufficiently prevent secondary structure and allow for efficient translation.
By "Shine-Dalgarno sequence" is meant any of many variations of a sequence usually found in natural prokaryotic genes about 2 to about 15 nucleotides 5' of the ATG translation start codon, termed in the art the "Shine- Dalgarno sequence" and generally accepted as a binding site on the mRNA molecule for the ribosome, thereby facilitating translation of the mRNA. The Shine Dalgarno sequence (SD) in the vector of the present invention is preferably about 2 to about 15 nucleotides 5' of the ATG, and even more preferably about 5 to about 7 nucleotides upstream of the ATG. The SD can be any variant of the consensus sequence AGGAGGU, listed herein as SEQ ID NO: 13, that retains the characteristic of facilitating translation of the vector. The SD is preferably from about 3 nucleotides to about 9 nucleotides in length. Many such variants are known in the art; others can be tested by known methods in combination with the teachings herein. The SD of the invention has at least one, and preferably more than one, unique restriction endonuclease site within or around the SD, such that the SD sequences, or a portion thereof, can be replaced, using recombinant techniques with more desirable SD sequences, depending upon the c ding sequences to which complementarity is desired for formation of a hairpin loop. By "unique" is meant that the restriction endonuclease site is not found elsewhere in the vector.
A "synthetically produced protein coding nucleotide sequence" can be produced, for example, by the exchangeable template reaction (ETR) taught herein or by any other method for assembling synthetic genes. Such coding nucleotide sequence includes first a translation start codon. Secondly, the coding nucleotide sequence includes a region about equidistant 3' of the translation start codon as the SD is 5' of the translation codon which selectively hybridizes with the SD to form a hairpin loop, such that the start codon is exposed in the center of the loop and translation is thereby efficiently initiated. Either the SD or the region within the coding sequence, or both, can be modified to create sequences that can selectively hybridize, as taught and exemplified herein.
By "selectively" hybridize is meant that one nucleic acid specifically hybridizes with its target (i.e., complementary) nucleic acid based upon complementarity between the two sequences, rather than random, non-specific hybridization. In other words, the sequence of one nucleic acid is uniquely complementary to its target hybridizing sequence, such that the two sequences, under standard hybridizing conditions or cellular translation conditions, can form a hairpin loop.
The Shine-Dalgamo sequence can be modified by substituting nucleotides or by adding or deleting nucleotides, so long as it selectively hybridizes with the complementary coding region to form a hairpin loop and maintains its ribosome binding capabilities, as discussed above. The complementary coding sequence can be modified utilizing the redundancy of the genetic code. Any nucleotide substitution can be made to this coding sequence that will not change the amino acid sequence encoded therein, as taught and exemplified herein. The vector also includes, after the coding region that selectively hybridizes with the SD, an adenosine-rich region of a length and number of adenosines sufficient to prevent or decrease secondary structure of the mRNA of the vector in the region immediately downstream of the hairpin loop. This can separate the translation initiation region from the rest of the mRNA, thus decreasing the influence of the downstream sequences on the secondary structure within the translation initiation zone.
By "immediately downstream" is meant at a distance 3' of the hairpin loop such that the influence of the sequences further downstream is decreased. To create this adenosine-rich region, the amino acids coded by the region of the vector are examined, and using the degeneracy of the genetic code, substitutions that can be made to replace non-adenosine nucleotides with adenosines are made, as taught and exemplified herein.
The vector further can comprise, after the adenosine-rich region, a unique restriction endonuclease site, i.e., a site not found in the DNA encoding the native protein. It is preferable that more than one restriction endonuclease site be utilized. Such sites can be especially useful for such uses as adding, by known recombinant methods, nucleotide sequences encoding additional antigenic epitopes. Such constructs would be useful in producing antigen having multiple epitopes. The protein encoded by the vector can then act as a carrier of additional antigenic epitopes. For example, the hepatitis C virus nucleocapsid protein described herein (SEQ ID NO: 14) contains a set of unique restriction endonuclease recognition sites not present in the native sequence. With these modifications, this sequence can be modified by standard molecular cloning techniques utilizing the additional restriction endonuclease sites to carry additional antigenic epitopes for, for example, hepatitis B vims, other hepatitis C virus proteins, or human immunodeficiency virus. When such proteins are produced by the method herein, they can be used, for example, as immunodiagnostic reagents for the detection of antibodies to the epitopes they carry, as taught herein. Such proteins can also be used, for example, as vaccines by administering the proteins to subjects by standard methods.
The invention provides the vector in a cell which can express the protein encoded by the vector. Preferable cells are prokaryotic cells, as such cells particularly utilize the SD and the hairpin structure in translation initiation. Even more preferable are E. coli cells, many examples of which are known in the art and publicly available.
The coding sequence of the vector can comprise hepatitis C vims nucleocapsid protein. Such a vector is exemplified by a vector having a coding sequence consisting essentially of the nucleotides in the sequence set forth in SEQ ID NO: 14. This vector provides a sequence encoding hepatitis C virus nucleocapsid protein in which the native nucleotide coding sequence has been modified without altering the amino acid sequence. The modifications include a region just 3' to the ATG start codon that, in the RNA molecule transcribed from the vector, can hybridize to the SD portion of the 5' untranslated region to fcπr a hairpin loop. The modifications also include adding additional adenosine residues just 3' of the above hairpin loop-forming region, and further include the addition of unique restriction endonuclease recognition sites to the coding region that are not present in the native gene. Additionally, the vector can comprise a unique portion of SEQ ID NO: 14. As used herein, "a unique portion" of a vector includes any portion having a nucleotide sequence that substantially does not occur in known nucleic acids.
Another such vector is exemplified by a vector consisting essentially of the nucleotides in the sequence set forth in SEQ ID NO: 15, which provides the HCV coding sequence of SEQ ID NO: 14 that has been inserted 3' to sequences containing the T7 gene 10 promoter, a transcription start region, and a Shine-Dalgamo ribosome binding region, in order. Unique restriction endonuclease sites have been placed on either side of the SD such that the nucleotide sequence can be altered if desired. For example, the SD can be removed and replaced using standard techniques with the restriction enzymes EcoRV and Ndel. An additional restriction endonuclease recognition site is located 5' of the promoter, and allows for alteration in the promoter of the construct, also by standard cloning techniques with restriction enzymes. Additionally, the vector can comprise a unique portion of SEQ ID NO: 15.
The invention also provides the vector consisting essentially of the nucleotides set forth in SEQ ID NO: 15 in a cell which can express the protein encoded by the vector. Preferable cells are prokaryotic cells; even more preferable cells are E. coli cells.
Also provided by the invention is a nucleic acid which selectively hybridizes to the vector comprising the nucleotides in the sequence set forth in SEQ ID NO: 15. A nucleic acid which "selectively hybridizes" to a vector means that hybridization is based upon complementarity of sequences rather than random non-specific hybridization; thus, it means that the sequences of the nucleic acid utilized for hybridization are unique to the target vector sequence, such that the target sequence can be detected. Such a nucleic acid can comprise, for example, a DNA or RNA probe, a primer, or an RNA molecule, e.g., as encoded by SEQ ID NO: 15. Hybridization conditions may vary, depending upon its purpose; however, the invention includes a nucleic acid that hybridizes under any conditions such that hybridization is specific.
The invention further provides a method of producing a protein comprising placing a cell containing the vector of the invention in protein expressing conditions and collecting the protein produced by the cell. The vector can be that consisting essentially of the nucleotides in the sequence set forth in SEQ ID NO: 15. Protein expressing conditions include any conditions that allow a cell to express proteins encoded by nucleotides contained therein and are exemplified herein. The protein can be collected by any chosen method, such as harvesting the cells and preparing a lysate. The lysate can be analyzed by, for example, gel electrophoresis. The protein can then be manipulated as desired, including additional purification, cleavage into smaller subunits, etc., as known in the art.
Description of the Exchangeable Template Reaction (ETT mechanism.
Oligonucleotides for the synthesis of the HCV protein C gene were designed especially for ETR, a method for the synthesis of long polynucleotide DNA fragments using short synthetic oligonucleotides as templates for the DNA polymerase reaction. The ETR is a method for the synthesis of long polynucleotide DNA fragments using short synthetic oligonucleotides as templates for DNA polymerase. The method is based ' a cyclic mechanism involving three main components: (1) polymerase activity .3 synthesize double-stranded DNA, (2) enzymatic activity to create 3' terminal single-stranded regions, and (3) specifically designed synthetic deoxyoligonucleotides used as templates for the polymerase reaction. The critical step is the enzymatic creation of a 3' terminal single-stranded region at the "growing point" of the synthesizing polynucleotide chain, which is used for the complementary binding of the next oligonucleotide as a template to continue the polymerase reaction.
The order of oligonucleotide additions for each cycle is encoded in each 3' terminal sequence. At the 3' terminus of the growing DNA molecule, a specific sequence of nucleotides can anneal with a complementary sequence of nucleotides from the synthetic oligonucleotide. Thus, it is possible to synthesize a long DNA fragment in one step by simply combining the entire set of deoxyoligonucleotides in one reaction tube containing all the required enzymatic activities and incubating the mixture at the optimal temperature and optimal buffer.
Each cycle begins with the complementary binding of the 3' terminal region of a synthetic oligonucleotide with the 3' protruding region of double-stranded DNA (step 1 in Fig. 1). After annealing, a DNA polymerase reaction occurs to create a second strand of DNA using the short synthetic oligonucleotide as a template for DNA polymerase (step 2 in Fig. 1). After polymerization is complete, the double-stranded DNA has been extended by the length of the synthetic oligonucleotide. To initiate the second round in the cycle of DNA synthesis, another enzymatic reaction occurs that creates a 3' protruding single-stranded region by removing several nucleotides from the 5' terminus leaving a 3' protrusion. This protrusion is used to anneal another short synthetic oligonucleotide (step 3 in Fig. 1).
Thus, ETR is a method for the synthesis of DNA based on a cyclic mechanism of combining the following deoxyoligonucleotides in any order:
(a) a series of unique single-stranded deoxypolynucleotides, each having a 5' sequence which, when polymerized to double-stranded form, can be enzymatically treated to form a unique 3' single-stranded protrusion for selective cyclic hybridization with another unique single-stranded deoxypolynucleotide of the series;
(b) a unique deoxypolynucleotide having a 3' sequence which can selectively hybridize with one of the unique single-stranded deoxypolynucleotides of (a);
(c) a polymerase which can direct the formation of double- stranded deoxypolynucleotides from the single-stranded deoxypolynucleotides; and
(d) an enzyme which can form a unique single-stranded 3' protrusion from the double-stranded deoxypolynucleotides; under conditions which hybridize the unique deoxypolynucleotides in a cyclic manner and polymerize the hybridized deoxypolynucleotides to form the DNA.
"Cyclic" as used herein means a sequential hybridization in a regularly repeated order. Thus, as noted above, hybridization of deoxypolynucleotides (hereinafter "DPNTs") occurs only in a specified controlled order. For example, a series of DPNTs (two or more), each of which encodes a unique segment of a desired long DPNT, a-rs synthesized. During the synthesis, the sequence of each DPNT is selected to produce, when later cleaved by an enzyme, a unique 3' protmsion which will hybridize with only one other member of the DPNT series. When the DPNTs are combined, only two of the DPNTs initially hybridize. Once this hybridisation occurs, the sequence of the remaining synthesis is set. A polymerase utilizes the two hybridized DPNTs to form double strands. The appropriate enyme then acts on the double-stranded DPNTs to form the unique 3' single-stranded protmsion. The next DPNT which hybridizes only with this unique 3' protmsion then hybridizes. Once this hybridization occurs, the polymerase again directs the synthesis of double strands. After the double strands are completed, the enzyme again produces a unique 3' single protrusion which was previously synthesized to hybridize only with the next unique DPNT. The sequence is then repeated the desired number of times.
ETR can also utilize hybridization and cleavage which proceeds in both directions, e.g., first hybridize DPNTs in the middle of the desired sequence with cleavage sites on both subsequently-formed ends. The selection of DPNTs and enzymes follows the procedure of unidirectional synthesis, but enzyme sites on both ends of the double-stranded DNA are created.
Once a long DPNT is made by the above method, a new series of DPNTs can be P led, each having a 5' sequence which, when in double- stranded form, , Λ be enzymatically treated to form a unique 3' single-stranded protmsion for selective cyclic hybridization with another unique single-stranded DPNT of the series. This procedure can be repeated many times. The number of DPNTs in the reaction is only limited by undesired interference of hybridization. This can be avoided by creating unique 3' protrusions and hybridizing DPNTs which have minii al sequence similarity. Very long DPNTs, including genes and entire genomes, can thereby be synthesized by this method. As can be appreciated from the above, the method works so long as a unique 3' single-stranded protmsion is formed by an enzymatically-treated hybridized unique DPNT. By "unique" is meant a nucleotide sequence on one DPNT which is absent on another DPNT so that selective hybridization can occur. The number of unique nucleotides necessary for selective hybridization depends on hybridization conditions. For example, for a 3' protmsion of four nucleotides, the optimal temperature of the reaction is about 37° C. This optimal temperature may be different if a different polymerase is utilized in the synthesis. This is true because different polymerases have different affinities to complementary complexes. Thermostable enzymes also have a rather high affinity to such complexes. A longer 3' protmsion should be more reactive and more specific in hybridization and utilize a higher annealing temperature. However, the single-stranded region must be of a size to avoid being involved in secondary structure formation. This region, to be effective in hybridization, should be represented in a single-stranded form at the reaction temperature. From this point of view, thermostable enzymes can be more effective in ETR because a higher reaction temperature can be utilized. Thus, very effective single-stranded terminal regions can be about 7-9 nucleotides long. For such lengths it is routine to find conditions to maintain single-stranded form. Specific complementary complexes between DPNTs can be effectively organized at higher temperatures, which decreases the possibility of improper complex formation. The optimal temperature for the 7-9 nucleotide 3' protrusion may be around 55-65°C, the optimal temperature for the activity of thermostable polymerases. Thus, a preferred range of 3' protrusion length is about 3- 12 nucleotides. Longer protrusions can be made and routinely tested by the methods described in the Experimental section to optimize length and conditions for a particular system.
The precise 5' sequence of a member of the series will depend on the desired sequence for the ultimate DNA and the type of enzyme utilized to form the protrusion. Thus, once an ultimate desired sequence is selected, a 5' sequence is synthesized which corresponds to the desired sequence and which will either be cleaved or exposed such that the desired sequences remain and the undesired sequences, if any, are removed prior to hybridization of the next member of the series. For example, if a restriction endonuclease is utilized, it must cleave in such a way that unique sequences for each member of the series to be hybridized are produced. BstXI, as described in detail in the Experimental section, provides one example of such a restriction endonuclease because the endonuclease allows for four unique nucleotides to be synthesized in each member of the series which remains after cleavage.
Because of the unique nature of the 5' sequence which is treated to produce the unique 3' protrusion, the members of the series of DPNTs must be synthesized if a restriction endonuclease is utilized, for example with a DNA synthesizer. Since the DPNT which starts the hybridization can hybridize directly with the second DPNT, it is not affected by the enzymatic treatment. Therefore, the first unique DPNT can be obtained, if desired, by means other than synthesis and can be single- or double-stranded. For example, the DPNT can be a fragment excised from natural DNA, e.g. plasmid, phage genome, or viral genome by restriction endonucleases. Likewise, the fragment can be obtained by specific amplification using PCR. PCR fragments are more suitable because terminal sequences of the amplified fragment can be easily modified with primers used for amplification with the introduction of desirable nucleotide modifications, including artificially synthesized non-natural derivatives of nucleotides. Any suitable number of nucleotides sufficient for efficient hybridization under the selected conditions can be utilized for this initial hybridization.
This unique synthesis-initiating DPNT, which begins synthesis by providing a template for hybridization of the second DPNT from the series, can be bound to a solid support for impioved efficiency. The solid phase allows for the efficient separation of the synthesized DNA from other components of the reaction. Different supports can be applied in the method. For example, supports can be magnetic latex beads or magnetic control pore glass beads. Being attached to the first DPNT, these beads allow the desirable product from the reaction mixture to be magneticϊilly separated. Binding the DPNT to the beads can be accomplished by a varisty of known methods, for example carbodϋmide treatment (Gilham, Biochemistry 7:2809-2813 (1968); Mizutani and Tachbana, J. Chromatography 356:202 - 205 (1986); Wolf et a , Nucleic Acids Res. 15:2911 -2926 (1987); Musso, Nucleic Acids Res., 15:5353-5372 (1987); Lund et al., Nucleic Acids Res., 16:10861 - 10880 (1988)). The DPNT attached to the solid phase is the primer for synthesis of the whole DNA molecule. Synthesis can be accomplished by addition of sets of compatible oligonucleotides together with enzymes. After the appropriate incubation time, unbound components of the method can be washed out and the reaction can be repeated again to improve the efficiency of each oligonucleotide to be utilized as a template. Alternatively, another set of oligonucleotides can be added to continue the synthesis. This "set principle," barely applicable to solution synthesis, turns the method into a very powerful method for the synthesis of a long DNA molecule that is not possible with any other methods.
Solid phase, to be efficiently used for the synthesis, can contain pores with sufficient room for synthesis of the long DNA molecules. The solid phase can be composed of material that cannot non-specifically bind any undesired components of the reaction. One way to solve the problem is to use control pore glass beads appropriate for long DNA molecules. The initial primer can be attached to the beads through a long connector. The role of the connector is to position the primer from the surface of the solid support at a desirable distance.
Any polymerase which can direct the synthesis of double strands from partially hybridized single strands is appropriate. Suitable polymerases, for example, may include Taq polymerase, large fragments of E. coli DNA polymerase I, DNA polymerase of T7 phase. The optimal conditions of the polymerization vary with the type of polymerase used. Likewise, the optimal polymerase can vary with the conditions necessary for the synthesis (Bej et al., CriL Rev. Biochem. MoL BioL 26(3 -4):301 -334 (1991); Tabor and Richardson, Proc Nad. Acad. ScL USA, 86:4076-4080 (1989); Petruska et aL, Proc Natl Acad. ScL USA, 85:6252-6256 (1988)). One example of an enzyme capable of removing several nucleotides from the 5' terminus is the restriction endonuclease BstXI. This restriction endonuclease is compatible with ETR for the following reasons: (1) a 3' protrusion is produced, (2) the single-stranded 3' protmsion does not have any sequence restrictions, and (3) after cleavage the restriction site cannot be restored by the interaction of the next synthetic oligonucleotides.
Any enzyme can be utilized which can form a unique 3' protmsion from double-stranded DNA. Presently known enzymes useful in the method include not only BstXI, as used in the examples, but also 5' exonucleases specific for double-stranded DNA, such as the exonuclease of T7 and lambda phage, and an enzyme of DNA recombination, such as recA.
The method utilizing a 5' exonuclease specific for double-stranded DNA can be performed as follows: oligonucleotides to be used in the reaction as templates for polymerase reaction are chemically modified at a defined point to prevent T7 exonucleases from jumping over the modified nucleotides. For example, oligonucleotide phosphorodithioates can be utilized using methods described in Camthers, NucL Acids $ymp. Ser., 21:119- 120 (1989). As described above, polymerase first fills gaps in hybridized DPNTs. When the reaction is finished, the exonuclease of the T7 starts cutting double-stranded DNA beginning from the 5' end (the opposite 5' end should be modified or attached to solid phase to prevent cleavage from the end). This reaction goes until the modified position where it stops. The 3' protrusion created by the exonuclease activity can then be used for hybridization with the next oligonucleotide in the cycle reaction. T7 is well known to have a relatively strong preference for double-stranded DNA (Kerr and Sadowski, /. BioL Chem., 247:311 -318 (1972); Thomas and Olivera, /. BioL Chem., 253:424-429 (1978); Shon et al., J. BioL Chem., 25:13823-13827 (1982)). Another double-stranded specific exonuclease is encoded by lambda phage (Sayers et al., Nucleic Acids Res., 16:791 -802 (1988)). This enzyme can also be utilized in ETR.
The main advantage of these exonucleases is the possibility of creating a single-stranded 3' protrusion of any necessary size to allow the use of higher temperatures in the reaction. Additionally, because the exonuclease recognizes any blunt end, its use eliminates the need to synthesize DPNT having a restriction site when polymerized to double-stranded form.
ETR can also be performed utilizing an enzyme of DNA recombination. It is known that recA can replace one strand of double-stranded DNA, in a strong sequence-specific manner, with a single-stranded DNA from solution creating D-loop structures (Cox and Lehman, Ann. Rev. Biochem., 56:229-262 (1987); Tadi-Laskowski et aL, Nucleic Acids Res., 16:8157-8169 (1988); Hahn et al., J. BioL Chem., 263:7431 -7436 (1988)). In this modification of the method, DPNTs are combined in one reaction with polymerase and recA. Polymerase fills single-stranded gaps and recA replaces the terminal region of one of the strands of double-stranded DNA with a single-stranded DPNT from solution which provides the polymerase with a new template. An advantage of the reaction is strong specificity of the hybridization which is due to enzymatic support. In any other variations of the method, for example with restriction endonucleases and exonucleases, the hybridization is the only step without enzymatic support While restriction endonucleases and exonucleases can only create a 3' protrusion, recA can create a single-stranded region at the ends of double-stranded DNA and anneals oligonucleotides to the 3' protmsion.
The present invention is more particularly described in the following examples which are intended as illustrative only since numerous modifications and variations therein will be apparent to those skilled in the art. EXAMPLES
Sequence Design. The nucleocapsid protein of HCV (also referred to as "core protein" or "protein C) was expressed in E. coli as an authentic non- hybrid protein. Within the sequence of the coding region for the HCV protein C, a set of unique restriction endonuclease recognition sites was introduced, including Smal, Avrll, Eco47III, AccI, Ddel, BstEII, SacII, Clal, Bbel, Ncol, Xbal, Tthllll, and Asul. The recognition and cleavage sites of all restriction enzymes listed herein are known in the art and published. Since this gene was to be expressed in bacterial and mammalian cells, codon usage was of no concern. The sequence for the synthesis of oligonucleotides was taken from published data (position 330-917 nt, (Kato, N. et aL, Proc. NatL Acad. ScL USA, 87:9524-9528 (1990)), and modified for insertion of restriction endonuclease sites or modified to avoid adverse complementary interactions. The sequence for the synthesis of the gene encoding the nucleocapsid protein of hepatitis C virus was obtained from published data (Kato et al.. 1990). The secondary s αcture of mRNA around the initiator codon was calculated according to puolished methods (Khudyakov, 1985; Frier et aL, 1986, Jaeger et aL, 1989) to create a hairpin structure with a G =? -2.8kcal/mol that can be easily destroyed by the interaction of the SD with the 3' end of the ribosomal 16S rRNA to expose the ATG in the center of a large single-stranded region. The sequence of the mRNA 5 '-untranslated region was designed utilizing published recommendations (Khudyakov, 1985; Khudyakov et al., 1987). Unique restriction endonuclease recognition sites were introduced, including SnaBI, EcoRV, and Ndel. Such sites allow for easy manipulation of the nucleotide sequence of the region, such as by removing and replacing nucleotide fragments by routine cloning methods utilizing restriction endonucleases. The primary stm- - of oligonucleotides was taken from published data (position 330-917 nt, Kato . L, 1990) and modified as ntκ-essary to introduce BstXI-sites or other sites for restriction endonucleases. In addition, the secondary stmctures of the oligonucleotides were predicted, as described (Frier et aL, 1986, Jaeger et al., 1989), and complementary interactions between oligonucleotides for each set of oligonucleotides were taken into consideration. Whenever the 3'-terminal region of the oligonucleotides was potentially involved in undesirable complimentary interactions, the primary structure of the oligonucleotides was changed without influencing the protein sequence.
A synthetic promoter having the sequence of the strong promoter of the bacteriophage T7 gene 10 (Tabor, S. et aL, Proc. Nad. Acad. ScL USA, 82:1074- 1078 (1985)) was utilized to provide efficient transcription.
Oligodeoxynucleotide Synthesis. The DNA sequence encoding the
HCV nucleocapsid protein was divided into 3 fragments. Each fragment was synthesized separately by ETR. The first fragment was synthesized from 5 deoxyoligonucleotides (Seq ID NOS: 1 -5) (Fig. 2), the second fragment from 3 deoxyoligonucleotides (Seq ID NOS: 6-8) (Fig. 3), and the third from 4 deoxyoligonucleotides (Seq ID NOS: 9 - 12) (Fig. 4). All reactions were carried out as described below. Deoxyoligonucleotides were synthesized using an automatic synthesizer (Applied Biosystems Model 480A, Foster City, California) and purified by electrophoresis in 10% polyacrylamide gel electrophoresis (PAGE), containing 7M urea with TBE buffer (0.045M Tris-borate, 0.001 M EDTA, pH 83). Oligonucleotides were recovered from the gel by electroelution.
Exchangeable Template Reaction (ETR) Conditions. The ETR was made in a volume of 50 ul of 10 mM Tris-HCl, pH 7.9, containing 10 mM MgCl, 50 mM NaCl, 1 mM DTT, pH 7.9,; 0.25 mM each dATP, dGTP, dTTP, dCTP (Pharmacia, Piscataway, NJ); 5 units of native Taq DNA polymerase (Perkin Elmer Cetus), 30 units of BstXI (New England BioLabs, Beberly, MA); 0.5 - 100 pmol of each deoxyoligonucleotide, at 37° C for 0.5-5 h or overnight. To provide analysis of the reaction course, one of the oligodeoxynucleotides without a BstXI site was radiolabeled with [gamma-P-32] ATP in 50 mM Tris-HCl, pH 7.6, containing 10 mM MgCl, 5 mM DTT with the addition of 10 uCi [gamma-P- 32] ATP (5000 Ci/mmole, New England Nuclear) and 10-20 pmol of oligonucleotide. After completion, the products of the ETR were analyzed by electrophoresis in 8% PAGE containing 8M urea, and analyzed by autoradiography.
Assembly of the Gene. ETR fragments were recovered from PAGE by incubating pieces of the gel in 0.15 M NaCl and 1 mM EDTA for 30 min at 65 °C. After incubation, the DNA was precipitated with ethanol. The pellet was dissolved in 20 ul of TE-buffer and 1 ul of the DNA solution was used to amplify the ETR fragments by the PCR. The terminal oligonucleotides used for synthesis of the fragments were used as primers for the PCR. Briefly, 20-50 pmol of each primer was added to the reaction mixture and the PCR consisted of 30 cycles as follows: 94°C for 45 sec, 65°C for 20 sec, and 72°C for 1 min. The PCR amplified ETR fragments were treated with the appropriate restriction endonuclease with the recognition sites located at the termini of each fragment, and then ligated in 10 ul of a solution containing all three fragments, 50 mM Tris-HCl, pH 7.5, 10 mM MgCl, 1 mM DTT, 1 mM ATP, and 10 units of DNA ligase (Pharmacia, Piscataway, NJ) for 6 h. One μl of the ligase reaction was used to amplify the fragment by the PCR to provide the full length DNA using PCR conditions described above and using the two terminal oligonucleotides as primers. Amplified full length DNA was recovered from agarose gel using a DEAE procedure (Dretzen et aL, 1( and treated with restriction endonucleases to confirm the struct-, .c of the synthesized gene. The nucleotide sequence of this construct is set forth in SEQ ID NO: 14.
Plasmid Construction. A vector, designated pTS7, was constructed to prepare a plasmid for expression of the HCV nucleocapsid gene. Two oligonucleotides containing the gene 10 promoter from T7 bacteriophage and encoding the 5 '-untranslated region of the mRN \ were annealed at 20°C and inserted into pBR322 between the EcoRI and EaπiHI sites. A BamHI-Scal- fragment of the plasmid containing this synthetic sequence and the N-terminal part of the Amp gene were introduced into the plasmid pSP65 between the BamHI site localized in the multiple cloning region and Seal-site of the Amp gene. The resulting plasmid pTS7 and the synthetic gene assembled from ETR fragments were treated with Ndel and Hindlll, combined together, and ligated. E. coli HB101 competent cells (Invitrogene, San Diego, California) were transformed with the ligation mixture. A plasmid containing the synthetic core gene under the control of a T7 phage promoter was designated pTSC6178-7. The sequence of the cloned HCV synthetic gene was verified using the polymerase chain terminator method (Sanger et aL, 1977). This sequence is set forth in SEQ ID NO: 15.
To prepare a hybrid protein composed of beta-galactosidase and a portion of the HCV nucleocapsid protein, two plasmids were used. A DNA fragment containing the T7 promoter and a piece of the synthetic gene encoding amino acids 1 - 158 of the nucleocapsid protein (see SEQ ID NOS: 14 and SEQ ID NOS: 15) was extracted from the plasmid pTSC6178-7 and inserted by standard techniques into the plasmid pCL121 at the BamHI site (Khudyakov et aL, 1987). The resulting plasmid pFC105-301 encodes for a hybrid protein composed of amino acids 1 - 158 of the HCV nucleocapsid and beta- galactosidase under the control of the T7 promoter.
Analysis of Expression. To express the protein encoded by the synthetic gene, E. coli BL21(DE3) (Studier & Moffott, 1986) competent cells were prepared (Hanahan, 1983) and transformed with pTSC6178-7. Cells were grown in LB medium until an optical density at 600 nm was equal to 0.6 and then the T7 promoter was activated by the addition of IPTG at a final concentration of 1 mM. After 4-6 h fermentation at 37° C, the cells were harvested and a lysate was prepared (Sambrook et aL, 1989). Aliquots of the lysate were analyzed by Western blot (Hariσw & Lane, 1988). Nitrocellulose filters containing immobilized proteins were incubated at 20° C for 2 h with human sera diluted 50 times in 50 mM tris HQ, pH 7.5, containing 0.5% Triton X-100, 1% gelatin, and 1% bovine scrum albumin (NET). After washing with NET three times, the filters were incubated for 1 h with affinity purified anti- human IgG antibodies coupled to horseradish peroxidase (TAGO, Burlingame, California) diluted 1:5000 in NET. After washing, diaminobenzidine (Sigma, St. Louis, Missouri) and hydrogen peroxide were used to develop the reaction.
Sera. Sera were obtained from the collection reposited at the D. I.
Ivanovsky Institute of Virology, Moscow, Russia. All sera were initially tested for markers of HBV, HDV infection, and for the presence of the anti-ClOO-3 antibody by commercially available kits (ABBOTT Laboratories, Abbott Park, Illinois).
Synthesis Of A DNA Fragment Encoding The HCV Nucleocapsid Protein. The sequence of the HCV nucleocapsid protein was divided into three fragments. Each fragment was synthesized by ETR. The first fragment was synthesized using five oligodeoxynucleotides, the second fragment using three, and the third fragment using four. All reactions were carried out as described above. Fragments obtained using four and five oligonucleotides consisted of 228 and 216 bp, respectively. The yield of full-size fragments was approximately 5- 10% after a 14 h incubation period at 37° C. Different buffers were tested for the synthesis. Buffer NEB3 (New England Biolabs, Beverly, MA; 50 mM Tris-HCl, pH 7.9, 10 mM MgClj, 100 mM NaCl, 1 mM DTT) was optimal for the BstXI reactor and Taq buffer (10 mM Tris-HCl, pH 7.8, 50 mM Kcl, 1 mM DTT, 1,5 mM, or 3 mM, or 5 mM MgCλ^) was optimal for Taq DNA polymerase. The best results for ETR, however, were obtained with buffers NEB4 (New England Biolabs, Beverly, MA; 20 mM Tris-acetate, pH 7.9, 10 mM Magnesium acetate, 50 mM Po assium acetate, ImM DTT). Both enzymes have high optimal temperatures, but because of the very short single-stranded protrusion formed by BstXI, ETR fails to work at elevated temperatures. As shown previously for the HBS DNA fragment, 37°C was optimal for buffers NEB2 and NEB4 for both enzymatic activir -. s. For the synthesis of the first segment of the HCV nucleocapsid gene, th optimal relative concentrations of the oligodeoxynucleotides were 1:4:20:40:60. When the relative concentrations were changed to 1:1:20:40:60, the rate of ETR changed as well. At a 1:4 relative concentration of oligos 1 and 2, a full-size fragment was observed after a 3 h incubation period at 37° C in NEB2. At the 1:1 relative concentrations of oligos 1 and 2, a fragment was synthesized in detectable amounts after only 30 min (data not shown).
Assembly And Cloning Of The HCV Gene. Following synthesis, all three fragments were purified by recovery from PAGE, and subsequently amplified by PCR. Amplified products were digested with the appropriate restriction endonuclease and joined by DNA ligase. The whole gene was then amplified by PCR and analyzed by restriction site mapping. Following verification by restriction analysis, the fragment containing the synthetic gene was inserted into the expression vector pTS7 under the control of the T7 promoter. This vector was specifically designed for the expression of the nucleocapsid synthetic gene. Among the first five clones chosen for analysis, one contained a plasmid, designated pTSCόl 78- 7, that contained an insert of the appropriate size. Analysis of the primary structure of this insert by sequencing using standard techniques demonstrated that all three fragments of the ETR-synthesized DNA had the correct sequence. The sequence is set forth in SEQ ID NO: 15. Thus, the gene for the nucleocapsid protein of the hepatitis C vims was assembled in the correct form by the new method and cloned.
Expression Of The Synthetic Gene In E. coli. Plasmid pTSC6178-7 was unstable in transformed bacteria growing with or without the IPTG inducer. Modified derivatives oipTSC6178-7 were often found in bacteria after 2-3 passages on solid LB agar or in liquid medium. Because of this problem, the cells were transformed each time the HCV protein was to be expressed, or transformed cells were stored frozen and used as aliquots. In cells transformed with plasmid pTSC6178-7, a 27kDa protein identified by Western blot analysis effectively bound an antibody from sera containing serologic markers specific for HCV infection. Sera without markers of HCV infection did not immunoreact with this protein. To confirm the specific reactivity of the protein with anti-HCV antibody, we collected two sets of 24 sera. Both sets were from donors in a high risk group for HCV infection, and all had elevated alanine aminotransferase (ALT) activity. Each serum specimen from one set was positive for the presence of antibody reactive with the CI 00-3 protein, a marker specific for HCV infection (van der Poel et aL, Lancet, 337:317-319 (1991)). The second set was composed of sera without markers of HCV infection. Both sets were analyzed by Western blot. Among sera from HCV-positive donors, 20 sera contained antibodies reactive with the 27kDa protein expressed in cells transformed with pTSC6178- 7. Two sera from the second set were also reactive with this protein. Although anti-ClOO-3 has been identified in most cases of HCV infection, this marker may appear late in infection or not at all (van der Poel et al, Lancet, 335:558 -560 (1990); Ebeling et al, Lancet, 335:982-983 (1990)). Because both sets of sera came from high risk-group donors, it is likely that some of the HCV-positive sera would be negative for anti-ClOO-3 activity. Thus, the synthetic gene assembled from three fragments synthesized by ETR is functionally active, and the product of its expression is specific for HCV. The data suggest that this protein can be used for the development of a diagnostic test for the detection of antibody specific for the HCV nucleocapsid protein and help to identify more HCV-positive sera than those found positive using test systems that detect anti-ClOO-3 alone.
In cells transformed with pFC105-301, a protein of 135 kDa was expected to be reactive with anti-HCV sera by Western Blot. However, the major protein interacting with anti-HCV antibody had a m.w. of 23 kDA. Only a very weak band could be identified as reactive with HCV specific sera in the position corresponding to a protein with the expected m.w. of 135 kDa. The 23 kDa protein was very immunoreactive and represented by 5-7% of the total E. coli t oteins. It was concluded that this 135 kDa hybrid protein undergoes cleavage with intracellular proteases, after which the 23 kDa protein fragment of the HCV nucleocapsid polypeptide was very stable and immunoreactive. This protein represents a good source of immunoreactive recombinant HCV core protein for the development of a diagnostic test system for the detection of anti- HCV activity in human sera.
Throughout this application, various publications are referenced.
The disclosures of these publications in their entireties are hereby incorporated by reference into this application in order to more fully describe the state of the art to which this invention pertains.
Although the present process has been described with reference to specific details of certain embodiments thereof, it is not intended that such details should be regarded as limitations upon the scope of the invention except as and to the extent that they are included in the accompanying claims.
REFERENCES
1. Choo, Q.-L., et aL, "Genetic organization and diversity of the hepatitis C vims," Proc. NatL Aca ScL USA, 88:2451 -2455 (1991).
2. Dretzen, G., et aL, "A reliable method for the recovery of DNA fragments from agarose and acrylamide gtύs," AnaL Biochem., 112:295 -298 (1981).
3. Ebeling, F., et aL, "Recombinant immunoblot assay for hepatitis C vims antibody as predictor of infectivity," Lancet, 335:982-983 (1990).
4. Frier, S.H., et aL, "Improved free-energy parameters for predictions of RNA duplex stability," Proc. NatL Acad ScL USA, 83:9373-9377 (1986).
5. Gold, L., et aL, "Translation initiation. In Escherichia coli and Salmonella typhimurium: Cellular and Molecular Biology, 2:1302-1307 (Neidhardt, F.C., et al. eds) American Society for Microbiology, Washington, DC (1987).
6. Hanahan, D., "Studies on transformation of Escherichia coli with plasmids," J. MoL BioL, 166:557-580 (1983).
7. Harada, S., et aL, "Expression of processed core protein of hepatitis C virus in mammalian cells," /. ViroL, 65:3015-3021 (1991).
8. Harlow, E., et aL, Antibodies. A laboratory manual Cold Spring Harbor Lab., pp. 471 -510 (1988).
9. Hijikata, M., et aL, "Gene mapping of the putative structural region of the hepatitis C virus genome by in vitro processing analysis," Proc NatL Acad ScL USA, 88:5547-5551 (1991).
10. Jaeger, J.A., et aL, "Improved predictions of secondary structures for RNA. Proc. Natl. Acad. Sci. USA 86:7706-7710 (1989)
11. Kato, N., et aL, "Molecular cloning of the human hepatitis C virus genome from Japanese patients with non-A, non-B hepatitis," Proc NatL Acad ScL USA, 87:9524-9528 (1990).
12. Khudyakov, Yu.E. et aL, "The Shine-Dalgamo sequence and the effectiveness of translation initiation," MoL BioL (Russia) 19:702-716 (1985).
13. Khudyakov, Yu.E., et aL, "Correlation between the effectiveness of translation initiation and secondary structure of mRNA in the hybrid gene cro-lacIZ," MoL BioL (Russia) 21:1504-1512 (1987). 14. Miller, R.H., et aL, "Hepatitis C vims shares amino acid sequence similarity with pestiviruses and flaviviruses as well as members of two plant virus supergroups," Proc NatL Acad ScL USA 87:2057-2061 (1990).
15. Muraiso, K., et aL, "A stmctural protein of hepatitis C vims expressed in E. coli facilitates accurate detection of hepatitis C vitus," Biochem. Biophys. Res. Commun., 172:511 -516 (1990).
16. Muraiso, K., et aL, "Detection of hepatitis C virus infection by enzyme- linked immunosorbent assay system using core protein expressed in Escherichia colL, Japan J. Cancer Res., 82:879-882 (1991).
17. Plagemann, P.G.W. et aL, "Hepatitis C vims," Arch. VυroL, 120:165 - 180 (1991).
18. Sambrook, J., et aL, Molecular cloning. A laboratory manual, 3:18.40 - 18.41 (1989).
19. Sanger, F., et al, "DNA sequencing with chain-terminating inhibitors" Proc. NatL Acad ScL USA, 74:5463-5467 (1977).
20. Studier, F.W. et aL, "Use of bacteriophage T7 RNA polymerase to direct selective high-level expression of cloned genes," J.MoLBioL, 189:113 - 130 (1986).
21. Tabor, S. et aL, "A bacteriophage T7 RNA polymerase/promoter system for controlled exclusive expression of specific genes," Proc NatL Acad ScL USA, 82:1074-1078 (1985).
22. Takahashi, K. et aL, "Demonstration of a hepatitis C virus-specific antigen predicted from the putative core gene in the circulation of infected hosts," J. Gen. ViroL, 73:667-672 (1992).
23. Takamizawa, A. et aL, "Structure and organization of the hepatitis C vims genome isolated from human carriers," J. ViroL, 65:1105-1113 (1991).
24. Takeuchi, K. et aL, The putative nucleocapsid and envelope protein genes of hepatitis C vims determined by comparison of the nucleotide sequences of two isolates from an experimentaly infected chimpanzee and healthy human carriers," /. Gen. ViroL, 71:3027-3033 (1990).
25. van der Poel, C.L. et aL, "Infectivity of blood seropositive for hepatitis C virus antibodies," Lancet, 335:558-560 (1990).
26. van der Poel, C.L. et aL, "Confirmation of hepatitis C virus infection by new four-antigen recombinant immunoblot assay," Lancet, 337:317-319 (1991). SEQUENCE LISTING
(1) GENERAL INFORMATION:
(i; APPLICANT: Khudyakov, Yury Fields, Howard A.
(ii) TITLE OF INVENTION: PLASMIDS FOR EFFICIENT EXPRESSION OF SYNTHETIC GENES IN E. COLI
(iii) NUMBER OF SEQUENCES: 17
(iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: NEEDLE & ROSENBERG, P.C.
(B) STREET: 127 Peachtree Street, NE, Suite 1200
(C) CITY: Atlanta
(D) STATE: Georgia
(E) COUNTRY: USA
(F) ZIP: 3^303-1811
(v) COMPUTER R ABLE FORM:
(A) MEDIU, ."YPE: Floppy disk
(B) COMPUTER: IBM PC compatible
(C) OPERATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: Patentlπ Release #1.0, Version #1.25
(vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER:
(B) FILING DATE:
(C) CLASSIFICATION:
(vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: US 07/849,294
(B) FILING DATE: 10-MAR-1993
(viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: Perry an, David G.
(B) REGISTRATION NUMBER: 33,438
(C) REFERENCE/DOCKET NUMBER: 1414.065
(ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: (404) 688-0770
(B) TELEFAX: (404) 688-9880
(2) INFORMATION FOR SEQ ID N0:1:
(i) SEQUENCE CHARACTERISTICS: ^A) LENGTH: 76 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: CCCCATATGA GCACGATTCC TAAACCACAΛ AGAAAAACCA AACGTAACAC CAATCGACGA 60 CCACAAGATG TAAAGT 76
(2) INFORMATION FOR SEQ ID N0:2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 69 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Other nucleic acid
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: CCCCCACCTC CGTGGAAGCA AATAGACTCC ACCAACGATC TGACCGCCAC CCGGGAACTT 60 TACATCTTG 69
(2) INFORMATION FOR SEQ ID N0:3ι
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 45 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear *
(ii) MOLECULE TYPE: Other nucleic acid
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3: CCCCCATCTT CCTGGTCGCG CGCACACCCA ACCTAGGTCC CCTCC 45
(2) INFORMATION FOR SEQ ID NO:4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 37 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: CCCCCAACCT CGTGGTTGCG AGCGCTCGGA AGTCTTC 37 (2) INFORMATION FOR SEQ ID N0: 5:
(i ) SEQUENCE CHARACTERISTICS :
(A) LENGTH : 45 base pai rs
(B) TYPE: nucl ei c aci d
(C) STRANDEDNESS : si ngl e
(D) TOPOLOGY: l i near
(ii) MOLECULE TYPE: Other nucleic acid
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:5: CCCCCTCAGG CCGACGCACT TTAGGGATAG GCTGTCGTCT ACCTC 45
(2) INFORMATION FOR SEQ ID N0:6:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 75 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Other nucleic acid
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:6: CCCCCTGAGG GCAGGACCTG GGCTCAACCC GGTTACCCCT GGCCCCTCTA TGGCAATGAG 60 GGCTGCGGGT GGGCG 75
(2) INFORMATION FOR SEQ ID N0:7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 71 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Other nucleic acid
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: CCCCCAGATC AGTGGGTCCC CAACTCGGTC GAGAGCCGCG GGGAGACAGG AGCCATCCCG 60 CCCACCCGCA G 71
(2) INFORMATION FOR SEQ ID N0:8:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 46 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Other nucleic acid
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:8: CCCATCGATG ACCTTACCCA AATTTCGCGA CCTACGTCGC GGATCA 46
(2) INFORMATION FOR SEQ ID N0:9:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 57 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Other nucleic acid
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:9: CCCATCGATA CCCTCACGTG CGGCTTCGCC GACCTCATGG GGTACATACC GCTCGTC 57 (2) INFORMATION FOR SEQ ID NO:10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 62 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Other nucleic acid
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: CCCCCAACTC CATGGGCAAG GGCTCTGGCG GCACCTCCAA GAGGGGCGCC GACGAGCGGT 60 AT 62
(2) INFORMATION FOR SEQ ID NO:11:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 79 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Other nucleic acid
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11: CCCCCAGGAA GATGGAGAAA GAGCAACCAG GAAGGTTTCC TGTTGCATAA TTGACGCCGT 60 CTTCTAGAAC CCGTACTCC 79 (2) INFORMAT W FOR SEQ ID NO: 12 :
(i ) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 73 base pai rs
(B) TYPE: nucl ei c aci d
(C) STRANDEDNESS : si ngl e
(D) TOPOLOGY: l i near
(ii) MOLECULE TYPE: Other nucleic acid
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: CCCAAGCTTT TAGTTTCGAA CTTGGTAGGC TGAAGCGGGC ACAGTCAGGC AAGAGAGCAG 60 GGCCAGAAGG AAG 73
(2) INFORMATION FOR SEQ ID NO:13:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 6 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Other nucleic acid
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Shrine-Daigarno consensus sequence
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:
AGGAGGU 7
(2) INFORMATION FOR SEQ ID NO:14:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 606 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: Other nucleic acid
(A) DESCRIPTION: Synthetic hepatitis C virus nucleocapsid protein coding sequence
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 7..594
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:
CCCCAT ATG AGC ACG Aπ CCT AAA CCA CAA AGA AAA ACC AAA CGT AAC 48 Mel Ser Thr He Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 1 5 10 ACC AAT CGA CGA CCA CAA GAT GTA AAG TTC CCG GGT GGC GGT CAG ATC 96 Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin He 15 20 25 30
GTT GGT GGA GTC TAT TTG CTT CCA CGG AGG GGA CCT AGG TTG GGT GTG 144 Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val 35 40 45
CGC GCG ACC AGG AAG ACT TCC GAG CGC TCG CAA CCA CGA GGT AGA CGA 192 Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg 50 55 60
CAG CCT ATC CCT AAA GTG CGT CGG CCT GAG GGC AGG ACC TGG GCT CAA 240 Gin Pro He Pro Lys Val Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin 65 70 75
CCC GGT TAC CCC TGG CCC CTC TAT GGC AAT GAG GGC TGC GGG TGG GCG 288 Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala 80 85 90
GGA TGG CTC CTG TCT CCC CGC GGC TCT CGA CCG AGT TGG GGA CCC ACT 336 Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr 95 100 105 110
GAT CCG CGA CGT AGG TCG CGA AAT TTG GGT AAG GTC ATC GAT ACC CTC 384 Asp Pro Arg Arg Arg Ser Arg Asn Leu Gly Lys Val He Asp Thr Leu 115 120 125
ACG TGC GGC TTC GCC GAC CTC ATG GGG TAC ATA CCG CTC GTC GGC GCC 432 Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala 130 135 140
CCT Cπ GGA GGT GCC GCC AGA GCC CTT GCC CAT GGA GTA CGG GTT CTA 480 Pro Leu Gly Gly Ala Ala Arg Ala Leu Ala His Gly Val Arg Val Leu 145 150 155
GAA GAC GGC GTC AAT TAT GCA ACA GGA AAC Cπ CCT GGT TGC TCT TTC 528 Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe 160 165 170
TCC ATC πC Cπ CTG GCC CTG CTC TCT TGC CTG ACT GTG CCC GCT TCA 576 Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser 175 180 185 190
GCC TAC CAA Gπ CGA AAC TAAAAGCTTG GG 606
Ala Tyr Gin Val Arg Asn 195
(2) INFORMATION FOR SEQ ID NO:15:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 643 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Other nucleic acid
(vi) ORIGINAL SOURCE:
(A) DESCRIPTION: synthetic hepatitis C virus nucleocapsid protein gene
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 44..631
(D) OTHER INFORMATION: /note= "function="other nucleic acid"; product3 "synthetic hepatitis C virus nucleocapsid protein gene"/note="T7 Promoter: nt [7-21]; Transcription start site: nt [25]; A-rich region: approximately nt 59-111; Restriction endonuclease recognition sites, nt approximately: SnaBI: 1, EcoRV: 27, Ndel: 41, HgiAI:46, Smal: 115, Avrll: 167, Eco47III: 203, AccI: 222, Dde I: 254, BstEII: 280, SacII: 343, Clal: 410, Bbel: 464, Ncol: 499, Xbal: 513, Tthlllll: 521, Asul: 623
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15:
TACGTAATAA TACGACTCAC TATAGGGATA TCAAGGAGGT CAT ATG AGC ACG ATT 55
Met Ser Thr He 1
CCT AAA CCA CAA AGA AAA ACC AAA CGT AAC ACC AAT CGA CGA CCA CAA 103 Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn Arg Arg Pro Gin 5 10 15 20
GAT GTA AAG TTC CCG GGT GGC GGT CAG ATC GTT GGT GGA GTC TAT TTG 151 Asp Val Lys Phe Pro Gly Gly Gly Gin He Val Gly Gly Val Tyr Leu 25 30 35
CTT CCA CGG AGG GGA CCT AGG TTG GGT GTG CGC GCG ACC AGG AAG ACT 199 Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala Thr Arg Lys Thr 40 45 50
TCC GAG CGC TCG CAA CCA CGA GGT AGA CGA CAG CCT ATC CCT AAA GTG 247 Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro He Pro Lys Val 55 60 65
CGT CGG CCT GAG GGC AGG ACC TGG GCT CAA CCC GGT TAC CCC TGG CCC 295 Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly Tyr Pro Trp Pro 70 75 80
CTC TAT GGC AAT GAG GGC TGC GGG TGG GCG GGA TGG CTC CTG TCT CCC 343 Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp Leu Leu Ser Pro 85 90 95 100
CGC GGC TCT CGA CCG AGT TGG GGA CCC ACT GAT CCG CGA CGT AGG TCG 391 Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro Arg Arg Arg Ser 105 110 115 CGA AAT TTG GGT AAG GTC ATC GAT ACC CTC ACG TGC GGC TTC GCC GAC 439
Arg Asn Leu Gly Lys Val He Asp Thr Leu Thr Cys Gly Phe Ala Asp 120 125 130
CTC ATG GGG TAC ATA CCG CTC GTC GGC GCC CCT CTT GGA GGT GCC GCC 487
Leu Met Gly Tyr He Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala 135 140 145
AGA GCC CTT GCC CAT GGA GTA CGG GTT CTA GAA GAC GGC GTC AAT TAT 535
Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp Gly Val Asn Tyr 150 155 160
GCA ACA GGA AAC CTT CCT GGT TGC TCT TTC TCC ATC TTC CTT CTG GCC 583
Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala 165 170 175 180
CTG CTC TCT TGC CTG ACT GTG CCC GCT TCA GCC TAC CAA GTT CGA AAC 631
Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr Gin Val Arg Asn 185 190 195
TAAAAGCTTG GG 643

Claims

What is claimed is:
1. A vector comprising, in sequential order 5' to 3': a) a Shine-Dalgarno nucleotide sequence having a unique restriction endonuclease site and b) a synthetically produced protein coding nucleotide sequence having
i) a translation start codon; ii) a sequence about equidistant 3' of the translation start codon as the Shine-Dalgarno sequence is 5', which selectively hybridizes with the Shine-Dalgarno sequence to form a hairpin loop wherein the start codon is exposed in the loop such that translation is efficiently initiated; and ϋi) an adenosine-containing region of a length sufficient to substantially prevent secondary structure of the vector in the region immediately downstream of the hairpin loop.
2. The vector of Claim 1, further comprising, after the adenosine-containing region, a restriction endonuclease site not found in the DNA encoding the native protein.
3. The vector of Claim 2, wherein the coding nucleotide sequence encodes hepatitis C virus nucleocapsid protein.
4. A vector consisting essentially of the nucleotides in the sequence set forth in SEQ ID NO: 14, or a unique portion thereof.
5. A vector consisting essentially of the nucleotides in the sequence set forth in SEQ ID NO: 15, or a unique portion thereof.
A nucleic acid which selectively hybridizes to the vector of Claim 5.
7. The vector of Claim 1 in a cell which can express a protein encoded by the vector.
8. A method of producing a protein comprising placing the cell of Claim 7 in protein expressing conditions and collecting the protein produced by the cell.
9. The vector of Claim 5 in a cell which can express a protein encoded by the vector.
10. A method of producing a protein comprising placing the cell of Claim 9 in protein expressing conditions and collecting the protein produced by the cell.
PCT/US1994/012166 1993-10-25 1994-10-25 PLASMIDS FOR EFFICIENT EXPRESSION OF SYNTHETIC GENES IN $i(E. COLI) WO1995011980A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU80887/94A AU8088794A (en) 1993-10-25 1994-10-25 Plasmids for efficient expression of synthetic genes in (e. coli)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US14191793A 1993-10-25 1993-10-25
US08/141,917 1993-10-25

Publications (2)

Publication Number Publication Date
WO1995011980A2 true WO1995011980A2 (en) 1995-05-04
WO1995011980A3 WO1995011980A3 (en) 1995-06-15

Family

ID=22497805

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1994/012166 WO1995011980A2 (en) 1993-10-25 1994-10-25 PLASMIDS FOR EFFICIENT EXPRESSION OF SYNTHETIC GENES IN $i(E. COLI)

Country Status (2)

Country Link
AU (1) AU8088794A (en)
WO (1) WO1995011980A2 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1989000604A1 (en) * 1987-07-13 1989-01-26 Interferon Sciences, Inc. Method for improving translation efficiency
WO1992013070A1 (en) * 1991-01-25 1992-08-06 United States Biochemical Corporation Regulation of nucleic acid translation
WO1993019202A2 (en) * 1992-03-10 1993-09-30 The United States Of America, As Represented By The Secretary, Department Of Health & Human Services Exchangeable template reaction

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1989000604A1 (en) * 1987-07-13 1989-01-26 Interferon Sciences, Inc. Method for improving translation efficiency
WO1992013070A1 (en) * 1991-01-25 1992-08-06 United States Biochemical Corporation Regulation of nucleic acid translation
WO1993019202A2 (en) * 1992-03-10 1993-09-30 The United States Of America, As Represented By The Secretary, Department Of Health & Human Services Exchangeable template reaction

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHEMICAL ABSTRACTS, vol. 108, no. 7, 15 February 1988 Columbus, Ohio, US; abstract no. 50202m, Y.E. KHUDYAKOV ET AL. 'Correlation between the efficiency of translation initiation and hybrid gene cro-lacIZ mRNA secondary structure' page 172; column l; & MOL. BIOL. (MOSCOW), vol. 21,no. 6, 1987 pages 1504-1512, *
NUCLEIC ACIDS RESEARCH, vol. 21,no. 11, June 1993 IRL PRESS LIMITED,OXFORD,ENGLAND, pages 2747-2754, Y.E. KHUDYAKOV ET AL. 'Synthetic gene for the hepatitis C virus nucleocapsid protein' *

Also Published As

Publication number Publication date
WO1995011980A3 (en) 1995-06-15
AU8088794A (en) 1995-05-22

Similar Documents

Publication Publication Date Title
Xiang et al. Interaction between the 5'-terminal cloverleaf and 3AB/3CDpro of poliovirus is essential for RNA replication
Kuo et al. Genetic evidence for a structural interaction between the carboxy termini of the membrane and nucleocapsid proteins of mouse hepatitis virus
Chen et al. RNA-protein interactions: involvement of NS3, NS5, and 3'noncoding regions of Japanese encephalitis virus genomic RNA
US6653127B1 (en) Single-chain recombinant complexes of hepatitis C virus NS3 protease and NS4A cofactor peptide
CA2220575C (en) Soluble, active hepatitis c virus protease
Reigadas et al. HCV RNA‐dependent RNA polymerase replicates in vitro the 3′ terminal region of the minus‐strand viral RNA more efficiently than the 3′ terminal region of the plus RNA
Semler et al. Replication of the poliovirus genome
JP4519318B2 (en) Continuous in vitro evolution
AU3798393A (en) Exchangeable template reaction
Kim et al. Template requirements for de novo RNA synthesis by hepatitis C virus nonstructural protein 5B polymerase on the viral X RNA
AU673135B2 (en) Non-A, non-B hepatitis virus antigen, diagnostic methods and vaccines
CA2055149A1 (en) Non-a, non-b hepatitis virus related antigen, antibody, detection systems, polynucleotides and polypeptides
EP0510952A1 (en) Oligonucleotide primers and their application for high-fidelity detection of non-a, non-b hepatitis virus
Shoji et al. Proteolytic activity of NS3 serine proteinase of hepatitis C virus efficiently expressed in Escherichia coli
Khudyakov et al. Synthetic gene for the hepatitis C virus nucleocapsid protein
Cao et al. Genetic variation of the poliovirus genome with two VPg coding units.
WO1995011980A2 (en) PLASMIDS FOR EFFICIENT EXPRESSION OF SYNTHETIC GENES IN $i(E. COLI)
US7838002B2 (en) HCV core+1 protein, methods for diagnosis of HCV infections, prophylaxis, and for screening of anti-HCV agents
AU757685B2 (en) Mosaic protein and restriction endonuclease assisted ligation method for making the same
Van Kammen et al. The replication of cowpea mosaic virus
KR0120928B1 (en) Novel hcv gene which is separated in korea
Takahashi et al. Analysis of the 5’end structure of HCV subgenomic RNA replicated in a Huh7 cell line
RU2158139C1 (en) Method for selectively killing cells
KR100254830B1 (en) Pries2HV, which is a fusion protein of the non-hepatitis B virus surface antigen and the hepatitis C virus hypervariable region protein, and a preparation method thereof
JPH06121689A (en) Non-a non-b hepatitic virus gene, polynucleotide, polypeptide, antigen and antibody detection system

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AM AT AU BB BG BR BY CA CH CN CZ DE DK EE ES FI GB GE HU JP KE KG KP KR KZ LK LR LT LU LV MD MG MN MW NL NO NZ PL PT RO RU SD SE SI SK TJ TT UA UZ VN

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): KE MW SD SZ AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

AK Designated states

Kind code of ref document: A3

Designated state(s): AM AT AU BB BG BR BY CA CH CN CZ DE DK EE ES FI GB GE HU JP KE KG KP KR KZ LK LR LT LU LV MD MG MN MW NL NO NZ PL PT RO RU SD SE SI SK TJ TT UA UZ VN

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): KE MW SD SZ AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: CA