EP1012314A1 - A vector and method for preparation of dna libraries - Google Patents

A vector and method for preparation of dna libraries

Info

Publication number
EP1012314A1
EP1012314A1 EP98930330A EP98930330A EP1012314A1 EP 1012314 A1 EP1012314 A1 EP 1012314A1 EP 98930330 A EP98930330 A EP 98930330A EP 98930330 A EP98930330 A EP 98930330A EP 1012314 A1 EP1012314 A1 EP 1012314A1
Authority
EP
European Patent Office
Prior art keywords
dna
vector
primer
complementary
cdna
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP98930330A
Other languages
German (de)
French (fr)
Inventor
John Seed
Brian Seed
Su Wen Qian
Reyes Candau
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Edge Biosystems Inc
Original Assignee
Edge Biosystems Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Edge Biosystems Inc filed Critical Edge Biosystems Inc
Publication of EP1012314A1 publication Critical patent/EP1012314A1/en
Withdrawn legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1096Processes for the isolation, preparation or purification of DNA or RNA cDNA Synthesis; Subtracted cDNA library construction, e.g. RT, RT-PCR
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/66General methods for inserting a gene into a vector to form a recombinant vector using cleavage and ligation; Use of non-functional linkers or adaptors, e.g. linkers containing the sequence for a restriction endonuclease
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/10Plasmid DNA
    • C12N2800/108Plasmid DNA episomal vectors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2840/00Vectors comprising a special translation-regulating system
    • C12N2840/20Vectors comprising a special translation-regulating system translation of more than one cistron

Definitions

  • the invention relates to the field of recombinant DNA. In particular, it relates to the field of DNA libraries.
  • high quality cDNA libraries is essential for gene identification strategies based on high throughput sequencing, on phenotypic expression in bacteria, yeast, or mammalian cells, on identification of interaction partners through the yeast two-hybrid system; or on recovery of cDNAs cognate to rare mRNAs. All of the varieties of expression cloning, for example, depend on the creation of, or access to, high quality cDNA libraries. Some of the key features of a high quality library include, a large number of independent clones (preferably greater than 10 7 ), a high percentage of inserts, oriented insertion of the genes, a high percentage of full length gene sequences and a population of clones which is representative of the starting population of RNAs.
  • a typical strategy is to reverse transcribe the target RNA using an oligo-dT primer which comprises at its 5' end a restriction endonuclease recognition site. After preparing the complementary strand, adaptors are ligated onto the double stranded cDNA. These adaptors generate ends which are not complementary to the ends resulting from restriction of the site incorporated in the 5' end of the primer.
  • the cDNA is then cut with this enzyme, purified to remove fragments and ligated into a suitably cleaved vector.
  • the enzymes used in this strategy (most commonly Not I) restrict DNA with low frequency.
  • this enzyme is known to restrict DNA within coding elements and several important genes have been identified (from non-oriented libraries) that contain Not I restriction sites.
  • the vector is cut with one restriction enzyme, dephosphorylated, then cut with two additional restriction enzymes to produce a phosphorylated vector with non-complementary ends which can then be used in high-efficiency ligations with phosphorylated inserts.
  • this technique leads to ends which can still ligate to one another to produce vector dimers or insert dimers.
  • the most efficient incorporation of cDNA is by methods that produce non self-complementary ends on both cDNA and vector. However, methods described to date do not allow directional insertion of cDNA. In yet another method to reduce background, toxic elements have been incorporated into vectors.
  • vectors are designed so that the cDNA insertion site resides within the promoter or coding elements of inducible genes which express products toxic to the host cell in the presence of an inducing agent. If cDNA is inserted in the vector, then the toxic gene will be interrupted and the host containing this plasmid will survive in the presence of inducer whereas host cells containing plasmid vector without insert will die. However, the toxicity may be incomplete, resulting in a number of slow growing clones without insert. In addition, traces of nuclease activity can contribute substantially to background by changing the reading frame of restricted plasmid prior to ligation. Thus there is a need in the art for better methods of preparing oriented, representational cDNA libraries with very low backgrounds.
  • One aspect of this invention is a nucleotide polymer comprising two elements, one that binds to the nucleotide elements of another nucleic acid polymer and an element that comprises the recognition and cleavage site for an endonuclease.
  • the binding elements include by way of example, but without limitation, random hexamers, random nanomers, homopolymers (deoxythymidine homopolymers in particular), sequence-specific nucleotides which bind to specific nucleic acid polymers, sequence-specific nucleotides which bind to known types or classes of nucleic acid polymers, and combinations of the above.
  • the polymer of the invention is prepared by standard chemical or biological methods (Itakura, et al. 53 Ann. Rev. Biochem.
  • this nucleotide polymer element is variable and is dependent on conditions affecting the specificity of the binding and is typically between 4 and 200 nucleotides in length, with a preferred length between 6 and 30 nucleotides.
  • the element of the nucleotide polymer of the invention that comprises the recognition and cleavage site for an endonuclease is comprised of a nucleotide sequence that forms one complementary strand of a double stranded DNA sequence that is bound by an endonuclease and is cleaved by it.
  • This nucleotide sequence is recognized only by endonucleases that cleave the coding elements of target genomes with a frequency less than 100 times per genome or by endonucleases which do not cleave cDNA.
  • the target genome is the DNA which comprises the source of the nucleic acid polymer to which the nucleotide polymer of the invention binds.
  • intron endonucleases An example of endonucleases that cleave the coding elements of target genomes with a frequency of less than 100 times per genome are the intron endonucleases. These endonucleases are intron-encoded enzymes that, under optimal conditions, recognize and cleave asymmetric DNA sequences of unusual length (14 - 31 bp). Intron endonucleases include by way of example, but without limitation, Pl-Sce I (VDE), I-Ceu I, I-Tli I and I-Ppo I. Optimum conditions are those which are known to provide maximum specificity and enzyme activity. The preferred intron endonuclease of the invention is VDE.
  • This enzyme has an unusually long asymmetric recognition and cleavage site, with only 1 known recognition and cleavage site in S. cerevisiae, and none in E. coli.
  • the wildtype recognition and cleavage sequence for this enzyme is 5'-TATGTCGGGTGCGGAGAAAGAGGTAAT GAAA-3' (Gimble and Wang, 263 J. Mol. Biol. 163-180 (1996)). It is known that this sequence is somewhat degenerate, i.e. base changes at certain locations enhance, decrease or have no impact on binding or cleavage. For example, substitutions of T for C at position -1 and G for C at position 6 generate a site which is more readily cleaved by VDE than the wild type.
  • substitutions can be introduced at sites with known degeneracy to facilitate cleavage of the sequence, increase the GC content of the 3' overhang or introduce other changes as may be deemed valuable by those of skill in the art.
  • substitution of Mn for Mg decreases the specificity of the enzyme. Consequently, changes to this sequence or to enzyme reaction conditions that preserve the functionality of the endonuclease (binding and cleavage) and that do not increase the frequency of cleavage to greater than 100 times per genome are within the scope of this invention.
  • methylation-specific endonucleases methylation-specific endonucleases. These enzymes recognize only methylated DNA, and because cDNA or in vitro amplified DNA is not normally methylated (unless methyl nucleotides are deliberately introduced), it will not cleave these DNAs.
  • the nucleotide sequence of the invention is methylated by chemical or enzymatic methods to produce a site which is cleaved by the methylation-specific endonuclease to produce a unique end.
  • a nucleotide sequence comprising overlapping Cla I sites can be methylated with Cla I methylase to produce a Dpn I recognition and cleavage site. Because this enzyme recognizes only methylated DNA, it will not cleave cDNA or in vitro amplified DNA.
  • This example is provided only to illustrate the means by which those of skill in the art may identify and modify nucleotide sequences which serve as unique binding sites for endonucleases which recognize and cleave only modified nucleotides.
  • nucleotide polymer of the invention may be added to the nucleotide polymer of the invention as may be considered useful by those of skill in the art. These include by way of example, but without limitation, recognition and cleavage sites for other restriction endonucleases, sequences complementary to nucleotide polymers commonly used for sequencing, and recognition sites for DNA binding proteins.
  • restriction and cleavage site for an endonuclease can be added to an existing DNA or RNA by techniques such as ligation of a double stranded oligonucleotide comprising the sequence for the restriction site, or by priming with a single stranded oligonucleotide comprising the sequence for the restriction site followed by extension of the first strand and synthesis of second strand, to create a duplex DNA which can be recognized and cleaved by the enzyme.
  • the principle advantage of this invention is that it allows insertion of DNA into vectors without risk of cleaving the DNA which is being inserted. It is particularly useful in orienting
  • adaptors are ligated to the inserted DNA before ligating the inserted DNA to the DNA sequence within which the DNA is to be inserted (vector).
  • the vector comprises within the DNA insertion site one or more recognition and cleavage sites for an endonuclease which has less than 100 recognition and cleavage sites within the genome of the insert (rare endonuclease).
  • the adaptors that are ligated to the insert DNA comprise either the full sequence of the recognition and cleavage site for the rare endonuclease of the invention, which must then be cleaved prior to insertion into the vector, or part of the recognition and cleavage site of the rare endonuclease which when ligated with the vector reconstitutes the full recognition and cleavage site for the rare endonuclease of the invention.
  • the recognition and cleavage site for the rare endonuclease of the invention if not already present, may be added to the vector using the same strategies as described for the insert DNA. When the insert of the method is ligated to the vector of the method, one or more recognition and cleavage sites for the rare endonuclease of the invention are regenerated.
  • This vector has an element located between two polylinker sites which comprises a conditionally lethal gene. The region between the two polylinker sites also constitutes the cDNA insertion site. If the lethal gene is not removed during the process of vector preparation, host cells which are susceptible to the lethal gene product will be killed when transformed with this construct.
  • the conditionally lethal gene is the kilA gene from the broad host range plasmid RK2. In host strains lacking the repressor genes korA and korB, the kilA gene is expressed and kills the host (Kornacki et al.,
  • vector DNA containing the MIA gene segment is prepared in good yield by standard methods from E. coli strains which constitutively express korA and korB.
  • E. coli strains which constitutively express korA and korB.
  • a normal transformation frequency ⁇ 10 9 colonies per ug DNA using electrocompetent cells
  • the transformation frequency of kor " bacteria is between 1 and 10 colonies per ug DNA. This provides extremely powerful selection against vector.
  • the CMV promoter of commonly used expression vectors is replaced with the more powerful and stable broad host range promoter from EFl ⁇ .
  • the polylinker region of the vector is flanked by endonuclease sites which when cleaved, yield non-complementary ends.
  • One of these ends is complementary to the cleavage product of an endonuclease that does not cleave cDNA.
  • the ends are complementary to the cleavage product of an intron endonuclease.
  • the preferred endonuclease is VDE.
  • Other sites downstream of the insertion site are provided to confer properties of improved stability and expression in transfected mammalian cells. These include the EBNA-1 transcription unit, SV40 and human growth hormone polyadenylation signal sequences, human IgGl H/CH2 splice site, the Epstein-Barr virus OriP, puromycin acetyl transferase gene and transcription terminators.
  • These preferred transcription termination sites are bidirectional, e.g. those from circular viral genome, such as those of the papova virus family, or synthetic bidirectional polyadenylation and termination signals (Figure 1)
  • Such libraries are prepared without the use of bacteriophage or bacteriophage vectors.
  • the vector of this method is a plasmid with two or more endonuclease recognition and cleavage sites which when cleaved by one or more endonucleases, give noncomplementary ends.
  • these sites are recognized by a single endonuclease with a degenerate cleavage site which can provide an overhang of 4 or more bases with two or more deoxyguanidines and/or deoxycytidines.
  • the preferred site in this method is recognized and cleaved by Bst XI.
  • first strand synthesis is initiated with a nucleotide primer which binds to the target nucleic acid.
  • a nucleotide primer which binds to the target nucleic acid.
  • a nucleotide primer which binds to the target nucleic acid.
  • these include, by way of example, but without limitation, random hexamers, random nanomers, homopolymers (deoxythymidine homopolymers in particular), sequence-specific nucleotides which bind to specific nucleic acid polymers, sequence-specific nucleotides which bind to known types or classes of nucleic acid polymers, and combinations of the above.
  • Another element of the primer is the recognition and cleavage site for an endonuclease which is either not found in cDNA or found in very low frequency in target genomes.
  • endonuclease recognition and cleavage sites that are not found in cDNA include by way of example but without limitation, the sites for intron endonucleases VDE, I-Ceu I, I-Tli I and I-Ppo I as well as for the methylation specific enzyme Dpn I.
  • the preferred site is recognized and cleaved by the intron endonuclease VDE (Gimble and Stephens, 1995; Gimble and Thorner, 1992; Gimble and Thorner, 1993; Gimble and Wang, 1996). Cleavage of this site leaves a 4 base, 3 1 overhang that is rich in GC content but not palindromic (GTGC).
  • First strand cDNA is synthesized by these or related enzymes according to standard procedures. This step is followed by second strand synthesis and ligation of phosphorylated, non-selfcomplementary adaptors.
  • Second strand synthesis is performed using a suitable DNA polymerase such as T4 DNA polymerase or E. coli DNA polymerase I, and priming strategies such as RNase H treatment.
  • Other enzymes such as thermostable polymerases, and/or other priming strategies may be used which are known to those of skill in the art.
  • Phosphorylated adaptors are selected, annealed and ligated to the cDNA essentially as described previously (Seed and Aruffo, 1987; Aruffo and Seed, 1987a).
  • the key feature of this method is that the adaptors are non-selfcomplementary, i.e. annealing of the two strands of the adaptor generates an overhang which is not complementary to itself.
  • the endonuclease site (introduced through the primer) is then cleaved, leaving nonidentical, noncomplementary ends. It is preferred that the endonuclease not cleave the cDNA or that it cleave with very low frequency.
  • the cDNA is fractionated and DNA greater than 0.5 - 1 kb in length is ligated to the vector and transformed into a suitable host.
  • the preferred method of fractionation is potassium acetate gradients, although size exclusion chromatography, other density gradients or other techniques known to those of skill in the art may be used.
  • the preferred vector of the method has a cDNA insertion site which comprises a toxic stuffer gene, preferably the kilA gene of the invention, and two flanking restriction sites, cleavage of which leaves non-selfcomplementary overhangs that are complementary to those of the adaptor and the cleaved intron endonuclease site on the cDNA.
  • the overhang which is complementary to the adaptor will be located at the 5* end of the cDNA in the preferred version.
  • vector-based strategies for producing low background may be coupled with the intron endonuclease/non-self complementary adaptor strategy for producing large unbiased libraries.
  • such vectors must have phosphorylated, non- selfcomplementary ends which are complementary to the ends of the cDNA.
  • a preferred enzyme for restriction of the preferred vector is Bst XI. This enzyme has a degenerate cleavage site which facilitates the selection of overhangs which are complementary to more than one restriction site and which may be manipulated to have other useful features such as high GC content.
  • a single restriction digest can generate ends which are complementary to both the adaptor and the cleaved VDE site.
  • Other enzymes with degenerate cleavage sites known to those of skill in the art may be used to leave overhangs on the vector that are capable of annealing and ligating to the non-selfcomplementary adaptors described above and to the overhang generated by the intron endonuclease.
  • more than one enzyme may be used to generate the appropriate vector configuration. It is generally advantageous, but not essential, that the overhang created by the adaptor is complementary to the overhang generated by the enzyme used to cleave the insertion site of the vector.
  • the vector must be treated in such a way that non-selfcomplementary overhangs are created on the vector which are complementary to the non-selfcomplementary overhangs on the cDNA. This can be achieved by ligating adaptors to appropriate restriction cleavage site of the vector or by similar techniques known to those of skill in the art.
  • Alternative strategies which require the use of two enzymes or multiple manipulations of the vector are not preferred because they increase cost and the extra manipulations tend to reduce efficiency of library preparation.
  • the plasmid vector containing the kilA gene is amplified in a korA, korB strain such as MC1061/p3 and isolated by standard methods.
  • the purified plasmid is then digested with restriction enzymes that flank the MIA gene fragment.
  • the MIA gene fragment is then separated from the vector by methods such as gel electrophoresis, size exclusion chromatography, density gradient centrifugation and other techniques known to those of skill in the art.
  • the vector is then ready for further manipulations or immediate ligation of cDNA .
  • this method the same approach used to prepared oriented cDNA inserts is used genetically to orient DNA in plasmids and other DNA vectors used in the cloning and manipulation of DNA, with little or no risk of cleaving the inserted DNA.
  • a cDNA library with 1.1 x 10 8 recombinant clones was prepared from human small intestine.
  • the cDNA was prepared from 2.5 ug of poly(A)+ RNA using an oligo-dT-VDE primer comprising 18 bases of dT and 66 bases including 31 bases of the original VDE recognition site (underlined).
  • the sequence of the primer is : 5'-CGACGTTGTAAAACGACGGCCAGTGAATTCTC TATGTCGGGTGCGGAGAAAGAGG TAATGAAA TACTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT-3 '
  • the poly(A)RNA was diluted with DEPC-H 2 0 to 22 ul and mix with 2 ul of the primer (1 ug/ul). The mixture was incubated at 70°C for 10 minutes and transferred to ice for 5 minutes. The following components were added to the sample for the first strand cDNA synthesis: 8 ul of 5x 1" strand buffer [ 250mM Tris-HCl(pH 8.3), 375mM KC1, 15mM MgCl 2 ], 4 ul of 10 mM dNTP, 2 ul of 0.1 MDTT, 1 ul (40U) of RNase inhibitor ( Life Technologies, Inc., Gaithersburg, MD) and 1 ul (36 TJ) of AMV RT-XL (Life Science Research Products, Orlando,
  • the sample was incubated at 42 °C for 1 hr and 70 °C for 10 minutes.
  • the second strand cDNA was prepared by adding the following reagents to the first strand cDNA sample: 70 ul of H 2 O, 30 ul of 5x 2 nd strand Buffer [100 mM Tris-HCl (pH 6.9), 450 mM KC1, 23 mM MgCl 2 , 0.75 mM ⁇ -NAD + , 50 mM (NH 4 ) 2 SO 4 ], 2 ul of 0.1 M DTT, 3 ul of 10 mM dNTP, 0.5 ul (2TJ) of RNase H ( Life
  • E. coli DNA polymerase I Life Technologies, Inc., Gaithersburg, MD
  • E. coli DNA ligase Life Technologies, Inc., Gaithersburg, MD
  • the sample was incubated 2 hours at 16°C, then added 5 ul (5TJ) of T4 DNA polymerase (Boehringer Mannheim, Indianapolis, IN ) and incubated 5 minutes at 16°C.
  • the reaction was terminated by adding 10 ul of 0.5 M EDTA.
  • the cDNA was purified by phenol extraction and ethanol precipitation.
  • the ligation of the Bst XI adaptor to the cDNA was performed by adding the following reagents to 20 ul of the cDNA sample in H 2 O: 10 ul of phosphorylated Bst XI adaptors (5'-CTGGCTCA-3'; 5'- TGAGCCAGCCCC-3') ( 10 ug ), 10 ul of ligation buffer [330mM Tris-HCl, 50 mM MgC12, 5 mM ATP], 7 ul of 0.1 M DTT and 5 ul (5U)of T4 DNA ligase ( Life Technologies, Inc., Gaithersburg, MD ) and incubating overnight at 16°C.
  • the cDNA was purified by phenol extraction and ethanol precipitation, then digested with 20 units of VDE (New England Biolabs, Beverly, MA) at 37°C for 6 hours.
  • the cDNA was fractionated on a potassium acetate gradient (5 - 20%) for 3 hours at 50,000 rpm in a Beckman L5-50 centrifuge using an SW-50 rotor.
  • the cDNA fragments greater than 700 bases were collected, concentrated by ethanol precipitation and dissolved in 50 ul of TE (10 mM Tris-HCl, pH 8.0, 1 mMEDTA).
  • the pEAK8 vector was digested with Bst XI using manufacturer's recommended procedures (New England Biolabs, Beverly, MA) and purified on a potassium acetate gradient as above to remove the MIA stuffer. After ethanol precipitation, the vector was dissolved in TE at 50 ng/ul. Mix 47 ul of the vector with 47 ul of the fractionated cDNA , 258 ul of H 2 0, 94 ul of 5x T4 DNA ligase buffer (Life Technologies, Inc., Gaithersburg, MD) and 24 ul of ligase (1 U/ul) (Life
  • the average size of the cDNA inserts was analyzed by extracting the plasmid DNA from 24 randomly selected colonies and digested with the restriction enzymes flanking the cDNA insert.
  • the primary cDNA library for human small intestine contains l.lx 10 8 primary transformants with the average size of the cDNA inserts at 2.3 kb.
  • the vector (without insert) background in this library is less than 1%.
  • a cDNA library with 2.7 x 10 8 recombinant clones was constructed from human fetal kidney RNA.
  • the cDNA was prepared from 5.0 ug of poly(A)+ RNA using another oligo-dT-VDE primer comprising 18 bases of dT and 60 bases including 28 bases of the modified VDE recognition site (underlined).
  • oligo-dT-VDE primer comprising 18 bases of dT and 60 bases including 28 bases of the modified VDE recognition site (underlined).
  • a variant with modification of two bases C to T at position -1 and C to G at position 6 of the original version
  • deletion of three deoxyadenines at the 3' of the original version leads to a site that can also be cleaved by VDE enzyme.
  • the sequence of the primer is :
  • the cDNA library for human fetal kidney was prepared with similar protocol as above except the following changes: 1). the use of the modified oligo-dT-VDE primer ; 2). 2 hours of VDE digestion.
  • This library contains 2.7 x 10 8 primary clones with average size of the cDNA inserts at 1.3 kb and less than 1 % of background.
  • pEAKlO was prepared from pEAK8 by deleting an inhibitory regulatory sequence present in the EFl ⁇ promoter of pEAK8 and removing the BspLU 1 II site in the EFl ⁇ promoter.
  • the protein expression levels from pEAKlO are 50% higher than those for pEAK8.
  • the LacZ gene from E. coli was cloned in pEAKlO and in three other different commercial vectors and the resulting plasmids were transfected into 293HEK cells expressing the EBNA-1 protein and the large T-antigen (293 EBNA-T).
  • the amount of recombinant protein expressed in 293 EBNA-T cells transfected with pEAKiO was, at least, three fold higher than when the same cells were transfected with any of the other plasmids.
  • LacZ was cloned by standard methods (Current Protocols in Molecular Biology, Vol 1, Ausubel, et al., Eds, John Wiley & Sons, New York (1997)) into the Hind m-Not I sites of pEAKlO or pCDNA3.1 Hygro (+) [(Invitrogen, cat# V870-20) (this vector has the CMV promoter and the SV40 origin of replication)], pREP4 [(Invitrogen, cat# V004-50) (this vector has the RSV promoter, the EBNA- 1 expression cassette and an Epstein-Barr virus origin of replication)] or pCEP4 [(Invitrogen, cat# V044-50) (this vector has the CMV promoter, the EBNA-1 expression cassette and an Epstein-Barr virus origin of replication)] to generate pEAKlO- ⁇ gal, pCDNA3- ⁇ gal, pREP4- ⁇ gal or pCEP4- ⁇ gal.
  • 5 x 10 5 293 EBNA-T cells were plated in 60 mm Petri dishes containing 5 ml DMEM medium supplemented with 10% calf serum and incubated at standard conditions (37 °C and 5% CO ⁇ ) for 24 hours. The medium was then changed and the plates were incubated in the same conditions for two additional hours.
  • the cells were washed once with PBS and collected in 1 ml PBS, spun at 250 g for 5 minutes and resuspended in 100 ul 0.25 M Tris.ClH pH 8. The cells were then lysed by three freeze-thaw (liquid nitrogen/37 °C water bath) cycles and the insoluble material was pelleted at 12000 g for 5 minutes at 4°C.
  • VDE a site-specific endonuclease from the yeast Saccharomyces cerevisiae. Journal of Biological Chemistry 268, 21844-53.

Landscapes

  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Biomedical Technology (AREA)
  • Organic Chemistry (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

A method is presented for construction of a cDNA library in which the insert cDNA is flanked by linkers comprising recognition sites for an intron endonuclease. The insert may additionally comprise linkers comprising recognition sites for an endonuclease that generates non-self-complementary termini.

Description

A VECTOR AND METHOD FOR PREPARATION OF DNA LIBRARIES
Field of the Invention
The invention relates to the field of recombinant DNA. In particular, it relates to the field of DNA libraries.
Background to the Invention
The generation of high quality cDNA libraries is essential for gene identification strategies based on high throughput sequencing, on phenotypic expression in bacteria, yeast, or mammalian cells, on identification of interaction partners through the yeast two-hybrid system; or on recovery of cDNAs cognate to rare mRNAs. All of the varieties of expression cloning, for example, depend on the creation of, or access to, high quality cDNA libraries. Some of the key features of a high quality library include, a large number of independent clones (preferably greater than 107), a high percentage of inserts, oriented insertion of the genes, a high percentage of full length gene sequences and a population of clones which is representative of the starting population of RNAs. Many of the vectors and systems currently used for preparation of cDNA libraries have features which inherently limit the capacity to produce high quality libraries. To achieve the desired goal of orientation in a cDNA library, it is necessary to prepare cDNA with non-complementary ends. A typical strategy is to reverse transcribe the target RNA using an oligo-dT primer which comprises at its 5' end a restriction endonuclease recognition site. After preparing the complementary strand, adaptors are ligated onto the double stranded cDNA. These adaptors generate ends which are not complementary to the ends resulting from restriction of the site incorporated in the 5' end of the primer. The cDNA is then cut with this enzyme, purified to remove fragments and ligated into a suitably cleaved vector. To minimize loss of gene sequences, the enzymes used in this strategy (most commonly Not I) restrict DNA with low frequency. However, this enzyme is known to restrict DNA within coding elements and several important genes have been identified (from non-oriented libraries) that contain Not I restriction sites.
Another difficulty in the preparation of high quality libraries is background. For a variety of reasons, typical libraries have backgrounds (vector without insert) of 2 - 10%. Even though these backgrounds are sufficiently low for purposes such as expression cloning, they are unacceptable in high cost operations such as automated sequencing. Several strategies have been developed which reduce the incidence of background in cDNA libraries. The most common technique is to dephosphorylate the vector. A dephosphorylated vector cannot ligate to itself. By ligating a phosphorylated insert to a dephosphorylated vector, significant reductions in background can be obtained. However, vector dephosphorylation reduces ligation efficiency and thus library size. Furthermore, if the restriction enzyme used to cut the vector does not cut to completion, high backgrounds will result. In a variation on this strategy, the vector is cut with one restriction enzyme, dephosphorylated, then cut with two additional restriction enzymes to produce a phosphorylated vector with non-complementary ends which can then be used in high-efficiency ligations with phosphorylated inserts. However, this technique leads to ends which can still ligate to one another to produce vector dimers or insert dimers. The most efficient incorporation of cDNA is by methods that produce non self-complementary ends on both cDNA and vector. However, methods described to date do not allow directional insertion of cDNA. In yet another method to reduce background, toxic elements have been incorporated into vectors. These vectors are designed so that the cDNA insertion site resides within the promoter or coding elements of inducible genes which express products toxic to the host cell in the presence of an inducing agent. If cDNA is inserted in the vector, then the toxic gene will be interrupted and the host containing this plasmid will survive in the presence of inducer whereas host cells containing plasmid vector without insert will die. However, the toxicity may be incomplete, resulting in a number of slow growing clones without insert. In addition, traces of nuclease activity can contribute substantially to background by changing the reading frame of restricted plasmid prior to ligation. Thus there is a need in the art for better methods of preparing oriented, representational cDNA libraries with very low backgrounds.
Detailed Description of the Invention
One aspect of this invention is a nucleotide polymer comprising two elements, one that binds to the nucleotide elements of another nucleic acid polymer and an element that comprises the recognition and cleavage site for an endonuclease. The binding elements include by way of example, but without limitation, random hexamers, random nanomers, homopolymers (deoxythymidine homopolymers in particular), sequence-specific nucleotides which bind to specific nucleic acid polymers, sequence-specific nucleotides which bind to known types or classes of nucleic acid polymers, and combinations of the above. The polymer of the invention is prepared by standard chemical or biological methods (Itakura, et al. 53 Ann. Rev. Biochem. 323-356 (1984); Current Protocols in Molecular Biology, Vol 1, Ausubel, et al., Eds, John Wiley & Sons, New York (1997 )) known to those of skill in the art. The length of this nucleotide polymer element is variable and is dependent on conditions affecting the specificity of the binding and is typically between 4 and 200 nucleotides in length, with a preferred length between 6 and 30 nucleotides.
The element of the nucleotide polymer of the invention that comprises the recognition and cleavage site for an endonuclease is comprised of a nucleotide sequence that forms one complementary strand of a double stranded DNA sequence that is bound by an endonuclease and is cleaved by it. This nucleotide sequence is recognized only by endonucleases that cleave the coding elements of target genomes with a frequency less than 100 times per genome or by endonucleases which do not cleave cDNA. The target genome is the DNA which comprises the source of the nucleic acid polymer to which the nucleotide polymer of the invention binds. An example of endonucleases that cleave the coding elements of target genomes with a frequency of less than 100 times per genome are the intron endonucleases. These endonucleases are intron-encoded enzymes that, under optimal conditions, recognize and cleave asymmetric DNA sequences of unusual length (14 - 31 bp). Intron endonucleases include by way of example, but without limitation, Pl-Sce I (VDE), I-Ceu I, I-Tli I and I-Ppo I. Optimum conditions are those which are known to provide maximum specificity and enzyme activity. The preferred intron endonuclease of the invention is VDE. This enzyme has an unusually long asymmetric recognition and cleavage site, with only 1 known recognition and cleavage site in S. cerevisiae, and none in E. coli. By way of example, but without limitation, the wildtype recognition and cleavage sequence for this enzyme is 5'-TATGTCGGGTGCGGAGAAAGAGGTAAT GAAA-3' (Gimble and Wang, 263 J. Mol. Biol. 163-180 (1996)). It is known that this sequence is somewhat degenerate, i.e. base changes at certain locations enhance, decrease or have no impact on binding or cleavage. For example, substitutions of T for C at position -1 and G for C at position 6 generate a site which is more readily cleaved by VDE than the wild type. Other substitutions can be introduced at sites with known degeneracy to facilitate cleavage of the sequence, increase the GC content of the 3' overhang or introduce other changes as may be deemed valuable by those of skill in the art. Furthermore, substitution of Mn for Mg decreases the specificity of the enzyme. Consequently, changes to this sequence or to enzyme reaction conditions that preserve the functionality of the endonuclease (binding and cleavage) and that do not increase the frequency of cleavage to greater than 100 times per genome are within the scope of this invention.
Other enzymes whose recognition and cleavage sites are useful elements of the invention are those that recognize only modified nucleotides not normally found in cDNA or DNA amplified by in vitro technologies. One class of enzymes which meet this criterion are methylation-specific endonucleases. These enzymes recognize only methylated DNA, and because cDNA or in vitro amplified DNA is not normally methylated (unless methyl nucleotides are deliberately introduced), it will not cleave these DNAs. The nucleotide sequence of the invention is methylated by chemical or enzymatic methods to produce a site which is cleaved by the methylation-specific endonuclease to produce a unique end. By way of example, but without limitation, a nucleotide sequence comprising overlapping Cla I sites can be methylated with Cla I methylase to produce a Dpn I recognition and cleavage site. Because this enzyme recognizes only methylated DNA, it will not cleave cDNA or in vitro amplified DNA. This example is provided only to illustrate the means by which those of skill in the art may identify and modify nucleotide sequences which serve as unique binding sites for endonucleases which recognize and cleave only modified nucleotides.
Other elements may be added to the nucleotide polymer of the invention as may be considered useful by those of skill in the art. These include by way of example, but without limitation, recognition and cleavage sites for other restriction endonucleases, sequences complementary to nucleotide polymers commonly used for sequencing, and recognition sites for DNA binding proteins. It is known in the art that the restriction and cleavage site for an endonuclease can be added to an existing DNA or RNA by techniques such as ligation of a double stranded oligonucleotide comprising the sequence for the restriction site, or by priming with a single stranded oligonucleotide comprising the sequence for the restriction site followed by extension of the first strand and synthesis of second strand, to create a duplex DNA which can be recognized and cleaved by the enzyme. The principle advantage of this invention is that it allows insertion of DNA into vectors without risk of cleaving the DNA which is being inserted. It is particularly useful in orienting
DNA within vectors with little or no risk of cleaving the DNA which is being oriented.
It is an object of this invention to provide a method for inserting one DNA sequence within another DNA sequence wherein there is little or no risk of cleaving the inserted DNA during subsequent manipulation of the DNA. In the method of the invention, adaptors are ligated to the inserted DNA before ligating the inserted DNA to the DNA sequence within which the DNA is to be inserted (vector). The vector comprises within the DNA insertion site one or more recognition and cleavage sites for an endonuclease which has less than 100 recognition and cleavage sites within the genome of the insert (rare endonuclease). The adaptors that are ligated to the insert DNA comprise either the full sequence of the recognition and cleavage site for the rare endonuclease of the invention, which must then be cleaved prior to insertion into the vector, or part of the recognition and cleavage site of the rare endonuclease which when ligated with the vector reconstitutes the full recognition and cleavage site for the rare endonuclease of the invention. The recognition and cleavage site for the rare endonuclease of the invention, if not already present, may be added to the vector using the same strategies as described for the insert DNA. When the insert of the method is ligated to the vector of the method, one or more recognition and cleavage sites for the rare endonuclease of the invention are regenerated.
It is another object of the invention to provide a vector with improved selection against clones lacking an insert. This vector has an element located between two polylinker sites which comprises a conditionally lethal gene. The region between the two polylinker sites also constitutes the cDNA insertion site. If the lethal gene is not removed during the process of vector preparation, host cells which are susceptible to the lethal gene product will be killed when transformed with this construct. In a preferred vector, the conditionally lethal gene is the kilA gene from the broad host range plasmid RK2. In host strains lacking the repressor genes korA and korB, the kilA gene is expressed and kills the host (Kornacki et al.,
1993; Larsen and Figurski, 1994; Thomas et al., 1995; Thomson et al., 1993). On the other hand, vector DNA containing the MIA gene segment is prepared in good yield by standard methods from E. coli strains which constitutively express korA and korB. When the vector is transformed into a korA, korB E. coli strain, a normal transformation frequency (~ 109 colonies per ug DNA using electrocompetent cells) is observed. The transformation frequency of kor " bacteria, on the other hand, is between 1 and 10 colonies per ug DNA. This provides extremely powerful selection against vector.
It is yet another object of this invention to provide an improved vector for cloning of mammalian genes, expression of proteins in mammalian cells and expression cloning. In the vector of the invention, the CMV promoter of commonly used expression vectors is replaced with the more powerful and stable broad host range promoter from EFlα. The polylinker region of the vector is flanked by endonuclease sites which when cleaved, yield non-complementary ends. One of these ends is complementary to the cleavage product of an endonuclease that does not cleave cDNA. In a preferred form of the vector, the ends are complementary to the cleavage product of an intron endonuclease. The preferred endonuclease is VDE. Other sites downstream of the insertion site are provided to confer properties of improved stability and expression in transfected mammalian cells. These include the EBNA-1 transcription unit, SV40 and human growth hormone polyadenylation signal sequences, human IgGl H/CH2 splice site, the Epstein-Barr virus OriP, puromycin acetyl transferase gene and transcription terminators. These preferred transcription termination sites are bidirectional, e.g. those from circular viral genome, such as those of the papova virus family, or synthetic bidirectional polyadenylation and termination signals (Figure 1)
It is an object of this invention to provide a method for the preparation of oriented plasmid cDNA libraries with greater than 108 primary transformants per ug of poly(A)+ RNA. Such libraries are prepared without the use of bacteriophage or bacteriophage vectors. The vector of this method is a plasmid with two or more endonuclease recognition and cleavage sites which when cleaved by one or more endonucleases, give noncomplementary ends. In a preferred vector, these sites are recognized by a single endonuclease with a degenerate cleavage site which can provide an overhang of 4 or more bases with two or more deoxyguanidines and/or deoxycytidines. The preferred site in this method is recognized and cleaved by Bst XI.
In the method of the invention, first strand synthesis is initiated with a nucleotide primer which binds to the target nucleic acid. These include, by way of example, but without limitation, random hexamers, random nanomers, homopolymers (deoxythymidine homopolymers in particular), sequence-specific nucleotides which bind to specific nucleic acid polymers, sequence-specific nucleotides which bind to known types or classes of nucleic acid polymers, and combinations of the above. Another element of the primer is the recognition and cleavage site for an endonuclease which is either not found in cDNA or found in very low frequency in target genomes. Examples of endonuclease recognition and cleavage sites that are not found in cDNA include by way of example but without limitation, the sites for intron endonucleases VDE, I-Ceu I, I-Tli I and I-Ppo I as well as for the methylation specific enzyme Dpn I. The preferred site is recognized and cleaved by the intron endonuclease VDE (Gimble and Stephens, 1995; Gimble and Thorner, 1992; Gimble and Thorner, 1993; Gimble and Wang, 1996). Cleavage of this site leaves a 4 base, 31 overhang that is rich in GC content but not palindromic (GTGC). This is an important feature in the preparation of large size cDNA libraries as it prevents end-to-end ligation of cDNA during ligation with vector. The high GC content of the overhang increases the stability of overlapping complementary ends. This in turn enhances ligation efficiency, which in turn enhances library size. Substitutions of T for C at position -1 and G for C at position 6 generate a site which is more readily cleaved by VDE than the wild type. Other substitutions can be introduced at sites with known degeneracy to facilitate desired changes in the sequence. The primer of the invention is used to initiate reverse transcription. Many reverse transcriptases and some thermostable polymerases with reverse transcriptase activity have been described as being useful in the synthesis of first strand cDNA. First strand cDNA is synthesized by these or related enzymes according to standard procedures. This step is followed by second strand synthesis and ligation of phosphorylated, non-selfcomplementary adaptors. Second strand synthesis is performed using a suitable DNA polymerase such as T4 DNA polymerase or E. coli DNA polymerase I, and priming strategies such as RNase H treatment. Other enzymes, such as thermostable polymerases, and/or other priming strategies may be used which are known to those of skill in the art. Phosphorylated adaptors are selected, annealed and ligated to the cDNA essentially as described previously (Seed and Aruffo, 1987; Aruffo and Seed, 1987a). The key feature of this method is that the adaptors are non-selfcomplementary, i.e. annealing of the two strands of the adaptor generates an overhang which is not complementary to itself. By ligating the adaptors in large molar excess over the cDNA, end-to-end ligation of the cDNA is minimized. Although large amounts of end-to-end ligations of the adaptors occur, this is more than offset by efficient ligation of adaptors to the cDNA.
The endonuclease site (introduced through the primer) is then cleaved, leaving nonidentical, noncomplementary ends. It is preferred that the endonuclease not cleave the cDNA or that it cleave with very low frequency. The cDNA is fractionated and DNA greater than 0.5 - 1 kb in length is ligated to the vector and transformed into a suitable host. The preferred method of fractionation is potassium acetate gradients, although size exclusion chromatography, other density gradients or other techniques known to those of skill in the art may be used. The preferred vector of the method has a cDNA insertion site which comprises a toxic stuffer gene, preferably the kilA gene of the invention, and two flanking restriction sites, cleavage of which leaves non-selfcomplementary overhangs that are complementary to those of the adaptor and the cleaved intron endonuclease site on the cDNA. The overhang which is complementary to the adaptor will be located at the 5* end of the cDNA in the preferred version. This strategy has the benefit of providing oriented inserts, and cDNA and vector which are completely non-selfcomplementary. The cDNA cannot ligate to itself, nor can the vector, in any orientation. As a result the cDNA is assimilated into the vector with maximal efficiency, in the correct orientation, with low background.
Other vector-based strategies for producing low background may be coupled with the intron endonuclease/non-self complementary adaptor strategy for producing large unbiased libraries. To achieve the large library sizes afforded by the method of the invention, such vectors must have phosphorylated, non- selfcomplementary ends which are complementary to the ends of the cDNA. A preferred enzyme for restriction of the preferred vector is Bst XI. This enzyme has a degenerate cleavage site which facilitates the selection of overhangs which are complementary to more than one restriction site and which may be manipulated to have other useful features such as high GC content. Thus, by engineering the correct sequences into different Bst XT restriction sites flanking the insertion site, a single restriction digest can generate ends which are complementary to both the adaptor and the cleaved VDE site. Other enzymes with degenerate cleavage sites known to those of skill in the art may be used to leave overhangs on the vector that are capable of annealing and ligating to the non-selfcomplementary adaptors described above and to the overhang generated by the intron endonuclease.
Alternatively, more than one enzyme may be used to generate the appropriate vector configuration. It is generally advantageous, but not essential, that the overhang created by the adaptor is complementary to the overhang generated by the enzyme used to cleave the insertion site of the vector. However, the vector must be treated in such a way that non-selfcomplementary overhangs are created on the vector which are complementary to the non-selfcomplementary overhangs on the cDNA. This can be achieved by ligating adaptors to appropriate restriction cleavage site of the vector or by similar techniques known to those of skill in the art. Alternative strategies which require the use of two enzymes or multiple manipulations of the vector are not preferred because they increase cost and the extra manipulations tend to reduce efficiency of library preparation.
For purposes of preparing a cDNA library, the plasmid vector containing the kilA gene is amplified in a korA, korB strain such as MC1061/p3 and isolated by standard methods. The purified plasmid is then digested with restriction enzymes that flank the MIA gene fragment. The MIA gene fragment is then separated from the vector by methods such as gel electrophoresis, size exclusion chromatography, density gradient centrifugation and other techniques known to those of skill in the art. The vector is then ready for further manipulations or immediate ligation of cDNA .
It is an object of this invention to provide a method for orienting DNA inserts within vectors with little or no risk of cleaving the inserted DNA. In this method, the same approach used to prepared oriented cDNA inserts is used genetically to orient DNA in plasmids and other DNA vectors used in the cloning and manipulation of DNA, with little or no risk of cleaving the inserted DNA.
Example 1. Preparation of unamplifed cDNA library from human small intestine
A cDNA library with 1.1 x 108 recombinant clones was prepared from human small intestine. The cDNA was prepared from 2.5 ug of poly(A)+ RNA using an oligo-dT-VDE primer comprising 18 bases of dT and 66 bases including 31 bases of the original VDE recognition site (underlined). The sequence of the primer is : 5'-CGACGTTGTAAAACGACGGCCAGTGAATTCTC TATGTCGGGTGCGGAGAAAGAGG TAATGAAA TACTTTTTTTTTTTTTTTTTT-3 '
Detailed protocol: The poly(A)RNA was diluted with DEPC-H20 to 22 ul and mix with 2 ul of the primer (1 ug/ul). The mixture was incubated at 70°C for 10 minutes and transferred to ice for 5 minutes. The following components were added to the sample for the first strand cDNA synthesis: 8 ul of 5x 1" strand buffer [ 250mM Tris-HCl(pH 8.3), 375mM KC1, 15mM MgCl2 ], 4 ul of 10 mM dNTP, 2 ul of 0.1 MDTT, 1 ul (40U) of RNase inhibitor ( Life Technologies, Inc., Gaithersburg, MD) and 1 ul (36 TJ) of AMV RT-XL (Life Science Research Products, Orlando,
FL). The sample was incubated at 42 °C for 1 hr and 70 °C for 10 minutes. The second strand cDNA was prepared by adding the following reagents to the first strand cDNA sample: 70 ul of H2O, 30 ul of 5x 2nd strand Buffer [100 mM Tris-HCl (pH 6.9), 450 mM KC1, 23 mM MgCl2, 0.75 mM β-NAD+, 50 mM (NH4)2SO4 ], 2 ul of 0.1 M DTT, 3 ul of 10 mM dNTP, 0.5 ul (2TJ) of RNase H ( Life
Technologies, Inc., Gaithersburg, MD ), 4 ul (40U) of E. coli DNA polymerase I ( Life Technologies, Inc., Gaithersburg, MD ) and 1 ul (10U) of E. coli DNA ligase ( Life Technologies, Inc., Gaithersburg, MD ). The sample was incubated 2 hours at 16°C, then added 5 ul (5TJ) of T4 DNA polymerase (Boehringer Mannheim, Indianapolis, IN ) and incubated 5 minutes at 16°C. The reaction was terminated by adding 10 ul of 0.5 M EDTA. The cDNA was purified by phenol extraction and ethanol precipitation. The ligation of the Bst XI adaptor to the cDNA was performed by adding the following reagents to 20 ul of the cDNA sample in H2O: 10 ul of phosphorylated Bst XI adaptors (5'-CTGGCTCA-3'; 5'- TGAGCCAGCCCC-3') ( 10 ug ), 10 ul of ligation buffer [330mM Tris-HCl, 50 mM MgC12, 5 mM ATP], 7 ul of 0.1 M DTT and 5 ul (5U)of T4 DNA ligase ( Life Technologies, Inc., Gaithersburg, MD ) and incubating overnight at 16°C. The cDNA was purified by phenol extraction and ethanol precipitation, then digested with 20 units of VDE (New England Biolabs, Beverly, MA) at 37°C for 6 hours. The cDNA was fractionated on a potassium acetate gradient (5 - 20%) for 3 hours at 50,000 rpm in a Beckman L5-50 centrifuge using an SW-50 rotor. The cDNA fragments greater than 700 bases were collected, concentrated by ethanol precipitation and dissolved in 50 ul of TE (10 mM Tris-HCl, pH 8.0, 1 mMEDTA). The pEAK8 vector was digested with Bst XI using manufacturer's recommended procedures (New England Biolabs, Beverly, MA) and purified on a potassium acetate gradient as above to remove the MIA stuffer. After ethanol precipitation, the vector was dissolved in TE at 50 ng/ul. Mix 47 ul of the vector with 47 ul of the fractionated cDNA , 258 ul of H20, 94 ul of 5x T4 DNA ligase buffer (Life Technologies, Inc., Gaithersburg, MD) and 24 ul of ligase (1 U/ul) (Life
Technologies, Inc., Gaithersburg, MD). Incubate the ligation mix for 2 hours at room temperature. The DNA sample was desalted by ethanol precipitation and dissolved in 30 ul of H2O. The DNA was electroporated into electrocompetent E. coli DH10B cells in 10 equal fractions, 0.3 ml per fraction. The transformed cells were incubated in SOC medium for 45 minutes before adding glycerol to 15% to make the frozen stock (-70 °C ) of the cDNA library. To check the Ubrary titer,10 ul of the library stock was diluted 200 times and plated on ampicillin LB-agar plates. The average size of the cDNA inserts was analyzed by extracting the plasmid DNA from 24 randomly selected colonies and digested with the restriction enzymes flanking the cDNA insert. In summary, the primary cDNA library for human small intestine contains l.lx 108 primary transformants with the average size of the cDNA inserts at 2.3 kb. The vector (without insert) background in this library is less than 1%.
Example 2. Preparation of unamplifed cDNA library from human fetal kidney
A cDNA library with 2.7 x 108 recombinant clones was constructed from human fetal kidney RNA. The cDNA was prepared from 5.0 ug of poly(A)+ RNA using another oligo-dT-VDE primer comprising 18 bases of dT and 60 bases including 28 bases of the modified VDE recognition site (underlined). For example, a variant with modification of two bases (C to T at position -1 and C to G at position 6 of the original version) and deletion of three deoxyadenines at the 3' of the original version leads to a site that can also be cleaved by VDE enzyme. The sequence of the primer is :
5'-CGACGTTGTAAAACGACGGCCAGTGAATTCTR
TATGTGGGGTGCGGAGAAAGAGG TAATG ττχττττTTTTTTTTTTT-3'
The cDNA library for human fetal kidney was prepared with similar protocol as above except the following changes: 1). the use of the modified oligo-dT-VDE primer ; 2). 2 hours of VDE digestion. This library contains 2.7 x 108 primary clones with average size of the cDNA inserts at 1.3 kb and less than 1 % of background.
Example 3. High level protein expression in mammalian cells using pEAKlO
pEAKlO was prepared from pEAK8 by deleting an inhibitory regulatory sequence present in the EFlα promoter of pEAK8 and removing the BspLU 1 II site in the EFlα promoter. The protein expression levels from pEAKlO are 50% higher than those for pEAK8.
The LacZ gene from E. coli was cloned in pEAKlO and in three other different commercial vectors and the resulting plasmids were transfected into 293HEK cells expressing the EBNA-1 protein and the large T-antigen (293 EBNA-T). The amount of recombinant protein expressed in 293 EBNA-T cells transfected with pEAKiO was, at least, three fold higher than when the same cells were transfected with any of the other plasmids.
LacZ was cloned by standard methods (Current Protocols in Molecular Biology, Vol 1, Ausubel, et al., Eds, John Wiley & Sons, New York (1997)) into the Hind m-Not I sites of pEAKlO or pCDNA3.1 Hygro (+) [(Invitrogen, cat# V870-20) (this vector has the CMV promoter and the SV40 origin of replication)], pREP4 [(Invitrogen, cat# V004-50) (this vector has the RSV promoter, the EBNA- 1 expression cassette and an Epstein-Barr virus origin of replication)] or pCEP4 [(Invitrogen, cat# V044-50) (this vector has the CMV promoter, the EBNA-1 expression cassette and an Epstein-Barr virus origin of replication)] to generate pEAKlO-βgal, pCDNA3-βgal, pREP4-βgal or pCEP4-βgal. 5 x 105 293 EBNA-T cells were plated in 60 mm Petri dishes containing 5 ml DMEM medium supplemented with 10% calf serum and incubated at standard conditions (37 °C and 5% CO^) for 24 hours. The medium was then changed and the plates were incubated in the same conditions for two additional hours. 3 ug of pEAKlO-βgal (or pCDNA3-βgal, or pREP4-βgal , or pCEP4-βgal, three samples of each plasmid) were pipetted into a microtube and the following components were added in the following order: up to 225 ul water, 25 ul 2.5 M CaCl2, 250 ul [50mM HEPES, pH 7.05, 1.26 mM^HPO^ 140 mMNaCl], the mix was vortexed briefly, incubated at room temperature for one minute and then added dropwise to the cell cultures. After three hours of incubation at standard conditions, the medium was changed and the transfected cells were incubated for 1, 2 or 3 days. (A total of twelve experiment were done, result from the combination of four different plasmids at three expression times).
To harvest the recombinant protein (β-galactosidase), the cells were washed once with PBS and collected in 1 ml PBS, spun at 250 g for 5 minutes and resuspended in 100 ul 0.25 M Tris.ClH pH 8. The cells were then lysed by three freeze-thaw (liquid nitrogen/37 °C water bath) cycles and the insoluble material was pelleted at 12000 g for 5 minutes at 4°C.
β-galactosidase liquid assays were done by standard protocols (Current
Protocols in Molecular Biology, Vol 1, Ausubel, et al., Eds, John Wiley & Sons, New York (1997)), and samples were quantified against a standard curve using purified E. coli β-galactosidase (Sigma. St. Louis, MO). The levels of β-galactosidase (expressed as % of β-galactosidase per total amount of protein) showed that pEAKlO was superior in protein expression to any of the other plasmids at any given time (Table I).
TABLE I: Comparison of expression levels of β-galactosidase (shown as % of β- galactosidase relative to total protein) in pEAKlO versus other three commercial plasmids at different time points.
Literature Cited
Aruffo, A., and Seed, B. (1987a). Molecular cloning of a CD28 cDNA by a high- efficiency COS cell expression system. Proceedings of the National Academy of Sciences of the United States of America 84, 8573-7.
Aruffo, A., and Seed, B. (1987). Molecular cloning of two CD7 (T-cell leukemia antigen) cDNAs by a COS cell expression system. EMBO Journal 6, 3313-6.
Brenneman, M., Gimble, F. S., and Wilson, J. H. (1996). Stimulation of intrachromosomal homologous recombination in human cells by electroporation with site-specific endonucleases. Proceedings of the National Academy of Sciences of the United States of America 93, 3608-12.
Gimble, F. S., and Stephens, B. W. (1995). Substitutions in conserved dodecapeptide motifs that uncouple the DNA binding and DNA cleavage activities of Pl-Scel endonuclease. Journal of Biological Chemistry 270, 5849-56.
Gimble, F. S., and Thorner, J. (1992). Homing of a DNA endonuclease gene by meiotic gene conversion in Saccharomyces cerevisiae. Nature 357, 301-6.
Gimble, F. S., and Thorner, J. (1993). Purification and characterization of VDE, a site-specific endonuclease from the yeast Saccharomyces cerevisiae. Journal of Biological Chemistry 268, 21844-53.
Gimble, F. S., and Wang, J. (1996). Substrate recognition and induced DNA distortion by the Pl-Scel endonuclease, an enzyme generated by protein splicing. Journal of Molecular Biology 263, 163-80.
Kornacki, J. A., Chang, C. H., and Figurski, D. H. (1993). kil-kor regulon of promiscuous plasmid RK2: structure, products, and regulation of two operons that constitute the kilE locus. Journal of Bacteriology 175, 5078-90.
Larsen, M. H., and Figurski, D. H. (1994). Structure, expression, and regulation of the kilC operon of promiscuous IncP alpha plasmids. Journal of Bacteriology 176, 5022-32.
Seed, B., and Aruffo, A. (1987). Molecular cloning of the CD2 antigen, the T-cell erythrocyte receptor, by a rapid immunoselection procedure. Proceedings of the National Academy of Sciences of the United States of America 84, 3365-9. Thomas, C. M., Smith, C. A, Ibbotson, J. P., Johnston, L., and Wang, N. (1995). Evolution of the korA-oriV segment of promiscuous IncP plasmids. Microbiology 141, 1201-10.
Thomson, V. J., Jovanovic, O. S., Pohlman, R. F., Chang, C. H., and Figurski, D. H. (1993). Structure, function, and regulation of the kilB locus of promiscuous plasmid RK2. Journal of Bacteriology 175, 2423-35.

Claims

What is claimed is:
1. A method for inserting one DNA sequence within another DNA sequence with little or no risk of cleaving the inserted DNA, comprising the steps of: a. preparing a double stranded DNA sequence to be inserted (insert DNA) by ligation of a double stranded oligonucleotide which either comprises a sequence that is recognized and cleaved by an endonuclease for which less than 100 recognition and cleavage sites exist within the genome of the target or which when ligated with a vector reconstitutes a sequence that is recognized and cleaved by an endonuclease for which less than 100 recognition and cleavage sites exist within the genome of the target; b. cleaving the insert DNA with an enzyme which recognizes and cleaves the insert DNA at the rare endonuclease recognition and cleavage site provided with the primer; c. ligating the cleaved insert DNA to the DNA sequence within which it is to be inserted (vector DNA), wherein the ends of the vector DNA are complementary to the ends of the cleaved insert DNA.
2. A method for inserting one DNA sequence within another DNA sequence with little or no risk of cleaving the inserted DNA, comprising the steps of: a. preparing a double stranded DNA sequence to be inserted (insert
DNA) from RNA or DNA (target) using a nucleotide polymer (primer) comprising two elements, an element at one end of the primer that is the site for initiation of polymerization of the complementary DNA strand and another element at the other end of the primer that comprises one strand of a double stranded DNA sequence that is recognized and cleaved by an endonuclease for which less than 100 recognition and cleavage sites exist within the genome of the target; b. preparing a second strand of DNA using the first strand as a template; c. cleaving the insert DNA with an enzyme which recognizes and cleaves the insert DNA at the rare endonuclease recognition and cleavage site provided with the primer; d. ligating the cleaved insert DNA to the DNA sequence within which it is to be inserted (vector DNA) wherein the ends of the vector DNA are complementary to the ends of the cleaved insert DNA.
3. A method for orienting one DNA sequence within another DNA sequence with little or no risk of cleaving the inserted DNA, comprising the steps of: a. preparing a double stranded DNA sequence to be inserted (insert DNA) from RNA or DNA (target) using a nucleotide polymer (primer) comprising two elements, an element at one end of the primer that is the site for initiation of polymerization of the complementary DNA strand and another element at the other end of the primer that comprises one strand of a double stranded DNA sequence that is recognized and cleaved by an endonuclease for which less than 100 recognition and cleavage sites exist within the genome of the target; b. preparing a second strand of DNA using the first strand as a template; c. cleaving the insert DNA with an enzyme which recognizes and cleaves the insert DNA at the rare endonuclease recognition and cleavage site provided with the primer, wherein the ends of the cleaved insert DNA are distinct and not self-complementary; d. ligating the cleaved insert DNA to the DNA sequence within which it is to be inserted (vector DNA), wherein the ends of the vector DNA are distinct and not self-complementary and are complementary to the ends of the cleaved insert DNA.
4. The method of claim 1 in which one end of the insert DNA is prepared by ligating non self-complementary adaptors to the DNA before cleaving with an enzyme which recognizes and cleaves the rare endonuclease recognition and cleavage site provided with the primer.
5. The method of claim 2 in which one end of the insert DNA is prepared by ligating non self-complementary adaptors to the DNA before cleaving with an enzyme which recognizes and cleaves the rare endonuclease recognition and cleavage site provided with the primer.
6. The method of claim 3 in which one end of the insert DNA is prepared by ligating non self-complementary adaptors to the DNA before cleaving with an enzyme which recognizes and cleaves the rare endonuclease recognition and cleavage site provided with the primer.
7. A synthetic nucleotide sequence comprising the recognition and cleavage site for an intron endonuclease.
8. The nucleotide sequence of claim 7 which comprises less than a whole genome.
9. A plasmid vector which contains an insertion site for DNA that when cleaved gives two distinct and non self-complementary single stranded DNA sequences.
10. The vector of claim 9 which contains a bacterial origin of replication and a gene conferring a drug resistance in bacteria.
11. The vector of claim 9 which contains a gene located in the DNA insertion site that confers a conditionally lethal phenotype.
12. The vector of claim 11 in which the conditionally lethal gene is KilA.
13. The vector of claim 10 in which an EF 1 ╬▒ promoter is upstream the insertion site for the cDNA.
14. The vector of claim 13 in which the insertion site for the DNA is followed by a human IgGl H/CH2 splice sequence and a human polyadenylation signal sequence.
15. The vector of claim 12 which is able to express puromycin acetyl transferase gene..
16. The vector of claim 12 which contains the EBNA-1 transcription unit, and an Epstein-Barr virus origin of replication.
17. The vector of claim 15 which contains an SV40 origin of replication.
18. The vector of claim 8 which contains a mammalian expression unit.
19. The vector of claim 18 that contains an origin of repUcation for mammaUan cells.
20. A plasmid vector which contains the EF 1 ╬▒ promoter, the EBNA- 1 transcription unit, and an Epstein-Barr virus origin of replication.
21. A method for the preparation of oriented cDNA libraries comprising the steps of: a. preparing DNA from RNA using a nucleotide polymer (primer) comprising two elements, an element at one end of the primer that is the site for initiation of polymerization of the complementary DNA strand and another element at the other end of the primer that comprises one strand of a double stranded DNA sequence that is recognized and cleaved by an endonuclease for which less than 100 recognition and cleavage sites exist within the genome of the target; b. preparing a second strand of DNA using the first strand as a template; c. cleaving the cDNA with an enzyme which recognizes and cleaves the cDNA at the rare endonuclease recognition and cleavage site provided with the primer, wherien the ends of the cDNA are distinct and non self-complementary; d. ligating the cleaved cDNA is ligated to a plasmid vector, wherein the ends of the vector DNA are distinct and not self-complementary and are complementary to the ends of the cDNA.
22. The method of claim 21 in which one end of the cDNA is prepared by ligating non self-complementary adaptors to the cDNA before cleaving with an enzyme which recognizes and cleaves the rare endonuclease recognition and cleavage site provided with the primer.
23. The method of claim 22 in which the adaptors are phosphorylated.
24. A method for production of proteins comprising the step of: transfecting mammaUan cells with a vector which comprises a EFl╬▒ promoter, an Epstein-Barr origin of replication, and an EBNA-1 transcription element.
25. The method of claim 24 wherein the mammalian cells express the T antigen of SV40 and the vector comprises the SV40 origin of replication.
26. The vector of claim 20 which comprises the SV40 origin of replication.
EP98930330A 1997-06-18 1998-06-18 A vector and method for preparation of dna libraries Withdrawn EP1012314A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US5014297P 1997-06-18 1997-06-18
PCT/US1998/012620 WO1998058067A1 (en) 1997-06-18 1998-06-18 A vector and method for preparation of dna libraries
US50142P 2008-05-02

Publications (1)

Publication Number Publication Date
EP1012314A1 true EP1012314A1 (en) 2000-06-28

Family

ID=21963583

Family Applications (1)

Application Number Title Priority Date Filing Date
EP98930330A Withdrawn EP1012314A1 (en) 1997-06-18 1998-06-18 A vector and method for preparation of dna libraries

Country Status (3)

Country Link
EP (1) EP1012314A1 (en)
AU (1) AU7974498A (en)
WO (1) WO1998058067A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2825103B1 (en) * 2001-05-28 2003-09-19 Genoway CLONING VECTORS FOR APPROVED RECOMBINATION AND METHOD OF USING THE SAME
US8293503B2 (en) 2003-10-03 2012-10-23 Promega Corporation Vectors for directional cloning
DK1865316T3 (en) * 2006-06-07 2010-05-25 Nutrinova Gmbh Method for screening compounds that modulate the activity of G protein-coupled receptors

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO9858067A1 *

Also Published As

Publication number Publication date
WO1998058067A1 (en) 1998-12-23
AU7974498A (en) 1999-01-04

Similar Documents

Publication Publication Date Title
US5595895A (en) Efficient directional genetic cloning system
CA2802167C (en) Direct cloning
EP1068305B1 (en) High throughput dna sequencing vector
CA2308608C (en) Conditionally amplifiable bacterial artificial chromosome (bac) vector
CA3111432A1 (en) Novel crispr enzymes and systems
JP4264703B2 (en) Synthetic genes and bacterial plasmids lacking CpG
EP1141239B1 (en) Improved methods for insertion of nucleic acids into circular vectors
Carninci et al. Balanced-size and long-size cloning of full-length, cap-trapped cDNAs into vectors of the novel λ-FLC family allows enhanced gene discovery rate and functional analysis
JP7305553B2 (en) DNA assembly
JP2001500740A (en) Method for stably cloning large repetitive DNA sequences
CN111088275B (en) Cloning method of DNA large fragment
US7863222B2 (en) shRNA library
US10385334B2 (en) Molecular identity tags and uses thereof in identifying intermolecular ligation products
US7592161B2 (en) Methods for analyzing the insertion capabilities of modified group II introns
JP2002509733A (en) Directional antisense library
WO1998058067A1 (en) A vector and method for preparation of dna libraries
WO2002044415A1 (en) Method for screening of dna libraries and generation of recombinant dna constructs
JP2024509048A (en) CRISPR-related transposon system and its usage
JP2024509047A (en) CRISPR-related transposon system and its usage
US20220380784A1 (en) Universal dna assembly
EP1156114A1 (en) Vectors for use in transponson-based DNA sequencing methods
KR20020011139A (en) Novel vectors for improving cloning and expression in low copy number plasmids
KR20050009118A (en) Plasmid having a function of T-vector and expression vector, and expression of the target gene using the same
TW201600607A (en) Mate-pair sequences from large inserts
CN114555803A (en) Prespacer sequence adjacent motif sequence and method for modifying target nucleic acid in cell genome by using the same

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20000113

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): CH DE FR GB LI

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN

18W Application withdrawn

Withdrawal date: 20020628