EP1402020A2 - Methods, vectors, cell lines and kits for selecting nucleic acids having a desired feature - Google Patents

Methods, vectors, cell lines and kits for selecting nucleic acids having a desired feature

Info

Publication number
EP1402020A2
EP1402020A2 EP02744992A EP02744992A EP1402020A2 EP 1402020 A2 EP1402020 A2 EP 1402020A2 EP 02744992 A EP02744992 A EP 02744992A EP 02744992 A EP02744992 A EP 02744992A EP 1402020 A2 EP1402020 A2 EP 1402020A2
Authority
EP
European Patent Office
Prior art keywords
site
nucleic acid
recombinase
vector
vectors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP02744992A
Other languages
German (de)
French (fr)
Inventor
Christian Lanctot
Rock Gingras
Marie-Hélène GAUMOND
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Phenogene Therapeutiques Inc
Original Assignee
Phenogene Therapeutiques Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Phenogene Therapeutiques Inc filed Critical Phenogene Therapeutiques Inc
Publication of EP1402020A2 publication Critical patent/EP1402020A2/en
Withdrawn legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/86Viral vectors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2770/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssRNA viruses positive-sense
    • C12N2770/00011Details
    • C12N2770/36011Togaviridae
    • C12N2770/36111Alphavirus, e.g. Sindbis virus, VEE, EEE, WEE, Semliki
    • C12N2770/36141Use of virus, viral particle or viral elements as a vector
    • C12N2770/36143Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/30Vector systems comprising sequences for excision in presence of a recombinase, e.g. loxP or FRT
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2830/00Vector systems having a special element relevant for transcription
    • C12N2830/001Vector systems having a special element relevant for transcription controllable enhancer/promoter combination
    • C12N2830/002Vector systems having a special element relevant for transcription controllable enhancer/promoter combination inducible enhancer/promoter combination, e.g. hypoxia, iron, transcription factor
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2830/00Vector systems having a special element relevant for transcription
    • C12N2830/38Vector systems having a special element relevant for transcription being a stuffer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2830/00Vector systems having a special element relevant for transcription
    • C12N2830/55Vector systems having a special element relevant for transcription from bacteria

Definitions

  • the present invention relates to screening of nucleic acids. More particularly, the present invention is concerned with the identification of nucleic acids having a desired feature, such as nucleic acids encoding signaling molecules, transcription factors or other proteins involved in changes of cell metabolism.
  • Cell-based screening technology can be viewed as a tool that sends out a "signal" if, and only if, a particular "target" nucleic acid possessing the activity being screened for has been incorporated into a cell.
  • This technology is based on a reporter system that is kept inactive (no signal) in the absence of the target gene.
  • reporter system are based on the conditional expression of marker proteins. Appearance of this marker in a cell transfected with an expression vector containing a nucleic acid having a desired feature allows the selection of this cell from the rest of the transfected cell population.
  • the need to select cells greatly limits the throughput of the method since techniques to do so are complex, cumbersome or lengthy, ln some cases for example, selection is achieved with fluorescent markers coupled to very sophisticated sorting equipment.
  • cells having the desired phenotype are selected either after limiting dilution or by clonal growth followed by colony picking, long and laborious techniques.
  • the cell selection step may increase the occurrence of false-positives.
  • the cell selection step is merely a pre-requisite to recover the expression vector whose transfection has triggered the appearance of a desired phenotype and to identify the nucleic acid contained therein.
  • Cre recombinase from bacteriophage P1 and Flp from S. cerevisiae have previously been used for recombining exogenous molecules in an heterologous system (Sauer and Henderson, 1988; O'Gorman et al., 1991).
  • the Cre and Flp recombinases bind well-characterized DNA elements ("recombination target sequences") and mediate excision or inversion of the intervening sequence, depending on the orientation of the recombination target sequences relative to one another (Sauer, 1994).
  • the Cre recombinase is extensively used to create so-called conditional mouse knock-out mutants, i.e.
  • a nuclear receptor e.g. estrogen receptor
  • nuclear localization of the fusion protein is dependent on the presence of the nuclear receptor ligand molecule (e.g. estradiol in the case of the estrogen receptor) (Logie and Stewart, 1995; Angrand et al., 1998).
  • nucleic acid expression screening method that bypasses the time-consuming cell selection step and that does not require marking of proteins.
  • new methods taking advantage of the recombinase activity for the selective retrieval of nucleic acids encoding a desired function.
  • the present invention aims to overcome the limits and obviate the problems known in the art for screening gene and nucleic acids by providing a molecule, a method and a kit for rapidly and efficiently identifying and retrieving nucleic acids encoding a desired function in eukaryotic cells.
  • the purpose of the invention is also to fulfill other needs that will be apparent to those skilled in the art upon reading the following specification.
  • nucleic acids having a desired feature includes nucleic acids having transcriptional activity, nucleic acids encoding proteins involved in signal transduction pathways, and nucleic acids encoding proteins involved in cell metabolism or differentiation state.
  • the invention relates to expression vectors or viral-based expression vectors containing one or more recombination target sequences, these vectors being modified after the action of a site-specific recombinase in such a way as to be differentiable from non recombined vectors.
  • the invention also relates to libraries of expression vectors or viral-based expression vectors containing exogenous nucleic acids.
  • Most preferred vectors according to the invention are those whose sequences are set forth in SEQ ID NOS: 1 , 2 and 3.
  • the vector is useful for expressing an exogenous nucleic acid in eukaryotic cells and it comprises a nucleic acid sequence excisable by site-specific recombination.
  • the vector is an expression vector which comprises nucleic acid sequence, the vector comprising a recombinase substrate and a transcription unit.
  • the vector comprises i) a site-specific recombinase coding sequence operatively linked to a termination sequence; and ii) a recombinase substrate excisable specifically by a recombinase encoded by the site-specific recombinase coding sequence.
  • the recombinase substrate in the vectors of the invention comprises a stuffer region flanked by recombination target sequences.
  • the stuffer region is removable by site-specific recombination. More preferably, the stuffer region comprises a restriction site.
  • the vector's transcription unit comprises an enhancer sequence, a promoter sequence and a termination sequence operatively linked together.
  • the nucleic acid sequence of the vector comprises at least two fragments of a viral genome for packaging the vector or a fragment thereof into infectious viral particles. These fragments may derive from a retrovirus or an adenovirus.
  • the recombinase substrate and the transcription unit are preferably located between these two viral fragments.
  • the nucleic acid sequence of the vector comprises a nucleic acid sequence encoding an inactive gene conferring resistance to an antibiotic in bacteria.
  • the activity of inactive gene is restorable by site-specific recombination of the vector nucleic acid sequence.
  • the vector nucleic acid sequence comprises a recombinase substrate and a transcription unit incorporated into a viral genome.
  • the recombinase substrate comprises a stuffer region advantageously excisable by site-specific recombination, and formation of viral particles is dependent upon excision of the stuffer region.
  • the viral genome consists of a cDNA copy of an alphaviral genome, and presence of the stuffer region blocks translation of viral proteins encoded by the cDNA copy of the alphaviral genome.
  • the cDNA copy of the alphaviral genome may derive from Sindbis virus genome or from Semliki Forest virus genome.
  • the recombinase substrate is present in a 5' untranslated region of the cDNA copy of the alphaviral genome.
  • the invention relates to modified cell lines and transgenic animals having incorporated in their genome a DNA segment comprising a site-specific recombinase operatively linked to regulatory elements.
  • a related aspect concerns the use of such cell Iine(s) and animal(s) for screening, among libraries of expression vectors, those vectors containing a nucleic acid that activates, directly or indirectly, transcription of regulatory element(s).
  • an eukaryotic cell line which comprises an expressible site-specific recombinase coding sequence.
  • This expressible site-specific recombinase coding sequence is operatively linked to a minimal promoter and to at least one cis-acting regulatory element.
  • the site-specific recombinase is expressed upon activation of the at least one cis-acting regulatory element.
  • the cis-acting regulatory element(s) may be activated by elevation of intracellular cAMP or cGMP levels, elevation of intracellular calcium concentration, and/or change in the phosphorylation state of specific proteins (e.g.
  • the cis- acting regulatory element(s) may also be activated during differentiation of mesenchymal stem cells into bone, cartilage, adipocytes or myoblasts. More preferably, the site-specific recombinase coding sequence is optimized for enhanced synthesis, stability or translation in eukaryote cells.
  • the expressible site-specific recombinase coding sequence may be chosen from Flp coding sequence from Saccharomyces cerevisiae, Cre coding sequence from bacteriophage P1 and ⁇ -recombinase coding sequence from Bacillus subtilis.
  • Preferred Flp coding sequence are those comprising SEQ ID NO:4, SEQ ID NO:6, or a functional homologue thereof, particularly those homologues coding for proteins having substantially the same biological activity than SEQ ID NO:5 or SEQ ID NO:7.
  • the invention relates to nucleic acids and amino acid sequences comprising an optimized Flp recombinase.
  • Preferred optimized Flp coding sequences are those coding amino acid sequence set forth in SEQ ID NO:7 (including SEQ ID NO:6) and functional homologues thereof.
  • the invention relates to a method for identifying nucleic acids encoding a desired feature from a library of exogenous nucleic acids.
  • a plurality of nucleic acids from the library are inserted into a plurality of expression vectors comprising a nucleic acid sequence excisable by site-specific recombination.
  • the vectors are then inserted into a eukaryotic cell line (as defined previously) or into a transgenic animal comprising a nucleic acid encoding an inactive site-specific recombinase whose activity is restorable.
  • This site-specific recombinase may be inactive for instance due to a lack of sufficient expression or due to sequestration outside of the cell nucleus.
  • the activity of the inactive site-specific recombinase is restored upon expression by said vector of an exogenous nucleic acid having the desired feature.
  • the active site-specific recombinase preferably excises a fragment from the expression vectors, thereby forming recombined expression vectors that comprise a nucleic acid having the desired feature and that can be differentiated from unrecombined expression vectors.
  • the method of the invention is used for screening exogenous nucleic acids having a desired feature within eukaryotic cells.
  • the method comprises the steps of: a) providing a plurality of expression vectors each capable, when present into a suitable host, of expressing an exogenous nucleic acid inserted therein, these vectors comprising a nucleic acid sequence excisable by site-specific recombination; b) providing a cell line or a transgenic animal comprising a nucleic acid encoding an inactive site-specific recombinase whose activity is restorable; c) inserting at least one exogenous nucleic acid from a library of nucleic acids into a plurality of the expression vectors, in order to provide a library of recombinant expression vectors; d) introducing, into cells of the cell line or of the transgenic animal of step (b), a plurality of recombinant expression vectors from the library obtained at step (c); e) allowing
  • the method of the invention is used for screening exogenous nucleic acids having a transcriptional activity within eukaryotic cells (e.g. regulatory elements such as enhancer, promoter, etc).
  • the method comprises the steps of: a) providing a vector comprising: i) a site-specific recombinase coding sequence operatively linked to a termination sequence; and ii) a recombinase substrate excisable specifically by a site-specific recombinase encoded by the site-specific recombinase coding sequence; b) inserting into a plurality of vectors as defined at step (a) at least one exogenous nucleic acid taken from a library of exogenous nucleic acids in order to provide a library of recombinant vectors; c) inserting a plurality of recombinant vectors from the library obtained at step (b) into a suitable eukaryotic host; d) allowing the exogenous nucle
  • the nucleic acid sequence of the vector comprises an inactive gene conferring resistance to an antibiotic in bacteria, the activity of the inactive gene being restored by site-specific recombination of said nucleic acid sequence. Therefore, recombined vectors may be isolated from unrecombined vectors by: i) extracting DNA from cells into which the expression vectors have been introduced; ii) transforming bacteria with DNA extracted at step (i); iii) growing bacteria transformed at step (ii) in presence of the antibiotic; and iv) selecting bacterial colonies resistant to the antibiotic.
  • the resistant bacterial colonies comprises expression vectors having undergone site-specific recombination.
  • This method may further comprises the steps of: v) extracting expression vectors from colonies selected at step (iv); and vi) identifying an exogenous nucleic acid found in said extracted vectors.
  • the nucleic acid sequence of the vector comprises a recombinase substrate having a stuffer region flanked by recombination target sequences.
  • the stuffer region also comprises a cleavable restriction site.
  • recombined vectors may be isolated from unrecombined vectors by: i) extracting DNA from cells into which the expression vectors have been introduced; and ii) contacting DNA extracted at step a) with a restriction enzyme recognizing said cleavable restriction site; iii) optionally degrading DNA fragments cleaved by the restriction enzyme with an exonuclease; and iv) optionally amplifying a DNA fragment from the expression vectors, the fragment comprising the exogenous nucleic acid.
  • recombined expression vectors are not cleaved by the restriction enzyme, but unrecombined expression vectors are cleaved by the restriction enzyme.
  • the invention concerns a screening kit comprising 1) a vector as defined herein; and/or 2) a cell line as defined herein; and at least one further element selected from the group consisting of instructions for using the kit, reaction buffer(s), enzyme(s), probe(s) and pool(s) of nucleotide molecules to be screened.
  • An advantage of the present invention is that it obviates the expensive and time-consuming task of selecting cells that express a gene of interest.
  • the invention is also much more rapid, efficient and accurate for selecting a particular nucleic acid having a desired feature, characteristic or function.
  • the invention can also selectively retrieve, from a library of nucleic acids, a nucleic acid having a desired feature, such as a nucleic acid encoding a signaling molecule, a transcription factor or a protein involved somehow in promoting changes in cell metabolism or differentiation state (for instance a kinase, a phosphatase, or a transcription factor).
  • Figure 1 is a schema illustrating how site-specific recombination can be used to screen for nucleic acids encoding a specific biological function.
  • Figures 2A and 2B are schemas showing preferred embodiments of a transcription unit of an expression vector according to the invention.
  • Figures 3A, 3B, and 3C are schemas showing preferred embodiments of a recombinase substrate of an expression vector according to the invention, and also preferred methods for specifically retrieving whole or parts of recombined expression vectors or viral-based expression vectors according to the invention.
  • Figure 4A shows an alignment of the first 116 codons and corresponding amino acids of wild type Flp recombinase (Flp; SEQ ID NOS: 4 and 5) and of an optimized recombinase coding sequence (oFlp; SEQ ID NOS: 6 and 7) according the present invention. Amino acid substitutions to enhance thermostability are shown in bold. Putative internal polyadenylation signal is underlined.
  • Figure 4B is a picture of a Northern analysis comparing the expression of wild type Flp (411) and optimized recombinase (oFlp, 412) after transfection of appropriate constructs in HEK293 cells.
  • Flp signal (arrowhead) is not detected in mock-transfected cells (410).
  • Figure 4C is a picture of Western analysis comparing the amount of Flp protein produced after transfection of HEK293 cells with vectors expressing either the wild type coding sequence (421) or optimized coding sequence ,(422) according to the invention.
  • Flp signal (arrowhead) is not detected in mock-transfected cells (420).
  • Figures 5A, 5B and 5C schematizes construction of an expression vector (RC43) according to a preferred embodiment of the invention, the expression vector comprising a transcription unit and a recombinase substrate disrupting a gene conferring resistance to kanamycin.
  • Nucleic acid sequence of RC43 is set forth in SEQ ID NO:1.
  • Figure 6 schematizes the construction of a plasmid containing cis-acting regulatory elements operatively linked to an optimized recombinase coding sequence (oFlp).
  • Figure 7 is a picture showing the results of a Northern analysis performed to detect expression of oFlp mRNA in subclones of HEK293 cells obtained after stable transfection of a plasmid comprising a coding sequence for oFlp operatively linked to regulatory elements activated by the Gal4VP16 protein (RE- oFlp).
  • Lane 701 wild type HEK293 cells; lane 702, subclone 6 transfected with a control vector expressing green fluorescent protein; lane 703, subclone 6 transfected with an expression vector for Gal4VP16; lane 704, subclone 10 transfected with a control vector expressing green fluorescent protein; lane 705, subclone 10 transfected with an expression vector for Gal4VP16.
  • oFlp signal is indicated by an arrowhead. The signal indicated by an asterisk is an artefact arising from transfection of the control vector.
  • Figure 8A schematizes fragments from non recombined (801) and recombined (802) expression vectors. Arrows indicate the approximate positions of primers used in the PCR analysis presented on figure 8B.
  • Figure 8B is a picture showing results of a PCR analysis performed to selectively amplify fragments from recombined expression vectors according to a preferred embodiment of the invention.
  • Expression vectors were recovered from HEK293/RE-oFlp subclone 6 transfected with a vector expressing green fluorescent protein (lane 813), Gal4VP16 (lane 814) or from wild type HEK293 cells transfected with a vector expressing Gal4VP16 (lane 815). DNA was subjected to PCR after digestion with Swal. A control fragment was amplified from a non recombined vector expressing Gal4VP16 (lane 811).
  • Figures 9A and 9B schematizes the construction of an expression vector (plasmid RC49-2) according to a preferred embodiment of the invention.
  • the plasmid may generate an adenovirus-based expression vector, and comprises a transcription unit and a recombinase substrate with a restriction site. The approximate position of primers 18-64V and 18-106V used in subsequent PCR is indicated.
  • Nucleic acid sequence of RC49-2 is set forth in SEQ ID NO:2.
  • Figure 10 is a picture of a Northern analysis showing the expression of optimized Flp mRNA in distinct subclones of Hela cells obtained after stable transfection of a vector comprising an optimized Flp coding sequence linked operatively to a cytomegalovirus enhancer and promoter according to a preferred embodiment of the invention.
  • Lane 1001 wild type Hela cells; lane 1002, subclone Hela/oFlp2-3; lane 1003, subclone Hela/oFlp3-2; lane 1004, subclone Hela/oFlp6-2; lane 1005, , subclone Hela/oFlp6-3.
  • Figure 11 A is a picture showing results of a PCR analysis performed to determine the amount of recombined adenovirus-based expression vector according to a preferred embodiment of the invention using DNA extracted from wild type Hela cells (1101); infected Hela/oFlp6-2 cells (1102); infected Hela/oFlp6-2 cells and digested by Swal (1103); infected Hela/oFlp2-3 cells (1104).
  • Lane 1100 shows the migration of a molecular marker. Fragments amplified from non recombined adenovirus-based expression vector (1105). Fragments amplified from recombined adenovirus-based expression vector (1106).
  • Figure 11 B is a picture showing results of a PCR analysis performed to detect fragments of recombined adenovirus-based expression vectors according to a preferred embodiment of the invention after infection of populations of Hela cells containing 10% of Hela/oFlp6-2 cells (lanes 1110,1111) or 0.1% of Hela/oFlp6-2 cells (lanes 1112,1113). DNA extracted from infected cells was subjected to PCR after digestion with Swal (lanes 1111 ,1113).
  • Figure 11C is a picture showing results of a semi-nested PCR analysis performed on amplicons obtained from Swal-digested DNA extracted from populations of Hela cells containing 10% of Hela/oFlp6-2 cells (lanes 1120) or 0.1% of Hela/oFlp6-2 cells (lanes 1121). Lane 1122 shows the migration of a molecular marker. Fragments amplified from non recombined adenovirus-based expression vector (1123). Fragments amplified from recombined adenovirus- based expression vector (1124).
  • Figures 12A and 12B schematizes the construction of an expression vector (RC77) according to a preferred embodiment of the invention, the expression vector comprising a transcription unit embedded in a viral genome whose translation is disrupted by a recombinase substrate.
  • Nucleic acid sequence of RC77 is set forth in SEQ ID NO:3.
  • Figure 13 shows the properties of an expression vector containing a transcription unit embedded in a cDNA copy of the Sindbis virus genome whose translation is disrupted by a recombinase substrate according to a preferred embodiment of the invention.
  • Figure 13A Image 1301 , expression of GFP inserted in such an expression vector (RC77).
  • Image 1302 immunofluorescence against the C viral protein in HEK293A cells transfected with RC77 and a control expression vector (VB35).
  • Image 1303 immunofluorescence against the C viral protein in BHK-21 cells infected with culture medium from HEK293A cells transfected with RC77 and VB35.
  • Image 1304 immunofluorescence against the C viral protein in HEK293A cells transfected with RC77 and a vector expressing oFlp (RC59).
  • Image 1305 immunofluorescence against the C viral protein in BHK-21 cells infected with culture medium from HEK293A cells transfected with RC77 and RC59.
  • Figure 13B is a picture showing results of a RT-PCR analysis performed to detect fragments of engineered Sindbis virus genomes after co-transfection in HEK293A cells of RC77 with either VB35 (lane 1313) or RC59 (lane 1314). Fragment amplified from RC77 plasmid DNA (lane 1312). Control reaction on RNA extracted from untransfected cells (lane 1311). Lane 1310 shows migration of a 100bp ladder.
  • the word “kilobase” is generally abbreviated as “kb”, the words “deoxyribonucleic acid” as “DNA”, the words “ribonucleic acid” as “RNA”, the words “complementary DNA” as “cDNA”, the words “polymerase chain reaction” as “PCR”, and the words “reverse transcription” as “RT”. Nucleotide sequences are written in the 5' to 3' orientation unless stated otherwise.
  • Desired feature refers to a nucleic acid encoding a peptide or a protein having a desired property or function.
  • a non-limitative list of examples of a nucleic acid having a "desired property” or “desired function” include nucleic acids encoding a specific signal transduction activity (e.g. a kinase or a phosphatase), a specific gene regulation activity (e.g. a transcription factor), or a specific cellular function (e.g. a protein promoting changes in cell metabolism or differentiation state), etc.
  • Exogenous nucleic acid A nucleic acid (such as cDNA, cDNA fragments, genomic DNA fragments, antisense RNA, oligonucleotide) which is not naturally part of another nucleic acid molecule.
  • the "exogenous nucleic acid” may be from any organism or purely synthetic.
  • Expression The process whereby an exogenous nucleic acid is transcribed.
  • the transcribed exogenous nucleic acid can be subsequently translated into a peptide or a protein in order to carry out its function if any.
  • Expression vector a vector capable of mediating the expression of an exogenous nucleic acid once introduced into a host.
  • expression vectors according to the present invention are capable of expressing an exogenous nucleic acid inserted therein in eukaryotic cells and comprise a recombinase substrate and a transcription unit.
  • the expression vectors of the invention preferably contain a signal for the termination of transcription and the polyadenylation of transcripts generated from enhancer and promoter sequences (see transcription unit definition).
  • the expression vectors also preferably comprise unique restriction sites between the promoter sequences and the termination sequence for inserting the exogenous nucleic acid to be expressed.
  • Functional homologue refers to a non native polypeptide or nucleic acid molecule that possesses a functional biological activity that is substantially similar to the biological activity of a native polypeptide or a nucleic acid molecule.
  • a functional homologue typically refers to a polypeptide or a nucleic acid molecule having at least 50%, more preferably at least 55%, even more preferably at least 60%, still more preferably at least 65-70%, and yet even more preferably greater than 85%, 90%, 95% or
  • the functional homologue may exist naturally or may be obtained following a single or multiple amino acid substitutions, deletions and/or additions relative to the naturally occurring enzyme(s) using methods and principles well known in the art.
  • a functional homologue of a protein may or may not contain post-translational modifications such as covalently linked carbohydrate, if such modification is not necessary for the performance of a specific function.
  • nucleotide or amino acid sequences may have similarities below the above given percentages and still encode a proteinic molecule having a desired activity, and such proteinic molecules may still be considered within the scope of the present invention where they have regions of sequence conservation.
  • the term "functional homologue” is intended to the "fragments”, “segments”, “variants”, “analogs” or “chemical derivatives” of a polypeptide or a nucleic acid molecule. Fragment: refers to a section of a molecule, such as protein/polypeptide or nucleic acid, and is meant to refer to any portion of the amino acid or nucleotide sequence.
  • Host A cell, tissue, organ or organism capable of providing cellular components for allowing expression of an exogenous nucleic acid inserted into an expression vector. This term is intended to also include hosts which have been modified in order to accomplish these functions. Bacteria, fungi, animals (cells, tissues, or organisms) and plants (cells, tissues, or organisms) are examples of a host. Preferred hosts according to the present invention are eukaryotic cells and animals.
  • Insertion The process by which a nucleic acid is introduced into another nucleic acid.
  • a typical example includes insertion of an exogenous nucleic acid into an expression vector to create a "recombinant" or "genetically modified” expression vector.
  • Methods for inserting a nucleic acid into another normally requires the use of restriction enzymes and such methods of insertion are well known in the art.
  • Knock-in refers to the process by which a specific region of the genome of a host is replaced by an exogenous nucleic acid through a reaction involving homologous recombination. According to a preferred embodiment of the present invention, this process is used to replace the first coding exon of a host gene by the coding sequence of a site-specific recombinase.
  • a Collection or a pool of nucleic acid molecules This includes genomic libraries, RNA libraries, cDNA libraries, expressed sequence tag libraries, artificial sequences libraries including randomized artificial sequence libraries.
  • Minimal promoter A short DNA sequence harboring minimal requirements for initiating transcription of a genetic sequence. The minimal promoter is not sufficient to activate transcription of a linked gene. A sequence harboring a so called "TATA" box at about 30 nucleotides upstream of the site of initiation of transcription is an example of a minimal promoter.
  • Nucleic acid Any DNA, RNA sequence or molecule having one nucleotide or more, including nucleotide sequences encoding a complete gene. The term is intended to encompass all nucleic acids whether occurring naturally or non-naturally in a particular cell, tissue or organism. This includes DNA and fragments thereof, RNA and fragments thereof, cDNAs and fragments thereof, expressed sequence tags, artificial sequences including randomized artificial sequences.
  • Optimized coding sequence refers to a wild type nucleic acid sequence which has been modified to give higher levels of transcripts and/or products when expressed in a given host which is different from the host of the wild type nucleic acid.
  • a typical example is the replacement of codons not efficiently translated in a given host by codons preferred in this host.
  • Recombinant in association with “expression vector” refers to an expression vector which has been modified to contain a non- native exogenous nucleic acid.
  • Recombinase substrate A nucleic acid molecule comprising a stuffer region flanked by recombination target sequences in direct or reverse orientation relative to one another.
  • the stuffer region is a nucleic acid . sequence which is excisable by site-specific recombination.
  • Recombination target sequence A short DNA segment acted upon by a site-specific recombinase. Generally, it is composed of two inverted sequences (such as SEQ ID NO:8) that are bound by a site-specific recombinase and that are separated by a spacer sequence of defined length. According to a preferred embodiment of the invention, an additional binding sequence is typically present at the 5' end of the recombination target sequence.
  • regulatory element refers to a DNA sequence that can, under specific cellular conditions, mediate the activation or repression of the transcription of nucleic acid sequences that are operatively linked thereto.
  • regulatory elements comprise one or more fragments of sequences naturally occurring in the enhancer or promoter regions of cellular genes. Purely synthetic regulatory elements can also be made by assembling one or more oligonucleotides corresponding to binding sites of specific transcription factors.
  • Site-specific recombinase A protein capable of mediating site-specific recombination.
  • Site-specific recombination The process by which a recombinase substrate is acted upon by a site-specific recombinase. Typically, this activity results in the excision of the stuffer region and of one recombination target sequence if the recombination target sequences are in the direct orientation relative to one another, or, if the recombination target sequences are in the reverse orientation relative to one another, in the inversion of the stuffer region.
  • Transcription unit refers to a region of a vector which comprises an enhancer sequence, a promoter sequence and a termination sequence, all operatively linked together.
  • the enhancer and promoter sequences are constitutively active and are operatively linked to an exogenous nucleic acid inserted into the vector.
  • Enhancer and promoter sequences can be derived for example from the cytomegalovirus (CMV) immediate-early genes or from the Rous sarcoma virus (RSV) long terminal repeat.
  • CMV cytomegalovirus
  • RSV Rous sarcoma virus
  • Transfection the process of introducing nucleic acids in eukaryotic cells by any means such as electroporation, lipofection, precipitate uptake, micro- injection.
  • a cell having incorporated an exogenous nucleic acid e.g. an expression vector or a recombinant expression vector
  • an exogenous nucleic acid e.g. an expression vector or a recombinant expression vector
  • Viral-based expression vector refers to an expression vector or parts thereof embedded in a viral genome that can be packaged into infectious viral particles. Typically, the parts of an expression vector embedded in a viral genome consists of the transcription unit and the recombinase substrate. Viral-based expression vectors provide a way to better control the delivery of exogenous nucleic acids to host cells via infectious viral particles.
  • the invention is based on the use of a site-specific recombinase to modify an expression vector containing an exogenous nucleic acid having a desired feature.
  • a nucleic acid having a desired feature such as nucleic acids capable of changing the expression of cellular genes or the state of cellular metabolism or signaling pathways, triggers the synthesis and/or activity of a site specific recombinase, the action of the recombinase allowing an easy selection of the expression vector containing the exogenous nucleic acid having a desired feature.
  • the present invention relates to methods for screening and/or identifying exogenous nucleic acids having a desired feature within eukaryotic cells.
  • Figure 1 depicts a preferred specific embodiment of a screening method according to the invention.
  • a vector (101) which is capable, when present into a suitable host (102), of expressing an exogenous nucleic acid inserted therein (103), is provided.
  • the vector comprises a nucleic acid sequence (104) which is excisable by site-specific recombination.
  • a cell line or a transgenic animal is also provided.
  • the cell line or transgenic animal comprises a nucleic acid minimally encoding an inactive site- specific recombinase (105) whose activity is restorable.
  • a library of recombinant expression vectors is then prepared. This is achieved by inserting into a plurality of expression vectors as the one defined previously, at least one exogenous nucleic acid from a library of exogenous nucleic acids. Next, a plurality of recombinant expression vectors from this library are inserted into the cell line or transgenic animal provided previously. Thereafter, these recombinant expression vectors are allowed to express the exogenous nucleic acid inserted therein. According to the invention, only exogenous nucleic acids encoding the desired feature will be capable of restoring (106) the activity of the site-specific recombinase of the host.
  • a site-specific recombinase (107) whose activity is restored may then excise the excisable nucleic acid sequence from recombinant expression vector(s) which have expressed an exogenous nucleic acid having restored such site-specific recombinase activity.
  • Recombinant expression vectors are then recovered (108) from the transfected cells or transgenic animal and recombinant expression vectors having undergone site-specific recombination are selected (109). According to the invention, most of these vectors contains an exogenous nucleic acid encoding the desired feature.
  • the invention is used for screening nucleic acids having a transcriptional activity (e.g. regulatory elements such as enhancers, promoters and the like).
  • a preferred screening method comprises the steps of: a) providing a vector comprising: i) a site-specific recombinase coding sequence operatively linked to a termination sequence; and ii) a recombinase substrate excisable specifically by a site-specific recombinase encoded by the site-specific recombinase coding sequence; b) inserting into a plurality of vectors as defined at step (a) at least one exogenous nucleic acid taken from a library of exogenous nucleic acids in order to provide a library of recombinant vectors; c) inserting a plurality of recombinant vectors from the library obtained at step (b) into a suitable eukaryotic host; d) allowing the exogenous nucleic acid inserted at step (b) to activate
  • Site-specific recombinases are part of the larger integrase family of recombinases that are mainly involved in the insertion, deletion or inversion of genetic material.
  • Site-specific recombinases have been used to recombine DNA molecules transfected into eukaryotic cells, particularly Cre from bacteriophage P1 and Flp from Saccharomyces cerevisiae (Sauer and Henderson, 1988; O'Gorman et al., 1991).
  • recombination target sequences e.g. SEQ ID NO:8
  • Recombination target sequences e.g. SEQ ID NO:8
  • Recombination target sequences results in the deletion of the intervening sequence and of one target sequence if the recombination target sequences are in the same orientation relative to one another.
  • any site-specific recombinase can be used according to the present invention. Examples include prokaryotic ⁇ -rec ⁇ mbinase (Diaz et al., 1999) in addition to Cre and Flp mentioned above.
  • the expression vector according to the present invention is minimally composed of a transcription unit (200; Fig 2A) and a selectable recombinase substrate unit (309; Figs 3A to 3C).
  • the transcription unit (200) comprises cloning sites (204), a promoter (202), enhancer elements (201), transcription termination and polyadenylation signals (203).
  • An exogenous nucleic acid (205) is inserted into the cloning sites (204) of the transcription unit (200).
  • the transcription unit comprises a recombinase coding sequence (211) operatively linked to a minimal promoter (210) and an exogenous nucleic acid (205) is placed upstream of the promoter (210).
  • the transcription unit (200) as schematized on Figure 2A serves to express exogenous nucleic acids and comprises enhancer (201) and promoter (202) sequences, followed by signals (203) for the termination of transcription and the polyadenylation of transcripts generated from the enhancer and promoter sequences.
  • Enhancer and promoter sequences driving robust expression in a wide variety of cells are generally preferred. These include but are not limited to sequences derived from cytomegalovirus immediate-early genes (CMV; GenBankTM acc. No. AF477200) and Rous sarcoma virus long terminal repeat (RSV; GenBankTM acc. No. M83236.1) as well as sequences derived from widely expressed cellular genes such as chicken ⁇ -actin and human elongation factor 1 ⁇ .
  • CMV cytomegalovirus immediate-early genes
  • RSV Rous sarcoma virus long terminal repeat
  • enhancer and promoter sequences driving expression in specific cells, tissues or organs can be used.
  • expression of the exogenous nucleic acid will be limited to the cells, tissues or organs in which the enhancer and promoter sequences can activate transcription.
  • This can be desirable when constructing libraries of viral-based expression vectors. Indeed, as will be described in more detail below, such construction requires the introduction of expression vectors into cells at some point, a step that can lead to the loss of vectors comprising exogenous nucleic acids whose expression are deleterious or toxic to the cell type used in the library construction procedure.
  • Such losses can be spared by insuring that exogenous nucleic acids are expressed from cell-specific or tissue-specific enhancer and promoter sequences that are not active in the cell type used for library construction.
  • the following DNA fragments are just a few examples of enhancer and promoter sequences that can activate the expression of exogenous nucleic acid only in specific cell populations. Nucleotides are numbered relative to the site of initiation of transcription (+1). A fragment encompassing nucleotides -1700 to +1 of the rat osteocalcin gene can be used to achieve osteoblast-specific expression (Baker et al., 1992).
  • a fragment encompassing nucleotides -1542 to -1 of the kidney androgen-regulated protein can be used to achieve kidney-specific expression (Ding et al., 1997).
  • Signals for the termination and polyadenylation of transcripts are well known in the art. Examples include part of the 3' untranslated region of the bovine growth hormone gene or of the SV40 virus.
  • Unique cloning sites (204) are introduced between the promoter sequences and the termination signal and are used to insert exogenous nucleic acids (205). To decrease the probability of cleaving the exogenous nucleic acids during their insertion process, these sites are generally recognized by or compatible with sites recognized by enzymes which infrequently cut DNA molecules (e.g. Notl, Sail)
  • the transcription unit as schematized on Figure 2B, comprises a recombinase coding sequence (211) linked at its 5' end to a minimal promoter sequence (210) and at its 3' end to a transcription termination and polyadenylation sequence (203).
  • the minimal promoter is typically approximately 30-40 nucleotides in length. Its only functional element is a "TATA box" about 30 nt upstream of the site of initiation of transcription.
  • the minimal promoter sequence can be derived from naturally occurring genes (e.g. pro- opiomelanocortin (Therrien and Drouin, 1991) or be entirely synthetic.
  • the minimal promoter is chosen such that the level of expression of the recombinase in the screening host is insufficient to mediate efficient recombination of substrate.
  • Unique cloning sites (204) are present, generally immediately upstream of the minimal promoter, to insert an exogenous nucleic acid (205) to be tested for its transcriptional properties. It is understood that the minimal promoter can be omitted to screen for sequences containing complete transcriptional activity.
  • the expression vector also comprises a recombinase substrate (309).
  • the recombinase substrate (309) is composed of a stuffer region (302) containing a restriction site (R) flanked by recombination target sequences in the same orientation (301).
  • Site-specific recombination (303) leads to removal of the stuffer and one recombination target sequence as well as disappearance of the restriction site.
  • Recombined expression vectors (305) can be distinguished from non recombined expression vectors (304) after digestion with restriction enzyme R and PCR amplification using primers located upstream and downstream (306, 307) of the site of recombination.
  • the recombination target sequences (301) are in direct orientation relative to one another and are separated by a stuffer region (302) which comprises one or many rare restriction site (R; e.g. Notl, Pad, Swal).
  • Site-specific recombination (303) leads to removal of the stuffer and one recombination target sequences. Consequently, the rare restriction site is also deleted from the recombined molecule.
  • unrecombined expression vectors (304) can be distinguished from recombined expression vectors (305) by their size and by the restriction patterns obtained after digestion by the enzyme cleaving at said rare restriction site.
  • the restriction site present in the stuffer region should be unique in the expression vector such that unrecombined expression vectors can be distinguished from recombined expression vectors by their sensitivity to the enzyme cleaving at the rare restriction site. Furthermore, the restriction site should be rarely found in DNA molecules to decrease the probability of cleaving the exogenous nucleic acid, thereby allowing a region comprising the exogenous nucleic acid to be amplified by PCR from recombined vectors using primers located upstream and downstream (306,307) of the recombination site.
  • the recombinase substrate (309) is composed of a stuffer region (302) which is flanked by recombination target sequences in the same orientation (301), which disrupts a coding sequence conferring resistance to a given antibiotic (310), and which is expressed from a prokaryotic promoter (311).
  • the vector is designed such that the remaining recombination target sequence (312) after site-specific recombination (303) no longer interferes with the production of protein conferring resistance to the antibiotic. Therefore, recombined expression vectors (314) give rise to colonies (316) when transformed into bacteria whereas non recombined expression vectors (313) do not (315).
  • the DNA segment which comprises recombination target sequences (301) in direct orientation relative to one another and separated by a stuffer region (302) is inserted in the expression vector within a gene conferring resistance to a given antibiotic (310) (e.g. aminoglycoside phosphotransferase conferring resistance to kanamycin) such that it disrupts its proper function.
  • Disruption can be achieved by interrupting the coding sequence of said gene or by abolishing the expression of said gene through insertion of recombination target sequences and stuffer in essential promoter sequences or between promoter (311) and coding sequences.
  • Site-specific recombination (303) will lead to removal of one recombination target sequence and the stuffer segment.
  • the expression vector is designed such that the remaining recombination target sequence (312) no longer interferes with the proper function of the gene conferring resistance to a given antibiotic.
  • bacteria transformed with recombined expression vectors (314) will be resistant to a given antibiotic (316) whereas those transformed with unrecombined expression vectors (313) will not (315).
  • a rare restriction site can be introduced in the stuffer segment, as described above, to distinguish recombined and unrecombined expression vectors after digestion with an enzyme recognizing such a rare restriction site.
  • the recombinase substrate (309) is composed of a stuffer region (302) flanked by recombination target sequences in the same orientation (301) which disrupts translation of a viral genome (321) in which a transcription unit (320) comprising an exogenous nucleic acid (205) has been embedded.
  • the defective viral genome is expressed from eukaryotic promoter and enhancer elements (322). Signals for transcription termination and polyadenylation of transcripts are provided (203).
  • the vector is designed such that the remaining recombination target sequence (312) after site-specific recombination (303) no longer interferes with the translation of the viral genome. Therefore, recombined expression vectors (325) give rise to viral particles whereas non recombined expression vectors (324) do not.
  • the transcription unit and the associated exogenous nucleic acid (320) are embedded in a viral genome (321) whose translation and replication have been disrupted by insertion of a DNA segment comprising recombination target sequences (301) in direct orientation relative to one another and separated by a stuffer region (302).
  • the viral genome is a cDNA copy of a Sindbis virus replicon (see GenBankTM acc. No. NC_001547; and WO 02/16572 incorporated herein by reference) cloned into a DNA-based plasmid downstream of constitutively active enhancer and promoter sequences (322) and whose translation is disrupted by insertion of said DNA segment in the 5' untranslated region of the viral genome.
  • the enhancer and promoter sequences driving expression of the disrupted viral genome are different from those comprised in the transcription unit and driving expression of the exogenous nucleic acid.
  • the viral genome contains the transcription unit and associated exogenous nucleic acid between the viral coding sequence and the 3' untranslated region (323). Once transfected, such a DNA plasmid will lead to expression of the exogenous nucleic acid (205). Site-specific recombination (303) will lead to removal of one recombination target sequence and the stuffer segment.
  • the expression vector is designed such that the remaining recombination target sequence (312) no longer interferes with the translation and replication of the Sindbis virus replicon.
  • recombined expression vectors (325) produce self-replicating and self- packaging viral genomes that contain the exogenous nucleic acid whereas unrecombined expression vectors (324) do not.
  • the expression vector or part thereof is embedded in a viral genome to generate a viral-based expression vector that minimally contains 1) a transcription unit and 2) a recombinase substrate having at least one of the properties described above.
  • a viral-based expression vector may then be packaged within infectious viral particles.
  • These are particularly useful to deliver the expression vector into a whole organism or into cells that are difficult to transfect by conventional methods (e.g. primary cells, immortalized cell lines of hematopoietic origin).
  • Engineered retroviruses and adenoviruses are commonly used to introduce nucleic acid into cells (Ragot et al., 1998; Pear et al., 1993).
  • Standard methodology may be used according to the invention to insert an expression vector in a viral genome and package the resulting viral genome within infectious viral particles.
  • the transcription unit and the recombinase substrate of the expression vector should be flanked by viral sequences that are either essential for replication and packaging or that are sufficient to insert components of the expression vector in a viral genome via homologous recombination.
  • the present invention relies on the conditional activity of a site-specific recombinase to select expression vectors containing nucleic acids having a desired feature such as those encoding a peptide/protein with a specific cellular function. It is understood that the recombinase activity should somehow be dependent on the occurrence of the specific cellular function. Before choosing a screening host (cell line or transgenic animal or plant), it is important to ascertain that 1) the specific cellular function does not occur in the absence of an "activating" exogenous nucleic acid, i.e. that the recombinase is not active under basal conditions; and 2) that the cellular function can occur if the right conditions are met, for example transfection of an expression vector containing an "activating" exogenous nucleic acid.
  • the specific cellular function being screened for is the activation of a particular gene or set of genes.
  • the recombinase coding sequence is placed under the control of regulatory elements known to be responsible for the activation of this particular gene or set of genes.
  • the recombinase will be expressed solely if an expression vector contains an exogenous nucleic acid that can activate transcription from the regulatory elements.
  • regulatory elements have been described in the prior art. They are generally composed of repeats of synthetic oligonucleotides or relatively small gene fragments. They activate transcription under known conditions.
  • cyclic AMP response elements For example, transcription from cyclic AMP response elements (CRE) is activated by increased intracellular cyclic AMP levels, a well-known second messenger to many hormones (Tamai et al., 1997). As another example, transcription from a 1.7 kb fragment of the osteocalcin gene is activated upon osteoblast terminal differentiation (Baker et al., 1992).
  • regulatory elements can be operatively linked to the recombinase coding sequence to obtain a conditionally active form of the recombinase. Transcription termination and polyadenylation signals are also added to the 3' of the recombinase coding sequence.
  • Another approach can be used to place the expression of a recombinase coding sequence under the control of specific regulatory elements.
  • This is the so-called "knock-in” technique, whereby, according to a preferred embodiment of the present invention, the whole or part of the expressed sequence of a specific cellular gene is replaced by a recombinase coding sequence, or whereby a recombinase coding sequence is inserted into a specific cellular gene.
  • the result of such replacement or insertion is that the expression of a recombinase coding sequence mimics the expression of a cellular gene.
  • Methods to insert into or replace specific cellular genomic sequences are known in the prior art.
  • a recombinase coding sequence can be expressed solely under the desired conditions, thereby creating a conditionally- active form of the recombinase.
  • the specific cellular function being screened for is the translocation of a signaling molecule from the cytosol to the nucleus.
  • the recombinase is fused to the signaling molecule and the mRNA encoding the fusion protein is constitutively expressed from enhancer and promoter sequences.
  • a number of signaling molecules are known to shuttle between the cytosol and the nucleus depending on the activation state of certain cellular pathways. For example, part of the NF- ⁇ B complex translocates to the nucleus upon activation of lymphocytes.
  • Smad4 is another example of a signaling molecule that translocates to the nucleus as a result of TGF ⁇ binding to its cognate receptor at the cell surface (Wrana and Attisano, 2000). Furthermore, it is well known that many nuclear receptors (e.g. glucocorticoid receptor) translocate to the nucleus upon ligand binding. In the absence of activation, the signaling molecule is retained in the cytosol, thereby leading to retention of the fused recombinase in the cytosol. Since the recombinase must be located in the nucleus to perform site-specific recombination, the recombinase is inactive when retained in the cytosol.
  • nuclear receptors e.g. glucocorticoid receptor
  • the fusion protein Upon activation of the signaling molecule (i.e. a specific cellular pathway), the fusion protein translocates to the nucleus, where the recombinase moiety acts on the expression vector containing an exogenous nucleic acid whose expression triggered activation of the specific pathway.
  • the signaling molecule i.e. a specific cellular pathway
  • the recombinase moiety acts on the expression vector containing an exogenous nucleic acid whose expression triggered activation of the specific pathway.
  • the specific cellular function being screened for is the stabilization of a particular messenger RNA.
  • messenger RNAs are unstable due to the fact that they contain one or more "destabilizing" sequences, usually located in their 3' untranslated region.
  • Various mRNA-destabilizing sequences have been reported.
  • the recombinase coding sequence is fused to a specific "destabilizing" sequence.
  • the chimeric mRNA is expressed from constitutively active enhancer and promoter sequences.
  • the recombinase is not produced because of the instability of the chimeric mRNA.
  • a conditionally-active recombinase can be obtained by inserting in its mRNA a chosen destabilizing sequence.
  • the conditionally active recombinase sequence is inserted into a plasmid containing a gene conferring resistance to a selective agent (e.g. puromycin-N- acetyltransferase conferring resistance to puromycin).
  • a selective agent e.g. puromycin-N- acetyltransferase conferring resistance to puromycin.
  • the resulting construct is transfected into cells using standard protocols (e.g. electroporation) and selection is applied. Surviving and growing cells are thought to have incorporated the plasmid and are cloned. Individual clones are analyzed by Southern blotting to confirm the presence of the construct within the genomic DNA.
  • telomeres are tested to determine whether site-specific recombination can be activated when the conditions initially set forth are met, for example when transcription from regulatory elements linked to the recombinase coding sequence is activated or when the fusion protein containing a recombinase moiety is translocated to the nucleus.
  • Cellular clones containing a conditionally-active form of a recombinase are used as screening hosts.
  • the conditionally active recombinase sequence is inserted into a fertilized egg (e.g. of a mouse), which is re-implanted into a pseudo-pregnant mother.
  • DNA extracted from resulting organisms e.g. embryos, pups or adults
  • Southern blotting to determine whether the organism is transgenic. Positive animals are bred and used as screening hosts.
  • the conditionally-active form of a recombinase can be incorporated in the genome of embryonic stem (“ES”) cells (e.g. of mouse origin).
  • the transgenic ES cells can be aggregated with morula or injected, into blastocysts to obtain chimeric animals. If the transgenic ES cells have populated the germline, then the resulting chimeric animal can be bred to obtain a line of transgenic animals, which can be used as screening hosts.
  • exogenous nucleic acid may be derived from any source, i.e. any organism, tissue or cell type, disease state, etc.
  • a plurality of different nucleic acids is inserted into a plurality of copies of an expression vector to provide a plurality of recombinant expression vectors each expressing a unique exogenous nucleic acid and/or encoding a unique protein or peptide.
  • a nucleic acid encoding one particular exogenous protein or peptide may be inserted into the expression vector.
  • the exogenous nucleic acid is derived from a nucleic acid library and a plurality of exogenous nucleic acids are inserted into multiple expression vector copies to yield a pool of recombinant expression vectors.
  • the library may be obtained from a tissue or a cell type of interest or synthesized artificially.
  • This library may be a cDNA library, a genomic library, an RNA library, an expressed sequence tag library, a library made of randomized artificial sequences, or any other kind of library comprising nucleic acids from any kind of organism, tissue, or cell type known to the skilled artisan.
  • the library is derived from a mammalian source.
  • the library may also be derived from reptilian, amphibian, avian, insect, plant, fungi, bacterial cells, etc.
  • the exogenous nucleic acid may be derived from mRNA isolated from a tissue or cell type of interest. In this case, the mRNA would be purified and reverse transcribed into cDNA using methods well known in the art.
  • the nucleic acid library will be derived from a subtractive library, for example a library which comprises cDNAs differently expressed in a disease state when compared to the corresponding healthy tissue. Suitable nucleic acid libraries may be generated using standard methods (see for example Sambrook et al., Molecular Cloning: A Laboratory Manual, 2 nd Ed. Cold Spring Harbor (1989)).
  • exogenous nucleic acids of any type can be screened and selected using the present invention, examples given below rely on cDNA, fragments of cDNA or fragments of genomic DNA as a source of exogenous nucleic acids.
  • initiation and termination codons may be provided by the expression vector upstream and downstream of the cloning site(s) for the fragments of cDNAs, respectively.
  • fragments of genomic DNA a library is made starting either with whole genomic DNA or DNA insert(s) from ⁇ bacteriophage, cosmid or bacterial artificial chromosome containing genomic DNA.
  • Exogenous nucleic acids are generated by partial digestion of the DNA with a restriction enzyme cutting DNA frequently (e.g. Sau3A, Rsal) and can be size-selected by sieve chromatography (e.g. SepharoseTM CL2B column).
  • the exogenous nucleic acids are cloned into the expression vectors to produce recombinant expression vectors.
  • the resulting population of recombinant expression vectors is transformed in Escherichia coli by electroporation according to standard procedures.
  • a typical yield is 5x10 5 to 5x10 7 transformants/ ⁇ g of cDNA depending on the expression vector.
  • Plasmids may be prepared and purified according to standard procedures. Additional steps may be needed to obtain a population of viral-based expression vectors.
  • plasmids comprise viral sequences essential for replication and packaging as well as components of an expression vector.
  • the population of plasmids is transfected into a cell line expressing the viral proteins necessary for replication and packaging to generate a plurality of recombinant viral genomes that are subsequently packaged.
  • plasmids comprise parts of the viral genome separated by the components of an expression vector.
  • the population of plasmid is transfected, along with a replication-defective viral genome, in a cell line that can complement the replication defect, usually HEK293 cells.
  • Homologous recombination between a plasmid and a viral genome generates a recombinant viral genome having inserted the components of the expression vector.
  • This recombinant viral genome is subsequently packaged and can be propagated in HEK293 cells.
  • a plurality of recombinant viral genomes is thus produced and packaged to obtain a plurality of adenoviral-based expression vectors.
  • a suitable host should be able to perform the cellular function being screened for, but it should not exhibit this function in the absence of an "activating" condition (e.g. expression of an appropriate exogenous nucleic acid). If screening is performed using viral-based expression vectors, it is necessary that the host be infected with the recombinant viral particles. In preferred embodiments of the present invention, the genome of the host should also harbor a conditionally-active form of a recombinase. Introduction of a recombinant expression vector into an eukaryotic host cell can be carried out using a number of different well known procedures.
  • Transfection by electroporation, lipofection, calcium phosphate, and micro- injection are only a few of the available techniques to introduce nucleic acids into eukaryotic cells.
  • Introduction of recombinant viral-based expression vectors into eukaryotic host cells is simply carried out by incubating the host with the viral particles and allowing infection to proceed.
  • infection is usually performed at a multiplicity of infection (m.o.i.) greater than 1 (e.g. 10 plaque-forming units/cell).
  • Introduction of a recombinant expression vector into a transgenic animal or plant or part thereof can be carried out by electroporation or by injection of complexes comprising lipid derivatives and DNA (e.g. intravenous or peritoneal injections).
  • Introduction of a recombinant viral-based expression vector into a transgenic organism is performed by injection of viral particles, e.g. in the case of recombinant adenoviruses, 10 8 plaque forming units in 0.05 ml of saline intraperitoneally (Mittal et al., 1993).
  • the transfected or infected cell should provide most of the molecular machinery for the proper expression and/or function of the exogenous nucleic acid contained therein.
  • the biological function encoded by the exogenous nucleic acid is carried out, if any, directly or through the corresponding protein or peptide if it contains an open reading frame. If this biological function somehow triggers the activity of the conditionally-active recombinase present in the host (e.g. by activating its expression, by inducing its translocation into the nucleus, by stabilizing its mRNA), then the expression vector or viral-based expression vector containing said exogenous nucleic acid will be recombined.
  • genomic and extrachromosomal DNA are extracted using standard techniques to recover a pool of expression vectors. Typically, this step is done 24 to 72 hours after introduction into the host of the expression vector having incorporated an appropriate exogenous nucleic acid.
  • This step generally involves lysis of cells using a buffer containing ionic detergent (e.g. 1% SDS) followed by digestion of proteins using proteinase K and purification of DNA by ion-exchange chromatography or phenol extraction and ethanol precipitation.
  • ionic detergent e.g. 1% SDS
  • activation of a site-specific recombinase results in the modification of the expression vector, this modification leading to the removal of a unique restriction site contained in the expression vector.
  • unrecombined expression vectors can be cut by the enzyme recognizing the restriction site whereas recombined expression vectors can not. Since it is known that cleaved vectors are much less efficiently transformed in bacteria than uncleaved vectors, this property may be used to identify expression vectors that have been recombined.
  • the method to identify recombined expression vectors comprises the steps of: a) extracting DNA from cells into which the expression vector(s) with an exogenous nucleic acid sequence has been introduced; b) digesting the DNA of step a) with a restriction enzyme recognizing a restriction site present in the stuffer region of the expression vector, between recombination target sequences; c) transforming bacteria with the digested DNA molecules of step b); d) culturing the transformed bacteria in a selection media (e.g.
  • step b it may be preferable to degrade cleaved unrecombined expression vectors (step b) with a nuclease that acts on extremities of double strand DNA (e.g. lambda exonuclease). Circular uncleaved recombined molecules are protected from the action of such nucleases.
  • a nuclease that acts on extremities of double strand DNA e.g. lambda exonuclease.
  • the method to identify recombined expression vectors comprises the steps of: a) extracting DNA from host(s) into which the expression vector with an exogenous nucleic acid sequence has been introduced; b) transforming bacteria with the DNA molecules extracted at step a); c) selecting for bacterial colonies resistant to a selection media (e.g.
  • PCR polymerase chain reaction
  • cleaved unrecombined expression vectors (step b) with a nuclease that acts on extremities of double strand DNA (e.g. lambda exonuclease).
  • a nuclease that acts on extremities of double strand DNA e.g. lambda exonuclease.
  • cleaved unrecombined expression vectors can be degraded by a nuclease that acts specifically on free 5' extremities of double strand DNA (e.g. bacteriophage lambda exonuclease VII). Since the 5' extremities of adenovirus derivatives are covalently linked to a protein moiety, uncleaved linear recombined molecules are protected from the action of such nucleases.
  • the amplicon can be cloned in a general purpose bacterial plasmid (e.g. pBLUESCRIPTTM KS II). If recombination has resulted in the reconstitution of a sequence encoding resistance to a given antibiotic (e.g. neomycin phosphotransferase conferring resistance to kanamycin) and if the amplicon contains the whole antibiotic resistance coding sequence, then bacteria transformed with a plasmid harboring the amplicon may be preferably selected on medium containing an appropriate antibiotic, thereby ensuring that only amplicons derived from recombined molecules are cloned.
  • a general purpose bacterial plasmid e.g. pBLUESCRIPTTM KS II.
  • activation of a site-specific recombinase results in the modification of the expression vector, this modification leading to the removal from the expression vector of a stuffer region preventing the replication and translation of a cDNA copy of the Sindbis virus genome that comprises the exogenous nucleic acid.
  • the exogenous nucleic is inserted immediately upstream of the 3' untranslated region of the cDNA copy of the Sindbis virus genome and it is expressed from enhancer and promoter sequences preferably also embedded in the cDNA copy of the Sindbis virus genome.
  • recombined expression vectors produce self-replicating and self-packaging viral genomes that contain the exogenous nucleic acid whereas unrecombined expression vectors do not.
  • Viral particles are therefore produced only in cells in which an exogenous nucleic acid encoding a desired function has been expressed. Because the exogenous nucleic acid and, generally, the promoter and enhancer sequences, are embedded in the cDNA copy of the viral genome, the viral particles that are produced after recombination contain a copy of the desired exogenous nucleic acid.
  • Viral particles are collected i) from the culture medium of host cells transfected with a library of expression vectors, or ii) from extracellular fluids (e.g.
  • Viral particles can be infectious to allow the propagation of the viral genome comprising desired exogenous nucleic acids.
  • Viral genomes can be recovered from infectious viral particles by infecting a susceptible cell line (e.g. BHK-21 in the case of recombinant Sindbis viral particles), extracting nucleic acids from infected cells (e.g. RNA in the case of recombinant Sindbis virus).
  • a DNA fragment containing the exogenous nucleic acid can be obtained by PCR (after reverse transcription of RNA in the case of Sindbis virus) using primers located upstream and downstream of the site of insertion of the exogenous nucleic acid.
  • viral particles produced after recombination of expression vectors can be conditionally infectious to prevent unwanted effects of a viral infection on the screening host, particularly in the case of transgenic animal screening hosts.
  • conditionally-infectious Sindbis virus particles can be obtained by preventing the cleavage of the p62 envelope precursor protein, usually by introducing a deleterious mutation in the sequence coding for the cleavage site (Berglund et al., 1993).
  • Viral particles produced under these conditions or from such a modified Sindbis virus genome can be recovered and partially purified (e.g. by centrifugation on density gradient or by heparin-agarose affinity chromatography).
  • viral particles produced from p62 cleavage deficient mutants can be rendered infectious after recovery from screening host by controlled digestion with chymotrypsin and used to infect a susceptible cell line (e.g. BHK-21 fibroblasts).
  • the identity of the exogenous nucleic acid inserted into a recombined expression vector can be determined by sequencing appropriate region(s) of the plasmids recovered from bacteria or by sequencing appropriate region(s) of the DNA fragment comprising the exogenous nucleic acid and obtained, for example, by PCR amplification. Sequence comparisons with • known polynucleotide sequences in databases may confirm the function of the isolated exogenous nucleic acid and/or reveal homologies with nucleic acids encoding known functions.
  • exogenous nucleic acids inserted into recombined expression vectors can also be i) analyzed by digestion with restriction enzymes followed by gel electrophoresis; ii) used as hybridization probe(s) in expression profiling or microarray analysis; iii) otherwise characterized.
  • exogenous nucleic acids selected and identified according to the methods of the invention, as well as the peptides and proteins encoded by the same may have many uses. They may be useful for research applications and laboratory use. For instance, they may be used for further screening procedures e.g. as a library, they may serve as probes for the discovery and isolation of various genes and/or diseases, be used for the production of antibodies, be used for the development and the use of oligonucleotide or oligoribonucleotide sequences antisense DNA or RNA molecules or ribozymes. Some of the genes and gene products identified and isolated by the method of the present invention may directly be used as therapeutic agents or, alternatively, as therapeutic targets. These applications and others are known in the art as well as the manner in which they can be reduced to practice.
  • Example 1 shows the properties of a Flp recombinase whose coding sequence has been partially optimized (oFlp).
  • Example 2 gives an example of a plasmid-based expression vector, a cell line in which the activity of oFlp is regulated in a specific manner, and methods to selectively recover recombined forms of plasmid-based expression vectors.
  • Example 3 gives an example of a viral-based expression vector and a method to selectively recover its recombined form after infection of cells constitutively expressing an active form of oFlp.
  • Example 4 gives an example of a vector carrying a cDNA copy of the Sindbis virus genome inactivated through insertion of a recombinase substrate and a method to recover viral particles after recombination of this vector.
  • Example 5 gives an hypothetical example of a screening performed in a transgenic animal using a virus-based expression vector.
  • Example 6 gives an hypothetical example of an in vivo screening performed to identify tissue-specific regulatory elements.
  • Enzymes and reagents Restriction enzymes and DNA-modifying enzymes were purchased from
  • TITANTM one- tube RT-PCR system was purchased from Roche Molecular (Laval, Quebec, Canada).
  • Taq DNA polymerase was purchased from Amersham Pharmacia Biotech (Baie d'Urfe, Quebec, Canada).
  • Synthetic oligonucleotides were obtained either from Hu Stamm Ltd. (Montreal, Quebec, Canada), Life Technologies (Burlington, Ontario, Canada) or MWG Biotech Inc. (High Point, North Carolina). Cell culture reagents were from Life Technologies unless otherwise stated.
  • PQuantoxTM, pQBI25fc3TM and pQBIAdBNTM were purchased from
  • pREP4 GenBankTM accession number A25856
  • pQE30TM were from Qiagen Inc.
  • pBluescript IITM SK (+) was from Stratagene (California).
  • DH-BB, pSinRep ⁇ and pcDNA1.1/Amp were from Invitrogen (Carlsbad, Ca.).
  • the sequence of Flp recombinase was amplified by 30 cycles of PCR from approximately 200 ng of yeast DNA using 25 pmoles of forward primer 20-5 (SEQ ID NO:29), 25 pmoles of reverse primer 23-1 (SEQ ID NO:30) and 1 U of high-fidelity Vent DNA polymerase (New England Biolabs, Ma.) in 50 ⁇ l of 1x Vent reaction buffer supplemented with 3% DMSO and 200 ⁇ M dNTP.
  • Comparison of the sequence of our clone with published Flp sequence (GenBankTM accession number J01347) revealed no difference.
  • pCMVneo was derived from pQBI25fc3TM (Quantum Biotechnologies, Montreal, Canada) by deletion of a 758 bp Sacll-Apal fragment.
  • the Flp PCR fragment was cloned at the Nrul site of expression vector pCMVneo in order to achieve production of the Flp recombinase in transfected cells.
  • the resulting plasmid is designated RC6.
  • a 1654 bp Bsml-Dralll fragment was deleted from RC6 to generate RC26.
  • oligonucleotides 89-1 (SEQ ID NO:31), 86-1 (SEQ ID NO:32), 83-1 (SEQ ID NO:33), 82-1 (SEQ ID NO:34), 80-1 (SEQ ID NO:35) and 74-1 (SEQ ID NO:36) were phosphorylated and annealed in 70 ⁇ l of 10mM Tris-HCI pH 7.5/1 OOmM NaCI/1mM EDTA by heating at 85°C for 10 minutes and decreasing the temperature at a rate of 1°C/minute. Gaps were filled using 3 U of T4 DNA polymerase and extremities were ligated using 1 Weiss U of T4 DNA ligase.
  • the resulting 391 bp fragment was isolated by electrophoresis on a 2% agarose gel, purified using the QiaQuickTM kit (Qiagen) and amplified by 25 cycles of PCR using 1 U Vent DNA polymerase and 25 pmoles of primers 24-6 (SEQ ID NO:37) and 18-88 (SEQ ID NO:38) in 90 ⁇ l of ThermopolTM 1x buffer containing 6% dimethylsulfoxide and 200 ⁇ M dNTP.
  • the PCR product was digested by BamHI and EcoRV and inserted into a BamHI-EcoRV digested RC33.
  • the resulting plasmid is designated RC59.
  • HEK293A cells (ATCC no. CRL-1575) are grown in Dulbecco's minimal essential medium supplemented with 10% (v/v) fetal bovine serum, 100 U/ml penicillin and 100 mg/ml streptomycin. Cells are passaged when reaching 80- 95% confluence by incubating with 0.05% (v/v) trypsin/O. ⁇ mM EDTA (Wisent Inc.). Lipofection is performed as follows. Lipid:DNA complexes are formed in 100 ⁇ l of culture medium without serum using 3 ⁇ l of 1 mg/ml PEI (Sigma, St- Louis) per ⁇ g of DNA. Cells are transfected the day after plating (typically 10,000 cells/cm 2 ) by adding the lipid:DNA complex to the culture medium. After a 3 hour incubation, the medium is changed and cells are usually processed after 48 hours.
  • pQE30TM (Qiagen, Mississauga, Ontario, Canada) contains an origin of replication, the ⁇ -lactamase coding sequence, and the taq promoter controlling the expression of a given fusion protein containing 6 histidines at its N-terminus.
  • a 865 bp Hindlll fragment from plasmid RC6 was subcloned in Hindlll-digested pQE30TM.
  • the hexahistidine tag coordinates nickel atom, thereby allowing purification of the fusion protein by metal affinity chromatography.
  • pQE30TM containing Flp 1"286 described above is transformed in strain M15[pREP4].
  • the fusion protein is produced and purified under denaturing conditions (6M guanidine hydrochloride, 200mM NaCl, 100mM sodium phosphate, 10mM Tris pH 8.0, 2mM imidazole, 5mM ⁇ -mercaptoethanol) according to the manufacturer's instructions (QIAEXPRESSIONISTTM kit, Qiagen, Mississauga, Ontario, Canada).
  • the protein solution is dialyzed at 4°C against 4 liters of PBS.
  • Approximately 200 ⁇ g of recombinant protein mixed with complete Freund's adjuvant (VWR Canlab, Montreal, Quebec, Canada) is injected subcutaneously to New Zealand White rabbit on day 1. On days 15 and 28, another 100 ⁇ g of recombinant protein mixed with incomplete Freund's adjuvant is similarly injected. Rabbits are bled 7 days after the last injection.
  • cells are rinsed twice with PBS and fixed with 2% (w/v) paraformaldehyde in PBS. Cells are washed with PBS and fixative is quenched by incubating 10 minutes in PBS supplemented with 50mM NH 4 CI. Cells are then incubated overnight in PBS supplemented with 1 % (w/v) bovine serum albumin fraction V (BSA), 0.1% (w/v) dried low fat milk and 0.05% (v/v) Triton X-100TM.
  • BSA bovine serum albumin fraction V
  • Triton X-100TM Triton X-100
  • Cells are incubated in a 1/25 dilution of antiserum in PBS/BSA 0.1 %/milk 0.1 %, washed and incubated with anti-rabbit IgG coupled to TRITC.
  • Igepal CA-630TM 50 ⁇ l of a 10% (v/v) solution is added, the solution is briefly vortexed and centrifuged.
  • the nuclear pellet is incubated for 15minut.es on ice in 0.05 ml of buffer B (20mM HEPES pH 7.9; 400mM NaCl; 1mM EDTA; 1mM DTT; 10 ⁇ g/ml aprotinin).
  • buffer B 20mM HEPES pH 7.9; 400mM NaCl; 1mM EDTA; 1mM DTT; 10 ⁇ g/ml aprotinin.
  • the solution is centrifuged and the insoluble pellet is resuspended in Laemmli buffer (50mM Tris-HCI, pH 6.8, 100mM dithiothreitol, 2% sodium dodecyl sulfate (w/v), 0.1% bromophenol blue (w/v), 10% glycerol (v/v)) and boiled for 5 minutes.
  • Proteins are electrophoresed on denaturing polyacrylamide gel and transferred to 0.22 ⁇ m nitrocellulose according to standard protocols.
  • the nitrocellulose membrane is incubated overnight in tris- buffered saline (TBS; 25mM Tris-HCI, pH 7.4, 137mM NaCl, 2.7mM KCI) supplemented with 5% (w/v) dried milk and 0.1% (v/v) TWEEN-20TM (Sigma, St.Louis, Mo.). It is then incubated for 1.5 hours at room temperature with affinity- purified antibody to Flp at a concentration of approximately 3 ⁇ g/ml in TBS supplemented with 0.1 % (w/v) dried milk and 0.1% (v/v) TWEEN-20TM.
  • the membrane is washed twice with TBS supplemented with 0.1 % (v/v) TWEEN- 20TM. It is then incubated for 1 hour at room temperature with goat anti-rabbit coupled to horseradish peroxidase (Sigma, St.Louis, Mo.) diluted 1/30,000 in TBS supplemented with 0.1% (v/v) TWEEN-20TM. The membrane is washed twice with TBS supplemented with 0.1% (v/v) TWEEN-20TM. Detection of the protein bound to the antibody complex is performed with the ECLTM reagent according to the manufacturer's instructions (Amersham Pharmacia Biotech, Baie d'Urfe, Canada).
  • RNA extraction and Northern analysis Total RNA is purified either by the guanidium isothiocyanate/acid phenol method or using the RNEASYTM kit according to the manufacturer's instructions (Qiagen).
  • RNA elecfrophoresed on 1.2% agarose/1.2% formaldehyde gel and transferred onto nylon membrane by capillarity. After UV crosslinking, the blot was probed with a radioactively-labeled full length wild type Flp fragment. After hybridization, the membrane was rinsed with 2xSSC/0.1 % SDS and washed at 65C once with 2xSSC/0.1% SDS and twice with 0.2xSSC/0.1 % SDS. Signal was revealed by autoradiography.
  • DNA was extracted and purified using the Qiagen DneasyTM tissue kit.
  • Extrachromosomal DNA was extracted using a modified Hirt procedure as follows
  • Example 1 Partially optimized coding sequence for the Flp recombinase The Flp recombinase was chosen for this and subsequent experiments.
  • This example illustrates the properties of a partially optimized coding sequence of the Flp recombinase.
  • Analysis of the Flp coding sequence indicated that the 5' third of the sequence contained a number of codons rarely found in mammalian genes and presumably poorly translated in cells of mammalian origin. Most notably, 3 ATA and 5 TTA or CTA codons, encoding isoleucine (lie) and leucine (Leu) respectively, are present in the yeast sequence but are the least preferred codons in mammals.
  • the 5' third of the yeast Flp coding sequence is AT-rich (64 %) and contains a putative site of transcription termination (AATAAA, position 220).
  • the Flp sequence was optimized by oligonucleotide-mediated gene synthesis technique (see Materials and methods). Alignment of wild-type Flp (SEQ ID NOS:4 and 5) and optimized Flp sequences (referred hereinafter as oFlp; SEQ ID NOS:6 and 7) is shown in Figure 4A.
  • the increased amount of Flp produced from an optimized coding sequence can be useful in a screening experiment. Indeed, if a regulatory element linked to a Flp coding sequence is activated by a given stimulus, then more Flp transcript and protein shall be produced if the regulatory element is linked to an optimized Flp coding sequence rather than to a wild type Flp coding sequence, as was shown for the CMV enhancer/promoter elements. This may help to achieve higher sensitivity, particularly if transcription from the regulatory element is weakly activated by the stimulus of interest.
  • This example illustrates the various functionalities of a plasmid-based expression vector designed according to the present invention. It also shows that an expression vector can be selected after transfection in an engineered cell line, provided the vector expresses an exogenous nucleic acid capable of triggering Flp activity.
  • Plasmid construction is schematized on Figure 5. Oligonucleotides 62-2 (SEQ ID NO:9) and 62-3 (SEQ ID NO: 10) (500) were annealed and the protruding extremities were blunted by the Klenow fragment of DNA polymerase I (501). The resulting fragment was cloned in a Sspl site of plasmid pQuantoxTM (502).
  • RC1 The resulting plasmid, RC1 , was partially digested with Hindi and completely with Kpnl, the extremities were blunted and the plasmid was recircularized (503) to generate RC20a.
  • a 953 bp fragment was amplified from plasmid pREP4 (504) using forward primer 72-2 (SEQ ID NO:11) and reverse primer 18-73 (SEQ ID NO: 13) (505). This fragment was cloned in the unique EcoRV site of pBluescript II SK(+) to generate RC17 (506).
  • a 358 bp BamHI fragment from RC20a (507) was cloned into the BamHI site of RC17 to generate RC22 (508).
  • RC24 The unique Notl site of the latter plasmid was removed by digestion, fill-in and recircularization (509) to create RC24.
  • the resulting vector is RC32.
  • the transcription unit of RC32 can be said to comprise the cytomegalovirus immediate early gene enhancer/promoter regions followed by the coding sequence for green fluorescent protein (GFP) and a bovine growth hormone polyadenylation signal.
  • GFP green fluorescent protein
  • the recombinase substrate composed of a recombination target sequence followed by a stuffer region and by another recombination target sequence, is inserted between the laci promoter derived from plasmid pBluescript IITM SK (+) and nt 419 to 1318 of pREP4, encoding residues 26 to 267 of neomycin phosphotransferase (GenBankTM accession number AAK28133).
  • the Xhol site upstream of the CMV enhancer/promoter elements in RC32 was removed by partial digestion, fill-in and recircularization (512).
  • annealed oligonucleotides 24-7 (SEQ ID NO:17) and 24-8 (SEQ ID NO: 18) (513) were cloned in a Seal site of the resulting plasmid to generate RC43 (SEQ ID NO:1) (514), which contains a unique Swal site in the stuffer region.
  • RC43 SEQ ID NO:1
  • a map of RC43 is given in Figure 5C and Table 1 hereinafter.
  • Gal4VP16 is composed of the DNA-binding domain of Gal4 and the transcriptional activation domain of VP16 (Sadowski et al., 1988). Binding of Gal4VP16 to its cognate element in the context of a minimal promoter activates transcription of a sequence operatively linked to the minimal promoter (Webster et al., 1988).
  • a plasmid comprising oFlp operatively linked to a minimal promoter downstream of two copies of a consensus UAS site (Webster et al., 1988). This was done essentially as schematized on Figure 6.
  • the minimal promoter of CMV was amplified from pcDNAH (Invitrogen) using forward primer 31-6 (SEQ ID NO:24) and reverse primer 72-3 (SEQ ID NO:23) (601). This fragment encompasses the first 36 nt upstream of the site of initiation of transcription followed by 55 nt of the 5' untranslated region of the Sindbis virus genome.
  • HEK293 cells were electroporated at 600V/cm with a mixture of 9 ⁇ g of Seal-linearized RC71 and 10 ⁇ g of denatured salmon sperm DNA. Cells were plated and selection (2.5 ⁇ g/ml puromycin) was applied 24 hours later. The concentration of puromycin was reduced to 0.5 ⁇ g/m! on day 4 and colonies were picked on day 11. Induction levels of oFlp expression was determined after transfection with RC74.
  • expression vectors (RC43 or RC74) were transfected in these cells, recovered after 48 hours and subjected to a selection procedure relying on reconstitution in the expression vector of a gene conferring resistance to kanamycin when transformed in E.coli only if the nucleic acid expressed by the vector is able to trigger the activity of oFlp.
  • DNA (0.4 ⁇ g) was transfected in approximately 200,000 cells by lipofection using the EffecteneTM reagent (Qiagen Inc.) according to the manufacturer's recommendations. Extrachromosomal DNA was extracted from cells by a modified Hirt technique. 20ng of recovered DNA was transformed in E.
  • site-specific recombination of an expression vector expressing an exogenous nucleic acid (205) results in the removal of a unique Swal restriction site (R) located in the stuffer region (309).
  • R a unique Swal restriction site located in the stuffer region (309).
  • a fragment can be specifically amplified from recombined expression vectors (802) using primers (803,804) flanking the recombination region and the exogenous nucleic acid.
  • Expression vectors recovered after transfection in 293-UASoFlp/6 cells were therefore also analyzed by PCR. In this case, half of the DNA recovered by the modified Hirt technique was subjected to Swal digestion in order to cut unrecombined molecules (801).
  • No fragment is amplified from DNA extracted from untransfected 293-UASoFlp/6 cells (lane 812), from 293- UASoFlp/6 cells transfected with a control expression vector (RC43, lane 813) or from 293 WT cells transfected with RC74 (lane 815).
  • Example 3 Viral-based expression vector and method to selectively recover its recombined form after infection of cells constitutively expressing an active form of oFlp.
  • RNA-based expression vector comprises a transcription unit and recombinase substrate flanked by nucleotides 1-102 and nucleotides 3334-5779 of the Adenovirus serotype 5 genome (GenBank accession number 9626187).
  • a map of RC49-2 is given in Figure 9B and Table 2 hereinafter.
  • the transcription unit and recombinase substrate of RC49-2 were incorporated in an adenoviral genome by in vivo homologous recombination. This was done by co-transfecting 5 ⁇ g of Pmel-linearized RC49-2 with 5 ⁇ g of AdCMVIacZ ⁇ E1/ ⁇ E3, a replication-defective genome obtained commercially (Quantum Biotechnologies, Montreal, Canada). Co-transfection of DNA molecules in HEK293 cells was carried out by means of a calcium phosphate precipitate using standard protocols. Two days post-transfection, cells were overlaid with medium containing 1.25% (w/v) low melting agarose.
  • Recombinant viral genome resulting from homologous recombination between RC49-2 and the replication-defective adenoviral genome can be propagated in HEK293 cells, as indicated by the presence of viral plaques composed of GFP-expressing cells.
  • a stock of viral particles was obtained after 2 successive rounds of plaque- purification according to standard protocols.
  • a cell line constitutively expressing an active oFlp was obtained in order to test the adenoviral-based expression vectors. This was done by integrating a plasmid, designated RC59, in the genome of Hela cells.
  • RC59 contains two distinct transcription units: one expressing the optimized Flp sequence from the CMV enhancer/promoter elements and the other conferring resistance to puromycin to stably transfected cells.
  • RC59 was linearized at a unique Seal restriction site to facilitate integration in the host genome.
  • the linearized vector (10 ⁇ g) was electroporated in 5 million Hela cells. Electroporated cells were plated in five 100mm petri dishes and allowed to recover for 24 hours, at which time puromycin (1 ⁇ g/ml) was added to the medium. After 4 days of selective pressure, the concentration of puromycin was decreased 10-fold to allow growth of cell colonies. Isolated colonies (20-100 cells) were picked 10 days later. Clones were tested for expression of optimized Flp coding sequence by Northern analysis.
  • FIG. 10 shows the Flp transcript levels in 4 subclones, Hela/oFlp2-3 (lane 1002), Hela/oFlp3-2 (lane 1003), Hela/oFlp6-1 (lane 1004) and Hela/oFlp6-2 (lane 1005). Wild type Hela cells do not express Flp (lane 1001). It is well known that expression of a transgene can vary from one subclone to another, depending for example on the number of copies of the transgene and/or its site of integration.
  • Figure 11 A Analysis of the PCR products by agarose gel electrophoresis revealed that about 50% of the viral-based expression vectors were recombined in Hela/oFlp6-2 cells (lane 1102).
  • the expected sizes of the amplicons are 3544 bp from the non recombined viral-based expression vector (1105) and 3182 bp from the recombined viral-based expression vector (1106). Because recombination leads to removal of a unique Swal site from the viral- based expression vector, the signal arising from non recombined viral-based expression vectors can be eliminated by Swal digestion.
  • Genomic DNA (1.5 ⁇ g) was therefore digested with 20 U of Swal for 6 hours at 22°C, purified by phenol extraction and ethanol precipitation and subjected to PCR as described above. Consistent with the fact a DNA molecule cleaved between primers can no longer serve as template in PCR, amplicons from non recombined viral-based vectors (1105) are no longer detected after digestion of genomic DNA from infected cells with Swal (lane 1103). However, amplicons from recombined viral-based vectors (1106) are still readily detected after Swal digestion of genomic DNA extracted from Hela/oFlp6-2 cells infected with viral-based expression vectors (lane 1103).
  • Shorter amplicons from recombined viral-based expression vectors are readily detected when the infected cell population is composed of 15,000 Hela/oFlp6-2 cells mixed with 135,000 WT cells (lane 1111). No amplicon is detected when the infected cell population consists of 150 Hela/oFlp6-2 cells mixed with 149,850 WT cells (lane 1113).
  • PCR was performed on DNA that had not been digested by Swal. As expected longer, fragments derived from non recombined expression vectors are easily detected in both cases (lane 1110 and 1112).
  • Sindbis virus genome is a positive-strand RNA molecule.
  • Flp acts on DNA substrate
  • CMV-based vector expressing a cDNA derived from the Sindbis virus genome as follows.
  • a 4415 bp BamHI-Xhol fragment from DH- BB comprising the Sindbis virus structural proteins coding sequence (SP) was subcloned into a 9241 bp BamHI-Xhol fragment of plasmid pSinRep ⁇ to generate
  • VB220 (1200).
  • a 1230 bp fragment was amplified from pcDNA1.1/Amp by 25 cycles of PCR using high-fidelity Vent DNA polymerase and primers 18-100 (SEQ ID NO: 1200).
  • This fragment comprises the first nucleotide of the Sindbis virus genome positioned at the putative site of initiation of transcription of the cytomegalovirus immediate early enhancer/promoter elements (CMV).
  • CMV cytomegalovirus immediate early enhancer/promoter elements
  • the PCR product was digested by Hindi and Muni (1202) and a 639 bp fragment was inserted into a 13529 bp fragment of VB220 resulting from digestion by Sad, blunting of the extremities by T4 DNA polymerase followed by partial Muni digestion (1203).
  • the resulting plasmid is called Vb233b.
  • a 714 bp fragment comprising signals for transcription stop and polyadenylation of transcripts was obtained by digestion of pcDNA1.1/Amp with Sphl and Ncol (1204). After blunting its extremities with T4 DNA polymerase, the fragment was cloned in the blunted Xhol site of VB233b to generate VB250b (1205).
  • a 449 bp fragment comprising a recombinase substrate was amplified from RC24 (see Example 2) using primers 18-112 (SEQ ID NO:21) and 20-22 (SEQ ID NO:27) and high fidelity Vent DNA polymerase (1206).
  • This fragment was cloned in a partially-digested VB250b at a Muni site located in the 5' untranslated region of the cDNA copy of the Sindbis viral genome (1207).
  • the resulting vector is VB271 b.
  • a GFP coding sequence linked to promoter and enhancer sequences derived from the Rous sarcoma virus long terminal repeat (RSV LTR) was introduced between the structural proteins coding sequence and the 3' untranslated region of the viral genome. Sequences from RSV LTR were chosen because they are strongly active in a wide variety of cell types (Gorman et al., 1982). Cloning was performed as follows.
  • the GFP coding sequence was amplified (1208) from pQBIfc3TM (Quantum Biotechnologies, Montreal, Canada) and cloned (1209) into a 4005 bp EcoRV fragment of pRcRSV (Invitrogen, Carlsbad, Ca.) to generate VB288b.
  • a fragment comprising the RSV LTR and the GFP coding sequence was excised from VB288b (1210) and inserted at the blunted Apal site of VB271 b.
  • the resulting expression vector is RC77 (SEQ ID NO:3). A map of RC77 is given in Figure 12C and Table 3 hereinafter.
  • this vector could express an exogenous nucleic acid (i.e. GFP in this case)
  • 500ng of RC77 was transfected in HEK293A cells by lipofection using the EffecteneTM reagent (Qiagen Inc.). Forty-eight hours after transfection, cells were fixed with 4% paraformaldehyde and observed by fluorescence microscopy. As shown on Figure 13A, expression of GFP is detectable in approximately 2-5% of cells (image 1301). This result indicates that the transcription unit embedded in the cDNA copy of a disrupted Sindbis virus genome is active.
  • RC77 could lead to production of viral particles when recombined
  • 350ng of expression vector RC77 was transfected in HEK293A cells by lipofection with either 650ng of a vector expressing an optimized Flp coding sequence (RC59) or 650ng of a control vector expressing luciferase (VB35). Forty-eight hours after transfection, cells were fixed with 4% (v/v) paraformaldehyde and processed for anti-C protein immunofluorescence. Only cells co-transfected with RC77 and RC59 showed expression of this viral protein ( Figure 13A, images 1304), presumably due to recombination of the expression vector, excision of the disruptive recombinase substrate and subsequent production of viral particles.
  • an additional 108 bp fragment is specifically amplified from cells co-transfected with RC77 and RC59 ( Figure 13B, arrow 1316).
  • the difference in the size of the amplicon is due to excision of the disruptive recombinase substrate.
  • Example 5 Screening in a transgenic animal using virus-based expression vector.
  • the gene delivery vector that is used is an adenovirus particle containing a genome engineered as described in Example 3.
  • a library of adenoviral particles is constructed as outlined in section iv) starting ⁇ 8
  • the screening host is a transgenic mouse obtained as follows.
  • the oFlp coding sequence and the 3' untranslated region of the bovine growth hormone cDNA are operatively linked to a fragment encompassing 1.7 kb found immediately upstream of the site ⁇ of transcription initiation of the mouse osteocalein gene.
  • Osteocalein is a well- known marker of osteoblast differentiation whose expression is controlled by this 1.7 kb cell-specific regulatory fragment.
  • the resulting construct is injected into a pseudo-fertilized egg to obtain lines of transgenic mice according to standard protocols.
  • RNA extracted from various tissues of 0 heterozygotes animals are tested by Northern analysis to ensure that oFlp expression is restricted to the differentiated osteoblast, as is the endogenous osteocalein gene.
  • Transgenic animals are then injected intraperitoneally with 10 8 plaque-forming units of the adenovirus-based expression vector library.
  • animals are sacrificed and ⁇ recombined expression vectors are detected, if any, as in Example 3, from muscle and adipose tissue. These tissues are selected because it is believed that they harbor cells that can differentiate into osteoblasts given the proper stimulus.
  • exogenous nucleic acids comprised in recombined expression vectors so produced in the mouse have the 0 capacity to activate oFlp transcription from the osteocalein regulatory fragment and may therefore be hypothesized to be dominant activators of osteoblast differentiation.
  • Example 6 In vivo screening for regulatory elements ⁇ This hypothetical example illustrates the design of a screening conducted in a mouse to identify fragments of mouse genomic DNA that can confer tissue- specific expression to the linked oFlp coding sequence.
  • the expression vector that is used comprises a transcription unit devoid of regulatory elements, as depicted on Figure 2B and a 0 recombinase substrate constructed as depicted on Figure 3A. Fragments of mouse genomic DNA are obtained by partial digestion with Sau3A and cloned in the vector upstream of the oFlp coding sequence.
  • the library of vectors is transfected in vivo by injection of lipid:DNA complexes formed using commercially available transfection reagents. At various times after transfection in vivo, animals are sacrificed, extrachromosomal DNA is extracted from various tissues and digested with the enzyme cutting in the stuffer region of the recombinase substrate.
  • Exogenous nucleic acids from uncleaved recombined vectors are amplified by PCR using a primer located upstream of their site of insertion and another downstream of the recombinase substrate. Exogenous nucleic acids are reinserted in the vector and subjected to another round of in vivo screening. Exogenous nucleic acids that are finally retrieved correspond to genomic DNA fragments capable of activating the expression of oFlp in the tissue from which it was retrieved. By comparing sequences of fragments retrieved from different tissues, it is therefore possible to identify genomic fragments whose transcriptional activity is tissue-restricted.

Abstract

Methods, vectors, cell lines and kits for the screening and identification of nucleic acids are described. The invention is based on the use of a site-specific recombinase to excise, from an expression vector into which an exogenous nucleic acid having a desired feature has been inserted, a region of the vector which is excisable by site-specific recombination. Insertion into the vector of a nucleic acid having a desired feature, such as nucleic acids capable of changing the expression of cellular genes or the state of cellular metabolism or signaling pathways, triggers the synthesis and/or activity of a site specific recombinase, the action of the recombinase allowing an easy selection of the expression vector containing the exogenous nucleic acid having a desired feature.

Description

METHODS, VECTORS, CELL LINES AND KITS FOR SELECTING NUCLEIC ACIDS HAVING A DESIRED FEATURE
RELATED APPLICATION This application claims priority of United States Provisional Application
60/301.149 filed June 28, 2001 , the disclosure of which is incorporated by reference herein in its entirety.
BACKGROUND OF THE INVENTION a) Field of the invention
The present invention relates to screening of nucleic acids. More particularly, the present invention is concerned with the identification of nucleic acids having a desired feature, such as nucleic acids encoding signaling molecules, transcription factors or other proteins involved in changes of cell metabolism.
b) Brief description of the prior art
Large-scale gene sequencing projects are currently generating huge amounts of genetic information. In silico analysis and transcriptional profiling are important tools to help decipher this information. Ultimately however, the function encoded by novel sequences will have to be determined within the proper biological context. A powerful approach to do so is the so-called expression screening strategy, i.e. transfection of a library of vectors harboring different nucleic acids ("expression vectors") into host cells and selective retrieval of those nucleic acids encoding specific function.
Cell-based screening technology can be viewed as a tool that sends out a "signal" if, and only if, a particular "target" nucleic acid possessing the activity being screened for has been incorporated into a cell. This technology is based on a reporter system that is kept inactive (no signal) in the absence of the target gene. Typically, reporter system are based on the conditional expression of marker proteins. Appearance of this marker in a cell transfected with an expression vector containing a nucleic acid having a desired feature allows the selection of this cell from the rest of the transfected cell population. When performed in vertebrate cells, the need to select cells greatly limits the throughput of the method since techniques to do so are complex, cumbersome or lengthy, ln some cases for example, selection is achieved with fluorescent markers coupled to very sophisticated sorting equipment. In others, cells having the desired phenotype are selected either after limiting dilution or by clonal growth followed by colony picking, long and laborious techniques. Furthermore, the cell selection step may increase the occurrence of false-positives. In fact, the cell selection step is merely a pre-requisite to recover the expression vector whose transfection has triggered the appearance of a desired phenotype and to identify the nucleic acid contained therein.
Site-specific recombinases such as Cre recombinase from bacteriophage P1 and Flp from S. cerevisiae have previously been used for recombining exogenous molecules in an heterologous system (Sauer and Henderson, 1988; O'Gorman et al., 1991). The Cre and Flp recombinases bind well-characterized DNA elements ("recombination target sequences") and mediate excision or inversion of the intervening sequence, depending on the orientation of the recombination target sequences relative to one another (Sauer, 1994). The Cre recombinase is extensively used to create so-called conditional mouse knock-out mutants, i.e. control the spatial and temporal inactivation of engineered genetic loci in vivo. One way to achieve this spatial and temporal selectivity is to control the transcription of Cre via cis regulatory elements having the desired properties (Metzger and Feil, 1999). An alternative is to exploit the fact that the recombinase must be localized in the nucleus to carry out recombination of target sequences. In this technique, the recombinase is fused to a ligand binding domain of a nuclear receptor (e.g. estrogen receptor), such that nuclear localization of the fusion protein is dependent on the presence of the nuclear receptor ligand molecule (e.g. estradiol in the case of the estrogen receptor) (Logie and Stewart, 1995; Angrand et al., 1998). Although the prior art describes some methods to render the activity of a site-specific recombinase dependent on the occurrence of specific cellular events, the use of recombinases for screening nucleic acids has never been suggested. Similarly, no one has ever produced expression vectors comprising a nucleic acid sequence excisable by site-specific recombinase for the screening of exogenous nucleic acids in eukaryotic cells.
In view of the above, it is clear that there is a need for a nucleic acid expression screening method that bypasses the time-consuming cell selection step and that does not require marking of proteins. There is more particularly a need for new methods taking advantage of the recombinase activity for the selective retrieval of nucleic acids encoding a desired function.
The present invention aims to overcome the limits and obviate the problems known in the art for screening gene and nucleic acids by providing a molecule, a method and a kit for rapidly and efficiently identifying and retrieving nucleic acids encoding a desired function in eukaryotic cells. The purpose of the invention is also to fulfill other needs that will be apparent to those skilled in the art upon reading the following specification.
SUMMARY OF THE INVENTION It is therefore an object of the present invention to provide methods, vectors, cell lines and kits for identifying and/or selecting a nucleic acid having a desired feature. A non limitative list of nucleic acids having a desired feature includes nucleic acids having transcriptional activity, nucleic acids encoding proteins involved in signal transduction pathways, and nucleic acids encoding proteins involved in cell metabolism or differentiation state.
According to a first aspect, the invention relates to expression vectors or viral-based expression vectors containing one or more recombination target sequences, these vectors being modified after the action of a site-specific recombinase in such a way as to be differentiable from non recombined vectors. The invention also relates to libraries of expression vectors or viral-based expression vectors containing exogenous nucleic acids. Most preferred vectors according to the invention are those whose sequences are set forth in SEQ ID NOS: 1 , 2 and 3. According to a preferred embodiment, the vector is useful for expressing an exogenous nucleic acid in eukaryotic cells and it comprises a nucleic acid sequence excisable by site-specific recombination.
According to another preferred embodiment, the vector is an expression vector which comprises nucleic acid sequence, the vector comprising a recombinase substrate and a transcription unit.
According to a further preferred embodiment, the vector comprises i) a site-specific recombinase coding sequence operatively linked to a termination sequence; and ii) a recombinase substrate excisable specifically by a recombinase encoded by the site-specific recombinase coding sequence.
Preferably, the recombinase substrate in the vectors of the invention comprises a stuffer region flanked by recombination target sequences. Preferably, the stuffer region is removable by site-specific recombination. More preferably, the stuffer region comprises a restriction site. Preferably also, the vector's transcription unit comprises an enhancer sequence, a promoter sequence and a termination sequence operatively linked together. ln one embodiment, the nucleic acid sequence of the vector comprises at least two fragments of a viral genome for packaging the vector or a fragment thereof into infectious viral particles. These fragments may derive from a retrovirus or an adenovirus. The recombinase substrate and the transcription unit are preferably located between these two viral fragments.
Preferably also, the nucleic acid sequence of the vector comprises a nucleic acid sequence encoding an inactive gene conferring resistance to an antibiotic in bacteria. According to the invention, the activity of inactive gene is restorable by site-specific recombination of the vector nucleic acid sequence.
According to another preferred embodiment, the vector nucleic acid sequence comprises a recombinase substrate and a transcription unit incorporated into a viral genome. According to the invention, the recombinase substrate comprises a stuffer region advantageously excisable by site-specific recombination, and formation of viral particles is dependent upon excision of the stuffer region. Preferably, the viral genome consists of a cDNA copy of an alphaviral genome, and presence of the stuffer region blocks translation of viral proteins encoded by the cDNA copy of the alphaviral genome. The cDNA copy of the alphaviral genome may derive from Sindbis virus genome or from Semliki Forest virus genome. More preferably, the recombinase substrate is present in a 5' untranslated region of the cDNA copy of the alphaviral genome. According to a second aspect, the invention relates to modified cell lines and transgenic animals having incorporated in their genome a DNA segment comprising a site-specific recombinase operatively linked to regulatory elements. A related aspect concerns the use of such cell Iine(s) and animal(s) for screening, among libraries of expression vectors, those vectors containing a nucleic acid that activates, directly or indirectly, transcription of regulatory element(s). According to a preferred embodiment, there is provided an eukaryotic cell line which comprises an expressible site-specific recombinase coding sequence. This expressible site-specific recombinase coding sequence is operatively linked to a minimal promoter and to at least one cis-acting regulatory element. In one embodiment, the site-specific recombinase is expressed upon activation of the at least one cis-acting regulatory element. The cis-acting regulatory element(s) may be activated by elevation of intracellular cAMP or cGMP levels, elevation of intracellular calcium concentration, and/or change in the phosphorylation state of specific proteins (e.g. mitogen-activated protein kinase (MAPK), c-jun N-terminal protein kinase (JNK) and phosphatidyl inositol-3 kinase (PI-3 kinase)). The cis- acting regulatory element(s) may also be activated during differentiation of mesenchymal stem cells into bone, cartilage, adipocytes or myoblasts. More preferably, the site-specific recombinase coding sequence is optimized for enhanced synthesis, stability or translation in eukaryote cells. The expressible site-specific recombinase coding sequence may be chosen from Flp coding sequence from Saccharomyces cerevisiae, Cre coding sequence from bacteriophage P1 and β-recombinase coding sequence from Bacillus subtilis. Preferred Flp coding sequence are those comprising SEQ ID NO:4, SEQ ID NO:6, or a functional homologue thereof, particularly those homologues coding for proteins having substantially the same biological activity than SEQ ID NO:5 or SEQ ID NO:7. In a related aspect, the invention relates to nucleic acids and amino acid sequences comprising an optimized Flp recombinase. Preferred optimized Flp coding sequences are those coding amino acid sequence set forth in SEQ ID NO:7 (including SEQ ID NO:6) and functional homologues thereof.
According to another aspect, the invention relates to a method for identifying nucleic acids encoding a desired feature from a library of exogenous nucleic acids. In a preferred embodiment, a plurality of nucleic acids from the library are inserted into a plurality of expression vectors comprising a nucleic acid sequence excisable by site-specific recombination. Preferably, the vectors are then inserted into a eukaryotic cell line (as defined previously) or into a transgenic animal comprising a nucleic acid encoding an inactive site-specific recombinase whose activity is restorable. This site-specific recombinase may be inactive for instance due to a lack of sufficient expression or due to sequestration outside of the cell nucleus. In one embodiment, the activity of the inactive site-specific recombinase is restored upon expression by said vector of an exogenous nucleic acid having the desired feature. Thereafter, the active site-specific recombinase preferably excises a fragment from the expression vectors, thereby forming recombined expression vectors that comprise a nucleic acid having the desired feature and that can be differentiated from unrecombined expression vectors.
In a more specific embodiment, the method of the invention is used for screening exogenous nucleic acids having a desired feature within eukaryotic cells. The method comprises the steps of: a) providing a plurality of expression vectors each capable, when present into a suitable host, of expressing an exogenous nucleic acid inserted therein, these vectors comprising a nucleic acid sequence excisable by site-specific recombination; b) providing a cell line or a transgenic animal comprising a nucleic acid encoding an inactive site-specific recombinase whose activity is restorable; c) inserting at least one exogenous nucleic acid from a library of nucleic acids into a plurality of the expression vectors, in order to provide a library of recombinant expression vectors; d) introducing, into cells of the cell line or of the transgenic animal of step (b), a plurality of recombinant expression vectors from the library obtained at step (c); e) allowing the recombinant expression vectors introduced at step (d) to express the exogenous nucleic acid inserted therein, wherein only exogenous nucleic acids encoding the desired feature are capable of restoring the activity of the site-specific recombinase of step (b); f) allowing the site-specific recombinase whose activity has been restored in step (e) to excise the excisable nucleic acid sequence from recombinant expression vectors which have expressed an exogenous nucleic acid having restored the activity of the site-specific recombinase; g) recovering recombinant expression vectors from cells of the cell line or transgenic animal; and h) selecting recombined expression vectors having undergone site-specific recombination at step (f), said recombined vectors containing an exogenous nucleic acid encoding the desired feature.
In an even more specific embodiment, the method of the invention is used for screening exogenous nucleic acids having a transcriptional activity within eukaryotic cells (e.g. regulatory elements such as enhancer, promoter, etc). The method comprises the steps of: a) providing a vector comprising: i) a site-specific recombinase coding sequence operatively linked to a termination sequence; and ii) a recombinase substrate excisable specifically by a site-specific recombinase encoded by the site-specific recombinase coding sequence; b) inserting into a plurality of vectors as defined at step (a) at least one exogenous nucleic acid taken from a library of exogenous nucleic acids in order to provide a library of recombinant vectors; c) inserting a plurality of recombinant vectors from the library obtained at step (b) into a suitable eukaryotic host; d) allowing the exogenous nucleic acid inserted at step (b) to activate transcription of the site-specific recombinase coding sequence which is comprised in the vector, thereby producing the site-specific recombinase; e) allowing the site-specific recombinase so produced to excise the recombinase substrate in the recombinant vector harboring the exogenous nucleic acid having activated the transcription of the site- specific recombinase; f) following step (e), recovering a plurality of recombinant vectors from the eukaryotic host; and g) selecting recombinant vectors having undergone site-specific recombination, most of these vectors containing an exogenous nucleic acid having transcriptional activity. It is further an object of the invention to provide methods for specifically recovering recombined vectors after screening of libraries of expression vectors or viral-based expression vectors.
According to one embodiment, the nucleic acid sequence of the vector comprises an inactive gene conferring resistance to an antibiotic in bacteria, the activity of the inactive gene being restored by site-specific recombination of said nucleic acid sequence. Therefore, recombined vectors may be isolated from unrecombined vectors by: i) extracting DNA from cells into which the expression vectors have been introduced; ii) transforming bacteria with DNA extracted at step (i); iii) growing bacteria transformed at step (ii) in presence of the antibiotic; and iv) selecting bacterial colonies resistant to the antibiotic.
The resistant bacterial colonies comprises expression vectors having undergone site-specific recombination. This method may further comprises the steps of: v) extracting expression vectors from colonies selected at step (iv); and vi) identifying an exogenous nucleic acid found in said extracted vectors. According to another embodiment, the nucleic acid sequence of the vector comprises a recombinase substrate having a stuffer region flanked by recombination target sequences. The stuffer region also comprises a cleavable restriction site. Therefore, recombined vectors may be isolated from unrecombined vectors by: i) extracting DNA from cells into which the expression vectors have been introduced; and ii) contacting DNA extracted at step a) with a restriction enzyme recognizing said cleavable restriction site; iii) optionally degrading DNA fragments cleaved by the restriction enzyme with an exonuclease; and iv) optionally amplifying a DNA fragment from the expression vectors, the fragment comprising the exogenous nucleic acid.
Accordingly, recombined expression vectors are not cleaved by the restriction enzyme, but unrecombined expression vectors are cleaved by the restriction enzyme.
In another aspect, the invention concerns a screening kit comprising 1) a vector as defined herein; and/or 2) a cell line as defined herein; and at least one further element selected from the group consisting of instructions for using the kit, reaction buffer(s), enzyme(s), probe(s) and pool(s) of nucleotide molecules to be screened.
An advantage of the present invention is that it obviates the expensive and time-consuming task of selecting cells that express a gene of interest. The invention is also much more rapid, efficient and accurate for selecting a particular nucleic acid having a desired feature, characteristic or function. The invention can also selectively retrieve, from a library of nucleic acids, a nucleic acid having a desired feature, such as a nucleic acid encoding a signaling molecule, a transcription factor or a protein involved somehow in promoting changes in cell metabolism or differentiation state (for instance a kinase, a phosphatase, or a transcription factor).
Other objects and advantages of the present invention will be apparent upon reading the following non-restrictive description of several preferred embodiments, made with reference to the accompanying drawings. BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a schema illustrating how site-specific recombination can be used to screen for nucleic acids encoding a specific biological function.
Figures 2A and 2B are schemas showing preferred embodiments of a transcription unit of an expression vector according to the invention.
Figures 3A, 3B, and 3C are schemas showing preferred embodiments of a recombinase substrate of an expression vector according to the invention, and also preferred methods for specifically retrieving whole or parts of recombined expression vectors or viral-based expression vectors according to the invention.
Figure 4A shows an alignment of the first 116 codons and corresponding amino acids of wild type Flp recombinase (Flp; SEQ ID NOS: 4 and 5) and of an optimized recombinase coding sequence (oFlp; SEQ ID NOS: 6 and 7) according the present invention. Amino acid substitutions to enhance thermostability are shown in bold. Putative internal polyadenylation signal is underlined.
Figure 4B is a picture of a Northern analysis comparing the expression of wild type Flp (411) and optimized recombinase (oFlp, 412) after transfection of appropriate constructs in HEK293 cells. Flp signal (arrowhead) is not detected in mock-transfected cells (410).
Figure 4C is a picture of Western analysis comparing the amount of Flp protein produced after transfection of HEK293 cells with vectors expressing either the wild type coding sequence (421) or optimized coding sequence ,(422) according to the invention. Flp signal (arrowhead) is not detected in mock-transfected cells (420).
Figures 5A, 5B and 5C schematizes construction of an expression vector (RC43) according to a preferred embodiment of the invention, the expression vector comprising a transcription unit and a recombinase substrate disrupting a gene conferring resistance to kanamycin. Nucleic acid sequence of RC43 is set forth in SEQ ID NO:1.
Figure 6 schematizes the construction of a plasmid containing cis-acting regulatory elements operatively linked to an optimized recombinase coding sequence (oFlp).
Figure 7 is a picture showing the results of a Northern analysis performed to detect expression of oFlp mRNA in subclones of HEK293 cells obtained after stable transfection of a plasmid comprising a coding sequence for oFlp operatively linked to regulatory elements activated by the Gal4VP16 protein (RE- oFlp). Lane 701 , wild type HEK293 cells; lane 702, subclone 6 transfected with a control vector expressing green fluorescent protein; lane 703, subclone 6 transfected with an expression vector for Gal4VP16; lane 704, subclone 10 transfected with a control vector expressing green fluorescent protein; lane 705, subclone 10 transfected with an expression vector for Gal4VP16. oFlp signal is indicated by an arrowhead. The signal indicated by an asterisk is an artefact arising from transfection of the control vector.
Figure 8A schematizes fragments from non recombined (801) and recombined (802) expression vectors. Arrows indicate the approximate positions of primers used in the PCR analysis presented on figure 8B.
Figure 8B is a picture showing results of a PCR analysis performed to selectively amplify fragments from recombined expression vectors according to a preferred embodiment of the invention. Expression vectors were recovered from HEK293/RE-oFlp subclone 6 transfected with a vector expressing green fluorescent protein (lane 813), Gal4VP16 (lane 814) or from wild type HEK293 cells transfected with a vector expressing Gal4VP16 (lane 815). DNA was subjected to PCR after digestion with Swal. A control fragment was amplified from a non recombined vector expressing Gal4VP16 (lane 811). Figures 9A and 9B schematizes the construction of an expression vector (plasmid RC49-2) according to a preferred embodiment of the invention. The plasmid may generate an adenovirus-based expression vector, and comprises a transcription unit and a recombinase substrate with a restriction site. The approximate position of primers 18-64V and 18-106V used in subsequent PCR is indicated. Nucleic acid sequence of RC49-2 is set forth in SEQ ID NO:2.
Figure 10 is a picture of a Northern analysis showing the expression of optimized Flp mRNA in distinct subclones of Hela cells obtained after stable transfection of a vector comprising an optimized Flp coding sequence linked operatively to a cytomegalovirus enhancer and promoter according to a preferred embodiment of the invention. Lane 1001 , wild type Hela cells; lane 1002, subclone Hela/oFlp2-3; lane 1003, subclone Hela/oFlp3-2; lane 1004, subclone Hela/oFlp6-2; lane 1005, , subclone Hela/oFlp6-3.
Figure 11 A is a picture showing results of a PCR analysis performed to determine the amount of recombined adenovirus-based expression vector according to a preferred embodiment of the invention using DNA extracted from wild type Hela cells (1101); infected Hela/oFlp6-2 cells (1102); infected Hela/oFlp6-2 cells and digested by Swal (1103); infected Hela/oFlp2-3 cells (1104). Lane 1100 shows the migration of a molecular marker. Fragments amplified from non recombined adenovirus-based expression vector (1105). Fragments amplified from recombined adenovirus-based expression vector (1106).
Figure 11 B is a picture showing results of a PCR analysis performed to detect fragments of recombined adenovirus-based expression vectors according to a preferred embodiment of the invention after infection of populations of Hela cells containing 10% of Hela/oFlp6-2 cells (lanes 1110,1111) or 0.1% of Hela/oFlp6-2 cells (lanes 1112,1113). DNA extracted from infected cells was subjected to PCR after digestion with Swal (lanes 1111 ,1113). Figure 11C is a picture showing results of a semi-nested PCR analysis performed on amplicons obtained from Swal-digested DNA extracted from populations of Hela cells containing 10% of Hela/oFlp6-2 cells (lanes 1120) or 0.1% of Hela/oFlp6-2 cells (lanes 1121). Lane 1122 shows the migration of a molecular marker. Fragments amplified from non recombined adenovirus-based expression vector (1123). Fragments amplified from recombined adenovirus- based expression vector (1124).
Figures 12A and 12B schematizes the construction of an expression vector (RC77) according to a preferred embodiment of the invention, the expression vector comprising a transcription unit embedded in a viral genome whose translation is disrupted by a recombinase substrate. Nucleic acid sequence of RC77 is set forth in SEQ ID NO:3.
Figure 13 shows the properties of an expression vector containing a transcription unit embedded in a cDNA copy of the Sindbis virus genome whose translation is disrupted by a recombinase substrate according to a preferred embodiment of the invention. Figure 13A. Image 1301 , expression of GFP inserted in such an expression vector (RC77). Image 1302, immunofluorescence against the C viral protein in HEK293A cells transfected with RC77 and a control expression vector (VB35). Image 1303, immunofluorescence against the C viral protein in BHK-21 cells infected with culture medium from HEK293A cells transfected with RC77 and VB35. Image 1304, immunofluorescence against the C viral protein in HEK293A cells transfected with RC77 and a vector expressing oFlp (RC59). Image 1305, immunofluorescence against the C viral protein in BHK-21 cells infected with culture medium from HEK293A cells transfected with RC77 and RC59. Figure 13B is a picture showing results of a RT-PCR analysis performed to detect fragments of engineered Sindbis virus genomes after co-transfection in HEK293A cells of RC77 with either VB35 (lane 1313) or RC59 (lane 1314). Fragment amplified from RC77 plasmid DNA (lane 1312). Control reaction on RNA extracted from untransfected cells (lane 1311). Lane 1310 shows migration of a 100bp ladder.
Similar reference numerals are used in different figures to denote similar components.
DETAILED DESCRIPTION OF THE INVENTION A) Definitions
Throughout the text, the word "kilobase" is generally abbreviated as "kb", the words "deoxyribonucleic acid" as "DNA", the words "ribonucleic acid" as "RNA", the words "complementary DNA" as "cDNA", the words "polymerase chain reaction" as "PCR", and the words "reverse transcription" as "RT". Nucleotide sequences are written in the 5' to 3' orientation unless stated otherwise.
In order to provide an even clearer and more consistent understanding of the specification and the claims, including the scope given herein to such terms, the following definitions are provided:
Desired feature: Refers to a nucleic acid encoding a peptide or a protein having a desired property or function. A non-limitative list of examples of a nucleic acid having a "desired property" or "desired function" include nucleic acids encoding a specific signal transduction activity (e.g. a kinase or a phosphatase), a specific gene regulation activity (e.g. a transcription factor), or a specific cellular function (e.g. a protein promoting changes in cell metabolism or differentiation state), etc.
Exogenous nucleic acid: A nucleic acid (such as cDNA, cDNA fragments, genomic DNA fragments, antisense RNA, oligonucleotide) which is not naturally part of another nucleic acid molecule. The "exogenous nucleic acid" may be from any organism or purely synthetic.
Expression: The process whereby an exogenous nucleic acid is transcribed. In the case of cDNAs, cDNA fragments, genomic DNA fragments and oligonucleotides, the transcribed exogenous nucleic acid can be subsequently translated into a peptide or a protein in order to carry out its function if any. Expression vector: a vector capable of mediating the expression of an exogenous nucleic acid once introduced into a host. Preferably, expression vectors according to the present invention are capable of expressing an exogenous nucleic acid inserted therein in eukaryotic cells and comprise a recombinase substrate and a transcription unit. In addition, the expression vectors of the invention preferably contain a signal for the termination of transcription and the polyadenylation of transcripts generated from enhancer and promoter sequences (see transcription unit definition). The expression vectors also preferably comprise unique restriction sites between the promoter sequences and the termination sequence for inserting the exogenous nucleic acid to be expressed.
Functional homologue: As is generally understood and used herein, refers to a non native polypeptide or nucleic acid molecule that possesses a functional biological activity that is substantially similar to the biological activity of a native polypeptide or a nucleic acid molecule. A functional homologue typically refers to a polypeptide or a nucleic acid molecule having at least 50%, more preferably at least 55%, even more preferably at least 60%, still more preferably at least 65-70%, and yet even more preferably greater than 85%, 90%, 95% or
' 95% similarity or identity at the level of nucleotide or amino acid sequence to at least one or more regions of a given nucleotide or amino acid sequence. The functional homologue may exist naturally or may be obtained following a single or multiple amino acid substitutions, deletions and/or additions relative to the naturally occurring enzyme(s) using methods and principles well known in the art. A functional homologue of a protein may or may not contain post-translational modifications such as covalently linked carbohydrate, if such modification is not necessary for the performance of a specific function. It should be noted, however, that nucleotide or amino acid sequences may have similarities below the above given percentages and still encode a proteinic molecule having a desired activity, and such proteinic molecules may still be considered within the scope of the present invention where they have regions of sequence conservation. The term "functional homologue" is intended to the "fragments", "segments", "variants", "analogs" or "chemical derivatives" of a polypeptide or a nucleic acid molecule. Fragment: refers to a section of a molecule, such as protein/polypeptide or nucleic acid, and is meant to refer to any portion of the amino acid or nucleotide sequence.
Host: A cell, tissue, organ or organism capable of providing cellular components for allowing expression of an exogenous nucleic acid inserted into an expression vector. This term is intended to also include hosts which have been modified in order to accomplish these functions. Bacteria, fungi, animals (cells, tissues, or organisms) and plants (cells, tissues, or organisms) are examples of a host. Preferred hosts according to the present invention are eukaryotic cells and animals.
Insertion: The process by which a nucleic acid is introduced into another nucleic acid. A typical example includes insertion of an exogenous nucleic acid into an expression vector to create a "recombinant" or "genetically modified" expression vector. Methods for inserting a nucleic acid into another normally requires the use of restriction enzymes and such methods of insertion are well known in the art.
Knock-in: Refers to the process by which a specific region of the genome of a host is replaced by an exogenous nucleic acid through a reaction involving homologous recombination. According to a preferred embodiment of the present invention, this process is used to replace the first coding exon of a host gene by the coding sequence of a site-specific recombinase.
Library: A collection or a pool of nucleic acid molecules. This includes genomic libraries, RNA libraries, cDNA libraries, expressed sequence tag libraries, artificial sequences libraries including randomized artificial sequence libraries.
Minimal promoter: A short DNA sequence harboring minimal requirements for initiating transcription of a genetic sequence. The minimal promoter is not sufficient to activate transcription of a linked gene. A sequence harboring a so called "TATA" box at about 30 nucleotides upstream of the site of initiation of transcription is an example of a minimal promoter.
Nucleic acid: Any DNA, RNA sequence or molecule having one nucleotide or more, including nucleotide sequences encoding a complete gene. The term is intended to encompass all nucleic acids whether occurring naturally or non-naturally in a particular cell, tissue or organism. This includes DNA and fragments thereof, RNA and fragments thereof, cDNAs and fragments thereof, expressed sequence tags, artificial sequences including randomized artificial sequences.
Optimized coding sequence: refers to a wild type nucleic acid sequence which has been modified to give higher levels of transcripts and/or products when expressed in a given host which is different from the host of the wild type nucleic acid. A typical example is the replacement of codons not efficiently translated in a given host by codons preferred in this host.
Recombinant: The term "recombinant" in association with "expression vector" refers to an expression vector which has been modified to contain a non- native exogenous nucleic acid.
Recombinase substrate: A nucleic acid molecule comprising a stuffer region flanked by recombination target sequences in direct or reverse orientation relative to one another. Typically the stuffer region is a nucleic acid . sequence which is excisable by site-specific recombination.
Recombination target sequence: A short DNA segment acted upon by a site-specific recombinase. Generally, it is composed of two inverted sequences (such as SEQ ID NO:8) that are bound by a site-specific recombinase and that are separated by a spacer sequence of defined length. According to a preferred embodiment of the invention, an additional binding sequence is typically present at the 5' end of the recombination target sequence.
Regulatory element: Refers to a DNA sequence that can, under specific cellular conditions, mediate the activation or repression of the transcription of nucleic acid sequences that are operatively linked thereto. Typically, regulatory elements comprise one or more fragments of sequences naturally occurring in the enhancer or promoter regions of cellular genes. Purely synthetic regulatory elements can also be made by assembling one or more oligonucleotides corresponding to binding sites of specific transcription factors.
Site-specific recombinase: A protein capable of mediating site-specific recombination. Site-specific recombination: The process by which a recombinase substrate is acted upon by a site-specific recombinase. Typically, this activity results in the excision of the stuffer region and of one recombination target sequence if the recombination target sequences are in the direct orientation relative to one another, or, if the recombination target sequences are in the reverse orientation relative to one another, in the inversion of the stuffer region.
Transcription unit: As used herein, refers to a region of a vector which comprises an enhancer sequence, a promoter sequence and a termination sequence, all operatively linked together. Preferably, the enhancer and promoter sequences are constitutively active and are operatively linked to an exogenous nucleic acid inserted into the vector. Enhancer and promoter sequences can be derived for example from the cytomegalovirus (CMV) immediate-early genes or from the Rous sarcoma virus (RSV) long terminal repeat.
Transfection: the process of introducing nucleic acids in eukaryotic cells by any means such as electroporation, lipofection, precipitate uptake, micro- injection. A cell having incorporated an exogenous nucleic acid (e.g. an expression vector or a recombinant expression vector) is said to be transfected.
Vector: An RNA or DNA molecule which can be used to transfer an RNA or DNA segment from one organism to another. Viral-based expression vector: Refers to an expression vector or parts thereof embedded in a viral genome that can be packaged into infectious viral particles. Typically, the parts of an expression vector embedded in a viral genome consists of the transcription unit and the recombinase substrate. Viral-based expression vectors provide a way to better control the delivery of exogenous nucleic acids to host cells via infectious viral particles.
B) General overview of the invention
The invention is based on the use of a site-specific recombinase to modify an expression vector containing an exogenous nucleic acid having a desired feature. As will be outlined in greater details below, insertion of a nucleic acid having a desired feature, such as nucleic acids capable of changing the expression of cellular genes or the state of cellular metabolism or signaling pathways, triggers the synthesis and/or activity of a site specific recombinase, the action of the recombinase allowing an easy selection of the expression vector containing the exogenous nucleic acid having a desired feature.
C) Methods for selecting a nucleic acid having a desired feature
According to a first aspect, the present invention relates to methods for screening and/or identifying exogenous nucleic acids having a desired feature within eukaryotic cells.
In its most basic version, the invention is used to screen for nucleic acids encoding a specific gene regulatory activity (e.g. a kinase, a phosphatase, or a transcription factor). Figure 1 depicts a preferred specific embodiment of a screening method according to the invention. A shown, a vector (101) which is capable, when present into a suitable host (102), of expressing an exogenous nucleic acid inserted therein (103), is provided. The vector comprises a nucleic acid sequence (104) which is excisable by site-specific recombination.
A cell line or a transgenic animal is also provided. The cell line or transgenic animal comprises a nucleic acid minimally encoding an inactive site- specific recombinase (105) whose activity is restorable.
A library of recombinant expression vectors is then prepared. This is achieved by inserting into a plurality of expression vectors as the one defined previously, at least one exogenous nucleic acid from a library of exogenous nucleic acids. Next, a plurality of recombinant expression vectors from this library are inserted into the cell line or transgenic animal provided previously. Thereafter, these recombinant expression vectors are allowed to express the exogenous nucleic acid inserted therein. According to the invention, only exogenous nucleic acids encoding the desired feature will be capable of restoring (106) the activity of the site-specific recombinase of the host. A site-specific recombinase (107) whose activity is restored may then excise the excisable nucleic acid sequence from recombinant expression vector(s) which have expressed an exogenous nucleic acid having restored such site-specific recombinase activity.
Recombinant expression vectors are then recovered (108) from the transfected cells or transgenic animal and recombinant expression vectors having undergone site-specific recombination are selected (109). According to the invention, most of these vectors contains an exogenous nucleic acid encoding the desired feature.
In another embodiment, the invention is used for screening nucleic acids having a transcriptional activity (e.g. regulatory elements such as enhancers, promoters and the like). A preferred screening method comprises the steps of: a) providing a vector comprising: i) a site-specific recombinase coding sequence operatively linked to a termination sequence; and ii) a recombinase substrate excisable specifically by a site-specific recombinase encoded by the site-specific recombinase coding sequence; b) inserting into a plurality of vectors as defined at step (a) at least one exogenous nucleic acid taken from a library of exogenous nucleic acids in order to provide a library of recombinant vectors; c) inserting a plurality of recombinant vectors from the library obtained at step (b) into a suitable eukaryotic host; d) allowing the exogenous nucleic acid inserted at step (b) to activate transcription of the site-specific recombinase coding sequence which is comprised in the vector, thereby producing the site-specific recombinase; e) allowing the site-specific recombinase so produced to excise the recombinase substrate in the recombinant vector harboring the exogenous nucleic acid having activated the transcription of the site- specific recombinase; f) following step (e), recovering a plurality of recombinant vectors from the eukaryotic host; and g) selecting recombinant vectors having undergone site-specific recombination, most of these vectors containing an exogenous nucleic acid having transcriptional activity. i) Site-specific recombinase
As it will now be explained in more detail, the present invention uses a site-specific recombinase as a tool to screen for nucleic acids encoding a specific function or having a desired feature. Site-specific recombinases are part of the larger integrase family of recombinases that are mainly involved in the insertion, deletion or inversion of genetic material. Site-specific recombinases have been used to recombine DNA molecules transfected into eukaryotic cells, particularly Cre from bacteriophage P1 and Flp from Saccharomyces cerevisiae (Sauer and Henderson, 1988; O'Gorman et al., 1991). These proteins cooperatively bind to specific DNA sequences arranged as palindromes ("recombination target sequences"; e.g. SEQ ID NO:8) (Jayaram, 1985). Recombination between target sequences results in the deletion of the intervening sequence and of one target sequence if the recombination target sequences are in the same orientation relative to one another. Theoretically, any site-specific recombinase can be used according to the present invention. Examples include prokaryotic β-recόmbinase (Diaz et al., 1999) in addition to Cre and Flp mentioned above. However, it may be necessary to optimize the coding sequence of the recombinase as the preferred codon usage in the organism from which it originates may differ greatly from the preferred codon usage in the screening host. This can impair either stability or efficient translation of the recombinase mRNA. In one of the enclosed examples (Example 1), codons in the first 345 nt of the Flp coding sequence from Saccharomyces cerevisiae (SEQ ID NO:4) have been changed to optimal codons for translation in mammalian cells. In preferred embodiments of the present invention, an optimized coding sequence of Flp (SEQ ID NO:4) is used as a tool to screen nucleic acids.
ii) The expression vector
The expression vector according to the present invention is minimally composed of a transcription unit (200; Fig 2A) and a selectable recombinase substrate unit (309; Figs 3A to 3C). According to an embodiment of the invention shown in Figure 2A, the transcription unit (200) comprises cloning sites (204), a promoter (202), enhancer elements (201), transcription termination and polyadenylation signals (203). An exogenous nucleic acid (205) is inserted into the cloning sites (204) of the transcription unit (200). According to another embodiment of the invention shown in Figure 2B, the transcription unit comprises a recombinase coding sequence (211) operatively linked to a minimal promoter (210) and an exogenous nucleic acid (205) is placed upstream of the promoter (210).
The transcription unit (200) as schematized on Figure 2A serves to express exogenous nucleic acids and comprises enhancer (201) and promoter (202) sequences, followed by signals (203) for the termination of transcription and the polyadenylation of transcripts generated from the enhancer and promoter sequences. Enhancer and promoter sequences driving robust expression in a wide variety of cells are generally preferred. These include but are not limited to sequences derived from cytomegalovirus immediate-early genes (CMV; GenBank™ acc. No. AF477200) and Rous sarcoma virus long terminal repeat (RSV; GenBank™ acc. No. M83236.1) as well as sequences derived from widely expressed cellular genes such as chicken β-actin and human elongation factor 1α. Alternatively, enhancer and promoter sequences driving expression in specific cells, tissues or organs can be used. In this case, expression of the exogenous nucleic acid will be limited to the cells, tissues or organs in which the enhancer and promoter sequences can activate transcription. This can be desirable when constructing libraries of viral-based expression vectors. Indeed, as will be described in more detail below, such construction requires the introduction of expression vectors into cells at some point, a step that can lead to the loss of vectors comprising exogenous nucleic acids whose expression are deleterious or toxic to the cell type used in the library construction procedure. Such losses can be spared by insuring that exogenous nucleic acids are expressed from cell-specific or tissue-specific enhancer and promoter sequences that are not active in the cell type used for library construction. The following DNA fragments are just a few examples of enhancer and promoter sequences that can activate the expression of exogenous nucleic acid only in specific cell populations. Nucleotides are numbered relative to the site of initiation of transcription (+1). A fragment encompassing nucleotides -1700 to +1 of the rat osteocalcin gene can be used to achieve osteoblast-specific expression (Baker et al., 1992). A fragment encompassing nucleotides -1542 to -1 of the kidney androgen-regulated protein can be used to achieve kidney-specific expression (Ding et al., 1997). Signals for the termination and polyadenylation of transcripts are well known in the art. Examples include part of the 3' untranslated region of the bovine growth hormone gene or of the SV40 virus. Unique cloning sites (204) are introduced between the promoter sequences and the termination signal and are used to insert exogenous nucleic acids (205). To decrease the probability of cleaving the exogenous nucleic acids during their insertion process, these sites are generally recognized by or compatible with sites recognized by enzymes which infrequently cut DNA molecules (e.g. Notl, Sail)
Alternatively, the transcription unit, as schematized on Figure 2B, comprises a recombinase coding sequence (211) linked at its 5' end to a minimal promoter sequence (210) and at its 3' end to a transcription termination and polyadenylation sequence (203). The minimal promoter is typically approximately 30-40 nucleotides in length. Its only functional element is a "TATA box" about 30 nt upstream of the site of initiation of transcription. The minimal promoter sequence can be derived from naturally occurring genes (e.g. pro- opiomelanocortin (Therrien and Drouin, 1991) or be entirely synthetic. The minimal promoter is chosen such that the level of expression of the recombinase in the screening host is insufficient to mediate efficient recombination of substrate. Unique cloning sites (204) are present, generally immediately upstream of the minimal promoter, to insert an exogenous nucleic acid (205) to be tested for its transcriptional properties. It is understood that the minimal promoter can be omitted to screen for sequences containing complete transcriptional activity.
According to preferred embodiments of the present invention, the expression vector also comprises a recombinase substrate (309). In one embodiment, depicted on Figure 3A, the recombinase substrate (309) is composed of a stuffer region (302) containing a restriction site (R) flanked by recombination target sequences in the same orientation (301). Site-specific recombination (303) leads to removal of the stuffer and one recombination target sequence as well as disappearance of the restriction site. Recombined expression vectors (305) can be distinguished from non recombined expression vectors (304) after digestion with restriction enzyme R and PCR amplification using primers located upstream and downstream (306, 307) of the site of recombination.
More preferably, the recombination target sequences (301) are in direct orientation relative to one another and are separated by a stuffer region (302) which comprises one or many rare restriction site (R; e.g. Notl, Pad, Swal). Site- specific recombination (303) leads to removal of the stuffer and one recombination target sequences. Consequently, the rare restriction site is also deleted from the recombined molecule. Thus, unrecombined expression vectors (304) can be distinguished from recombined expression vectors (305) by their size and by the restriction patterns obtained after digestion by the enzyme cleaving at said rare restriction site. Ideally, the restriction site present in the stuffer region should be unique in the expression vector such that unrecombined expression vectors can be distinguished from recombined expression vectors by their sensitivity to the enzyme cleaving at the rare restriction site. Furthermore, the restriction site should be rarely found in DNA molecules to decrease the probability of cleaving the exogenous nucleic acid, thereby allowing a region comprising the exogenous nucleic acid to be amplified by PCR from recombined vectors using primers located upstream and downstream (306,307) of the recombination site. In another embodiment of the invention, depicted on Figure 3B the recombinase substrate (309) is composed of a stuffer region (302) which is flanked by recombination target sequences in the same orientation (301), which disrupts a coding sequence conferring resistance to a given antibiotic (310), and which is expressed from a prokaryotic promoter (311). According to this embodiment, the vector is designed such that the remaining recombination target sequence (312) after site-specific recombination (303) no longer interferes with the production of protein conferring resistance to the antibiotic. Therefore, recombined expression vectors (314) give rise to colonies (316) when transformed into bacteria whereas non recombined expression vectors (313) do not (315).
More preferably, the DNA segment which comprises recombination target sequences (301) in direct orientation relative to one another and separated by a stuffer region (302), is inserted in the expression vector within a gene conferring resistance to a given antibiotic (310) (e.g. aminoglycoside phosphotransferase conferring resistance to kanamycin) such that it disrupts its proper function. Disruption can be achieved by interrupting the coding sequence of said gene or by abolishing the expression of said gene through insertion of recombination target sequences and stuffer in essential promoter sequences or between promoter (311) and coding sequences. Site-specific recombination (303) will lead to removal of one recombination target sequence and the stuffer segment. According to this embodiment, the expression vector is designed such that the remaining recombination target sequence (312) no longer interferes with the proper function of the gene conferring resistance to a given antibiotic. Thus, bacteria transformed with recombined expression vectors (314) will be resistant to a given antibiotic (316) whereas those transformed with unrecombined expression vectors (313) will not (315). It is understood that a rare restriction site can be introduced in the stuffer segment, as described above, to distinguish recombined and unrecombined expression vectors after digestion with an enzyme recognizing such a rare restriction site.
In yet another embodiment according to the present invention, depicted on Figure 3C, the recombinase substrate (309) is composed of a stuffer region (302) flanked by recombination target sequences in the same orientation (301) which disrupts translation of a viral genome (321) in which a transcription unit (320) comprising an exogenous nucleic acid (205) has been embedded. The defective viral genome is expressed from eukaryotic promoter and enhancer elements (322). Signals for transcription termination and polyadenylation of transcripts are provided (203). According to this embodiment, the vector is designed such that the remaining recombination target sequence (312) after site-specific recombination (303) no longer interferes with the translation of the viral genome. Therefore, recombined expression vectors (325) give rise to viral particles whereas non recombined expression vectors (324) do not.
According to this preferred embodiment, the transcription unit and the associated exogenous nucleic acid (320) are embedded in a viral genome (321) whose translation and replication have been disrupted by insertion of a DNA segment comprising recombination target sequences (301) in direct orientation relative to one another and separated by a stuffer region (302). More preferably, the viral genome is a cDNA copy of a Sindbis virus replicon (see GenBank™ acc. No. NC_001547; and WO 02/16572 incorporated herein by reference) cloned into a DNA-based plasmid downstream of constitutively active enhancer and promoter sequences (322) and whose translation is disrupted by insertion of said DNA segment in the 5' untranslated region of the viral genome. Preferably, the enhancer and promoter sequences driving expression of the disrupted viral genome are different from those comprised in the transcription unit and driving expression of the exogenous nucleic acid. The viral genome contains the transcription unit and associated exogenous nucleic acid between the viral coding sequence and the 3' untranslated region (323). Once transfected, such a DNA plasmid will lead to expression of the exogenous nucleic acid (205). Site-specific recombination (303) will lead to removal of one recombination target sequence and the stuffer segment. According to this embodiment, the expression vector is designed such that the remaining recombination target sequence (312) no longer interferes with the translation and replication of the Sindbis virus replicon. Thus, only recombined expression vectors (325) produce self-replicating and self- packaging viral genomes that contain the exogenous nucleic acid whereas unrecombined expression vectors (324) do not.
Advantageously, the expression vector or part thereof is embedded in a viral genome to generate a viral-based expression vector that minimally contains 1) a transcription unit and 2) a recombinase substrate having at least one of the properties described above. Such a viral-based expression vector may then be packaged within infectious viral particles. These are particularly useful to deliver the expression vector into a whole organism or into cells that are difficult to transfect by conventional methods (e.g. primary cells, immortalized cell lines of hematopoietic origin). Engineered retroviruses and adenoviruses are commonly used to introduce nucleic acid into cells (Ragot et al., 1998; Pear et al., 1993). Standard methodology may be used according to the invention to insert an expression vector in a viral genome and package the resulting viral genome within infectious viral particles. Preferably, the transcription unit and the recombinase substrate of the expression vector should be flanked by viral sequences that are either essential for replication and packaging or that are sufficient to insert components of the expression vector in a viral genome via homologous recombination.
iii) Production of a cell line or a transgenic animal in which the activity of the site-specific recombinase is regulated in a specific manner
The present invention relies on the conditional activity of a site-specific recombinase to select expression vectors containing nucleic acids having a desired feature such as those encoding a peptide/protein with a specific cellular function. It is understood that the recombinase activity should somehow be dependent on the occurrence of the specific cellular function. Before choosing a screening host (cell line or transgenic animal or plant), it is important to ascertain that 1) the specific cellular function does not occur in the absence of an "activating" exogenous nucleic acid, i.e. that the recombinase is not active under basal conditions; and 2) that the cellular function can occur if the right conditions are met, for example transfection of an expression vector containing an "activating" exogenous nucleic acid.
In one embodiment of the present invention, the specific cellular function being screened for is the activation of a particular gene or set of genes. In this case, the recombinase coding sequence is placed under the control of regulatory elements known to be responsible for the activation of this particular gene or set of genes. Thus, according to this embodiment, the recombinase will be expressed solely if an expression vector contains an exogenous nucleic acid that can activate transcription from the regulatory elements. Various regulatory elements have been described in the prior art. They are generally composed of repeats of synthetic oligonucleotides or relatively small gene fragments. They activate transcription under known conditions. For example, transcription from cyclic AMP response elements (CRE) is activated by increased intracellular cyclic AMP levels, a well-known second messenger to many hormones (Tamai et al., 1997). As another example, transcription from a 1.7 kb fragment of the osteocalcin gene is activated upon osteoblast terminal differentiation (Baker et al., 1992). Thus, regulatory elements can be operatively linked to the recombinase coding sequence to obtain a conditionally active form of the recombinase. Transcription termination and polyadenylation signals are also added to the 3' of the recombinase coding sequence. Another approach can be used to place the expression of a recombinase coding sequence under the control of specific regulatory elements. This is the so-called "knock-in" technique, whereby, according to a preferred embodiment of the present invention, the whole or part of the expressed sequence of a specific cellular gene is replaced by a recombinase coding sequence, or whereby a recombinase coding sequence is inserted into a specific cellular gene. The result of such replacement or insertion is that the expression of a recombinase coding sequence mimics the expression of a cellular gene. Methods to insert into or replace specific cellular genomic sequences are known in the prior art. By targeting a cellular gene known to be activated under the desired conditions (e.g. activation of a specific cellular pathway or cell differentiation), a recombinase coding sequence can be expressed solely under the desired conditions, thereby creating a conditionally- active form of the recombinase.
In another embodiment, the specific cellular function being screened for is the translocation of a signaling molecule from the cytosol to the nucleus. In this case, the recombinase is fused to the signaling molecule and the mRNA encoding the fusion protein is constitutively expressed from enhancer and promoter sequences. A number of signaling molecules are known to shuttle between the cytosol and the nucleus depending on the activation state of certain cellular pathways. For example, part of the NF-κB complex translocates to the nucleus upon activation of lymphocytes. Smad4 is another example of a signaling molecule that translocates to the nucleus as a result of TGFβ binding to its cognate receptor at the cell surface (Wrana and Attisano, 2000). Furthermore, it is well known that many nuclear receptors (e.g. glucocorticoid receptor) translocate to the nucleus upon ligand binding. In the absence of activation, the signaling molecule is retained in the cytosol, thereby leading to retention of the fused recombinase in the cytosol. Since the recombinase must be located in the nucleus to perform site-specific recombination, the recombinase is inactive when retained in the cytosol. Upon activation of the signaling molecule (i.e. a specific cellular pathway), the fusion protein translocates to the nucleus, where the recombinase moiety acts on the expression vector containing an exogenous nucleic acid whose expression triggered activation of the specific pathway. Thus, a conditionally-active form of the recombinase can be obtained by fusion with certain signaling molecules.
In a further embodiment of the present invention, the specific cellular function being screened for is the stabilization of a particular messenger RNA. It is known in the prior art that certain messenger RNAs are unstable due to the fact that they contain one or more "destabilizing" sequences, usually located in their 3' untranslated region. Various mRNA-destabilizing sequences have been reported. To screen for nucleic acids whose expression leads to stabilization and therefore increased translation of specific mRNA, the recombinase coding sequence is fused to a specific "destabilizing" sequence. The chimeric mRNA is expressed from constitutively active enhancer and promoter sequences. However, the recombinase is not produced because of the instability of the chimeric mRNA. Thus, a conditionally-active recombinase can be obtained by inserting in its mRNA a chosen destabilizing sequence.
Methods to incorporate DNA segments into the genome of a cell are well known in the art. According to a preferred embodiment of the invention, the conditionally active recombinase sequence is inserted into a plasmid containing a gene conferring resistance to a selective agent (e.g. puromycin-N- acetyltransferase conferring resistance to puromycin). The resulting construct is transfected into cells using standard protocols (e.g. electroporation) and selection is applied. Surviving and growing cells are thought to have incorporated the plasmid and are cloned. Individual clones are analyzed by Southern blotting to confirm the presence of the construct within the genomic DNA. Ultimately, positive cellular clones are tested to determine whether site-specific recombination can be activated when the conditions initially set forth are met, for example when transcription from regulatory elements linked to the recombinase coding sequence is activated or when the fusion protein containing a recombinase moiety is translocated to the nucleus. Cellular clones containing a conditionally-active form of a recombinase are used as screening hosts.
Methods to incorporate DNA segments into the genome of an organism are also well known in the art. According to a preferred embodiment of the invention, the conditionally active recombinase sequence is inserted into a fertilized egg (e.g. of a mouse), which is re-implanted into a pseudo-pregnant mother. DNA extracted from resulting organisms (e.g. embryos, pups or adults) is analyzed by Southern blotting to determine whether the organism is transgenic. Positive animals are bred and used as screening hosts. Alternatively, the conditionally-active form of a recombinase can be incorporated in the genome of embryonic stem ("ES") cells (e.g. of mouse origin). The transgenic ES cells can be aggregated with morula or injected, into blastocysts to obtain chimeric animals. If the transgenic ES cells have populated the germline, then the resulting chimeric animal can be bred to obtain a line of transgenic animals, which can be used as screening hosts.
iv) Insertion of exogenous nucleic acids in the expression vector and production of libraries of recombinant expression vectors containing exogenous nucleic acids The exogenous nucleic acid may be derived from any source, i.e. any organism, tissue or cell type, disease state, etc. In one embodiment of the invention, a plurality of different nucleic acids is inserted into a plurality of copies of an expression vector to provide a plurality of recombinant expression vectors each expressing a unique exogenous nucleic acid and/or encoding a unique protein or peptide. Alternatively, a nucleic acid encoding one particular exogenous protein or peptide may be inserted into the expression vector.
Preferably, the exogenous nucleic acid is derived from a nucleic acid library and a plurality of exogenous nucleic acids are inserted into multiple expression vector copies to yield a pool of recombinant expression vectors. The library may be obtained from a tissue or a cell type of interest or synthesized artificially. This library may be a cDNA library, a genomic library, an RNA library, an expressed sequence tag library, a library made of randomized artificial sequences, or any other kind of library comprising nucleic acids from any kind of organism, tissue, or cell type known to the skilled artisan. Preferably the library is derived from a mammalian source. However, the library may also be derived from reptilian, amphibian, avian, insect, plant, fungi, bacterial cells, etc. The exogenous nucleic acid may be derived from mRNA isolated from a tissue or cell type of interest. In this case, the mRNA would be purified and reverse transcribed into cDNA using methods well known in the art. In some instance, the nucleic acid library will be derived from a subtractive library, for example a library which comprises cDNAs differently expressed in a disease state when compared to the corresponding healthy tissue. Suitable nucleic acid libraries may be generated using standard methods (see for example Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed. Cold Spring Harbor (1989)).
Although exogenous nucleic acids of any type can be screened and selected using the present invention, examples given below rely on cDNA, fragments of cDNA or fragments of genomic DNA as a source of exogenous nucleic acids. In the case of fragments of cDNAs, initiation and termination codons may be provided by the expression vector upstream and downstream of the cloning site(s) for the fragments of cDNAs, respectively. In the case of fragments of genomic DNA, a library is made starting either with whole genomic DNA or DNA insert(s) from λ bacteriophage, cosmid or bacterial artificial chromosome containing genomic DNA. Exogenous nucleic acids are generated by partial digestion of the DNA with a restriction enzyme cutting DNA frequently (e.g. Sau3A, Rsal) and can be size-selected by sieve chromatography (e.g. Sepharose™ CL2B column).
Preferably, the exogenous nucleic acids are cloned into the expression vectors to produce recombinant expression vectors. The resulting population of recombinant expression vectors is transformed in Escherichia coli by electroporation according to standard procedures. A typical yield is 5x105 to 5x107 transformants/μg of cDNA depending on the expression vector. A person skilled in the art will understand that the required number of individual transformants depends on the predicted abundance of exogenous nucleic acids having the desired feature within the starting population of exogenous nucleic acids. Plasmids may be prepared and purified according to standard procedures. Additional steps may be needed to obtain a population of viral-based expression vectors. In the case of retroviruses, plasmids comprise viral sequences essential for replication and packaging as well as components of an expression vector. The population of plasmids is transfected into a cell line expressing the viral proteins necessary for replication and packaging to generate a plurality of recombinant viral genomes that are subsequently packaged. Thus, a plurality of retroviral-based expression vectors is obtained. In the case of adenoviruses, plasmids comprise parts of the viral genome separated by the components of an expression vector. The population of plasmid is transfected, along with a replication-defective viral genome, in a cell line that can complement the replication defect, usually HEK293 cells. Homologous recombination between a plasmid and a viral genome generates a recombinant viral genome having inserted the components of the expression vector. This recombinant viral genome is subsequently packaged and can be propagated in HEK293 cells. A plurality of recombinant viral genomes is thus produced and packaged to obtain a plurality of adenoviral-based expression vectors.
v) Insertion of the expression vector into a suitable host and recombination of expression vector comprising a heterologous nucleic acid encoding the desired function
As outlined above, a suitable host should be able to perform the cellular function being screened for, but it should not exhibit this function in the absence of an "activating" condition (e.g. expression of an appropriate exogenous nucleic acid). If screening is performed using viral-based expression vectors, it is necessary that the host be infected with the recombinant viral particles. In preferred embodiments of the present invention, the genome of the host should also harbor a conditionally-active form of a recombinase. Introduction of a recombinant expression vector into an eukaryotic host cell can be carried out using a number of different well known procedures. Transfection by electroporation, lipofection, calcium phosphate, and micro- injection are only a few of the available techniques to introduce nucleic acids into eukaryotic cells. Introduction of recombinant viral-based expression vectors into eukaryotic host cells is simply carried out by incubating the host with the viral particles and allowing infection to proceed. In order to ensure that a recombinant viral-based expression vector is introduced in almost every cell, infection is usually performed at a multiplicity of infection (m.o.i.) greater than 1 (e.g. 10 plaque-forming units/cell). Introduction of a recombinant expression vector into a transgenic animal or plant or part thereof can be carried out by electroporation or by injection of complexes comprising lipid derivatives and DNA (e.g. intravenous or peritoneal injections). Introduction of a recombinant viral-based expression vector into a transgenic organism is performed by injection of viral particles, e.g. in the case of recombinant adenoviruses, 108 plaque forming units in 0.05 ml of saline intraperitoneally (Mittal et al., 1993).
Once the expression vector or viral-based expression vector has been introduced into the host, the transfected or infected cell should provide most of the molecular machinery for the proper expression and/or function of the exogenous nucleic acid contained therein. The biological function encoded by the exogenous nucleic acid is carried out, if any, directly or through the corresponding protein or peptide if it contains an open reading frame. If this biological function somehow triggers the activity of the conditionally-active recombinase present in the host (e.g. by activating its expression, by inducing its translocation into the nucleus, by stabilizing its mRNA), then the expression vector or viral-based expression vector containing said exogenous nucleic acid will be recombined.
vi) Recovery and identification of recombined expression vector After recombination has taken place, genomic and extrachromosomal DNA are extracted using standard techniques to recover a pool of expression vectors. Typically, this step is done 24 to 72 hours after introduction into the host of the expression vector having incorporated an appropriate exogenous nucleic acid. This step generally involves lysis of cells using a buffer containing ionic detergent (e.g. 1% SDS) followed by digestion of proteins using proteinase K and purification of DNA by ion-exchange chromatography or phenol extraction and ethanol precipitation. In the case of expression vectors smaller than approximately 12 kb, it may be preferable to extract specifically extrachromosomal DNA using a modified Hirt procedure involving lysis of cells by ionic detergents (e.g. 1.2% SDS) followed by precipitation of the genomic DNA (e.g. using KOAc 3M, pH 5) and purification of extrachromosomal DNA by ion- exchange chromatography (e.g. QIAQUICK™ column from Qiagen Inc.).
In one embodiment of the present invention, activation of a site-specific recombinase results in the modification of the expression vector, this modification leading to the removal of a unique restriction site contained in the expression vector. Thus, unrecombined expression vectors can be cut by the enzyme recognizing the restriction site whereas recombined expression vectors can not. Since it is known that cleaved vectors are much less efficiently transformed in bacteria than uncleaved vectors, this property may be used to identify expression vectors that have been recombined. According to a preferred embodiment of the invention the method to identify recombined expression vectors comprises the steps of: a) extracting DNA from cells into which the expression vector(s) with an exogenous nucleic acid sequence has been introduced; b) digesting the DNA of step a) with a restriction enzyme recognizing a restriction site present in the stuffer region of the expression vector, between recombination target sequences; c) transforming bacteria with the digested DNA molecules of step b); d) culturing the transformed bacteria in a selection media (e.g. media with an antibiotic such as ampicillin) so as to select for bacterial colonies resistant to the selection media by virtue of having been transformed by an expression vector; e) extracting the expression vector molecules from bacterial colonies selected in step d); and f) identifying exogenous nucleic acid(s) found in the expression vectors extracted at step e).
To further increase the specificity of recovery of expression vectors at step e) herein above, it may be preferable to degrade cleaved unrecombined expression vectors (step b) with a nuclease that acts on extremities of double strand DNA (e.g. lambda exonuclease). Circular uncleaved recombined molecules are protected from the action of such nucleases.
According to another embodiment of the invention, activation of a site- specific recombinase results in the modification of the expression vector, this modification leading to the reconstitution of a sequence encoding a peptide/protein conferring a resistance to a suppressive condition, such as resistance to a given antibiotic (e.g. aminoglycoside phosphotransferase conferring resistance to kanamycin). Hence, according to an embodiment of the invention, the method to identify recombined expression vectors comprises the steps of: a) extracting DNA from host(s) into which the expression vector with an exogenous nucleic acid sequence has been introduced; b) transforming bacteria with the DNA molecules extracted at step a); c) selecting for bacterial colonies resistant to a selection media (e.g. culture media with an antibiotic such as kanamycin) by virtue of having been transformed by a recombined expression vector; d) extracting the expression vector molecules from bacterial colonies selected at step c); and e) identifying the exogenous nucleic acid(s) found into the expression vectors from step d).
It is also possible to use the polymerase chain reaction (PCR) to identify exogenous nucleic acids inserted into expression vectors which have been recombined. This is particularly useful when screening is performed with viral- based expression vectors. According to an embodiment of the invention, identification of exogenous nucleic acids inserted into expression vectors is achieved using a PCR-based method which comprises the steps of : a) extracting DNA from host(s) into which the expression vector with an exogenous nucleic acid sequence has been introduced; b) digesting the DNA of step a) with a restriction enzyme recognizing a site present in the stuffer region of the expression vector, between recombination target sequences; c) amplifying a DNA fragment ("amplicon") from the expression vector using a forward primer located upstream of the site of insertion of the exogenous nucleic acid(s) and a reverse primer located downstream the recombinase substrate; and d) identifying exogenous nucleic acid(s) found in the amplicon obtained. from step c).
To further increase the specificity of the PCR reaction, it may be preferable to degrade cleaved unrecombined expression vectors (step b) with a nuclease that acts on extremities of double strand DNA (e.g. lambda exonuclease). Uncleaved circular recombined molecules are protected from the action of such nucleases. In the case of viral-based expression vector derived from some linear DNA viruses (e.g. adenoviruses), cleaved unrecombined expression vectors can be degraded by a nuclease that acts specifically on free 5' extremities of double strand DNA (e.g. bacteriophage lambda exonuclease VII). Since the 5' extremities of adenovirus derivatives are covalently linked to a protein moiety, uncleaved linear recombined molecules are protected from the action of such nucleases.
The amplicon can be cloned in a general purpose bacterial plasmid (e.g. pBLUESCRIPT™ KS II). If recombination has resulted in the reconstitution of a sequence encoding resistance to a given antibiotic (e.g. neomycin phosphotransferase conferring resistance to kanamycin) and if the amplicon contains the whole antibiotic resistance coding sequence, then bacteria transformed with a plasmid harboring the amplicon may be preferably selected on medium containing an appropriate antibiotic, thereby ensuring that only amplicons derived from recombined molecules are cloned.
According to a third embodiment of the present invention, activation of a site-specific recombinase results in the modification of the expression vector, this modification leading to the removal from the expression vector of a stuffer region preventing the replication and translation of a cDNA copy of the Sindbis virus genome that comprises the exogenous nucleic acid. Preferably the exogenous nucleic is inserted immediately upstream of the 3' untranslated region of the cDNA copy of the Sindbis virus genome and it is expressed from enhancer and promoter sequences preferably also embedded in the cDNA copy of the Sindbis virus genome. Thus, recombined expression vectors produce self-replicating and self-packaging viral genomes that contain the exogenous nucleic acid whereas unrecombined expression vectors do not. Viral particles are therefore produced only in cells in which an exogenous nucleic acid encoding a desired function has been expressed. Because the exogenous nucleic acid and, generally, the promoter and enhancer sequences, are embedded in the cDNA copy of the viral genome, the viral particles that are produced after recombination contain a copy of the desired exogenous nucleic acid. Viral particles are collected i) from the culture medium of host cells transfected with a library of expression vectors, or ii) from extracellular fluids (e.g. blood, lymph) or whole tissues if the screening host is a transgenic organism. Viral particles can be infectious to allow the propagation of the viral genome comprising desired exogenous nucleic acids. Viral genomes can be recovered from infectious viral particles by infecting a susceptible cell line (e.g. BHK-21 in the case of recombinant Sindbis viral particles), extracting nucleic acids from infected cells (e.g. RNA in the case of recombinant Sindbis virus). A DNA fragment containing the exogenous nucleic acid can be obtained by PCR (after reverse transcription of RNA in the case of Sindbis virus) using primers located upstream and downstream of the site of insertion of the exogenous nucleic acid. U.S. application No 09/641 ,931 of the present inventors (incorporated herein by reference) contains numerous details and explanations on the construction and use of recombinant Sindbis viral particles and genomes.
Alternatively, viral particles produced after recombination of expression vectors can be conditionally infectious to prevent unwanted effects of a viral infection on the screening host, particularly in the case of transgenic animal screening hosts. It is known that conditionally-infectious Sindbis virus particles can be obtained by preventing the cleavage of the p62 envelope precursor protein, usually by introducing a deleterious mutation in the sequence coding for the cleavage site (Berglund et al., 1993). Viral particles produced under these conditions or from such a modified Sindbis virus genome can be recovered and partially purified (e.g. by centrifugation on density gradient or by heparin-agarose affinity chromatography). Alternatively, viral particles produced from p62 cleavage deficient mutants can be rendered infectious after recovery from screening host by controlled digestion with chymotrypsin and used to infect a susceptible cell line (e.g. BHK-21 fibroblasts).
The identity of the exogenous nucleic acid inserted into a recombined expression vector can be determined by sequencing appropriate region(s) of the plasmids recovered from bacteria or by sequencing appropriate region(s) of the DNA fragment comprising the exogenous nucleic acid and obtained, for example, by PCR amplification. Sequence comparisons with • known polynucleotide sequences in databases may confirm the function of the isolated exogenous nucleic acid and/or reveal homologies with nucleic acids encoding known functions. The exogenous nucleic acids inserted into recombined expression vectors can also be i) analyzed by digestion with restriction enzymes followed by gel electrophoresis; ii) used as hybridization probe(s) in expression profiling or microarray analysis; iii) otherwise characterized.
vii) Applications of the identified exogenous nucleic acids having a desired property
The exogenous nucleic acids selected and identified according to the methods of the invention, as well as the peptides and proteins encoded by the same may have many uses. They may be useful for research applications and laboratory use. For instance, they may be used for further screening procedures e.g. as a library, they may serve as probes for the discovery and isolation of various genes and/or diseases, be used for the production of antibodies, be used for the development and the use of oligonucleotide or oligoribonucleotide sequences antisense DNA or RNA molecules or ribozymes. Some of the genes and gene products identified and isolated by the method of the present invention may directly be used as therapeutic agents or, alternatively, as therapeutic targets. These applications and others are known in the art as well as the manner in which they can be reduced to practice.
EXAMPLES As it will now be demonstrated by way of examples hereinafter, the invention provides a very rapid, efficient and accurate method to select a particular nucleic acid having a desired feature. Example 1 shows the properties of a Flp recombinase whose coding sequence has been partially optimized (oFlp). Example 2 gives an example of a plasmid-based expression vector, a cell line in which the activity of oFlp is regulated in a specific manner, and methods to selectively recover recombined forms of plasmid-based expression vectors. Example 3 gives an example of a viral-based expression vector and a method to selectively recover its recombined form after infection of cells constitutively expressing an active form of oFlp. Example 4 gives an example of a vector carrying a cDNA copy of the Sindbis virus genome inactivated through insertion of a recombinase substrate and a method to recover viral particles after recombination of this vector. Example 5 gives an hypothetical example of a screening performed in a transgenic animal using a virus-based expression vector. Example 6 gives an hypothetical example of an in vivo screening performed to identify tissue-specific regulatory elements.
Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, the preferred methods and materials are described.
A) Materials and methods
The following are experimental procedures and materials that were used for the examples set forth below.
Enzymes and reagents Restriction enzymes and DNA-modifying enzymes were purchased from
New England Biolabs (Cambridge, Ma.) unless otherwise stated. TITAN™ one- tube RT-PCR system was purchased from Roche Molecular (Laval, Quebec, Canada). Taq DNA polymerase was purchased from Amersham Pharmacia Biotech (Baie d'Urfe, Quebec, Canada). Synthetic oligonucleotides were obtained either from Hukabel Ltd. (Montreal, Quebec, Canada), Life Technologies (Burlington, Ontario, Canada) or MWG Biotech Inc. (High Point, North Carolina). Cell culture reagents were from Life Technologies unless otherwise stated.
Plasmids
PQuantox™, pQBI25fc3™ and pQBIAdBN™ were purchased from
Quantum Biotechnologies, Montreal, Canada. pREP4 (GenBank™ accession number A25856) and pQE30™ were from Qiagen Inc. pBluescript II™ SK (+) was from Stratagene (California). DH-BB, pSinRepδ and pcDNA1.1/Amp were from Invitrogen (Carlsbad, Ca.).
Cloning of Flp recombinase and synthesis of a partially optimized coding sequence
The sequence of Flp recombinase (SEQ ID NO:4) was amplified by 30 cycles of PCR from approximately 200 ng of yeast DNA using 25 pmoles of forward primer 20-5 (SEQ ID NO:29), 25 pmoles of reverse primer 23-1 (SEQ ID NO:30) and 1 U of high-fidelity Vent DNA polymerase (New England Biolabs, Ma.) in 50 μl of 1x Vent reaction buffer supplemented with 3% DMSO and 200 μM dNTP. Comparison of the sequence of our clone with published Flp sequence (GenBank™ accession number J01347) revealed no difference. pCMVneo was derived from pQBI25fc3™ (Quantum Biotechnologies, Montreal, Canada) by deletion of a 758 bp Sacll-Apal fragment. The Flp PCR fragment was cloned at the Nrul site of expression vector pCMVneo in order to achieve production of the Flp recombinase in transfected cells. The resulting plasmid is designated RC6. A 1654 bp Bsml-Dralll fragment was deleted from RC6 to generate RC26. Finally, a filled-in BamHI/Pvull 1392 bp fragment from pPur™ (BD Clontech, Ca.) was cloned at a Nael site of RC26 to generate RC33. Oligonucleotide-mediated gene synthesis was used to optimize the 5' third of Flp coding sequence. Briefly, 35 pmoles of oligonucleotides 89-1 (SEQ ID NO:31), 86-1 (SEQ ID NO:32), 83-1 (SEQ ID NO:33), 82-1 (SEQ ID NO:34), 80-1 (SEQ ID NO:35) and 74-1 (SEQ ID NO:36) were phosphorylated and annealed in 70μl of 10mM Tris-HCI pH 7.5/1 OOmM NaCI/1mM EDTA by heating at 85°C for 10 minutes and decreasing the temperature at a rate of 1°C/minute. Gaps were filled using 3 U of T4 DNA polymerase and extremities were ligated using 1 Weiss U of T4 DNA ligase. The resulting 391 bp fragment was isolated by electrophoresis on a 2% agarose gel, purified using the QiaQuick™ kit (Qiagen) and amplified by 25 cycles of PCR using 1 U Vent DNA polymerase and 25 pmoles of primers 24-6 (SEQ ID NO:37) and 18-88 (SEQ ID NO:38) in 90 μl of Thermopol™ 1x buffer containing 6% dimethylsulfoxide and 200μM dNTP. The PCR product was digested by BamHI and EcoRV and inserted into a BamHI-EcoRV digested RC33. The resulting plasmid is designated RC59.
Cell culture and transfection
HEK293A cells (ATCC no. CRL-1575) are grown in Dulbecco's minimal essential medium supplemented with 10% (v/v) fetal bovine serum, 100 U/ml penicillin and 100 mg/ml streptomycin. Cells are passaged when reaching 80- 95% confluence by incubating with 0.05% (v/v) trypsin/O.δmM EDTA (Wisent Inc.). Lipofection is performed as follows. Lipid:DNA complexes are formed in 100 μl of culture medium without serum using 3 μl of 1 mg/ml PEI (Sigma, St- Louis) per μg of DNA. Cells are transfected the day after plating (typically 10,000 cells/cm2) by adding the lipid:DNA complex to the culture medium. After a 3 hour incubation, the medium is changed and cells are usually processed after 48 hours.
Production of recombinant proteins and polyclonal antibodies pQE30™ (Qiagen, Mississauga, Ontario, Canada) contains an origin of replication, the β-lactamase coding sequence, and the taq promoter controlling the expression of a given fusion protein containing 6 histidines at its N-terminus. A 865 bp Hindlll fragment from plasmid RC6 was subcloned in Hindlll-digested pQE30™. The hexahistidine tag coordinates nickel atom, thereby allowing purification of the fusion protein by metal affinity chromatography. pQE30™ containing Flp1"286 described above is transformed in strain M15[pREP4]. The fusion protein is produced and purified under denaturing conditions (6M guanidine hydrochloride, 200mM NaCl, 100mM sodium phosphate, 10mM Tris pH 8.0, 2mM imidazole, 5mM β-mercaptoethanol) according to the manufacturer's instructions (QIAEXPRESSIONIST™ kit, Qiagen, Mississauga, Ontario, Canada). The protein solution is dialyzed at 4°C against 4 liters of PBS. Approximately 200 μg of recombinant protein mixed with complete Freund's adjuvant (VWR Canlab, Montreal, Quebec, Canada) is injected subcutaneously to New Zealand White rabbit on day 1. On days 15 and 28, another 100 μg of recombinant protein mixed with incomplete Freund's adjuvant is similarly injected. Rabbits are bled 7 days after the last injection.
Nuclear extracts, Western blotting, immunofluorescence
For immunofluorescence, cells are rinsed twice with PBS and fixed with 2% (w/v) paraformaldehyde in PBS. Cells are washed with PBS and fixative is quenched by incubating 10 minutes in PBS supplemented with 50mM NH4CI. Cells are then incubated overnight in PBS supplemented with 1 % (w/v) bovine serum albumin fraction V (BSA), 0.1% (w/v) dried low fat milk and 0.05% (v/v) Triton X-100™. Cells are incubated in a 1/25 dilution of antiserum in PBS/BSA 0.1 %/milk 0.1 %, washed and incubated with anti-rabbit IgG coupled to TRITC. To prepare crude nuclear extracts, cells are rinsed with PBS and collected in PBS/1 mM EDTA. Cells are incubated for 15 minutes on ice in 0.8ml of buffer A (10mM HEPES pH 7.9; 10mM KCI; 0.1mM EDTA; 1mM DTT; 10μg/ml aprotinin). Igepal CA-630™ (50μl of a 10% (v/v) solution) is added, the solution is briefly vortexed and centrifuged. The nuclear pellet is incubated for 15minut.es on ice in 0.05 ml of buffer B (20mM HEPES pH 7.9; 400mM NaCl; 1mM EDTA; 1mM DTT; 10μg/ml aprotinin). The solution is centrifuged and the insoluble pellet is resuspended in Laemmli buffer (50mM Tris-HCI, pH 6.8, 100mM dithiothreitol, 2% sodium dodecyl sulfate (w/v), 0.1% bromophenol blue (w/v), 10% glycerol (v/v)) and boiled for 5 minutes. Proteins are electrophoresed on denaturing polyacrylamide gel and transferred to 0.22 μm nitrocellulose according to standard protocols. The nitrocellulose membrane is incubated overnight in tris- buffered saline (TBS; 25mM Tris-HCI, pH 7.4, 137mM NaCl, 2.7mM KCI) supplemented with 5% (w/v) dried milk and 0.1% (v/v) TWEEN-20™ (Sigma, St.Louis, Mo.). It is then incubated for 1.5 hours at room temperature with affinity- purified antibody to Flp at a concentration of approximately 3μg/ml in TBS supplemented with 0.1 % (w/v) dried milk and 0.1% (v/v) TWEEN-20™. The membrane is washed twice with TBS supplemented with 0.1 % (v/v) TWEEN- 20™. It is then incubated for 1 hour at room temperature with goat anti-rabbit coupled to horseradish peroxidase (Sigma, St.Louis, Mo.) diluted 1/30,000 in TBS supplemented with 0.1% (v/v) TWEEN-20™. The membrane is washed twice with TBS supplemented with 0.1% (v/v) TWEEN-20™. Detection of the protein bound to the antibody complex is performed with the ECL™ reagent according to the manufacturer's instructions (Amersham Pharmacia Biotech, Baie d'Urfe, Canada).
RNA extraction and Northern analysis Total RNA is purified either by the guanidium isothiocyanate/acid phenol method or using the RNEASY™ kit according to the manufacturer's instructions (Qiagen). For Northern analysis, 2.5μg of total RNA was elecfrophoresed on 1.2% agarose/1.2% formaldehyde gel and transferred onto nylon membrane by capillarity. After UV crosslinking, the blot was probed with a radioactively-labeled full length wild type Flp fragment. After hybridization, the membrane was rinsed with 2xSSC/0.1 % SDS and washed at 65C once with 2xSSC/0.1% SDS and twice with 0.2xSSC/0.1 % SDS. Signal was revealed by autoradiography.
Extraction of genomic and extrachromosomal DNA. Polymerase chain reaction. Cells were washed twice with PBS and collected using PBS supplemented with 1 mM EDTA. Cells were centrifuged for 5 minutes at 3000g. Total nuclear
DNA was extracted and purified using the Qiagen Dneasy™ tissue kit.
Extrachromosomal DNA was extracted using a modified Hirt procedure as follows
(Arad, 1998). Cell pellet was resuspended in 250 μl of buffer A (50 mM Tris-HCI pH 7.5/1 OmM EDTA/100 μg/ml RNAse A). Cells were incubated for 5 minutes at room temperature in 250 μl lysis buffer (1.2% w/v sodium dodecyl sulfate).
Cellular debris and chromosomal DNA were precipitated 15 minutes on ice using 350 μl of buffer B (3M cesium chloride/1 M potassium acetate/0.67M glacial acetic acid). After centrifugation, DNA was purified from the supernatant by using a QIAquick™ kit (Qiagen). Concentration of DNA solutions was determined by fluorimetry. PCR was typically performed for 30-40 cycles on 100ng of DNA using the Expand™ enzyme mix (Roche Molecular, Laval, Canada), 25 pmoles of each primer, 2% (v/v) DMSO, 200μM dNTP in a 1x buffer supplied by the manufacturer.
B) Example 1 : Partially optimized coding sequence for the Flp recombinase The Flp recombinase was chosen for this and subsequent experiments.
This example illustrates the properties of a partially optimized coding sequence of the Flp recombinase. Analysis of the Flp coding sequence (SEQ ID NO:4) indicated that the 5' third of the sequence contained a number of codons rarely found in mammalian genes and presumably poorly translated in cells of mammalian origin. Most notably, 3 ATA and 5 TTA or CTA codons, encoding isoleucine (lie) and leucine (Leu) respectively, are present in the yeast sequence but are the least preferred codons in mammals. Furthermore, the 5' third of the yeast Flp coding sequence is AT-rich (64 %) and contains a putative site of transcription termination (AATAAA, position 220). Changes were therefore introduced in the first 345 base pairs of the Flp coding sequence to remove the putative sites of transcription termination and to replace ATA (lie) and TTA or CTA (Leu) codons by ATC (most frequent He codon in mammalian cells) and CTG or CTC (most frequent Leu codons in mammalian cells), respectively. Furthermore, mutations were introduced to substitute a serine for proline at position 2, a serine for a leucine at position 33 and a serine for a leucine at position 108. It has been reported in prior art that these mutations enhance the thermostability of Flp (Buchholz et al., 1998). The Flp sequence was optimized by oligonucleotide-mediated gene synthesis technique (see Materials and methods). Alignment of wild-type Flp (SEQ ID NOS:4 and 5) and optimized Flp sequences (referred hereinafter as oFlp; SEQ ID NOS:6 and 7) is shown in Figure 4A.
To assess the combined effect of codon optimization and thermostability- increasing mutations on the levels of Flp that is produced, wild type Flp (SEQ ID NO:4) and oFlp (SEQ ID NO:6) sequences were inserted into CMV-based expression vectors to obtain RC33 and RC59, respectively (see Materials and Methods). Recombinant expression vectors (2 μg) were transfected into HEK293A cells by lipofection. Total RNA was extracted from cells 24 hours after transfection and nuclear proteins extracts were prepared 48 hours after transfection to evaluate both the transcript and protein levels of the two forms of Flp. As shown by Northern analysis, (Figure 4B), transcript levels of optimized Flp (lane 412) were approximately 20-fold higher than those of wild type Flp (lane 411), even though both were expressed from the same promoter/enhancer elements (derived from CMV) and transfection efficiencies were considered similar in both cases, as judged by the equal number of fluorescent cells after co- transfection of Flp-expressing vectors with a vector expressing green fluorescent protein as a marker. No Flp transcript is detected in cells transfected only with a vector expressing green fluorescent protein (lane 410). Flp protein levels were determined by Western analysis (Figure 4C). Using this technique, Flp is undetectable when expressed as a wild type coding sequence (lane 421) but a robust signal is detected when Flp is expressed as an optimized coding sequence (lane 422). As expected, no Flp is detected in cells transfected only with a vector expressing green fluorescent protein (lane 420). Taken together, these results indicate that both transcript and protein levels of Flp are greatly increased by expressing an optimized coding sequence.
The increased amount of Flp produced from an optimized coding sequence can be useful in a screening experiment. Indeed, if a regulatory element linked to a Flp coding sequence is activated by a given stimulus, then more Flp transcript and protein shall be produced if the regulatory element is linked to an optimized Flp coding sequence rather than to a wild type Flp coding sequence, as was shown for the CMV enhancer/promoter elements. This may help to achieve higher sensitivity, particularly if transcription from the regulatory element is weakly activated by the stimulus of interest. C) Example 2: Plasmid-based expression vector, cell line in which the activity of oFlp is regulated in a specific manner, and methods to selectively recover recombined forms of plasmid-based expression vectors.
This example illustrates the various functionalities of a plasmid-based expression vector designed according to the present invention. It also shows that an expression vector can be selected after transfection in an engineered cell line, provided the vector expresses an exogenous nucleic acid capable of triggering Flp activity. Plasmid construction is schematized on Figure 5. Oligonucleotides 62-2 (SEQ ID NO:9) and 62-3 (SEQ ID NO: 10) (500) were annealed and the protruding extremities were blunted by the Klenow fragment of DNA polymerase I (501). The resulting fragment was cloned in a Sspl site of plasmid pQuantox™ (502). The resulting plasmid, RC1 , was partially digested with Hindi and completely with Kpnl, the extremities were blunted and the plasmid was recircularized (503) to generate RC20a. A 953 bp fragment was amplified from plasmid pREP4 (504) using forward primer 72-2 (SEQ ID NO:11) and reverse primer 18-73 (SEQ ID NO: 13) (505). This fragment was cloned in the unique EcoRV site of pBluescript II SK(+) to generate RC17 (506). A 358 bp BamHI fragment from RC20a (507) was cloned into the BamHI site of RC17 to generate RC22 (508). The unique Notl site of the latter plasmid was removed by digestion, fill-in and recircularization (509) to create RC24. A 1607 bp Pvull-Hincll fragment (510) obtained by partial digestion of RC24 was ligated to a 4602 bp Dralll-Bsml fragment of pQBI25fc3 that had been blunted (511). The resulting vector is RC32. The transcription unit of RC32 can be said to comprise the cytomegalovirus immediate early gene enhancer/promoter regions followed by the coding sequence for green fluorescent protein (GFP) and a bovine growth hormone polyadenylation signal. The recombinase substrate, composed of a recombination target sequence followed by a stuffer region and by another recombination target sequence, is inserted between the laci promoter derived from plasmid pBluescript II™ SK (+) and nt 419 to 1318 of pREP4, encoding residues 26 to 267 of neomycin phosphotransferase (GenBank™ accession number AAK28133). The Xhol site upstream of the CMV enhancer/promoter elements in RC32 was removed by partial digestion, fill-in and recircularization (512). Finally, annealed oligonucleotides 24-7 (SEQ ID NO:17) and 24-8 (SEQ ID NO: 18) (513) were cloned in a Seal site of the resulting plasmid to generate RC43 (SEQ ID NO:1) (514), which contains a unique Swal site in the stuffer region. A map of RC43 is given in Figure 5C and Table 1 hereinafter.
Table 1: Map of RC43 (Fig 5C; SEQ ID NO:1)
We next established a system in which the activity of an oFlp recombinase was conditional on the transfection of an appropriate expression vector. We chose to place the expression of oFlp under the control of cis-acting regulatory elements recognized by the Gal4VP16 chimeric transcription factor ('UAS' sites). Gal4VP16 is composed of the DNA-binding domain of Gal4 and the transcriptional activation domain of VP16 (Sadowski et al., 1988). Binding of Gal4VP16 to its cognate element in the context of a minimal promoter activates transcription of a sequence operatively linked to the minimal promoter (Webster et al., 1988). We constructed a plasmid comprising oFlp operatively linked to a minimal promoter downstream of two copies of a consensus UAS site (Webster et al., 1988). This was done essentially as schematized on Figure 6. The minimal promoter of CMV was amplified from pcDNAH (Invitrogen) using forward primer 31-6 (SEQ ID NO:24) and reverse primer 72-3 (SEQ ID NO:23) (601). This fragment encompasses the first 36 nt upstream of the site of initiation of transcription followed by 55 nt of the 5' untranslated region of the Sindbis virus genome. It was cloned in a EcoRV site in pBluescript II KS(+) (602) to generate plasmid pKS-PCRCMV. Oligonucleotides 21-1 (SEQ ID NO:25) and 21-8 (SEQ ID NO:26) were phosphorylated, annealed and ligated using standard protocols. The multimers of double-stranded oligonucleotide were cloned in a Bglll site of pKS-PCRCMV (603). A 131 bp EcoRI-Munl fragment was excised from the resulting plasmid and substituted to a 785 bp Xhol-Notl fragment in RC59 (604). The resulting plasmid is RC71. To obtain an expression vector for Gal4VP16, we replaced the GFP coding sequence by the one for Gal4VP16 in RC43 and obtained RC74.
To obtain a cell line having integrated one or more copies of RC71 in its genome, 3.5 million of HEK293 cells were electroporated at 600V/cm with a mixture of 9μg of Seal-linearized RC71 and 10μg of denatured salmon sperm DNA. Cells were plated and selection (2.5 μg/ml puromycin) was applied 24 hours later. The concentration of puromycin was reduced to 0.5μg/m! on day 4 and colonies were picked on day 11. Induction levels of oFlp expression was determined after transfection with RC74. Northern analysis revealed that oFlp mRNA levels were strongly induced in one subclone after transfection with RC74 (subclone 293-UASoFlp/6; Figure 7, lane 703). No induction was observed in another subclone (subclone 293-UASoFlp/10; Figure 7, lane 105). No Flp signal was detected in wild type HEK293 cells (Figure 7, lane 701) or after transfection of a control vector in either subclone (Figure 7, lanes 702 and 704).
To assess the activity of oFlp in the 293-UASoFlp/6 subclone, expression vectors (RC43 or RC74) were transfected in these cells, recovered after 48 hours and subjected to a selection procedure relying on reconstitution in the expression vector of a gene conferring resistance to kanamycin when transformed in E.coli only if the nucleic acid expressed by the vector is able to trigger the activity of oFlp. DNA (0.4μg) was transfected in approximately 200,000 cells by lipofection using the Effectene™ reagent (Qiagen Inc.) according to the manufacturer's recommendations. Extrachromosomal DNA was extracted from cells by a modified Hirt technique. 20ng of recovered DNA was transformed in E. coli DH10B cells by electroporation. A small aliquot of the transformation (1/1 ,000) was plated on plates containing 100μg/ml ampicillin to control for the efficiency of DNA recovery and electroporation. The remainder was plated on plates containing 50 μg/ml ampicillin and 20 μg/ml kanamycin. The number of colonies on plates containing kanamycin is a measure of the amount of expression vector that was recombined. In this example, transformation of DNA extracted from 293- UASoFlp/6 cells transfected with RC43 gave rise to 25 kanamycin-resistant colonies per 100 ampiciliin-resistant colonies. Transformation of DNA extracted from 293-UASoFlp/6 cells transfected with RC74 gave rise to 1 ,000 kanamycin- resistant colonies per 100 ampiciliin-resistant colonies.
According to a preferred embodiment of the invention and as schematized on Figure 8A, site-specific recombination of an expression vector expressing an exogenous nucleic acid (205) results in the removal of a unique Swal restriction site (R) located in the stuffer region (309). As will be shown below, a fragment can be specifically amplified from recombined expression vectors (802) using primers (803,804) flanking the recombination region and the exogenous nucleic acid. Expression vectors recovered after transfection in 293-UASoFlp/6 cells were therefore also analyzed by PCR. In this case, half of the DNA recovered by the modified Hirt technique was subjected to Swal digestion in order to cut unrecombined molecules (801). Digestion was carried out at room temperature for 4 hours. The reaction was then purified by phenol extraction and DNA was precipitated with ethanol. One fifth of the precipitate was used in a PCR performed with 0.7U of Expand Mix DNA polymerase (Roche Molecular, Laval, Canada), 12.5 pmoles of 18-64 (SEQ ID , NO: 19) and 12.5 pmoles of 18-106 (SEQ ID NO:20) in 25 μl of 1x Expand buffer (Roche Molecular) supplemented with 2% (v/v) dimethyl sulfoxide and 200μM deoxynucleotides. Cycling conditions were 94°C/30 s, 52°C/30s, 68°C/4 min for 25 cycles. One fifth of the reaction was analyzed by electrophoresis on a 1 % (w/v) agarose gel. Results are shown on Figure 8B. A 3076 bp fragment (indicated by arrow 816) is obtained only when an expression vector for Gal4VP16 (RC74) is transfected (lane 814). By comparison, a 3462 bp fragment (indicated by arrow 817) is obtained after 15 cycles of PCR on RC74 before transfection (lane 811). No fragment is amplified from DNA extracted from untransfected 293-UASoFlp/6 cells (lane 812), from 293- UASoFlp/6 cells transfected with a control expression vector (RC43, lane 813) or from 293 WT cells transfected with RC74 (lane 815).
D) Example 3: Viral-based expression vector and method to selectively recover its recombined form after infection of cells constitutively expressing an active form of oFlp.
This example illustrates the various functionalities of an adenovirus-based expression vector designed according to the present invention. Plasmid construction is schematized on Figure 9. Annealed oligonucleotides 19-6 (SEQ ID NO: 13) and 19-7 (SEQ ID NO: 14) were cloned in the 5417 bp fragment of Hindlll/Notl-digested vector pQBIAdBN after extremities were blunted using the Klenow fragment of DNA polymerase I (900). Annealed oligonucleotides 22-5 (SEQ ID NO:15) and 22-6 (SEQ ID NO:16) were cloned in the unique Clal site of RC44 to introduce a unique Pmel site in the resulting plasmid RC46 (901). Finally, a 4112 bp Sail fragment from RC43 (see Example 2) was cloned in the Xhol site of RC46 (902). The resulting viral-based expression vector (RC49-2; SEQ ID NO:2) comprises a transcription unit and recombinase substrate flanked by nucleotides 1-102 and nucleotides 3334-5779 of the Adenovirus serotype 5 genome (GenBank accession number 9626187). A map of RC49-2 is given in Figure 9B and Table 2 hereinafter.
Table 2: Map of RC49-2 (Fig 9B; SEQ ID NO:2)
The transcription unit and recombinase substrate of RC49-2 were incorporated in an adenoviral genome by in vivo homologous recombination. This was done by co-transfecting 5 μg of Pmel-linearized RC49-2 with 5 μg of AdCMVIacZΔE1/ΔE3, a replication-defective genome obtained commercially (Quantum Biotechnologies, Montreal, Canada). Co-transfection of DNA molecules in HEK293 cells was carried out by means of a calcium phosphate precipitate using standard protocols. Two days post-transfection, cells were overlaid with medium containing 1.25% (w/v) low melting agarose. Recombinant viral genome resulting from homologous recombination between RC49-2 and the replication-defective adenoviral genome can be propagated in HEK293 cells, as indicated by the presence of viral plaques composed of GFP-expressing cells. A stock of viral particles was obtained after 2 successive rounds of plaque- purification according to standard protocols. In order to test the adenoviral-based expression vectors, we obtained a cell line constitutively expressing an active oFlp. This was done by integrating a plasmid, designated RC59, in the genome of Hela cells. RC59 contains two distinct transcription units: one expressing the optimized Flp sequence from the CMV enhancer/promoter elements and the other conferring resistance to puromycin to stably transfected cells. RC59 was linearized at a unique Seal restriction site to facilitate integration in the host genome. The linearized vector (10 μg) was electroporated in 5 million Hela cells. Electroporated cells were plated in five 100mm petri dishes and allowed to recover for 24 hours, at which time puromycin (1 μg/ml) was added to the medium. After 4 days of selective pressure, the concentration of puromycin was decreased 10-fold to allow growth of cell colonies. Isolated colonies (20-100 cells) were picked 10 days later. Clones were tested for expression of optimized Flp coding sequence by Northern analysis. Expressing clones were cloned again by limiting dilution to ensure that the population of Flp-expressing cells is monoclonal. Figure 10 shows the Flp transcript levels in 4 subclones, Hela/oFlp2-3 (lane 1002), Hela/oFlp3-2 (lane 1003), Hela/oFlp6-1 (lane 1004) and Hela/oFlp6-2 (lane 1005). Wild type Hela cells do not express Flp (lane 1001). It is well known that expression of a transgene can vary from one subclone to another, depending for example on the number of copies of the transgene and/or its site of integration.
To verify that the viral-based expression vector contained within these viral particles could be recombined by the Flp recombinase, Hela/oFlp6-2, Hela/oFlp2-3 and wild type Hela cells were infected at a multiplicity of infection (m.o.i.) of 50. At this m.o.i., expression of GFP was seen in over 95% of cells. 48 hours after infection, genomic DNA was extracted and a fragment was amplified using primers located from either side of the site of recombination (18-64V (SEQ ID NO:19); and 18-106V (SEQ ID NO:20) as described in the Materials and Methods section. Analysis of the PCR products by agarose gel electrophoresis (Figure 11 A) revealed that about 50% of the viral-based expression vectors were recombined in Hela/oFlp6-2 cells (lane 1102). The expected sizes of the amplicons are 3544 bp from the non recombined viral-based expression vector (1105) and 3182 bp from the recombined viral-based expression vector (1106). Because recombination leads to removal of a unique Swal site from the viral- based expression vector, the signal arising from non recombined viral-based expression vectors can be eliminated by Swal digestion. Genomic DNA (1.5 μg) was therefore digested with 20 U of Swal for 6 hours at 22°C, purified by phenol extraction and ethanol precipitation and subjected to PCR as described above. Consistent with the fact a DNA molecule cleaved between primers can no longer serve as template in PCR, amplicons from non recombined viral-based vectors (1105) are no longer detected after digestion of genomic DNA from infected cells with Swal (lane 1103). However, amplicons from recombined viral-based vectors (1106) are still readily detected after Swal digestion of genomic DNA extracted from Hela/oFlp6-2 cells infected with viral-based expression vectors (lane 1103). Interestingly, no recombined viral-based expression vectors was detected after infection of Hela/oFlp2-3 cells (lane 1104), showing that the level of Flp expression in these cells is not sufficient to mediate site-specific recombination. We can therefore hypothesize that some leakage of Flp expression from regulatory elements, i.e. low levels of expression in the absence of activators, will not give rise to high background in the context of a screening experiment.
The following was performed in order to mimic a screening experiment. Decreasing numbers of optimized Flp-expressing Hela cells (Hela/oFlp6-2: 15,000 cells; 150 cells) were mixed with wild type non-expressing cells (WT: 135,000 cells; 149,850 cells). The resulting cell population was infected with the adenoviral-based vector expressing GFP described above at a m.o.i. of 50. Genomic and viral DNA were extracted 48 hours after infection. Two micrograms of DNA was digested with 20U of Swal for 6 hours at 37°C before 1/10 of the digested DNA was subjected to 40 cycles of PCR as described above. Figure 11B shows the analysis of the PCR products by agarose gel electrophoresis. Shorter amplicons from recombined viral-based expression vectors are readily detected when the infected cell population is composed of 15,000 Hela/oFlp6-2 cells mixed with 135,000 WT cells (lane 1111). No amplicon is detected when the infected cell population consists of 150 Hela/oFlp6-2 cells mixed with 149,850 WT cells (lane 1113). To ascertain the presence of viral DNA in the tested cell populations, PCR was performed on DNA that had not been digested by Swal. As expected longer, fragments derived from non recombined expression vectors are easily detected in both cases (lane 1110 and 1112). To increase the sensitivity of the PCR detection method when performed on DNA digested by Swal, a semi-nested PCR was performed on 1/25 of the initial PCR using primers 18-112V (SEQ ID NO:21) and 18-106V (SEQ ID NO:20). As can be seen on Figure 11C, a fragment amplified from recombined viral-based expression vectors (expected size 838 bp, label 1124) was detected after semi- nested PCR on DNA extracted from both cell populations (lanes 1120, 1121). Semi-nested PCR on DNA extracted from an infected cell population consisting of 150 Hela/oFlp6-2 cells mixed with 149,850 WT cells (lane 1121) reveals a fragment from non recombined DNA molecules (expected size 1201 bp, label 1123), presumably due to the incomplete digestion of the viral DNA by Swal. Taken together, these results indicate that our method is sensitive enough to detect recombined viral-based expression vector when only 0.1 % of an infected cell population expresses the oFlp recombinase.
F) Example 4: Production of Sindbis viral particles dependent on site- specific recombination
This example illustrates how a cDNA copy of the Sindbis virus genome can be engineered such that replication and packaging is dependent on the activity of the Flp recombinase. Plasmid construction is schematized on Figure 12. Sindbis virus genome is a positive-strand RNA molecule. Because Flp acts on DNA substrate, we constructed a CMV-based vector expressing a cDNA derived from the Sindbis virus genome as follows. A 4415 bp BamHI-Xhol fragment from DH- BB comprising the Sindbis virus structural proteins coding sequence (SP) was subcloned into a 9241 bp BamHI-Xhol fragment of plasmid pSinRepδ to generate
VB220 (1200). A 1230 bp fragment was amplified from pcDNA1.1/Amp by 25 cycles of PCR using high-fidelity Vent DNA polymerase and primers 18-100 (SEQ
ID NO:22) and 72-3 (SEQ ID NO:23) (1201). This fragment comprises the first nucleotide of the Sindbis virus genome positioned at the putative site of initiation of transcription of the cytomegalovirus immediate early enhancer/promoter elements (CMV). The PCR product was digested by Hindi and Muni (1202) and a 639 bp fragment was inserted into a 13529 bp fragment of VB220 resulting from digestion by Sad, blunting of the extremities by T4 DNA polymerase followed by partial Muni digestion (1203). The resulting plasmid is called Vb233b. A 714 bp fragment comprising signals for transcription stop and polyadenylation of transcripts was obtained by digestion of pcDNA1.1/Amp with Sphl and Ncol (1204). After blunting its extremities with T4 DNA polymerase, the fragment was cloned in the blunted Xhol site of VB233b to generate VB250b (1205). A 449 bp fragment comprising a recombinase substrate was amplified from RC24 (see Example 2) using primers 18-112 (SEQ ID NO:21) and 20-22 (SEQ ID NO:27) and high fidelity Vent DNA polymerase (1206). This fragment was cloned in a partially-digested VB250b at a Muni site located in the 5' untranslated region of the cDNA copy of the Sindbis viral genome (1207). The resulting vector is VB271 b. Finally, a GFP coding sequence linked to promoter and enhancer sequences derived from the Rous sarcoma virus long terminal repeat (RSV LTR) was introduced between the structural proteins coding sequence and the 3' untranslated region of the viral genome. Sequences from RSV LTR were chosen because they are strongly active in a wide variety of cell types (Gorman et al., 1982). Cloning was performed as follows. The GFP coding sequence was amplified (1208) from pQBIfc3™ (Quantum Biotechnologies, Montreal, Canada) and cloned (1209) into a 4005 bp EcoRV fragment of pRcRSV (Invitrogen, Carlsbad, Ca.) to generate VB288b. A fragment comprising the RSV LTR and the GFP coding sequence was excised from VB288b (1210) and inserted at the blunted Apal site of VB271 b. The resulting expression vector is RC77 (SEQ ID NO:3). A map of RC77 is given in Figure 12C and Table 3 hereinafter.
Table 3: Map of RC77 (Fig 12C; SEQ ID NO:3)
To verify that this vector could express an exogenous nucleic acid (i.e. GFP in this case), 500ng of RC77 was transfected in HEK293A cells by lipofection using the Effectene™ reagent (Qiagen Inc.). Forty-eight hours after transfection, cells were fixed with 4% paraformaldehyde and observed by fluorescence microscopy. As shown on Figure 13A, expression of GFP is detectable in approximately 2-5% of cells (image 1301). This result indicates that the transcription unit embedded in the cDNA copy of a disrupted Sindbis virus genome is active. To further verify that RC77 could lead to production of viral particles when recombined, 350ng of expression vector RC77 was transfected in HEK293A cells by lipofection with either 650ng of a vector expressing an optimized Flp coding sequence (RC59) or 650ng of a control vector expressing luciferase (VB35). Forty-eight hours after transfection, cells were fixed with 4% (v/v) paraformaldehyde and processed for anti-C protein immunofluorescence. Only cells co-transfected with RC77 and RC59 showed expression of this viral protein (Figure 13A, images 1304), presumably due to recombination of the expression vector, excision of the disruptive recombinase substrate and subsequent production of viral particles. By assaying for C protein immunoreactivity in monolayers of BHK-21 fibroblasts incubated with culture medium from co-transfected cells, we found that the expression vector RC77 gives rise to infectious viral particles when co-transfected with RC59 (Figure 13A, image 1305) but not when co-transfected with VB35 (Figure 13A, image 1303). , Our interpretation was further confirmed by RT-PCR analysis of total RNA (500ng) extracted from co-transfected cells and using primers flanking the disruptive recombinase substrate 20-22V (SEQ ID NO:27) and 18-101V (SEQ ID NO:28)). Reaction was performed using the Titan™ one-step RT-PCR (Roche Molecular) as follows: 50°C, 30 min; 94°C, 2 min, 2δ cycles of 94°C 30s, δ4°C, 30s and 68°C, 30s. No product is amplified from RNA extracted from untransfected BHK-21 cells (Figure 13B, lane 1311). A 470 bp fragment (Figure 13B, arrow 1315) is amplified from RC77 plasmid DNA (Figure 13B, lane 1312) and from total RNA extracted from cells co-transfected with RC77 and either VB35 (Figure 13B, lane 1313) or RCδ9 (Figure 13B, lane 1314). However, an additional 108 bp fragment is specifically amplified from cells co-transfected with RC77 and RC59 (Figure 13B, arrow 1316). The difference in the size of the amplicon is due to excision of the disruptive recombinase substrate.
Taken together, these results indicate that RC77 can be specifically recombined by oFlp and that removal of a stuffer region in the 5' untranslated portion of a cDNA copy of the Sindbis virus genome can restore production of infectious viral particles.
Example 5: Screening in a transgenic animal using virus-based expression vector.
This hypothetical example illustrates the design of a screening conducted in a transgenic animal to identify cDNAs encoding dominant activators of osteoblast differentiation. Of course, this example could easily be adapted and used for identifying and selecting cDNAs encoding other types of genes. According to this example, the gene delivery vector that is used is an adenovirus particle containing a genome engineered as described in Example 3. A library of adenoviral particles is constructed as outlined in section iv) starting δ8
from RNAs extracted from osteoblasts undergoing differentiation. The screening host is a transgenic mouse obtained as follows. The oFlp coding sequence and the 3' untranslated region of the bovine growth hormone cDNA are operatively linked to a fragment encompassing 1.7 kb found immediately upstream of the site δ of transcription initiation of the mouse osteocalein gene. Osteocalein is a well- known marker of osteoblast differentiation whose expression is controlled by this 1.7 kb cell-specific regulatory fragment. The resulting construct is injected into a pseudo-fertilized egg to obtain lines of transgenic mice according to standard protocols. Preparations of total RNA extracted from various tissues of 0 heterozygotes animals are tested by Northern analysis to ensure that oFlp expression is restricted to the differentiated osteoblast, as is the endogenous osteocalein gene. Transgenic animals are then injected intraperitoneally with 108 plaque-forming units of the adenovirus-based expression vector library. At various times after injection (typically 2 to 10 days), animals are sacrificed and δ recombined expression vectors are detected, if any, as in Example 3, from muscle and adipose tissue. These tissues are selected because it is believed that they harbor cells that can differentiate into osteoblasts given the proper stimulus. According to the design of the screen, exogenous nucleic acids (cDNAs) comprised in recombined expression vectors so produced in the mouse have the 0 capacity to activate oFlp transcription from the osteocalein regulatory fragment and may therefore be hypothesized to be dominant activators of osteoblast differentiation.
Example 6: In vivo screening for regulatory elements δ This hypothetical example illustrates the design of a screening conducted in a mouse to identify fragments of mouse genomic DNA that can confer tissue- specific expression to the linked oFlp coding sequence.
According to this example, the expression vector that is used comprises a transcription unit devoid of regulatory elements, as depicted on Figure 2B and a 0 recombinase substrate constructed as depicted on Figure 3A. Fragments of mouse genomic DNA are obtained by partial digestion with Sau3A and cloned in the vector upstream of the oFlp coding sequence. The library of vectors is transfected in vivo by injection of lipid:DNA complexes formed using commercially available transfection reagents. At various times after transfection in vivo, animals are sacrificed, extrachromosomal DNA is extracted from various tissues and digested with the enzyme cutting in the stuffer region of the recombinase substrate. Exogenous nucleic acids from uncleaved recombined vectors are amplified by PCR using a primer located upstream of their site of insertion and another downstream of the recombinase substrate. Exogenous nucleic acids are reinserted in the vector and subjected to another round of in vivo screening. Exogenous nucleic acids that are finally retrieved correspond to genomic DNA fragments capable of activating the expression of oFlp in the tissue from which it was retrieved. By comparing sequences of fragments retrieved from different tissues, it is therefore possible to identify genomic fragments whose transcriptional activity is tissue-restricted.
REFERENCES
1- Angrand, P.-O., et al. (1998), Nucl. Acid. Res., 26, 3263-3269.
2- Arad, U. (1998), Biotechniques, 24, 761-762.
3- Baker, A.R., et al. (1992), Mol. Cell Biol., 12, 5541-5δ47. 4- Berglund, P., et al. (1993), Bio/Technology, 11., 916-920. δ- Buchholz, F., et al. (1998), Nat. Biotech., 16, 6δ7-662.
6- Diaz, V., et al. (1999), J. Biol. Chem., 274, 6634-6640.
7- Ding, Y., et al. (1997), J. Biol. Chem., 272, 28142-28148.
8- Gorman, CM., et al. (1982), Proc. Natl. Acad. Sci. U.S.A., 79, 6777-6781. 9- Jayaram, M. (1986), Proc. Natl. Acad. Sci. U.S.A., 82, 587δ-δ879.
10-Logie, C. and Stewart, A.F. (1995), Proc. Natl. Acad. Sci. U.S.A., 92, 5940-
5944. 1 -Metzger, D. and Feil, R. (1999), Curr. Opin. Biotech., 10, 470-476. 12-Mittal, S„ et al. (1993), Virus Res., 28, 67-90. 13-O'Gorman, S., et al. (1991), Science, 251, 1351-1355.
14-Pear, W.S., et al. (1993), Proc. Natl Acad. Sci. U.S.A., 90, 8392-8396. 15-Ragot, T., et al. (1998), Meth. Cell Biol., 52, 229-260. 16-Sadowski, I., et al. (1988), Nature, 335, 563-4.
17-Sauer, B. and Henderson, N. (1988), Proc. Natl. Acad. Sci. U.S.A., 85, 5166-
5170. 18-Tamai, K.T., et al. (1997), Recent Prog. Horm. Res., 52, 121-139. 19-Therrien, M. and Drouin, J. (1991), Mol. Cell Biol., 11 , 3492-3δ03. 20-Webster, N, et al. (1988), Cell, 52, 169. 21-Wrana, J.L. and Attisano, L. (2000), Cytokine Growth Factor Rev., 11 , δ-13.
While several embodiments of the invention have been described, it will be understood that the present invention is capable of further modifications, and this application is intended to cover any variations, uses, or adaptations of the invention, following in general the principles of the invention and including such departures from the present disclosure as to come within knowledge or customary practice in the art to which the invention pertains, and as may be applied to the essential features hereinbefore set forth and falling within the scope of the invention or the limits of the appended claims.

Claims

CLAIMS:
1. A vector for expressing an exogenous nucleic acid in eukaryotic cells, said vector comprising a nucleic acid sequence excisable by site-specific δ recombination.
2. An expression vector comprising a nucleic acid sequence, said sequence comprising a recombinase substrate and a transcription unit.
0 3. The expression vector of claim 2, wherein said recombinase substrate comprises a stuffer region flanked by recombination target sequences.
4. The expression vector of claim 3, wherein said stuffer region is removable by site-specific recombination. 5
5. The vector of claim 4, wherein the stuffer region comprises a restriction site.
6. The expression vector of any one of claims 2 to 5, wherein said 0 transcription unit comprises an enhancer sequence, a promoter sequence and a termination sequence operatively linked together.
7. The expression vector of any one of claims 2 to 6, wherein said nucleic acid sequence comprises at least two fragments of a viral genome for packaging 5 said vector or a fragment thereof into infectious viral particles
8. The expression vector of claim 7, wherein the recombinase substrate and the transcription unit are located between said at least two fragments.
0 9. The vector of claim 7 or 8, wherein said at least two fragments derive from a retrovirus or an adenovirus.
10. The expression vector of any one of claims 2 to 9, comprising a nucleic acid sequence encoding an inactive gene conferring resistance to an antibiotic in bacteria, and wherein activity of said inactive gene is restorable by site-specific recombination of the recombinase substrate.
11. A vector comprising: i) a site-specific recombinase coding sequence operatively linked to a termination sequence; and ii) a recombinase substrate excisable specifically by a recombinase encoded by said site-specific recombinase coding sequence.
12. The expression vector of claim 11 , wherein said recombinase substrate comprises a stuffer region flanked by recombination target sequences.
13. The expression vector of claim 12, wherein said stuffer region is removable by site-specific recombination.
14. The vector of claim 13, wherein the stuffer region comprises a restriction site.
15. The expression vector of any one of claims 11 to 14, wherein said nucleic acid sequence comprises at least two fragments of a viral genome for packaging said vector or a fragment thereof into infectious viral particles
16. The expression vector of claim 15, wherein the recombinase substrate is located between said at least two fragments.
17. The vector of claim 15 or 16, wherein said at least two fragments derive from a retrovirus or an adenovirus.
18, The expression vector of any one of claims 11 to 17, comprising a nucleic acid sequence encoding an inactive gene conferring resistance to an antibiotic in bacteria, and wherein activity of said inactive gene is restorable by site-specific recombination of the recombinase substrate.
19. An expression vector comprising a nucleic acid sequence, said nucleic δ acid sequence comprising a recombinase substrate and a transcription unit incorporated into a viral genome.
20. The expression vector of claim 19, wherein said recombinase substrate comprises a stuffer region excisable by site-specific recombination, and wherein 0 formation of viral particles is dependent upon excision of said stuffer region.
21. The vector of claim 19 or 20, wherein said viral genome consists of a cDNA copy of an alphaviral genome, and presence of the stuffer region blocks translation of viral proteins encoded by said cDNA copy of the alphaviral genome. 5
22. The vector of claim 21 , wherein said recombinase substrate is present in a 5' untranslated region of the cDNA copy of the alphaviral genome.
23. The vector of claim 21 or 22, wherein said cDNA copy of the alphaviral 0 genome derives from Sindbis virus genome or from Semliki Forest virus genome.
24. An expression vector comprising a sequence selected from the group consisting of SEQ ID NO: 1 , SEQ ID NO: 2, and SEQ ID'NO: 3.
5 2δ. An eukaryotic ceil line comprising an expressible site-specific recombinase coding sequence, wherein said expressible site-specific recombinase coding sequence is operatively linked to a minimal promoter and to at least one cis- acting regulatory element.
0 26. The cell line of claim 25, wherein said site-specific recombinase is expressed upon activation of said at least one cis-acting regulatory element.
27. The cell line of claim 25 or 26, wherein said cis-acting regulatory element is activatable by elevation of intracellular cAMP or cGMP levels, elevation of intracellular calcium concentration, and/or change in phosphorylation state of proteins.
5
28. The cell line of claim 2δ or 26, wherein said cis-acting regulatory element is activated during differentiation of mesenchymal stem cells into bone, cartilage, adipocytes or myoblasts.
0 29. The cell line of any one of claims 26 to 28, wherein said site-specific recombinase coding sequence is optimized for enhanced synthesis, stability or translation in eukaryotic cells.
30. The cell line of any one of claims 2δ to 28, wherein said expressible site- δ specific recombinase coding sequence consists of a site-specific recombinase coding sequence selected from the group consisting of Flp from Saccharomyces cerevisiae, Cre from bacteriophage P1 and β-recombinase from Bacillus subtilis.
31. The cell line of claim 30, wherein said site-specific recombinase coding 0 sequence comprises SEQ ID NO:5, SEQ ID NO: 6, or a functional homologue thereof.
32. Use of a vector as defined in any one of claims 1 to 24, or a cell line as defined in any one of claims 25 to 31 , for identifying or selecting an exogenous 5 nucleic acid having a desired feature.
33. A method for identifying nucleic acids encoding a desired feature from a library of exogenous nucleic acids, wherein a plurality of nucleic acids from said library are inserted into a plurality of vectors, said vectors comprising a nucleic 0 acid sequence excisable by site-specific recombination.
34. The method of claim 33, wherein said vectors are inserted into a eukaryotic cell line or into a transgenic animal comprising a nucleic acid encoding an inactive site-specific recombinase whose activity is restorable.
5 3δ. The method of claim 34, wherein the activity of said inactive site-specific recombinase is restored upon expression by said vector of an exogenous nucleic acid having said desired feature.
36. The method of claim 36, wherein said activated site-specific recombinase 0 excises a fragment of the nucleic acid sequence of said expression vectors, thereby forming recombined expression vectors comprising a nucleic acid having said desired feature.
37. The method of any one of claims 34 to 36 wherein said inactive site- δ specific recombinase is inactive due to a lack of sufficient expression or due to sequestration outside of the cell nucleus.
38. The method of any one of claims 34 to 37, wherein said nucleic acid having a desired feature is selected from the group consisting of nucleic acids 0 encoding transcription factors, nucleic acids encoding proteins involved in signal transduction pathways, and nucleic acids encoding proteins involved in cell metabolism or differentiation state.
39. The method of claim 36, wherein said vectors are inserted into a suitable δ eukaryotic host, and wherein said vectors encode and express a site-specific recombinase.
40. A method for screening exogenous nucleic acids having a desired feature within eukaryotic cells, said method comprising the steps of: 0 a) providing a plurality of expression vectors each capable, when present into a suitable host, of expressing an exogenous nucleic acid inserted therein, said vectors comprising a nucleic acid sequence excisable by site-specific recombination; b) providing a cell line or a transgenic animal comprising a nucleic acid encoding an inactive site-specific recombinase whose activity is restorable; c) inserting at least one exogenous nucleic acid from a library of nucleic acids into a plurality of said expression vectors, to provide a library of recombinant expression vectors; d) introducing, into cells of the cell line or of the transgenic animal of step (b), a plurality of recombinant expression vectors from the library obtained at step (c); e) allowing the recombinant expression vectors introduced at step (d) to express the exogenous nucleic acid inserted therein, wherein only exogenous nucleic acids encoding said desired feature are capable of restoring the activity of the site-specific recombinase of step (b); f) allowing the site-specific recombinase whose activity has been restored in step (e) to excise said excisable nucleic acid sequence from recombinant expression vectors which have expressed an exogenous nucleic acid having restored the activity of the site-specific recombinase; g) recovering recombinant expression vectors from cells of said cell line or transgenic animal; and h) selecting recombined expression vectors having undergone site-specific recombination at step (f), said recombined vectors containing an exogenous nucleic acid encoding the desired feature.
41 , A method for screening exogenous nucleic acids having a transcriptional activity within eukaryotic cells, said method comprising the steps of: a) providing a vector comprising: i) a site-specific recombinase coding sequence operatively linked to a termination sequence; and ii) a recombinase substrate excisable specifically by a site-specific recombinase encoded by said site-specific recombinase coding sequence; b) inserting into a plurality of vectors as defined at step (a) at least one exogenous nucleic acid taken from a library of exogenous nucleic acids in order to provide a library of recombinant vectors; c) inserting a plurality of recombinant vectors from the library obtained at 5 step (b) into a suitable eukaryotic host; d) allowing the exogenous nucleic acid inserted at step (b) to activate transcription of the site-specific recombinase coding sequence which is comprised in the vector, thereby producing said site-specific recombinase; 0 e) allowing the site-specific recombinase so produced to excise the recombinase substrate in the recombinant vector harboring the exogenous nucleic acid having activated the transcription of the site- specific recombinase; f) following step e), recovering a plurality of recombinant vectors from δ said eukaryotic host; and g) selecting recombinant vectors having undergone site-specific recombination, most of these vectors containing an exogenous nucleic acid having transcriptional activity.
0 42, The method of claim 40 or 41 , wherein the nucleic acid sequence of said vector encodes an inactive gene conferring resistance to an antibiotic in bacteria, and wherein activity of said inactive gene is restored by site-specific recombination of said nucleic acid sequence.
δ 43. The method of claim 42, wherein the step of selecting recombined expression vectors having undergone site-specific recombination comprises the steps of: i) extracting DNA from cells into which the expression vectors have been introduced; 0 ii) transforming bacteria with DNA extracted at step (i); iii) growing bacteria transformed at step (ii) in presence of said antibiotic; and iv) selecting bacterial colonies resistant to said antibiotic; whereby said resistant bacterial colonies comprises excised expression vectors having undergone site-specific recombination.
44. The method of claim 43, further comprising the steps of: δ v) extracting expression vectors from colonies selected at step (iv); and vi) identifying an exogenous nucleic acid found in said extracted vectors.
46. The method of claim 40 or 41 , wherein the nucleic acid sequence of said vector comprises a recombinase substrate having a stuffer region flanked by 0 recombination target sequences, and wherein said stuffer region comprises a cleavable restriction site.
46. The method of claim 45, wherein the step of selecting recombined expression vectors having undergone site-specific recombination comprises the 5 steps of: i. extracting DNA from cells into which the expression vectors have been introduced; and ii. contacting DNA extracted at step a) with a restriction enzyme recognizing said cleavable restriction site; 0 whereby recombined expression vectors are not cleaved by said restriction enzyme, and whereby unrecombined expression vectors are cleaved by said restriction enzyme.
47. The method of claim 46, further comprising the step of degrading DNA δ fragments cleaved by said restriction enzyme with an exonuclease,
48. The method of claim 46, further comprising the step of amplifying a DNA fragment from the expression vectors, said fragment comprising said exogenous nucleic acid. 0
49. A screening kit comprising:
1) a vector as defined in any one of claims 1 to 24; or 2) a cell line as defined in any one of claims 2δ to 31; and at least one further element selected from the group consisting of instructions for using said kit, reaction buffer(s), enzyme(s), probe(s) and pool(s) of nucleotide molecules to be screened.
EP02744992A 2001-06-28 2002-06-28 Methods, vectors, cell lines and kits for selecting nucleic acids having a desired feature Withdrawn EP1402020A2 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US30114901P 2001-06-28 2001-06-28
US301149P 2001-06-28
PCT/CA2002/000997 WO2003002735A2 (en) 2001-06-28 2002-06-28 Methods, vectors, cell lines and kits for selecting nucleic acids having a desired feature

Publications (1)

Publication Number Publication Date
EP1402020A2 true EP1402020A2 (en) 2004-03-31

Family

ID=23162156

Family Applications (1)

Application Number Title Priority Date Filing Date
EP02744992A Withdrawn EP1402020A2 (en) 2001-06-28 2002-06-28 Methods, vectors, cell lines and kits for selecting nucleic acids having a desired feature

Country Status (3)

Country Link
EP (1) EP1402020A2 (en)
CA (1) CA2451957A1 (en)
WO (1) WO2003002735A2 (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5919676A (en) * 1993-06-24 1999-07-06 Advec, Inc. Adenoviral vector system comprising Cre-loxP recombination
JP4216350B2 (en) * 1994-09-19 2009-01-28 大日本住友製薬株式会社 Recombinant DNA viral vector for animal cell infection
US6830885B1 (en) * 2000-08-18 2004-12-14 Phenogene Therapeutiques Inc. Nucleic acid molecule, method and kit for selecting a nucleic acid having a desired feature

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO03002735A3 *

Also Published As

Publication number Publication date
WO2003002735A3 (en) 2003-05-30
CA2451957A1 (en) 2003-01-09
WO2003002735A2 (en) 2003-01-09

Similar Documents

Publication Publication Date Title
JP5075833B2 (en) Recombinant expression of multiprotein complexes using polygenes
EP1857549B1 (en) Control of gene expression
JP7418470B2 (en) Integration of nucleic acid constructs into eukaryotic cells using transposase derived from Orydias
JP7418469B2 (en) Transfer of nucleic acid constructs into eukaryotic genomes using transposase from amyelois
US20120135517A1 (en) Synthetic genes and genetic constructs
JP2004000233A (en) Method for modifying expression characteristic of endogenous gene of certain cell system or microorganism
US5789653A (en) Secretory gene trap
E Tolmachov Building mosaics of therapeutic plasmid gene vectors
JP2004501601A5 (en)
CN113302291A (en) Genome editing by targeted non-homologous DNA insertion using retroviral integrase-Cas 9 fusion proteins
EP1815000A1 (en) Enhancer-containing gene trap vectors for random and targeted gene trapping
US7189506B1 (en) DNA binding compound-mediated molecular switch system
JP2004530429A (en) How to target transcriptionally active loci
US20230159958A1 (en) Methods for targeted integration
WO2003002735A2 (en) Methods, vectors, cell lines and kits for selecting nucleic acids having a desired feature
CA2367037A1 (en) Dna binding compound-mediated molecular switch system
JP3713038B2 (en) Recombinant adenovirus
WO2024044673A1 (en) Dual cut retron editors for genomic insertions and deletions
SG191419A1 (en) Means for generating adenoviral vectors for cloning large nucleic acids
WO2004035780A1 (en) Gene expression vector using echinus-origin insulators
JP2008104357A (en) TetR EXPRESSING NON-HUMAN TRANSGENIC ANIMAL AND METHOD FOR PRODUCING THE SAME
AU2004237883A1 (en) Spliceosome mediated RNA trans-splicing
NO845186L (en) VECTOR SYSTEM FOR INTRODUCING HETEROLOGICAL DNA IN EYKARYOTIC CELLS.

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20040126

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

AX Request for extension of the european patent

Extension state: AL LT LV MK RO SI

17Q First examination report despatched

Effective date: 20040414

RIN1 Information on inventor provided before grant (corrected)

Inventor name: GAUMOND, MARIE-HELENE

Inventor name: GINGRAS, ROCK

Inventor name: LANCTOT, CHRISTIAN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20040825