WO2009112844A1 - Method of preparing an shrna library - Google Patents

Method of preparing an shrna library Download PDF

Info

Publication number
WO2009112844A1
WO2009112844A1 PCT/GB2009/000684 GB2009000684W WO2009112844A1 WO 2009112844 A1 WO2009112844 A1 WO 2009112844A1 GB 2009000684 W GB2009000684 W GB 2009000684W WO 2009112844 A1 WO2009112844 A1 WO 2009112844A1
Authority
WO
WIPO (PCT)
Prior art keywords
adaptor
sequence
cdna
primer
dna
Prior art date
Application number
PCT/GB2009/000684
Other languages
French (fr)
Inventor
Henricus Petrus Joseph Te Riele
Camiel Lambert Christiaan Wielders
Original Assignee
Stichting Het Nederlands Kanker Instituut
Dean, John, Paul
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Stichting Het Nederlands Kanker Instituut, Dean, John, Paul filed Critical Stichting Het Nederlands Kanker Instituut
Publication of WO2009112844A1 publication Critical patent/WO2009112844A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1072Differential gene expression library synthesis, e.g. subtracted libraries, differential screening
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/111General methods applicable to biologically active non-coding nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/14Type of nucleic acid interfering N.A.
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2330/00Production
    • C12N2330/30Production chemically synthesised
    • C12N2330/31Libraries, arrays

Definitions

  • the invention relates to low-complexity short hairpin RNA(shRNA) libraries and to a method of producing such a library.
  • the libraries are useful in identifying genes which are responsible for phenotypic differences existing between two cell types among the genes which are differentially expressed genes in these cell types.
  • RNA interference RNA interference
  • shRNA short hairpins
  • enriched for one or more cDNA sequences indicates that at least one cDNA sequence, corresponding to at least a portion of a gene which is differently expressed in a first and second cell type, is present in the shRNA library more frequently than in a normalised library in which all genes are equally represented, such as a library synthesised from oligonucleotides (e.g., Bernards et al. (2006) Nat. Methods 3 701-6).
  • the shRNA library is enriched 2-500-fold, more preferably 50-200-fold, still more preferably 100-300-fold, most preferably about 200-fold for the one or more cDNA sequences.
  • a 200-fold enrichment would indicate that at least one cDNA sequence, corresponding to at least a portion of a gene which is differently expressed in a first and second cell type, is present in the shRNA library 200 times more frequently than in a normalised library in which all genes are equally represented. It is within the routine abilities of the skilled person to determine such levels of enrichment in a library according to the invention.
  • the two adaptor DNA sequences are generally ligated to cDNA fragments and allow the manipulation of the cDNA fragments.
  • the adaptor DNA sequence comprises a marker, such as a stretch of adenine residues bound to Biotin, it allows the isolation of cDNA fragments ligated to the adaptor DNA sequence.
  • Each adaptor DNA sequence will comprise a primer binding site. This allows PCR amplification of the cDNA fragments.
  • Each adaptor DNA sequence has two different primer binding sites to allow nested PCR which reduces the contamination in products due to amplification of unwanted primer binding sites.
  • AAAAAAAAAAAAAAAAAGTATTACCGCACTCACTTGGACTTCTGTCAC CGTCACCGCATAGCTCATCTACGTCTTCC (SEQ ID NO. 3), which may optionally be hybridised to the sequence GGAAGACGTAGATGAGC (SEQ ID NO. 4).
  • the processed cDNA sequence which encodes for a shRNA is inserted into an shRNA expression vector.
  • Suitable expression vectors are well known to those skilled in the art.
  • the method provides a technical solution to the problem of high cost and high complexity of current shRNA libraries, which compromise efficient screening. This is achieved by combining subtractive hybridisation and enzymatic production of shRNA libraries. To enable this, newly designed adaptors are used, which direct a more effective subtractive hybridisation procedure, followed by a new, efficient and easy procedure to process the selected sequences into shRNA libraries. The complexity of the resulting libraries is decreased by focusing on relevant sequences only, thus facilitating the subsequent screening. Such a method to produce enzymatically low complexity shRNA libraries that are enriched for differentially expressed genes has not been described or suggested in the prior art.
  • a second aspect of the invention relates to adaptor DNA sequences which can be used in the method described above. Accordingly, there is provided a first adaptor nucleotide sequence for use in preparing a shRNA library enriched for one or more cDNA sequences corresponding to at least a portion of at least one gene which is differently expressed in a first and second cell type, the adaptor sequence comprising: a first primer binding site for PCR amplification of a cDNA attached thereto, for example, following subtractive hybridisation; a first restriction recognition site for a first restriction enzyme to cause the cleavage of the cDNA attached to the first adaptor sequence into a fragment; and a second restriction recognition site for a second restriction enzyme to cause the cleavage of the cDNA fragment from at least a portion of the first adaptor sequence.
  • the first adaptor sequence can be used in subtractive hybridisation to pre-identify the cDNAs with differential expression between two cell types. It is then used to process the cDNAs into an shRNA library.
  • the first adaptor sequence is between about 20 and about 200 nucleotides in length. More preferably, the first adaptor sequence is between about 20 and about
  • the second adaptor sequence is between about 20 and about 120 nucleotides in length, more preferably still, the second adaptor sequence is between about 40 and about 100 nucleotides in length, even more preferably, the second adaptor sequence is between about 50 and about 90 nucleotides in length, more preferably still, the second adaptor sequence is between about 60 and about 80 nucleotides in length and, most preferably, the second adaptor sequence is about 70 nucleotides in length.
  • an isolated polynucleotide sequence comprising any one of sequences:
  • a method of identifying at least one gene which is differently expressed in a first and a second cell type comprising the steps of: a. carrying out a restriction digest reaction of cDNA from the first cell type using a first restriction enzyme; b. ligating a first adaptor DNA sequence, comprising a first primer recognition sequence and a second primer recognition sequence and a marker molecule, to DNA from step (a); c. ligating a second adaptor DNA sequence comprising a first primer recognition sequence and a third primer recognition sequence to DNA from step (a); d. carrying out a restriction digest reaction of cDNA from the second cell type using the first restriction enzyme; e.
  • the second primer may have the nucleotide sequence: CATCGTCCTGGCGTCTGGCT (SEQ ID NO. 6).
  • the third primer may have the nucleotide sequence: CTTCTGTCACCGTCACCGCATAG (SEQ ID NO. 7).
  • Figure 1 shows a schematic representation of the PCR based subtractive hybridisation
  • Figure 2a shows normalisation and enrichment of cDNA fragments during the first hybridisation reaction.
  • Figure 4 shows a schematic representation of the procedure to process DNA selected by subtractive hybridisation into an shRNA library using restriction sites on Adaptor A;
  • Figure 6 is a comparison of the abundance of differentially expressed genes as would be found in a completely normalised library (A), and in the shRNA library of the invention, as determined by sequencing (B); the figure shows the bins of differentially expressed genes, sequences which are not annotated as gene and genes which are not expressed in the cell lines under study. Examples
  • Antisense SH Adaptor B GGAAGACGTAGATGAGC (SEQ ID NO. 4)
  • Looped Adaptor C TTCAAGAGAACGCGTTGCACCGGTGCTGCACCGGTGCAGCGCGTTCTCTT GAANN (where N A 5 G 5 C, or T) (SEQ ID NO. 8); or
  • Nj 5 N 2 , N 3 , N 4 , nj and n 2 are independently selected from A, G 5 C 5 and T and ni is complementary to Ni and n 2 complementary to N 2 ) (SEQ ID NO. 12);
  • Adaptors A and B allow subtractive hybridisation and subsequent shRNA library construction.
  • Adaptor A contains sites for library processing and B is biotinylated to allow pull down using beads.
  • 1 ⁇ g of tester cDNA preparation was Alul digested and ligated in 20 ⁇ l buffer 4 (New England BioLabs), supplemented with ATP (ImM), and either adaptor A or B (20 ⁇ M).
  • Alul 5U; 1 hour at 37 0 C
  • ligase (5U) was added to the reaction, which was incubated overnight in a thermocycler cycling between 30°C (10 sec) and 10°C (10 sec). After heat inactivation of the enzymes (20 min at 80°C), the samples were filtered twice to remove excess adaptors and concentrated (20ng/ ⁇ l tester A and B) using a Microcon ® YM 100 column.
  • the beads were resuspended in lOO ⁇ l pfu polymerase mix (Invitrogen) containing only dNTPs (1OmM each) and heated to 75 0 C before the pfu polymerase was added to hotstart the reaction and fill in the adaptor sequences (5 min). The reaction was stopped by the addition of EDTA and the beads were recovered on a magnet. After cooling, they were washed twice with NaOH (10OnM) and once in tris buffer (1OmM). After recovery on a magnet, half of the beads are used as a template in a subsequent nested PCR using the pfu polymerase.
  • mRNA sequences overexpressed in one transcriptome (tester) compared to another (driver) at a ratio of t/dr are selected.
  • 5' ends (but not 3' ends) of AIiA digested tester cDNA are ligated to either Adaptor A (step 1 of Figure 1) or B (step 2).
  • Both Adaptor A and B ligated tester fragments are mixed separately with 75 fold excess AIiA digested driver cDNA (step 3), melted and allowed to hybridise.
  • a normalised number (n) of each fragment remains single stranded. The share of remaining single stranded fragments which carry an adaptor is proportional to their abundance in the tester (t) and driver (dr) cDNA.
  • the Adaptor A and B ligated samples were mixed; part of the remaining single strands were then forced to anneal during a second hybridisation by increasing the polyethylene glycol concentration to raise their effective concentration. Because the strands were first equalised, a normalised number of hybrids were formed for each of the Alul fragments during the second hybridisation, including some hybrids which- carry both an Adaptor A and B at their ends. Again, their number depended on the relative expression levels in the two cell lines, and was strongly enriched for sequences which are under-represented in cell line H-. Therefore, the hybrids carrying two different adaptors represent putative target sequences desired to constitute the shRNA library, which are selected in two ways.
  • Adaptor A- and B-ligated tester cDNA H+ and driver cDNA H- were mixed 1:35, and the first hybridisation was allowed to proceed between 6 and 50 hours, before the two
  • AM sites present in a gene which averages 20. This provides an internal control for off-target effects and increases the chance to come across vectors that give the right degree of functionality.
  • libraries may be produced for every organism.
  • shRNA delivering vector backbones including retroviral or lentiviral vectors, vectors encoding inducible RNAi or microRNA primary transcripts, can be used with only a minor adaptation to meet experimental demands.

Landscapes

  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Plant Pathology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

There is provided a method of preparing an shRNA library enriched for one or more cDNA sequences corresponding to at least a portion of at least one gene which is differently expressed in a first and second cell type, the method comprising the steps of: pre-identifying said one or more cDNA sequences using a first adaptor DNA sequence; and processing said one or more cDNA sequences using the first adaptor DNA sequence. There is further provided an shRNA library produced by such a method and uses thereof.

Description

METHOD OF PREPARING AN SHRNA LIBRARY
Field Of The Invention
The invention relates to low-complexity short hairpin RNA(shRNA) libraries and to a method of producing such a library. The libraries are useful in identifying genes which are responsible for phenotypic differences existing between two cell types among the genes which are differentially expressed genes in these cell types.
Background
RNA interference (RNAi) has proved to be an important tool to knock-down gene function. Retro or lentiviral shRNA libraries targeting large parts of the human and murine genomes have been synthetically produced to study long-term effects of gene knock-down in genome- wide loss of function screens. Screening is much facilitated by a reduction of the library complexity, because fewer vectors require testing to cover the variety of library sequences, thus reducing the obligatory background. Therefore, smaller sub-libraries, for instance restricted to the deubiquitinating enzymes or the protein kinases, have been produced to reduce library complexity.
Functional screens have become a very powerful tool to identify gene function. For gain-of-function screening, cDNA libraries were expressed in cells to identify genes whose over-expression affects the behaviour of cells. In this way, stimulatory genes (including oncogenes) have been discovered (e.g., Zhu et al. (2005) Genetics 170 767-77; Staudt et al. (2005) PLoS Genet. Oct;l(4):e55 doi:10.1371/ journal.pgen.0010055; Molnar et al, (2006) Genetics 174 1635-59; Mindorff et al (2007) Genetics 176 2247-63). These cDNA libraries, consisting of full-length cDNA copies of mRNA sequences, are enzymatically produced starting from RNA. Because RNA is a complex mixture, in which the number of mRNA copies of each gene varies enormously, this procedure yields very complex cDNA libraries. This means that many cDNA sequences need to be screened, in particular when the gene of interest is poorly expressed and therefore under-represented in the library. To overcome this problem, normalisation or subtractive hybridisation of cDNA preparations for gain-of- function screening have been used to reduce the complexity of cDNA libraries (Soares et al. (1994) Proc Natl Acad Sci USA 91 9228-32; Diatchenko et al. (1996) Proc Natl Acad Sci USA 93 6025-30; Perrett et al. (1998) Gene 208 103-15; Fischer et al. (2003) Biotechniques 34 250-4; Sternberg et al. (2007) Methods MoI Biol. 371 289-305; Akopyants et al. (1995) Proc. Natl. Acad. Sci. USA 95 13108-13113). Normalisation aims to equalize the full-length cDNA sequences, i.e., all genes present in the library are present in equal amounts. Subtractive hybridisation is used to select cDNA sequences that are differentially expressed between cell lines, in order to reduce the complexity of the resulting cDNA libraries even further.
More recently, the advent of RNA interference (RNAi) technology, which allows suppression of gene activity, has made it possible to also perform loss-of-function screens. RNAi is conferred by short (21 nucleotides) RNA sequences, which can either be directly introduced into cells as small interfering RNAs (siRNA), or be transcribed as short hairpins (shRNA) from expression vectors to obtain long-term RNAi. Several genome-wide shRNA libraries have been produced synthetically, i.e., by inserting custom-made oligonucleotide sequences into appropriate expression vectors. Screening of such libraries has identified a series of tumour suppressor genes (Bernards et al. (2006) Nat Methods 3 701-6). Synthetic shRNA libraries are of course normalized, but are also very expensive to produce. Therefore, efforts have been undertaken to produce them enzymatically starting from RNA. Again, the complexity of the RNA presents a problem similar as for cDNA libraries. Therefore, Xu et al. (J. Biotechnol. (2007) 128 477-485) described a protocol to enzymatically produce shRNA libraries from normalized RNA preparations. Although in these libraries all genes are more equally represented, they are still highly complex.
One way of generating low-complexity RNAi libraries only containing genes that are differentially expressed in the cell lines under study is to perform expression profiling and then select RNAi vectors targeting differentially expressed genes from a synthetic library. However, not every laboratory has easy access to a synthetic RNAi library or the technology to perform expression profiling. Alternatively, subtractive hybridisation may be used to pre-select cDNA sequences that are differentially expressed in cell lines under study and use these sequences to produce RNAi libraries (Zhang et al. (2006) Stem Cells 24 2661-2668). However, libraries produced in this way consisted of long double-stranded RNA molecules, which are considered to confer only a mild RNAi effect, inferior to that of shRNA vectors. Further, these long double stranded RNA molecules can induce an immune response in, for example, mammalian cells. This can cause non-specific gene silencing and apoptosis which is not desired.
Summary Of The Invention
According to a first aspect of the invention, there is provided a method of preparing an shRNA library enriched for one or more cDNA sequences corresponding to at least a portion of at least one gene which is differently expressed in a first and second cell type, the method comprising the steps of: pre-identifying said one or more cDNA sequences using a first adaptor DNA sequence; and processing said one or more cDNA sequences using the first adaptor DNA sequence. Preferably, the first and second cell types are phenotypically different from one another. The first and second cell types may be of different cell strains or may be of the same cell strain.
The term "shRNA library" indicates a library of vectors each comprising a short inverted repeat cDNA sequence which, when transcribed, expresses an shRNA species which can be processed by a host cell to siRNA, able to interfere with the normal process of gene expression.
The term "enriched for one or more cDNA sequences" indicates that at least one cDNA sequence, corresponding to at least a portion of a gene which is differently expressed in a first and second cell type, is present in the shRNA library more frequently than in a normalised library in which all genes are equally represented, such as a library synthesised from oligonucleotides (e.g., Bernards et al. (2006) Nat. Methods 3 701-6). Preferably, the shRNA library is enriched 2-500-fold, more preferably 50-200-fold, still more preferably 100-300-fold, most preferably about 200-fold for the one or more cDNA sequences. A 200-fold enrichment would indicate that at least one cDNA sequence, corresponding to at least a portion of a gene which is differently expressed in a first and second cell type, is present in the shRNA library 200 times more frequently than in a normalised library in which all genes are equally represented. It is within the routine abilities of the skilled person to determine such levels of enrichment in a library according to the invention.
Advantageously, the method according to the invention, including the initial step of pre-selection of putative target sequences, enables the production of very low- complexity shRNA libraries.
The term "pre-identifying said one or more cDNA sequences" indicates that genes which are differently expressed between the first and second cell types are initially selected, prior to the preparation of the shRNA library, with cDNA sequences corresponding to at least a portion of such genes being isolated.
In order to identify genetic differences between cell lines which are responsible for a selectable phenotypic difference, the library used for loss-of-function screening may be limited to the genes which are differentially expressed in these cells, because these especially are expected to contribute to the phenotypic change. For example, a low- complexity shRNA library to screen for tumour suppressor genes should primarily consist of genes whose expression is reduced in tumour cells, to test if their knockdown results in proliferation.
Therefore, the method according to the invention produces shRNA libraries for loss-of-function screening, which are strongly enriched (for example, 50-200-fold) for mRNA sequences down-regulated in the transcriptome associated with the selectable phenotype of interest compared to a control transcriptome (e.g., transformed vs. non- transformed cells). First, gene sequences are selected that are underrepresented in one transcriptome compared to another (for example, tumour suppressor genes), for example, using optimized PCR-based subtractive hybridisation. Subsequently, the selected sequences, enriched for potential target sequences, are efficiently processed into a low-complexity shRNA library. The library can then be used in a loss-of- function screen for the selectable phenotype (e.g., cell proliferation), which can identify the genes responsible for the phenotypic transition. shRNA libraries according to the invention can be produced for loss-of-function screening in a broad range of experimental settings and species to identify genetic differences between two cell line counterparts that differ by a selectable phenotype. The shRNA libraries facilitate loss of function screening because they focus on putative target genes to reduce their complexity with respect to these genes. In addition, because the sequences which are pre-selected in the library are actually regulated during the phenotypic transition, any resulting "hits" identified by screening are likely to be relevant regulators of the phenotypic transition in the physiological context of cells.
Advantageously, the method according to the first aspect of the invention combines the steps of pre-identifying cDNA sequences and processing them into an shRNA library into one integrated procedure. This is done by using the same adaptor DNA sequence in both the first pre-identifying step and the second processing step. This reduces the complexity of the overall process and so the resulting method is simpler, easier to carry out and can be done in a relatively short period of time.
Preferably, in the method according to the first aspect of the invention, the step of pre- identifying said one or more cDNA sequences comprises using subtractive hybridisation using the first adaptor DNA sequence and a second adaptor DNA sequence. The process of subtractive hybridisation which can be used to select gene sequences that are underrepresented in one transcriptome compared to another is well known to those skilled in the art, for example, see Diatchenko et al. (1996) Proc. Natl. Acad Sci. USA 93 6025-30. This method generally uses two adaptor DNA sequences which allow for PCR-based amplification of cDNA fragments that differ between a control and experimental transcriptome. The technique relies on the removal of dsDNA formed by hybridization between a control and test sample, thus reducing cDNAs of similar abundance, and retaining differentially expressed transcripts.
The two adaptor DNA sequences are generally ligated to cDNA fragments and allow the manipulation of the cDNA fragments. For example, if the adaptor DNA sequence comprises a marker, such as a stretch of adenine residues bound to Biotin, it allows the isolation of cDNA fragments ligated to the adaptor DNA sequence. Each adaptor DNA sequence will comprise a primer binding site. This allows PCR amplification of the cDNA fragments. Preferably, Each adaptor DNA sequence has two different primer binding sites to allow nested PCR which reduces the contamination in products due to amplification of unwanted primer binding sites.
In a preferred embodiment of the invention, the pre-identifying step of the method comprises the steps of: a. carrying out a restriction digest reaction of cDNA from the first cell type using a first restriction enzyme; b. ligating the first adaptor DNA sequence, comprising a first primer recognition sequence and a second primer recognition sequence and a marker molecule, to DNA from step (a); c. ligating a second adaptor DNA sequence comprising a first primer recognition sequence and a third primer recognition sequence to DNA from step (a); d. carrying out a restriction digest reaction of cDNA from the second cell type using the first restriction enzyme; e. mixing the denatured products of step (b) with the denatured products of step (d), optionally in the presence of a size exclusion agent, in an annealing step; f. mixing the denatured products of step (c) with the denatured products of step
(d), optionally in the presence of a size exclusion agent, in an annealing step; g. mixing the denatured products of steps (e) and (f) in the presence of a size exclusion agent in an annealing step, wherein the concentration of the size exclusion agent is higher than the concentration of the size exclusion agent in steps (e) and (f), if present; h. separating and isolating DNA carrying the first adaptor sequence from DNA carrying only the second adaptor sequence and/or no adaptor sequence, or separating and isolating DNA carrying the second adaptor sequence from DNA carrying only the first adaptor sequence and/or no adaptor sequence; and i. carrying out a nested PCR reaction using the separated DNA from (h) as template DNA and including the use of a first primer which recognises the first primer recognition sequence and, subsequently, a second primer which recognises the second primer recognition sequence and a third primer which recognises the third primer recognition sequence.
The concentration of the size exclusion agent will depend on the particular size exclusion agent used. P referably, the size exclusion agent is polyethylene glycol (PEG). Preferably, the concentration of size exclusion agent in steps (e) and (f), if present, is 1-10%. Preferably, the concentration of size exclusion agent in step (g) is 11-20%.
Having the concentration of size exclusion agent in step (g) higher than the concentration of size exclusion agent in steps (e) and (f) surprisingly causes a significant improvement in the efficiency of the pre-identifying step.
For the avoidance of doubt, not all of the steps need be carried out in the order given above; e.g., step (f) may be carried out before, or concurrently with, step (e). The term "comprises the steps of, as used throughout this specification, indicates that all steps are present but that their order need not be fixed. The skilled person would readily be able to determine a suitable alternative ordering to the steps set out above.
The first and second adaptor DNA sequences may comprise the nucleotide sequence:
GTATTACCGCACTCACTTGGAGCATCGTCCTGGCGTCTGGCTGAGGGTAGC TGGAGAGGCGATCCGACCG (SEQ ID NO. 1), which may optionally be hybridised to the sequence CGGTCGGATCGC (SEQ ID NO. 2); or
AAAAAAAAAAAAAAAAAAAGTATTACCGCACTCACTTGGACTTCTGTCAC CGTCACCGCATAGCTCATCTACGTCTTCC (SEQ ID NO. 3), which may optionally be hybridised to the sequence GGAAGACGTAGATGAGC (SEQ ID NO. 4).
The first and second adaptor sequences are preferably different to one another.
The first primer may have the nucleotide sequence: GTATTACCGCACTCACTTGGA (SEQ ID NO. 5).
The second primer may have the nucleotide sequence:
CATCGTCCTGGCGTCTGGCT (SEQ ID NO. 6).
The third primer may have the nucleotide sequence:
CTTCTGTCACCGTCACCGCATAG (SEQ ID NO. 7).
The effect of step (i) is that cDNA carrying both the first and second adaptor sequences is amplified by the PCR reaction at an exponential level, so the PCR product is significantly dominated by cDNA carrying both adaptors.
The first restriction enzyme is preferably an enzyme having a recognition site an average of once per 250 nucleotides in a typical genomic DNA sequence, most preferably Aϊul. An advantage of Alul is that the shRNA hairpins formed start with GCT (because this is a part of the Alul recognition sequence, which is AGCT). This is important for their functionality. Other suitable enzymes include: Mspl, Hpall, BstUI, HinPlI, Hhal, Haelll, Phol, CviKI-1, Taql, HpyCH4V, HpyCH2IV.
Preferably, the marker molecule may be any molecule which enables identification and/or separation of cDNA carrying the first adaptor sequence, for example, biotin or streptavidin. The identification of DNA carrying the first adaptor sequence, in step (h), may be carried out by use of the marker molecule, e.g., where the marker molecule is biotin, DNA carrying the marker may be separated by binding to streptavidin. Such methods are within the capabilities of the skilled person, who would be able to put them into effect without difficulty. Preferably, steps (e) and (f) of the method each continue for between 6 and 50 hours, more preferably between 40 and 50 hours, most preferably about 45 hours. In a further preference, step (g) of the method continues for about 24 hours. Preferably, the size exclusion agent (for example, PEG) is present in steps (e) and (f) at 4-8%, more preferably at about 6% and in step (g) at 15-20%, preferably at about 17.5%. The separation of DNA in step (h) may be carried out by any of the means well known to the skilled person, for example, by gel electrophoresis, band excision and purification, or using the marker molecule present on the first adaptor, for example by affinity separation on a column.
The method may further comprise the further step of inserting at least a portion of the PCR product from (i) into an shRNA expression vector.
In the method of the invention, once differentially expressed cDNAs have been pre- identified, they can be processed to produce inverted repeat sequences encoding for shRNA molecules directed to the at least one gene which is differently expressed in the first and second cell type. This is done, at least in part, using the first adaptor DNA sequence which is ligated to the one or more cDNA sequences.
The one or more cDNA sequences can be processed using: the first adaptor DNA sequence which comprises a recognition site for a first restriction enzyme; and a third hairpin adaptor DNA sequence to form hairpin structures, allowing the formation of inverted repeats.
Preferably, in the method of the invention, the processing step comprises the steps of: a. digesting the one or more cDNA sequences with a first restriction enzyme to produce a fragment of each of said one or more cDNA sequences attached to the first adaptor sequence; b. ligating the product of (a) with the third hairpin adaptor DNA sequence to form hairpin structures; c. replicating the hairpin structures of step (b) to form an inverted repeat of each of said one or more cDNA sequences; d. removing at least a portion the duplicated first adaptor DNA sequences from the product of step (c) using a second restriction enzyme; and e. processing the product of step (d), if necessary, to encode for a shRNA.
In step (a), the fragment of the cDNA sequence attached to the first adaptor sequence may be produced by the first restriction enzyme recognising a first recognition site in the first adaptor sequence and cleaving the cDNA attached to the first adaptor sequence at a distance from the first recognition site in the first adaptor sequence. This would thus produce a cDNA fragment attached to the first adaptor sequence. The skilled person is well aware of restriction enzymes which cleave DNA at a distance from their recognition site, for example, Mmel (or another restriction enzyme which produces functional shRNA inserts, i.e., fragments which can be processed into shRNA molecules). The skilled person would have no problem in positioning the first recognition site in the first adaptor sequence in order to produce a fragment of the cDNA attached to the first adaptor sequence. The particular sequence of the first recognition site will depend on the particular restriction enzyme used and the skilled person would be able be determine this and position it accordingly. Preferably, the cDNA fragment attached to the first adaptor sequence after cleavage with the restriction enzyme is between about 10 and about 30 nucleotides in length. More preferably, the cDNA fragment is between about 15 and about 25 nucleotides in length and, most preferably, between about 18 and about 25 nucleotides in length.
In step (b), the third hairpin adaptor DNA sequence is ligated to the fragment of the cDNA sequence which is attached to the first adaptor sequence. This forms a hairpin structure. The third adaptor sequence can be any sequence which allows the formation of hairpin structures. In order to do this, the third adaptor sequence will contain two substantially complementary sequences which can bind together to form a stable hairpin structure. The two sequences do not have to be entirely complementary in order to produce a stable hairpin structure. They may be at least 90% complimentary or, more preferably, at least 95% complementary. A skilled person could readily determine a suitable sequence to give a hairpin structure. Preferably, each of the two substantially complementary sequences is between about 5 and about 50 nucleotides long, more preferably, between about 10 and about 40 nucleotides long, even more preferably, between about 20 and about 30 nucleotides long and, most preferably, about 25 nucleotides long.
In step (c), the hairpin structures are replicated to form an inverted repeat of each of said one or more cDNA sequences. This can be done in any suitable way. For example, a primer can bind to the denatured hairpin structure at a primer binding site on the first adaptor sequence. Starting from the primer, the rest of the sequence can be replicated using a polymerase, for example, the Klenow fragment of DNA polymerase I5 to produce an inverted repeat of the cDNA sequence. The primer binding site can be the same as one of the primer sites used in subtractive hybridisation or it can be different. A person skilled in the art would readily be able to determine suitable primers and binding sites.
Alternatively, the first adaptor sequence may be nicked by a third restriction enzyme which recognises a third recognition site on the first adaptor sequence. The term 'nick' means to cut only one of the strands of a double stranded DNA sequence. Nicking restriction enzymes are well known to those skilled in the art, for example, Nb.BbvC-I (or any other enzyme which produces nicks in DNA). Once the first adaptor sequence is nicked, the product can be subjected to conditions which cause the release of the nicked portion of the first adaptor sequence. A primer can then bind in the region from which the nicked portion of the first adaptor sequence was released and the hairpin structure can be replicated starting from this primer, thereby forming an inverted repeat of each the cDNA sequences.
In step (d), at least a portion of the duplicated first adaptor DNA sequences are removed from the product of step (c) using a second restriction enzyme. Preferably, the whole duplicated first adaptor sequences are removed. The second restriction enzyme recognises a second recognition site in the first adaptor sequence and cleave the first adaptor sequence from the cDNA sequence. A person skilled in the art would be able to design and position the second recognition site so that the second restriction enzyme cleaves the first adaptor sequence from the cDNA sequence. The second restriction enzyme may be Bpml.
This leaves an inverted repeat of the cDNA sequence, the inverted repeat being separated by the hairpin forming structure of the third adaptor sequence. Upon transcription of this inverted repeat, a shRNA molecule will form which can interfere with gene expression.
If necessary, in step (e), the inverted repeat of the cDNA sequence can be processed to encode for a functional shRNA. For example, a portion of the third adaptor sequence which separates the inverted repeat may be removed if it is too long. This can be done by having two restriction recognition sites in the third adaptor sequence so that a portion of the third adaptor sequence is cleaved and which can then be removed. The remaining ends of the third adaptor sequence can then be ligated together. The two restriction recognition sites may be for the same restriction enzyme or for different restriction enzymes. Preferably, they are for the same restriction enzyme. A suitable restriction enzyme is Bsgl.
Preferably, the inverted cDNA repeat is inserted into an shRNA expression vector.
In one embodiment, step (e) can be performed after the inverted cDNA repeat has been inserted into an shRNA expression vector.
When subtractive hybridisation is used for the pre-identifying step, as discussed above, both the first and second adaptor DNA sequences will be attached to the cDNA sequences due to PCR amplification in subtractive hybridisation. Accordingly, these first and second adaptor DNA sequences need to be removed during processing of the cDNA sequences into sequences which correspond to shRNAs. The advantage of the present method is that the first adaptor DNA sequence is used both for subtractive hybridisation and for the processing of the cDNA. This means that the first adaptor sequence does not have to be removed and an alternative adaptor sequence used in the processing step. This ensures that the method is relatively simple.
Preferably, the processed cDNA sequence which encodes for a shRNA is inserted into an shRNA expression vector. Suitable expression vectors are well known to those skilled in the art.
In a slight variation to the method of the processing step described above, the processing step may comprise the steps of: a. carrying out a restriction digest reaction of the one or more pre-identified cDNA sequences using the first restriction enzyme; b. ligating the product of (a) with the third hairpin adaptor DNA sequence to form hairpin structures; c. nicking the first adaptor sequence in the hairpin structure of step (b) using the third restriction enzyme; d. isolating the product of the reaction in step (c) by binding to a marker molecule of the first adaptor sequence; e. heating the product isolated in (d) with a primer; f. extension of the primer by a polymerase to generate the inverted DNA sequence complementary to the product of (f); and g. isolating at least a portion of the generated DNA product by use of the second restriction enzyme and inserting the resulting DNA into an shRNA expression vector.
The third hairpin DNA adaptor sequence may have the nucleotide sequence:
TTCAAGAGAACGCGTTGCACCGGTGCTGCACCGGTGCAGCGCGTTCTCTT
GAANN (where N is A, G, C5 or T) (SEQ ID NO. 8); or
TTCAAGAGANlN2GCGTTGCACCGGTGCTGCACCGGTGCAGCGCn2nlTCTCT
TGAA N3N4 (where N1, N2 , N3, N4, nj and n2 are independently selected from A, G,
C, and T5 and n\ is complementary to N1 and n2 complementary to N2) (SEQ ID NO.
12).
The primer used in the processing step may have the nucleotide sequence
CATCGTCCTGGCGTCTGGC (SEQ ID NO. 9).
The shRNA expression vector may be selected from, by way of non-limiting example, pRetroSuper, a retroviral or lentiviral vector, or a vector encoding inducible RNAi or microRNA primary transcripts. A final digestion with a fourth restriction enzyme, such as Bsgl, removes part of the hairpin adaptor and, after religation of the vector, a functional shRNA vector is formed. The skilled person would readily be able to determine alternative suitable restriction enzymes to those mentioned above. Other enzymes which may be suitable are: Mspl, Hpall, BstUI, HmPlI, Hhal, Haelll, Phol,
CviKI-1 , Taql, HpyCH4V, or HpyCH2IV. While normalisation alone generates libraries encompassing all genes expressed in a particular tissue or cell line (Xu et al. , 2004), the inventors have found a method to produce libraries of very low complexity, strongly enriched for the small subset of genes which are differentially expressed in two cell types that differ by a selectable phenotype. Because the libraries of the invention focus on the few hundreds of genes that are differently regulated, screening for the actual gene that underlies the phenotypic transition becomes very easy. For the procedure, adaptors have been designed that allow subtractive hybridisation combining elements of the PCR selection method and pull down of desired sequences captured by magnetic beads. By using the same adaptors and specific restriction enzymes, selected cDNA sequences can subsequently be processed into inverted repeats that are finally introduced into shRNA-expression vectors (pSUPER). It has been found that the efficiency of subtractive hybridisation is much increased when a first round of hybridisation is performed in a low concentration (1-10% v/v) of polyethylene glycol (PEG) or another size exclusion agent, yielding a normalized number of single strands once the reaction is at equilibrium. A second hybridisation at a higher PEG concentration (11- 30% v/v) shifts the equilibrium and drives further formation of hybrids. Addition of PEG enabled the optimisation of the length of the first hybridization, ultimately leading to libraries that more directly reflect the relative expression levels.
The method provides a technical solution to the problem of high cost and high complexity of current shRNA libraries, which compromise efficient screening. This is achieved by combining subtractive hybridisation and enzymatic production of shRNA libraries. To enable this, newly designed adaptors are used, which direct a more effective subtractive hybridisation procedure, followed by a new, efficient and easy procedure to process the selected sequences into shRNA libraries. The complexity of the resulting libraries is decreased by focusing on relevant sequences only, thus facilitating the subsequent screening. Such a method to produce enzymatically low complexity shRNA libraries that are enriched for differentially expressed genes has not been described or suggested in the prior art.
A second aspect of the invention relates to adaptor DNA sequences which can be used in the method described above. Accordingly, there is provided a first adaptor nucleotide sequence for use in preparing a shRNA library enriched for one or more cDNA sequences corresponding to at least a portion of at least one gene which is differently expressed in a first and second cell type, the adaptor sequence comprising: a first primer binding site for PCR amplification of a cDNA attached thereto, for example, following subtractive hybridisation; a first restriction recognition site for a first restriction enzyme to cause the cleavage of the cDNA attached to the first adaptor sequence into a fragment; and a second restriction recognition site for a second restriction enzyme to cause the cleavage of the cDNA fragment from at least a portion of the first adaptor sequence.
The first adaptor sequence can be used in subtractive hybridisation to pre-identify the cDNAs with differential expression between two cell types. It is then used to process the cDNAs into an shRNA library.
It is well within the capability of a person skilled in the art to design a first primer binding site to bind a first primer to allow a cDNA attached to the first adaptor sequence to be amplified using PCR. Preferably, the first adaptor sequence comprises a second primer binding site for binding a second primer to allow nested PCR.
The first restriction recognition site is for a restriction enzyme which cuts the nucleotide sequence away from the recognition site, i.e. in the cDNA attached to the first adaptor sequence. This produces a fragment of the cDNA which is still attached to the first adaptor sequence. It is well within the capability of a person skilled in the art to select a suitable restriction enzyme, e.g. Mmel, and position the recognition site in the first adaptor sequence to cause cleavage of the cDNA attached to the adaptor sequence into an appropriately sized fragment.
Similarly, with regard to the second restriction recognition site, it is well within the capability of a person skilled in the art to select a suitable second restriction enzyme to recognise the second restriction recognition site and position the second recognition site in the first adaptor sequence to cause the cleavage of the cDNA fragment from at least a portion of the first adaptor sequence. For example, a suitable restriction enzyme is Bpml. Preferably, the entire first adaptor sequence is cleaved from the cDNA.
Preferably, the first adaptor sequence is between about 20 and about 200 nucleotides in length. More preferably, the first adaptor sequence is between about 20 and about
120 nucleotides in length, more preferably still, the first adaptor sequence is between about 40 and about 100 nucleotides in length, even more preferably, the first adaptor sequence is between about 50 and about 90 nucleotides in length, more preferably still, the first adaptor sequence is between about 60 and about 80 nucleotides in length and, most preferably, the first adaptor sequence is about 70 nucleotides in length.
Preferably, the first restriction recognition site for the first restriction enzyme causes the cleavage of the cDNA attached to the first adaptor sequence into a fragment of between about 10 and about 30 nucleotides in length. More preferably, the cDNA fragment is between about 15 and about 25 nucleotides in length and, most preferably, between about 18 and about 25 nucleotides in length.
Preferably, the first adaptor sequence comprises a third restriction recognition site for a third restriction enzyme to introduce a nick into the first adaptor sequence. A skilled person is well aware of suitable nicking enzymes, e.g. Nb.BvCI. The nick in the first adaptor sequence provides an entry site for a polymerase, e.g. the Klenow fragment of DNA polymerase I. This allows the cDNA to be replicated into an inverted repeat when used in conjunction with the third hairpin adaptor sequence as described above.
The first adaptor sequence preferably has a third primer binding site for binding a primer to allow replication of the cDNA. When the first adaptor sequence is used in conjunction with the third hairpin adaptor sequence, the primer can be used to replicate the cDNA into an inverted repeat. When the first adaptor sequence comprises a third restriction recognition site for a third restriction enzyme to introduce a nick into the first adaptor sequence, the third primer binding site can be within the region of nucleotides which are exposed when the nicked portion of the first adaptor sequence is released from the first adaptor sequence under suitable conditions, e.g. denaturing conditions.
The primer binding sites in the first adaptor sequence, and also in other adaptor sequences of the invention, are preferably between about 15 and about 30 nucleotides long and, more preferably, between about 18 and about 25 nucleotides long.
In one embodiment, the first adaptor sequence comprises a marker molecule. The marker molecule may be any molecule which enables identification and/or separation of the first adaptor sequence. Suitable marker molecules are well known to those skilled in the art, for example, biotin or streptavidin.
In one embodiment, the first adaptor sequence has the following sequence: GTATTACCGCACTCACTTGGAGCATCGTCCTGGCGTCTGGCTGAGGGTAGC TGGAGAGGCGATCCGACCG (SEQ ID NO. 1); or
The present invention also provides a second adaptor sequence which can be used in conjunction with the first adaptor sequence in subtractive hybridisation to pre-identify cDNAs with different expression between two cell types. The second adaptor sequence comprises a first primer binding site to bind a first primer for PCR amplification of a cDNA attached thereto, for example, following subtractive hybridisation. The first primer binding site of the second adaptor sequence can bind the same primer as the first primer binding site on the first adaptor sequence. Preferably, the second adaptor sequence comprises a second primer binding site for binding a second primer to allow nested PCR. Preferably, the second primer binding site of the second adaptor sequence binds a different primer than the second primer binding site of the first adaptor sequence.
In one embodiment, the second adaptor sequence comprises a marker molecule. The marker molecule may be any molecule which enables identification and/or separation of the second adaptor sequence. This can be used in subtractive hybridisation. Preferably, the second adaptor sequence is between about 20 and about 200 nucleotides in length. More preferably, the second adaptor sequence is between about 20 and about 120 nucleotides in length, more preferably still, the second adaptor sequence is between about 40 and about 100 nucleotides in length, even more preferably, the second adaptor sequence is between about 50 and about 90 nucleotides in length, more preferably still, the second adaptor sequence is between about 60 and about 80 nucleotides in length and, most preferably, the second adaptor sequence is about 70 nucleotides in length.
In one embodiment the second adaptor sequence has the following sequence:
AAAAAAAAAAAAAAAAAAAGTATTACCGCACTCACTTGGACTTCTGTCAC CGTCACCGCATAGCTCATCTACGTCTTCC (SEQ IDNO.3)
The present invention also provides a third hairpin adaptor sequence. This can be any suitable sequence which allows the formation of a hairpin structure. The hairpin adaptor sequence can form hairpin structures and allows the production of inverted cDNA repeats using the method described above. In order to do this, the third adaptor sequence comprises two substantially complementary sequences which can bind together to form a stable hairpin structure. A skilled person could readily determine a suitable sequence to form a hairpin structure. As discussed earlier, the two sequences do not need to be entirely complementary. Preferably, each of the two substantially complementary sequences is between about 5 and about 50 nucleotides long, more preferably, between about 10 and about 40 nucleotides long, even more preferably, between about 20 and about 30 nucleotides long and, most preferably, about 25 nucleotides long.
Preferably, the third adaptor sequence is between about 20 and about 200 nucleotides in length. More preferably, the third adaptor sequence is between about 20 and about 120 nucleotides in length, more preferably still, the third adaptor sequence is between about 30 and about 90 nucleotides in length, even more preferably, the third adaptor sequence is between about 40 and about 80 nucleotides in length, more preferably still, the third adaptor sequence is between about 50 and about 70 nucleotides in length and, most preferably, the third adaptor sequence is about 55 nucleotides in length.
Preferably, the third hairpin adaptor sequence comprises two restriction recognition sites for a restriction enzyme. This allows a portion of the third hairpin adaptor sequence to be removed once the cDNA inverted repeat has been produced. In the method of the invention, a suitable amount of the third hairpin adaptor can be removed to allow the formation of functional shRNA molecules as the remaining portion of the third hairpin adaptor sequence forms the hairpin section of the shRNA. Preferably, a portion of the third hairpin adaptor sequence is removed to leave between about 4 and about 15 nucleotides between the cDNA repeats and more preferably, between about 7 and about 11 nucleotides. Accordingly, a skilled person would be able to select a suitable restriction enzyme or enzymes and position the recognition sites to allow the third hairpin adaptor sequence to be processed into a suitable length for producing functional shRNAs. Preferably, the two restriction recognition sites are for the same restriction enzyme.
In one embodiment the third hairpin adaptor sequence has the following sequence: TTCAAGAGAACGCGTTGCACCGGTGCTGCACCGGTGCAGCGCGTTCTCTT GAANN (where N A, G5 C, or T) (SEQ ID NO. 8); or
TTCAAGAGANlN2GCGTTGCACCGGTGCTGCACCGGTGCAGCGCn2n1TCTCT TGAAN3N4 (where Ni, N2 , N3, N4, nj and n2 are independently selected from A, G, C, and T and ni is complementary to N1 and m complementary to N2) (SEQ ID NO. 12);
According to a third aspect of the invention, there is provided a kit for preparing an shRNA library enriched for one or more cDNA sequences corresponding to at least a portion of at least one gene which is differently expressed in a first and second cell type, the kit comprising a first adaptor sequence as defined above. This kit can be used in the method described above. Preferably, the kit further comprises a second adaptor sequence as defined above for use in subtractive hybridisation. Preferably, the kit further comprises a third hairpin adaptor sequence as defined above.
Preferably, the kit further comprises suitable primers to bind to the primer binding sites on the first adaptor sequence and the second adaptor sequence.
The kit may further comprise other components, for example, chemical reagents, materials for carrying out PCR or restriction enzymes.
According to a one embodiment, there is provided a kit for use in the method according to the first aspect of the invention. The kit comprises a polynucleotide sequence comprising the sequence:
GTATTACCGCACTCACTTGGAGCATCGTCCTGGCGTCTGGCTGAGGGTAGC TGGAGAGGCGATCCGACCG (SEQ ID NO. 1); and may comprise one or more polynucleotide sequences comprising one or more of sequences:
CGGTCGGATCGC (SEQ ID NO. 2); or
AAAAAAAAAAAAAAAAAAAGTATTACCGCACTCACTTGGACTTCTGTCAC CGTCACCGCATAGCTCATCTACGTCTTCC (SEQ ID NO. 3); or
GGAAGACGTAGATGAGC (SEQ ID NO. 4); or
GTATTACCGCACTCACTTGGA (SEQ ID NO. 5); or
CATCGTCCTGGCGTCTGGCT (SEQ ID NO. 6); or
CTTCTGTCACCGTCACCGCATAG (SEQ ID NO. 7); or CATCGTCCTGGCGTCTGGC (SEQ ID NO. 9); or
TTCAAGAGAACGCGTTGCACCGGTGCTGCACCGGTGCAGCGCGTTCTCTT
GAANN (where N is A, G, C, or T) (SEQ ID NO.8); or
TTCAAGAGAN1N2GCGTTGCACCGGTGCTGCACCGGTGCAGCGCn2nlTCTCT
TGAANsN4 (where N1, N2 , N3, N4, ni and n2 are independently selected from A, G, C, and T and nj is complementary to N1 and n2 complementary to N2) (SEQ ID NO.
12); or sequences complementary to any of the above sequences. The skilled person would readily be able to determine alternative sequences which would enable the method to be performed, subject to the requirements of, for example, the restriction enzymes, primers and marker molecules to be used.
According to a fourth aspect of the invention, there is provided an shRNA library enriched for one or more cDNA sequences corresponding to at least a portion of at least one gene which is differently expressed in a first and a second cell type. That is, sequences from one or more genes which are expressed to the same or substantially the same level in the first and second cell types are normalised in the library, while differentially expressed genes are enriched.
The term "enriched for one or more cDNA sequences" indicates that at least one cDNA sequence, corresponding to at least a portion of a gene which is differently expressed in a first and second cell type, is present in the shRNA library more frequently than in a normalised library in which all genes are equally represented, such as a library synthesised from oligonucleotides (e.g., Bernards et al. (2006) Nat. Methods 3 701-6). Preferably, the shRNA library is enriched 2-500-fold, more preferably 50-200-fold, still more preferably 100-300-fold, most preferably about 200-fold for the one or more cDNA sequences. A 200-fold enrichment would indicate that at least one cDNA sequence, corresponding to at least a portion of a gene which is differently expressed in a first and second cell type, is present in the shRNA library 200 times more frequently than in a normalised library in which all genes are equally represented.
The shRNA library is obtainable by the method of the invention. Preferably, the shRNA library is obtained by the method of the invention.
In one embodiment, a kit may comprise the shRNA library.
Advantageously, with respect to differently expressed genes, which represent potential targets of screening, such a library has a much lower complexity than prior art libraries, such that genes which differ in expression in two different cell types by only a small amount can be selected and, therefore, their effects studied. According to a fifth aspect of the invention, there is provided a method of determining the effects of reduction in expression of a gene of interest, comprising the use of an shRNA library according to the fourth aspect of the invention.
According to a another aspect of the invention, there is provided an isolated polynucleotide sequence comprising any one of sequences:
GTATTACCGCACTCACTTGGAGCATCGTCCTGGCGTCTGGCTGAGGGTAGC
TGGAGAGGCGATCCGACCG (SEQ ID NO. 1); or CGGTCGGATCGC (SEQ ID NO. 2); or
AAAAAAAAAAAAAAAAAAAGTATTACCGCACTCACTTGGACTTCTGTCAC
CGTCACCGCATAGCTCATCTACGTCTTCC (SEQ ID NO. 3); or
GGAAGACGTAGATGAGC (SEQ ID NO. 4); or
GTATTACCGCACTCACTTGGA (SEQ ID NO. 5); or CATCGTCCTGGCGTCTGGCT (SEQ ID NO. 6); or
CTTCTGTCACCGTCACCGCATAG (SEQ ID NO. 7); or
CATCGTCCTGGCGTCTGGC (SEQ ID NO. 9); or
TTCAAGAGAACGCGTTGCACCGGTGCTGCACCGGTGCAGCGCGTTCTCTT
GAANN (where N A, G5 C5 or T) (SEQ ID NO. 8); or TTCAAGAGAN1N2GCGTTGCACCGGTGCTGCACCGGTGCAGCGCn2nlTCTCT
TGAAN3N4 (where N1, N2 , N3, N4, ni and n2 are independently selected from A5 G5
C, and T and ni is complementary to N1 and n2 complementary to N2) (SEQ ID NO.
12);
or sequences complementary thereto.
The skilled person would readily be able to determine alternative sequences which would enable the method to be performed, subject to the requirements of, for example, the restriction enzymes, primers and marker molecules to be used.
According to a the pre-identification step of the invention, there is provided a method of identifying at least one gene which is differently expressed in a first and a second cell type, the method comprising the steps of: a. carrying out a restriction digest reaction of cDNA from the first cell type using a first restriction enzyme; b. ligating a first adaptor DNA sequence, comprising a first primer recognition sequence and a second primer recognition sequence and a marker molecule, to DNA from step (a); c. ligating a second adaptor DNA sequence comprising a first primer recognition sequence and a third primer recognition sequence to DNA from step (a); d. carrying out a restriction digest reaction of cDNA from the second cell type using the first restriction enzyme; e. mixing the denatured products of step (b) with denatured products of step (d), optionally in the presence of a size exclusion agent, in an annealing step; f. mixing the denatured products of step (c) with denatured products of step (d), optionally in the presence of a size exclusion agent, in an annealing step; g. mixing the denatured products of steps (e) and (f) in the presence of a size exclusion agent in an annealing step, wherein the concentration of the size exclusion agent is higher than the concentration of size exclusion agent in steps (e) and (f); h. separating and isolating DNA carrying the first adaptor sequence from DNA carrying only the second adaptor sequence and/or no adaptor sequence, or separating and isolating DNA carrying the second adaptor sequence from
DNA carrying only the first adaptor sequence and/or no adaptor sequence; i. carrying out a nested PCR reaction using the separated DNA from (h) as template DNA and including the use of a first primer which recognises the first primer recognition sequence and, subsequently, a second primer which recognises the second primer recognition sequence and a third primer which recognises the third primer recognition sequence; and j. separating and isolating the product(s) of the reaction in step (i), said product being at least a fragment of a gene which is expressed at a lower level in the second cell type than in the first cell type. Preferably, the concentration of size exclusion agent in steps (e) and (f), if present, is 1-10%. Preferably, the concentration of size exclusion agent in step (g) is 11-30%.
The first and second adaptor DNA sequences may have the nucleotide sequence: GTATTACCGCACTCACTTGGAGCATCGTCCTGGCGTCTGGCTGAGGGTAGC TGGAGAGGCGATCCGACCG (SEQ ID NO. 1), which may optionally be hybridised to the sequence CGGTCGGATCGC (SEQ ID NO. 2); or
AAAAAAAAAAAAAAAAAAAGTATTACCGCACTCACTTGGACTTCTGTCAC CGTCACCGCATAGCTCATCTACGTCTTCC (SEQ ID NO. 3), which may optionally be hybridised to the sequence GGAAGACGT AGATGAGC (SEQ ID NO. 4)
The first and second adaptor sequences are preferably different to one another.
The first primer may have the nucleotide sequence: GTATTACCGCACTCACTTGGA (SEQ ID NO. 5).
The second primer may have the nucleotide sequence: CATCGTCCTGGCGTCTGGCT (SEQ ID NO. 6).
The third primer may have the nucleotide sequence: CTTCTGTCACCGTCACCGCATAG (SEQ ID NO. 7).
Preferably, the size exclusion agent is PEG.
The first restriction enzyme is preferably an enzyme having a recognition site an average of once per 250 nucleotides in a typical DNA sequence, most preferably Alul. Preferably, steps (e) and (f) of the method each continue for between 6 and 100 hours, more preferably between 30 and 50 hours, most preferably about 45 hours. In a further preference, step (g) of the method continues for about 24 hours. Preferably, the size exclusion agent (for example, PEG) is present in steps (e) and (f) at 4-8%, more preferably at about 6% and in step (g) at 15-20%, more preferably at about 17.5%. The separation of DNA in step (h) may be carried out by any of the means well known to the skilled person, for example, by gel electrophoresis, band excision and purification, or using the marker molecule present on the first adaptor.
Brief Description Of The Figures
Embodiments of the invention will now be shown, by way of example only, with reference to the accompanying Figures 1-6 in which:
Figure 1 shows a schematic representation of the PCR based subtractive hybridisation;
Figure 2a shows normalisation and enrichment of cDNA fragments during the first hybridisation reaction.
Figure 2b shows the effect of polyethylene glycol on Hyg enrichement.
Figure 3 shows the effect of the amount of driver cDNA on the efficiency of subtractive hybridisation;
Figure 4 shows a schematic representation of the procedure to process DNA selected by subtractive hybridisation into an shRNA library using restriction sites on Adaptor A;
Figure 5 shows the difference in gene expression between a MEF cell line which grows on the surface of a culture dish cells or arrested in methyl cellulose; and
Figure 6 is a comparison of the abundance of differentially expressed genes as would be found in a completely normalised library (A), and in the shRNA library of the invention, as determined by sequencing (B); the figure shows the bins of differentially expressed genes, sequences which are not annotated as gene and genes which are not expressed in the cell lines under study. Examples
Oligonucleotides and cDNA
All oligonucleotides were obtained from Sigma.
SH (subtractive hybridisation') Adaptor A
GTAΎTACCGCACΎCACΎΎGGAGCATCGTCCTGGCGTCTGGCTGAGGGTAG CΓGGAGAGGCGATCCGACCG (SEQ ID NO. 1)
(Primer 1 recognition site is bold; primer 2 recognition site is italicised)
Antisense SH Adaptor A
CGGTCGGATCGC (SEQ ID NO. 2)
SH Adaptor B
BiotinAAAAAAAAAAAAAAAAAAAGTATTACCGCACTCACTTGGACrrCr GTCACCGTCACCGCATAGCTCATCTACGTCTTCC (SEQ ID NO. 3) (Primer 1 recognition site is bold; primer 3 recognition site is italicised)
Antisense SH Adaptor B GGAAGACGTAGATGAGC (SEQ ID NO. 4)
SH Primer 1
GTATTACCGCACTCACTTGGA (SEQ ID NO. 5)
SH Primer 2 BiotinCATCGTCCTGGCGTCTGGCT (SEQ ID NO. 6)
SH Primer 3
CTTCTGTCACCGTCACCGCATAG (SEQ ID NO. 7)
SH Primer 4
CATCGTCCTGGCGTCTGGC (SEQ ID NO. 9)
Looped Adaptor C TTCAAGAGAACGCGTTGCACCGGTGCTGCACCGGTGCAGCGCGTTCTCTT GAANN (where N A5 G5 C, or T) (SEQ ID NO. 8); or
TTCAAGAGAN1N2GCGTTGCACCGGTGCTGCACCGGTGCAGCGCn2nlTCTCT TGAAN3N4 (where Nj5 N2 , N3, N4, nj and n2 are independently selected from A, G5 C5 and T and ni is complementary to Ni and n2 complementary to N2) (SEQ ID NO. 12);
To adapt the standard pRETRO Super for insertion of inserts produced by the method of the invention, the following sequence is inserted between BgIU and Hindlll sites in the pRetroSuper vector, to allow insertion of processed fragments after Bsgl digestion:
Adaptor for pRetro Super (sense")
GATCCCCGCTGCCGTGCAGTTAACCTGCACCCGAGCTTTTTGGAAA (SEQ ID NO. 10)
Adaptor for pRetro Super (antisense)
AGCTTTTCCAAAAAGCTCGGGTGCAGGTTAACTGCACGGCAGCGGG (SEQ
ID NO. 11)
cDNA synthesis tester and driver cDNA
RNA was isolated from cells using RNeasy columns (Qiagen). cDNA was produced according to the SMART cDNA synthesis method (Clontech). cDNA was amplified using the Expand polymerase (Roche) during a maximum of 20 PCR cycles in 45 separate 50μl reactions. After PCR5 the cDNA was precipitated before purification on a Microcon® YMlOO column (Millipore).
Subtractive hybridisation
Adaptors A and B (Sigma) allow subtractive hybridisation and subsequent shRNA library construction. Adaptor A contains sites for library processing and B is biotinylated to allow pull down using beads. 1 μg of tester cDNA preparation was Alul digested and ligated in 20μl buffer 4 (New England BioLabs), supplemented with ATP (ImM), and either adaptor A or B (20μM). After incubation with Alul (5U; 1 hour at 370C), ligase (5U) was added to the reaction, which was incubated overnight in a thermocycler cycling between 30°C (10 sec) and 10°C (10 sec). After heat inactivation of the enzymes (20 min at 80°C), the samples were filtered twice to remove excess adaptors and concentrated (20ng/μl tester A and B) using a Microcon® YM 100 column.
2μg driver cDNA was digested overnight in 40 μl buffer 4 (NEB) using AM (10U). After heat inactivation of the enzyme (20 min at 800C), the sample was concentrated (20ng/μl tester A and B) and 200ng/μl driver) using a Microcon® YM 10 column.
For subtractive hybridisation, IOng of both the adaptor A and B ligated tester fragments were each mixed with 700ng of driver fragments. In a volume of 5μl, containing PEG (6%; MW 6/8000), NaCl (0.5mM), HEPES (10OmM), and covered by mineral oil, the fragments were melted using a thermocycler (3 min 95 °C) and subsequently allowed to hybridise for 45 hours at 680C. Then, the Adaptor A and B ligated samples, and 1.25μl PEG 55% were thoroughly mixed. Hybridisation was allowed to continue for 24 hours, before the reaction was diluted by the addition of 250μl of dilution buffer (5OmM NaCl, HEPES 10OmM) and kept at 680C for 10 more minutes. The reaction was removed from the thermocycler and added to lOOng of streptavidin-coated magnetic beads (Dynalbeads) in 250μl dilution buffer. After 10 minutes at room temperature (while keeping the beads in suspension), the biotinylated Adaptor B ligated fragments were pulled down from the mixture on a magnet and washed twice with dilution buffer (300μl, 5 min, 68°C). The beads were resuspended in lOOμl pfu polymerase mix (Invitrogen) containing only dNTPs (1OmM each) and heated to 750C before the pfu polymerase was added to hotstart the reaction and fill in the adaptor sequences (5 min). The reaction was stopped by the addition of EDTA and the beads were recovered on a magnet. After cooling, they were washed twice with NaOH (10OnM) and once in tris buffer (1OmM). After recovery on a magnet, half of the beads are used as a template in a subsequent nested PCR using the pfu polymerase.
In a nested PCR procedure, the first PCR reaction (lOOOμl pfu PCR mix divided over 40 separate 25 μl reactions) used the primer 1 and consisted of 30 cycles (15 sec 94°C; 30 sec 64°C; 90 sec 94°C). After mixing of the duplicate reactions, 2μl was used as a template in the secondary PCR (1000 μl pfu PCR mix divided over 40 separate 25 μl reactions) using primers 2 and 3 and consisting of 16 cycles (15 sec 94°C; 30 sec 64°C; 90 sec 94°C). Primer 1 was biotinylated for further applications. After PCR, the subtracted library was precipitated before purification on a Microcon® YMlOO column.
Subtr -active hybridisation usmg isogenic Hyg+ and Hyg- cell lines. Enrichment of differentially expressed cDNA sequences for library construction is accomplished by PCR based subtractive hybridisation, as outlined above and in Figures 1 and 2.
mRNA sequences overexpressed in one transcriptome (tester) compared to another (driver) at a ratio of t/dr are selected. First, 5' ends (but not 3' ends) of AIiA digested tester cDNA are ligated to either Adaptor A (step 1 of Figure 1) or B (step 2). Both Adaptor A and B ligated tester fragments are mixed separately with 75 fold excess AIiA digested driver cDNA (step 3), melted and allowed to hybridise. After the first hybridisation, a normalised number (n) of each fragment remains single stranded. The share of remaining single stranded fragments which carry an adaptor is proportional to their abundance in the tester (t) and driver (dr) cDNA. The equalised number n of single strands present in the two samples is mixed and, after addition of PEG (step 4), new hybrids can be formed during the second hybridisation. The number of hybrids which carry two different adaptors at their ends is then proportional to their relative expression levels in the tester and driver cDNA. Subsequently, hybrids containing the adaptor B are fished out from the mixture (step 5) and, after addition of polymerase to fill in the complementary adaptor strand, can be exponentially amplified if they carry the adaptor A at their other end (step 6). Amplification of hybrids carrying two identical adaptors is suppressed due to hairpin formation which prevents primer annealing. Hybrids which carry a single adaptor are only amplified linearly.
Two Adaptors A and B were designed to allow subtractive hybridisation and subsequent processing of selected fragments into shRNA libraries. The Adaptor A contained several restriction sites to process attached sequences into an shRNA library. The subtractive hybridisation procedure was optimised using a pair of isogenic embryonic stem cells differing in their hyg expression. In separate reactions, the Adaptors A and B were ligated to Alul digested cDNA obtained from cell line H+, which expresses the hyg gene (called tester cDNA). During a first hybridisation, the Adaptor A and -B ligated samples were allowed to anneal separately to excess cDNA obtained from cell line H-, which lacks the hyg gene (driver cDNA).
Over time, a normalized number of each of the different Alul fragments present in the mixture is expected to remain single stranded due to the second order kinetics of re-hybridisation, which causes the more abundant fragments to re-anneal faster until an equilibrium concentration is reached for each different fragment. In addition, for each of the different fragments, the proportion of remaining single strands carrying either one of the adaptors A or B depends on their relative abundance in the cell line H+ and H- cDNA. Progressively more of the remaining single strands will carry an adaptor when the expression of the sequence is reduced in cell line H- relative to cell line H+.
After equilibrium was reached, the Adaptor A and B ligated samples were mixed; part of the remaining single strands were then forced to anneal during a second hybridisation by increasing the polyethylene glycol concentration to raise their effective concentration. Because the strands were first equalised, a normalised number of hybrids were formed for each of the Alul fragments during the second hybridisation, including some hybrids which- carry both an Adaptor A and B at their ends. Again, their number depended on the relative expression levels in the two cell lines, and was strongly enriched for sequences which are under-represented in cell line H-. Therefore, the hybrids carrying two different adaptors represent putative target sequences desired to constitute the shRNA library, which are selected in two ways.
First, after immobilising adaptor B on magnetic beads, all of the adaptor A strands, which carry the restriction sites necessary to process attached cDNA sequences into shRNA libraries, were washed away unless they were hybridised to Adaptor B strand. Subsequently, to produce a subtracted library, only fragments carrying both adaptors A and B at their ends were exponentially amplified by PCR, using nested primers located in the two different adaptors, further restricting the resulting shRNA library to the desired sequences.
Optimisation of subtractive hybridisation procedure
To optimise subtractive hybridisation, the time required to reach equilibrium for optimal normalisation during the first hybridisation was first determined.
Adaptor A- and B-ligated tester cDNA H+ and driver cDNA H- were mixed 1:35, and the first hybridisation was allowed to proceed between 6 and 50 hours, before the two
Adaptor A and B ligated samples were mixed and the PEG concentration was raised from 6% to 22%. After the second hybridisation (24 hour), the subtracted library was amplified by PCR, and the presence of a panel of 10 genes, including the differently expressed hyg gene and abundant genes (such as actin, tubulin, E cadherin, and mshό) among the selected cDNA fragments was measured using dotblots. The results are shown in Figure 2a. When the first hybridisation was allowed to proceed for 45 hours, those genes expressed equally in the two cell lines H+ and H- were reduced to normalised low levels in the subtracted library. Even the levels of the abundantly expressed sequences were effectively reduced. In contrast, the differentially expressed hyg gene was strongly enriched and exceeded the other sequences approximately 200 fold in the subtracted library. The addition of PEG during the second hybridisation increases the efficiency of the subtractive hybridisation (Fig. 2b).
In a second experiment, the effect of the amount of driver cDNA added in excess on the efficiency of subtractive hybridisation was determined.
The procedure was followed using 15, 30, 45, or 60 fold excess of driver (H-) cDNA compared to tester (H+) cDNA for subtractive hybridisation. As expected, the efficiency of subtractive hybridisation was increased when more driver cDNA was added to the reaction (Figure 3). When the first hybridisation was allowed to proceed for only 24 hours, normalisation and enrichment rapidly improved when increasing amounts of driver cDNA are added to the reaction. Construction ofshRNA libraries
Restriction sites on the Adaptor A were used to process the selected sequences into inverted repeats, which were ligated into a pRetroSuper vector. 10 μg of the amplified subtracted library was processed in lOOμl buffer 4 (New England BioLabs), containing SAM (ImM) and the Looped Adaptor C. One hour after addition of Mmel (5U), ligase was added overnight to form hairpins by Looped Adaptor C ligation. After inactivation of these enzymes (8O0C 15 min), Nb.BbvC I (5U) was added for 1 hour to nick the hairpin. On ice, streptavidin-coated magnetic beads (lOOng) were used to pull down the biotinylated nicked hairpin. The supernatant was then replaced by 100 μl buffer 2 (New England BioLabs) containing primer 4 (10OnM) and dNTP's (1OmM each) the sample was heated (70°C 20 min) to inactivate the enzyme and release the nicked hairpins from the beads. Exo" Klenow polymerase (10U) is added to elongate the primer 4 (1 hour at 37°C) and form an inverted repeat. Bpml (10U) digestion (1 hour at 37°C) removed the duplicated adaptor A ends, generating CT overhangs on both sides. The sample was separated on a 4.5% agarose gel and the 91bp inverted repeat was isolated from gel by freeze squeezing and phenol extracted. After ligation into a modified pRetroSuper vector, the plasmids were used to transform bacteria. At least 40000 individual small colonies were pooled and a maxiprep (Genomed) was performed to isolate plasmids. lμg was digested with Bsgl to remove an obsolete part of the Looped Adaptor C, and the linearised 6834bp vector fragment isolated from gel (Qiagen). After religation of the vector, the plasmids were used to transform bacteria. The plasmids from 40000 individual small colonies were pooled and a maxiprep (Genomed) was performed. An aliquot of the maxiprep was used to transform E. coli in order to isolate and sequence individual plasmids. Sequenced inserts were compared to annotated sequences (Genbank). The library of pRETRO SUPER plasmids was calcium transfected into Phoenix cells to produce viruses.
The steps to process the selected cDNA sequences into an shRNA library are shown in Figure 4 and are outlined in Table 1 :
Mmel digestion
Ligation with looped adaptor
Figure imgf000035_0001
Table 1: outline of experimental steps shown in Figure 4
Explaining now in more detail, digestion with Mmel, which cuts away from its recognition site in adaptor A, leaves around 18 base pairs of cDNA attached to the adaptor A. This sequence will constitute the sequence which is expressed in shRNA in the shRNA vector. The cDNA end is ligated to the looped adaptor C, forming a hairpin structure. The looped adaptor C carries a nine nucleotide sequence which ultimately ends up in between the inverted cDNA repeats in the final shRNA vector. In the hairpin consisting of adaptor A, cDNA and adaptor C5 a nick is created in adaptor A using Nb.BbvCI. Since a biotinylated primer was used during PCR amplification of the subtracted library, the nicked hairpins can be immobilised on streptavidin coated beads,. Due to the nick, part of the hairpin is released from the bead by heating, leaving part of adaptor A attached to the bead. Primer 4, which binds to part of the exposed sequence of adaptor A, is added to restore the nicked hairpin. Using the nick as an entry site for DNA synthesis by the Klenow fragment of DNA polymerase I, a complementary strand is synthesized, starting from primer 4, to form a double stranded inverted repeat. Bpml, which cuts 16 bp away from its restriction recognition site present in adaptor A, is used to remove the now duplicated adaptor A ends. The resulting inverted repeats of cDNA and adaptor C sequence are purified by gel electrophoresis. Subsequently, they are ligated behind an appropriate promoter sequence, preferably the Hl promoter, in a vector of choice, preferably pRETRO-SUPER. Finally, most of the adaptor C sequence is removed using Bsgl, which recognizes two restriction sites in adaptor C, to yield a pRETRO-Super shRNA vector. In the vector, inverted cDNA sequences are separated by the nine nucleotide sequence originally present in adaptor C. The skilled person would readily be able to determine suitable alternative adaptors, primers, vectors and/or restriction enzymes for use in the method of the invention.
An shRNA library was produced using a subtracted library generated from cell line H+ and H- using 60 fold excess of cell line H- as a driver. To validate the resulting shRNA library, the inserted sequences in 88 shRNA plasmids were verified by sequencing. 81 of the sequences corresponded to inverted repeats capable of expressing genuine shRNA molecules. The inserted sequences matched perfectly to mRNA sequences found in GenBank. They all followed Aluϊ sites and were distributed over the full length of the corresponding mRNA. It was found that one of the 88 vectors targeted the hyg gene, the result of efficient enrichment by subtractive hybridization.
Validation of subtracted libraries.
The composition of the subtracted libraries was determined by microarray, dotblot analysis or sequencing. Microarrays were performed by the Netherlands Cancer Institute microarray facility. For dotblots, a panel of 15 genes was spotted onto a membrane and hybridised with radiolabeled subtracted libraries. The abundance of the panel genes in the libraries was quantified by phosphoimaging (Figures 2 and 3). Sequencing was performed at the Netherlands Cancer Institute sequencing facility.
Production of shRNA library from cells either grown on a culture dish or arrested in methyl cellulose. To test the method of the invention, an shRNA library was produced, which was enriched for tumour suppressor sequences that are induced when cells arrest in methyl cellulose. Subsequently, a loss of function screen for colony growth in methyl cellulose was performed. For microarray analysis and library construction, cDNA was produced from a mouse embryonic fibroblast (MEF) cell line expressing RASV12 and lacking the retinoblastoma proteins Rb and pi 07, which was either grown on a culture dish or arrested in methyl cellulose to induce tumor suppressor genes. Subtractive hybridization was performed using cDNA derived from the arrested cells as the tester and from the growing cells as a driver to enrich for gene sequences induced by methyl cellulose.
Microarray analysis of cells residing in methyl cellulose revealed numerous differences in gene expression, including many genes which were induced in the arrested cells, but not in the growing cells. Figure 5 shows the expression profile of the growing and arrested cells as determined by microarray analysis. Black dots represent genes expressed equally in the two culture situations. Grey dots represent genes either overexpressed (negative M value) or repressed (positive M value) by culturing the cells in methyl cellulose. The 21331 expressed genes were divided into 5 bins: more than 4-fold induction (47 genes), 4-3 fold (83 genes), 3-2 fold (425 genes), 1-2 fold (10490 genes) or reduced expression (10440 genes). In Figure 6, the representation of the different bins as would be found in a normalised library containing all 25000 annotated genes (i.e., genes that encode a protein) (A) is compared to the representation of the bins in the library made according to the invention (B). The genes induced mostly by methyl cellulose (>4-fold induction) were enriched up to 51 fold.
Next, untransformed RASV12 expressing Rb/plO7 -/- cells either mock infected or infected with the shRNA library of the invention were inoculated in methyl cellulose.
Although RASV12 expressing Rb/plO7 -/- cells normally arrest under these conditions, expression of the library of the invention caused a small number of colonies to form, indicating that this system can be used to screen for tumor suppressor sequences.
Sequencing of vectors indicated that they target both coding and non-coding genes (Table 2).
Figure imgf000038_0001
Conclusions
Expression profiling using microarrays may reveal many genes which are differentially expressed between two different cellular phenotypes. However, it usually fails to identify the genes responsible to bring about this phenotypic change, because it is impossible to distinguish between cause and effect. To identify the genes responsible for a phenotypic transition, functional genetic screens may be performed. In recent years, loss of function screening using viral RNAi libraries has proved to be a powerful method to study gene function. Loss of function screens are facilitated by reducing the complexity of the RNAi library. Since fewer cells require testing to cover the variety of library sequences, this benefits the ratio between truly positive "hits" and obligatory background. In addition, it becomes easier to perform screens in a single well format. Finally, when the complexity of the library becomes low enough, it may be possible to pick up interactions between genes in screens.
To reduce their complexity, RNAi libraries for loss-of-function screening should ideally be limited to genes which are suspected to suppress the particular selectable phenotype which is the focus of the screen, such as the genes which are actually differentially expressed during the phenotypic change.
Although post-translational modifications of proteins may play an important role to bring about a phenotypic change, it may be beneficial to use shRNA libraries as disclosed here, if there are accompanying changes in expression patterns, for two reasons. First, the subpopulation of genes which are regulated during a phenotypic transition are strongly suspected to play a underlying role. At the expense of disregarding others, these candidate genes are strongly enriched to lower the complexity of libraries of the invention. Second and equally important, because the sequences selected in the library were regulated during the phenotypical transition, they actually play a role in the physiological context of cells.
Using the method of the invention, more shRNA vectors can be generated per gene compared to conventional synthetic shRNA libraries, depending on the number of
AM sites present in a gene, which averages 20. This provides an internal control for off-target effects and increases the chance to come across vectors that give the right degree of functionality. In addition, libraries may be produced for every organism. In addition to pRetro Super, other shRNA delivering vector backbones, including retroviral or lentiviral vectors, vectors encoding inducible RNAi or microRNA primary transcripts, can be used with only a minor adaptation to meet experimental demands.
An important advantage is the simplicity of the method. After isolation of RNA and the production of cDNA, it requires only three weeks to perform the subtractive hybridisation and to produce tailor-made libraries.

Claims

Claims
1. A method of preparing a shRNA library enriched for one or more cDNA sequences corresponding to at least a portion of at least one gene which is differently expressed in a first and second cell type, the method comprising the steps of: pre-identifying said one or more cDNA sequences using a first adaptor DNA sequence; and processing said one or more cDNA sequences using the first adaptor DNA sequence.
2. A method according to claim 1 wherein the first and second cell types are phenotypically different from one another.
3. A method according to claim 1 or 2 wherein the first and second cell types are different cell strains.
4. A method according to claim 1 or 2 wherein the first and second cell types are the same cell strain.
5. A method according to any preceding claim, wherein the step of pre-identifying said one or more cDNA sequences comprises using subtractive hybridisation using the first adaptor DNA sequence and a second adaptor DNA sequence.
6. A method according to any preceding claim, wherein said one or more cDNA sequences are processed to produce inverted repeat sequences encoding for shRNA molecules directed to the at least one gene which is differently expressed in the first and second cell type.
7. A method according to any preceding claim, wherein said one or more cDNA sequences are processed using: the first adaptor DNA sequence which comprises a recognition site for a first restriction enzyme; and a third adaptor DNA sequence to form hairpin structures.
8. A method according to any one of preceding claim wherein the step of pre-identifying each one or more cDNA sequences comprises the steps of: a. carrying out a restriction digest reaction of cDNA from the first cell type using a first restriction enzyme; b. ligating a first adaptor DNA sequence, comprising a first primer recognition sequence and a second primer recognition sequence and a marker molecule, to
Figure imgf000041_0001
c. ligating a second adaptor DNA sequence comprising a first primer recognition sequence and a third primer recognition sequence to DNA from step (a); d. carrying out a restriction digest reaction of cDNA from the second cell type using the first restriction enzyme; e. mixing the denatured products of step (b) with denatured products of step (d), optionally in the presence of a size exclusion agent, in an annealing step; f. mixing the denatured products of step (c) with denatured products of step (d), optionally in the presence of a size exclusion agent, in an annealing step; g. mixing the denatured products of steps (e) and (f) in the presence of a size exclusion agent in an annealing step, wherein the concentration of the size exclusion agent is higher than the concentration of size exclusion agent in steps (e) and (f), if present; h. separating and isolating DNA carrying the first adaptor sequence from DNA carrying only the second adaptor sequence or no adaptor sequence, or separating and isolating DNA carrying the second adaptor sequence from DNA carrying only the first adaptor sequence or no adaptor sequence; i. carrying out a nested PCR reaction using the separated DNA from (h) as template DNA and including the use of a first primer which recognises the first primer recognition sequence and, subsequently, a second primer which recognises the second primer recognition sequence and a third primer which recognises the third primer recognition sequence.
9. A method according to claim 8 wherein the marker molecule is biotin or streptavidin.
10. A method according to claim 8 or 9 comprising the further step of inserting at least a portion of the PCR product from (i) into an shRNA expression vector.
11. A method according to any one of claims 1 to 9, wherein the processing step comprises the steps of: a. digesting the one or more cDNA sequences with a first restriction enzyme to produce a fragment of each of said one or more cDNA sequences attached to the first adaptor sequence; b. ligating the product of (a) with the third hairpin adaptor DNA sequence to form hairpin structures; c. replicating the hairpin structures of step (b) to form an inverted repeat of each of said one or more cDNA sequences; d. removing at least a portion of the duplicated first adaptor DNA sequences from the product of step (c) using a second restriction enzyme; and e. processing the product of step (d), if necessary, to encode for a shRNA.
12. A method according to any one of claims 1 to 9, wherein the processing step comprises the steps of: a. carrying out a restriction digest reaction of the one or more pre-identified cDNA sequences using a second restriction enzyme; b. ligating the product of (a) with a third adaptor sequence, comprising a recognition site for a third restriction enzyme, to form hairpin structures; c. nicking the hairpin structure of step (b) using a third restriction enzyme; d. isolating the product of the reaction in step (c) by binding to the marker molecule; e. heating the product isolated in (d) with a fourth primer; f. extension of the fourth primer by a polymerase to generate the inverted DNA sequence complementary to the product of (e); and g. isolating at least a portion of the generated DNA product by use of a fourth restriction enzyme and inserting the resulting DNA into an shRNA expression vector.
13. A first adaptor nucleotide sequence for use in preparing a shRNA library enriched for one or more cDNA sequences corresponding to at least a portion of at least one gene which is differently expressed in a first and second cell type, the adaptor sequence comprising: a first primer binding site for PCR amplification of a cDNA attached thereto; a first restriction recognition site for a first restriction enzyme to cause the cleavage of the cDNA attached to the first adaptor sequence into a fragment; and a second restriction recognition site for a second restriction enzyme to cause the cleavage of the cDNA fragment from at least a portion of the first adaptor sequence.
14. The adaptor sequence of claim 13, wherein the first restriction recognition site for a first restriction enzyme causes the cleavage of the cDNA attached to the first adaptor sequence into a fragment of between about 10 and about 30 nucleotides in length.
15. The adaptor sequence of claim 13 or claim 14, further comprising a third restriction recognition site for a third restriction enzyme to introduce a nick into the first adaptor sequence.
16. The adaptor sequence of any one of claims 13 to 15, further comprising a third primer binding site for binding a primer to allow replication of the cDNA.
17. A kit for preparing an shRNA library enriched for one or more cDNA sequences corresponding to at least a portion of at least one gene which is differently expressed in a first and second cell type, the kit comprising a first adaptor sequence according to any one of claims 13 to 16.
18. The kit of claim 17, further comprising a second adaptor sequence for use in subtractive hybridisation.
19. The kit of claim 17 or claim 18, further comprising a third hairpin adaptor sequence.
20. An shRNA library enriched for one or more cDNA sequences corresponding to at least a portion of at least one gene which is differently expressed in a first and a second cell type, wherein the shRNA library is obtainable by the method of any of claims 1 to 12.
21. A method of determining the effects of reduction in expression of a gene of interest, comprising the use of an shRNA library according to claim 20.
PCT/GB2009/000684 2008-03-13 2009-03-13 Method of preparing an shrna library WO2009112844A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0804690.6 2008-03-13
GB0804690A GB0804690D0 (en) 2008-03-13 2008-03-13 Method

Publications (1)

Publication Number Publication Date
WO2009112844A1 true WO2009112844A1 (en) 2009-09-17

Family

ID=39328071

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2009/000684 WO2009112844A1 (en) 2008-03-13 2009-03-13 Method of preparing an shrna library

Country Status (2)

Country Link
GB (1) GB0804690D0 (en)
WO (1) WO2009112844A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2847351A1 (en) * 2012-05-08 2015-03-18 Cellecta, Inc. Clonal analysis of functional genomic assays and compositions for practicing same
EP2807292A4 (en) * 2012-01-26 2015-08-26 Nugen Technologies Inc Compositions and methods for targeted nucleic acid sequence enrichment and high efficiency library generation
US9745614B2 (en) 2014-02-28 2017-08-29 Nugen Technologies, Inc. Reduced representation bisulfite sequencing with diversity adaptors
US9822408B2 (en) 2013-03-15 2017-11-21 Nugen Technologies, Inc. Sequential sequencing
US9957549B2 (en) 2012-06-18 2018-05-01 Nugen Technologies, Inc. Compositions and methods for negative selection of non-desired nucleic acid sequences
US10102337B2 (en) 2014-08-06 2018-10-16 Nugen Technologies, Inc. Digital measurements from targeted sequencing
US10570448B2 (en) 2013-11-13 2020-02-25 Tecan Genomics Compositions and methods for identification of a duplicate sequencing read
US11028430B2 (en) 2012-07-09 2021-06-08 Nugen Technologies, Inc. Methods for creating directional bisulfite-converted nucleic acid libraries for next generation sequencing
US11099202B2 (en) 2017-10-20 2021-08-24 Tecan Genomics, Inc. Reagent delivery system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003033673A2 (en) * 2001-10-19 2003-04-24 Agy Therapeutics, Inc. High-throughput transcriptome and functional validation analysis
WO2005023991A2 (en) * 2003-09-05 2005-03-17 The General Hospital Corporation Small hairpin rna libraries

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003033673A2 (en) * 2001-10-19 2003-04-24 Agy Therapeutics, Inc. High-throughput transcriptome and functional validation analysis
WO2005023991A2 (en) * 2003-09-05 2005-03-17 The General Hospital Corporation Small hairpin rna libraries

Non-Patent Citations (11)

* Cited by examiner, † Cited by third party
Title
BERNARDS RENÉ ET AL: "shRNA libraries and their use in cancer genetics.", NATURE METHODS SEP 2006, vol. 3, no. 9, September 2006 (2006-09-01), pages 701 - 706, XP002533585, ISSN: 1548-7091 *
BERNS K ET AL: "A large-scale RNAi screen in human cells identifies new components of the p53 pathway", NATURE, NATURE PUBLISHING GROUP, LONDON, UK, vol. 428, 25 March 2004 (2004-03-25), pages 431 - 437, XP003002475, ISSN: 0028-0836 *
DIATCHENKO L ET AL: "SUPPRESSION SUBTRACTIVE HYBRIDIZATION: A METHOD FOR GENERATING DIFFERENTIALLY REGULATED OR TISSUE-SPECIFIC CDNA PROBES AND LIBRARIES", PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF USA, NATIONAL ACADEMY OF SCIENCE, WASHINGTON, DC.; US, vol. 93, 1 June 1996 (1996-06-01), pages 6025 - 6030, XP002911922, ISSN: 0027-8424 *
DIATCHENKO L ET AL: "SUPPRESSION SUBTRACTIVE HYBRIDIZATION: A VERSATILE METHOD FOR IDENTIFYING DIFFERENTIALLY EXPRESSED GENES", METHODS IN ENZYMOLOGY, ACADEMIC PRESS INC, SAN DIEGO, CA, US, vol. 303, 1 January 1999 (1999-01-01), pages 349 - 380, XP009016828, ISSN: 0076-6879 *
HANAZAWA M ET AL: "Use of cDNA subtraction and RNA interference screens in combination reveals genes required for germ-line development in Caenorhabditis elegans.", PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA 17 JUL 2001, vol. 98, no. 15, 17 July 2001 (2001-07-17), pages 8686 - 8691, XP002533584, ISSN: 0027-8424 *
PACCHIONI B ET AL: "Semi-multiplex PCR technique for screening of abundant transcripts during systematic sequencing of cDNA libraries.", BIOTECHNIQUES OCT 1996, vol. 21, no. 4, October 1996 (1996-10-01), pages 644 - 646 , 648, XP002533587, ISSN: 0736-6205 *
SILVA J M ET AL: "Second-generation shRNA libraries covering the mouse and human genomes", NATURE GENETICS, NATURE PUBLISHING GROUP, NEW YORK, US, vol. 37, no. 11, 2 October 2005 (2005-10-02), pages 1281 - 1288, XP002399751, ISSN: 1061-4036 *
XU ET AL: "Construction of equalized short hairpin RNA library from human brain cDNA", JOURNAL OF BIOTECHNOLOGY, ELSEVIER SCIENCE PUBLISHERS, AMSTERDAM, NL, vol. 128, no. 3, 24 January 2007 (2007-01-24), pages 477 - 485, XP005856745, ISSN: 0168-1656 *
ZHANG JUN-ZHENG ET AL: "Screening for genes essential for mouse embryonic stem cell self-renewal using a subtractive RNA interference library.", STEM CELLS (DAYTON, OHIO) DEC 2006, vol. 24, no. 12, December 2006 (2006-12-01), pages 2661 - 2668, XP002533583, ISSN: 1066-5099 *
ZHOU DEMIN ET AL: "Generation of shRNA pool library: a revision of the biological technique from the viewpoint of chemistry.", CHEMBIOCHEM : A EUROPEAN JOURNAL OF CHEMICAL BIOLOGY 16 JUN 2008, vol. 9, no. 9, 16 June 2008 (2008-06-16), pages 1365 - 1367, XP002533588, ISSN: 1439-7633 *
ZHUMABAYEVA B ET AL: "Generation of full-length cDNA libraries enriched for differentially expressed genes for functional genomics.", BIOTECHNIQUES MAR 2001, vol. 30, no. 3, March 2001 (2001-03-01), pages 512 - 516 , 518, XP002533586, ISSN: 0736-6205 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10876108B2 (en) 2012-01-26 2020-12-29 Nugen Technologies, Inc. Compositions and methods for targeted nucleic acid sequence enrichment and high efficiency library generation
EP2807292A4 (en) * 2012-01-26 2015-08-26 Nugen Technologies Inc Compositions and methods for targeted nucleic acid sequence enrichment and high efficiency library generation
US9650628B2 (en) 2012-01-26 2017-05-16 Nugen Technologies, Inc. Compositions and methods for targeted nucleic acid sequence enrichment and high efficiency library regeneration
US10036012B2 (en) 2012-01-26 2018-07-31 Nugen Technologies, Inc. Compositions and methods for targeted nucleic acid sequence enrichment and high efficiency library generation
JP2015516163A (en) * 2012-05-08 2015-06-11 セレクタ,インク Clone analysis of functional genomic analysis and composition for performing the clone analysis
EP2847351A1 (en) * 2012-05-08 2015-03-18 Cellecta, Inc. Clonal analysis of functional genomic assays and compositions for practicing same
US10196634B2 (en) 2012-05-08 2019-02-05 Cellecta, Inc. Clonal analysis of functional genomic assays and compositions for practicing same
EP2847351A4 (en) * 2012-05-08 2015-11-11 Cellecta Inc Clonal analysis of functional genomic assays and compositions for practicing same
US9429565B2 (en) 2012-05-08 2016-08-30 Cellecta, Inc. Clonal analysis of functional genomic assays and compositions for practicing same
US9957549B2 (en) 2012-06-18 2018-05-01 Nugen Technologies, Inc. Compositions and methods for negative selection of non-desired nucleic acid sequences
US11028430B2 (en) 2012-07-09 2021-06-08 Nugen Technologies, Inc. Methods for creating directional bisulfite-converted nucleic acid libraries for next generation sequencing
US11697843B2 (en) 2012-07-09 2023-07-11 Tecan Genomics, Inc. Methods for creating directional bisulfite-converted nucleic acid libraries for next generation sequencing
US9822408B2 (en) 2013-03-15 2017-11-21 Nugen Technologies, Inc. Sequential sequencing
US10619206B2 (en) 2013-03-15 2020-04-14 Tecan Genomics Sequential sequencing
US10760123B2 (en) 2013-03-15 2020-09-01 Nugen Technologies, Inc. Sequential sequencing
US10570448B2 (en) 2013-11-13 2020-02-25 Tecan Genomics Compositions and methods for identification of a duplicate sequencing read
US11098357B2 (en) 2013-11-13 2021-08-24 Tecan Genomics, Inc. Compositions and methods for identification of a duplicate sequencing read
US11725241B2 (en) 2013-11-13 2023-08-15 Tecan Genomics, Inc. Compositions and methods for identification of a duplicate sequencing read
US9745614B2 (en) 2014-02-28 2017-08-29 Nugen Technologies, Inc. Reduced representation bisulfite sequencing with diversity adaptors
US10102337B2 (en) 2014-08-06 2018-10-16 Nugen Technologies, Inc. Digital measurements from targeted sequencing
US11099202B2 (en) 2017-10-20 2021-08-24 Tecan Genomics, Inc. Reagent delivery system

Also Published As

Publication number Publication date
GB0804690D0 (en) 2008-04-16

Similar Documents

Publication Publication Date Title
JP7053706B2 (en) Increased specificity of RNA-induced genome editing with shortened guide RNA (tru-gRNA)
WO2009112844A1 (en) Method of preparing an shrna library
McMahon et al. TRIBE: hijacking an RNA-editing enzyme to identify cell-specific targets of RNA-binding proteins
Zhao et al. Gene silencing by artificial microRNAs in Chlamydomonas
JP4339852B2 (en) Methods and compositions for gene silencing
CN113166797A (en) Nuclease-based RNA depletion
CN101395281B (en) Methods for nucleic acid mapping and identification of fine-structural-variations in nucleic acids and utilities
EP2235179B1 (en) Methods for creating and identifying functional rna interference elements
US20050277139A1 (en) Methods and apparatus for the detection and validation of microRNAs
JP7460539B2 (en) IN VITRO sensitive assays for substrate selectivity and sites of binding, modification, and cleavage of nucleic acids
JP2010516284A (en) Methods, compositions and kits for detection of microRNA
Zinshteyn et al. Nuclease-mediated depletion biases in ribosome footprint profiling libraries
CN110343724B (en) Method for screening and identifying functional lncRNA
CN112384620B (en) Methods for screening and identifying functional lncRNA
US9708603B2 (en) Method for amplifying cDNA derived from trace amount of sample
US20050250100A1 (en) Method of utilizing the 5'end of transcribed nucleic acid regions for cloning and analysis
JP2007520221A (en) Composition and production method of short double-stranded RNA using mutant RNase
Zhou et al. Intronic heterochromatin prevents cryptic transcription initiation in Arabidopsis
US8846350B2 (en) MicroRNA affinity assay and uses thereof
CN111334531A (en) High signal-to-noise ratio negative genetic screening method
CN110546275A (en) Method and kit for removing unwanted nucleic acids
CA2547885A1 (en) Methods for obtaining gene tags
Cheng et al. Dense sgRNA library construction using a Molecular Chipper approach
Zhou et al. CRISPRa screen on a genetic risk locus shared by multiple autoimmune diseases identifies a dysfunctional enhancer that affects IRF8 expression through cooperative lncRNA and DNA methylation machinery
WO2011023827A1 (en) Purification process of nascent dna

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09720154

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09720154

Country of ref document: EP

Kind code of ref document: A1