Methods of Preparing cRNA
CROSS REFERENCE TO RELATED APPLICATIONS This application claims the benefit of U.S. Provisional Application Ser. No. 60/165,668, filed November 16, 1999, which is hereby incorporated by reference. FIELD OF THE INVENTION
The present invention provides new and improved methods for generating amplified RNA molecules. The methods are robust and reliable, and can be used to provide RNA gene fragments for use in methods of analyzing gene expression patterns. BACKGROUND OF THE INVENTION
In recent years, methods have been developed for the analysis of gene expression in individual cells and tissues. These methods are providing powerful insights into the cellular processes that occur, for example, in disease states. For example, the gene expression profile for normal and diseased cells can be compared to provide information regarding the identity of genes whose expression levels are modified in the disease state. This information can provide insights that are useful in developing treatments for the disease, or in understanding the pathology of the disease.
Micro fabricated arrays of large numbers of oligonucleotide probes, called "DNA chips" offer great promise for a wide variety of applications. In particular, DNA chips are useful for generating gene expression profiles of the type discussed above. Typically, DNA chip technology involves a microarray containing many thousands of unique DNA probes fixed to a solid support. Mixtures containing fragments of target nucleic acids are applied to the chip, and fragments that hybridize with the probes are retained on the chip while fragments that do not hybridize simply are washed away. The success of DNA chip technology, however, depends on the ability to obtain sufficient amount of single stranded nucleic acid molecules of an appropriate size that can be labeled and hybridized to the chips. Moreover, the amounts of the single stranded nucleic acid molecules should reflect the amount of the corresponding mRNA in the cell or tissue of interest if the gene expression analysis is to provide any useful quantitative information.
It is often desirable to fragment the target nucleic acid molecule prior to hybridization with a probe array, in order to provide segments which are more readily accessible to the probes, which hybridize more rapidly, and which avoid looping and/or hybridization to multiple probes. On the other hand, target molecules that are too short are more likely to hybridize in a non-specific manner, providing an inaccurate assessment of gene expression patterns. RNA molecules can be fragmented in a straightforward manner by heating in basic solution and, accordingly, RNA is often the nucleic acid of choice for generating gene fragments for use in methods of gene expression analysis. Obtaining sufficient mRNA for the study of gene expression often is problematic. Typically, amplification of the mRNA in some fashion is required to provide sufficient material for detection. Linear amplification methods are preferred over exponential amplification methods such as PCR because they provide a more accurate representation of the relative abundance of expressed genes in a given cell or tissue, preserving rare sequences and providing more accurate quantitation.
U.S. Patent No. 5,545,522, (Gelder et al.,) describes a method in which mRNA molecules are reverse-transcribed using a complementary primer linked to an RNA polymerase promoter region to make a first strand cDNA. Second strand synthesis relies upon self-priming by the formation of a hairpin loop at the end of the first strand of cDNA. Following second strand synthesis, anti-sense RNA (aRNA) is transcribed from the cDNA by introducing an RNA polymerase capable of binding to the promoter region. The resulting aRNA can be fragmented by heating.
This method has the disadvantage of relying on the formation of the hairpin loop at the end of the first cDNA strand to prime second strand synthesis. First strand cDNA does not always reliably generate such a hairpin loop, meaning that second strand synthesis does not occur, and no aRNA molecule is generated upon initiation of transcription.
It is apparent, therefore, that a need exists for improved methods of generating amplified RNA molecules and RNA fragments that are representative of the type and amounts of cellular mRNA. Preferably, the overall methodologies will be capable of amplifying a broad range of target molecule without prior cloning and without
knowledge of mRNA sequence in some instances. The present invention fulfills these and other needs.
SUMMARY OF THE INVENTION It is therefore an object of the present invention to provide improved methods for generating amplified RNA molecules (cRNA molecules) and RNA fragments that can be used in gene expression analysis and other applications.
In accomplishing these objects, there has been provided, in accordance with one aspect of the present invention, a method for amplifying at least one RNA molecule, comprising the steps of (a) preparing a first strand cDNA molecule by reverse transcription using a primer molecule that hybridizes to the RNA molecule where the primer molecule contains an upstream nucleotide sequence that is recognized by a restriction endonuclease having a 6, 7, or 8 base recognition sequence; (b) synthesizing a double stranded cDNA from the first strand cDNA, where synthesis of the second cDNA strand of the double stranded cDNA is primed by a hairpin loop formed at the 3' end of the first cDNA strand during reverse transcription; (c) digesting the double stranded cDNA with a restriction endonuclease that recognizes the upstream nucleotide sequence to provide a double stranded cDNA containing a cohesive terminus; (d) ligating a double stranded promoter oligonucleotide to the cohesive terminus, where the promoter oligonucleotide comprises a promoter region that is recognized by an RNA polymerase; and (e) transcribing copies of RNA initiated from the promoter region.
In accordance with another aspect of the invention there has been provided a method for amplifying at least one RNA molecule, comprising the steps of: (a) preparing a first strand cDNA molecule by reverse transcription using a primer that hybridizes to the RNA molecule where the primer molecule contains an upstream promoter region that is recognized by an RNA polymerase; (b) digesting the resulting mRNA/cDNA double stranded molecule with an RNAse to provide a single stranded cDNA molecule; (c) ligating a partially double-stranded adapter to the single stranded cDNA molecule to produce a partially double stranded cDNA, where the adapter comprises: (i) an overhang that can hybridize to the 3' end sequence of the single- stranded cDNA and (ii) a 5' end of the adapter positioned for ligation to the 3' end of
the single-stranded cDNA when the 3' end of an adapter is hybridized to the 3' end of the single-stranded cDNA; (d) synthesizing a double stranded cDNA molecule from the partially double stranded cD A; and (e) transcribing copies of RNA initiated from the upstream promoter region. In accordance with still another aspect of the invention there has been provided a method for amplifying at least one RNA molecule, comprising the steps of: (a) preparing a first strand cDNA molecule by reverse transcription using a primer that hybridizes to the RNA molecule wherein the primer molecule contains an upstream promoter region that is recognized by an RNA polymerase; (b) ligating a partially double-stranded adapter to the single-stranded cDNA molecule to produce a partially double stranded cDNA, wherein the adapter comprises: (i) an overhang that can hybridize to the 3' end sequence of the single-stranded cDNA and (ii) a 5' end of the adapter positioned for ligation to the 3' end of the single-stranded cDNA when the 3' end of an adapter is hybridized to the 3' end of the single-stranded cDNA; (c) synthesizing a double-stranded cDNA molecule from the partially double stranded cDNA; and (d) transcribing copies of RNA initiated from the promoter region.
In one embodiment, a mixture of mRNA molecules is amplified, where a mixture of partially double-stranded adapters is used, and where the mixture of adapters comprises 3 '-overhangs 4-10 bases long that are complementary in sequence to all the sequences 4-10 bases long that can be formed by the bases A, C, G and T. In another embodiment, the promoter region can operably be recognized by a T bacteriophage RNA polymerase, such as a T3 or T7 RNA polymerase or by SP6 bacteriophage RNA polymerase.
In yet another embodiment, the RNA is eukaryotic mRNA, preferably mRNA having a poly (A) tail.
In still another embodiment, the cRNA molecules are fragmented. The fragmentation can be via heat and/or treatment at high pH, for a time sufficient to cleave at least about 95 % of said cRNA molecules.
In yet a further embodiment, the nucleotides used in the synthesis of the first and/or second strand cDNA are labeled with a detectable label. The detectable label
may be at least one of a radioisotope, a chromophore, a fluorophore, an enzyme, or a reactive group.
Other objects, features and advantages of the present invention will become apparent from the following detailed description. It should be understood, however, that the detailed description and the specific examples, while indicating preferred embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.
BRIEF DESCRIPTION OF THE DRA WINGS Figure 1 describes one method of preparing cRNA. First strand cDNA synthesis is carried out using a poly(dT) primer containing a recognition site (RE) at the 5' end for a rare cutter restriction endonuclease. Second strand synthesis is primed using the hairpin loop formed at the end of the first strand. After second strand synthesis has occurred, the double stranded cDNA is digested ("cut") with the rare cutter endonuclease and a DNA fragment containing a promoter sequence is ligated to the cohesive termini generated by the digestion. Transcription is initiated using an RNA polymerase that recognizes the promoter sequence.
Figure 2 describes another method of preparing cRNA. First strand cDNA synthesis is carried out using a poly(dT) primer containing a promoter sequence at the 5' end. The RNA then is digested with RNAseH, and second strand synthesis is carried out using a partially double standed primer having an overhang comprising a random nucleotide at the 3' end of one strand (hatched area) and a phosphate group at the 5' end of the other strand. The primer is ligated to the first strand using the 5'-phosphate group. After second strand synthesis, transcription is initiated using an RNA polymerase that recognizes the promoter sequence.
Figure 3 describes yet another method of preparing cRNA. First strand cDNA synthesis is carried out using a poly(dT) primer containing a promoter sequence at the 5' end. Second strand synthesis is primed using a partially double standed primer having an overhang comprising a random nucleotide at the 3' end of one strand (hatched area) and a phosphate group at the 5' end of the other strand. The primer is ligated to the first strand using the 5'-phosphate group. Second strand synthesis occurs
by strand displacement from the RNA/DNA duplex formed by the first strand synthesis. After second strand synthesis, transcription is initiated using an RNA polymerase that recognizes the promoter sequence.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT In accordance with the present invention, novel methods are provided for the generation of amplified RNA molecules that correspond in sequence and in relative amount to cellular mRNA molecules. That is, the methods provide amplified RNA (hereinafter "cRNA") comprising a sequence that is substantially identical to a sequence found in a cellular mRNA molecule. Moreover, when applied to mixtures of cellular mRNA molecules, the amplification methods of the invention provide cRNA molecules in relative quantities that reflect the relative quantities of those cellular mRNA molecules. In particular, the methods provide gene fragments in a quantity and form suitable for gene expression analysis.
In general, the methods involve an amplification process that generates cRNA by transcription from a double stranded cDNA that comprises a recognition sequence for a bacterial RNA polymerase. In a first method, shown schematically in Figure 1 , first strand cDNA synthesis is carried out by reverse transcription using a primer that recognizes the cellular mRNA molecule. The skilled artisan is well aware of methods of carrying out reverse transcription reactions. See, for example, Sambrook et al, (1989), Molecular Cloning: A Laboratory Manual Second Edition, (Cold Spring
Harbor). In one embodiment, the recognition by the primer occurs via recognition of the poly A tail at the 3' end of the mRNA molecules, i.e. a poly(dT) containing primer is used. Upstream (to the 5' end) of the primer sequence that recognizes the mRNA molecule, the primer contains a series of nucleotides comprising a recognition sequence for a "rare cutter" restriction endonuclease. A "rare cutter" restriction endonuclease is an endonuclease with a recognition sequence that is at least six, and preferably at least seven or eight nucleotides long. The endonuclease Notl is an example of a rare cutter endonuclease.
Second strand synthesis then is primed using the hairpin loop formed at the end of the first strand by the reverse transcription step. Methods for carrying out second strand cDNA synthesis are well known in the art. See Sambrook supra. After
second strand synthesis has occurred, the resulting double stranded cDNA is digested with the rare cutter endonuclease and a DNA fragment containing a bacterial promoter sequence is ligated to the cohesive termini generated by the digestion. RNA transcription then is initiated using an RNA polymerase that recognizes the promoter sequence. Preferably, the RNA polymerase is a bacteriophage RNA polymerase such as a T bacteriophage such as T3 or T7, or SP6 RNA polymerase. Such polymerases are available commercially, for example from Promega (Madison, WI) and Life Technologies (Rockville, MD). The resulting cRNA molecules can be fragmented as desired using heat and/or pH using methods that are well known in the art. In a second method, shown schematically in Figure 2, first strand cDNA synthesis is carried out using a poly(dT) primer containing a promoter sequence at the 5' end. The RNA then is digested with an RNAse, such as RNAseH, and second strand synthesis is carried out using a partially double stranded adapter primer having an overhang at the 3' end of one strand (hatched area in Figure 2) and a phosphate group at the 5' end of the other strand. The overhang typically is about 4-10 bases long. The primer is ligated to the first strand using the 5'-phosphate group. After second strand synthesis, transcription is initiated using an RNA polymerase that recognizes the promoter sequence. Preferably, the RNA polymerase is a bacteriophage RNA polymerase such as a T bacteriophage such as T3 or T7, or SP6 RNA polymerase. The resulting cRNA molecules can be fragmented as desired using heat and/or pH using methods that are well known in the art. The transcription reaction can be carried out until the desired number of cRNA copies are produced. Typically, for gene expressiona analysis, at least about 50 cRNA copies are produced. In a third method, which is an alternative version of the second method, the mRNA is not digested using an RNAse, and the second strand synthesis occurs via a strand displacement reaction. To drive the strand displacement, the primer for the second strand synthesis can be present in a molar excess. In addition, all or part of the overhang in the hatched area of Figure 3 can be made of ribonucleotides, or other modified bases that preferentially displace RNA from an RNA/DNA duplex. Following strand displacement, the steps of the second method described above are carried out.
In the second and third methods, when a mixture of mRNAs is used for preparing double stranded cDNA (and subsequent preparation of cRN A), the partially double stranded adapter primer contains a random sequence of bases in the overhanging portion of the primer. This ensures that at least one adapter primer will bind to each first strand cDNA sequence. The random sequence typically will be 4-10 nucleotides in length, and will be complementary in sequence to all the sequences 4-10 bases long that can be formed by the bases A, C, G and T..
Isolation of mRNAs and Synthesis of Double-stranded cDNAs
The target mRNA population for the practice of this invention may be isolated from a cellular source using many available methods well-known in the art. The Chomczynski method, e.g., isolation of total cellular RNA by the guanidine isothiocyanate (described in U.S. Pat. No. 4,843,155) used in conjunction with, for example, oligo-dT streptavidin beads, is an exemplary mRNA isolation protocol.
The mRNAs are converted to cDNA by reverse transcriptase, e.g., poly(dT)- primed first strand cDNA synthesis by reverse transcriptase, followed by second strand synthesis using a DNA polymerase such as DNA Polymerase I. Such methods are well-known to the skilled artisan. For general description of these methods, please see Sambrook et al., 1989, Molecular Cloning - A Laboratory Manual, 2nd ed., Vol. 1-3; and Ausubel et al, 1989, Current Protocols in Molecular Biology, Green Publishing Associates and Wiley Interscience, N.Y. When desired, the skilled artisan will recognize that primers specific for gene families can be used to provide cDNA mixtures containing a desired gene family. For example, it is known that G-protein coupled receptors contain regions of conserved sequence that can be used to design primers or primer mixtures that allow selective isolation of cDNAs encoding the receptors.
In preparing the first strand cDNA, the primer is contacted with the mRNA with a reverse transcriptase and other reagents necessary for primer extension under conditions sufficient for first strand cDNA synthesis, where additional reagents include: dNTPs; buffering agents, e.g. Tris-Cl; cationic sources, both monovalent and divalent, e.g. KC1, MgCl2 ; RNAase inhibitor and sulfhydryl reagents, e.g. dithiothreitol; and the like. A variety of enzymes, usually DNA polymerases,
possessing reverse transcriptase activity can be used for the first strand cDNA synthesis step. Examples of suitable DNA polymerases include the DNA polymerases derived from organisms selected from the group consisting of a thermophilic bacteria and archaebacteria, retroviruses, yeasts, insects, primates and rodents. Preferably, the DNA polymerase will be selected from the group consisting of Moloney murine leukemia virus (M-MLV) and M-MLV reverse transcriptase lacking RNaseH activity, human T-cell leukemia virus type I (HTLV-I), bovine leukemia virus (BLV), Rous sarcoma virus (RSV), human immunodeficiency virus (HIV) and Thermus aquaticus (Taq) or Thermus thermophilus (Tth), avian reverse transcriptase, and the like. Suitable DNA polymerases possessing reverse transcriptase activity may be isolated from an organism, obtained commercially or obtained from cells which express high levels of cloned genes encoding the polymerases by methods known to those of skill in the art, where the particular manner of obtaining the polymerase will be chosen based primarily on factors such as convenience, cost, availability and the like. The order in which the reagents are combined may be modified as desired.
One protocol that may be used involves the combination of all reagents except for the reverse transcriptase on ice, then adding the reverse transcriptase and mixing at around 4°C. Following mixing, the temperature of the reaction mixture is raised to 37°C, followed by incubation for a period of time sufficient for first strand cDNA primer extension product to form, usually about 1 hour.
First strand synthesis produces a mRNA cDNA hybrid, which is then converted to double-stranded (ds) cDNA as described above. In the first method, involving self-priming by a hairpin loop, the methods described by Efstratiadis et al., Cell (1976)7: 279; Higuchi et al., Proc. Natl. Acad. Sci. (1976) 73: 3146; Maniatis et al, Cell (1976) 8: 163 and Rougeon and Mach, Proc. Natl. Acad. Sci. (1976) 73:3418 may be used, where the hybrid is denatured, e.g. by boiling or hydrolysis of the mRNA, and the first strand cDNA is allowed to form a hairpin loop and self prime the second strand cDNA.
Incorporation of Labels into the Amplification Product According to a preferred embodiment of the invention, the cRNA molecules are labeled, by any of many methods well-known in the art, with a marker for easy
detection. The labeled fragments are particularly desired for many purposes in biotechnology, such as for the analysis of gene expression patterns and determination of DNA polymorphism.
As used herein, the terms "label" or "labeled" refers to incorporation of a detectable marker, e.g., by incorporation of a radioactively or non-radioactively labeled nucleotide. Various methods of labeling RNA molecules are known in the art and may be used.
Labeling of the cRNA according to the present invention may be achieved by incorporating a marker-labeled nucleotide into the transcription product. A large portion of available labeling method currently in use are radioactive and they can be obtained from a wide variety of commercial sources. Examples of radiolabels include, but are not restricted to, 32P, 3H, ,4C, or 35S.
A large number of convenient and sensitive non-isotopic markers are also available. In general, all of the non-isotopic methods of detecting hybridization probes that are currently available depend on some type of derivitization of the nucleotides to allow for detection, whether through antibody binding, or enzymatic processing, or through the fluorescence or chemiluminescence of an attached
"reporter" molecule. The cRNA product labeled with non-radioactive reporters incorporate single or multiple molecules of the label nucleotide which contain the reporter molecule, generally at specific cyclic or exocyclic positions.
Techniques for attaching reporter groups have largely relied upon (a) functionalization of 5' or 3' termini of the monomeric nucleosides by numerous chemical reactions (see Cardullo et al. (1988) Proc. Natl Acad. Sci. 85: 8790-8794);
(b) synthesizing modified nucleosides containing (i) protected reactive groups, such as NH2, SH, CHO, or COOH, (ii) activatable monofunctional linkers, such as NHS esters, aldehydes, or hydrazides, or (iii) affinity binding groups, such as biotin, attached to either the heterocyclic base or the furanose moiety.
According to one aspect of the invention, the labeled nucleotide(s) are labeled with fluorogens. Examples of fluorogens include fluorescein and derivatives, isothiocyanate, dansyl chloride, phycoerythrin, allo-phycocyanin, phycocyanin, rhodamine, Texas Red™, SYBR-Green™ or other proprietary fluorogens. The
fluorogens are generally attached by chemical modification. The fluorogens can be detected by a fluorescence detector.
In a preferred embodiment, the labeled nucleotide can alternatively be labeled with a chromogen to provide an enzyme or affinity label. For example, nucleotide may have biotinyl moieties that can be detected by marked avidin (e.g., streptavidin containing a fluorescent marker or enzymatic activity that can be detected by optical or calorimetric methods). The probe can be labeled with peroxidase, alkaline phosphatase or other enzymes giving a chromogenic or fluorogenic reaction upon addition of substrate. For example, additives such as 5-amino-2,3-dihydro-l,4- phthalazinedione (also known as LUMINOL™) (Sigma Chemical Company, St. Louis, Mo.) and rate enhancers such as p-hydroxybiphenyl (also known as p-phenylphenol) (Sigma Chemical Company, St. Louis, Mo.) can be used to amplify enzymes such as horseradish peroxidase through a luminescent reaction; and luminogeneic or fluorogenic dioxetane derivatives of enzyme substrates can also be used.
Usually, the labeled binding component comprises a direct label, such as a fluorescent label, radioactive label, or enzyme-conjugated label that catalyzes the conversion of a chromogenic substrate to a chromophore. However, it is possible, and often desirable for signal amplification, for the labeled binding component to be detected by at least one additional binding component that incorporates a label. Signal amplification can be accomplished by layering of reactants where the reactants are polyvalent.
The invention has been disclosed broadly and illustrated in reference to representative embodiments described above. Those skilled in the art will recognize that various modifications can be made to the present invention without departing from the spirit and scope thereof. Throughout the specification, any and all references to publicly available documents are specifically incorporated by reference.